1 min readfrom Machine Learning

How much can a video generated by the same diffusion model differ across GPU architectures if the initial noise latent is fixed? [D]

Hi! I am trying to sanity-check an assumption for diffusion video generation reproducibility.

Suppose I run the same video diffusion model on two different GPU architectures, with:

  • identical model weights and implementation (same attention backend, etc)
  • identical prompt and parameters (same number of denoising steps, etc)
  • deterministic sampler (no extra noise is injected during inference)
  • the exact same starting noise latent

Could I expect more or less the same generated video?

I understand that there's no way to guarantee bitwise-identical outputs due to floating-point math differences, but could it realistically make the generated videos so different that it'd be immediately noticeable to a human eye? Or would one normally expect only tiny pixel-level/minor perceptual differences?

submitted by /u/hellosandrik
[link] [comments]

Want to read more?

Check out the full article on the original site

View original article

Tagged with

#rows.com
#natural language processing for spreadsheets
#AI formula generation techniques
#generative AI for data analysis
#enterprise-level spreadsheet solutions
#Excel alternatives for data analysis
#financial modeling with spreadsheets
#video diffusion model
#generated videos
#GPU architectures
#starting noise latent
#video generation
#reproducibility
#floating-point math differences
#identical model weights
#deterministic sampler
#human eye noticeability
#denoising steps
#minor perceptual differences
#attention backend