Studying FLUX in diffusers library was hard, so I built a smaller open-source version [P]

If you've tried to study modern diffusion models by digging through the official diffusers library, you know it can be overwhelming with its complexity and abstractions.

I wanted to simplify FLUX diffusion models, so I built minFLUX: a PyTorch implementation focused on its core architecture and math. Here is the project: https://github.com/purohit10saurabh/minFLUX

What’s inside:

- Minimal FLUX.1 + FLUX.2 implementation with VAE and transformer model.

- Line-by-line mappings to the source HuggingFace diffusers.

- Training loop (VAE encode → flow matching → velocity MSE)

- Inference loop (noise → Euler ODE → VAE decode)

- Shared utilities (RoPE, timestep embeddings)

The most interesting part for me was seeing that FLUX.2 is not just a scaled-up FLUX.1. It improves the transformer blocks, modulation, FFN, VAE normalization, position IDs, etc. The architecture overview of FLUX.2 is attached.

Let me know if you find this interesting! 🙂

https://preview.redd.it/9evuthx2vg8h1.jpg?width=1080&format=pjpg&auto=webp&s=47e4f72f4751e1c11d3928f6dcb43c9e96cbbc0b

submitted by /u/Other-Eye-8152
[link] [comments]

Studying FLUX in diffusers library was hard, so I built a smaller open-source version [P]

Want to read more?

Tagged with