mirror of https://github.com/huggingface/diffusers.git synced 2025-12-06 12:34:13 +08:00

Files

Sayak Paul 564079f295 [feat]: implement "local" caption upsampling for Flux.2 (#12718 )

* feat: implement caption upsampling for flux.2.

* doc

* up

* fix

* up

* fix system prompts 🤷‍

* up

* up

* up

2025-12-02 04:27:24 +05:30

2.2 KiB

Raw Blame History

Flux2

Flux.2 is the recent series of image generation models from Black Forest Labs, preceded by the Flux.1 series. It is an entirely new model with a new architecture and pre-training done from scratch!

Original model checkpoints for Flux can be found here. Original inference code can be found here.

Tip

Flux2 can be quite expensive to run on consumer hardware devices. However, you can perform a suite of optimizations to run it faster and in a more memory-friendly manner. Check out this section for more details. Additionally, Flux can benefit from quantization for memory efficiency with a trade-off in inference latency. Refer to this blog post to learn more.

Caching may also speed up inference by storing and reusing intermediate outputs.

Caption upsampling

Flux.2 can potentially generate better better outputs with better prompts. We can "upsample" an input prompt by setting the caption_upsample_temperature argument in the pipeline call arguments. The official implementation recommends this value to be 0.15.

Flux2Pipeline

autodoc Flux2Pipeline - all - call

2.2 KiB Raw Blame History

Flux2

Caption upsampling

Flux2Pipeline

2.2 KiB

Raw Blame History