6058 Commits

Author SHA1 Message Date
Tran Thanh Luan
6290fdfda4 [Feat] TaylorSeer Cache (#12648)
* init taylor_seer cache

* make compatible with any tuple size returned

* use logger for printing, add warmup feature

* still update in warmup steps

* refractor, add docs

* add configurable cache, skip compute module

* allow special cache ids only

* add stop_predicts (cooldown)

* update docs

* apply ruff

* update to handle multple calls per timestep

* refractor to use state manager

* fix format & doc

* chores: naming, remove redundancy

* add docs

* quality & style

* fix taylor precision

* Apply style fixes

* add tests

* Apply style fixes

* Remove TaylorSeerCacheTesterMixin from flux2 tests

* rename identifiers, use more expressive taylor predict loop

* torch compile compatible

* Apply style fixes

* Update src/diffusers/hooks/taylorseer_cache.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* update docs

* make fix-copies

* fix example usage.

* remove tests on flux kontext

---------

Co-authored-by: toilaluan <toilaluan@github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-06 05:39:54 +05:30
David El Malih
256e010674 Improve docstrings and type hints in scheduling_deis_multistep.py (#12796)
* feat: Add `flow_prediction` to `prediction_type`, introduce `use_flow_sigmas`, `flow_shift`, `use_dynamic_shifting`, and `time_shift_type` parameters, and refine type hints for various arguments.

* style: reformat argument wrapping in `_convert_to_beta` and `index_for_timestep` method signatures.
2025-12-05 08:48:01 -08:00
Sayak Paul
8430ac2a2f [docs] minor fixes to kandinsky docs (#12797)
up
2025-12-05 08:33:05 -08:00
sayakpaul
bb9e713d02 move kandisnky docs. 2025-12-05 21:44:24 +07:00
Álvaro Somoza
c98c157a9e [Docs] Add Z-Image docs (#12775)
* initial

* toctree

* fix

* apply review and fix

* Update docs/source/en/api/pipelines/z_image.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/z_image.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/z_image.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-12-05 11:05:47 -03:00
swappy
f12d161d67 Fix broken group offloading with block_level for models with standalone layers (#12692)
* fix: group offloading to support standalone computational layers in block-level offloading

* test: for models with standalone and deeply nested layers in block-level offloading

* feat: support for block-level offloading in group offloading config

* fix: group offload block modules to AutoencoderKL and AutoencoderKLWan

* fix: update group offloading tests to use AutoencoderKL and adjust input dimensions

* refactor: streamline block offloading logic

* Apply style fixes

* update tests

* update

* fix for failing tests

* clean up

* revert to use skip_keys

* clean up

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-12-05 18:54:05 +05:30
David Bertoin
8d415a6f48 PRX Set downscale_freq_shift to 0 for consistency with internal implementation (#12791)
fix timestepembeddings downscale_freq_shift to be consitant with Photoroom's original code
2025-12-04 10:57:14 -10:00
Sayak Paul
7de51b826c [lora] support more ZImage LoRAs (#12790)
up

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
2025-12-04 09:01:11 -10:00
Jiang
cd00ba685b fix spatial compression ratio error for AutoEncoderKLWan doing tiled encode (#12753)
fix spatial compression ratio compute error for AutoEncoderKLWan

Co-authored-by: lirui.926 <lirui.926@bytedance.com>
2025-12-04 08:57:13 -10:00
David El Malih
2842c14c5f Improve docstrings and type hints in scheduling_unipc_multistep.py (#12767)
refactor: add type hints and update docstrings for UniPCMultistepScheduler parameters and methods.
2025-12-04 10:10:54 -08:00
Sayak Paul
c318686090 Update attention_backends.md to format kernels (#12757) 2025-12-04 07:48:23 -08:00
hlky
6028613226 Z-Image-Turbo from_single_file (#12756)
* Z-Image-Turbo `from_single_file`

* compute_dtype

* -device cast
2025-12-04 20:22:48 +05:30
Sayak Paul
a1f36ee3ef [Z-Image] various small changes, Z-Image transformer tests, etc. (#12741)
* start zimage model tests.

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* up

* Revert "up"

This reverts commit bca3e27c96.

* expand upon compilation failure reason.

* Update tests/models/transformers/test_models_transformer_z_image.py

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

* reinitialize the padding tokens to ones to prevent NaN problems.

* updates

* up

* skipping ZImage DiT tests

* up

* up

---------

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
2025-12-03 19:35:46 +05:30
Sayak Paul
d96cbacacd [tests] fix hunuyanvideo 1.5 offloading tests. (#12782)
fix hunuyanvideo 1.5 offloading tests.
2025-12-03 18:07:59 +05:30
Aditya Borate
5ab5946931 Fix: leaf_level offloading breaks after delete_adapters (#12639)
* Fix(peft): Re-apply group offloading after deleting adapters

* Test: Add regression test for group offloading + delete_adapters

* Test: Add assertions to verify output changes after deletion

* Test: Add try/finally to clean up group offloading hooks

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-03 17:39:11 +05:30
Lev Novitskiy
d0c54e5563 Kandinsky 5.0 Video Pro and Image Lite (#12664)
* add transformer pipeline first version


---------

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Charles <charles@huggingface.co>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: dmitrienkoae <dmitrienko.ae@phystech.edu>
Co-authored-by: nvvaulin <nvvaulin@gmail.com>
2025-12-03 00:46:37 -10:00
Dhruv Nair
1908c47600 Deprecate upcast_vae in SDXL based pipelines (#12619)
* update

* update

* Revert "update"

This reverts commit 73906381ab.

* Revert "update"

This reverts commit 21a03f93ef.

* update

* update

* update

* update

* update
2025-12-03 15:53:23 +05:30
Sayak Paul
759ea58708 [core] reuse AttentionMixin for compatible classes (#12463)
* remove attn_processors property

* more

* up

* up more.

* up

* add AttentionMixin to AuraFlow.

* up

* up

* up

* up
2025-12-03 13:58:33 +05:30
Sayak Paul
f48f9c250f [core] start varlen variants for attn backend kernels. (#12765)
* start varlen variants for attn backend kernels.

* maybe unflatten heads.

* updates

* remove unused function.

* doc

* up
2025-12-03 13:34:52 +05:30
Kimbing Ng
3c05b9f71c Fixes #12673. record_stream in group offloading is not working properly (#12721)
* Fixes #12673.

    Wrong default_stream is used. leading to wrong execution order when record_steram is enabled.

* update

* Update test

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-12-03 11:37:11 +05:30
Jerry Wu
9379b2391b Fix TPU (torch_xla) compatibility Error about tensor repeat func along with empty dim. (#12770)
* Refactor image padding logic to pervent zero tensor in transformer_z_image.py

* Apply style fixes

* Add more support to fix repeat bug on tpu devices.

* Fix for dynamo compile error for multi if-branches.

---------

Co-authored-by: Mingjia Li <mingjiali@tju.edu.cn>
Co-authored-by: Mingjia Li <mail@mingjia.li>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-12-02 12:51:23 -10:00
Guo-Hua Wang
4f136f842c Add support for Ovis-Image (#12740)
* add ovis_image

* fix code quality

* optimize pipeline_ovis_image.py according to the feedbacks

* optimize imports

* add docs

* make style

* make style

* add ovis to toctree

* oops

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-12-02 11:48:07 -10:00
CalamitousFelicitousness
edf36f5128 Add ZImage LoRA support and integrate into ZImagePipeline (#12750)
* Add ZImage LoRA support and integrate into ZImagePipeline

* Add LoRA test for Z-Image

* Move the LoRA test

* Fix ZImage LoRA scale support and test configuration

* Add ZImage LoRA test overrides for architecture differences

- Override test_lora_fuse_nan to use ZImage's 'layers' attribute
  instead of 'transformer_blocks'
- Skip block-level LoRA scaling test (not supported in ZImage)
- Add required imports: numpy, torch_device, check_if_lora_correctly_set

* Add ZImageLoraLoaderMixin to LoRA documentation

* Use conditional import for peft.LoraConfig in ZImage tests

* Override test_correct_lora_configs_with_different_ranks for ZImage

ZImage uses 'attention.to_k' naming convention instead of 'attn.to_k',
so the base test's module name search loop never finds a match. This
override uses the correct naming pattern for ZImage architecture.

* Add is_flaky decorator to ZImage LoRA tests initialise padding tokens

* Skip ZImage LoRA test class entirely

Skip the entire ZImageLoRATests class due to non-deterministic behavior
from complex64 RoPE operations and torch.empty padding tokens.
LoRA functionality works correctly with real models.

Clean up removed:
- Individual @unittest.skip decorators
- @is_flaky decorator overrides for inherited methods
- Custom test method overrides
- Global torch deterministic settings
- Unused imports (numpy, is_flaky, check_if_lora_correctly_set)

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
2025-12-02 02:16:30 -03:00
Sayak Paul
564079f295 [feat]: implement "local" caption upsampling for Flux.2 (#12718)
* feat: implement caption upsampling for flux.2.

* doc

* up

* fix

* up

* fix system prompts 🤷‍

* up

* up

* up
2025-12-02 04:27:24 +05:30
Sayak Paul
394a48d169 Update bria_fibo.md with minor fixes (#12731)
* Update bria_fibo.md with minor fixes

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-12-02 04:24:19 +05:30
Gal Davidi
99784ae0d2 Rename BriaPipeline to BriaFiboPipeline in documentation (#12758) 2025-12-01 09:34:47 -10:00
DefTruth
fffd964a0f fix FLUX.2 context parallel (#12737) 2025-12-01 09:07:49 -10:00
David El Malih
859b809031 Improve docstrings and type hints in scheduling_euler_ancestral_discrete.py (#12766)
refactor: add type hints to methods and update docstrings for parameters.
2025-12-01 08:38:01 -10:00
David El Malih
d769d8a13b Improve docstrings and type hints in scheduling_heun_discrete.py (#12726)
refactor: improve type hints for `beta_schedule`, `prediction_type`, and `timestep_spacing` parameters, and add return type hints to several methods.
2025-12-01 08:09:36 -08:00
David El Malih
c25582d509 [Docs] Update Imagen Video paper link in schedulers (#12724)
docs: Update Imagen Video paper link in scheduler docstrings.
2025-12-01 08:09:22 -08:00
YiYi Xu
6156cf8f22 Hunyuanvideo15 (#12696)
* add


---------

Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-11-30 20:27:59 -10:00
DefTruth
152f7ca357 fix type-check for z-image transformer (#12739)
* allow type-check for ZImageTransformer2DModel

* make fix-copies
2025-11-29 14:58:33 +05:30
Dhruv Nair
b010a8ce0c [Modular] Add single file support to Modular (#12383)
* update

* update

* update

* update

* Apply style fixes

* update

* update

* update

* update

* update

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-11-28 22:23:04 +05:30
Ayush Sur
1b91856d0e Fix examples not loading LoRA adapter weights from checkpoint (#12690)
* Fix examples not loading LoRA adapter weights from checkpoint

* Updated lora saving logic with accelerate save_model_hook and load_model_hook

* Formatted the changes using ruff

* import and upcasting changed

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-11-28 11:56:39 +05:30
Sayak Paul
01e355516b Enable regional compilation on z-image transformer model (#12736)
up
2025-11-27 07:18:00 -10:00
Sayak Paul
6bf668c4d2 [chore] remove torch.save from remnant code. (#12717)
remove torch.save from remnant code.
2025-11-27 13:04:09 +05:30
Jerry Wu
e6d4612309 Support unittest for Z-image ️ (#12715)
* Add Support for Z-Image.

* Reformatting with make style, black & isort.

* Remove init, Modify import utils, Merge forward in transformers block, Remove once func in pipeline.

* modified main model forward, freqs_cis left

* refactored to add B dim

* fixed stack issue

* fixed modulation bug

* fixed modulation bug

* fix bug

* remove value_from_time_aware_config

* styling

* Fix neg embed and devide / bug; Reuse pad zero tensor; Turn cat -> repeat; Add hint for attn processor.

* Replace padding with pad_sequence; Add gradient checkpointing.

* Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that.

* Fix Docstring and Make Style.

* Revert "Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that."

This reverts commit fbf26b7ed1.

* update z-image docstring

* Revert attention dispatcher

* update z-image docstring

* styling

* Recover attention_dispatch.py with its origin impl, later would special commit for fa3 compatibility.

* Fix prev bug, and support for prompt_embeds pass in args after prompt pre-encode as List of torch Tensor.

* Remove einop dependency.

* remove redundant imports & make fix-copies

* fix import

* Support for num_images_per_prompt>1; Remove redundant unquote variables.

* Fix bugs for num_images_per_prompt with actual batch.

* Add unit tests for Z-Image.

* Refine unitest and skip for cases needed separate test env; Fix compatibility with unitest in model, mostly precision formating.

* Add clean env for test_save_load_float16 separ test; Add Note; Styling.

* Update dtype mentioned by yiyi.

---------

Co-authored-by: liudongyang <liudongyang0114@gmail.com>
2025-11-26 07:18:57 -10:00
David El Malih
a88a7b4f03 Improve docstrings and type hints in scheduling_dpmsolver_multistep.py (#12710)
* Improve docstrings and type hints in multiple diffusion schedulers

* docs: update Imagen Video paper link to Hugging Face Papers.
2025-11-26 08:38:41 -08:00
Sayak Paul
c8656ed73c [docs] put autopipeline after overview and hunyuanimage in images (#12548)
put autopipeline after overview and hunyuanimage in images
2025-11-26 15:34:22 +05:30
Sayak Paul
94c9613f99 [docs] Correct flux2 links (#12716)
* fix links

* up
2025-11-26 10:46:51 +05:30
Sayak Paul
b91e8c0d0b [lora]: Fix Flux2 LoRA NaN test (#12714)
* up

* Update tests/lora/test_lora_layers_flux2.py

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

---------

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
2025-11-26 09:07:48 +05:30
Andrei Filatov
ac7864624b Update script names in README for Flux2 training (#12713) 2025-11-26 07:02:18 +05:30
Sayak Paul
5ffb73d4ae let's go Flux2 🚀 (#12711)
* add vae

* Initial commit for Flux 2 Transformer implementation

* add pipeline part

* small edits to the pipeline and conversion

* update conversion script

* fix

* up up

* finish pipeline

* Remove Flux IP Adapter logic for now

* Remove deprecated 3D id logic

* Remove ControlNet logic for now

* Add link to ViT-22B paper as reference for parallel transformer blocks such as the Flux 2 single stream block

* update pipeline

* Don't use biases for input projs and output AdaNorm

* up

* Remove bias for double stream block text QKV projections

* Add script to convert Flux 2 transformer to diffusers

* make style and make quality

* fix a few things.

* allow sft files to go.

* fix image processor

* fix batch

* style a bit

* Fix some bugs in Flux 2 transformer implementation

* Fix dummy input preparation and fix some test bugs

* fix dtype casting in timestep guidance module.

* resolve conflicts.,

* remove ip adapter stuff.

* Fix Flux 2 transformer consistency test

* Fix bug in Flux2TransformerBlock (double stream block)

* Get remaining Flux 2 transformer tests passing

* make style; make quality; make fix-copies

* remove stuff.

* fix type annotaton.

* remove unneeded stuff from tests

* tests

* up

* up

* add sf support

* Remove unused IP Adapter and ControlNet logic from transformer (#9)

* copied from

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>

* up

* up

* up

* up

* up

* Refactor Flux2Attention into separate classes for double stream and single stream attention

* Add _supports_qkv_fusion to AttentionModuleMixin to allow subclasses to disable QKV fusion

* Have Flux2ParallelSelfAttention inherit from AttentionModuleMixin with _supports_qkv_fusion=False

* Log debug message when calling fuse_projections on a AttentionModuleMixin subclass that does not support QKV fusion

* Address review comments

* Update src/diffusers/pipelines/flux2/pipeline_flux2.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* up

* Remove maybe_allow_in_graph decorators for Flux 2 transformer blocks (#12)

* up

* support ostris loras. (#13)

* up

* update schdule

* up

* up (#17)

* add training scripts (#16)

* add training scripts

Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>

* model cpu offload in validation.

* add flux.2 readme

* add img2img and tests

* cpu offload in log validation

* Apply suggestions from code review

* fix

* up

* fixes

* remove i2i training tests for now.

---------

Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>
Co-authored-by: linoytsaban <linoy@huggingface.co>

* up

---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Daniel Gu <dgu8957@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-10-53-87-203.ec2.internal>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
Co-authored-by: Linoy Tsaban <linoytsaban@gmail.com>
Co-authored-by: linoytsaban <linoy@huggingface.co>
2025-11-25 21:49:04 +05:30
Jerry Wu
4088e8a851 Add Support for Z-Image Series (#12703)
* Add Support for Z-Image.

* Reformatting with make style, black & isort.

* Remove init, Modify import utils, Merge forward in transformers block, Remove once func in pipeline.

* modified main model forward, freqs_cis left

* refactored to add B dim

* fixed stack issue

* fixed modulation bug

* fixed modulation bug

* fix bug

* remove value_from_time_aware_config

* styling

* Fix neg embed and devide / bug; Reuse pad zero tensor; Turn cat -> repeat; Add hint for attn processor.

* Replace padding with pad_sequence; Add gradient checkpointing.

* Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that.

* Fix Docstring and Make Style.

* Revert "Fix flash_attn3 in dispatch attn backend by _flash_attn_forward, replace its origin implement; Add DocString in pipeline for that."

This reverts commit fbf26b7ed1.

* update z-image docstring

* Revert attention dispatcher

* update z-image docstring

* styling

* Recover attention_dispatch.py with its origin impl, later would special commit for fa3 compatibility.

* Fix prev bug, and support for prompt_embeds pass in args after prompt pre-encode as List of torch Tensor.

* Remove einop dependency.

* remove redundant imports & make fix-copies

* fix import

---------

Co-authored-by: liudongyang <liudongyang0114@gmail.com>
2025-11-25 05:50:00 -10:00
Junsong Chen
d33d9f6715 fix typo in docs (#12675)
* fix typo in docs

* Update docs/source/en/api/pipelines/sana_video.md

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>

---------

Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
2025-11-24 19:42:16 -08:00
sq
dde8754ba2 Fix variable naming typos in community FluxControlNetFillInpaintPipeline (#12701)
- Fixed variable naming typos (maskkk -> mask_fill, mask_imagee -> mask_image_fill, masked_imagee -> masked_image_fill, masked_image_latentsss -> masked_latents_fill)

These changes improve code readability without affecting functionality.
2025-11-24 15:16:11 -08:00
cdutr
fbcd3ba6b2 [i8n-pt] Fix grammar and expand Portuguese documentation (#12598)
* Updates Portuguese documentation for Diffusers library

Enhances the Portuguese documentation with:
- Restructured table of contents for improved navigation
- Added placeholder page for in-translation content
- Refined language and improved readability in existing pages
- Introduced a new page on basic Stable Diffusion performance guidance

Improves overall documentation structure and user experience for Portuguese-speaking users

* Removes untranslated sections from Portuguese documentation

Cleans up the Portuguese documentation table of contents by removing placeholder sections marked as "Em tradução" (In translation)

Removes the in_translation.md file and associated table of contents entries for sections that are not yet translated, improving documentation clarity
2025-11-24 14:07:32 -08:00
Sayak Paul
d176f61fcf [core] support sage attention + FA2 through kernels (#12439)
* up

* support automatic dispatch.

* disable compile support for now./

* up

* flash too.

* document.

* up

* up

* up

* up
2025-11-24 16:58:07 +05:30
DefTruth
354d35adb0 bugfix: fix chrono-edit context parallel (#12660)
* bugfix: fix chrono-edit context parallel

* bugfix: fix chrono-edit context parallel

* Update src/diffusers/models/transformers/transformer_chronoedit.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update src/diffusers/models/transformers/transformer_chronoedit.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Clean up comments in transformer_chronoedit.py

Removed unnecessary comments regarding parallelization in cross-attention.

* fix style

* fix qc

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-11-24 16:36:53 +05:30
SwayStar123
544ba677dd Add FluxLoraLoaderMixin to Fibo pipeline (#12688)
Update pipeline_bria_fibo.py
2025-11-24 13:31:31 +05:30