Compare commits

...

774 Commits

Author SHA1 Message Date
Aryan
de7cdf6287 Merge modular diffusers with main (#11893)
* [CI] Fix big GPU test marker (#11786)

* update

* update

* First Block Cache (#11180)

* update

* modify flux single blocks to make compatible with cache techniques (without too much model-specific intrusion code)

* remove debug logs

* update

* cache context for different batches of data

* fix hs residual bug for single return outputs; support ltx

* fix controlnet flux

* support flux, ltx i2v, ltx condition

* update

* update

* Update docs/source/en/api/cache.md

* Update src/diffusers/hooks/hooks.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* address review comments pt. 1

* address review comments pt. 2

* cache context refacotr; address review pt. 3

* address review comments

* metadata registration with decorators instead of centralized

* support cogvideox

* support mochi

* fix

* remove unused function

* remove central registry based on review

* update

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fix

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-07-08 18:30:27 -10:00
yiyixuxu
73c5fe8bb1 Merge branch 'modular-diffusers' of github.com:huggingface/diffusers into modular-diffusers 2025-07-08 22:13:34 +02:00
yiyixuxu
595581d6ba style 2025-07-08 22:13:00 +02:00
yiyixuxu
d27b65411e add more docstrings + experimental marks 2025-07-08 20:23:44 +02:00
yiyixuxu
cb9dca5523 add experimental marks to all modular docs 2025-07-08 20:23:21 +02:00
YiYi Xu
79166dcb47 Merge branch 'main' into modular-diffusers 2025-07-08 05:46:01 -10:00
Sayak Paul
01240fecb0 [training ] add Kontext i2i training (#11858)
* feat: enable i2i fine-tuning in Kontext script.

* readme

* more checks.

* Apply suggestions from code review

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>

* fixes

* fix

* add proj_mlp to the mix

* Update README_flux.md

add note on installing from commit `05e7a854d0a5661f5b433f6dd5954c224b104f0b`

* fix

* fix

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-07-08 21:04:16 +05:30
Steven Liu
ce338d4e4a [docs] LoRA metadata (#11848)
* draft

* hub image

* update

* fix
2025-07-08 08:29:38 -07:00
yiyixuxu
f95c320467 addreess more review comments 2025-07-08 07:11:57 +02:00
yiyixuxu
59abd9514b add link to components manager doc 2025-07-08 06:47:14 +02:00
yiyixuxu
5f3ebef0d7 update remove duplicated config for pag, and remove the description of all the guiders 2025-07-08 06:29:47 +02:00
YiYi Xu
e6ffde2936 Apply suggestions from code review
Co-authored-by: Aryan <aryan@huggingface.co>
2025-07-07 18:25:31 -10:00
yiyixuxu
04171c7345 Merge branch 'modular-diffusers' of github.com:huggingface/diffusers into modular-diffusers 2025-07-08 06:17:08 +02:00
Aryan
be5e10ae61 Copied-from implementation of PAG-guider (#11882)
* update

* fix
2025-07-07 18:16:52 -10:00
yiyixuxu
a2da0004ee add a guide on components manager 2025-07-08 06:16:26 +02:00
yiyixuxu
863c7df543 components manager: use shorter ID, display id instead of name 2025-07-08 06:15:37 +02:00
Sayak Paul
bc55b631fd [tests] remove tests for deprecated pipelines. (#11879)
* remove tests for deprecated pipelines.

* remove folders

* test_pipelines_common
2025-07-08 07:13:16 +05:30
yiyixuxu
e0083b29d5 Merge branch 'modular-diffusers' of github.com:huggingface/diffusers into modular-diffusers 2025-07-07 20:52:54 +02:00
yiyixuxu
6521f599b2 make sure modularpipeline from_pretrained works without modular_model_index 2025-07-07 20:52:37 +02:00
YiYi Xu
0fcce2acd8 Merge branch 'main' into modular-diffusers 2025-07-07 07:17:20 -10:00
Sayak Paul
15d50f16f2 [docs] fix references in flux pipelines. (#11857)
* fix references in flux.

* Update src/diffusers/pipelines/flux/pipeline_flux_kontext.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-07-07 22:20:34 +05:30
yiyixuxu
ceeb3c1da3 fix 2025-07-07 10:21:01 +02:00
Sayak Paul
2c30287958 [chore] deprecate blip controlnet pipeline. (#11877)
* deprecate blip controlnet pipeline.

* last_supported_version
2025-07-07 13:25:40 +05:30
yiyixuxu
0fcdd699cf style 2025-07-07 09:55:04 +02:00
yiyixuxu
5af003a9e1 update from_componeenet, update_component 2025-07-07 09:51:04 +02:00
yiyixuxu
179d6d958b add subfolder to push_to_hub 2025-07-07 09:50:33 +02:00
yiyixuxu
229c4b355c add from_pretrained/save_pretrained for guider 2025-07-07 09:50:04 +02:00
yiyixuxu
0a4819a755 add sub_folder to save_pretrained() for config mixin 2025-07-07 09:49:29 +02:00
yiyixuxu
7cea9a3bb0 add a guider section on doc 2025-07-07 09:48:28 +02:00
yiyixuxu
23de59e21a add sub_blocks for pipelineBlock 2025-07-06 06:18:34 +02:00
yiyixuxu
4f8b6f5a15 style + copy 2025-07-06 03:23:31 +02:00
yiyixuxu
63e94cbc61 resolve conflicnt 2025-07-06 02:59:32 +02:00
YiYi Xu
2c66fb3a85 Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-07-05 14:26:13 -10:00
Aryan
284f827d6c Modular custom config object serialization (#11868)
* update

* make style
2025-07-05 07:49:35 -10:00
Aryan
b750c69859 Modular Guider ConfigMixin (#11862)
* update

* update

* register to config pag
2025-07-04 17:08:05 -10:00
Aryan
13c51bb038 Modular PAG Guider (#11860)
* update

* fix

* update
2025-07-04 12:19:10 -10:00
Aryan
425a715e35 Fix Wan AccVideo/CausVid fuse_lora (#11856)
* fix

* actually, better fix

* empty commit; trigger tests again

* mark wanvace test as flaky
2025-07-04 21:10:35 +05:30
Benjamin Bossan
2527917528 FIX set_lora_device when target layers differ (#11844)
* FIX set_lora_device when target layers differ

Resolves #11833

Fixes a bug that occurs after calling set_lora_device when multiple LoRA
adapters are loaded that target different layers.

Note: Technically, the accompanying test does not require a GPU because
the bug is triggered even if the parameters are already on the
corresponding device, i.e. loading on CPU and then changing the device
to CPU is sufficient to cause the bug. However, this may be optimized
away in the future, so I decided to test with GPU.

* Update docstring to warn about device mismatch

* Extend docstring with an example

* Fix docstring

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-07-04 19:26:17 +05:30
Sayak Paul
e6639fef70 [benchmarks] overhaul benchmarks (#11565)
* start overhauling the benchmarking suite.

* fixes

* fixes

* checking.

* checking

* fixes.

* error handling and logging.

* add flops and params.

* add more models.

* utility to fire execution of all benchmarking scripts.

* utility to push to the hub.

* push utility improvement

* seems to be working.

* okay

* add torchprofile dep.

* remove total gpu memory

* fixes

* fix

* need a big gpu

* better

* what's happening.

* okay

* separate requirements and make it nightly.

* add db population script.

* update secret name

* update secret.

* population db update

* disable db population for now.

* change to every monday

* Update .github/workflows/benchmark.yml

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* quality improvements.

* reparate hub upload step.

* repository

* remove csv

* check

* update

* update

* threading.

* update

* update

* updaye

* update

* update

* update

* remove peft dep

* upgrade runner.

* fix

* fixes

* fix merging csvs.

* push dataset to the Space repo for analysis.

* warm up.

* add a readme

* Apply suggestions from code review

Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>

* address feedback

* Apply suggestions from code review

* disable db workflow.

* update to bi weekly.

* enable population

* enable

* updaye

* update

* metadata

* fix

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Luc Georges <McPatate@users.noreply.github.com>
2025-07-04 11:04:17 +05:30
Aryan
8c938fb410 [docs] Add a note of _keep_in_fp32_modules (#11851)
* update

* Update docs/source/en/using-diffusers/schedulers.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update schedulers.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-07-02 15:51:57 -07:00
Linoy Tsaban
f864a9a352 [Flux Kontext] Support Fal Kontext LoRA (#11823)
* initial commit

* initial commit

* initial commit

* fix import

* fix prefix

* remove print

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-07-02 16:57:08 +03:00
Vương Đình Minh
d6fa3298fa update: FluxKontextInpaintPipeline support (#11820)
* update: FluxKontextInpaintPipeline support

* fix: Refactor code, remove mask_image_latents and ruff check

* feat: Add test case and fix with pytest

* Apply style fixes

* copies

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-07-01 23:34:27 -10:00
Sayak Paul
6f1d6694df [lora] tests for exclude_modules with Wan VACE (#11843)
* wan vace.

* update

* update

* import problem
2025-07-02 14:23:26 +05:30
Ju Hoon Park
0e95aa853e [From Single File] support from_single_file method for WanVACE3DTransformer (#11807)
* add `WandVACETransformer3DModel` in`SINGLE_FILE_LOADABLE_CLASSES`

* add rename keys for `VACE`

add rename keys for `VACE`

* fix typo

Sincere thanks to @nitinmukesh 🙇‍♂️

* support for `1.3B VACE` model

Sincere thanks to @nitinmukesh again🙇‍♂️

* update

* update

* Apply style fixes

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-07-02 05:55:36 +02:00
Luo Yihang
5ef74fd5f6 fix norm not training in train_control_lora_flux.py (#11832) 2025-07-01 17:37:54 -10:00
Steven Liu
64a9210315 [docs] Deprecated pipelines (#11838)
add warning

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-07-01 14:02:54 -10:00
Steven Liu
d31b8cea3e [docs] Batch generation (#11841)
* draft

* fix

* fix

* feedback

* feedback
2025-07-01 17:00:20 -07:00
Mikko Tukiainen
62e847db5f Use real-valued instead of complex tensors in Wan2.1 RoPE (#11649)
* use real instead of complex tensors in Wan2.1 RoPE

* remove the redundant type conversion

* unpack rotary_emb

* register rotary embedding frequencies as non-persistent buffers

* Apply style fixes

---------

Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-07-01 13:57:19 -10:00
Sayak Paul
470458623e [docs] fix single_file example. (#11847)
fix single_file example.
2025-07-01 21:23:27 +05:30
Aryan
a79c3af6bb [single file] Cosmos (#11801)
* update

* update

* update docs
2025-07-01 18:02:58 +05:30
Aryan
3f3f0c16a6 [tests] Fix failing float16 cuda tests (#11835)
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-07-01 11:13:58 +05:30
jiqing-feng
f3e1310469 reset deterministic in tearDownClass (#11785)
* reset deterministic in tearDownClass

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix deterministic setting

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-07-01 10:06:54 +05:30
Sayak Paul
87f83d3dd9 [tests] add test for hotswapping + compilation on resolution changes (#11825)
* add resolution changes tests to hotswapping test suite.

* fixes

* docs

* explain duck shapes

* fix
2025-07-01 09:40:34 +05:30
yiyixuxu
3e46c86a93 fix links in the doc 2025-07-01 04:51:49 +02:00
yiyixuxu
8cb5b084b5 up upup 2025-07-01 03:22:27 +02:00
yiyixuxu
13fe248152 add modularpipelineblocks to be pushtohub mixin 2025-07-01 03:22:15 +02:00
yiyixuxu
2e2024152c up up 2025-07-01 03:07:08 +02:00
yiyixuxu
1987c07899 update docstree 2025-07-01 03:06:34 +02:00
yiyixuxu
4543d216ec rename quick start- it is really not quick 2025-07-01 03:06:13 +02:00
yiyixuxu
b5db8aaa6f developer_guide -> end-to-end guide 2025-07-01 03:05:38 +02:00
yiyixuxu
98ea5c9e86 Merge branch 'modular-diffusers' of github.com:huggingface/diffusers into modular-diffusers 2025-06-30 22:10:10 +02:00
yiyixuxu
f27fbceba1 more attemp to fix circular import 2025-06-30 22:09:57 +02:00
YiYi Xu
4b12a60c93 Merge branch 'main' into modular-diffusers 2025-06-30 09:46:44 -10:00
yiyixuxu
abf28d55fb update 2025-06-30 21:45:30 +02:00
Aryan
f064b3bf73 Remove print statement in SCM Scheduler (#11836)
remove print
2025-06-30 09:07:34 -10:00
yiyixuxu
db4b54cfab finish the autopipelines section! 2025-06-30 21:05:32 +02:00
yiyixuxu
0138e176ac remove the get_exeuction_blocks rec from AutoPipelineBlocks repr 2025-06-30 21:05:12 +02:00
Benjamin Bossan
3b079ec3fa ENH: Improve speed of function expanding LoRA scales (#11834)
* ENH Improve speed of expanding LoRA scales

Resolves #11816

The following call proved to be a bottleneck when setting a lot of LoRA
adapters in diffusers:

cdaf84a708/src/diffusers/loaders/peft.py (L482)

This is because we would repeatedly call unet.state_dict(), even though
in the standard case, it is not necessary:

cdaf84a708/src/diffusers/loaders/unet_loader_utils.py (L55)

This PR fixes this by deferring this call, so that it is only run when
it's necessary, not earlier.

* Small fix

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-06-30 20:25:56 +05:30
Sayak Paul
bc34fa8386 [lora]feat: use exclude modules to loraconfig. (#11806)
* feat: use exclude modules to loraconfig.

* version-guard.

* tests and version guard.

* remove print.

* describe the test

* more detailed warning message + shift to debug

* update

* update

* update

* remove test
2025-06-30 20:08:53 +05:30
yiyixuxu
bbd9340781 up 2025-06-30 11:30:06 +02:00
yiyixuxu
363737ec4b add loop sequential blocks 2025-06-30 11:09:08 +02:00
yiyixuxu
c5849ba9d5 more 2025-06-30 09:46:34 +02:00
yiyixuxu
f09b1ccfae start the section on sequential pipelines 2025-06-30 07:48:44 +02:00
yiyixuxu
285f877620 make InsertableDict importable from modular_pipelines 2025-06-30 07:48:26 +02:00
yiyixuxu
c75b88f86f up 2025-06-30 03:23:44 +02:00
YiYi Xu
b43e703fae Update docs/source/en/modular_diffusers/write_own_pipeline_block.md 2025-06-29 14:49:54 -10:00
YiYi Xu
9fae3828a7 Apply suggestions from code review 2025-06-29 14:49:31 -10:00
yiyixuxu
3a3441cb45 start the write your own pipeline block tutorial 2025-06-30 02:47:38 +02:00
yiyixuxu
fdd2bedae9 2024 -> 2025; fix a circular import 2025-06-29 03:00:46 +02:00
YiYi Xu
fedaa00bd5 Merge branch 'main' into modular-diffusers 2025-06-28 14:50:58 -10:00
yiyixuxu
8c680bc0b4 up 2025-06-28 14:11:17 +02:00
yiyixuxu
92b6b43805 add some visuals 2025-06-28 13:39:45 +02:00
yiyixuxu
49ea4d1bf5 style 2025-06-28 12:50:11 +02:00
yiyixuxu
58dbe0c29e finimsh the quickstart! 2025-06-28 12:46:21 +02:00
yiyixuxu
9aaec5b9bc up 2025-06-28 12:46:06 +02:00
yiyixuxu
93760b1888 InsertableOrderedDict -> InsertableDict 2025-06-28 09:15:13 +02:00
yiyixuxu
75540f42ee more blocks -> sub_blocks 2025-06-28 08:54:05 +02:00
yiyixuxu
b543bcc661 docstring blocks -> sub_blocks 2025-06-28 08:53:46 +02:00
yiyixuxu
885a596696 blocks -> sub_blocks; will not by default load all; add load_default_components method on modular_pipeline 2025-06-28 08:52:43 +02:00
yiyixuxu
655512e2cf components manager: change get -> search_models; add get_ids, get_components_by_ids, get_components_by_names 2025-06-28 08:35:50 +02:00
Sayak Paul
05e7a854d0 [lora] fix: lora unloading behvaiour (#11822)
* fix: lora unloading behvaiour

* fix

* update
2025-06-28 12:00:42 +05:30
Aryan
76ec3d1fee Support dynamically loading/unloading loras with group offloading (#11804)
* update

* add test

* address review comments

* update

* fixes

* change decorator order to fix tests

* try fix

* fight tests
2025-06-27 23:20:53 +05:30
Aryan
cdaf84a708 TorchAO compile + offloading tests (#11697)
* update

* update

* update

* update

* update

* user property instead
2025-06-27 18:31:57 +05:30
Sayak Paul
e8e44a510c [CI] disable onnx, mps, flax from the CI (#11803)
* disable onnx, mps, flax

* remove
2025-06-27 16:33:43 +05:30
yiyixuxu
f63d62e091 intermediates_inputs -> intermediate_inputs; component_manager -> components_manager, and more 2025-06-27 12:48:30 +02:00
Sayak Paul
21543de571 remove syncs before denoising in Kontext (#11818) 2025-06-27 15:57:55 +05:30
Aryan
d7dd924ece Kontext fixes (#11815)
fix
2025-06-26 13:03:44 -10:00
Sayak Paul
00f95b9755 Kontext training (#11813)
* support flux kontext

* make fix-copies

* add example

* add tests

* update docs

* update

* add note on integrity checker

* initial commit

* initial commit

* add readme section and fixes in the training script.

* add test

* rectify ckpt_id

* fix ckpt

* fixes

* change id

* update

* Update examples/dreambooth/train_dreambooth_lora_flux_kontext.py

Co-authored-by: Aryan <aryan@huggingface.co>

* Update examples/dreambooth/README_flux.md

---------

Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: linoytsaban <linoy@huggingface.co>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-06-26 19:31:42 +03:00
Aryan
eea76892e8 Flux Kontext (#11812)
* support flux kontext

* make fix-copies

* add example

* add tests

* update docs

* update

* add note on integrity checker

* make fix-copies issue

* add copied froms

* make style

* update repository ids

* more copied froms
2025-06-26 21:29:59 +05:30
yiyixuxu
7608d2eb9e style 2025-06-26 12:44:02 +02:00
yiyixuxu
449f299c63 move all the sequential pipelines & auto pipelines to the blocks_presets.py 2025-06-26 12:43:14 +02:00
yiyixuxu
84f4b27dfa modular_pipeline_presets.py -> modular_blocks_presets.py 2025-06-26 12:41:16 +02:00
yiyixuxu
9abac85f77 remove mapping file, move to preeset.py 2025-06-26 12:40:38 +02:00
yiyixuxu
61772f0994 updatee a comment 2025-06-26 12:39:53 +02:00
yiyixuxu
b92cda25e2 move quicktour to first page 2025-06-26 12:39:13 +02:00
kaixuanliu
27bf7fcd0e adjust tolerance criteria for test_float16_inference in unit test (#11809)
Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
2025-06-26 13:19:59 +05:30
Sayak Paul
a185e1ab91 [tests] add a test on torch compile for varied resolutions (#11776)
* add test for checking compile on different shapes.

* update

* update

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-26 10:07:03 +05:30
Animesh Jain
d93381cd41 [rfc][compile] compile method for DiffusionPipeline (#11705)
* [rfc][compile] compile method for DiffusionPipeline

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Apply style fixes

* Update docs/source/en/optimization/fp16.md

* check

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-06-26 08:41:38 +05:30
Dhruv Nair
3649d7b903 Follow up for Group Offload to Disk (#11760)
* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-06-26 07:24:24 +05:30
yiyixuxu
7492e331b4 fix 2025-06-26 03:43:10 +02:00
yiyixuxu
ab6d63407a style 2025-06-26 03:37:58 +02:00
yiyixuxu
da4242d467 use diffusers ModelHook, raise a import error for accelerate inside enable_auto_cpu_offload 2025-06-26 03:36:34 +02:00
Sayak Paul
10c36e0b78 [chore] post release v0.34.0 (#11800)
* post release v0.34.0

* code quality

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-06-26 06:56:46 +05:30
yiyixuxu
129d658da7 oops, fix 2025-06-26 01:36:43 +02:00
yiyixuxu
75e62385f5 revert changes in pipelines.stable_diffusion_xl folder, can seperate PR later 2025-06-26 01:35:00 +02:00
yiyixuxu
a33206d22b fix 2025-06-26 01:31:51 +02:00
yiyixuxu
a82e211f89 style 2025-06-26 00:48:23 +02:00
yiyixuxu
f3453f05ff copy 2025-06-26 00:47:33 +02:00
yiyixuxu
c437ae72c6 copies 2025-06-25 23:26:59 +02:00
Sayak Paul
8846635873 fix deprecation in lora after 0.34.0 release (#11802) 2025-06-25 08:48:20 -10:00
yiyixuxu
9530245e17 correct code format 2025-06-25 12:10:35 +02:00
yiyixuxu
74b908b7e2 style 2025-06-25 12:04:52 +02:00
yiyixuxu
7d2a633e02 style 2025-06-25 11:26:36 +02:00
YiYi Xu
cb328d3ff9 Apply suggestions from code review 2025-06-24 23:12:26 -10:00
YiYi Xu
8c038f0e62 Update src/diffusers/loaders/lora_base.py 2025-06-24 23:05:23 -10:00
yiyixuxu
5917d7039f remove lora related changes 2025-06-25 11:04:25 +02:00
yiyixuxu
c0327e493e update init 2025-06-25 10:49:09 +02:00
kaixuanliu
dd285099eb adjust to get CI test cases passed on XPU (#11759)
* adjust to get CI test cases passed on XPU

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* fix format issue

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* Apply style fixes

---------

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-06-25 14:02:17 +05:30
YiYi Xu
174628edf4 Merge branch 'main' into modular-diffusers 2025-06-24 22:01:03 -10:00
yiyixuxu
1c9f0a83c9 ujpdate toctree 2025-06-25 09:14:19 +02:00
yiyixuxu
cdaaa40d31 update ComponentSpec.from_component, only update config if it is created with from_config 2025-06-25 08:56:08 +02:00
yiyixuxu
ffbaa890ba move save_pretrained to the correct place 2025-06-25 08:55:06 +02:00
yiyixuxu
e49413d87d update doc 2025-06-25 08:52:15 +02:00
Sayak Paul
80f27d7e8d [tests] skip instead of returning. (#11793)
skip instead of returning.
2025-06-25 08:59:36 +05:30
Sayak Paul
d3e27e05f0 guard omnigen processor. (#11799) 2025-06-24 19:15:34 +05:30
Aryan
5df02fc171 [tests] Fix group offloading and layerwise casting test interaction (#11796)
* update

* update

* update
2025-06-24 17:33:32 +05:30
Sayak Paul
7392c8ff5a [chore] raise as early as possible in group offloading (#11792)
* raise as early as possible in group offloading

* remove check from ModuleGroup
2025-06-24 15:05:23 +05:30
Aryan
474a248f10 [tests] Fix HunyuanVideo Framepack device tests (#11789)
update
2025-06-24 13:49:37 +05:30
yiyixuxu
48e4ff5c05 update overview 2025-06-24 10:17:35 +02:00
yiyixuxu
7c78fb1aad add a overview doc page 2025-06-24 08:16:34 +02:00
YiYi Xu
7bc0a07b19 [lora] only remove hooks that we add back (#11768)
up
2025-06-23 16:49:19 -10:00
Sayak Paul
92542719ed [docs] minor cleanups in the lora docs. (#11770)
* minor cleanups in the lora docs.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* format docs

* fix copies

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-24 08:10:07 +05:30
yiyixuxu
bb4044362e up 2025-06-23 18:37:28 +02:00
yiyixuxu
1ae591e817 update code format 2025-06-23 18:08:55 +02:00
yiyixuxu
42c06e90f4 update doc 2025-06-23 17:55:32 +02:00
yiyixuxu
085ade03be add doc (developer guide) 2025-06-23 16:12:31 +02:00
yiyixuxu
78d2454c7c fix 2025-06-23 16:06:17 +02:00
imbr92
6760300202 Add --lora_alpha and metadata handling to train_dreambooth_lora_sana.py (#11744)
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-06-23 15:46:44 +03:00
Yuanchen Guo
798265f2b6 [Wan] Fix mask padding in Wan VACE pipeline. (#11778) 2025-06-23 16:28:21 +05:30
Dhruv Nair
cd813499be [CI] Skip ONNX Upscale tests (#11774)
update
2025-06-23 12:14:01 +05:30
Sayak Paul
fbddf02807 [tests] properly skip tests instead of return (#11771)
model test updates
2025-06-23 11:59:59 +05:30
Yao Matrix
f20b83a04f enable cpu offloading of new pipelines on XPU & use device agnostic empty to make pipelines work on XPU (#11671)
* commit 1

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* patch 2

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* Update pipeline_pag_sana.py

* Update pipeline_sana.py

* Update pipeline_sana_controlnet.py

* Update pipeline_sana_sprint_img2img.py

* Update pipeline_sana_sprint.py

* fix style

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix fat-thumb while merge conflict

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix ci issues

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-06-23 09:44:16 +05:30
jiqing-feng
ee40088fe5 enable deterministic in bnb 4 bit tests (#11738)
* enable deterministic in bnb 4 bit tests

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* fix 8bit test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-06-23 08:17:36 +05:30
yiyixuxu
19545fd3e1 update components manager __repr__ 2025-06-22 12:59:19 +02:00
yiyixuxu
d12531ddf7 lora: only remove hooks that we add back 2025-06-22 12:32:04 +02:00
yiyixuxu
4751d456f2 shorten loop subblock name 2025-06-22 12:31:16 +02:00
Tolga Cangöz
7fc53b5d66 Fix dimensionalities in apply_rotary_emb functions' comments (#11717)
Fix dimensionality in `apply_rotary_emb` functions' comments.
2025-06-21 12:09:28 -10:00
yiyixuxu
083479c365 ordereddict -> insertableOrderedDict; make sure loader to method works 2025-06-21 04:28:10 +02:00
yiyixuxu
04c16d0a56 update 2025-06-21 04:25:12 +02:00
yiyixuxu
9e58856b7a add __repr__ method for InsertableOrderedDict 2025-06-21 04:24:44 +02:00
Steven Liu
0874dd04dc [docs] LoRA scale scheduling (#11727)
draft
2025-06-20 10:15:29 -07:00
Steven Liu
6184d8a433 [docs] device_map (#11711)
draft

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-06-20 10:14:48 -07:00
Steven Liu
5a6e386464 [docs] Quantization + torch.compile + offloading (#11703)
* draft

* feedback

* update

* feedback

* fix

* feedback

* feedback

* fix

* feedback
2025-06-20 10:11:39 -07:00
yiyixuxu
45392cce11 update the description of StableDiffusionXLDenoiseLoopWrapper 2025-06-20 07:46:54 +02:00
yiyixuxu
8913d59bf3 add to method to modular loader, copied from DiffusionPipeline, not tested yet 2025-06-20 07:46:53 +02:00
yiyixuxu
5a8c1b5f19 add block mappings to modular_diffusers.stable_diffusion_xl.__init__ 2025-06-20 07:46:53 +02:00
yiyixuxu
7ad01a6350 rename modular_pipeline_block_mappings.py to modular_block_mapping 2025-06-20 07:46:45 +02:00
Dhruv Nair
42077e6c73 Fix failing cpu offload test for LTX Latent Upscale (#11755)
update
2025-06-20 06:07:34 +02:00
Sayak Paul
3d8d8485fc fix invalid component handling behaviour in PipelineQuantizationConfig (#11750)
* start

* updates
2025-06-20 07:54:12 +05:30
YiYi Xu
a8e853b791 [modular diffusers] more refactor (#11235)
* add componentspec and configspec

* up

* up

* move methods to blocks

* Modular Diffusers Guiders (#11311)

* cfg; slg; pag; sdxl without controlnet

* support sdxl controlnet

* support controlnet union

* update

* update

* cfg zero*

* use unwrap_module for torch compiled modules

* remove guider kwargs

* remove commented code

* remove old guider

* fix slg bug

* remove debug print

* autoguidance

* smoothed energy guidance

* add note about seg

* tangential cfg

* cfg plus plus

* support cfgpp in ddim

* apply review suggestions

* refactor

* rename enable/disable

* remove cfg++ for now

* rename do_classifier_free_guidance->prepare_unconditional_embeds

* remove unused

* [modular diffusers] introducing ModularLoader (#11462)

* cfg; slg; pag; sdxl without controlnet

---------

Co-authored-by: Aryan <aryan@huggingface.co>

* make loader optional

* remove lora step and ip-adapter step -> no longer needed

* rename pipeline -> components, data -> block_state

* seperate controlnet step into input + denoise

* refactor controlnet union

* reefactor pipeline/block states so that it can dynamically accept kwargs

* remove controlnet union denoise step, refactor & reuse controlnet denoisee step to accept aditional contrlnet kwargs

* allow input_fields as input & update message

* update input formating, consider kwarggs_type inputs with no name, e/g *_controlnet_kwargs

* refactor the denoiseestep using LoopSequential! also add a new file for denoise step

* change warning to debug

* fix get_execusion blocks with loopsequential

* fix auto denoise so all tests pass

* update imports on guiders

* remove modular reelated change from pipelines folder

* made a modular_pipelines folder!

* update __init__

* add notes

* add block state will also make sure modifed intermediates_inputs will be updated

* move block mappings to its own file

* make inputs truly immutable, remove the output logic in sequential pipeline, and update so that intermediates_outputs are only new variables

* decode block, if skip decoding do not need to update latent

* fix imports

* fix import

* fix more

* remove the output step

* make generator intermediates (it is mutable)

* after_denoise -> decoders

* add a to-do for guider cconfig mixin

* refactor component spec: replace create/create_from_pretrained/create_from_config to just create and load method

* refactor modular loader: 1. load only load (pretrained components only if not specific names) 2. update acceept create spec 3. move the updte _componeent_spec logic outside register_components to each method that create/update the component: __init__/update/load

* update components manager

* up

* [WIP] Modular Diffusers support custom code/pipeline blocks (#11539)

* update

* update

* remove the duplicated components_manager file I forgot to deletee

* fix import in block mapping

* add a to-do for modular loader

* prepare_latents_img2img pipeline method -> function, maybe do the same for others?

* update input for loop blocks, do not need to include intermediate

* solve merge conflict: manually add back the remote code change to modular_pipeline

* add node_utils

* modular node!

* add

* refator based on dhruv's feedbacks

* update doc format for kwargs_type

* up

* updatee modular_pipeline.from_pretrained, modular_repo ->pretrained_model_name_or_path

* save_pretrained for serializing config. (#11603)

* save_pretrained for serializing config.

* remove pushtohub

* diffusers-cli rough

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-06-19 15:34:17 -10:00
Dhruv Nair
195926bbdc Update Chroma Docs (#11753)
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-06-19 19:33:19 +02:00
Sayak Paul
85a916bb8b make group offloading work with disk/nvme transfers (#11682)
* start implementing disk offloading in group.

* delete diff file.

* updates.patch

* offload_to_disk_path

* check if safetensors already exist.

* add test and clarify.

* updates

* update todos.

* update more docs.

* update docs
2025-06-19 18:09:30 +05:30
Dhruv Nair
3287ce2890 Fix HiDream pipeline test module (#11754)
update
2025-06-19 17:06:14 +05:30
Dhruv Nair
0c11c8c1ac [CI] Fix SANA tests (#11756)
update
2025-06-19 17:06:02 +05:30
Dhruv Nair
fc51583c8a [CI] Fix WAN VACE tests (#11757)
update
2025-06-19 17:03:12 +05:30
Sayak Paul
fb57c76aa1 [LoRA] refactor lora loading at the model-level (#11719)
* factor out stuff from load_lora_adapter().

* simplifying text encoder lora loading.

* fix peft.py

* fix logging locations.

* formatting

* fix

* update

* update

* update
2025-06-19 13:06:25 +05:30
dependabot[bot]
7251bb4fd0 Bump urllib3 from 2.2.3 to 2.5.0 in /examples/server (#11748)
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.2.3 to 2.5.0.
- [Release notes](https://github.com/urllib3/urllib3/releases)
- [Changelog](https://github.com/urllib3/urllib3/blob/main/CHANGES.rst)
- [Commits](https://github.com/urllib3/urllib3/compare/2.2.3...2.5.0)

---
updated-dependencies:
- dependency-name: urllib3
  dependency-version: 2.5.0
  dependency-type: indirect
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-19 11:09:33 +05:30
Aryan
3fba74e153 Add missing HiDream license (#11747)
update
2025-06-19 08:07:47 +05:30
Aryan
a4df8dbc40 Update more licenses to 2025 (#11746)
update
2025-06-19 07:46:01 +05:30
Sayak Paul
48eae6f420 [Quantizers] add is_compileable property to quantizers. (#11736)
add is_compileable property to quantizers.
2025-06-19 07:45:06 +05:30
Dhruv Nair
66394bf6c7 Chroma Follow Up (#11725)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* updte

* update

* update

* update
2025-06-18 22:24:41 +05:30
Sayak Paul
62cce3045d [chore] change to 2025 licensing for remaining (#11741)
change to 2025 licensing for remaining
2025-06-18 20:56:00 +05:30
Sayak Paul
05e867784d [tests] device_map tests for all models. (#11708)
* device_map tests for all models.

* updates

* Update tests/models/test_modeling_common.py

Co-authored-by: Aryan <aryan@huggingface.co>

* fix device_map in test

---------

Co-authored-by: Aryan <aryan@huggingface.co>
2025-06-18 10:52:06 +05:30
Leo Jiang
d72184eba3 [training] add ds support to lora hidream (#11737)
* [training] add ds support to lora hidream

* Apply style fixes

---------

Co-authored-by: J石页 <jiangshuo9@h-partners.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-06-18 09:26:02 +05:30
Saurabh Misra
5ce4814af1 ️ Speed up method AutoencoderKLWan.clear_cache by 886% (#11665)
* ️ Speed up method `AutoencoderKLWan.clear_cache` by 886%

**Key optimizations:**
- Compute the number of `WanCausalConv3d` modules in each model (`encoder`/`decoder`) **only once during initialization**, store in `self._cached_conv_counts`. This removes unnecessary repeated tree traversals at every `clear_cache` call, which was the main bottleneck (from profiling).
- The internal helper `_count_conv3d_fast` is optimized via a generator expression with `sum` for efficiency.

All comments from the original code are preserved, except for updated or removed local docstrings/comments relevant to changed lines.  
**Function signatures and outputs remain unchanged.**

* Apply style fixes

* Apply suggestions from code review

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Apply style fixes

---------

Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Aseem Saxena <aseem.bits@gmail.com>
2025-06-18 08:46:03 +05:30
Linoy Tsaban
1bc6f3dc0f [LoRA training] update metadata use for lora alpha + README (#11723)
* lora alpha

* Apply style fixes

* Update examples/advanced_diffusion_training/README_flux.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* fix readme format

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-06-17 12:19:27 +03:00
Aryan
79bd7ecc78 Support more Wan loras (VACE) (#11726)
update
2025-06-17 10:39:18 +05:30
David Berenstein
9b834f8710 Add Pruna optimization framework documentation (#11688)
* Add Pruna optimization framework documentation

- Introduced a new section for Pruna in the table of contents.
- Added comprehensive documentation for Pruna, detailing its optimization techniques, installation instructions, and examples for optimizing and evaluating models

* Enhance Pruna documentation with image alt text and code block formatting

- Added alt text to images for better accessibility and context.
- Changed code block syntax from diff to python for improved clarity.

* Add installation section to Pruna documentation

- Introduced a new installation section in the Pruna documentation to guide users on how to install the framework.
- Enhanced the overall clarity and usability of the documentation for new users.

* Update pruna.md

* Update pruna.md

* Update Pruna documentation for model optimization and evaluation

- Changed section titles for consistency and clarity, from "Optimizing models" to "Optimize models" and "Evaluating and benchmarking optimized models" to "Evaluate and benchmark models".
- Enhanced descriptions to clarify the use of `diffusers` models and the evaluation process.
- Added a new example for evaluating standalone `diffusers` models.
- Updated references and links for better navigation within the documentation.

* Refactor Pruna documentation for clarity and consistency

- Removed outdated references to FLUX-juiced and streamlined the explanation of benchmarking.
- Enhanced the description of evaluating standalone `diffusers` models.
- Cleaned up code examples by removing unnecessary imports and comments for better readability.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Enhance Pruna documentation with new examples and clarifications

- Added an image to illustrate the optimization process.
- Updated the explanation for sharing and loading optimized models on the Hugging Face Hub.
- Clarified the evaluation process for optimized models using the EvaluationAgent.
- Improved descriptions for defining metrics and evaluating standalone diffusers models.

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-16 12:25:05 -07:00
Carl Thomé
81426b0f19 Fix misleading comment (#11722) 2025-06-16 08:47:00 -10:00
Sayak Paul
f0dba33d82 [training] show how metadata stuff should be incorporated in training scripts. (#11707)
* show how metadata stuff should be incorporated in training scripts.

* typing

* fix

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-06-16 16:42:34 +05:30
Sayak Paul
d1db4f853a [LoRA ]fix flux lora loader when return_metadata is true for non-diffusers (#11716)
* fix flux lora loader when return_metadata is true for non-diffusers

* remove annotation
2025-06-16 14:26:35 +05:30
Edna
8adc6003ba Chroma Pipeline (#11698)
* working state from hameerabbasi and iddl

* working state form hameerabbasi and iddl (transformer)

* working state (normalization)

* working state (embeddings)

* add chroma loader

* add chroma to mappings

* add chroma to transformer init

* take out variant stuff

* get decently far in changing variant stuff

* add chroma init

* make chroma output class

* add chroma transformer to dummy tp

* add chroma to init

* add chroma to init

* fix single file

* update

* update

* add chroma to auto pipeline

* add chroma to pipeline init

* change to chroma transformer

* take out variant from blocks

* swap embedder location

* remove prompt_2

* work on swapping text encoders

* remove mask function

* dont modify mask (for now)

* wrap attn mask

* no attn mask (can't get it to work)

* remove pooled prompt embeds

* change to my own unpooled embeddeer

* fix load

* take pooled projections out of transformer

* ensure correct dtype for chroma embeddings

* update

* use dn6 attn mask + fix true_cfg_scale

* use chroma pipeline output

* use DN6 embeddings

* remove guidance

* remove guidance embed (pipeline)

* remove guidance from embeddings

* don't return length

* dont change dtype

* remove unused stuff, fix up docs

* add chroma autodoc

* add .md (oops)

* initial chroma docs

* undo don't change dtype

* undo arxiv change

unsure why that happened

* fix hf papers regression in more places

* Update docs/source/en/api/pipelines/chroma.md

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* do_cfg -> self.do_classifier_free_guidance

* Update docs/source/en/api/models/chroma_transformer.md

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update chroma.md

* Move chroma layers into transformer

* Remove pruned AdaLayerNorms

* Add chroma fast tests

* (untested) batch cond and uncond

* Add # Copied from for shift

* Update # Copied from statements

* update norm imports

* Revert cond + uncond batching

* Add transformer tests

* move chroma test (oops)

* chroma init

* fix chroma pipeline fast tests

* Update src/diffusers/models/transformers/transformer_chroma.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Move Approximator and Embeddings

* Fix auto pipeline + make style, quality

* make style

* Apply style fixes

* switch to new input ids

* fix # Copied from error

* remove # Copied from on protected members

* try to fix import

* fix import

* make fix-copes

* revert style fix

* update chroma transformer params

* update chroma transformer approximator init params

* update to pad tokens

* fix batch inference

* Make more pipeline tests work

* Make most transformer tests work

* fix docs

* make style, make quality

* skip batch tests

* fix test skipping

* fix test skipping again

* fix for tests

* Fix all pipeline test

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* update

* push local changes, fix docs

* add encoder test, remove pooled dim

* default proj dim

* fix tests

* fix equal size list input

* Revert "fix equal size list input"

This reverts commit 3fe4ad67d5.

* update

* update

* update

* update

* update

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-06-14 06:52:56 +05:30
Aryan
9f91305f85 Cosmos Predict2 (#11695)
* support text-to-image

* update example

* make fix-copies

* support use_flow_sigmas in EDM scheduler instead of maintain cosmos-specific scheduler

* support video-to-world

* update

* rename text2image pipeline

* make fix-copies

* add t2i test

* add test for v2w pipeline

* support edm dpmsolver multistep

* update

* update

* update

* update tests

* fix tests

* safety checker

* make conversion script work without guardrail
2025-06-14 01:51:29 +05:30
Sayak Paul
368958df6f [LoRA] parse metadata from LoRA and save metadata (#11324)
* feat: parse metadata from lora state dicts.

* tests

* fix tests

* key renaming

* fix

* smol update

* smol updates

* load metadata.

* automatically save metadata in save_lora_adapter.

* propagate changes.

* changes

* add test to models too.

* tigher tests.

* updates

* fixes

* rename tests.

* sorted.

* Update src/diffusers/loaders/lora_base.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* review suggestions.

* removeprefix.

* propagate changes.

* fix-copies

* sd

* docs.

* fixes

* get review ready.

* one more test to catch error.

* change to a different approach.

* fix-copies.

* todo

* sd3

* update

* revert changes in get_peft_kwargs.

* update

* fixes

* fixes

* simplify _load_sft_state_dict_metadata

* update

* style fix

* uipdate

* update

* update

* empty commit

* _pack_dict_with_prefix

* update

* TODO 1.

* todo: 2.

* todo: 3.

* update

* update

* Apply suggestions from code review

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* reraise.

* move argument.

---------

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-06-13 14:37:49 +05:30
Aryan
e52ceae375 Support Wan AccVideo lora (#11704)
* update

* make style

* Update src/diffusers/loaders/lora_conversion_utils.py

* add note explaining threshold
2025-06-13 11:55:08 +05:30
Sayak Paul
62cbde8d41 [docs] mention fp8 benefits on supported hardware. (#11699)
* mention fp8 benefits on supported hardware.

* Update docs/source/en/quantization/torchao.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-13 07:17:03 +05:30
Sayak Paul
648e8955cf swap out token for style bot. (#11701) 2025-06-13 06:51:19 +05:30
Sayak Paul
00b179fb1a [docs] add compilation bits to the bitsandbytes docs. (#11693)
* add compilation bits to the bitsandbytes docs.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* finish

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-12 08:49:24 +05:30
Tolga Cangöz
47ef79464f Apply Occam's Razor in position embedding calculation (#11562)
* fix: remove redundant indexing

* style
2025-06-11 13:47:37 -10:00
Joel Schlosser
b272807bc8 Avoid DtoH sync from access of nonzero() item in scheduler (#11696) 2025-06-11 12:03:40 -10:00
rasmi
447ccd0679 Set _torch_version to N/A if torch is disabled. (#11645) 2025-06-11 11:59:54 -10:00
Aryan
f3e09114f2 Improve Wan docstrings (#11689)
* improve docstrings for wan

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* make style

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-06-12 01:18:40 +05:30
Sayak Paul
91545666e0 [tests] model-level device_map clarifications (#11681)
* add clarity in documentation for device_map

* docs

* fix how compiler tester mixins are used.

* propagate

* more

* typo.

* fix tests

* fix order of decroators.

* clarify more.

* more test cases.

* fix doc

* fix device_map docstring in pipeline_utils.

* more examples

* more

* update

* remove code for stuff that is already supported.

* fix stuff.
2025-06-11 22:41:59 +05:30
Sayak Paul
b6f7933044 [tests] tests for compilation + quantization (bnb) (#11672)
* start adding compilation tests for quantization.

* fixes

* make common utility.

* modularize.

* add group offloading+compile

* xfail

* update

* Update tests/quantization/test_torch_compile_utils.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fixes

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-06-11 21:14:24 +05:30
Yao Matrix
33e636cea5 enable torchao test cases on XPU and switch to device agnostic APIs for test cases (#11654)
* enable torchao cases on XPU

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* device agnostic APIs

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* more

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* enable test_torch_compile_recompilation_and_graph_break on XPU

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* resolve comments

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Matrix YAO <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-06-11 15:17:06 +05:30
Tolga Cangöz
e27142ac64 [Wan] Fix VAE sampling mode in WanVideoToVideoPipeline (#11639)
* fix: vae sampling mode

* fix a typo
2025-06-11 14:19:23 +05:30
Sayak Paul
8e88495da2 [LoRA] support Flux Control LoRA with bnb 8bit. (#11655)
support Flux Control LoRA with bnb 8bit.
2025-06-11 08:32:47 +05:30
Akash Haridas
b79803fe08 Allow remote code repo names to contain "." (#11652)
* allow loading from repo with dot in name

* put new arg at the end to avoid breaking compatibility

* add test for loading repo with dot in name

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-06-10 13:38:54 -10:00
Meatfucker
b0f7036d9a Update pipeline_flux_inpaint.py to fix padding_mask_crop returning only the inpainted area (#11658)
* Update pipeline_flux_inpaint.py to fix padding_mask_crop returning only the inpainted area and not the entire image.

* Apply style fixes

* Update src/diffusers/pipelines/flux/pipeline_flux_inpaint.py
2025-06-10 13:07:22 -04:00
Philip Brown
6c7fad7ec8 Add community class StableDiffusionXL_T5Pipeline (#11626)
* Add community class StableDiffusionXL_T5Pipeline
Will be used with base model opendiffusionai/stablediffusionxl_t5

* Changed pooled_embeds to use projection instead of slice

* "make style" tweaks

* Added comments to top of code

* Apply style fixes
2025-06-09 15:57:51 -04:00
Dhruv Nair
5b0dab1253 Introduce DeprecatedPipelineMixin to simplify pipeline deprecation process (#11596)
* update

* update

* update

* update

* update

* update

* update
2025-06-09 13:03:40 +05:30
Sayak Paul
7c6e9ef425 [tests] Fix how compiler mixin classes are used (#11680)
* fix how compiler tester mixins are used.

* propagate

* more
2025-06-09 09:24:45 +05:30
Valeriy Sofin
f46abfe4ce fixed axes_dims_rope init (huggingface#11641) (#11678) 2025-06-09 01:16:30 +05:30
Aryan
73a9d5856f Wan VACE (#11582)
* initial support

* make fix-copies

* fix no split modules

* add conversion script

* refactor

* add pipeline test

* refactor

* fix bug with mask

* fix for reference images

* remove print

* update docs

* update slices

* update

* update

* update example
2025-06-06 17:53:10 +05:30
Sayak Paul
16c955c5fd [tests] add test for torch.compile + group offloading (#11670)
* add a test for group offloading + compilation.

* tests
2025-06-06 11:34:44 +05:30
jiqing-feng
0f91f2f6fc use deterministic to get stable result (#11663)
* use deterministic to get stable result

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

* add deterministic for int8 test

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>

---------

Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-06-06 09:14:00 +05:30
Markus Pobitzer
745199a869 [examples] flux-control: use num_training_steps_for_scheduler (#11662)
[examples] flux-control: use num_training_steps_for_scheduler in get_scheduler instead of args.max_train_steps * accelerator.num_processes

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-06-05 14:56:25 +05:30
Sayak Paul
0142f6f35a [chore] bring PipelineQuantizationConfig at the top of the import chain. (#11656)
bring PipelineQuantizationConfig at the top of the import chain.
2025-06-05 14:17:03 +05:30
Dhruv Nair
d04cd95012 [CI] Some improvements to Nightly reports summaries (#11166)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* updatee

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update
2025-06-05 13:55:01 +05:30
Steven Liu
c934720629 [docs] Model cards (#11112)
* initial

* update

* hunyuanvideo

* ltx

* fix

* wan

* gen guide

* feedback

* feedback

* pipeline-level quant config

* feedback

* ltx
2025-06-02 16:55:14 -07:00
Steven Liu
9f48394bf7 [docs] Caching methods (#11625)
* cache

* feedback
2025-06-02 10:58:47 -07:00
Sayak Paul
20273e5503 [tests] chore: rename lora model-level tests. (#11481)
chore: rename lora model-level tests.
2025-06-02 09:21:40 -07:00
Sayak Paul
d4dc4d7654 [chore] misc changes in the bnb tests for consistency. (#11355)
misc changes in the bnb tests for consistency.
2025-06-02 08:41:10 -07:00
Roy Hvaara
3a31b291f1 Use float32 RoPE freqs in Wan with MPS backends (#11643)
Use float32 for RoPE on MPS in Wan
2025-06-02 09:30:09 +05:30
Sayak Paul
b975bceff3 [docs] update torchao doc link (#11634)
update torchao doc link
2025-05-30 08:30:36 -07:00
co63oc
8183d0f16e Fix typos in strings and comments (#11476)
* Fix typos in strings and comments

Signed-off-by: co63oc <co63oc@users.noreply.github.com>

* Update src/diffusers/hooks/hooks.py

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update src/diffusers/hooks/hooks.py

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Update layerwise_casting.py

* Apply style fixes

* update

---------

Signed-off-by: co63oc <co63oc@users.noreply.github.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-30 18:49:00 +05:30
Yaniv Galron
6508da6f06 typo fix in pipeline_flux.py (#11623) 2025-05-30 11:32:13 +05:30
VLT Media
d0ec6601df Bug: Fixed Image 2 Image example (#11619)
Bug: Fixed Image 2 Image example where a PIL.Image was improperly being asked for an item via index.
2025-05-30 11:30:52 +05:30
Yao Matrix
a7aa8bf28a enable group_offloading and PipelineDeviceAndDtypeStabilityTests on XPU, all passed (#11620)
* enable group_offloading and PipelineDeviceAndDtypeStabilityTests on XPU,
all passed

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix style

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

* fix

Signed-off-by: Matrix YAO <matrix.yao@intel.com>

---------

Signed-off-by: Matrix YAO <matrix.yao@intel.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-05-30 11:30:37 +05:30
Yaniv Galron
3651bdb766 removing unnecessary else statement (#11624)
Co-authored-by: Aryan <aryan@huggingface.co>
2025-05-30 11:29:24 +05:30
Justin Ruan
df55f05358 Fix wrong indent for examples of controlnet script (#11632)
fix wrong indent for training controlnet
2025-05-29 15:26:39 -07:00
Yuanzhou Cai
89ddb6c0a4 [textual_inversion_sdxl.py] fix lr scheduler steps count (#11557)
fix lr scheduler steps count

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-05-29 15:25:45 +03:00
Steven Liu
be2fb77dc1 [docs] PyTorch 2.0 (#11618)
* combine

* Update docs/source/en/optimization/fp16.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-28 09:42:41 -07:00
Sayak Paul
54cddc1e12 [CI] fix the filename for displaying failures in lora ci. (#11600)
fix the filename for displaying failures in lora ci.
2025-05-27 22:27:27 -07:00
Linoy Tsaban
28ef0165b9 [Sana Sprint] add image-to-image pipeline (#11602)
* sana sprint img2img

* fix import

* fix name

* fix image encoding

* fix image encoding

* fix image encoding

* fix image encoding

* fix image encoding

* fix image encoding

* try w/o strength

* try scaling differently

* try with strength

* revert unnecessary changes to scheduler

* revert unnecessary changes to scheduler

* Apply style fixes

* remove comment

* add copy statements

* add copy statements

* add to doc

* add to doc

* add to doc

* add to doc

* Apply style fixes

* empty commit

* fix copies

* fix copies

* fix copies

* fix copies

* fix copies

* docs

* make fix-copies.

* fix doc building error.

* initial commit - add img2img test

* initial commit - add img2img test

* fix import

* fix imports

* Apply style fixes

* empty commit

* remove

* empty commit

* test vocab size

* fix

* fix prompt missing from last commits

* small changes

* fix image processing when input is tensor

* fix order

* Apply style fixes

* empty commit

* fix shape

* remove comment

* image processing

* remove comment

* skip vae tiling test for now

* Apply style fixes

* empty commit

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2025-05-27 22:09:51 +03:00
Sayak Paul
a4da216125 [LoRA] improve LoRA fusion tests (#11274)
* improve lora fusion tests

* more improvements.

* remove comment

* update

* relax tolerance.

* num_fused_loras as a property

Co-authored-by: BenjaminBossan <benjamin.bossan@gmail.com>

* updates

* update

* fix

* fix

Co-authored-by: BenjaminBossan <benjamin.bossan@gmail.com>

* Update src/diffusers/loaders/lora_base.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

---------

Co-authored-by: BenjaminBossan <benjamin.bossan@gmail.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2025-05-27 09:02:12 -07:00
Leo Jiang
5939ace91b Adding NPU for get device function (#11617)
* Adding device choice for npu

* Adding device choice for npu

---------

Co-authored-by: J石页 <jiangshuo9@h-partners.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-27 06:59:15 -07:00
Linoy Tsaban
cc59505e26 [training docs] smol update to README files (#11616)
add comment to install prodigy
2025-05-27 06:26:54 -07:00
Sayak Paul
5f5d02fbf1 Make group offloading compatible with torch.compile() (#11605)
wip: check if we can make go compile compat
2025-05-26 20:15:29 -07:00
Sayak Paul
53748217e6 fix security issue in build docker ci (#11614)
* fix security issue in build docker ci

* better

* update
2025-05-26 22:43:01 +05:30
Dhruv Nair
826f43505d Fix mixed variant downloading (#11611)
* update

* update
2025-05-26 21:43:48 +05:30
Sayak Paul
4af76d0d7d [tests] Changes to the torch.compile() CI and tests (#11508)
* remove compile cuda docker.

* replace compile cuda docker path.

* better manage compilation cache.

* propagate similar to the pipeline tests.

* remove unneeded compile test.

* small.

* don't check for deleted files.
2025-05-26 08:31:04 -07:00
kaixuanliu
b5c2050a16 Fix bug when variant and safetensor file does not match (#11587)
* Apply style fixes

* init test

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* adjust

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* add the variant check when there are no component folders

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* update related test cases

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* update related unit test cases

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* adjust

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>

* Apply style fixes

---------

Signed-off-by: Liu, Kaixuan <kaixuan.liu@intel.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-26 14:18:41 +05:30
Steven Liu
7ae546f8d1 [docs] Pipeline-level quantization (#11604)
refactor
2025-05-26 14:12:57 +05:30
Ishan Modi
f64fa9492d [Feature] AutoModel can load components using model_index.json (#11401)
* update

* update

* update

* update

* addressed PR comments

* update

* addressed PR comments

* added tests

* addressed PR comments

* updates

* update

* addressed PR comments

* update

* fix style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-05-26 14:06:36 +05:30
Yao Matrix
049082e013 enable pipeline test cases on xpu (#11527)
* enable several pipeline integration tests on xpu

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* update per comments

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
2025-05-26 12:49:58 +05:30
regisss
f161e277d0 Update Intel Gaudi doc (#11479)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-05-23 10:41:49 +02:00
Sayak Paul
a5f4cc7f84 [LoRA] minor fix for load_lora_weights() for Flux and a test (#11595)
* fix peft delete adapters for flux.

* add test

* empty commit
2025-05-22 15:44:45 +05:30
Dhruv Nair
c36f8487df Type annotation fix (#11597)
* update

* update
2025-05-21 11:50:33 -10:00
Sayak Paul
54af3ca7fd [chore] allow string device to be passed to randn_tensor. (#11559)
allow string device to be passed to randn_tensor.
2025-05-21 14:55:54 +05:30
Sai Shreyas Bhavanasi
ba8dc7dc49 RegionalPrompting: Inherit from Stable Diffusion (#11525)
* Refactoring Regional Prompting pipeline to use Diffusion Pipeline instead of Stable Diffusion Pipeline

* Apply style fixes
2025-05-20 15:03:16 -04:00
Steven Liu
23a4ff8488 [docs] Remove fast diffusion tutorial (#11583)
remove tutorial

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-20 08:56:12 -07:00
osrm
8705af0914 docs: fix invalid links (#11505)
* fix invalid link lora.md

* fix invalid link controlnet_sdxl.md

The Hugging Face models page now uses the tags parameter instead of the other parameter for tag-based filtering. Therefore, to simultaneously apply both the "Stable Diffusion XL" and "ControlNet" tags, the following URL should be used: https://huggingface.co/models?tags=stable-diffusion-xl,controlnet

* fix invalid link cosine_dpm.md

"https://github.com/Stability-AI/stable-audio-tool" -> "https://github.com/Stability-AI/stable-audio-tools"

* Update controlnet_sdxl.md

* Update cosine_dpm.md

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-05-20 08:55:41 -07:00
Linoy Tsaban
5d4f723b57 [LoRA] kijai wan lora support for I2V (#11588)
* testing

* testing

* testing

* testing

* testing

* i2v

* i2v

* device fix

* testing

* fix

* fix

* fix

* fix

* fix

* Apply style fixes

* empty commit

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-20 17:11:55 +03:00
Aryan
05c8b42b75 LTX 0.9.7-distilled; documentation improvements (#11571)
* add guidance rescale

* update docs

* support adaptive instance norm filter

* fix custom timesteps support

* add custom timestep example to docs

* add a note about best generation settings being available only in the original repository

* use original org hub ids instead of personal

* make fix-copies

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-05-20 02:29:16 +05:30
Quentin Gallouédec
c8bb1ff53e Use HF Papers (#11567)
* Use HF Papers

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-19 06:22:33 -10:00
Dhruv Nair
799adf4a10 [Single File] Fix loading for LTX 0.9.7 transformer (#11578)
update
2025-05-19 06:22:13 -10:00
Sayak Paul
00f9273da2 [WIP][LoRA] start supporting kijai wan lora. (#11579)
* start supporting kijai wan lora.

* diff_b keys.

* Apply suggestions from code review

Co-authored-by: Aryan <aryan@huggingface.co>

* merge ready

---------

Co-authored-by: Aryan <aryan@huggingface.co>
2025-05-19 20:47:44 +05:30
Linoy Tsaban
ceb7af277c [LoRA] support non-diffusers LTX-Video loras (#11572)
* support non diffusers loras for ltxv

* Update src/diffusers/loaders/lora_conversion_utils.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/loaders/lora_pipeline.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Apply style fixes

* empty commit

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-19 12:59:55 +03:00
Sayak Paul
6918f6d19a [docs] tip for group offloding + quantization (#11576)
* tip for group offloding + quantization

Co-authored-by: Aryan VS <contact.aryanvs@gmail.com>

* Apply suggestions from code review

Co-authored-by: Aryan <aryan@huggingface.co>

---------

Co-authored-by: Aryan VS <contact.aryanvs@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-05-19 14:49:15 +05:30
apolinário
915c537891 Revert error to warning when loading LoRA from repo with multiple weights (#11568) 2025-05-19 13:33:43 +05:30
space_samurai
8270fa58e4 Doc update (#11531)
Update docs/source/en/using-diffusers/inpaint.md
2025-05-19 13:32:08 +05:30
Yao Matrix
1a10fa0c82 enhance value guard of _device_agnostic_dispatch (#11553)
enhance value guard

Signed-off-by: Matrix Yao <matrix.yao@intel.com>
2025-05-19 06:05:32 +05:30
Sayak Paul
9836f0e000 [docs] Regional compilation docs (#11556)
* add regional compilation docs.

* minor.

* reviwer feedback.

* Update docs/source/en/optimization/torch2.0.md

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>

---------

Co-authored-by: Ilyas Moutawwakil <57442720+IlyasMoutawwakil@users.noreply.github.com>
2025-05-15 19:11:24 +05:30
Sayak Paul
20379d9d13 [tests] add tests for combining layerwise upcasting and groupoffloading. (#11558)
* add tests for combining layerwise upcasting and groupoffloading.

* feedback
2025-05-15 17:16:44 +05:30
Animesh Jain
3a6caba8e4 [gguf] Refactor __torch_function__ to avoid unnecessary computation (#11551)
* [gguf] Refactor __torch_function__ to avoid unnecessary computation

This helps with torch.compile compilation lantency. Avoiding unnecessary
computation should also lead to a slightly improved eager latency.

* Apply style fixes

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-15 14:38:18 +05:30
Dhruv Nair
4267d8f4eb [Single File] GGUF/Single File Support for HiDream (#11550)
* update

* update

* update

* update

* update

* update

* update
2025-05-15 12:25:18 +05:30
Seokhyeon Jeong
f4fa3beee7 [tests] Add torch.compile test for UNet2DConditionModel (#11537)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-14 11:26:12 +05:30
Anwesha Chowdhury
7e3353196c Fix deprecation warnings in test_ltx_image2video.py (#11538)
Fixed 2 warnings that were raised during running LTXImageToVideoPipelineFastTests

Co-authored-by: achowdhury1211@gmail.com <anwesha@LAPTOP-E5QGFMOQ>
2025-05-13 08:05:52 -10:00
Meatfucker
8c249d1401 Update pipeline_flux_img2img.py to add missing vae_slicing and vae_tiling calls. (#11545)
* Update pipeline_flux_img2img.py

Adds missing vae_slicing and vae_tiling calls to FluxImage2ImagePipeline

* Update src/diffusers/pipelines/flux/pipeline_flux_img2img.py

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>

* Update src/diffusers/pipelines/flux/pipeline_flux_img2img.py

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>

* Update src/diffusers/pipelines/flux/pipeline_flux_img2img.py

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>

* Update src/diffusers/pipelines/flux/pipeline_flux_img2img.py

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>

---------

Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
2025-05-13 12:31:45 -04:00
Sayak Paul
b555a03723 [tests] Enable testing for HiDream transformer (#11478)
* add tests for hidream transformer model.

* fix

* Update tests/models/transformers/test_models_transformer_hidream.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-05-13 21:01:20 +05:30
Aryan
06fee551e9 LTX Video 0.9.7 (#11516)
* add upsampling pipeline

* ltx upsample pipeline conversion; pipeline fixes

* make fix-copies

* remove print

* add vae convenience methods

* update

* add tests

* support denoising strength for upscaling & video-to-video

* update docs

* update doc checkpoints

* update docs

* fix

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-05-13 14:57:03 +05:30
Linoy Tsaban
8b99f7e157 [LoRA] small change to support Hunyuan LoRA Loading for FramePack (#11546)
init
2025-05-13 10:15:06 +03:00
Kenneth Gerald Hamilton
07dd6f8c0e [train_dreambooth.py] Fix the LR Schedulers when num_train_epochs is passed in a distributed training env (#11239)
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-13 07:34:01 +05:30
johannaSommer
f8d4a1e283 fix: remove torch_dtype="auto" option from docstrings (#11513)
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-05-12 15:42:35 -10:00
Abdellah Oumida
ddd0cfb497 Fix typo in train_diffusion_orpo_sdxl_lora_wds.py (#11541) 2025-05-12 15:28:29 -10:00
Zhong-Yu Li
4f438de35a Add VisualCloze (#11377)
* VisualCloze

* style quality

* add docs

* add docs

* typo

* Update docs/source/en/api/pipelines/visualcloze.md

* delete einops

* style quality

* Update src/diffusers/pipelines/visualcloze/pipeline_visualcloze.py

* reorg

* refine doc

* style quality

* typo

* typo

* Update src/diffusers/image_processor.py

* add comment

* test

* style

* Modified based on review

* style

* restore image_processor

* update example url

* style

* fix-copies

* VisualClozeGenerationPipeline

* combine

* tests docs

* remove VisualClozeUpsamplingPipeline

* style

* quality

* test examples

* quality style

* typo

* make fix-copies

* fix test_callback_cfg and test_save_load_dduf in VisualClozePipelineFastTests

* add EXAMPLE_DOC_STRING to VisualClozeGenerationPipeline

* delete maybe_free_model_hooks from pipeline_visualcloze_combined

* Apply suggestions from code review

* fix test_save_load_local test; add reason for skipping cfg test

* more save_load test fixes

* fix tests in generation pipeline tests
2025-05-13 02:46:51 +05:30
Evan Han
98cc6d05e4 [test_models_transformer_ltx.py] help us test torch.compile() for impactful models (#11512)
* Update test_models_transformer_ltx.py

* Update test_models_transformer_ltx.py

* Update test_models_transformer_ltx.py

* Update test_models_transformer_ltx.py

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-12 19:26:15 +05:30
Yao Matrix
c3726153fd enable several pipeline integration tests on XPU (#11526)
* enable kandinsky2_2 integration test cases on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* enable latent_diffusion, dance_diffusion, musicldm, shap_e integration
uts on xpu

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-05-12 16:51:37 +05:30
Aryan
e48f6aeeb4 Hunyuan Video Framepack F1 (#11534)
* support framepack f1

* update docs

* update toctree

* remove typo
2025-05-12 16:11:10 +05:30
Sayak Paul
01abfc8736 [tests] add tests for framepack transformer model. (#11520)
* start.

* add tests for framepack transformer model.

* merge conflicts.

* make to square.

* fixes
2025-05-11 09:50:06 +05:30
Aryan
92fe689f06 Change Framepack transformer layer initialization order (#11535)
update
2025-05-09 23:09:50 +05:30
Yao Matrix
0ba1f76d4d enable print_env on xpu (#11507)
* detect xpu in print_env

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* enhance code, test passed on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-05-09 16:30:51 +05:30
Yao Matrix
d6bf268a4a enable dit integration cases on xpu (#11523)
* enable dit integration test on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-05-09 16:06:50 +05:30
James Xu
3c0a0129fe [LTXPipeline] Update latents dtype to match VAE dtype (#11533)
fix: update latents dtype to match vae
2025-05-09 16:05:21 +05:30
Yao Matrix
2d380895e5 enable 7 cases on XPU (#11503)
* enable 7 cases on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* calibrate A100 expectations

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-05-09 15:52:08 +05:30
Sayak Paul
0c47c954f3 [LoRA] support non-diffusers hidream loras (#11532)
* support non-diffusers hidream loras

* make fix-copies
2025-05-09 14:42:39 +05:30
Sayak Paul
7acf8345f6 [Tests] Enable more general testing for torch.compile() with LoRA hotswapping (#11322)
* refactor hotswap tester.

* fix seeds..

* add to nightly ci.

* move comment.

* move to nightly
2025-05-09 11:29:06 +05:30
Sayak Paul
599c887164 feat: pipeline-level quantization config (#11130)
* feat: pipeline-level quant config.

Co-authored-by: SunMarc <marc.sun@hotmail.fr>

condition better.

support mapping.

improvements.

[Quantization] Add Quanto backend (#10756)

* update

* updaet

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/quantization/quanto.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update src/diffusers/quantizers/quanto/utils.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

[Single File] Add single file loading for SANA Transformer (#10947)

* added support for from_single_file

* added diffusers mapping script

* added testcase

* bug fix

* updated tests

* corrected code quality

* corrected code quality

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

[LoRA] Improve warning messages when LoRA loading becomes a no-op (#10187)

* updates

* updates

* updates

* updates

* notebooks revert

* fix-copies.

* seeing

* fix

* revert

* fixes

* fixes

* fixes

* remove print

* fix

* conflicts ii.

* updates

* fixes

* better filtering of prefix.

---------

Co-authored-by: hlky <hlky@hlky.ac>

[LoRA] CogView4 (#10981)

* update

* make fix-copies

* update

[Tests] improve quantization tests by additionally measuring the inference memory savings (#11021)

* memory usage tests

* fixes

* gguf

[`Research Project`] Add AnyText: Multilingual Visual Text Generation And Editing (#8998)

* Add initial template

* Second template

* feat: Add TextEmbeddingModule to AnyTextPipeline

* feat: Add AuxiliaryLatentModule template to AnyTextPipeline

* Add bert tokenizer from the anytext repo for now

* feat: Update AnyTextPipeline's modify_prompt method

This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe.

* Fill in the `forward` pass of `AuxiliaryLatentModule`

* `make style && make quality`

* `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library`

* Update error handling to raise and logging

* Add `create_glyph_lines` function into `TextEmbeddingModule`

* make style

* Up

* Up

* Up

* Up

* Remove several comments

* refactor: Remove ControlNetConditioningEmbedding and update code accordingly

* Up

* Up

* up

* refactor: Update AnyTextPipeline to include new optional parameters

* up

* feat: Add OCR model and its components

* chore: Update `TextEmbeddingModule` to include OCR model components and dependencies

* chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task

* `make style`

* refactor: Update `AnyTextPipeline`'s docstring

* Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once

* simplify

* `make style`

* Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function

* Simplify for now

* `make style`

* Up

* feat: Add scripts to convert AnyText controlnet to diffusers

* `make style`

* Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule`

* make style

* Up

* Simplify

* Up

* feat: Add safetensors module for loading model file

* Fix device issues

* Up

* Up

* refactor: Simplify

* refactor: Simplify code for loading models and handling data types

* `make style`

* refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule

* refactor: Update dtype in embedding_manager.py to match proj.weight

* Up

* Add attribution and adaptation information to pipeline_anytext.py

* Update usage example

* Will refactor `controlnet_cond_embedding` initialization

* Add `AnyTextControlNetConditioningEmbedding` template

* Refactor organization

* style

* style

* Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding`

* Follow one-file policy

* style

* [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel

* [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py

* [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py

* Refactor AnyTextControlNet to use configurable conditioning embedding channels

* Complete control net conditioning embedding in AnyTextControlNetModel

* up

* [FIX] Ensure embeddings use correct device in AnyTextControlNetModel

* up

* up

* style

* [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline

* [UPDATE] Update example code in anytext.py to use correct font file and improve clarity

* down

* [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing

* update pillow

* [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity

* [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file

* [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency

* 🆙

* style

* [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py

* style

* Update examples/research_projects/anytext/README.md

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Remove commented-out image preparation code in AnyTextPipeline

* Remove unnecessary blank line in README.md

[Quantization] Allow loading TorchAO serialized Tensor objects with torch>=2.6  (#11018)

* update

* update

* update

* update

* update

* update

* update

* update

* update

fix: mixture tiling sdxl pipeline - adjust gerating time_ids & embeddings  (#11012)

small fix on generating time_ids & embeddings

[LoRA] support wan i2v loras from the world. (#11025)

* support wan i2v loras from the world.

* remove copied from.

* upates

* add lora.

Fix SD3 IPAdapter feature extractor (#11027)

chore: fix help messages in advanced diffusion examples (#10923)

Fix missing **kwargs in lora_pipeline.py (#11011)

* Update lora_pipeline.py

* Apply style fixes

* fix-copies

---------

Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

Fix for multi-GPU WAN inference (#10997)

Ensure that hidden_state and shift/scale are on the same device when running with multiple GPUs

Co-authored-by: Jimmy <39@🇺🇸.com>

[Refactor] Clean up import utils boilerplate (#11026)

* update

* update

* update

Use `output_size` in `repeat_interleave` (#11030)

[hybrid inference 🍯🐝] Add VAE encode (#11017)

* [hybrid inference 🍯🐝] Add VAE encode

* _toctree: add vae encode

* Add endpoints, tests

* vae_encode docs

* vae encode benchmarks

* api reference

* changelog

* Update docs/source/en/hybrid_inference/overview.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

Wan Pipeline scaling fix, type hint warning, multi generator fix (#11007)

* Wan Pipeline scaling fix, type hint warning, multi generator fix

* Apply suggestions from code review

[LoRA] change to warning from info when notifying the users about a LoRA no-op (#11044)

* move to warning.

* test related changes.

Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline (#10827)

* Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>

making ```formatted_images``` initialization compact (#10801)

compact writing

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

Fix aclnnRepeatInterleaveIntWithDim error on NPU for get_1d_rotary_pos_embed (#10820)

* get_1d_rotary_pos_embed support npu

* Update src/diffusers/models/embeddings.py

---------

Co-authored-by: Kai zheng <kaizheng@KaideMacBook-Pro.local>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: YiYi Xu <yixu310@gmail.com>

[Tests] restrict memory tests for quanto for certain schemes. (#11052)

* restrict memory tests for quanto for certain schemes.

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fixes

* style

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

[LoRA] feat: support non-diffusers wan t2v loras. (#11059)

feat: support non-diffusers wan t2v loras.

[examples/controlnet/train_controlnet_sd3.py] Fixes #11050 - Cast prompt_embeds and pooled_prompt_embeds to weight_dtype to prevent dtype mismatch (#11051)

Fix: dtype mismatch of prompt embeddings in sd3 controlnet training

Co-authored-by: Andreas Jörg <andreasjoerg@MacBook-Pro-von-Andreas-2.fritz.box>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

reverts accidental change that removes attn_mask in attn. Improves fl… (#11065)

reverts accidental change that removes attn_mask in attn. Improves flux ptxla by using flash block sizes. Moves encoding outside the for loop.

Co-authored-by: Juan Acevedo <jfacevedo@google.com>

Fix deterministic issue when getting pipeline dtype and device (#10696)

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

[Tests] add requires peft decorator. (#11037)

* add requires peft decorator.

* install peft conditionally.

* conditional deps.

Co-authored-by: DN6 <dhruv.nair@gmail.com>

---------

Co-authored-by: DN6 <dhruv.nair@gmail.com>

CogView4 Control Block (#10809)

* cogview4 control training

---------

Co-authored-by: OleehyO <leehy0357@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>

[CI] pin transformers version for benchmarking. (#11067)

pin transformers version for benchmarking.

updates

Fix Wan I2V Quality (#11087)

* fix_wan_i2v_quality

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update pipeline_wan_i2v.py

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>

LTX 0.9.5 (#10968)

* update

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>

make PR GPU tests conditioned on styling. (#11099)

Group offloading improvements (#11094)

update

Fix pipeline_flux_controlnet.py (#11095)

* Fix pipeline_flux_controlnet.py

* Fix style

update readme instructions. (#11096)

Co-authored-by: Juan Acevedo <jfacevedo@google.com>

Resolve stride mismatch in UNet's ResNet to support Torch DDP (#11098)

Modify UNet's ResNet implementation to resolve stride mismatch in Torch's DDP

Fix Group offloading behaviour when using streams (#11097)

* update

* update

Quality options in `export_to_video` (#11090)

* Quality options in `export_to_video`

* make style

improve more.

add placeholders for docstrings.

formatting.

smol fix.

solidify validation and annotation

* Revert "feat: pipeline-level quant config."

This reverts commit 316ff46b76.

* feat: implement pipeline-level quantization config

Co-authored-by: SunMarc <marc@huggingface.co>

* update

* fixes

* fix validation.

* add tests and other improvements.

* add tests

* import quality

* remove prints.

* add docs.

* fixes to docs.

* doc fixes.

* doc fixes.

* add validation to the input quantization_config.

* clarify recommendations.

* docs

* add to ci.

* todo.

---------

Co-authored-by: SunMarc <marc@huggingface.co>
2025-05-09 10:04:44 +05:30
Sayak Paul
393aefcdc7 [tests] fix audioldm2 for transformers main. (#11522)
fix audioldm2 for transformers main.
2025-05-08 21:13:42 +05:30
Aryan
6674a5157f Conditionally import torchvision in Cosmos transformer (#11524)
fix
2025-05-08 19:37:47 +05:30
scxue
784db0eaab Add cross attention type for Sana-Sprint training in diffusers. (#11514)
* test permission

* Add cross attention type for Sana-Sprint.

* Add Sana-Sprint training script in diffusers.

* make style && make quality;

* modify the attention processor with `set_attn_processor` and change `SanaAttnProcessor3_0` to `SanaVanillaAttnProcessor`

* Add import for SanaVanillaAttnProcessor

* Add README file.

* Apply suggestions from code review

* style

* Update examples/research_projects/sana/README.md

---------

Co-authored-by: lawrence-cj <cjs1020440147@icloud.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-08 18:55:29 +05:30
Linoy Tsaban
66e50d4e24 [LoRA] make lora alpha and dropout configurable (#11467)
* add lora_alpha and lora_dropout

* Apply style fixes

* add lora_alpha and lora_dropout

* Apply style fixes

* revert lora_alpha until #11324 is merged

* Apply style fixes

* empty commit

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-05-08 11:54:50 +03:00
sayakpaul
c5c34a4591 Revert "fix audioldm"
This reverts commit 87e508f11f.
2025-05-08 11:30:29 +05:30
sayakpaul
87e508f11f fix audioldm 2025-05-08 11:30:11 +05:30
YiYi Xu
53bd367b03 clean up the __Init__ for stable_diffusion (#11500)
up
2025-05-07 07:01:17 -10:00
Aryan
7b904941bc Cosmos (#10660)
* begin transformer conversion

* refactor

* refactor

* refactor

* refactor

* refactor

* refactor

* update

* add conversion script

* add pipeline

* make fix-copies

* remove einops

* update docs

* gradient checkpointing

* add transformer test

* update

* debug

* remove prints

* match sigmas

* add vae pt. 1

* finish CV* vae

* update

* update

* update

* update

* update

* update

* make fix-copies

* update

* make fix-copies

* fix

* update

* update

* make fix-copies

* update

* update tests

* handle device and dtype for safety checker; required in latest diffusers

* remove enable_gqa and use repeat_interleave instead

* enforce safety checker; use dummy checker in fast tests

* add review suggestion for ONNX export

Co-Authored-By: Asfiya Baig <asfiyab@nvidia.com>

* fix safety_checker issues when not passed explicitly

We could either do what's done in this commit, or update the Cosmos examples to explicitly pass the safety checker

* use cosmos guardrail package

* auto format docs

* update conversion script to support 14B models

* update name CosmosPipeline -> CosmosTextToWorldPipeline

* update docs

* fix docs

* fix group offload test failing for vae

---------

Co-authored-by: Asfiya Baig <asfiyab@nvidia.com>
2025-05-07 20:59:09 +05:30
Sayak Paul
fb29132b98 [docs] minor updates to bitsandbytes docs. (#11509)
* minor updates to bitsandbytes docs.

* Apply suggestions from code review
2025-05-06 18:52:18 +05:30
Valeriy Selitskiy
79371661d1 [lora_conversion] Enhance key handling for OneTrainer components in LORA conversion utility (#11441) (#11487)
* [lora_conversion] Enhance key handling for OneTrainer components in LORA conversion utility (#11441)

* Update src/diffusers/loaders/lora_conversion_utils.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-06 18:44:58 +05:30
Yao Matrix
8c661ea586 enable lora cases on XPU (#11506)
* enable lora cases on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* remove hunyuanvideo xpu expectation

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-05-06 14:59:50 +05:30
Aryan
d7ffe60166 Hunyuan Video Framepack (#11428)
* add transformer

* add pipeline

* fixes

* make fix-copies

* update

* add flux mu shift

* update example snippet

* debug

* cleanup

* batch_size=1 optimization

* add pipeline test

* fix for model cpu offloading'

* add last_image support; credits: https://github.com/lllyasviel/FramePack/pull/167

* update example with flf2v

* update penguin url

* fix test

* address review comment: https://github.com/huggingface/diffusers/pull/11428#discussion_r2071032371

* address review comment: https://github.com/huggingface/diffusers/pull/11428#discussion_r2071087689

* Update src/diffusers/pipelines/hunyuan_video/pipeline_hunyuan_video_framepack.py

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-05-06 14:59:38 +05:30
Sayak Paul
10bee525e7 [LoRA] use removeprefix to preserve sanity. (#11493)
* use removeprefix to preserve sanity.

* f-string.
2025-05-06 12:17:57 +05:30
Sayak Paul
d88ae1f52a update dep table. (#11504)
* update dep table.

* fix
2025-05-06 11:14:07 +05:30
Sayak Paul
53f1043cbb Update setup.py to pin min version of peft (#11502) 2025-05-06 10:23:16 +05:30
Aryan
1fa5639438 Fix torchao docs typo for fp8 granular quantization (#11473)
update
2025-05-06 07:54:28 +05:30
RogerSinghChugh
ed4efbd63d Update training script for txt to img sdxl with lora supp with new interpolation. (#11496)
* Update training script for txt to img sdxl with lora supp with new interpolation.

* ran make style and make quality.
2025-05-05 12:33:28 -04:00
Yijun Lee
9c29e938d7 Set LANCZOS as the default interpolation method for image resizing. (#11492)
* Set LANCZOS as the default interpolation method for image resizing.

* style: run make style and quality checks
2025-05-05 12:18:40 -04:00
Sayak Paul
071807c853 [training] feat: enable quantization for hidream lora training. (#11494)
* feat: enable quantization for hidream lora training.

* better handle compute dtype.

* finalize.

* fix dtype.

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-05-05 20:44:35 +05:30
Evan Han
ee1516e5c7 [train_dreambooth_lora_lumina2] Add LANCZOS as the default interpolation mode for image resizing (#11491)
[ADD] interpolation
2025-05-05 10:41:33 -04:00
MinJu-Ha
ec9323996b [train_dreambooth_lora_sdxl] Add --image_interpolation_mode option for image resizing (default to lanczos) (#11490)
feat(train_dreambooth_lora_sdxl): support --image_interpolation_mode with default to lanczos
2025-05-05 10:19:30 -04:00
Parag Ekbote
fc5e906689 [train_text_to_image_sdxl]Add LANCZOS as default interpolation mode for image resizing (#11455)
* Add LANCZOS as default interplotation mode.

* update script

* Update as per code review.

* make style.
2025-05-05 09:52:19 -04:00
Connector Switch
8520d496f0 [Feature] Implement tiled VAE encoding/decoding for Wan model. (#11414)
* implement tiled encode/decode

* address review comments
2025-05-05 16:07:14 +05:30
Yao Matrix
a674914fd5 enable semantic diffusion and stable diffusion panorama cases on XPU (#11459)
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-05-05 15:28:07 +05:30
Yash
ec3d58286d [train_dreambooth_lora_flux_advanced] Add LANCZOS as the default interpolation mode for image resizing (#11472)
* [train_controlnet_sdxl] Add LANCZOS as the default interpolation mode for image resizing

* [train_dreambooth_lora_flux_advanced] Add LANCZOS as the default interpolation mode for image resizing
2025-05-02 18:14:41 -04:00
Yuanzhou
ed6cf52572 [train_dreambooth_lora_sdxl_advanced] Add LANCZOS as the default interpolation mode for image resizing (#11471) 2025-05-02 16:46:01 -04:00
Steven Liu
e23705e557 [docs] Adapters (#11331)
* refactor adapter docs

* ip-adapter

* ip adapter

* fix toctree

* fix toctree

* lora

* images

* controlnet

* feedback

* controlnet

* t2i

* fix typo

* feedback

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-05-02 08:08:33 +05:30
Steven Liu
b848d479b1 [docs] Memory optims (#11385)
* reformat

* initial

* fin

* review

* inference

* feedback

* feedback

* feedback
2025-05-01 11:22:00 -07:00
Vladimir Mandic
d0c02398b9 cache packages_distributions (#11453)
* cache packages_distributions

* remove unused exception reference

* make style

Signed-off-by: Vladimir Mandic <mandic00@live.com>

* change name to _package_map

---------

Signed-off-by: Vladimir Mandic <mandic00@live.com>
Co-authored-by: DN6 <dhruv.nair@gmail.com>
2025-05-01 21:47:52 +05:30
Sayak Paul
5dcdf4ac9a [tests] xfail recent pipeline tests for specific methods. (#11469)
xfail recent pipeline tests for specific methods.
2025-05-01 18:33:52 +05:30
co63oc
86294d3c7f Fix typos in docs and comments (#11416)
* Fix typos in docs and comments

* Apply style fixes

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-30 20:30:53 -10:00
Sayak Paul
d70f8ee18b [WAN] fix recompilation issues (#11475)
* [tests] Add torch.compile() test for WanTransformer3DModel

* fix wan recompilation issues.

* style

---------

Co-authored-by: tongyu0924 <winnie920924@gmail.com>
2025-04-30 20:29:08 -10:00
YiYi Xu
6a509ba862 Merge branch 'main' into modular-diffusers 2025-04-30 17:56:25 -10:00
Yao Matrix
06beecafc5 make autoencoders. controlnet_flux and wan_transformer3d_single_file pass on xpu (#11461)
* make autoencoders. controlnet_flux and wan_transformer3d_single_file
pass on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* Apply style fixes

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-05-01 02:43:31 +05:30
Vaibhav Kumawat
daf0a23958 Add LANCZOS as default interplotation mode. (#11463)
* Add LANCZOS as default interplotation mode.

* LANCZOS as default interplotation

* LANCZOS as default interplotation mode

* Added LANCZOS as default interplotation mode
2025-04-30 14:22:38 -04:00
tongyu
38ced7ee59 [test_models_transformer_hunyuan_video] help us test torch.compile() for impactful models (#11431)
* Update test_models_transformer_hunyuan_video.py

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-04-30 19:11:42 +08:00
Yao Matrix
23c98025b3 make safe diffusion test cases pass on XPU and A100 (#11458)
* make safe diffusion test cases pass on XPU and A100

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* calibrate A100 expected values

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-04-30 16:05:28 +05:30
captainzz
8cd7426e56 Add StableDiffusion3InstructPix2PixPipeline (#11378)
* upload StableDiffusion3InstructPix2PixPipeline

* Move to community

* Add readme

* Fix images

* remove images

* Change image url

* fix

* Apply style fixes
2025-04-30 06:13:12 -04:00
Daniel Socek
fbce7aeb32 Add generic support for Intel Gaudi accelerator (hpu device) (#11328)
* Add generic support for Intel Gaudi accelerator (hpu device)

Signed-off-by: Daniel Socek <daniel.socek@intel.com>
Co-authored-by: Libin Tang <libin.tang@intel.com>

* Add loggers for generic HPU support

Signed-off-by: Daniel Socek <daniel.socek@intel.com>

* Refactor hpu support with is_hpu_available() logic

Signed-off-by: Daniel Socek <daniel.socek@intel.com>

* Fix style for hpu support update

Signed-off-by: Daniel Socek <daniel.socek@intel.com>

* Decouple soft HPU check from hard device validation to support HPU migration

Signed-off-by: Daniel Socek <daniel.socek@intel.com>

---------

Signed-off-by: Daniel Socek <daniel.socek@intel.com>
Co-authored-by: Libin Tang <libin.tang@intel.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-04-30 14:45:02 +05:30
Yao Matrix
35fada4169 enable unidiffuser test cases on xpu (#11444)
* enable unidiffuser cases on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix a typo

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-04-30 13:58:00 +05:30
Yao Matrix
fbe2fe5578 enable consistency test cases on XPU, all passed (#11446)
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-04-30 12:41:29 +05:30
Aryan
c86511586f torch.compile fullgraph compatibility for Hunyuan Video (#11457)
udpate
2025-04-30 11:21:17 +05:30
Yao Matrix
60892c55a4 enable marigold_intrinsics cases on XPU (#11445)
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-04-30 11:07:37 +05:30
Aryan
8fe5a14d9b Raise warning instead of error for block offloading with streams (#11425)
raise warning instead of error
2025-04-30 08:26:16 +05:30
Youlun Peng
58431f102c Set LANCZOS as the default interpolation for image resizing in ControlNet training (#11449)
Set LANCZOS as the default interpolation for image resizing
2025-04-29 08:47:02 -04:00
urpetkov-amd
4a9ab650aa Fixing missing provider options argument (#11397)
* Fixing missing provider options argument

* Adding if else for provider options

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Apply style fixes

* Update src/diffusers/pipelines/onnx_utils.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/onnx_utils.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: Uros Petkovic <urpektov@amd.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-28 10:23:05 -10:00
Linoy Tsaban
0ac1d5b482 [Hi-Dream LoRA] fix bug in validation (#11439)
remove unnecessary pipeline moving to cpu in validation

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-04-28 06:22:32 -10:00
Yao Matrix
7567adfc45 enable 28 GGUF test cases on XPU (#11404)
* enable gguf test cases on XPU

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* make SD35LargeGGUFSingleFileTests::test_pipeline_inference pas

Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>

* make FluxControlLoRAGGUFTests::test_lora_loading pass

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* polish code

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* Apply style fixes

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Signed-off-by: root <root@a4bf01945cfe.jf.intel.com>
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
Co-authored-by: root <root@a4bf01945cfe.jf.intel.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-28 21:32:04 +05:30
tongyu
3da98e7ee3 [train_text_to_image_lora] Better image interpolation in training scripts follow up (#11427)
* Update train_text_to_image_lora.py

* update_train_text_to_image_lora
2025-04-28 11:23:24 -04:00
tongyu
b3b04fefde [train_text_to_image] Better image interpolation in training scripts follow up (#11426)
* Update train_text_to_image.py

* update
2025-04-28 10:50:33 -04:00
Sayak Paul
0e3f2713c2 [tests] fix import. (#11434)
fix import.
2025-04-28 13:32:28 +08:00
Yao Matrix
a7e9f85e21 enable test_layerwise_casting_memory cases on XPU (#11406)
* enable test_layerwise_casting_memory cases on XPU

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------

Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-04-28 06:38:39 +05:30
Yao Matrix
9ce89e2efa enable group_offload cases and quanto cases on XPU (#11405)
* enable group_offload cases and quanto cases on XPU

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* use backend APIs

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: Yao Matrix <matrix.yao@intel.com>

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Signed-off-by: Yao Matrix <matrix.yao@intel.com>
2025-04-28 06:37:16 +05:30
Sayak Paul
aa5f5d41d6 [tests] add tests to check for graph breaks, recompilation, cuda syncs in pipelines during torch.compile() (#11085)
* test for better torch.compile stuff.

* fixes

* recompilation and graph break.

* clear compilation cache.

* change to modeling level test.

* allow running compilation tests during nightlies.
2025-04-28 08:36:33 +08:00
Mert Erbak
bd96a084d3 [train_dreambooth_lora.py] Set LANCZOS as default interpolation mode for resizing (#11421)
* Set LANCZOS as default interpolation mode for resizing

* [train_dreambooth_lora.py] Set LANCZOS as default interpolation mode for resizing
2025-04-26 01:58:41 -04:00
co63oc
f00a995753 Fix typos in strings and comments (#11407) 2025-04-24 08:53:47 -10:00
Ishan Modi
e8312e7ca9 [BUG] fixed WAN docstring (#11226)
update
2025-04-24 08:49:37 -10:00
Emiliano
7986834572 Fix Flux IP adapter argument in the pipeline example (#11402)
Fix Flux IP adapter argument in the example

IP-Adapter example had a wrong argument. Fix `true_cfg` -> `true_cfg_scale`
2025-04-24 08:41:12 -10:00
Linoy Tsaban
edd7880418 [HiDream LoRA] optimizations + small updates (#11381)
* 1. add pre-computation of prompt embeddings when custom prompts are used as well
2. save model card even if model is not pushed to hub
3. remove scheduler initialization from code example - not necessary anymore (it's now if the base model's config)
4. add skip_final_inference - to allow to run with validation, but skip the final loading of the pipeline with the lora weights to reduce memory reqs

* pre encode validation prompt as well

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* pre encode validation prompt as well

* Apply style fixes

* empty commit

* change default trained modules

* empty commit

* address comments + change encoding of validation prompt (before it was only pre-encoded if custom prompts are provided, but should be pre-encoded either way)

* Apply style fixes

* empty commit

* fix validation_embeddings definition

* fix final inference condition

* fix pipeline deletion in last inference

* Apply style fixes

* empty commit

* layers

* remove readme remarks on only pre-computing when instance prompt is provided and change example to 3d icons

* smol fix

* empty commit

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-24 07:48:19 +03:00
Teriks
b4be42282d Kolors additional pipelines, community contrib (#11372)
* Kolors additional pipelines, community contrib

---------

Co-authored-by: Teriks <Teriks@users.noreply.github.com>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-04-23 11:07:27 -10:00
Ishan Modi
a4f9c3cbc3 [Feature] Added Xlab Controlnet support (#11249)
update
2025-04-23 10:43:50 -10:00
Ishan Dutta
4b60f4b602 [train_dreambooth_flux] Add LANCZOS as the default interpolation mode for image resizing (#11395) 2025-04-23 10:47:05 -04:00
Aryan
6cef71de3a Fix group offloading with block_level and use_stream=True (#11375)
* fix

* add tests

* add message check
2025-04-23 18:17:53 +05:30
Ameer Azam
026507c06c Update README_hidream.md (#11386)
Small change
requirements_sana.txt to 
requirements_hidream.txt
2025-04-22 20:08:26 -04:00
YiYi Xu
448c72a230 [HiDream] move deprecation to 0.35.0 (#11384)
up
2025-04-22 08:08:08 -10:00
Aryan
f108ad8888 Update modeling imports (#11129)
update
2025-04-22 06:59:25 -10:00
Linoy Tsaban
e30d3bf544 [LoRA] add LoRA support to HiDream and fine-tuning script (#11281)
* initial commit

* initial commit

* initial commit

* initial commit

* initial commit

* initial commit

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>

* move prompt embeds, pooled embeds outside

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: hlky <hlky@hlky.ac>

* fix import

* fix import and tokenizer 4, text encoder 4 loading

* te

* prompt embeds

* fix naming

* shapes

* initial commit to add HiDreamImageLoraLoaderMixin

* fix init

* add tests

* loader

* fix model input

* add code example to readme

* fix default max length of text encoders

* prints

* nullify training cond in unpatchify for temp fix to incompatible shaping of transformer output during training

* smol fix

* unpatchify

* unpatchify

* fix validation

* flip pred and loss

* fix shift!!!

* revert unpatchify changes (for now)

* smol fix

* Apply style fixes

* workaround moe training

* workaround moe training

* remove prints

* to reduce some memory, keep vae in `weight_dtype` same as we have for flux (as it's the same vae)
bbd0c161b5/examples/dreambooth/train_dreambooth_lora_flux.py (L1207)

* refactor to align with HiDream refactor

* refactor to align with HiDream refactor

* refactor to align with HiDream refactor

* add support for cpu offloading of text encoders

* Apply style fixes

* adjust lr and rank for train example

* fix copies

* Apply style fixes

* update README

* update README

* update README

* fix license

* keep prompt2,3,4 as None in validation

* remove reverse ode comment

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/dreambooth/train_dreambooth_lora_hidream.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* vae offload change

* fix text encoder offloading

* Apply style fixes

* cleaner to_kwargs

* fix module name in copied from

* add requirements

* fix offloading

* fix offloading

* fix offloading

* update transformers version in reqs

* try AutoTokenizer

* try AutoTokenizer

* Apply style fixes

* empty commit

* Delete tests/lora/test_lora_layers_hidream.py

* change tokenizer_4 to load with AutoTokenizer as well

* make text_encoder_four and tokenizer_four configurable

* save model card

* save model card

* revert T5

* fix test

* remove non diffusers lumina2 conversion

---------

Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-22 11:44:02 +03:00
apolinário
6ab62c7431 Add stochastic sampling to FlowMatchEulerDiscreteScheduler (#11369)
* Add stochastic sampling to FlowMatchEulerDiscreteScheduler

This PR adds stochastic sampling to FlowMatchEulerDiscreteScheduler based on b1aeddd7cc  ltx_video/schedulers/rf.py

* Apply style fixes

* Use config value directly

* Apply style fixes

* Swap order

* Update src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-04-21 17:18:30 -10:00
Ishan Modi
f59df3bb8b [Refactor] Minor Improvement for import utils (#11161)
* update

* update

* addressed PR comments

* update

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-04-21 09:56:55 -10:00
josephrocca
a00c73a5e1 Support different-length pos/neg prompts for FLUX.1-schnell variants like Chroma (#11120)
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-04-21 09:28:19 -10:00
OleehyO
0434db9a99 [cogview4][feat] Support attention mechanism with variable-length support and batch packing (#11349)
* [cogview4] Enhance attention mechanism with variable-length support and batch packing

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-21 09:27:55 -10:00
Aamir Nazir
aff574fb29 Add Serialized Type Name kwarg in Model Output (#10502)
* Update outputs.py
2025-04-21 08:45:28 -10:00
Ishan Modi
79ea8eb258 [BUG] fixes in kadinsky pipeline (#11080)
* bug fix kadinsky pipeline
2025-04-21 08:41:09 -10:00
Aryan
e7f3a73786 Fix Wan I2V prepare_latents dtype (#11371)
update
2025-04-21 08:18:50 -10:00
PromeAI
7a4a126db8 fix issue that training flux controlnet was unstable and validation r… (#11373)
* fix issue that training flux controlnet was unstable and validation results were unstable

* del unused code pieces, fix grammar

---------

Co-authored-by: Your Name <you@example.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-04-21 08:16:05 -10:00
Kenneth Gerald Hamilton
0dec414d5b [train_dreambooth_lora_sdxl.py] Fix the LR Schedulers when num_train_epochs is passed in a distributed training env (#11240)
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-04-21 12:51:03 +05:30
Linoy Tsaban
44eeba07b2 [Flux LoRAs] fix lr scheduler bug in distributed scenarios (#11242)
* add fix

* add fix

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-21 10:08:45 +03:00
YiYi Xu
5873377a66 [Wan2.1-FLF2V] update conversion script (#11365)
update scheuler config in conversion sript
2025-04-18 14:08:44 -10:00
YiYi Xu
5a2e0f715c update output for Hidream transformer (#11366)
up
2025-04-18 14:07:21 -10:00
Kazuki Yoda
ef47726e2d Fix: StableDiffusionXLControlNetAdapterInpaintPipeline incorrectly inherited StableDiffusionLoraLoaderMixin (#11357)
Fix: Inherit `StableDiffusionXLLoraLoaderMixin`

`StableDiffusionXLControlNetAdapterInpaintPipeline`
used to incorrectly inherit
`StableDiffusionLoraLoaderMixin`
instead of `StableDiffusionXLLoraLoaderMixin`
2025-04-18 12:46:06 -10:00
YiYi Xu
0021bfa1e1 support Wan-FLF2V (#11353)
* update transformer

---------

Co-authored-by: Aryan <aryan@huggingface.co>
2025-04-18 10:27:50 -10:00
Marc Sun
bbd0c161b5 [BNB] Fix test_moving_to_cpu_throws_warning (#11356)
fix

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-04-18 09:44:51 +05:30
Yao Matrix
eef3d65954 enable 2 test cases on XPU (#11332)
* enable 2 test cases on XPU

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* Apply style fixes

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-04-17 13:27:41 -10:00
Frank (Haofan) Wang
ee6ad51d96 Update controlnet_flux.py (#11350) 2025-04-17 10:05:01 -10:00
Sayak Paul
4397f59a37 [bitsandbytes] improve dtype mismatch handling for bnb + lora. (#11270)
* improve dtype mismatch handling for bnb + lora.

* add a test

* fix and updates

* update
2025-04-17 19:51:49 +05:30
YiYi Xu
056793295c [Hi Dream] follow-up (#11296)
* add
2025-04-17 01:17:44 -10:00
Sayak Paul
29d2afbfe2 [LoRA] Propagate hotswap better (#11333)
* propagate hotswap to other load_lora_weights() methods.

* simplify documentations.

* updates

* propagate to load_lora_into_text_encoder.

* empty commit
2025-04-17 10:35:38 +05:30
Sayak Paul
b00a564dac [docs] add note about use_duck_shape in auraflow docs. (#11348)
add note about use_duck_shape in auraflow docs.
2025-04-17 10:25:39 +05:30
Sayak Paul
efc9d68b15 [chore] fix lora docs utils (#11338)
fix lora docs utils
2025-04-17 09:25:53 +05:30
nPeppon
3e59d531d1 Fix wrong dtype argument name as torch_dtype (#11346) 2025-04-16 16:00:25 -04:00
Ishan Modi
d63e6fccb1 [BUG] fixed _toctree.yml alphabetical ordering (#11277)
update
2025-04-16 09:04:22 -07:00
Dhruv Nair
59f1b7b1c8 Hunyuan I2V fast tests fix (#11341)
* update

* update
2025-04-16 18:40:33 +05:30
Sayak Paul
ce1063acfa [docs] add a snippet for compilation in the auraflow docs. (#11327)
* add a snippet for compilation in the auraflow docs.

* include speedups.
2025-04-16 11:12:09 +05:30
Sayak Paul
7212f35de2 [single file] enable telemetry for single file loading when using GGUF. (#11284)
* enable telemetry for single file loading when using GGUF.

* quality
2025-04-16 08:33:52 +05:30
Sayak Paul
3252d7ad11 unpin torch versions for onnx Dockerfile (#11290)
unpin torch versions for onnx
2025-04-16 08:16:38 +05:30
Dhruv Nair
b316104ddd Fix Hunyuan I2V for transformers>4.47.1 (#11293)
* update

* update
2025-04-16 07:53:32 +05:30
Álvaro Somoza
d3b2699a7f another fix for FlowMatchLCMScheduler forgotten import (#11330)
fix
2025-04-15 13:53:16 -10:00
Sayak Paul
4b868f14c1 post release 0.33.0 (#11255)
* post release

* update

* fix deprecations

* remaining

* update

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-04-15 06:50:08 -10:00
AstraliteHeart
b6156aafe9 Rewrite AuraFlowPatchEmbed.pe_selection_index_based_on_dim to be torch.compile compatible (#11297)
* Update pe_selection_index_based_on_dim

* Make pe_selection_index_based_on_dim work with torh.compile

* Fix AuraFlowTransformer2DModel's dpcstring default values

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-04-15 19:30:30 +05:30
Sayak Paul
7ecfe29160 [docs] fix hidream docstrings. (#11325)
* fix hidream docstrings.

* fix

* empty commit
2025-04-15 16:26:21 +05:30
Yao Matrix
7edace9a05 fix CPU offloading related fail cases on XPU (#11288)
* fix CPU offloading related fail cases on XPU

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix style

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* Apply style fixes

* trigger tests

* test_pipe_same_device_id_offload

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-04-15 09:06:56 +01:00
hlky
6e80d240d3 Fix vae.Decoder prev_output_channel (#11280) 2025-04-15 08:03:50 +01:00
Hameer Abbasi
9352a5ca56 [LoRA] Add LoRA support to AuraFlow (#10216)
* Add AuraFlowLoraLoaderMixin

* Add comments, remove qkv fusion

* Add Tests

* Add AuraFlowLoraLoaderMixin to documentation

* Add Suggested changes

* Change attention_kwargs->joint_attention_kwargs

* Rebasing derp.

* fix

* fix

* Quality fixes.

* make style

* `make fix-copies`

* `ruff check --fix`

* Attept 1 to fix tests.

* Attept 2 to fix tests.

* Attept 3 to fix tests.

* Address review comments.

* Rebasing derp.

* Get more tests passing by copying from Flux. Address review comments.

* `joint_attention_kwargs`->`attention_kwargs`

* Add `lora_scale` property for te LoRAs.

* Make test better.

* Remove useless property.

* Skip TE-only tests for AuraFlow.

* Support LoRA for non-CLIP TEs.

* Restore LoRA tests.

* Undo adding LoRA support for non-CLIP TEs.

* Undo support for TE in AuraFlow LoRA.

* `make fix-copies`

* Sync with upstream changes.

* Remove unneeded stuff.

* Mirror `Lumina2`.

* Skip for MPS.

* Address review comments.

* Remove duplicated code.

* Remove unnecessary code.

* Remove repeated docs.

* Propagate attention.

* Fix TE target modules.

* MPS fix for LoRA tests.

* Unrelated TE LoRA tests fix.

* Fix AuraFlow LoRA tests by applying to the right denoiser layers.

Co-authored-by: AstraliteHeart <81396681+AstraliteHeart@users.noreply.github.com>

* Apply style fixes

* empty commit

* Fix the repo consistency issues.

* Remove unrelated changes.

* Style.

* Fix `test_lora_fuse_nan`.

* fix quality issues.

* `pytest.xfail` -> `ValueError`.

* Add back `skip_mps`.

* Apply style fixes

* `make fix-copies`

---------

Co-authored-by: Warlord-K <warlordk28@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: AstraliteHeart <81396681+AstraliteHeart@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-15 10:41:28 +05:30
Sayak Paul
cefa28f449 [docs] Promote AutoModel usage (#11300)
* docs: promote the usage of automodel.

* bitsandbytes

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-15 09:25:40 +05:30
Beinsezii
8819cda6c0 Add skrample section to community_projects.md (#11319)
Update community_projects.md

https://github.com/huggingface/diffusers/discussions/11158#discussioncomment-12681691
2025-04-14 12:12:59 -10:00
hlky
dcf836cf47 Use float32 on mps or npu in transformer_hidream_image's rope (#11316) 2025-04-14 20:19:21 +01:00
Álvaro Somoza
1cb73cb19f import for FlowMatchLCMScheduler (#11318)
* add

* fix-copies
2025-04-14 06:28:57 -10:00
Linoy Tsaban
ba6008abfe [HiDream] code example (#11317) 2025-04-14 16:19:30 +01:00
Sayak Paul
a8f5134c11 [LoRA] support more SDXL loras. (#11292)
* support more SDXL loras.

* update

---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-04-14 17:09:59 +05:30
Fanli Lin
c7f2d239fe make KolorsPipelineFastTests::test_inference_batch_single_identical pass on XPU (#11313)
adjust diff
2025-04-14 11:02:02 +01:00
Yao Matrix
fa1ac50a66 make test_stable_diffusion_karras_sigmas pass on XPU (#11310)
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
2025-04-14 08:15:38 +01:00
Yao Matrix
aa541b9fab make KandinskyV22PipelineInpaintCombinedFastTests::test_float16_inference pass on XPU (#11308)
loose expected_max_diff from 5e-1 to 8e-1 to make
KandinskyV22PipelineInpaintCombinedFastTests::test_float16_inference
pass on XPU

Signed-off-by: Matrix Yao <matrix.yao@intel.com>
2025-04-14 07:49:20 +01:00
Ishan Modi
f1f38ffbee [ControlNet] Adds controlnet for SanaTransformer (#11040)
* added controlnet for sana transformer

* improve code quality

* addressed PR comments

* bug fixes

* added test cases

* update

* added dummy objects

* addressed PR comments

* update

* Forcing update

* add to docs

* code quality

* addressed PR comments

* addressed PR comments

* update

* addressed PR comments

* added proper styling

* update

* Revert "added proper styling"

This reverts commit 344ee8a701.

* manually ordered

* Apply suggestions from code review

---------

Co-authored-by: Aryan <contact.aryanvs@gmail.com>
2025-04-13 19:19:39 +05:30
Tuna Tuncer
36538e1135 Fix incorrect tile_latent_min_width calculations (#11305) 2025-04-13 18:21:50 +05:30
Aryan
97e0ef4db4 Hidream refactoring follow ups (#11299)
* HiDream Image

* update

* -einops

* py3.8

* fix -einops

* mixins, offload_seq, option_components

* docs

* Apply style fixes

* trigger tests

* Apply suggestions from code review

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* joint_attention_kwargs -> attention_kwargs, fixes

* fast tests

* -_init_weights

* style tests

* move reshape logic

* update slice 😴

* supports_dduf

* 🤷🏻‍♂️

* Update src/diffusers/models/transformers/transformer_hidream_image.py

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* address review comments

* update tests

* doc updates

* update

* Update src/diffusers/models/transformers/transformer_hidream_image.py

* Apply style fixes

---------

Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-13 09:45:19 +05:30
Adrien B
ed41db8525 Update autoencoderkl_allegro.md (#11303)
Correction typo
2025-04-13 09:41:30 +05:30
Nikita Starodubcev
ec0b2b3947 flow matching lcm scheduler (#11170)
* add flow matching lcm scheduler
* stochastic sampling
* upscaling for scale-wise generation

* Apply style fixes

* Apply suggestions from code review

Co-authored-by: hlky <hlky@hlky.ac>

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-04-12 11:14:57 -10:00
hlky
0ef29355c9 HiDream Image (#11231)
* HiDream Image


---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-04-11 06:31:34 -10:00
Tuna Tuncer
bc261058ee Fix incorrect tile_latent_min_width calculation in AutoencoderKLMochi (#11294) 2025-04-11 19:11:08 +05:30
Sayak Paul
7054a34978 do not use DIFFUSERS_REQUEST_TIMEOUT for notification bot (#11273)
fix to a constant
2025-04-11 14:19:46 +05:30
Sayak Paul
511d738121 [CI] relax tolerance for unclip further (#11268)
relax tolerance for unclip further.
2025-04-11 14:06:52 +05:30
Sayak Paul
ea5a6a8b7c [Tests] Cleanup lora tests utils (#11276)
* start cleaning up lora test utils for reusability

* update

* updates

* updates
2025-04-10 15:50:34 +05:30
hlky
b8093e6665 Fix LTX 0.9.5 single file (#11271) 2025-04-10 07:06:13 +01:00
Yuqian Hong
e121d0ef67 [BUG] Fix convert_vae_pt_to_diffusers bug (#11078)
* fix attention

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-10 06:59:45 +01:00
Yao Matrix
31c4f24fc1 make test_instant_style_multiple_masks pass on XPU (#11266)
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
2025-04-10 06:23:00 +01:00
xieofxie
0efdf411fb add onnxruntime-qnn & onnxruntime-cann (#11269)
Co-authored-by: hualxie <hualxie@microsoft.com>
2025-04-10 06:00:23 +01:00
Yao Matrix
450dc48a2c make test_dict_tuple_outputs_equivalent pass on XPU (#11265)
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
2025-04-10 05:56:28 +01:00
Yao Matrix
77b4f66b9e make test_stable_diffusion_inpaint_fp16 pass on XPU (#11264)
Signed-off-by: Matrix Yao <matrix.yao@intel.com>
2025-04-10 05:55:54 +01:00
Yao Matrix
68663f8a17 fix test_vanilla_funetuning failure on XPU and A100 (#11263)
* fix test_vanilla_funetuning failure on XPU and A100

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* change back to 5e-2

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

---------

Signed-off-by: Matrix Yao <matrix.yao@intel.com>
2025-04-10 05:55:07 +01:00
Sayak Paul
ffda8735be [LoRA] support musubi wan loras. (#11243)
* support musubi wan loras.

* Update src/diffusers/loaders/lora_conversion_utils.py

Co-authored-by: hlky <hlky@hlky.ac>

* support i2v loras from musubi too.

---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-04-10 09:50:22 +05:30
YiYi Xu
0706786e53 fix wan ftfy import (#11262) 2025-04-09 09:08:34 -10:00
Sayak Paul
5b27f8aba8 fix consisid imports (#11254)
* fix consisid imports

* fix opencv import

* fix
2025-04-09 18:49:32 +05:30
Sayak Paul
d1387ecee5 fix timeout constant (#11252)
* fix timeout constant

* style

* fix
2025-04-09 17:48:52 +05:30
Ilya Drobyshevskiy
6a7c2d0afa fix flux controlnet bug (#11152)
Before this if txt_ids was 3d tensor, line with txt_ids[:1] concat txt_ids by batch dim. Now we first check that txt_ids is 2d tensor (or take first batch element) and then concat by token dim
2025-04-09 13:01:07 +01:00
Dhruv Nair
edc154da09 Update Ruff to latest Version (#10919)
* update

* update

* update

* update
2025-04-09 16:51:34 +05:30
hlky
552cd32058 [docs] AutoModel (#11250)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-04-09 16:42:23 +05:30
Yao Matrix
c36c745ceb fix FluxReduxSlowTests::test_flux_redux_inference case failure on XPU (#11245)
* loose test_float16_inference's tolerance from 5e-2 to 6e-2, so XPU can
pass UT

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

* fix test_pipeline_flux_redux fail on XPU

Signed-off-by: Matrix Yao <matrix.yao@intel.com>

---------

Signed-off-by: Matrix Yao <matrix.yao@intel.com>
2025-04-09 11:41:15 +01:00
hlky
437cb36c65 AutoModel (#11115)
* AutoModel

* ...

* lol

* ...

* add test

* update

* make fix-copies

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-04-09 15:20:07 +05:30
hlky
9ee3dd3862 AudioLDM2 Fixes (#11244) 2025-04-09 14:12:00 +05:30
Sayak Paul
fd02aad402 fix: SD3 ControlNet validation so that it runs on a A100. (#11238)
* fix: SD3 ControlNet validation so that it runs on a A100.

* use backend-agnostic cache and pass devide.
2025-04-09 12:12:53 +05:30
Sayak Paul
6bfacf0418 [LoRA] support more comyui loras for Flux 🚨 (#10985)
* support more comyui loras.

* fix

* fixes

* revert changes in LoRA base.

* no position_embedding

* 🚨 introduce a breaking change to let peft handle module ambiguity

* styling

* remove position embeddings.

* improvements.

* style

* make info instead of NotImplementedError

* Update src/diffusers/loaders/peft.py

Co-authored-by: hlky <hlky@hlky.ac>

* add example.

* robust checks

* updates

---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-04-09 09:17:05 +05:30
Sayak Paul
f685981ed0 [docs] minor updates to dtype map docs. (#11237)
minor updates to dtype map docs.
2025-04-09 08:38:17 +05:30
Sayak Paul
b924251dd8 minor update to sana sprint docs. (#11236) 2025-04-09 08:17:45 +05:30
Sayak Paul
1a04812439 [bistandbytes] improve replacement warnings for bnb (#11132)
* improve replacement warnings for bnb

* updates to docs.
2025-04-08 21:18:34 +05:30
Sayak Paul
4b27c4a494 [feat] implement record_stream when using CUDA streams during group offloading (#11081)
* implement record_stream for better performance.

* fix

* style.

* merge #11097

* Update src/diffusers/hooks/group_offloading.py

Co-authored-by: Aryan <aryan@huggingface.co>

* fixes

* docstring.

* remaining todos in low_cpu_mem_usage

* tests

* updates to docs.

---------

Co-authored-by: Aryan <aryan@huggingface.co>
2025-04-08 21:17:49 +05:30
hlky
5d49b3e83b Flux quantized with lora (#10990)
* Flux quantized with lora

* fix

* changes

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Apply style fixes

* enable model cpu offload()

* Update src/diffusers/loaders/lora_pipeline.py

Co-authored-by: hlky <hlky@hlky.ac>

* update

* Apply suggestions from code review

* update

* add peft as an additional dependency for gguf

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-04-08 21:17:03 +05:30
Linoy Tsaban
71f34fc5a4 [Flux LoRA] fix issues in flux lora scripts (#11111)
* remove custom scheduler

* update requirements.txt

* log_validation with mixed precision

* add intermediate embeddings saving when checkpointing is enabled

* remove comment

* fix validation

* add unwrap_model for accelerator, torch.no_grad context for validation, fix accelerator.accumulate call in advanced script

* revert unwrap_model change temp

* add .module to address distributed training bug + replace accelerator.unwrap_model with unwrap model

* changes to align advanced script with canonical script

* make changes for distributed training + unify unwrap_model calls in advanced script

* add module.dtype fix to dreambooth script

* unify unwrap_model calls in dreambooth script

* fix condition in validation run

* mixed precision

* Update examples/advanced_diffusion_training/train_dreambooth_lora_flux_advanced.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* smol style change

* change autocast

* Apply style fixes

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-08 17:40:30 +03:00
Yao Matrix
c51b6bd837 introduce compute arch specific expectations and fix test_sd3_img2img_inference failure (#11227)
* add arch specfic expectations support, to support different arch's numerical characteristics

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* fix typo

Signed-off-by: YAO Matrix <matrix.yao@intel.com>

* Apply suggestions from code review

* Apply style fixes

* Update src/diffusers/utils/testing_utils.py

---------

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-08 14:57:49 +01:00
Benjamin Bossan
fb54499614 [LoRA] Implement hot-swapping of LoRA (#9453)
* [WIP][LoRA] Implement hot-swapping of LoRA

This PR adds the possibility to hot-swap LoRA adapters. It is WIP.

Description

As of now, users can already load multiple LoRA adapters. They can
offload existing adapters or they can unload them (i.e. delete them).
However, they cannot "hotswap" adapters yet, i.e. substitute the weights
from one LoRA adapter with the weights of another, without the need to
create a separate LoRA adapter.

Generally, hot-swapping may not appear not super useful but when the
model is compiled, it is necessary to prevent recompilation. See #9279
for more context.

Caveats

To hot-swap a LoRA adapter for another, these two adapters should target
exactly the same layers and the "hyper-parameters" of the two adapters
should be identical. For instance, the LoRA alpha has to be the same:
Given that we keep the alpha from the first adapter, the LoRA scaling
would be incorrect for the second adapter otherwise.

Theoretically, we could override the scaling dict with the alpha values
derived from the second adapter's config, but changing the dict will
trigger a guard for recompilation, defeating the main purpose of the
feature.

I also found that compilation flags can have an impact on whether this
works or not. E.g. when passing "reduce-overhead", there will be errors
of the type:

> input name: arg861_1. data pointer changed from 139647332027392 to
139647331054592

I don't know enough about compilation to determine whether this is
problematic or not.

Current state

This is obviously WIP right now to collect feedback and discuss which
direction to take this. If this PR turns out to be useful, the
hot-swapping functions will be added to PEFT itself and can be imported
here (or there is a separate copy in diffusers to avoid the need for a
min PEFT version to use this feature).

Moreover, more tests need to be added to better cover this feature,
although we don't necessarily need tests for the hot-swapping
functionality itself, since those tests will be added to PEFT.

Furthermore, as of now, this is only implemented for the unet. Other
pipeline components have yet to implement this feature.

Finally, it should be properly documented.

I would like to collect feedback on the current state of the PR before
putting more time into finalizing it.

* Reviewer feedback

* Reviewer feedback, adjust test

* Fix, doc

* Make fix

* Fix for possible g++ error

* Add test for recompilation w/o hotswapping

* Make hotswap work

Requires https://github.com/huggingface/peft/pull/2366

More changes to make hotswapping work. Together with the mentioned PEFT
PR, the tests pass for me locally.

List of changes:

- docstring for hotswap
- remove code copied from PEFT, import from PEFT now
- adjustments to PeftAdapterMixin.load_lora_adapter (unfortunately, some
  state dict renaming was necessary, LMK if there is a better solution)
- adjustments to UNet2DConditionLoadersMixin._process_lora: LMK if this
  is even necessary or not, I'm unsure what the overall relationship is
  between this and PeftAdapterMixin.load_lora_adapter
- also in UNet2DConditionLoadersMixin._process_lora, I saw that there is
  no LoRA unloading when loading the adapter fails, so I added it
  there (in line with what happens in PeftAdapterMixin.load_lora_adapter)
- rewritten tests to avoid shelling out, make the test more precise by
  making sure that the outputs align, parametrize it
- also checked the pipeline code mentioned in this comment:
  https://github.com/huggingface/diffusers/pull/9453#issuecomment-2418508871;
  when running this inside the with
  torch._dynamo.config.patch(error_on_recompile=True) context, there is
  no error, so I think hotswapping is now working with pipelines.

* Address reviewer feedback:

- Revert deprecated method
- Fix PEFT doc link to main
- Don't use private function
- Clarify magic numbers
- Add pipeline test

Moreover:
- Extend docstrings
- Extend existing test for outputs != 0
- Extend existing test for wrong adapter name

* Change order of test decorators

parameterized.expand seems to ignore skip decorators if added in last
place (i.e. innermost decorator).

* Split model and pipeline tests

Also increase test coverage by also targeting conv2d layers (support of
which was added recently on the PEFT PR).

* Reviewer feedback: Move decorator to test classes

... instead of having them on each test method.

* Apply suggestions from code review

Co-authored-by: hlky <hlky@hlky.ac>

* Reviewer feedback: version check, TODO comment

* Add enable_lora_hotswap method

* Reviewer feedback: check _lora_loadable_modules

* Revert changes in unet.py

* Add possibility to ignore enabled at wrong time

* Fix docstrings

* Log possible PEFT error, test

* Raise helpful error if hotswap not supported

I.e. for the text encoder

* Formatting

* More linter

* More ruff

* Doc-builder complaint

* Update docstring:

- mention no text encoder support yet
- make it clear that LoRA is meant
- mention that same adapter name should be passed

* Fix error in docstring

* Update more methods with hotswap argument

- SDXL
- SD3
- Flux

No changes were made to load_lora_into_transformer.

* Add hotswap argument to load_lora_into_transformer

For SD3 and Flux. Use shorter docstring for brevity.

* Extend docstrings

* Add version guards to tests

* Formatting

* Fix LoRA loading call to add prefix=None

See:
https://github.com/huggingface/diffusers/pull/10187#issuecomment-2717571064

* Run make fix-copies

* Add hot swap documentation to the docs

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-04-08 17:05:31 +05:30
Álvaro Somoza
723dbdd363 [Training] Better image interpolation in training scripts (#11206)
* initial

* Update examples/dreambooth/train_dreambooth_lora_sdxl.py

Co-authored-by: hlky <hlky@hlky.ac>

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-04-08 12:26:07 +05:30
Bhavay Malhotra
fbf61f465b [train_controlnet.py] Fix the LR schedulers when num_train_epochs is passed in a distributed training env (#8461)
* Create diffusers.yml

* fix num_train_epochs

* Delete diffusers.yml

* Fixed Changes

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-04-08 12:10:09 +05:30
YiYi Xu
96795afc72 Merge branch 'main' into modular-diffusers 2025-04-07 18:05:00 -10:00
Inigo Goiri
841504bb1a Add support to pass image embeddings to the WAN I2V pipeline. (#11175)
* Add support to pass image embeddings to the pipeline.



---------

Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-04-07 15:47:06 -10:00
Steven Liu
fc7a867ae5 [docs] MPS update (#11212)
mps
2025-04-07 14:32:27 -10:00
alex choi
5ded26cdc7 ensure dtype match between diffused latents and vae weights (#8391) 2025-04-07 12:59:10 -10:00
Yao Matrix
506f39af3a enable 1 case on XPU (#11219)
enable case on XPU: 1. tests/quantization/bnb/test_mixed_int8.py::BnB8bitTrainingTests::test_training

Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-04-07 08:24:21 +01:00
Mikko Tukiainen
8ad68c1393 Add missing MochiEncoder3D.gradient_checkpointing attribute (#11146)
* Add missing 'gradient_checkpointing = False' attr

* Add (limited) tests for Mochi autoencoder

* Apply style fixes

* pass 'conv_cache' as arg instead of kwarg

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-06 02:46:45 +05:30
Edna
41afb6690c Add Wan with STG as a community pipeline (#11184)
* Add stg wan to community pipelines

* remove debug prints

* remove unused comment

* Update doc

* Add credit + fix typo

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-05 04:00:40 +02:00
Tolga Cangöz
13e48492f0 [LTX0.9.5] Refactor LTXConditionPipeline for text-only conditioning (#11174)
* Refactor `LTXConditionPipeline` to add text-only conditioning

* style

* up

* Refactor `LTXConditionPipeline` to streamline condition handling and improve clarity

* Improve condition checks

* Simplify latents handling based on conditioning type

* Refactor rope_interpolation_scale preparation for clarity and efficiency

* Update LTXConditionPipeline docstring to clarify supported input types

* Add LTX Video 0.9.5 model to documentation

* Clarify documentation to indicate support for text-only conditioning without passing `conditions`

* refactor: comment out unused parameters in LTXConditionPipeline

* fix: restore previously commented parameters in LTXConditionPipeline

* fix: remove unused parameters from LTXConditionPipeline

* refactor: remove unnecessary lines in LTXConditionPipeline
2025-04-04 16:43:15 +02:00
Suprhimp
94f2c48d58 [feat]Add strength in flux_fill pipeline (denoising strength for fluxfill) (#10603)
* [feat]add strength in flux_fill pipeline

* Update src/diffusers/pipelines/flux/pipeline_flux_fill.py

* Update src/diffusers/pipelines/flux/pipeline_flux_fill.py

* Update src/diffusers/pipelines/flux/pipeline_flux_fill.py

* [refactor] refactor after review

* [fix] change comment

* Apply style fixes

* empty

* fix

* update prepare_latents from flux.img2img pipeline

* style

* Update src/diffusers/pipelines/flux/pipeline_flux_fill.py

---------
2025-04-04 11:23:30 -03:00
Dhruv Nair
aabf8ce20b Fix Single File loading for LTX VAE (#11200)
update
2025-04-04 18:02:39 +05:30
Kenneth Gerald Hamilton
f10775b1b5 Fixed requests.get function call by adding timeout parameter. (#11156)
* Fixed requests.get function call by adding timeout parameter.

* declare DIFFUSERS_REQUEST_TIMEOUT in constants and import when needed

* remove unneeded os import

* Apply style fixes

---------

Co-authored-by: Sai-Suraj-27 <sai.suraj.27.729@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-04-04 07:23:14 +01:00
célina
6edb774b5e Update Style Bot workflow (#11202)
update style bot workflow
2025-04-03 19:31:49 +02:00
Basile Lewandowski
480510ada9 Change KolorsPipeline LoRA Loader to StableDiffusion (#11198)
Change LoRA Loader to StableDiffusion

Replace the SDXL LoRA Loader Mixin inheritance with the StableDiffusion one
2025-04-03 11:21:11 -03:00
Abhipsha Das
d9023a671a [Model Card] standardize advanced diffusion training sdxl lora (#7615)
* model card gen code

* push modelcard creation

* remove optional from params

* add import

* add use_dora check

* correct lora var use in tags

* make style && make quality

---------

Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-04-03 07:43:01 +05:30
Eliseu Silva
c4646a3931 feat: [Community Pipeline] - FaithDiff Stable Diffusion XL Pipeline (#11188)
* feat: [Community Pipeline] - FaithDiff Stable Diffusion XL Pipeline for Image SR.

* added pipeline
2025-04-02 11:33:19 -10:00
Dhruv Nair
c97b709afa Add CacheMixin to Wan and LTX Transformers (#11187)
* update

* update

* update
2025-04-02 10:16:31 -10:00
lakshay sharma
b0ff822ed3 Update import_utils.py (#10329)
added onnxruntime-vitisai for custom build onnxruntime pkg
2025-04-02 20:47:10 +01:00
hlky
78c2fdc52e SchedulerMixin from_pretrained and ConfigMixin Self type annotation (#11192) 2025-04-02 08:24:02 -10:00
hlky
54dac3a87c Fix enable_sequential_cpu_offload in CogView4Pipeline (#11195)
* Fix enable_sequential_cpu_offload in CogView4Pipeline

* make fix-copies
2025-04-02 16:51:23 +01:00
hlky
e5c6027ef8 [docs] torch_dtype map (#11194) 2025-04-02 12:46:28 +01:00
hlky
da857bebb6 Revert save_model in ModelMixin save_pretrained and use safe_serialization=False in test (#11196) 2025-04-02 12:45:36 +01:00
Fanli Lin
52b460feb9 [tests] HunyuanDiTControlNetPipeline inference precision issue on XPU (#11197)
* add xpu part

* fix more cases

* remove some cases

* no canny

* format fix
2025-04-02 12:45:02 +01:00
hlky
d8c617ccb0 allow models to run with a user-provided dtype map instead of a single dtype (#10301)
* allow models to run with a user-provided dtype map instead of a single dtype

* make style

* Add warning, change `_` to `default`

* make style

* add test

* handle shared tensors

* remove warning

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-04-02 09:05:46 +01:00
Bruno Magalhaes
fe2b397426 remove unnecessary call to F.pad (#10620)
* rewrite memory count without implicitly using dimensions by @ic-synth

* replace F.pad by built-in padding in Conv3D

* in-place sums to reduce memory allocations

* fixed trailing whitespace

* file reformatted

* in-place sums

* simpler in-place expressions

* removed in-place sum, may affect backward propagation logic

* removed in-place sum, may affect backward propagation logic

* removed in-place sum, may affect backward propagation logic

* reverted change
2025-04-02 08:19:51 +01:00
Eliseu Silva
be0b7f55cc fix: for checking mandatory and optional pipeline components (#11189)
fix: optional componentes verification on load
2025-04-02 08:07:24 +01:00
jiqing-feng
4d5a96e40a fix autocast (#11190)
Signed-off-by: jiqing-feng <jiqing.feng@intel.com>
2025-04-02 07:26:27 +01:00
Yao Matrix
a7f07c1ef5 map BACKEND_RESET_MAX_MEMORY_ALLOCATED to reset_peak_memory_stats on XPU (#11191)
Signed-off-by: YAO Matrix <matrix.yao@intel.com>
2025-04-02 07:25:48 +01:00
Dhruv Nair
df1d7b01f1 [WIP] Add Wan Video2Video (#11053)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update
2025-04-01 17:22:11 +05:30
Fanli Lin
5a6edac087 [tests] no hard-coded cuda (#11186)
no cuda only
2025-04-01 12:14:31 +01:00
kakukakujirori
e8fc8b1f81 Bug fix in LTXImageToVideoPipeline.prepare_latents() when latents is already set (#10918)
* Bug fix in ltx

* Assume packed latents.

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-03-31 12:15:43 -10:00
hlky
d6f4774c1c Add latents_mean and latents_std to SDXLLongPromptWeightingPipeline (#11034) 2025-03-31 11:32:29 -10:00
Mark
eb50defff2 [Docs] Fix environment variables in installation.md (#11179) 2025-03-31 09:15:25 -07:00
Aryan
2c59af7222 Raise warning and round down if Wan num_frames is not 4k + 1 (#11167)
* update

* raise warning and round to nearest multiple of scale factor
2025-03-31 13:33:28 +05:30
hlky
75d7e5cc45 Fix LatteTransformer3DModel dtype mismatch with enable_temporal_attentions (#11139) 2025-03-29 15:52:56 +01:00
Dhruv Nair
617c208bb4 [Docs] Update Wan Docs with memory optimizations (#11089)
* update

* update
2025-03-28 19:05:56 +05:30
hlky
5d970a4aa9 WanI2V encode_image (#11164)
* WanI2V encode_image
2025-03-28 18:05:34 +05:30
kentdan3msu
de6a88c2d7 Set self._hf_peft_config_loaded to True when LoRA is loaded using load_lora_adapter in PeftAdapterMixin class (#11155)
set self._hf_peft_config_loaded to True on successful lora load

Sets the `_hf_peft_config_loaded` flag if a LoRA is successfully loaded in `load_lora_adapter`. Fixes bug huggingface/diffusers/issues/11148

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-26 18:31:18 +01:00
Dhruv Nair
7dc52ea769 [Quantization] dtype fix for GGUF + fix BnB tests (#11159)
* update

* update

* update

* update
2025-03-26 22:22:16 +05:30
Junsong Chen
739d6ec731 add a timestep scale for sana-sprint teacher model (#11150) 2025-03-25 08:47:39 -10:00
Aryan
1ddf3f3a19 Improve information about group offloading and layerwise casting (#11101)
* update

* Update docs/source/en/optimization/memory.md

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* apply review suggestions

* update

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-03-24 23:25:59 +05:30
Jun Yeop Na
7aac77affa [doc] Fix Korean Controlnet Train doc (#11141)
* remove typo from korean controlnet train doc

* removed more paragraphs to remain in sync with the english document
2025-03-24 09:38:21 -07:00
Aryan
8907a70a36 New HunyuanVideo-I2V (#11066)
* update

* update

* update

* add tests

* update docs

* raise value error

* warning for true cfg and guidance scale

* fix test
2025-03-24 21:18:40 +05:30
Junsong Chen
5dbe4f5de6 [fix SANA-Sprint] (#11142)
* fix bug in sana conversion script;

* add more model paths;

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-23 23:38:14 -10:00
Yuxuan Zhang
1d37f42055 Modify the implementation of retrieve_timesteps in CogView4-Control. (#11125)
* 1

* change to channel 1

* cogview4 control training

* add CacheMixin

* 1

* remove initial_input_channels change for val

* 1

* update

* use 3.5

* new loss

* 1

* use imagetoken

* for megatron convert

* 1

* train con and uc

* 2

* remove guidance_scale

* Update pipeline_cogview4_control.py

* fix

* use cogview4 pipeline with timestep

* update shift_factor

* remove the uncond

* add max length

* change convert and use GLMModel instead of GLMForCasualLM

* fix

* [cogview4] Add attention mask support to transformer model

* [fix] Add attention mask for padded token

* update

* remove padding type

* Update train_control_cogview4.py

* resolve conflicts with #10981

* add control convert

* use control format

* fix

* add missing import

* update with cogview4 formate

* make style

* Update pipeline_cogview4_control.py

* Update pipeline_cogview4_control.py

* remove

* Update pipeline_cogview4_control.py

* put back

* Apply style fixes

---------

Co-authored-by: OleehyO <leehy0357@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-23 21:17:14 +05:30
Tolga Cangöz
0213179ba8 Update README and example code for AnyText usage (#11028)
* [Documentation] Update README and example code with additional usage instructions for AnyText

* [Documentation] Update README for AnyTextPipeline and improve logging in code

* Remove wget command for font file from example docstring in anytext.py
2025-03-23 21:15:57 +05:30
hlky
a7d53a5939 Don't override torch_dtype and don't use when quantization_config is set (#11039)
* Don't use `torch_dtype` when `quantization_config` is set

* up

* djkajka

* Apply suggestions from code review

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-21 21:58:38 +05:30
YiYi Xu
8a63aa5e4f add sana-sprint (#11074)
* add sana-sprint




---------

Co-authored-by: Junsong Chen <cjs1020440147@icloud.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-03-21 06:21:18 -10:00
Aryan
844221ae4e [core] FasterCache (#10163)
* init

* update

* update

* update

* make style

* update

* fix

* make it work with guidance distilled models

* update

* make fix-copies

* add tests

* update

* apply_faster_cache -> apply_fastercache

* fix

* reorder

* update

* refactor

* update docs

* add fastercache to CacheMixin

* update tests

* Apply suggestions from code review

* make style

* try to fix partial import error

* Apply style fixes

* raise warning

* update

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-21 09:35:04 +05:30
CyberVy
9b2c0a7dbe fix _callback_tensor_inputs of sd controlnet inpaint pipeline missing some elements (#11073)
* Update pipeline_controlnet_inpaint.py

* Apply style fixes
2025-03-20 23:56:12 -03:00
Parag Ekbote
f424b1b062 Notebooks for Community Scripts-8 (#11128)
Add 4 Notebooks and update the missing links for the
example README.
2025-03-20 12:24:46 -07:00
YiYi Xu
e9fda3924f remove F.rms_norm for now (#11126)
up
2025-03-20 07:55:01 -10:00
Dhruv Nair
2c1ed50fc5 Provide option to reduce CPU RAM usage in Group Offload (#11106)
* update

* update

* clean up
2025-03-20 17:01:09 +05:30
Fanli Lin
15ad97f782 [tests] make cuda only tests device-agnostic (#11058)
* enable bnb on xpu

* add 2 more cases

* add missing change

* add missing change

* add one more

* enable cuda only tests on xpu

* enable big gpu cases
2025-03-20 10:12:35 +00:00
hlky
9f2d5c9ee9 Flux with Remote Encode (#11091)
* Flux img2img remote encode

* Flux inpaint

* -copied from
2025-03-20 09:44:08 +00:00
Junsong Chen
dc62e6931e [fix bug] PixArt inference_steps=1 (#11079)
* fix bug when pixart-dmd inference with `num_inference_steps=1`

* use return_dict=False and return [1] element for 1-step pixart model, which works for both lcm and dmd
2025-03-20 07:44:30 +00:00
Fanli Lin
56f740051d [tests] enable bnb tests on xpu (#11001)
* enable bnb on xpu

* add 2 more cases

* add missing change

* add missing change

* add one more
2025-03-19 16:33:11 +00:00
Linoy Tsaban
a34d97cef0 [Wan LoRAs] make T2V LoRAs compatible with Wan I2V (#11107)
* @hlky t2v->i2v

* Apply style fixes

* try with ones to not nullify layers

* fix method name

* revert to zeros

* add check to state_dict keys

* add comment

* copies fix

* Revert "copies fix"

This reverts commit 051f534d18.

* remove copied from

* Update src/diffusers/loaders/lora_pipeline.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/loaders/lora_pipeline.py

Co-authored-by: hlky <hlky@hlky.ac>

* update

* update

* Update src/diffusers/loaders/lora_pipeline.py

Co-authored-by: hlky <hlky@hlky.ac>

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Linoy <linoy@hf.co>
Co-authored-by: hlky <hlky@hlky.ac>
2025-03-19 21:44:19 +05:30
Yuqian Hong
fc28791fc8 [BUG] Fix Autoencoderkl train script (#11113)
* add disc_optimizer step (not fix)

* support syncbatchnorm in discriminator
2025-03-19 16:49:02 +05:30
Sayak Paul
ae14612673 [CI] uninstall deps properly from pr gpu tests. (#11102)
uninstall deps properly from pr gpu tests.
2025-03-19 08:58:36 +05:30
hlky
0ab8fe49bf Quality options in export_to_video (#11090)
* Quality options in `export_to_video`

* make style
2025-03-18 10:32:33 -10:00
Aryan
3be6706018 Fix Group offloading behaviour when using streams (#11097)
* update

* update
2025-03-18 14:44:10 +05:30
Cheng Jin
cb1b8b21b8 Resolve stride mismatch in UNet's ResNet to support Torch DDP (#11098)
Modify UNet's ResNet implementation to resolve stride mismatch in Torch's DDP
2025-03-18 07:38:13 +00:00
Juan Acevedo
27916822b2 update readme instructions. (#11096)
Co-authored-by: Juan Acevedo <jfacevedo@google.com>
2025-03-17 20:07:48 -10:00
co63oc
3fe3bc0642 Fix pipeline_flux_controlnet.py (#11095)
* Fix pipeline_flux_controlnet.py

* Fix style
2025-03-17 19:52:15 -10:00
Aryan
813d42cc96 Group offloading improvements (#11094)
update
2025-03-18 11:18:00 +05:30
Sayak Paul
b4d7e9c632 make PR GPU tests conditioned on styling. (#11099) 2025-03-18 11:15:35 +05:30
Aryan
2e83cbbb6d LTX 0.9.5 (#10968)
* update


---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-03-17 16:43:36 -10:00
C
33d10af28f Fix Wan I2V Quality (#11087)
* fix_wan_i2v_quality

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/wan/pipeline_wan_i2v.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update pipeline_wan_i2v.py

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-03-17 06:24:57 -10:00
Sayak Paul
100142586f [CI] pin transformers version for benchmarking. (#11067)
pin transformers version for benchmarking.
2025-03-16 10:27:35 +05:30
Yuxuan Zhang
82188cef04 CogView4 Control Block (#10809)
* cogview4 control training


---------

Co-authored-by: OleehyO <leehy0357@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
2025-03-15 07:15:56 -10:00
Sayak Paul
cc19726f3d [Tests] add requires peft decorator. (#11037)
* add requires peft decorator.

* install peft conditionally.

* conditional deps.

Co-authored-by: DN6 <dhruv.nair@gmail.com>

---------

Co-authored-by: DN6 <dhruv.nair@gmail.com>
2025-03-15 12:56:41 +05:30
Dimitri Barbot
be54a95b93 Fix deterministic issue when getting pipeline dtype and device (#10696)
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-03-15 07:50:58 +05:30
Juan Acevedo
6b9a3334db reverts accidental change that removes attn_mask in attn. Improves fl… (#11065)
reverts accidental change that removes attn_mask in attn. Improves flux ptxla by using flash block sizes. Moves encoding outside the for loop.

Co-authored-by: Juan Acevedo <jfacevedo@google.com>
2025-03-14 12:47:01 -10:00
Andreas Jörg
8ead643bb7 [examples/controlnet/train_controlnet_sd3.py] Fixes #11050 - Cast prompt_embeds and pooled_prompt_embeds to weight_dtype to prevent dtype mismatch (#11051)
Fix: dtype mismatch of prompt embeddings in sd3 controlnet training

Co-authored-by: Andreas Jörg <andreasjoerg@MacBook-Pro-von-Andreas-2.fritz.box>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-14 17:33:15 +05:30
Sayak Paul
124ac3e81f [LoRA] feat: support non-diffusers wan t2v loras. (#11059)
feat: support non-diffusers wan t2v loras.
2025-03-14 16:01:25 +05:30
Sayak Paul
2f0f281b0d [Tests] restrict memory tests for quanto for certain schemes. (#11052)
* restrict memory tests for quanto for certain schemes.

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fixes

* style

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-03-14 10:35:19 +05:30
ZhengKai91
ccc8321651 Fix aclnnRepeatInterleaveIntWithDim error on NPU for get_1d_rotary_pos_embed (#10820)
* get_1d_rotary_pos_embed support npu

* Update src/diffusers/models/embeddings.py

---------

Co-authored-by: Kai zheng <kaizheng@KaideMacBook-Pro.local>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-03-13 09:58:03 -10:00
Yaniv Galron
5e48cd27d4 making ``formatted_images`` initialization compact (#10801)
compact writing

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-03-13 09:27:14 -10:00
hlky
5551506b29 Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline (#10827)
* Rename Lumina(2)Text2ImgPipeline -> Lumina(2)Pipeline


---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-03-13 09:24:21 -10:00
Sayak Paul
20e4b6a628 [LoRA] change to warning from info when notifying the users about a LoRA no-op (#11044)
* move to warning.

* test related changes.
2025-03-12 21:20:48 +05:30
hlky
4ea9f89b8e Wan Pipeline scaling fix, type hint warning, multi generator fix (#11007)
* Wan Pipeline scaling fix, type hint warning, multi generator fix

* Apply suggestions from code review
2025-03-12 12:05:52 +00:00
hlky
733b44ac82 [hybrid inference 🍯🐝] Add VAE encode (#11017)
* [hybrid inference 🍯🐝] Add VAE encode

* _toctree: add vae encode

* Add endpoints, tests

* vae_encode docs

* vae encode benchmarks

* api reference

* changelog

* Update docs/source/en/hybrid_inference/overview.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-12 11:23:41 +00:00
hlky
8b4f8ba764 Use output_size in repeat_interleave (#11030) 2025-03-12 07:30:21 +00:00
Dhruv Nair
5428046437 [Refactor] Clean up import utils boilerplate (#11026)
* update

* update

* update
2025-03-12 07:48:34 +05:30
39th president of the United States, probably
e7ffeae0a1 Fix for multi-GPU WAN inference (#10997)
Ensure that hidden_state and shift/scale are on the same device when running with multiple GPUs

Co-authored-by: Jimmy <39@🇺🇸.com>
2025-03-11 07:42:12 -10:00
CyberVy
d87ce2cefc Fix missing **kwargs in lora_pipeline.py (#11011)
* Update lora_pipeline.py

* Apply style fixes

* fix-copies

---------

Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-11 07:34:27 -10:00
wonderfan
36d0553af2 chore: fix help messages in advanced diffusion examples (#10923) 2025-03-11 07:33:55 -10:00
hlky
7e0db46f73 Fix SD3 IPAdapter feature extractor (#11027) 2025-03-11 16:29:27 +00:00
Sayak Paul
e4b056fe65 [LoRA] support wan i2v loras from the world. (#11025)
* support wan i2v loras from the world.

* remove copied from.

* upates

* add lora.
2025-03-11 20:43:29 +05:30
Eliseu Silva
4e3ddd5afa fix: mixture tiling sdxl pipeline - adjust gerating time_ids & embeddings (#11012)
small fix on generating time_ids & embeddings
2025-03-11 04:20:18 -03:00
Dhruv Nair
9add071592 [Quantization] Allow loading TorchAO serialized Tensor objects with torch>=2.6 (#11018)
* update

* update

* update

* update

* update

* update

* update

* update

* update
2025-03-11 10:52:01 +05:30
Tolga Cangöz
b88fef4785 [Research Project] Add AnyText: Multilingual Visual Text Generation And Editing (#8998)
* Add initial template

* Second template

* feat: Add TextEmbeddingModule to AnyTextPipeline

* feat: Add AuxiliaryLatentModule template to AnyTextPipeline

* Add bert tokenizer from the anytext repo for now

* feat: Update AnyTextPipeline's modify_prompt method

This commit adds improvements to the modify_prompt method in the AnyTextPipeline class. The method now handles special characters and replaces selected string prompts with a placeholder. Additionally, it includes a check for Chinese text and translation using the trans_pipe.

* Fill in the `forward` pass of `AuxiliaryLatentModule`

* `make style && make quality`

* `chore: Update bert_tokenizer.py with a TODO comment suggesting the use of the transformers library`

* Update error handling to raise and logging

* Add `create_glyph_lines` function into `TextEmbeddingModule`

* make style

* Up

* Up

* Up

* Up

* Remove several comments

* refactor: Remove ControlNetConditioningEmbedding and update code accordingly

* Up

* Up

* up

* refactor: Update AnyTextPipeline to include new optional parameters

* up

* feat: Add OCR model and its components

* chore: Update `TextEmbeddingModule` to include OCR model components and dependencies

* chore: Update `AuxiliaryLatentModule` to include VAE model and its dependencies for masked image in the editing task

* `make style`

* refactor: Update `AnyTextPipeline`'s docstring

* Update `AuxiliaryLatentModule` to include info dictionary so that text processing is done once

* simplify

* `make style`

* Converting `TextEmbeddingModule` to ordinary `encode_prompt()` function

* Simplify for now

* `make style`

* Up

* feat: Add scripts to convert AnyText controlnet to diffusers

* `make style`

* Fix: Move glyph rendering to `TextEmbeddingModule` from `AuxiliaryLatentModule`

* make style

* Up

* Simplify

* Up

* feat: Add safetensors module for loading model file

* Fix device issues

* Up

* Up

* refactor: Simplify

* refactor: Simplify code for loading models and handling data types

* `make style`

* refactor: Update to() method in FrozenCLIPEmbedderT3 and TextEmbeddingModule

* refactor: Update dtype in embedding_manager.py to match proj.weight

* Up

* Add attribution and adaptation information to pipeline_anytext.py

* Update usage example

* Will refactor `controlnet_cond_embedding` initialization

* Add `AnyTextControlNetConditioningEmbedding` template

* Refactor organization

* style

* style

* Move custom blocks from `AuxiliaryLatentModule` to `AnyTextControlNetConditioningEmbedding`

* Follow one-file policy

* style

* [Docs] Update README and pipeline_anytext.py to use AnyTextControlNetModel

* [Docs] Update import statement for AnyTextControlNetModel in pipeline_anytext.py

* [Fix] Update import path for ControlNetModel, ControlNetOutput in anytext_controlnet.py

* Refactor AnyTextControlNet to use configurable conditioning embedding channels

* Complete control net conditioning embedding in AnyTextControlNetModel

* up

* [FIX] Ensure embeddings use correct device in AnyTextControlNetModel

* up

* up

* style

* [UPDATE] Revise README and example code for AnyTextPipeline integration with DiffusionPipeline

* [UPDATE] Update example code in anytext.py to use correct font file and improve clarity

* down

* [UPDATE] Refactor BasicTokenizer usage to a new Checker class for text processing

* update pillow

* [UPDATE] Remove commented-out code and unnecessary docstring in anytext.py and anytext_controlnet.py for improved clarity

* [REMOVE] Delete frozen_clip_embedder_t3.py as it is in the anytext.py file

* [UPDATE] Replace edict with dict for configuration in anytext.py and RecModel.py for consistency

* 🆙

* style

* [UPDATE] Revise README.md for clarity, remove unused imports in anytext.py, and add author credits in anytext_controlnet.py

* style

* Update examples/research_projects/anytext/README.md

Co-authored-by: Aryan <contact.aryanvs@gmail.com>

* Remove commented-out image preparation code in AnyTextPipeline

* Remove unnecessary blank line in README.md
2025-03-11 01:49:37 +05:30
Sayak Paul
e7e6d85282 [Tests] improve quantization tests by additionally measuring the inference memory savings (#11021)
* memory usage tests

* fixes

* gguf
2025-03-10 21:42:24 +05:30
Aryan
8eefed65bd [LoRA] CogView4 (#10981)
* update

* make fix-copies

* update
2025-03-10 20:24:05 +05:30
Sayak Paul
26149c0ecd [LoRA] Improve warning messages when LoRA loading becomes a no-op (#10187)
* updates

* updates

* updates

* updates

* notebooks revert

* fix-copies.

* seeing

* fix

* revert

* fixes

* fixes

* fixes

* remove print

* fix

* conflicts ii.

* updates

* fixes

* better filtering of prefix.

---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-03-10 09:28:32 +05:30
Ishan Modi
0703ce8800 [Single File] Add single file loading for SANA Transformer (#10947)
* added support for from_single_file

* added diffusers mapping script

* added testcase

* bug fix

* updated tests

* corrected code quality

* corrected code quality

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-03-10 08:38:30 +05:30
Dhruv Nair
f5edaa7894 [Quantization] Add Quanto backend (#10756)
* update

* updaet

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/quantization/quanto.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update src/diffusers/quantizers/quanto/utils.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-10 08:33:05 +05:30
Dhruv Nair
9a1810f0de Fix for fetching variants only (#10646)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update
2025-03-10 07:45:44 +05:30
Sayak Paul
1fddee211e [LoRA] Improve copied from comments in the LoRA loader classes (#10995)
* more sanity of mind with copied from ...

* better

* better
2025-03-08 19:59:21 +05:30
Kinam Kim
b38450d5d2 Add STG to community pipelines (#10960)
* Support STG for video pipelines

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update README.md

* Update pipeline_stg_cogvideox.py

* Update pipeline_stg_hunyuan_video.py

* Update pipeline_stg_ltx.py

* Update pipeline_stg_ltx_image2video.py

* Update pipeline_stg_mochi.py

* Update pipeline_stg_hunyuan_video.py

* Update pipeline_stg_ltx.py

* Update pipeline_stg_ltx_image2video.py

* Update pipeline_stg_mochi.py

* update

* remove rescaling

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-08 00:28:24 +05:30
Dhruv Nair
1357931d74 [Single File] Add single file support for Wan T2V/I2V (#10991)
* update

* update

* update

* update

* update

* update

* update
2025-03-07 22:13:25 +05:30
Sayak Paul
a2d3d6af44 [LoRA] remove full key prefix from peft. (#11004)
remove full key prefix from peft.
2025-03-07 21:51:59 +05:30
hlky
363d1ab7e2 Wan VAE move scaling to pipeline (#10998) 2025-03-07 10:42:17 +00:00
C
6a0137eb3b Fix Graph Breaks When Compiling CogView4 (#10959)
* Fix Graph Breaks When Compiling CogView4

Eliminate this:

```
t]V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles] Recompiling function forward in /home/zeyi/repos/diffusers/src/diffusers/models/transformers/transformer_cogview4.py:374
V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles]     triggered by the following guard failure(s):
V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles]     - 0/3: ___check_obj_id(L['self'].rope.freqs_h, 139976127328032)    
V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles]     - 0/2: ___check_obj_id(L['self'].rope.freqs_h, 139976107780960)    
V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles]     - 0/1: ___check_obj_id(L['self'].rope.freqs_h, 140022511848960)    
V0304 10:24:23.421000 3131076 torch/_dynamo/guards.py:2813] [0/4] [__recompiles]     - 0/0: ___check_obj_id(L['self'].rope.freqs_h, 140024081342416)   
```

* Update transformer_cogview4.py

* fix cogview4 rotary pos embed

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-03-06 22:57:17 -10:00
Aryan
2e5203be04 Hunyuan I2V (#10983)
* update

* update

* update

* add tests

* update

* add model tests

* update docs

* update

* update example

* fix defaults

* update
2025-03-07 12:52:48 +05:30
yupeng1111
d55f41102a fix wan i2v pipeline bugs (#10975)
* fix wan i2v pipeline bugs

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-03-06 18:57:41 -10:00
LittleNyima
748cb0fab6 Add CogVideoX DDIM Inversion to Community Pipelines (#10956)
* add cogvideox ddim inversion script

* implement as a pipeline, and add documentation

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2025-03-06 10:46:38 -10:00
Dhruv Nair
790a909b54 [Single File] Add user agent to SF download requests. (#10979)
update
2025-03-06 10:45:20 -10:00
CyberVy
54ab475391 Fix Flux Controlnet Pipeline _callback_tensor_inputs Missing Some Elements (#10974)
* Update pipeline_flux_controlnet.py

* Update pipeline_flux_controlnet_image_to_image.py

* Update pipeline_flux_controlnet_inpainting.py

* Update pipeline_flux_controlnet_inpainting.py

* Update pipeline_flux_controlnet_inpainting.py
2025-03-06 14:26:20 -03:00
dependabot[bot]
f103993094 Bump jinja2 from 3.1.5 to 3.1.6 in /examples/research_projects/realfill (#10984)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.5 to 3.1.6.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.5...3.1.6)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-06 11:59:51 +00:00
Sayak Paul
1be0202502 [CI] remove synchornized. (#10980)
removed synchornized.
2025-03-06 17:03:19 +05:30
Pierre Chapuis
ea81a4228d fix default values of Flux guidance_scale in docstrings (#10982) 2025-03-06 16:37:45 +05:30
hlky
b15027636a Fix loading OneTrainer Flux LoRA (#10978)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-06 13:53:36 +05:30
Sayak Paul
6e2a93de70 [tests] fix tests for save load components (#10977)
fix tests
2025-03-06 12:30:37 +05:30
Jun Yeop Na
37b8edfb86 [train_dreambooth_lora.py] Fix the LR Schedulers when num_train_epochs is passed in a distributed training env (#10973)
* updated train_dreambooth_lora to fix the LR schedulers for `num_train_epochs` in distributed training env

* fixed formatting

* remove trailing newlines

* fixed style error
2025-03-06 10:06:24 +05:30
Célina
fbf6b856cc use style bot GH Action from huggingface_hub (#10970)
use style bot GH action from hfh

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-05 23:39:50 +05:30
Linoy Tsaban
e031caf4ea [flux lora training] fix t5 training bug (#10845)
* fix t5 training bug

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-05 13:47:01 +02:00
hlky
08f74a8b92 Add VAE Decode endpoint slow test (#10946) 2025-03-05 11:28:06 +00:00
YiYi Xu
24c062aaa1 update check_input for cogview4 (#10966)
fix
2025-03-04 12:12:54 -10:00
Yuxuan Zhang
a74f02fb40 [Docs] CogView4 comment fix (#10957)
* Update pipeline_cogview4.py

* Use GLM instead of T5 in doc
2025-03-04 11:25:43 -10:00
Eliseu Silva
66bf7ea5be feat: add Mixture-of-Diffusers ControlNet Tile upscaler Pipeline for SDXL (#10951)
* feat: add Mixture-of-Diffusers ControlNet Tile upscaler Pipeline for SDXL

* make style make quality
2025-03-04 17:17:36 -03:00
Alexey Zolotenkov
b8215b1c06 Fix incorrect seed initialization when args.seed is 0 (#10964)
* Fix seed initialization to handle args.seed = 0 correctly

* Apply style fixes

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-04 10:09:52 -10:00
Aryan
3ee899fa0c [LoRA] Support Wan (#10943)
* update

* refactor image-to-video pipeline

* update

* fix copied from

* use FP32LayerNorm
2025-03-05 01:27:34 +05:30
CyberVy
dcd77ce222 Fix the missing parentheses when calling is_torchao_available in quantization_config.py. (#10961)
Update quantization_config.py
2025-03-04 09:52:41 -03:00
a120092009
11d8e3ce2c [Quantization] support pass MappingType for TorchAoConfig (#10927)
* [Quantization] support pass MappingType for TorchAoConfig

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-04 16:40:50 +05:30
Sayak Paul
97fda1b75c [LoRA] feat: support non-diffusers lumina2 LoRAs. (#10909)
* feat: support non-diffusers lumina2 LoRAs.

* revert ipynb changes (but I don't know why this is required ☹️)

* empty

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-03-04 14:40:55 +05:30
Sayak Paul
cc22058324 Update evaluation.md (#10938)
* Update evaluation.md

* Update docs/source/en/conceptual/evaluation.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-03-04 13:58:16 +05:30
Fanli Lin
7855ac597e [tests] make tests device-agnostic (part 4) (#10508)
* initial comit

* fix empty cache

* fix one more

* fix style

* update device functions

* update

* update

* Update src/diffusers/utils/testing_utils.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/utils/testing_utils.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/utils/testing_utils.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/controlnet/test_controlnet.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/utils/testing_utils.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/utils/testing_utils.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/controlnet/test_controlnet.py

Co-authored-by: hlky <hlky@hlky.ac>

* with gc.collect

* update

* make style

* check_torch_dependencies

* add mps empty cache

* add changes

* bug fix

* enable on xpu

* update more cases

* revert

* revert back

* Update test_stable_diffusion_xl.py

* Update tests/pipelines/stable_diffusion/test_stable_diffusion.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/stable_diffusion/test_stable_diffusion.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/stable_diffusion/test_stable_diffusion_img2img.py

Co-authored-by: hlky <hlky@hlky.ac>

* Apply suggestions from code review

Co-authored-by: hlky <hlky@hlky.ac>

* add test marker

---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-03-04 08:26:06 +00:00
CyberVy
30cef6bff3 Improve load_ip_adapter RAM Usage (#10948)
* Update ip_adapter.py

* Update ip_adapter.py

* Update ip_adapter.py

* Update ip_adapter.py

* Update ip_adapter.py

* Apply style fixes

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-03-04 07:21:23 +00:00
Ahmed Belgacem
8f15be169f Fix redundant prev_output_channel assignment in UNet2DModel (#10945) 2025-03-03 11:43:15 -10:00
Yuxuan Zhang
f92e599c70 Update pipeline_cogview4.py (#10944) 2025-03-03 09:42:01 -10:00
Parag Ekbote
982f9b38d6 Add Example of IPAdapterScaleCutoffCallback to Docs (#10934)
* Add example of Ip-Adapter-Callback.

* Add image links from HF Hub.
2025-03-03 08:32:45 -08:00
fancydaddy
c9a219b323 add from_single_file to animatediff (#10924)
* Update pipeline_animatediff.py

* Update pipeline_animatediff_controlnet.py

* Update pipeline_animatediff_sparsectrl.py

* Update pipeline_animatediff_video2video.py

* Update pipeline_animatediff_video2video_controlnet.py

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-03-03 19:11:54 +05:30
Teriks
9e910c4633 Fix SD2.X clip single file load projection_dim (#10770)
* Fix SD2.X clip single file load projection_dim

Infer projection_dim from the checkpoint before loading
from pretrained, override any incorrect hub config.

Hub configuration for SD2.X specifies projection_dim=512
which is incorrect for SD2.X checkpoints loaded from civitai
and similar.

Exception was previously thrown upon attempting to
load_model_dict_into_meta for SD2.X single file checkpoints.

Such LDM models usually require projection_dim=1024

* convert_open_clip_checkpoint use hidden_size for text_proj_dim

* convert_open_clip_checkpoint, revert checkpoint[text_proj_key].shape[1] -> [0]

values are identical

---------

Co-authored-by: Teriks <Teriks@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-03-03 19:00:39 +05:30
Bubbliiiing
5e3b7d2d8a Add EasyAnimateV5.1 text-to-video, image-to-video, control-to-video generation model (#10626)
* Update EasyAnimate V5.1

* Add docs && add tests && Fix comments problems in transformer3d and vae

* delete comments and remove useless import

* delete process

* Update EXAMPLE_DOC_STRING

* rename transformer file

* make fix-copies

* make style

* refactor pt. 1

* update toctree.yml

* add model tests

* Update layer_norm for norm_added_q and norm_added_k in Attention

* Fix processor problem

* refactor vae

* Fix problem in comments

* refactor tiling; remove einops dependency

* fix docs path

* make fix-copies

* Update src/diffusers/pipelines/easyanimate/pipeline_easyanimate_control.py

* update _toctree.yml

* fix test

* update

* update

* update

* make fix-copies

* fix tests

---------

Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-03-03 18:37:19 +05:30
Sayak Paul
7513162b8b [Tests] Remove more encode prompts tests (#10942)
* fix-copies went uncaught it seems.

* remove more unneeded encode_prompt() tests

* Revert "fix-copies went uncaught it seems."

This reverts commit eefb302791.

* empty
2025-03-03 16:55:01 +05:30
Sayak Paul
4aaa0d21ba [chore] fix-copies to flux pipelines (#10941)
fix-copies went uncaught it seems.
2025-03-03 11:21:57 +05:30
hlky
54043c3e2e Update VAE Decode endpoints (#10939) 2025-03-02 18:29:53 +00:00
hlky
fc4229a0c3 Add remote_decode to remote_utils (#10898)
* Add `remote_decode` to `remote_utils`

* test dependency

* test dependency

* dependency

* dependency

* dependency

* docstrings

* changes

* make style

* apply

* revert, add new options

* Apply style fixes

* deprecate base64, headers not needed

* address comments

* add license header

* init test_remote_decode

* more

* more test

* more test

* skeleton for xl, flux

* more test

* flux test

* flux packed

* no scaling

* -save

* hunyuanvideo test

* Apply style fixes

* init docs

* Update src/diffusers/utils/remote_utils.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* comments

* Apply style fixes

* comments

* hybrid_inference/vae_decode

* fix

* tip?

* tip

* api reference autodoc

* install tip

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2025-03-02 17:10:01 +00:00
hlky
694f9658c1 Support IPAdapter for more Flux pipelines (#10708)
* Support IPAdapter for more Flux pipelines

* -copied from

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-03-02 15:04:12 +00:00
YiYi Xu
2d8a41cae8 [Alibaba Wan Team] continue on #10921 Wan2.1 (#10922)
* Add wanx pipeline, model and example

* wanx_merged_v1

* change WanX into Wan

* fix i2v fp32 oom error

Link: https://code.alibaba-inc.com/open_wanx2/diffusers/codereview/20607813

* support t2v load fp32 ckpt

* add example

* final merge v1

* Update autoencoder_kl_wan.py

* up

* update middle, test up_block

* up up

* one less nn.sequential

* up more

* up

* more

* [refactor] [wip] Wan transformer/pipeline (#10926)

* update

* update

* refactor rope

* refactor pipeline

* make fix-copies

* add transformer test

* update

* update

* make style

* update tests

* tests

* conversion script

* conversion script

* update

* docs

* remove unused code

* fix _toctree.yml

* update dtype

* fix test

* fix tests: scale

* up

* more

* Apply suggestions from code review

* Apply suggestions from code review

* style

* Update scripts/convert_wan_to_diffusers.py

* update docs

* fix

---------

Co-authored-by: Yitong Huang <huangyitong.hyt@alibaba-inc.com>
Co-authored-by: 亚森 <wangjiayu.wjy@alibaba-inc.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-03-02 17:24:26 +05:30
Dhruv Nair
7007febae5 [CI] Update Stylebot Permissions (#10931)
update
2025-03-01 09:43:05 +05:30
Sayak Paul
d230ecc570 [style bot] improve security for the stylebot. (#10908)
* improve security for the stylebot.

* 
2025-02-28 22:01:31 +05:30
hlky
37a5f1b3b6 Experimental per control type scale for ControlNet Union (#10723)
* ControlNet Union scale

* fix

* universal interface

* from_multi

* from_multi
2025-02-27 10:23:38 +00:00
Dhruv Nair
501d9de701 [CI] Fix for failing IP Adapter test in Fast GPU PR tests (#10915)
* update

* update

* update

* update
2025-02-27 14:22:28 +05:30
Dhruv Nair
e5c43b8af7 [CI] Fix Fast GPU tests on PR (#10912)
* update

* update

* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-02-27 14:21:50 +05:30
CyberVy
9a8e8db79f Fix Callback Tensor Inputs of the SD Controlnet Pipelines are missing some elements. (#10907)
* Update pipeline_controlnet_img2img.py

* Update pipeline_controlnet_inpaint.py

* Update pipeline_controlnet.py

---------
2025-02-26 15:36:47 -03:00
Sayak Paul
764d7ed49a [Tests] fix: lumina2 lora fuse_nan test (#10911)
fix: lumina2 lora fuse_nan test
2025-02-26 22:44:49 +05:30
Anton Obukhov
3fab6624fd Marigold Update: v1-1 models, Intrinsic Image Decomposition pipeline, documentation (#10884)
* minor documentation fixes of the depth and normals pipelines

* update license headers

* update model checkpoints in examples
fix missing prediction_type in register_to_config in the normals pipeline

* add initial marigold intrinsics pipeline
update comments about num_inference_steps and ensemble_size
minor fixes in comments of marigold normals and depth pipelines

* update uncertainty visualization to work with intrinsics

* integrate iid


---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-02-25 14:13:02 -10:00
Yih-Dar
f0ac7aaafc Security fix (#10905)
fix

Co-authored-by: ydshieh <ydshieh@users.noreply.github.com>
2025-02-25 23:25:37 +05:30
CyberVy
613e77f8be Fix Callback Tensor Inputs of the SDXL Controlnet Inpaint and Img2img Pipelines are missing "controlnet_image". (#10880)
* Update pipeline_controlnet_inpaint_sd_xl.py

* Update pipeline_controlnet_sd_xl_img2img.py

* Update pipeline_controlnet_union_inpaint_sd_xl.py

* Update pipeline_controlnet_union_sd_xl_img2img.py

* Update pipeline_controlnet_inpaint_sd_xl.py

* Update pipeline_controlnet_sd_xl_img2img.py

* Update pipeline_controlnet_union_inpaint_sd_xl.py

* Update pipeline_controlnet_union_sd_xl_img2img.py

* Apply make style and make fix-copies fixes

* Update geodiff_molecule_conformation.ipynb

* Delete examples/research_projects/geodiff/geodiff_molecule_conformation.ipynb

* Delete examples/research_projects/gligen/demo.ipynb

* Create geodiff_molecule_conformation.ipynb

* Create demo.ipynb

* Update geodiff_molecule_conformation.ipynb

* Update geodiff_molecule_conformation.ipynb

* Delete examples/research_projects/geodiff/geodiff_molecule_conformation.ipynb

* Add files via upload

* Delete src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint.py

* Add files via upload
2025-02-25 12:53:03 -03:00
Daniel Regado
1450c2ac4f Multi IP-Adapter for Flux pipelines (#10867)
* Initial implementation of Flux multi IP-Adapter

* Update src/diffusers/pipelines/flux/pipeline_flux.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/flux/pipeline_flux.py

Co-authored-by: hlky <hlky@hlky.ac>

* Changes for ipa image embeds

* Update src/diffusers/pipelines/flux/pipeline_flux.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/flux/pipeline_flux.py

Co-authored-by: hlky <hlky@hlky.ac>

* make style && make quality

* Updated ip_adapter test

* Created typing_utils.py

---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-02-25 09:51:15 +00:00
Dhruv Nair
cc7b5b873a [CI] Improvements to conditional GPU PR tests (#10859)
* update

* update

* update

* update

* update

* update

* test

* test

* test

* test

* test

* test

* test

* test

* test

* test

* test

* test

* update
2025-02-25 09:49:29 +05:30
Aryan
0404703237 [refactor] Remove additional Flux code (#10881)
* update

* apply review suggestions

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2025-02-24 14:56:30 -10:00
Aryan
13f20c7fe8 [refactor] SD3 docs & remove additional code (#10882)
* update

* update

* update
2025-02-25 03:08:47 +05:30
Dhruv Nair
87599691b9 [Docs] Fix toctree sorting (#10894)
update
2025-02-24 10:05:32 -10:00
Sayak Paul
36517f6124 [chore] correct qk norm list. (#10876)
correct qk norm list.
2025-02-24 07:49:14 -10:00
Aryan
64af74fc58 [docs] Add CogVideoX Schedulers (#10885)
update
2025-02-24 07:02:59 -10:00
SahilCarterr
170833c22a [Fix] fp16 unscaling in train_dreambooth_lora_sdxl (#10889)
Fix fp16 bug

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-02-24 06:49:23 -10:00
Steven Liu
db21c97043 [docs] Flux group offload (#10847)
* flux group-offload

* feedback
2025-02-24 08:47:08 -08:00
Steven Liu
3fdf173084 [docs] Update prompt weighting docs (#10843)
* sd_embed

* feedback
2025-02-24 08:46:26 -08:00
hlky
aba4a5799a Add SD3 ControlNet to AutoPipeline (#10888)
Co-authored-by: puhuk <wetr235@gmail.com>
2025-02-24 06:21:02 -10:00
Sayak Paul
b0550a66cc [LoRA] restrict certain keys to be checked for peft config update. (#10808)
* restruct certain keys to be checked for peft config update.

* updates

* finish./

* finish 2.

* updates
2025-02-24 16:54:38 +05:30
hlky
6f74ef550d Fix torch_dtype in Kolors text encoder with transformers v4.49 (#10816)
* Fix `torch_dtype` in Kolors text encoder with `transformers` v4.49

* Default torch_dtype and warning
2025-02-24 13:37:54 +05:30
Daniel Regado
9c7e205176 Comprehensive type checking for from_pretrained kwargs (#10758)
* More robust from_pretrained init_kwargs type checking

* Corrected for Python 3.10

* Type checks subclasses and fixed type warnings

* More type corrections and skip tokenizer type checking

* make style && make quality

* Updated docs and types for Lumina pipelines

* Fixed check for empty signature

* changed location of helper functions

* make style

---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-02-22 13:15:19 +00:00
Steven Liu
64dec70e56 [docs] LoRA support (#10844)
* lora

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-02-22 08:53:02 +05:30
Marc Sun
ffb6777ace remove format check for safetensors file (#10864)
remove check
2025-02-21 19:56:16 +01:00
SahilCarterr
85fcbaf314 [Fix] Docs overview.md (#10858)
Fix docs
2025-02-21 08:03:22 -08:00
hlky
d75ea3c772 device_map in load_model_dict_into_meta (#10851)
* `device_map` in `load_model_dict_into_meta`

* _LOW_CPU_MEM_USAGE_DEFAULT

* fix is_peft_version is_bitsandbytes_version
2025-02-21 12:16:30 +00:00
Dhruv Nair
b27d4edbe1 [CI] Update always test Pipelines list in Pipeline fetcher (#10856)
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-02-21 16:24:20 +05:30
Dhruv Nair
2b2d04299c [CI] Fix incorrectly named test module for Hunyuan DiT (#10854)
update
2025-02-21 13:35:40 +05:30
Sayak Paul
6cef7d2366 fix remote vae template (#10852)
fix
2025-02-21 12:00:02 +05:30
Sayak Paul
9055ccb382 [chore] template for remote vae. (#10849)
template for remote vae.
2025-02-21 11:43:36 +05:30
Sayak Paul
1871a69ecb fix: run tests from a pr workflow. (#9696)
* fix: run tests from a pr workflow.

* correct

* update

* checking.
2025-02-21 08:50:37 +05:30
Aryan
e3bc4aab2e SkyReels Hunyuan T2V & I2V (#10837)
* update

* make fix-copies

* update

* tests

* update

* update

* add co-author

Co-Authored-By: Langdx <82783347+Langdx@users.noreply.github.com>

* add co-author

Co-Authored-By: howe <howezhang2018@gmail.com>

* update

---------

Co-authored-by: Langdx <82783347+Langdx@users.noreply.github.com>
Co-authored-by: howe <howezhang2018@gmail.com>
2025-02-21 06:48:15 +05:30
Aryan
f0707751ef Some consistency-related fixes for HunyuanVideo (#10835)
* update

* update
2025-02-21 03:37:07 +05:30
Daniel Regado
d9ee3879b0 SD3 IP-Adapter runtime checkpoint conversion (#10718)
* Added runtime checkpoint conversion

* Updated docs

* Fix for quantized model
2025-02-20 10:35:57 -10:00
Sayak Paul
454f82e6fc [CI] run fast gpu tests conditionally on pull requests. (#10310)
* run fast gpu tests conditionally on pull requests.

* revert unneeded changes.

* simplify PR.
2025-02-20 23:06:59 +05:30
Sayak Paul
1f853504da [CI] install accelerate transformers from main (#10289)
install accelerate transformers from .
2025-02-20 23:06:40 +05:30
Parag Ekbote
51941387dc Notebooks for Community Scripts-7 (#10846)
Add 5 Notebooks, improve their example
scripts and update the missing links for the
example README.
2025-02-20 09:02:09 -08:00
Haoyun Qin
c7a8c4395a fix: support transformer models' generation_config in pipeline (#10779) 2025-02-20 21:49:33 +05:30
Marc Sun
a4c1aac3ae store activation cls instead of function (#10832)
* store cls instead of an obj

* style
2025-02-20 10:38:15 +01:00
Sayak Paul
b2ca39c8ac [tests] test encode_prompt() in isolation (#10438)
* poc encode_prompt() tests

* fix

* updates.

* fixes

* fixes

* updates

* updates

* updates

* revert

* updates

* updates

* updates

* updates

* remove SDXLOptionalComponentsTesterMixin.

* remove tests that directly leveraged encode_prompt() in some way or the other.

* fix imports.

* remove _save_load

* fixes

* fixes

* fixes

* fixes
2025-02-20 13:21:43 +05:30
AstraliteHeart
532171266b Add missing isinstance for arg checks in GGUFParameter (#10834) 2025-02-20 12:49:51 +05:30
Sayak Paul
f550745a2b [Utils] add utilities for checking if certain utilities are properly documented (#7763)
* add; utility to check if attn_procs,norms,acts are properly documented.

* add support listing to the workflows.

* change to 2024.

* small fixes.

* does adding detailed docstrings help?

* uncomment image processor check

* quality

* fix, thanks to @mishig.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* style

* JointAttnProcessor2_0

* fixes

* fixes

* fixes

* fixes

* fixes

* fixes

* Update docs/source/en/api/normalization.md

Co-authored-by: hlky <hlky@hlky.ac>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-02-20 12:37:00 +05:30
Sayak Paul
f10d3c6d04 [LoRA] add LoRA support to Lumina2 and fine-tuning script (#10818)
* feat: lora support for Lumina2.

* fix-copies.

* updates

* updates

* docs.

* fix

* add: training script.

* tests

* updates

* updates

* major updates.

* updates

* fixes

* docs.

* updates

* updates
2025-02-20 09:41:51 +05:30
Sayak Paul
0fb7068364 [tests] use proper gemma class and config in lumina2 tests. (#10828)
use proper gemma class and config in lumina2 tests.
2025-02-20 09:27:07 +05:30
Aryan
f8b54cf037 Remove print statements (#10836)
remove prints
2025-02-19 17:21:07 -10:00
Sayak Paul
680a8ed855 [misc] feat: introduce a style bot. (#10274)
* feat: introduce a style bot.

* updates

* Apply suggestions from code review

Co-authored-by: Guillaume LEGENDRE <glegendre01@gmail.com>

* apply suggestion

* fixes

* updates

---------

Co-authored-by: Guillaume LEGENDRE <glegendre01@gmail.com>
2025-02-19 20:49:10 +05:30
Marc Sun
f5929e0306 [FEAT] Model loading refactor (#10604)
* first draft model loading refactor

* revert name change

* fix bnb

* revert name

* fix dduf

* fix huanyan

* style

* Update src/diffusers/models/model_loading_utils.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* suggestions from reviews

* Update src/diffusers/models/modeling_utils.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* remove safetensors check

* fix default value

* more fix from suggestions

* revert logic for single file

* style

* typing + fix couple of issues

* improve speed

* Update src/diffusers/models/modeling_utils.py

Co-authored-by: Aryan <aryan@huggingface.co>

* fp8 dtype

* add tests

* rename resolved_archive_file to resolved_model_file

* format

* map_location default cpu

* add utility function

* switch to smaller model + test inference

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* rm comment

* add log

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add decorator

* cosine sim instead

* fix use_keep_in_fp32_modules

* comm

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-02-19 17:34:53 +05:30
Sayak Paul
6fe05b9b93 [LoRA] make set_adapters() robust on silent failures. (#9618)
* make set_adapters() robust on silent failures.

* fixes to tests

* flaky decorator.

* fix

* flaky to sd3.

* remove warning.

* sort

* quality

* skip test_simple_inference_with_text_denoiser_multi_adapter_block_lora

* skip testing unsupported features.

* raise warning instead of error.
2025-02-19 14:33:57 +05:30
hlky
2bc82d6381 DiffusionPipeline mixin to+FromOriginalModelMixin/FromSingleFileMixin from_single_file type hint (#10811)
* DiffusionPipeline mixin `to` type hint

* FromOriginalModelMixin from_single_file

* FromSingleFileMixin from_single_file
2025-02-19 07:23:40 +00:00
Sayak Paul
924f880d4d [docs] add missing entries to the lora docs. (#10819)
add missing entries to the lora docs.
2025-02-18 09:10:18 -08:00
puhuk
b75b204a58 Fix max_shift value in flux and related functions to 1.15 (issue #10675) (#10807)
This PR updates the max_shift value in flux to 1.15 for consistency across the codebase. In addition to modifying max_shift in flux, all related functions that copy and use this logic, such as calculate_shift in `src/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3_img2img.py`, have also been updated to ensure uniform behavior.
2025-02-18 06:54:56 +00:00
Sayak Paul
c14057c8db [LoRA] improve lora support for flux. (#10810)
update lora support for flux.
2025-02-17 19:04:48 +05:30
Sayak Paul
3579cd2bb7 [chore] update notes generation spaces (#10592)
fix
2025-02-17 09:26:15 +05:30
Parag Ekbote
3e99b5677e Extend Support for callback_on_step_end for AuraFlow and LuminaText2Img Pipelines (#10746)
* Add support for callback_on_step_end for
AuraFlowPipeline and LuminaText2ImgPipeline.

* Apply the suggestions from code review for lumina and auraflow

Co-authored-by: hlky <hlky@hlky.ac>

* Update missing inputs and imports.

* Add input field.

* Apply suggestions from code review-2

Co-authored-by: hlky <hlky@hlky.ac>

* Apply the suggestions from review for unused imports.

Co-authored-by: hlky <hlky@hlky.ac>

* make style.

* Update pipeline_aura_flow.py

* Update pipeline_lumina.py

* Update pipeline_lumina.py

* Update pipeline_aura_flow.py

* Update pipeline_lumina.py

---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-02-16 17:28:57 +00:00
Yaniv Galron
952b9131a2 typo fix (#10802) 2025-02-16 20:56:54 +05:30
Yuxuan Zhang
d90cd3621d CogView4 (supports different length c and uc) (#10649)
* init

* encode with glm

* draft schedule

* feat(scheduler): Add CogView scheduler implementation

* feat(embeddings): add CogView 2D rotary positional embedding

* 1

* Update pipeline_cogview4.py

* fix the timestep init and sigma

* update latent

* draft patch(not work)

* fix

* [WIP][cogview4]: implement initial CogView4 pipeline

Implement the basic CogView4 pipeline structure with the following changes:
- Add CogView4 pipeline implementation
- Implement DDIM scheduler for CogView4
- Add CogView3Plus transformer architecture
- Update embedding models

Current limitations:
- CFG implementation uses padding for sequence length alignment
- Need to verify transformer inference alignment with Megatron

TODO:
- Consider separate forward passes for condition/uncondition
  instead of padding approach

* [WIP][cogview4][refactor]: Split condition/uncondition forward pass in CogView4 pipeline

Split the forward pass for conditional and unconditional predictions in the CogView4 pipeline to match the original implementation. The noise prediction is now done separately for each case before combining them for guidance. However, the results still need improvement.

This is a work in progress as the generated images are not yet matching expected quality.

* use with -2 hidden state

* remove text_projector

* 1

* [WIP] Add tensor-reload to align input from transformer block

* [WIP] for older glm

* use with cogview4 transformers forward twice of u and uc

* Update convert_cogview4_to_diffusers.py

* remove this

* use main example

* change back

* reset

* setback

* back

* back 4

* Fix qkv conversion logic for CogView4 to Diffusers format

* back5

* revert to sat to cogview4 version

* update a new convert from megatron

* [WIP][cogview4]: implement CogView4 attention processor

Add CogView4AttnProcessor class for implementing scaled dot-product attention
with rotary embeddings for the CogVideoX model. This processor concatenates
encoder and hidden states, applies QKV projections and RoPE, but does not
include spatial normalization.

TODO:
- Fix incorrect QKV projection weights
- Resolve ~25% error in RoPE implementation compared to Megatron

* [cogview4] implement CogView4 transformer block

Implement CogView4 transformer block following the Megatron architecture:
- Add multi-modulate and multi-gate mechanisms for adaptive layer normalization
- Implement dual-stream attention with encoder-decoder structure
- Add feed-forward network with GELU activation
- Support rotary position embeddings for image tokens

The implementation follows the original CogView4 architecture while adapting
it to work within the diffusers framework.

* with new attn

* [bugfix] fix dimension mismatch in CogView4 attention

* [cogview4][WIP]: update final normalization in CogView4 transformer

Refactored the final normalization layer in CogView4 transformer to use separate layernorm and AdaLN operations instead of combined AdaLayerNormContinuous. This matches the original implementation but needs validation.

Needs verification against reference implementation.

* 1

* put back

* Update transformer_cogview4.py

* change time_shift

* Update pipeline_cogview4.py

* change timesteps

* fix

* change text_encoder_id

* [cogview4][rope] align RoPE implementation with Megatron

- Implement apply_rope method in attention processor to match Megatron's implementation
- Update position embeddings to ensure compatibility with Megatron-style rotary embeddings
- Ensure consistent rotary position encoding across attention layers

This change improves compatibility with Megatron-based models and provides
better alignment with the original implementation's positional encoding approach.

* [cogview4][bugfix] apply silu activation to time embeddings in CogView4

Applied silu activation to time embeddings before splitting into conditional
and unconditional parts in CogView4Transformer2DModel. This matches the
original implementation and helps ensure correct time conditioning behavior.

* [cogview4][chore] clean up pipeline code

- Remove commented out code and debug statements
- Remove unused retrieve_timesteps function
- Clean up code formatting and documentation

This commit focuses on code cleanup in the CogView4 pipeline implementation, removing unnecessary commented code and improving readability without changing functionality.

* [cogview4][scheduler] Implement CogView4 scheduler and pipeline

* now It work

* add timestep

* batch

* change convert scipt

* refactor pt. 1; make style

* refactor pt. 2

* refactor pt. 3

* add tests

* make fix-copies

* update toctree.yml

* use flow match scheduler instead of custom

* remove scheduling_cogview.py

* add tiktoken to test dependencies

* Update src/diffusers/models/embeddings.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* apply suggestions from review

* use diffusers apply_rotary_emb

* update flow match scheduler to accept timesteps

* fix comment

* apply review sugestions

* Update src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: 三洋三洋 <1258009915@qq.com>
Co-authored-by: OleehyO <leehy0357@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2025-02-15 21:46:48 +05:30
YiYi Xu
69f919d8b5 follow-up refactor on lumina2 (#10776)
* up
2025-02-14 14:57:27 -10:00
SahilCarterr
a6b843a797 [FIX] check_inputs function in lumina2 (#10784) 2025-02-14 10:55:11 -10:00
puhuk
27b90235e4 Update Custom Diffusion Documentation for Multiple Concept Inference to resolve issue #10791 (#10792)
Update Custom Diffusion Documentation for Multiple Concept Inference

This PR updates the Custom Diffusion documentation to correctly demonstrate multiple concept inference by:

- Initializing the pipeline from a proper foundation model (e.g., "CompVis/stable-diffusion-v1-4") instead of a fine-tuned model.
- Defining model_id explicitly to avoid NameError.
- Correcting method calls for loading attention processors and textual inversion embeddings.
2025-02-14 08:19:11 -08:00
Aryan
9a147b82f7 Module Group Offloading (#10503)
* update

* fix

* non_blocking; handle parameters and buffers

* update

* Group offloading with cuda stream prefetching (#10516)

* cuda stream prefetch

* remove breakpoints

* update

* copy model hook implementation from pab

* update; ~very workaround based implementation but it seems to work as expected; needs cleanup and rewrite

* more workarounds to make it actually work

* cleanup

* rewrite

* update

* make sure to sync current stream before overwriting with pinned params

not doing so will lead to erroneous computations on the GPU and cause bad results

* better check

* update

* remove hook implementation to not deal with merge conflict

* re-add hook changes

* why use more memory when less memory do trick

* why still use slightly more memory when less memory do trick

* optimise

* add model tests

* add pipeline tests

* update docs

* add layernorm and groupnorm

* address review comments

* improve tests; add docs

* improve docs

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* apply suggestions from code review

* update tests

* apply suggestions from review

* enable_group_offloading -> enable_group_offload for naming consistency

* raise errors if multiple offloading strategies used; add relevant tests

* handle .to() when group offload applied

* refactor some repeated code

* remove unintentional change from merge conflict

* handle .cuda()

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-02-14 12:59:45 +05:30
Aryan
ab428207a7 Refactor CogVideoX transformer forward (#10789)
update
2025-02-13 12:11:25 -10:00
Aryan
8d081de844 Update FlowMatch docstrings to mention correct output classes (#10788)
update
2025-02-14 02:29:16 +05:30
Aryan
a0c22997fd Disable PEFT input autocast when using fp8 layerwise casting (#10685)
* disable peft input autocast

* use new peft method name; only disable peft input autocast if submodule layerwise casting active

* add test; reference PeftInputAutocastDisableHook in peft docs

* add load_lora_weights test

* casted -> cast

* Update tests/lora/utils.py
2025-02-13 23:12:54 +05:30
Fanli Lin
97abdd2210 make tensors contiguous before passing to safetensors (#10761)
fix contiguous bug
2025-02-13 06:27:53 +00:00
Eliseu Silva
051ebc3c8d fix: [Community pipeline] Fix flattened elements on image (#10774)
* feat: new community mixture_tiling_sdxl pipeline for SDXL mixture-of-diffusers support

* fix use of variable latents to tile_latents

* removed references to modules that are not being used in this pipeline

* make style, make quality

* fixfeat: added _get_crops_coords_list function to pipeline to automatically define ctop,cleft coord to focus on image generation, helps to better harmonize the image and corrects the problem of flattened elements.
2025-02-12 19:50:41 -03:00
Daniel Regado
5105b5a83d MultiControlNetUnionModel on SDXL (#10747)
* SDXL with MultiControlNetUnionModel



---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-02-12 10:48:09 -10:00
hlky
ca6330dc53 Fix use_lu_lambdas and use_karras_sigmas with beta_schedule=squaredcos_cap_v2 in DPMSolverMultistepScheduler (#10740) 2025-02-12 20:33:56 +00:00
Dhruv Nair
28f48f4051 [Single File] Add Single File support for Lumina Image 2.0 Transformer (#10781)
* update

* update
2025-02-12 18:53:49 +05:30
Thanh Le
067eab1b3a Faster set_adapters (#10777)
* Update peft_utils.py

* Update peft_utils.py

* Update peft_utils.py

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-02-12 16:30:09 +05:30
Aryan
57ac673802 Refactor OmniGen (#10771)
* OmniGen model.py

* update OmniGenTransformerModel

* omnigen pipeline

* omnigen pipeline

* update omnigen_pipeline

* test case for omnigen

* update omnigenpipeline

* update docs

* update docs

* offload_transformer

* enable_transformer_block_cpu_offload

* update docs

* reformat

* reformat

* reformat

* update docs

* update docs

* make style

* make style

* Update docs/source/en/api/models/omnigen_transformer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update docs

* revert changes to examples/

* update OmniGen2DModel

* make style

* update test cases

* Update docs/source/en/api/pipelines/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update docs

* typo

* Update src/diffusers/models/embeddings.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/attention.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/omnigen/test_pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/omnigen/test_pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* consistent attention processor

* updata

* update

* check_inputs

* make style

* update testpipeline

* update testpipeline

* refactor omnigen

* more updates

* apply review suggestion

---------

Co-authored-by: shitao <2906698981@qq.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-02-12 14:06:14 +05:30
Le Zhuo
81440fd474 Add support for lumina2 (#10642)
* Add support for lumina2


---------

Co-authored-by: csuhan <hanjiaming@whu.edu.cn>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
Co-authored-by: hlky <hlky@hlky.ac>
2025-02-11 11:38:33 -10:00
Eliseu Silva
c470274865 feat: new community mixture_tiling_sdxl pipeline for SDXL (#10759)
* feat: new community mixture_tiling_sdxl pipeline for SDXL mixture-of-diffusers support

* fix use of variable latents to tile_latents

* removed references to modules that are not being used in this pipeline

* make style, make quality
2025-02-11 18:01:42 -03:00
Shitao Xiao
798e17187d Add OmniGen (#10148)
* OmniGen model.py

* update OmniGenTransformerModel

* omnigen pipeline

* omnigen pipeline

* update omnigen_pipeline

* test case for omnigen

* update omnigenpipeline

* update docs

* update docs

* offload_transformer

* enable_transformer_block_cpu_offload

* update docs

* reformat

* reformat

* reformat

* update docs

* update docs

* make style

* make style

* Update docs/source/en/api/models/omnigen_transformer.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update docs

* revert changes to examples/

* update OmniGen2DModel

* make style

* update test cases

* Update docs/source/en/api/pipelines/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/omnigen.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update docs

* typo

* Update src/diffusers/models/embeddings.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/attention.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/models/transformers/transformer_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/omnigen/test_pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update tests/pipelines/omnigen/test_pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* Update src/diffusers/pipelines/omnigen/pipeline_omnigen.py

Co-authored-by: hlky <hlky@hlky.ac>

* consistent attention processor

* updata

* update

* check_inputs

* make style

* update testpipeline

* update testpipeline

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
Co-authored-by: Aryan <aryan@huggingface.co>
2025-02-12 02:16:38 +05:30
Dhruv Nair
ed4b75229f [CI] Fix Truffle Hog failure (#10769)
* update

* update
2025-02-11 22:41:03 +05:30
Mathias Parger
8ae8008b0d speedup hunyuan encoder causal mask generation (#10764)
* speedup causal mask generation

* fixing hunyuan attn mask test case
2025-02-11 16:03:15 +05:30
Sayak Paul
c80eda9d3e [Tests] Test layerwise casting with training (#10765)
* add a test to check if we can train with layerwise casting.

* updates

* updates

* style
2025-02-11 16:02:28 +05:30
hlky
7fb481f840 Add Self type hint to ModelMixin's from_pretrained (#10742) 2025-02-10 09:17:57 -10:00
Sayak Paul
9f5ad1db41 [LoRA] fix peft state dict parsing (#10532)
* fix peft state dict parsing

* updates
2025-02-10 18:47:20 +05:30
hlky
464374fb87 EDMEulerScheduler accept sigmas, add final_sigmas_type (#10734) 2025-02-07 06:53:52 +00:00
hlky
d43ce14e2d Quantized Flux with IP-Adapter (#10728) 2025-02-06 07:02:36 -10:00
Leo Jiang
cd0a4a82cf [bugfix] NPU Adaption for Sana (#10724)
* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* NPU Adaption for Sanna

* [bugfix]NPU Adaption for Sanna

---------

Co-authored-by: J石页 <jiangshuo9@h-partners.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-02-06 19:29:58 +05:30
suzukimain
145522cbb7 [Community] Enhanced Model Search (#10417)
* Added `auto_load_textual_inversion` and `auto_load_lora_weights`

* update README.md

* fix

* make quality

* Fix and `make style`
2025-02-05 14:43:53 -10:00
xieofxie
23bc56a02d add provider_options in from_pretrained (#10719)
Co-authored-by: hualxie <hualxie@microsoft.com>
2025-02-05 09:41:41 -10:00
SahilCarterr
5b1dcd1584 [Fix] Type Hint in from_pretrained() to Ensure Correct Type Inference (#10714)
* Update pipeline_utils.py

Added Self in from_pretrained method so  inference will correctly recognize pipeline

* Use typing_extensions

---------

Co-authored-by: hlky <hlky@hlky.ac>
2025-02-04 08:59:31 -10:00
Parag Ekbote
dbe0094e86 Notebooks for Community Scripts-6 (#10713)
* Fix Doc Tutorial.

* Add 4 Notebooks and improve their example
scripts.
2025-02-04 10:12:17 -08:00
Nicolas
f63d32233f Fix train_text_to_image.py --help (#10711) 2025-02-04 11:26:23 +05:30
Sayak Paul
5e8e6cb44f [bitsandbytes] Simplify bnb int8 dequant (#10401)
* fix dequantization for latest bnb.

* smol fixes.

* fix type annotation

* update peft link

* updates
2025-02-04 11:17:14 +05:30
yiyixuxu
12650e1393 up 2025-02-04 02:08:28 +01:00
yiyixuxu
addaad013c more more more refactor 2025-02-03 20:36:05 +01:00
Parag Ekbote
3e35f56b00 Fix Documentation about Image-to-Image Pipeline (#10704)
Fix Doc Tutorial.
2025-02-03 09:54:00 -08:00
Ikpreet S Babra
537891e693 Fixed grammar in "write_own_pipeline" readme (#10706) 2025-02-03 09:53:30 -08:00
yiyixuxu
485f8d1758 more refactor 2025-02-01 21:30:05 +01:00
Vedat Baday
9f28f1abba feat(training-utils): support device and dtype params in compute_density_for_timestep_sampling (#10699)
* feat(training-utils): support device and dtype params in compute_density_for_timestep_sampling

* chore: update type hint

* refactor: use union for type hint

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-02-01 23:04:05 +05:30
yiyixuxu
cff0fd6260 more refactor 2025-02-01 11:36:13 +01:00
yiyixuxu
8ddb20bfb8 up 2025-02-01 05:45:00 +01:00
yiyixuxu
e5089d702b update 2025-01-31 21:55:45 +01:00
Thanh Le
5d2d23986e Fix inconsistent random transform in instruct pix2pix (#10698)
* Update train_instruct_pix2pix.py

Fix inconsistent random transform in instruct_pix2pix

* Update train_instruct_pix2pix_sdxl.py
2025-01-31 08:29:29 -10:00
Max Podkorytov
1ae9b0595f Fix enable memory efficient attention on ROCm (#10564)
* fix enable memory efficient attention on ROCm

while calling CK implementation

* Update attention_processor.py

refactor of picking a set element
2025-01-31 17:15:49 +05:30
SahilCarterr
aad69ac2f3 [FIX] check_inputs function in Auraflow Pipeline (#10678)
fix_shape_error
2025-01-29 13:11:54 -10:00
Vedat Baday
ea76880bd7 fix(hunyuan-video): typo in height and width input check (#10684) 2025-01-30 04:16:05 +05:30
Teriks
33f936154d support StableDiffusionAdapterPipeline.from_single_file (#10552)
* support StableDiffusionAdapterPipeline.from_single_file

* make style

---------

Co-authored-by: Teriks <Teriks@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-01-29 07:18:47 -10:00
yiyixuxu
2c3e4eafa8 fix 2025-01-29 17:58:40 +01:00
Sayak Paul
e6037e8275 [tests] update llamatokenizer in hunyuanvideo tests (#10681)
update llamatokenizer in hunyuanvideo tests
2025-01-29 21:12:57 +05:30
Dimitri Barbot
196aef5a6f Fix pipeline dtype unexpected change when using SDXL reference community pipelines in float16 mode (#10670)
Fix pipeline dtype unexpected change when using SDXL reference community pipelines
2025-01-28 10:46:41 -03:00
Sayak Paul
7b100ce589 [Tests] conditionally check fp8_e4m3_bf16_max_memory < fp8_e4m3_fp32_max_memory (#10669)
* conditionally check if compute capability is met.

* log info.

* fix condition.

* updates

* updates

* updates

* updates
2025-01-28 12:00:14 +05:30
Aryan
c4d4ac21e7 Refactor gradient checkpointing (#10611)
* update

* remove unused fn

* apply suggestions based on review

* update + cleanup 🧹

* more cleanup 🧹

* make fix-copies

* update test
2025-01-28 06:51:46 +05:30
Hanch Han
f295e2eefc [fix] refer use_framewise_encoding on AutoencoderKLHunyuanVideo._encode (#10600)
* fix: refer to use_framewise_encoding on AutoencoderKLHunyuanVideo._encode

* fix: comment about tile_sample_min_num_frames

---------

Co-authored-by: Aryan <aryan@huggingface.co>
2025-01-28 06:51:27 +05:30
Aryan
658e24e86c [core] Pyramid Attention Broadcast (#9562)
* start pyramid attention broadcast

* add coauthor

Co-Authored-By: Xuanlei Zhao <43881818+oahzxl@users.noreply.github.com>

* update

* make style

* update

* make style

* add docs

* add tests

* update

* Update docs/source/en/api/pipelines/cogvideox.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/cogvideox.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Pyramid Attention Broadcast rewrite + introduce hooks (#9826)

* rewrite implementation with hooks

* make style

* update

* merge pyramid-attention-rewrite-2

* make style

* remove changes from latte transformer

* revert docs changes

* better debug message

* add todos for future

* update tests

* make style

* cleanup

* fix

* improve log message; fix latte test

* refactor

* update

* update

* update

* revert changes to tests

* update docs

* update tests

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update

* fix flux test

* reorder

* refactor

* make fix-copies

* update docs

* fixes

* more fixes

* make style

* update tests

* update code example

* make fix-copies

* refactor based on reviews

* use maybe_free_model_hooks

* CacheMixin

* make style

* update

* add current_timestep property; update docs

* make fix-copies

* update

* improve tests

* try circular import fix

* apply suggestions from review

* address review comments

* Apply suggestions from code review

* refactor hook implementation

* add test suite for hooks

* PAB Refactor (#10667)

* update

* update

* update

---------

Co-authored-by: DN6 <dhruv.nair@gmail.com>

* update

* fix remove hook behaviour

---------

Co-authored-by: Xuanlei Zhao <43881818+oahzxl@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: DN6 <dhruv.nair@gmail.com>
2025-01-28 05:09:04 +05:30
Giuseppe Catalano
fb42066489 Revert RePaint scheduler 'fix' (#10644)
Co-authored-by: Giuseppe Catalano <giuseppelorenzo.catalano@unito.it>
2025-01-27 11:16:45 -10:00
Teriks
e89ab5bc26 SDXL ControlNet Union pipelines, make control_image argument immutible (#10663)
controlnet union XL, make control_image immutible

when this argument is passed a list, __call__
modifies its content, since it is pass by reference
the list passed by the caller gets its content
modified unexpectedly

make a copy at method intro so this does not happen

Co-authored-by: Teriks <Teriks@users.noreply.github.com>
2025-01-27 10:53:30 -10:00
victolee0
8ceec90d76 fix check_inputs func in LuminaText2ImgPipeline (#10651) 2025-01-27 09:47:01 -10:00
hlky
158c5c4d08 Add provider_options to OnnxRuntimeModel (#10661) 2025-01-27 09:46:17 -10:00
hlky
41571773d9 [training] Convert to ImageFolder script (#10664)
* [training] Convert to ImageFolder script

* make
2025-01-27 09:43:51 -10:00
hlky
18f7d1d937 ControlNet Union controlnet_conditioning_scale for multiple control inputs (#10666) 2025-01-27 08:15:25 -10:00
Marlon May
f7f36c7d3d Add community pipeline for semantic guidance for FLUX (#10610)
* add community pipeline for semantic guidance for flux

* fix imports in community pipeline for semantic guidance for flux

* Update examples/community/pipeline_flux_semantic_guidance.py

Co-authored-by: hlky <hlky@hlky.ac>

* fix community pipeline for semantic guidance for flux

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
Co-authored-by: hlky <hlky@hlky.ac>
2025-01-27 16:19:46 +02:00
Yuqian Hong
4fa24591a3 create a script to train autoencoderkl (#10605)
* create a script to train vae

* update main.py

* update train_autoencoderkl.py

* update train_autoencoderkl.py

* add a check of --pretrained_model_name_or_path and --model_config_name_or_path

* remove the comment, remove diffusers in requiremnets.txt, add validation_image ote

* update autoencoderkl.py

* quality

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-01-27 16:41:34 +05:30
yiyixuxu
c7020df2cf add model_info 2025-01-27 11:33:27 +01:00
Jacob Helwig
4f3ec5364e Add sigmoid scheduler in scheduling_ddpm.py docs (#10648)
Sigmoid scheduler in scheduling_ddpm.py docs
2025-01-26 15:37:20 -08:00
yiyixuxu
4bed3e306e up up 2025-01-26 13:04:33 +01:00
Leo Jiang
07860f9916 NPU Adaption for Sanna (#10409)
* NPU Adaption for Sanna


---------

Co-authored-by: J石页 <jiangshuo9@h-partners.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-01-24 09:08:52 -10:00
Wenhao Sun
87252d80c3 Add pipeline_stable_diffusion_xl_attentive_eraser (#10579)
* add pipeline_stable_diffusion_xl_attentive_eraser

* add pipeline_stable_diffusion_xl_attentive_eraser_make_style

* make style and add example output

* update Docs

Co-authored-by: Other Contributor <a457435687@126.com>

* add Oral

Co-authored-by: Other Contributor <a457435687@126.com>

* update_review

Co-authored-by: Other Contributor <a457435687@126.com>

* update_review_ms

Co-authored-by: Other Contributor <a457435687@126.com>

---------

Co-authored-by: Other Contributor <a457435687@126.com>
2025-01-24 13:52:45 +00:00
Sayak Paul
5897137397 [chore] add a script to extract loras from full fine-tuned models (#10631)
* feat: add a lora extraction script.

* updates
2025-01-24 11:50:36 +05:30
Yaniv Galron
a451c0ed14 removing redundant requires_grad = False (#10628)
We already set the unet to requires grad false at line 506

Co-authored-by: Aryan <aryan@huggingface.co>
2025-01-24 03:25:33 +05:30
yiyixuxu
00a3bc9d6c fix 2025-01-23 18:16:00 +01:00
YiYi Xu
ccb35acd81 Merge branch 'main' into modular-diffusers 2025-01-23 07:07:11 -10:00
hlky
37c9697f5b Add IP-Adapter example to Flux docs (#10633)
* Add IP-Adapter example to Flux docs

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-01-23 22:15:33 +05:30
Raul Ciotescu
9684c52adf width and height are mixed-up (#10629)
vars mixed-up
2025-01-23 06:40:22 -10:00
Steven Liu
5483162d12 [docs] uv installation (#10622)
* uv

* feedback
2025-01-23 08:34:51 -08:00
Sayak Paul
d77c53b6d2 [docs] fix image path in para attention docs (#10632)
fix image path in para attention docs
2025-01-23 08:22:42 -08:00
yiyixuxu
00cae4e857 docstring doc doc doc 2025-01-23 11:07:13 +01:00
Sayak Paul
78bc824729 [Tests] modify the test slices for the failing flax test (#10630)
* fixes

* fixes

* fixes

* updates
2025-01-23 12:10:24 +05:30
kahmed10
04d40920a7 add onnxruntime-migraphx as part of check for onnxruntime in import_utils.py (#10624)
add onnxruntime-migraphx to import_utils.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2025-01-23 07:49:51 +05:30
yiyixuxu
b3fb4188f5 Merge branch 'modular-diffusers' of github.com:huggingface/diffusers into modular-diffusers 2025-01-22 17:24:06 +01:00
YiYi Xu
71df1581f7 Update src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl_modular.py
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
2025-01-22 06:19:22 -10:00
Dhruv Nair
8d6f6d6b66 [CI] Update HF_TOKEN in all workflows (#10613)
update
2025-01-22 20:03:41 +05:30
Aryan
ca60ad8e55 Improve TorchAO error message (#10627)
improve error message
2025-01-22 19:50:02 +05:30
Aryan
beacaa5528 [core] Layerwise Upcasting (#10347)
* update

* update

* make style

* remove dynamo disable

* add coauthor

Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com>

* update

* update

* update

* update mixin

* add some basic tests

* update

* update

* non_blocking

* improvements

* update

* norm.* -> norm

* apply suggestions from review

* add example

* update hook implementation to the latest changes from pyramid attention broadcast

* deinitialize should raise an error

* update doc page

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update docs

* update

* refactor

* fix _always_upcast_modules for asym ae and vq_model

* fix lumina embedding forward to not depend on weight dtype

* refactor tests

* add simple lora inference tests

* _always_upcast_modules -> _precision_sensitive_module_patterns

* remove todo comments about review; revert changes to self.dtype in unets because .dtype on ModelMixin should be able to handle fp8 weight case

* check layer dtypes in lora test

* fix UNet1DModelTests::test_layerwise_upcasting_inference

* _precision_sensitive_module_patterns -> _skip_layerwise_casting_patterns based on feedback

* skip test in NCSNppModelTests

* skip tests for AutoencoderTinyTests

* skip tests for AutoencoderOobleckTests

* skip tests for UNet1DModelTests - unsupported pytorch operations

* layerwise_upcasting -> layerwise_casting

* skip tests for UNetRLModelTests; needs next pytorch release for currently unimplemented operation support

* add layerwise fp8 pipeline test

* use xfail

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* add assertion with fp32 comparison; add tolerance to fp8-fp32 vs fp32-fp32 comparison (required for a few models' test to pass)

* add note about memory consumption on tesla CI runner for failing test

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2025-01-22 19:49:37 +05:30
yiyixuxu
d046cf7d35 block state + fix for num_images_per_prompt > 1 for denoise/controlnet union etc 2025-01-22 09:48:57 +01:00
yiyixuxu
68a5185c86 refactor more, ipadapter node, lora node 2025-01-20 03:36:01 +01:00
yiyixuxu
6e2fe26bfd fix more for lora 2025-01-18 08:04:12 +01:00
yiyixuxu
77b5fa59c5 make it work with lora has both text_encoder & unet 2025-01-18 04:12:07 +01:00
yiyixuxu
a226920b52 get_block_state make it less verbose 2025-01-17 01:37:18 +01:00
yiyixuxu
7007f72409 InputParam, OutputParam, get_auto_doc 2025-01-16 11:44:24 +01:00
yiyixuxu
a6804de4a2 add controlnet union to auto & fix for pag 2025-01-12 16:24:01 +01:00
yiyixuxu
7f897a9fc4 fix 2025-01-12 04:50:45 +01:00
yiyixuxu
0966663d2a adjust print 2025-01-11 19:15:54 +01:00
yiyixuxu
fb78f4f12d Merge branch 'modular-diffusers' of github.com:huggingface/diffusers into modular-diffusers 2025-01-11 09:05:56 +01:00
yiyixuxu
2220af6940 refactor 2025-01-11 09:05:47 +01:00
hlky
7a34832d52 [modular] Stable Diffusion XL ControlNet Union (#10509)
StableDiffusionXLControlNetUnionDenoiseStep
2025-01-09 10:29:45 -10:00
yiyixuxu
e973de64f9 fix contro;net inpaint preprocess 2025-01-08 21:47:20 +01:00
yiyixuxu
db94ca882d add controlnet inpaint + more refactor 2025-01-07 20:49:58 +01:00
yiyixuxu
6985906a2e controlnet input & remove the MultiPipelineBlocks class 2025-01-07 01:56:33 +01:00
yiyixuxu
54f410db6c add inpaint 2025-01-06 09:19:59 +01:00
yiyixuxu
c12a05b9c1 update to to not assume pipeline has hf_device_map 2025-01-03 20:57:44 +01:00
yiyixuxu
2e0f5c86cc start to add inpaint 2025-01-03 18:20:39 +01:00
yiyixuxu
1d63306295 make it work with lora 2025-01-03 06:07:25 +01:00
yiyixuxu
6c93626f6f remove run_blocks, just use __call__ 2025-01-02 00:59:12 +01:00
yiyixuxu
72c5bf07c8 add a from_block class method to modular pipeline 2025-01-02 00:49:34 +01:00
yiyixuxu
ed59f90f15 modular pipeline builder -> ModularPipeline 2025-01-01 22:15:48 +01:00
yiyixuxu
a09ca7f27e refactors: block __init__ no longer accept args. remove update_states from pipeline blocks, add update_states to modularpipeline, remove multi-block support for modular pipeline, remove offload support on modular pipeline 2025-01-01 21:43:20 +01:00
yiyixuxu
8c02572e16 add memory_reserve_margin arg to auto offload 2024-12-31 20:08:53 +01:00
yiyixuxu
27dde51de8 add output arg to run_blocks 2024-12-31 18:06:44 +01:00
yiyixuxu
10d4a775f1 style 2024-12-31 09:55:50 +01:00
yiyixuxu
72d9a81d99 components manager 2024-12-31 09:54:46 +01:00
yiyixuxu
4fa85c7963 add model_manager and global offloading method 2024-12-31 02:57:42 +01:00
YiYi Xu
806e8e66fb Merge branch 'main' into modular-diffusers 2024-12-29 00:44:43 -10:00
yiyixuxu
0b90051db8 add vae encoder node 2024-12-19 17:57:12 +01:00
yiyixuxu
b305c779b2 add offload support! 2024-12-14 21:37:21 +01:00
yiyixuxu
2b3cd2d39c update 2024-12-14 03:02:31 +01:00
yiyixuxu
bc3d1c9ee6 add model_cpu_offload_seq + _exlude_from_cpu_offload 2024-12-14 00:24:15 +01:00
yiyixuxu
e50d614636 only add model as expected_component when the model need to run for the block, currently it's added even when only config is needed 2024-12-11 03:39:39 +01:00
hlky
a8df0f1ffb Modular APG (#10173) 2024-12-10 08:22:42 -10:00
yiyixuxu
ace53e2d2f update/refactor 2024-12-10 03:41:28 +01:00
yiyixuxu
ffc2992fc2 add autostep (not complete) 2024-11-16 22:42:06 +01:00
yiyixuxu
c70a285c2c style 2024-10-30 10:33:25 +01:00
yiyixuxu
8b811feece refactor, from_pretrained, from_pipe, remove_blocks, replace_blocks 2024-10-30 10:13:03 +01:00
yiyixuxu
37e8dc7a59 remove img2img blocksgit status consolidate text2img and img2img 2024-10-28 00:37:48 +01:00
yiyixuxu
024a9f5de3 fix so that run_blocks can work with inputs in the state 2024-10-27 18:52:56 +01:00
yiyixuxu
005195c23e add 2024-10-27 15:18:10 +01:00
yiyixuxu
6742f160df up 2024-10-27 14:59:31 +01:00
yiyixuxu
540d303250 refactor guider 2024-10-26 21:17:06 +02:00
yiyixuxu
f1b3036ca1 update pag guider - draft 2024-10-24 00:14:59 +02:00
yiyixuxu
46ec1743a2 refactor guider, remove prepareguidance step to be combinedd into denoisestep 2024-10-23 21:42:40 +02:00
yiyixuxu
70272b1108 combine controlnetstep into contronetdesnoisestep 2024-10-20 19:45:00 +02:00
yiyixuxu
2b6dcbfa1d fix controlnet 2024-10-20 19:23:37 +02:00
yiyixuxu
af9572d759 controlnet 2024-10-19 12:36:12 +02:00
yiyixuxu
ddea157979 add from_pipe + run_blocks 2024-10-17 20:02:36 +02:00
yiyixuxu
ad3f9a26c0 update img2img, result match 2024-10-17 05:47:15 +02:00
yiyixuxu
e8d0980f9f add img2img support - output does not match with non-modular pipeline completely yet (look into later) 2024-10-16 20:56:39 +02:00
yiyixuxu
52a7f1cb97 add dataflow info for each block in builder _repr_ 2024-10-16 09:04:32 +02:00
yiyixuxu
33f85fadf6 add 2024-10-14 19:16:23 +02:00
1615 changed files with 154894 additions and 33482 deletions

View File

@@ -0,0 +1,38 @@
name: "\U0001F31F Remote VAE"
description: Feedback for remote VAE pilot
labels: [ "Remote VAE" ]
body:
- type: textarea
id: positive
validations:
required: true
attributes:
label: Did you like the remote VAE solution?
description: |
If you liked it, we would appreciate it if you could elaborate what you liked.
- type: textarea
id: feedback
validations:
required: true
attributes:
label: What can be improved about the current solution?
description: |
Let us know the things you would like to see improved. Note that we will work optimizing the solution once the pilot is over and we have usage.
- type: textarea
id: others
validations:
required: true
attributes:
label: What other VAEs you would like to see if the pilot goes well?
description: |
Provide a list of the VAEs you would like to see in the future if the pilot goes well.
- type: textarea
id: additional-info
attributes:
label: Notify the members of the team
description: |
Tag the following folks when submitting this feedback: @hlky @sayakpaul

View File

@@ -11,19 +11,20 @@ env:
HF_HOME: /mnt/cache
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
BASE_PATH: benchmark_outputs
jobs:
torch_pipelines_cuda_benchmark_tests:
torch_models_cuda_benchmark_tests:
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL_BENCHMARK }}
name: Torch Core Pipelines CUDA Benchmarking Tests
name: Torch Core Models CUDA Benchmarking Tests
strategy:
fail-fast: false
max-parallel: 1
runs-on:
group: aws-g6-4xlarge-plus
group: aws-g6e-4xlarge
container:
image: diffusers/diffusers-pytorch-compile-cuda
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host --gpus 0
steps:
- name: Checkout diffusers
@@ -35,26 +36,47 @@ jobs:
nvidia-smi
- name: Install dependencies
run: |
apt update
apt install -y libpq-dev postgresql-client
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install pandas peft
python -m uv pip install -r benchmarks/requirements.txt
- name: Environment
run: |
python utils/print_env.py
- name: Diffusers Benchmarking
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_BOT_TOKEN }}
BASE_PATH: benchmark_outputs
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
run: |
export TOTAL_GPU_MEMORY=$(python -c "import torch; print(torch.cuda.get_device_properties(0).total_memory / (1024**3))")
cd benchmarks && mkdir ${BASE_PATH} && python run_all.py && python push_results.py
cd benchmarks && python run_all.py
- name: Push results to the Hub
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_BOT_TOKEN }}
run: |
cd benchmarks && python push_results.py
mkdir $BASE_PATH && cp *.csv $BASE_PATH
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: benchmark_test_reports
path: benchmarks/benchmark_outputs
path: benchmarks/${{ env.BASE_PATH }}
# TODO: enable this once the connection problem has been resolved.
- name: Update benchmarking results to DB
env:
PGDATABASE: metrics
PGHOST: ${{ secrets.DIFFUSERS_BENCHMARKS_PGHOST }}
PGUSER: transformers_benchmarks
PGPASSWORD: ${{ secrets.DIFFUSERS_BENCHMARKS_PGPASSWORD }}
BRANCH_NAME: ${{ github.head_ref || github.ref_name }}
run: |
git config --global --add safe.directory /__w/diffusers/diffusers
commit_id=$GITHUB_SHA
commit_msg=$(git show -s --format=%s "$commit_id" | cut -c1-70)
cd benchmarks && python populate_into_db.py "$BRANCH_NAME" "$commit_id" "$commit_msg"
- name: Report success status
if: ${{ success() }}

View File

@@ -38,9 +38,16 @@ jobs:
token: ${{ secrets.GITHUB_TOKEN }}
- name: Build Changed Docker Images
env:
CHANGED_FILES: ${{ steps.file_changes.outputs.all }}
run: |
CHANGED_FILES="${{ steps.file_changes.outputs.all }}"
for FILE in $CHANGED_FILES; do
echo "$CHANGED_FILES"
for FILE in $CHANGED_FILES; do
# skip anything that isn't still on disk
if [[ ! -f "$FILE" ]]; then
echo "Skipping removed file $FILE"
continue
fi
if [[ "$FILE" == docker/*Dockerfile ]]; then
DOCKER_PATH="${FILE%/Dockerfile}"
DOCKER_TAG=$(basename "$DOCKER_PATH")
@@ -65,13 +72,9 @@ jobs:
image-name:
- diffusers-pytorch-cpu
- diffusers-pytorch-cuda
- diffusers-pytorch-compile-cuda
- diffusers-pytorch-cuda
- diffusers-pytorch-xformers-cuda
- diffusers-pytorch-minimum-cuda
- diffusers-flax-cpu
- diffusers-flax-tpu
- diffusers-onnxruntime-cpu
- diffusers-onnxruntime-cuda
- diffusers-doc-builder
steps:

View File

@@ -13,8 +13,9 @@ env:
PYTEST_TIMEOUT: 600
RUN_SLOW: yes
RUN_NIGHTLY: yes
PIPELINE_USAGE_CUTOFF: 5000
PIPELINE_USAGE_CUTOFF: 0
SLACK_API_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
CONSOLIDATED_REPORT_PATH: consolidated_test_report.md
jobs:
setup_torch_cuda_pipeline_matrix:
@@ -99,11 +100,6 @@ jobs:
with:
name: pipeline_${{ matrix.module }}_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_tests_for_other_torch_modules:
name: Nightly Torch CUDA Tests
@@ -174,11 +170,48 @@ jobs:
name: torch_${{ matrix.module }}_cuda_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run_torch_compile_tests:
name: PyTorch Compile CUDA tests
runs-on:
group: aws-g4dn-2xlarge
container:
image: diffusers/diffusers-pytorch-cuda
options: --gpus 0 --shm-size "16gb" --ipc host
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: |
pip install slack_sdk tabulate
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test,training]
- name: Environment
run: |
python utils/print_env.py
- name: Run torch compile tests on GPU
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
RUN_COMPILE: yes
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "compile" --make-reports=tests_torch_compile_cuda tests/
- name: Failure short reports
if: ${{ failure() }}
run: cat reports/tests_torch_compile_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: torch_compile_test_reports
path: reports
run_big_gpu_torch_tests:
name: Torch tests on big GPU
@@ -215,7 +248,7 @@ jobs:
BIG_GPU_MEMORY: 40
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-m "big_gpu_with_torch_cuda" \
-m "big_accelerator" \
--make-reports=tests_big_gpu_torch_cuda \
--report-log=tests_big_gpu_torch_cuda.log \
tests/
@@ -230,12 +263,7 @@ jobs:
with:
name: torch_cuda_big_gpu_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
torch_minimum_version_cuda_tests:
name: Torch Minimum Version CUDA Tests
runs-on:
@@ -265,7 +293,7 @@ jobs:
- name: Run PyTorch CUDA tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
@@ -292,132 +320,26 @@ jobs:
with:
name: torch_minimum_version_cuda_test_reports
path: reports
run_flax_tpu_tests:
name: Nightly Flax TPU Tests
runs-on:
group: gcp-ct5lp-hightpu-8t
if: github.event_name == 'schedule'
container:
image: diffusers/diffusers-flax-tpu
options: --shm-size "16gb" --ipc host --privileged ${{ vars.V5_LITEPOD_8_ENV}} -v /mnt/hf_cache:/mnt/hf_cache
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly Flax TPU tests
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
run: |
python -m pytest -n 0 \
-s -v -k "Flax" \
--make-reports=tests_flax_tpu \
--report-log=tests_flax_tpu.log \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_flax_tpu_stats.txt
cat reports/tests_flax_tpu_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: flax_tpu_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_onnx_tests:
name: Nightly ONNXRuntime CUDA tests on Ubuntu
runs-on:
group: aws-g4dn-2xlarge
container:
image: diffusers/diffusers-onnxruntime-cuda
options: --gpus 0 --shm-size "16gb" --ipc host
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run Nightly ONNXRuntime CUDA tests
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_onnx_cuda \
--report-log=tests_onnx_cuda.log \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_onnx_cuda_stats.txt
cat reports/tests_onnx_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: tests_onnx_cuda_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_quantization_tests:
name: Torch quantization nightly tests
strategy:
fail-fast: false
max-parallel: 2
matrix:
matrix:
config:
- backend: "bitsandbytes"
test_location: "bnb"
additional_deps: ["peft"]
- backend: "gguf"
test_location: "gguf"
additional_deps: ["peft"]
- backend: "torchao"
test_location: "torchao"
additional_deps: []
- backend: "optimum_quanto"
test_location: "quanto"
additional_deps: []
runs-on:
group: aws-g6e-xlarge-plus
container:
@@ -435,6 +357,9 @@ jobs:
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install -U ${{ matrix.config.backend }}
if [ "${{ join(matrix.config.additional_deps, ' ') }}" != "" ]; then
python -m uv pip install ${{ join(matrix.config.additional_deps, ' ') }}
fi
python -m uv pip install pytest-reportlog
- name: Environment
run: |
@@ -461,11 +386,114 @@ jobs:
with:
name: torch_cuda_${{ matrix.config.backend }}_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run_nightly_pipeline_level_quantization_tests:
name: Torch quantization nightly tests
strategy:
fail-fast: false
max-parallel: 2
runs-on:
group: aws-g6e-xlarge-plus
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "20gb" --ipc host --gpus 0
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install -U bitsandbytes optimum_quanto
python -m uv pip install pytest-reportlog
- name: Environment
run: |
python utils/print_env.py
- name: Pipeline-level quantization tests on GPU
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
BIG_GPU_MEMORY: 40
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
--make-reports=tests_pipeline_level_quant_torch_cuda \
--report-log=tests_pipeline_level_quant_torch_cuda.log \
tests/quantization/test_pipeline_level_quantization.py
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_pipeline_level_quant_torch_cuda_stats.txt
cat reports/tests_pipeline_level_quant_torch_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: torch_cuda_pipeline_level_quant_reports
path: reports
generate_consolidated_report:
name: Generate Consolidated Test Report
needs: [
run_nightly_tests_for_torch_pipelines,
run_nightly_tests_for_other_torch_modules,
run_torch_compile_tests,
run_big_gpu_torch_tests,
run_nightly_quantization_tests,
run_nightly_pipeline_level_quantization_tests,
# run_nightly_onnx_tests,
torch_minimum_version_cuda_tests,
# run_flax_tpu_tests
]
if: always()
runs-on:
group: aws-general-8-plus
container:
image: diffusers/diffusers-pytorch-cpu
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Create reports directory
run: mkdir -p combined_reports
- name: Download all test reports
uses: actions/download-artifact@v4
with:
path: artifacts
- name: Prepare reports
run: |
# Move all report files to a single directory for processing
find artifacts -name "*.txt" -exec cp {} combined_reports/ \;
- name: Install dependencies
run: |
pip install -e .[test]
pip install slack_sdk tabulate
python utils/log_reports.py >> $GITHUB_STEP_SUMMARY
- name: Generate consolidated report
run: |
python utils/consolidated_test_report.py \
--reports_dir combined_reports \
--output_file $CONSOLIDATED_REPORT_PATH \
--slack_channel_name diffusers-ci-nightly
- name: Show consolidated report
run: |
cat $CONSOLIDATED_REPORT_PATH >> $GITHUB_STEP_SUMMARY
- name: Upload consolidated report
uses: actions/upload-artifact@v4
with:
name: consolidated_test_report
path: ${{ env.CONSOLIDATED_REPORT_PATH }}
# M1 runner currently not well supported
# TODO: (Dhruv) add these back when we setup better testing for Apple Silicon
@@ -505,7 +533,7 @@ jobs:
# shell: arch -arch arm64 bash {0}
# env:
# HF_HOME: /System/Volumes/Data/mnt/cache
# HF_TOKEN: ${{ secrets.HF_TOKEN }}
# HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
# run: |
# ${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps \
# --report-log=tests_torch_mps.log \
@@ -561,7 +589,7 @@ jobs:
# shell: arch -arch arm64 bash {0}
# env:
# HF_HOME: /System/Volumes/Data/mnt/cache
# HF_TOKEN: ${{ secrets.HF_TOKEN }}
# HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
# run: |
# ${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps \
# --report-log=tests_torch_mps.log \

17
.github/workflows/pr_style_bot.yml vendored Normal file
View File

@@ -0,0 +1,17 @@
name: PR Style Bot
on:
issue_comment:
types: [created]
permissions:
contents: write
pull-requests: write
jobs:
style:
uses: huggingface/huggingface_hub/.github/workflows/style-bot-action.yml@main
with:
python_quality_dependencies: "[quality]"
secrets:
bot_token: ${{ secrets.HF_STYLE_BOT_ACTION }}

View File

@@ -2,8 +2,7 @@ name: Fast tests for PRs
on:
pull_request:
branches:
- main
branches: [main]
paths:
- "src/diffusers/**.py"
- "benchmarks/**.py"
@@ -12,6 +11,7 @@ on:
- "tests/**.py"
- ".github/**.yml"
- "utils/**.py"
- "setup.py"
push:
branches:
- ci-*
@@ -64,6 +64,7 @@ jobs:
run: |
python utils/check_copies.py
python utils/check_dummies.py
python utils/check_support_list.py
make deps_table_check_updated
- name: Check if failure
if: ${{ failure() }}
@@ -86,11 +87,6 @@ jobs:
runner: aws-general-8-plus
image: diffusers/diffusers-pytorch-cpu
report: torch_cpu_models_schedulers
- name: Fast Flax CPU tests
framework: flax
runner: aws-general-8-plus
image: diffusers/diffusers-flax-cpu
report: flax_cpu
- name: PyTorch Example CPU tests
framework: pytorch_examples
runner: aws-general-8-plus
@@ -120,7 +116,8 @@ jobs:
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate
pip uninstall transformers -y && python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git --no-deps
- name: Environment
run: |
@@ -145,15 +142,6 @@ jobs:
--make-reports=tests_${{ matrix.config.report }} \
tests/models tests/schedulers tests/others
- name: Run fast Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }}
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Flax" \
--make-reports=tests_${{ matrix.config.report }} \
tests
- name: Run example PyTorch CPU tests
if: ${{ matrix.config.framework == 'pytorch_examples' }}
run: |
@@ -289,8 +277,8 @@ jobs:
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_lora_failures_short.txt
cat reports/tests_models_lora_failures_short.txt
cat reports/tests_peft_main_failures_short.txt
cat reports/tests_models_lora_peft_main_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}

296
.github/workflows/pr_tests_gpu.yml vendored Normal file
View File

@@ -0,0 +1,296 @@
name: Fast GPU Tests on PR
on:
pull_request:
branches: main
paths:
- "src/diffusers/models/modeling_utils.py"
- "src/diffusers/models/model_loading_utils.py"
- "src/diffusers/pipelines/pipeline_utils.py"
- "src/diffusers/pipeline_loading_utils.py"
- "src/diffusers/loaders/lora_base.py"
- "src/diffusers/loaders/lora_pipeline.py"
- "src/diffusers/loaders/peft.py"
- "tests/pipelines/test_pipelines_common.py"
- "tests/models/test_modeling_common.py"
workflow_dispatch:
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
env:
DIFFUSERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
HF_HUB_ENABLE_HF_TRANSFER: 1
PYTEST_TIMEOUT: 600
PIPELINE_USAGE_CUTOFF: 1000000000 # set high cutoff so that only always-test pipelines run
jobs:
check_code_quality:
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: make quality
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Quality check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make style && make quality'" >> $GITHUB_STEP_SUMMARY
check_repository_consistency:
needs: check_code_quality
runs-on: ubuntu-22.04
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check repo consistency
run: |
python utils/check_copies.py
python utils/check_dummies.py
python utils/check_support_list.py
make deps_table_check_updated
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Repo consistency check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make fix-copies'" >> $GITHUB_STEP_SUMMARY
setup_torch_cuda_pipeline_matrix:
needs: [check_code_quality, check_repository_consistency]
name: Setup Torch Pipelines CUDA Slow Tests Matrix
runs-on:
group: aws-general-8-plus
container:
image: diffusers/diffusers-pytorch-cpu
outputs:
pipeline_test_matrix: ${{ steps.fetch_pipeline_matrix.outputs.pipeline_test_matrix }}
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
- name: Environment
run: |
python utils/print_env.py
- name: Fetch Pipeline Matrix
id: fetch_pipeline_matrix
run: |
matrix=$(python utils/fetch_torch_cuda_pipeline_test_matrix.py)
echo $matrix
echo "pipeline_test_matrix=$matrix" >> $GITHUB_OUTPUT
- name: Pipeline Tests Artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: test-pipelines.json
path: reports
torch_pipelines_cuda_tests:
name: Torch Pipelines CUDA Tests
needs: setup_torch_cuda_pipeline_matrix
strategy:
fail-fast: false
max-parallel: 8
matrix:
module: ${{ fromJson(needs.setup_torch_cuda_pipeline_matrix.outputs.pipeline_test_matrix) }}
runs-on:
group: aws-g4dn-2xlarge
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host --gpus 0
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
pip uninstall transformers -y && python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
- name: Environment
run: |
python utils/print_env.py
- name: Extract tests
id: extract_tests
run: |
pattern=$(python utils/extract_tests_from_mixin.py --type pipeline)
echo "$pattern" > /tmp/test_pattern.txt
echo "pattern_file=/tmp/test_pattern.txt" >> $GITHUB_OUTPUT
- name: PyTorch CUDA checkpoint tests on Ubuntu
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
if [ "${{ matrix.module }}" = "ip_adapters" ]; then
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_pipeline_${{ matrix.module }}_cuda \
tests/pipelines/${{ matrix.module }}
else
pattern=$(cat ${{ steps.extract_tests.outputs.pattern_file }})
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx and $pattern" \
--make-reports=tests_pipeline_${{ matrix.module }}_cuda \
tests/pipelines/${{ matrix.module }}
fi
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_pipeline_${{ matrix.module }}_cuda_stats.txt
cat reports/tests_pipeline_${{ matrix.module }}_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: pipeline_${{ matrix.module }}_test_reports
path: reports
torch_cuda_tests:
name: Torch CUDA Tests
needs: [check_code_quality, check_repository_consistency]
runs-on:
group: aws-g4dn-2xlarge
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host --gpus 0
defaults:
run:
shell: bash
strategy:
fail-fast: false
max-parallel: 2
matrix:
module: [models, schedulers, lora, others]
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install peft@git+https://github.com/huggingface/peft.git
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
pip uninstall transformers -y && python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
- name: Environment
run: |
python utils/print_env.py
- name: Extract tests
id: extract_tests
run: |
pattern=$(python utils/extract_tests_from_mixin.py --type ${{ matrix.module }})
echo "$pattern" > /tmp/test_pattern.txt
echo "pattern_file=/tmp/test_pattern.txt" >> $GITHUB_OUTPUT
- name: Run PyTorch CUDA tests
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
pattern=$(cat ${{ steps.extract_tests.outputs.pattern_file }})
if [ -z "$pattern" ]; then
python -m pytest -n 1 -sv --max-worker-restart=0 --dist=loadfile -k "not Flax and not Onnx" tests/${{ matrix.module }} \
--make-reports=tests_torch_cuda_${{ matrix.module }}
else
python -m pytest -n 1 -sv --max-worker-restart=0 --dist=loadfile -k "not Flax and not Onnx and $pattern" tests/${{ matrix.module }} \
--make-reports=tests_torch_cuda_${{ matrix.module }}
fi
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_torch_cuda_${{ matrix.module }}_stats.txt
cat reports/tests_torch_cuda_${{ matrix.module }}_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: torch_cuda_test_reports_${{ matrix.module }}
path: reports
run_examples_tests:
name: Examples PyTorch CUDA tests on Ubuntu
needs: [check_code_quality, check_repository_consistency]
runs-on:
group: aws-g4dn-2xlarge
container:
image: diffusers/diffusers-pytorch-cuda
options: --gpus 0 --shm-size "16gb" --ipc host
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
pip uninstall transformers -y && python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git --no-deps
python -m uv pip install -e [quality,test,training]
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run example tests on GPU
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install timm
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v --make-reports=examples_torch_cuda examples/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/examples_torch_cuda_stats.txt
cat reports/examples_torch_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: examples_test_reports
path: reports

View File

@@ -159,102 +159,6 @@ jobs:
name: torch_cuda_test_reports_${{ matrix.module }}
path: reports
flax_tpu_tests:
name: Flax TPU Tests
runs-on:
group: gcp-ct5lp-hightpu-8t
container:
image: diffusers/diffusers-flax-tpu
options: --shm-size "16gb" --ipc host --privileged ${{ vars.V5_LITEPOD_8_ENV}} -v /mnt/hf_cache:/mnt/hf_cache
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
python utils/print_env.py
- name: Run Flax TPU tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 0 \
-s -v -k "Flax" \
--make-reports=tests_flax_tpu \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_flax_tpu_stats.txt
cat reports/tests_flax_tpu_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: flax_tpu_test_reports
path: reports
onnx_cuda_tests:
name: ONNX CUDA Tests
runs-on:
group: aws-g4dn-2xlarge
container:
image: diffusers/diffusers-onnxruntime-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
python utils/print_env.py
- name: Run ONNXRuntime CUDA tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_onnx_cuda \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_onnx_cuda_stats.txt
cat reports/tests_onnx_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: onnx_cuda_test_reports
path: reports
run_torch_compile_tests:
name: PyTorch Compile CUDA tests
@@ -262,7 +166,7 @@ jobs:
group: aws-g4dn-2xlarge
container:
image: diffusers/diffusers-pytorch-compile-cuda
image: diffusers/diffusers-pytorch-cuda
options: --gpus 0 --shm-size "16gb" --ipc host
steps:
@@ -283,7 +187,7 @@ jobs:
python utils/print_env.py
- name: Run example tests on GPU
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
RUN_COMPILE: yes
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "compile" --make-reports=tests_torch_compile_cuda tests/
@@ -326,7 +230,7 @@ jobs:
python utils/print_env.py
- name: Run example tests on GPU
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "xformers" --make-reports=tests_torch_xformers_cuda tests/
- name: Failure short reports
@@ -349,7 +253,6 @@ jobs:
container:
image: diffusers/diffusers-pytorch-cuda
options: --gpus 0 --shm-size "16gb" --ipc host
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
@@ -359,7 +262,6 @@ jobs:
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
@@ -372,7 +274,7 @@ jobs:
- name: Run example tests on GPU
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install timm

View File

@@ -33,16 +33,6 @@ jobs:
runner: aws-general-8-plus
image: diffusers/diffusers-pytorch-cpu
report: torch_cpu
- name: Fast Flax CPU tests on Ubuntu
framework: flax
runner: aws-general-8-plus
image: diffusers/diffusers-flax-cpu
report: flax_cpu
- name: Fast ONNXRuntime CPU tests on Ubuntu
framework: onnxruntime
runner: aws-general-8-plus
image: diffusers/diffusers-onnxruntime-cpu
report: onnx_cpu
- name: PyTorch Example CPU tests on Ubuntu
framework: pytorch_examples
runner: aws-general-8-plus
@@ -87,24 +77,6 @@ jobs:
--make-reports=tests_${{ matrix.config.report }} \
tests/
- name: Run fast Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }}
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Flax" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
- name: Run fast ONNXRuntime CPU tests
if: ${{ matrix.config.framework == 'onnxruntime' }}
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
- name: Run example PyTorch CPU tests
if: ${{ matrix.config.framework == 'pytorch_examples' }}
run: |

View File

@@ -1,12 +1,7 @@
name: Fast mps tests on main
on:
push:
branches:
- main
paths:
- "src/diffusers/**.py"
- "tests/**.py"
workflow_dispatch:
env:
DIFFUSERS_IS_CI: yes

View File

@@ -81,7 +81,7 @@ jobs:
python utils/print_env.py
- name: Slow PyTorch CUDA checkpoint tests on Ubuntu
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
@@ -135,7 +135,7 @@ jobs:
- name: Run PyTorch CUDA tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
@@ -186,7 +186,7 @@ jobs:
- name: Run PyTorch CUDA tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
@@ -213,101 +213,6 @@ jobs:
with:
name: torch_minimum_version_cuda_test_reports
path: reports
flax_tpu_tests:
name: Flax TPU Tests
runs-on: docker-tpu
container:
image: diffusers/diffusers-flax-tpu
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ --privileged
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
python utils/print_env.py
- name: Run slow Flax TPU tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 0 \
-s -v -k "Flax" \
--make-reports=tests_flax_tpu \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_flax_tpu_stats.txt
cat reports/tests_flax_tpu_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: flax_tpu_test_reports
path: reports
onnx_cuda_tests:
name: ONNX CUDA Tests
runs-on:
group: aws-g4dn-2xlarge
container:
image: diffusers/diffusers-onnxruntime-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
pip uninstall accelerate -y && python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
python utils/print_env.py
- name: Run slow ONNXRuntime CUDA tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_onnx_cuda \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_onnx_cuda_stats.txt
cat reports/tests_onnx_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v4
with:
name: onnx_cuda_test_reports
path: reports
run_torch_compile_tests:
name: PyTorch Compile CUDA tests
@@ -316,7 +221,7 @@ jobs:
group: aws-g4dn-2xlarge
container:
image: diffusers/diffusers-pytorch-compile-cuda
image: diffusers/diffusers-pytorch-cuda
options: --gpus 0 --shm-size "16gb" --ipc host
steps:
@@ -335,9 +240,9 @@ jobs:
- name: Environment
run: |
python utils/print_env.py
- name: Run example tests on GPU
- name: Run torch compile tests on GPU
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
RUN_COMPILE: yes
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "compile" --make-reports=tests_torch_compile_cuda tests/
@@ -380,7 +285,7 @@ jobs:
python utils/print_env.py
- name: Run example tests on GPU
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "xformers" --make-reports=tests_torch_xformers_cuda tests/
- name: Failure short reports
@@ -426,7 +331,7 @@ jobs:
- name: Run example tests on GPU
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
HF_TOKEN: ${{ secrets.DIFFUSERS_HF_HUB_READ_TOKEN }}
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install timm

View File

@@ -7,8 +7,8 @@ on:
default: 'diffusers/diffusers-pytorch-cuda'
description: 'Name of the Docker image'
required: true
branch:
description: 'PR Branch to test on'
pr_number:
description: 'PR number to test on'
required: true
test:
description: 'Tests to run (e.g.: `tests/models`).'
@@ -43,8 +43,8 @@ jobs:
exit 1
fi
if [[ ! "$PY_TEST" =~ ^tests/(models|pipelines) ]]; then
echo "Error: The input string must contain either 'models' or 'pipelines' after 'tests/'."
if [[ ! "$PY_TEST" =~ ^tests/(models|pipelines|lora) ]]; then
echo "Error: The input string must contain either 'models', 'pipelines', or 'lora' after 'tests/'."
exit 1
fi
@@ -53,13 +53,13 @@ jobs:
exit 1
fi
echo "$PY_TEST"
shell: bash -e {0}
- name: Checkout PR branch
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.branch }}
repository: ${{ github.event.pull_request.head.repo.full_name }}
ref: refs/pull/${{ inputs.pr_number }}/head
- name: Install pytest
run: |

View File

@@ -13,3 +13,6 @@ jobs:
fetch-depth: 0
- name: Secret Scanning
uses: trufflesecurity/trufflehog@main
with:
extra_args: --results=verified,unknown

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

69
benchmarks/README.md Normal file
View File

@@ -0,0 +1,69 @@
# Diffusers Benchmarks
Welcome to Diffusers Benchmarks. These benchmarks are use to obtain latency and memory information of the most popular models across different scenarios such as:
* Base case i.e., when using `torch.bfloat16` and `torch.nn.functional.scaled_dot_product_attention`.
* Base + `torch.compile()`
* NF4 quantization
* Layerwise upcasting
Instead of full diffusion pipelines, only the forward pass of the respective model classes (such as `FluxTransformer2DModel`) is tested with the real checkpoints (such as `"black-forest-labs/FLUX.1-dev"`).
The entrypoint to running all the currently available benchmarks is in `run_all.py`. However, one can run the individual benchmarks, too, e.g., `python benchmarking_flux.py`. It should produce a CSV file containing various information about the benchmarks run.
The benchmarks are run on a weekly basis and the CI is defined in [benchmark.yml](../.github/workflows/benchmark.yml).
## Running the benchmarks manually
First set up `torch` and install `diffusers` from the root of the directory:
```py
pip install -e ".[quality,test]"
```
Then make sure the other dependencies are installed:
```sh
cd benchmarks/
pip install -r requirements.txt
```
We need to be authenticated to access some of the checkpoints used during benchmarking:
```sh
huggingface-cli login
```
We use an L40 GPU with 128GB RAM to run the benchmark CI. As such, the benchmarks are configured to run on NVIDIA GPUs. So, make sure you have access to a similar machine (or modify the benchmarking scripts accordingly).
Then you can either launch the entire benchmarking suite by running:
```sh
python run_all.py
```
Or, you can run the individual benchmarks.
## Customizing the benchmarks
We define "scenarios" to cover the most common ways in which these models are used. You can
define a new scenario, modifying an existing benchmark file:
```py
BenchmarkScenario(
name=f"{CKPT_ID}-bnb-8bit",
model_cls=FluxTransformer2DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
"quantization_config": BitsAndBytesConfig(load_in_8bit=True),
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=model_init_fn,
)
```
You can also configure a new model-level benchmark and add it to the existing suite. To do so, just defining a valid benchmarking file like `benchmarking_flux.py` should be enough.
Happy benchmarking 🧨

View File

@@ -1,346 +0,0 @@
import os
import sys
import torch
from diffusers import (
AutoPipelineForImage2Image,
AutoPipelineForInpainting,
AutoPipelineForText2Image,
ControlNetModel,
LCMScheduler,
StableDiffusionAdapterPipeline,
StableDiffusionControlNetPipeline,
StableDiffusionXLAdapterPipeline,
StableDiffusionXLControlNetPipeline,
T2IAdapter,
WuerstchenCombinedPipeline,
)
from diffusers.utils import load_image
sys.path.append(".")
from utils import ( # noqa: E402
BASE_PATH,
PROMPT,
BenchmarkInfo,
benchmark_fn,
bytes_to_giga_bytes,
flush,
generate_csv_dict,
write_to_csv,
)
RESOLUTION_MAPPING = {
"Lykon/DreamShaper": (512, 512),
"lllyasviel/sd-controlnet-canny": (512, 512),
"diffusers/controlnet-canny-sdxl-1.0": (1024, 1024),
"TencentARC/t2iadapter_canny_sd14v1": (512, 512),
"TencentARC/t2i-adapter-canny-sdxl-1.0": (1024, 1024),
"stabilityai/stable-diffusion-2-1": (768, 768),
"stabilityai/stable-diffusion-xl-base-1.0": (1024, 1024),
"stabilityai/stable-diffusion-xl-refiner-1.0": (1024, 1024),
"stabilityai/sdxl-turbo": (512, 512),
}
class BaseBenchmak:
pipeline_class = None
def __init__(self, args):
super().__init__()
def run_inference(self, args):
raise NotImplementedError
def benchmark(self, args):
raise NotImplementedError
def get_result_filepath(self, args):
pipeline_class_name = str(self.pipe.__class__.__name__)
name = (
args.ckpt.replace("/", "_")
+ "_"
+ pipeline_class_name
+ f"-bs@{args.batch_size}-steps@{args.num_inference_steps}-mco@{args.model_cpu_offload}-compile@{args.run_compile}.csv"
)
filepath = os.path.join(BASE_PATH, name)
return filepath
class TextToImageBenchmark(BaseBenchmak):
pipeline_class = AutoPipelineForText2Image
def __init__(self, args):
pipe = self.pipeline_class.from_pretrained(args.ckpt, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
if args.run_compile:
if not isinstance(pipe, WuerstchenCombinedPipeline):
pipe.unet.to(memory_format=torch.channels_last)
print("Run torch compile")
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
if hasattr(pipe, "movq") and getattr(pipe, "movq", None) is not None:
pipe.movq.to(memory_format=torch.channels_last)
pipe.movq = torch.compile(pipe.movq, mode="reduce-overhead", fullgraph=True)
else:
print("Run torch compile")
pipe.decoder = torch.compile(pipe.decoder, mode="reduce-overhead", fullgraph=True)
pipe.vqgan = torch.compile(pipe.vqgan, mode="reduce-overhead", fullgraph=True)
pipe.set_progress_bar_config(disable=True)
self.pipe = pipe
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
def benchmark(self, args):
flush()
print(f"[INFO] {self.pipe.__class__.__name__}: Running benchmark with: {vars(args)}\n")
time = benchmark_fn(self.run_inference, self.pipe, args) # in seconds.
memory = bytes_to_giga_bytes(torch.cuda.max_memory_allocated()) # in GBs.
benchmark_info = BenchmarkInfo(time=time, memory=memory)
pipeline_class_name = str(self.pipe.__class__.__name__)
flush()
csv_dict = generate_csv_dict(
pipeline_cls=pipeline_class_name, ckpt=args.ckpt, args=args, benchmark_info=benchmark_info
)
filepath = self.get_result_filepath(args)
write_to_csv(filepath, csv_dict)
print(f"Logs written to: {filepath}")
flush()
class TurboTextToImageBenchmark(TextToImageBenchmark):
def __init__(self, args):
super().__init__(args)
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
guidance_scale=0.0,
)
class LCMLoRATextToImageBenchmark(TextToImageBenchmark):
lora_id = "latent-consistency/lcm-lora-sdxl"
def __init__(self, args):
super().__init__(args)
self.pipe.load_lora_weights(self.lora_id)
self.pipe.fuse_lora()
self.pipe.unload_lora_weights()
self.pipe.scheduler = LCMScheduler.from_config(self.pipe.scheduler.config)
def get_result_filepath(self, args):
pipeline_class_name = str(self.pipe.__class__.__name__)
name = (
self.lora_id.replace("/", "_")
+ "_"
+ pipeline_class_name
+ f"-bs@{args.batch_size}-steps@{args.num_inference_steps}-mco@{args.model_cpu_offload}-compile@{args.run_compile}.csv"
)
filepath = os.path.join(BASE_PATH, name)
return filepath
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
guidance_scale=1.0,
)
def benchmark(self, args):
flush()
print(f"[INFO] {self.pipe.__class__.__name__}: Running benchmark with: {vars(args)}\n")
time = benchmark_fn(self.run_inference, self.pipe, args) # in seconds.
memory = bytes_to_giga_bytes(torch.cuda.max_memory_allocated()) # in GBs.
benchmark_info = BenchmarkInfo(time=time, memory=memory)
pipeline_class_name = str(self.pipe.__class__.__name__)
flush()
csv_dict = generate_csv_dict(
pipeline_cls=pipeline_class_name, ckpt=self.lora_id, args=args, benchmark_info=benchmark_info
)
filepath = self.get_result_filepath(args)
write_to_csv(filepath, csv_dict)
print(f"Logs written to: {filepath}")
flush()
class ImageToImageBenchmark(TextToImageBenchmark):
pipeline_class = AutoPipelineForImage2Image
url = "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/benchmarking/1665_Girl_with_a_Pearl_Earring.jpg"
image = load_image(url).convert("RGB")
def __init__(self, args):
super().__init__(args)
self.image = self.image.resize(RESOLUTION_MAPPING[args.ckpt])
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
image=self.image,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
class TurboImageToImageBenchmark(ImageToImageBenchmark):
def __init__(self, args):
super().__init__(args)
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
image=self.image,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
guidance_scale=0.0,
strength=0.5,
)
class InpaintingBenchmark(ImageToImageBenchmark):
pipeline_class = AutoPipelineForInpainting
mask_url = "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/benchmarking/overture-creations-5sI6fQgYIuo_mask.png"
mask = load_image(mask_url).convert("RGB")
def __init__(self, args):
super().__init__(args)
self.image = self.image.resize(RESOLUTION_MAPPING[args.ckpt])
self.mask = self.mask.resize(RESOLUTION_MAPPING[args.ckpt])
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
image=self.image,
mask_image=self.mask,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
class IPAdapterTextToImageBenchmark(TextToImageBenchmark):
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_neg_embed.png"
image = load_image(url)
def __init__(self, args):
pipe = self.pipeline_class.from_pretrained(args.ckpt, torch_dtype=torch.float16).to("cuda")
pipe.load_ip_adapter(
args.ip_adapter_id[0],
subfolder="models" if "sdxl" not in args.ip_adapter_id[1] else "sdxl_models",
weight_name=args.ip_adapter_id[1],
)
if args.run_compile:
pipe.unet.to(memory_format=torch.channels_last)
print("Run torch compile")
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
pipe.set_progress_bar_config(disable=True)
self.pipe = pipe
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
ip_adapter_image=self.image,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
class ControlNetBenchmark(TextToImageBenchmark):
pipeline_class = StableDiffusionControlNetPipeline
aux_network_class = ControlNetModel
root_ckpt = "Lykon/DreamShaper"
url = "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/benchmarking/canny_image_condition.png"
image = load_image(url).convert("RGB")
def __init__(self, args):
aux_network = self.aux_network_class.from_pretrained(args.ckpt, torch_dtype=torch.float16)
pipe = self.pipeline_class.from_pretrained(self.root_ckpt, controlnet=aux_network, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
pipe.set_progress_bar_config(disable=True)
self.pipe = pipe
if args.run_compile:
pipe.unet.to(memory_format=torch.channels_last)
pipe.controlnet.to(memory_format=torch.channels_last)
print("Run torch compile")
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
pipe.controlnet = torch.compile(pipe.controlnet, mode="reduce-overhead", fullgraph=True)
self.image = self.image.resize(RESOLUTION_MAPPING[args.ckpt])
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
image=self.image,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
class ControlNetSDXLBenchmark(ControlNetBenchmark):
pipeline_class = StableDiffusionXLControlNetPipeline
root_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
def __init__(self, args):
super().__init__(args)
class T2IAdapterBenchmark(ControlNetBenchmark):
pipeline_class = StableDiffusionAdapterPipeline
aux_network_class = T2IAdapter
root_ckpt = "Lykon/DreamShaper"
url = "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/benchmarking/canny_for_adapter.png"
image = load_image(url).convert("L")
def __init__(self, args):
aux_network = self.aux_network_class.from_pretrained(args.ckpt, torch_dtype=torch.float16)
pipe = self.pipeline_class.from_pretrained(self.root_ckpt, adapter=aux_network, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
pipe.set_progress_bar_config(disable=True)
self.pipe = pipe
if args.run_compile:
pipe.unet.to(memory_format=torch.channels_last)
pipe.adapter.to(memory_format=torch.channels_last)
print("Run torch compile")
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
pipe.adapter = torch.compile(pipe.adapter, mode="reduce-overhead", fullgraph=True)
self.image = self.image.resize(RESOLUTION_MAPPING[args.ckpt])
class T2IAdapterSDXLBenchmark(T2IAdapterBenchmark):
pipeline_class = StableDiffusionXLAdapterPipeline
root_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
url = "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/benchmarking/canny_for_adapter_sdxl.png"
image = load_image(url)
def __init__(self, args):
super().__init__(args)

View File

@@ -1,26 +0,0 @@
import argparse
import sys
sys.path.append(".")
from base_classes import ControlNetBenchmark, ControlNetSDXLBenchmark # noqa: E402
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="lllyasviel/sd-controlnet-canny",
choices=["lllyasviel/sd-controlnet-canny", "diffusers/controlnet-canny-sdxl-1.0"],
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_pipe = (
ControlNetBenchmark(args) if args.ckpt == "lllyasviel/sd-controlnet-canny" else ControlNetSDXLBenchmark(args)
)
benchmark_pipe.benchmark(args)

View File

@@ -1,33 +0,0 @@
import argparse
import sys
sys.path.append(".")
from base_classes import IPAdapterTextToImageBenchmark # noqa: E402
IP_ADAPTER_CKPTS = {
# because original SD v1.5 has been taken down.
"Lykon/DreamShaper": ("h94/IP-Adapter", "ip-adapter_sd15.bin"),
"stabilityai/stable-diffusion-xl-base-1.0": ("h94/IP-Adapter", "ip-adapter_sdxl.bin"),
}
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="rstabilityai/stable-diffusion-xl-base-1.0",
choices=list(IP_ADAPTER_CKPTS.keys()),
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
args.ip_adapter_id = IP_ADAPTER_CKPTS[args.ckpt]
benchmark_pipe = IPAdapterTextToImageBenchmark(args)
args.ckpt = f"{args.ckpt} (IP-Adapter)"
benchmark_pipe.benchmark(args)

View File

@@ -1,29 +0,0 @@
import argparse
import sys
sys.path.append(".")
from base_classes import ImageToImageBenchmark, TurboImageToImageBenchmark # noqa: E402
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="Lykon/DreamShaper",
choices=[
"Lykon/DreamShaper",
"stabilityai/stable-diffusion-2-1",
"stabilityai/stable-diffusion-xl-refiner-1.0",
"stabilityai/sdxl-turbo",
],
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_pipe = ImageToImageBenchmark(args) if "turbo" not in args.ckpt else TurboImageToImageBenchmark(args)
benchmark_pipe.benchmark(args)

View File

@@ -1,28 +0,0 @@
import argparse
import sys
sys.path.append(".")
from base_classes import InpaintingBenchmark # noqa: E402
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="Lykon/DreamShaper",
choices=[
"Lykon/DreamShaper",
"stabilityai/stable-diffusion-2-1",
"stabilityai/stable-diffusion-xl-base-1.0",
],
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_pipe = InpaintingBenchmark(args)
benchmark_pipe.benchmark(args)

View File

@@ -1,28 +0,0 @@
import argparse
import sys
sys.path.append(".")
from base_classes import T2IAdapterBenchmark, T2IAdapterSDXLBenchmark # noqa: E402
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="TencentARC/t2iadapter_canny_sd14v1",
choices=["TencentARC/t2iadapter_canny_sd14v1", "TencentARC/t2i-adapter-canny-sdxl-1.0"],
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_pipe = (
T2IAdapterBenchmark(args)
if args.ckpt == "TencentARC/t2iadapter_canny_sd14v1"
else T2IAdapterSDXLBenchmark(args)
)
benchmark_pipe.benchmark(args)

View File

@@ -1,23 +0,0 @@
import argparse
import sys
sys.path.append(".")
from base_classes import LCMLoRATextToImageBenchmark # noqa: E402
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="stabilityai/stable-diffusion-xl-base-1.0",
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=4)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_pipe = LCMLoRATextToImageBenchmark(args)
benchmark_pipe.benchmark(args)

View File

@@ -1,40 +0,0 @@
import argparse
import sys
sys.path.append(".")
from base_classes import TextToImageBenchmark, TurboTextToImageBenchmark # noqa: E402
ALL_T2I_CKPTS = [
"Lykon/DreamShaper",
"segmind/SSD-1B",
"stabilityai/stable-diffusion-xl-base-1.0",
"kandinsky-community/kandinsky-2-2-decoder",
"warp-ai/wuerstchen",
"stabilityai/sdxl-turbo",
]
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="Lykon/DreamShaper",
choices=ALL_T2I_CKPTS,
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_cls = None
if "turbo" in args.ckpt:
benchmark_cls = TurboTextToImageBenchmark
else:
benchmark_cls = TextToImageBenchmark
benchmark_pipe = benchmark_cls(args)
benchmark_pipe.benchmark(args)

View File

@@ -0,0 +1,98 @@
from functools import partial
import torch
from benchmarking_utils import BenchmarkMixin, BenchmarkScenario, model_init_fn
from diffusers import BitsAndBytesConfig, FluxTransformer2DModel
from diffusers.utils.testing_utils import torch_device
CKPT_ID = "black-forest-labs/FLUX.1-dev"
RESULT_FILENAME = "flux.csv"
def get_input_dict(**device_dtype_kwargs):
# resolution: 1024x1024
# maximum sequence length 512
hidden_states = torch.randn(1, 4096, 64, **device_dtype_kwargs)
encoder_hidden_states = torch.randn(1, 512, 4096, **device_dtype_kwargs)
pooled_prompt_embeds = torch.randn(1, 768, **device_dtype_kwargs)
image_ids = torch.ones(512, 3, **device_dtype_kwargs)
text_ids = torch.ones(4096, 3, **device_dtype_kwargs)
timestep = torch.tensor([1.0], **device_dtype_kwargs)
guidance = torch.tensor([1.0], **device_dtype_kwargs)
return {
"hidden_states": hidden_states,
"encoder_hidden_states": encoder_hidden_states,
"img_ids": image_ids,
"txt_ids": text_ids,
"pooled_projections": pooled_prompt_embeds,
"timestep": timestep,
"guidance": guidance,
}
if __name__ == "__main__":
scenarios = [
BenchmarkScenario(
name=f"{CKPT_ID}-bf16",
model_cls=FluxTransformer2DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=model_init_fn,
compile_kwargs={"fullgraph": True},
),
BenchmarkScenario(
name=f"{CKPT_ID}-bnb-nf4",
model_cls=FluxTransformer2DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
"quantization_config": BitsAndBytesConfig(
load_in_4bit=True, bnb_4bit_compute_dtype=torch.bfloat16, bnb_4bit_quant_type="nf4"
),
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=model_init_fn,
),
BenchmarkScenario(
name=f"{CKPT_ID}-layerwise-upcasting",
model_cls=FluxTransformer2DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=partial(model_init_fn, layerwise_upcasting=True),
),
BenchmarkScenario(
name=f"{CKPT_ID}-group-offload-leaf",
model_cls=FluxTransformer2DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=partial(
model_init_fn,
group_offload_kwargs={
"onload_device": torch_device,
"offload_device": torch.device("cpu"),
"offload_type": "leaf_level",
"use_stream": True,
"non_blocking": True,
},
),
),
]
runner = BenchmarkMixin()
runner.run_bencmarks_and_collate(scenarios, filename=RESULT_FILENAME)

View File

@@ -0,0 +1,80 @@
from functools import partial
import torch
from benchmarking_utils import BenchmarkMixin, BenchmarkScenario, model_init_fn
from diffusers import LTXVideoTransformer3DModel
from diffusers.utils.testing_utils import torch_device
CKPT_ID = "Lightricks/LTX-Video-0.9.7-dev"
RESULT_FILENAME = "ltx.csv"
def get_input_dict(**device_dtype_kwargs):
# 512x704 (161 frames)
# `max_sequence_length`: 256
hidden_states = torch.randn(1, 7392, 128, **device_dtype_kwargs)
encoder_hidden_states = torch.randn(1, 256, 4096, **device_dtype_kwargs)
encoder_attention_mask = torch.ones(1, 256, **device_dtype_kwargs)
timestep = torch.tensor([1.0], **device_dtype_kwargs)
video_coords = torch.randn(1, 3, 7392, **device_dtype_kwargs)
return {
"hidden_states": hidden_states,
"encoder_hidden_states": encoder_hidden_states,
"encoder_attention_mask": encoder_attention_mask,
"timestep": timestep,
"video_coords": video_coords,
}
if __name__ == "__main__":
scenarios = [
BenchmarkScenario(
name=f"{CKPT_ID}-bf16",
model_cls=LTXVideoTransformer3DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=model_init_fn,
compile_kwargs={"fullgraph": True},
),
BenchmarkScenario(
name=f"{CKPT_ID}-layerwise-upcasting",
model_cls=LTXVideoTransformer3DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=partial(model_init_fn, layerwise_upcasting=True),
),
BenchmarkScenario(
name=f"{CKPT_ID}-group-offload-leaf",
model_cls=LTXVideoTransformer3DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=partial(
model_init_fn,
group_offload_kwargs={
"onload_device": torch_device,
"offload_device": torch.device("cpu"),
"offload_type": "leaf_level",
"use_stream": True,
"non_blocking": True,
},
),
),
]
runner = BenchmarkMixin()
runner.run_bencmarks_and_collate(scenarios, filename=RESULT_FILENAME)

View File

@@ -0,0 +1,82 @@
from functools import partial
import torch
from benchmarking_utils import BenchmarkMixin, BenchmarkScenario, model_init_fn
from diffusers import UNet2DConditionModel
from diffusers.utils.testing_utils import torch_device
CKPT_ID = "stabilityai/stable-diffusion-xl-base-1.0"
RESULT_FILENAME = "sdxl.csv"
def get_input_dict(**device_dtype_kwargs):
# height: 1024
# width: 1024
# max_sequence_length: 77
hidden_states = torch.randn(1, 4, 128, 128, **device_dtype_kwargs)
encoder_hidden_states = torch.randn(1, 77, 2048, **device_dtype_kwargs)
timestep = torch.tensor([1.0], **device_dtype_kwargs)
added_cond_kwargs = {
"text_embeds": torch.randn(1, 1280, **device_dtype_kwargs),
"time_ids": torch.ones(1, 6, **device_dtype_kwargs),
}
return {
"sample": hidden_states,
"encoder_hidden_states": encoder_hidden_states,
"timestep": timestep,
"added_cond_kwargs": added_cond_kwargs,
}
if __name__ == "__main__":
scenarios = [
BenchmarkScenario(
name=f"{CKPT_ID}-bf16",
model_cls=UNet2DConditionModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "unet",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=model_init_fn,
compile_kwargs={"fullgraph": True},
),
BenchmarkScenario(
name=f"{CKPT_ID}-layerwise-upcasting",
model_cls=UNet2DConditionModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "unet",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=partial(model_init_fn, layerwise_upcasting=True),
),
BenchmarkScenario(
name=f"{CKPT_ID}-group-offload-leaf",
model_cls=UNet2DConditionModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "unet",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=partial(
model_init_fn,
group_offload_kwargs={
"onload_device": torch_device,
"offload_device": torch.device("cpu"),
"offload_type": "leaf_level",
"use_stream": True,
"non_blocking": True,
},
),
),
]
runner = BenchmarkMixin()
runner.run_bencmarks_and_collate(scenarios, filename=RESULT_FILENAME)

View File

@@ -0,0 +1,244 @@
import gc
import inspect
import logging
import os
import queue
import threading
from contextlib import nullcontext
from dataclasses import dataclass
from typing import Any, Callable, Dict, Optional, Union
import pandas as pd
import torch
import torch.utils.benchmark as benchmark
from diffusers.models.modeling_utils import ModelMixin
from diffusers.utils.testing_utils import require_torch_gpu, torch_device
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(name)s: %(message)s")
logger = logging.getLogger(__name__)
NUM_WARMUP_ROUNDS = 5
def benchmark_fn(f, *args, **kwargs):
t0 = benchmark.Timer(
stmt="f(*args, **kwargs)",
globals={"args": args, "kwargs": kwargs, "f": f},
num_threads=1,
)
return float(f"{(t0.blocked_autorange().mean):.3f}")
def flush():
gc.collect()
torch.cuda.empty_cache()
torch.cuda.reset_max_memory_allocated()
torch.cuda.reset_peak_memory_stats()
# Adapted from https://github.com/lucasb-eyer/cnn_vit_benchmarks/blob/15b665ff758e8062131353076153905cae00a71f/main.py
def calculate_flops(model, input_dict):
try:
from torchprofile import profile_macs
except ModuleNotFoundError:
raise
# This is a hacky way to convert the kwargs to args as `profile_macs` cries about kwargs.
sig = inspect.signature(model.forward)
param_names = [
p.name
for p in sig.parameters.values()
if p.kind
in (
inspect.Parameter.POSITIONAL_ONLY,
inspect.Parameter.POSITIONAL_OR_KEYWORD,
)
and p.name != "self"
]
bound = sig.bind_partial(**input_dict)
bound.apply_defaults()
args = tuple(bound.arguments[name] for name in param_names)
model.eval()
with torch.no_grad():
macs = profile_macs(model, args)
flops = 2 * macs # 1 MAC operation = 2 FLOPs (1 multiplication + 1 addition)
return flops
def calculate_params(model):
return sum(p.numel() for p in model.parameters())
# Users can define their own in case this doesn't suffice. For most cases,
# it should be sufficient.
def model_init_fn(model_cls, group_offload_kwargs=None, layerwise_upcasting=False, **init_kwargs):
model = model_cls.from_pretrained(**init_kwargs).eval()
if group_offload_kwargs and isinstance(group_offload_kwargs, dict):
model.enable_group_offload(**group_offload_kwargs)
else:
model.to(torch_device)
if layerwise_upcasting:
model.enable_layerwise_casting(
storage_dtype=torch.float8_e4m3fn, compute_dtype=init_kwargs.get("torch_dtype", torch.bfloat16)
)
return model
@dataclass
class BenchmarkScenario:
name: str
model_cls: ModelMixin
model_init_kwargs: Dict[str, Any]
model_init_fn: Callable
get_model_input_dict: Callable
compile_kwargs: Optional[Dict[str, Any]] = None
@require_torch_gpu
class BenchmarkMixin:
def pre_benchmark(self):
flush()
torch.compiler.reset()
def post_benchmark(self, model):
model.cpu()
flush()
torch.compiler.reset()
@torch.no_grad()
def run_benchmark(self, scenario: BenchmarkScenario):
# 0) Basic stats
logger.info(f"Running scenario: {scenario.name}.")
try:
model = model_init_fn(scenario.model_cls, **scenario.model_init_kwargs)
num_params = round(calculate_params(model) / 1e9, 2)
try:
flops = round(calculate_flops(model, input_dict=scenario.get_model_input_dict()) / 1e9, 2)
except Exception as e:
logger.info(f"Problem in calculating FLOPs:\n{e}")
flops = None
model.cpu()
del model
except Exception as e:
logger.info(f"Error while initializing the model and calculating FLOPs:\n{e}")
return {}
self.pre_benchmark()
# 1) plain stats
results = {}
plain = None
try:
plain = self._run_phase(
model_cls=scenario.model_cls,
init_fn=scenario.model_init_fn,
init_kwargs=scenario.model_init_kwargs,
get_input_fn=scenario.get_model_input_dict,
compile_kwargs=None,
)
except Exception as e:
logger.info(f"Benchmark could not be run with the following error:\n{e}")
return results
# 2) compiled stats (if any)
compiled = {"time": None, "memory": None}
if scenario.compile_kwargs:
try:
compiled = self._run_phase(
model_cls=scenario.model_cls,
init_fn=scenario.model_init_fn,
init_kwargs=scenario.model_init_kwargs,
get_input_fn=scenario.get_model_input_dict,
compile_kwargs=scenario.compile_kwargs,
)
except Exception as e:
logger.info(f"Compilation benchmark could not be run with the following error\n: {e}")
if plain is None:
return results
# 3) merge
result = {
"scenario": scenario.name,
"model_cls": scenario.model_cls.__name__,
"num_params_B": num_params,
"flops_G": flops,
"time_plain_s": plain["time"],
"mem_plain_GB": plain["memory"],
"time_compile_s": compiled["time"],
"mem_compile_GB": compiled["memory"],
}
if scenario.compile_kwargs:
result["fullgraph"] = scenario.compile_kwargs.get("fullgraph", False)
result["mode"] = scenario.compile_kwargs.get("mode", "default")
else:
result["fullgraph"], result["mode"] = None, None
return result
def run_bencmarks_and_collate(self, scenarios: Union[BenchmarkScenario, list[BenchmarkScenario]], filename: str):
if not isinstance(scenarios, list):
scenarios = [scenarios]
record_queue = queue.Queue()
stop_signal = object()
def _writer_thread():
while True:
item = record_queue.get()
if item is stop_signal:
break
df_row = pd.DataFrame([item])
write_header = not os.path.exists(filename)
df_row.to_csv(filename, mode="a", header=write_header, index=False)
record_queue.task_done()
record_queue.task_done()
writer = threading.Thread(target=_writer_thread, daemon=True)
writer.start()
for s in scenarios:
try:
record = self.run_benchmark(s)
if record:
record_queue.put(record)
else:
logger.info(f"Record empty from scenario: {s.name}.")
except Exception as e:
logger.info(f"Running scenario ({s.name}) led to error:\n{e}")
record_queue.put(stop_signal)
logger.info(f"Results serialized to {filename=}.")
def _run_phase(
self,
*,
model_cls: ModelMixin,
init_fn: Callable,
init_kwargs: Dict[str, Any],
get_input_fn: Callable,
compile_kwargs: Optional[Dict[str, Any]],
) -> Dict[str, float]:
# setup
self.pre_benchmark()
# init & (optional) compile
model = init_fn(model_cls, **init_kwargs)
if compile_kwargs:
model.compile(**compile_kwargs)
# build inputs
inp = get_input_fn()
# measure
run_ctx = torch._inductor.utils.fresh_inductor_cache() if compile_kwargs else nullcontext()
with run_ctx:
for _ in range(NUM_WARMUP_ROUNDS):
_ = model(**inp)
time_s = benchmark_fn(lambda m, d: m(**d), model, inp)
mem_gb = torch.cuda.max_memory_allocated() / (1024**3)
mem_gb = round(mem_gb, 2)
# teardown
self.post_benchmark(model)
del model
return {"time": time_s, "memory": mem_gb}

View File

@@ -0,0 +1,74 @@
from functools import partial
import torch
from benchmarking_utils import BenchmarkMixin, BenchmarkScenario, model_init_fn
from diffusers import WanTransformer3DModel
from diffusers.utils.testing_utils import torch_device
CKPT_ID = "Wan-AI/Wan2.1-T2V-14B-Diffusers"
RESULT_FILENAME = "wan.csv"
def get_input_dict(**device_dtype_kwargs):
# height: 480
# width: 832
# num_frames: 81
# max_sequence_length: 512
hidden_states = torch.randn(1, 16, 21, 60, 104, **device_dtype_kwargs)
encoder_hidden_states = torch.randn(1, 512, 4096, **device_dtype_kwargs)
timestep = torch.tensor([1.0], **device_dtype_kwargs)
return {"hidden_states": hidden_states, "encoder_hidden_states": encoder_hidden_states, "timestep": timestep}
if __name__ == "__main__":
scenarios = [
BenchmarkScenario(
name=f"{CKPT_ID}-bf16",
model_cls=WanTransformer3DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=model_init_fn,
compile_kwargs={"fullgraph": True},
),
BenchmarkScenario(
name=f"{CKPT_ID}-layerwise-upcasting",
model_cls=WanTransformer3DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=partial(model_init_fn, layerwise_upcasting=True),
),
BenchmarkScenario(
name=f"{CKPT_ID}-group-offload-leaf",
model_cls=WanTransformer3DModel,
model_init_kwargs={
"pretrained_model_name_or_path": CKPT_ID,
"torch_dtype": torch.bfloat16,
"subfolder": "transformer",
},
get_model_input_dict=partial(get_input_dict, device=torch_device, dtype=torch.bfloat16),
model_init_fn=partial(
model_init_fn,
group_offload_kwargs={
"onload_device": torch_device,
"offload_device": torch.device("cpu"),
"offload_type": "leaf_level",
"use_stream": True,
"non_blocking": True,
},
),
),
]
runner = BenchmarkMixin()
runner.run_bencmarks_and_collate(scenarios, filename=RESULT_FILENAME)

View File

@@ -0,0 +1,166 @@
import argparse
import os
import sys
import gpustat
import pandas as pd
import psycopg2
import psycopg2.extras
from psycopg2.extensions import register_adapter
from psycopg2.extras import Json
register_adapter(dict, Json)
FINAL_CSV_FILENAME = "collated_results.csv"
# https://github.com/huggingface/transformers/blob/593e29c5e2a9b17baec010e8dc7c1431fed6e841/benchmark/init_db.sql#L27
BENCHMARKS_TABLE_NAME = "benchmarks"
MEASUREMENTS_TABLE_NAME = "model_measurements"
def _init_benchmark(conn, branch, commit_id, commit_msg):
gpu_stats = gpustat.GPUStatCollection.new_query()
metadata = {"gpu_name": gpu_stats[0]["name"]}
repository = "huggingface/diffusers"
with conn.cursor() as cur:
cur.execute(
f"INSERT INTO {BENCHMARKS_TABLE_NAME} (repository, branch, commit_id, commit_message, metadata) VALUES (%s, %s, %s, %s, %s) RETURNING benchmark_id",
(repository, branch, commit_id, commit_msg, metadata),
)
benchmark_id = cur.fetchone()[0]
print(f"Initialised benchmark #{benchmark_id}")
return benchmark_id
def parse_args():
parser = argparse.ArgumentParser()
parser.add_argument(
"branch",
type=str,
help="The branch name on which the benchmarking is performed.",
)
parser.add_argument(
"commit_id",
type=str,
help="The commit hash on which the benchmarking is performed.",
)
parser.add_argument(
"commit_msg",
type=str,
help="The commit message associated with the commit, truncated to 70 characters.",
)
args = parser.parse_args()
return args
if __name__ == "__main__":
args = parse_args()
try:
conn = psycopg2.connect(
host=os.getenv("PGHOST"),
database=os.getenv("PGDATABASE"),
user=os.getenv("PGUSER"),
password=os.getenv("PGPASSWORD"),
)
print("DB connection established successfully.")
except Exception as e:
print(f"Problem during DB init: {e}")
sys.exit(1)
try:
benchmark_id = _init_benchmark(
conn=conn,
branch=args.branch,
commit_id=args.commit_id,
commit_msg=args.commit_msg,
)
except Exception as e:
print(f"Problem during initializing benchmark: {e}")
sys.exit(1)
cur = conn.cursor()
df = pd.read_csv(FINAL_CSV_FILENAME)
# Helper to cast values (or None) given a dtype
def _cast_value(val, dtype: str):
if pd.isna(val):
return None
if dtype == "text":
return str(val).strip()
if dtype == "float":
try:
return float(val)
except ValueError:
return None
if dtype == "bool":
s = str(val).strip().lower()
if s in ("true", "t", "yes", "1"):
return True
if s in ("false", "f", "no", "0"):
return False
if val in (1, 1.0):
return True
if val in (0, 0.0):
return False
return None
return val
try:
rows_to_insert = []
for _, row in df.iterrows():
scenario = _cast_value(row.get("scenario"), "text")
model_cls = _cast_value(row.get("model_cls"), "text")
num_params_B = _cast_value(row.get("num_params_B"), "float")
flops_G = _cast_value(row.get("flops_G"), "float")
time_plain_s = _cast_value(row.get("time_plain_s"), "float")
mem_plain_GB = _cast_value(row.get("mem_plain_GB"), "float")
time_compile_s = _cast_value(row.get("time_compile_s"), "float")
mem_compile_GB = _cast_value(row.get("mem_compile_GB"), "float")
fullgraph = _cast_value(row.get("fullgraph"), "bool")
mode = _cast_value(row.get("mode"), "text")
# If "github_sha" column exists in the CSV, cast it; else default to None
if "github_sha" in df.columns:
github_sha = _cast_value(row.get("github_sha"), "text")
else:
github_sha = None
measurements = {
"scenario": scenario,
"model_cls": model_cls,
"num_params_B": num_params_B,
"flops_G": flops_G,
"time_plain_s": time_plain_s,
"mem_plain_GB": mem_plain_GB,
"time_compile_s": time_compile_s,
"mem_compile_GB": mem_compile_GB,
"fullgraph": fullgraph,
"mode": mode,
"github_sha": github_sha,
}
rows_to_insert.append((benchmark_id, measurements))
# Batch-insert all rows
insert_sql = f"""
INSERT INTO {MEASUREMENTS_TABLE_NAME} (
benchmark_id,
measurements
)
VALUES (%s, %s);
"""
psycopg2.extras.execute_batch(cur, insert_sql, rows_to_insert)
conn.commit()
cur.close()
conn.close()
except Exception as e:
print(f"Exception: {e}")
sys.exit(1)

View File

@@ -1,19 +1,19 @@
import glob
import sys
import os
import pandas as pd
from huggingface_hub import hf_hub_download, upload_file
from huggingface_hub.utils import EntryNotFoundError
sys.path.append(".")
from utils import BASE_PATH, FINAL_CSV_FILE, GITHUB_SHA, REPO_ID, collate_csv # noqa: E402
REPO_ID = "diffusers/benchmarks"
def has_previous_benchmark() -> str:
from run_all import FINAL_CSV_FILENAME
csv_path = None
try:
csv_path = hf_hub_download(repo_id=REPO_ID, repo_type="dataset", filename=FINAL_CSV_FILE)
csv_path = hf_hub_download(repo_id=REPO_ID, repo_type="dataset", filename=FINAL_CSV_FILENAME)
except EntryNotFoundError:
csv_path = None
return csv_path
@@ -26,46 +26,50 @@ def filter_float(value):
def push_to_hf_dataset():
all_csvs = sorted(glob.glob(f"{BASE_PATH}/*.csv"))
collate_csv(all_csvs, FINAL_CSV_FILE)
from run_all import FINAL_CSV_FILENAME, GITHUB_SHA
# If there's an existing benchmark file, we should report the changes.
csv_path = has_previous_benchmark()
if csv_path is not None:
current_results = pd.read_csv(FINAL_CSV_FILE)
current_results = pd.read_csv(FINAL_CSV_FILENAME)
previous_results = pd.read_csv(csv_path)
numeric_columns = current_results.select_dtypes(include=["float64", "int64"]).columns
numeric_columns = [
c for c in numeric_columns if c not in ["batch_size", "num_inference_steps", "actual_gpu_memory (gbs)"]
]
for column in numeric_columns:
previous_results[column] = previous_results[column].map(lambda x: filter_float(x))
# get previous values as floats, aligned to current index
prev_vals = previous_results[column].map(filter_float).reindex(current_results.index)
# Calculate the percentage change
current_results[column] = current_results[column].astype(float)
previous_results[column] = previous_results[column].astype(float)
percent_change = ((current_results[column] - previous_results[column]) / previous_results[column]) * 100
# get current values as floats
curr_vals = current_results[column].astype(float)
# Format the values with '+' or '-' sign and append to original values
current_results[column] = current_results[column].map(str) + percent_change.map(
lambda x: f" ({'+' if x > 0 else ''}{x:.2f}%)"
# stringify the current values
curr_str = curr_vals.map(str)
# build an appendage only when prev exists and differs
append_str = prev_vals.where(prev_vals.notnull() & (prev_vals != curr_vals), other=pd.NA).map(
lambda x: f" ({x})" if pd.notnull(x) else ""
)
# There might be newly added rows. So, filter out the NaNs.
current_results[column] = current_results[column].map(lambda x: x.replace(" (nan%)", ""))
# Overwrite the current result file.
current_results.to_csv(FINAL_CSV_FILE, index=False)
# combine
current_results[column] = curr_str + append_str
os.remove(FINAL_CSV_FILENAME)
current_results.to_csv(FINAL_CSV_FILENAME, index=False)
commit_message = f"upload from sha: {GITHUB_SHA}" if GITHUB_SHA is not None else "upload benchmark results"
upload_file(
repo_id=REPO_ID,
path_in_repo=FINAL_CSV_FILE,
path_or_fileobj=FINAL_CSV_FILE,
path_in_repo=FINAL_CSV_FILENAME,
path_or_fileobj=FINAL_CSV_FILENAME,
repo_type="dataset",
commit_message=commit_message,
)
upload_file(
repo_id="diffusers/benchmark-analyzer",
path_in_repo=FINAL_CSV_FILENAME,
path_or_fileobj=FINAL_CSV_FILENAME,
repo_type="space",
commit_message=commit_message,
)
if __name__ == "__main__":

View File

@@ -0,0 +1,6 @@
pandas
psutil
gpustat
torchprofile
bitsandbytes
psycopg2==2.9.9

View File

@@ -1,101 +1,84 @@
import glob
import logging
import os
import subprocess
import sys
from typing import List
import pandas as pd
sys.path.append(".")
from benchmark_text_to_image import ALL_T2I_CKPTS # noqa: E402
logging.basicConfig(level=logging.INFO, format="%(asctime)s %(levelname)s %(name)s: %(message)s")
logger = logging.getLogger(__name__)
PATTERN = "benchmark_*.py"
PATTERN = "benchmarking_*.py"
FINAL_CSV_FILENAME = "collated_results.csv"
GITHUB_SHA = os.getenv("GITHUB_SHA", None)
class SubprocessCallException(Exception):
pass
# Taken from `test_examples_utils.py`
def run_command(command: List[str], return_stdout=False):
"""
Runs `command` with `subprocess.check_output` and will potentially return the `stdout`. Will also properly capture
if an error occurred while running `command`
"""
def run_command(command: list[str], return_stdout=False):
try:
output = subprocess.check_output(command, stderr=subprocess.STDOUT)
if return_stdout:
if hasattr(output, "decode"):
output = output.decode("utf-8")
return output
if return_stdout and hasattr(output, "decode"):
return output.decode("utf-8")
except subprocess.CalledProcessError as e:
raise SubprocessCallException(
f"Command `{' '.join(command)}` failed with the following error:\n\n{e.output.decode()}"
) from e
raise SubprocessCallException(f"Command `{' '.join(command)}` failed with:\n{e.output.decode()}") from e
def main():
python_files = glob.glob(PATTERN)
def merge_csvs(final_csv: str = "collated_results.csv"):
all_csvs = glob.glob("*.csv")
all_csvs = [f for f in all_csvs if f != final_csv]
if not all_csvs:
logger.info("No result CSVs found to merge.")
return
for file in python_files:
print(f"****** Running file: {file} ******")
# Run with canonical settings.
if file != "benchmark_text_to_image.py" and file != "benchmark_ip_adapters.py":
command = f"python {file}"
run_command(command.split())
command += " --run_compile"
run_command(command.split())
# Run variants.
for file in python_files:
# See: https://github.com/pytorch/pytorch/issues/129637
if file == "benchmark_ip_adapters.py":
df_list = []
for f in all_csvs:
try:
d = pd.read_csv(f)
except pd.errors.EmptyDataError:
# If a file existed but was zerobytes or corrupted, skip it
continue
df_list.append(d)
if file == "benchmark_text_to_image.py":
for ckpt in ALL_T2I_CKPTS:
command = f"python {file} --ckpt {ckpt}"
if not df_list:
logger.info("All result CSVs were empty or invalid; nothing to merge.")
return
if "turbo" in ckpt:
command += " --num_inference_steps 1"
final_df = pd.concat(df_list, ignore_index=True)
if GITHUB_SHA is not None:
final_df["github_sha"] = GITHUB_SHA
final_df.to_csv(final_csv, index=False)
logger.info(f"Merged {len(all_csvs)} partial CSVs → {final_csv}.")
run_command(command.split())
command += " --run_compile"
run_command(command.split())
def run_scripts():
python_files = sorted(glob.glob(PATTERN))
python_files = [f for f in python_files if f != "benchmarking_utils.py"]
elif file == "benchmark_sd_img.py":
for ckpt in ["stabilityai/stable-diffusion-xl-refiner-1.0", "stabilityai/sdxl-turbo"]:
command = f"python {file} --ckpt {ckpt}"
for file in python_files:
script_name = file.split(".py")[0].split("_")[-1] # example: benchmarking_foo.py -> foo
logger.info(f"\n****** Running file: {file} ******")
if ckpt == "stabilityai/sdxl-turbo":
command += " --num_inference_steps 2"
partial_csv = f"{script_name}.csv"
if os.path.exists(partial_csv):
logger.info(f"Found {partial_csv}. Removing for safer numbers and duplication.")
os.remove(partial_csv)
run_command(command.split())
command += " --run_compile"
run_command(command.split())
command = ["python", file]
try:
run_command(command)
logger.info(f"{file} finished normally.")
except SubprocessCallException as e:
logger.info(f"Error running {file}:\n{e}")
finally:
logger.info(f"→ Merging partial CSVs after {file}")
merge_csvs(final_csv=FINAL_CSV_FILENAME)
elif file in ["benchmark_sd_inpainting.py", "benchmark_ip_adapters.py"]:
sdxl_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
command = f"python {file} --ckpt {sdxl_ckpt}"
run_command(command.split())
command += " --run_compile"
run_command(command.split())
elif file in ["benchmark_controlnet.py", "benchmark_t2i_adapter.py"]:
sdxl_ckpt = (
"diffusers/controlnet-canny-sdxl-1.0"
if "controlnet" in file
else "TencentARC/t2i-adapter-canny-sdxl-1.0"
)
command = f"python {file} --ckpt {sdxl_ckpt}"
run_command(command.split())
command += " --run_compile"
run_command(command.split())
logger.info(f"\nAll scripts attempted. Final collated CSV: {FINAL_CSV_FILENAME}")
if __name__ == "__main__":
main()
run_scripts()

View File

@@ -1,98 +0,0 @@
import argparse
import csv
import gc
import os
from dataclasses import dataclass
from typing import Dict, List, Union
import torch
import torch.utils.benchmark as benchmark
GITHUB_SHA = os.getenv("GITHUB_SHA", None)
BENCHMARK_FIELDS = [
"pipeline_cls",
"ckpt_id",
"batch_size",
"num_inference_steps",
"model_cpu_offload",
"run_compile",
"time (secs)",
"memory (gbs)",
"actual_gpu_memory (gbs)",
"github_sha",
]
PROMPT = "ghibli style, a fantasy landscape with castles"
BASE_PATH = os.getenv("BASE_PATH", ".")
TOTAL_GPU_MEMORY = float(os.getenv("TOTAL_GPU_MEMORY", torch.cuda.get_device_properties(0).total_memory / (1024**3)))
REPO_ID = "diffusers/benchmarks"
FINAL_CSV_FILE = "collated_results.csv"
@dataclass
class BenchmarkInfo:
time: float
memory: float
def flush():
"""Wipes off memory."""
gc.collect()
torch.cuda.empty_cache()
torch.cuda.reset_max_memory_allocated()
torch.cuda.reset_peak_memory_stats()
def bytes_to_giga_bytes(bytes):
return f"{(bytes / 1024 / 1024 / 1024):.3f}"
def benchmark_fn(f, *args, **kwargs):
t0 = benchmark.Timer(
stmt="f(*args, **kwargs)",
globals={"args": args, "kwargs": kwargs, "f": f},
num_threads=torch.get_num_threads(),
)
return f"{(t0.blocked_autorange().mean):.3f}"
def generate_csv_dict(
pipeline_cls: str, ckpt: str, args: argparse.Namespace, benchmark_info: BenchmarkInfo
) -> Dict[str, Union[str, bool, float]]:
"""Packs benchmarking data into a dictionary for latter serialization."""
data_dict = {
"pipeline_cls": pipeline_cls,
"ckpt_id": ckpt,
"batch_size": args.batch_size,
"num_inference_steps": args.num_inference_steps,
"model_cpu_offload": args.model_cpu_offload,
"run_compile": args.run_compile,
"time (secs)": benchmark_info.time,
"memory (gbs)": benchmark_info.memory,
"actual_gpu_memory (gbs)": f"{(TOTAL_GPU_MEMORY):.3f}",
"github_sha": GITHUB_SHA,
}
return data_dict
def write_to_csv(file_name: str, data_dict: Dict[str, Union[str, bool, float]]):
"""Serializes a dictionary into a CSV file."""
with open(file_name, mode="w", newline="") as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=BENCHMARK_FIELDS)
writer.writeheader()
writer.writerow(data_dict)
def collate_csv(input_files: List[str], output_file: str):
"""Collates multiple identically structured CSVs into a single CSV file."""
with open(output_file, mode="w", newline="") as outfile:
writer = csv.DictWriter(outfile, fieldnames=BENCHMARK_FIELDS)
writer.writeheader()
for file in input_files:
with open(file, mode="r") as infile:
reader = csv.DictReader(infile)
for row in reader:
writer.writerow(row)

View File

@@ -28,9 +28,9 @@ ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3 -m uv pip install --no-cache-dir \
torch==2.1.2 \
torchvision==0.16.2 \
torchaudio==2.1.2 \
torch \
torchvision \
torchaudio\
onnxruntime \
--extra-index-url https://download.pytorch.org/whl/cpu && \
python3 -m uv pip install --no-cache-dir \

View File

@@ -1,50 +0,0 @@
FROM nvidia/cuda:12.1.0-runtime-ubuntu20.04
LABEL maintainer="Hugging Face"
LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
libgl1 \
python3.10 \
python3.10-dev \
python3-pip \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark && \
python3.10 -m pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \
huggingface-hub \
hf_transfer \
Jinja2 \
librosa \
numpy==1.26.4 \
scipy \
tensorboard \
transformers \
hf_transfer
CMD ["/bin/bash"]

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -17,12 +17,6 @@
title: AutoPipeline
- local: tutorials/basic_training
title: Train a diffusion model
- local: tutorials/using_peft_for_inference
title: Load LoRAs for inference
- local: tutorials/fast_diffusion
title: Accelerate inference of text-to-image diffusion models
- local: tutorials/inference_with_big_models
title: Working with big models
title: Tutorials
- sections:
- local: using-diffusers/loading
@@ -33,11 +27,24 @@
title: Load schedulers and models
- local: using-diffusers/other-formats
title: Model files and layouts
- local: using-diffusers/loading_adapters
title: Load adapters
- local: using-diffusers/push_to_hub
title: Push files to the Hub
title: Load pipelines and adapters
- sections:
- local: tutorials/using_peft_for_inference
title: LoRA
- local: using-diffusers/ip_adapter
title: IP-Adapter
- local: using-diffusers/controlnet
title: ControlNet
- local: using-diffusers/t2i_adapter
title: T2I-Adapter
- local: using-diffusers/dreambooth
title: DreamBooth
- local: using-diffusers/textual_inversion_inference
title: Textual inversion
title: Adapters
isExpanded: false
- sections:
- local: using-diffusers/unconditional_image_generation
title: Unconditional image generation
@@ -57,10 +64,10 @@
title: Overview
- local: using-diffusers/create_a_server
title: Create a server
- local: using-diffusers/batched_inference
title: Batch inference
- local: training/distributed_inference
title: Distributed inference
- local: using-diffusers/merge_loras
title: Merge LoRAs
- local: using-diffusers/scheduler_features
title: Scheduler features
- local: using-diffusers/callback
@@ -77,8 +84,26 @@
title: Outpainting
title: Advanced inference
- sections:
- local: using-diffusers/cogvideox
title: CogVideoX
- local: hybrid_inference/overview
title: Overview
- local: hybrid_inference/vae_decode
title: VAE Decode
- local: hybrid_inference/vae_encode
title: VAE Encode
- local: hybrid_inference/api_reference
title: API Reference
title: Hybrid Inference
- sections:
- local: modular_diffusers/getting_started
title: Getting Started
- local: modular_diffusers/components_manager
title: Components Manager
- local: modular_diffusers/write_own_pipeline_block
title: Write your own pipeline block
- local: modular_diffusers/end_to_end_guide
title: End-to-End Developer Guide
title: Modular Diffusers
- sections:
- local: using-diffusers/consisid
title: ConsisID
- local: using-diffusers/sdxl
@@ -87,18 +112,12 @@
title: SDXL Turbo
- local: using-diffusers/kandinsky
title: Kandinsky
- local: using-diffusers/ip_adapter
title: IP-Adapter
- local: using-diffusers/omnigen
title: OmniGen
- local: using-diffusers/pag
title: PAG
- local: using-diffusers/controlnet
title: ControlNet
- local: using-diffusers/t2i_adapter
title: T2I-Adapter
- local: using-diffusers/inference_with_lcm
title: Latent Consistency Model
- local: using-diffusers/textual_inversion_inference
title: Textual inversion
- local: using-diffusers/shap-e
title: Shap-E
- local: using-diffusers/diffedit
@@ -163,14 +182,20 @@
title: gguf
- local: quantization/torchao
title: torchao
- local: quantization/quanto
title: quanto
title: Quantization Methods
- sections:
- local: optimization/fp16
title: Speed up inference
title: Accelerate inference
- local: optimization/cache
title: Caching
- local: optimization/memory
title: Reduce memory usage
- local: optimization/torch2.0
title: PyTorch 2.0
- local: optimization/speed-memory-optims
title: Compile and offloading quantized models
- local: optimization/pruna
title: Pruna
- local: optimization/xformers
title: xFormers
- local: optimization/tome
@@ -197,7 +222,7 @@
- local: optimization/mps
title: Metal Performance Shaders (MPS)
- local: optimization/habana
title: Habana Gaudi
title: Intel Gaudi
- local: optimization/neuron
title: AWS Neuron
title: Optimized hardware
@@ -251,71 +276,91 @@
sections:
- local: api/models/overview
title: Overview
- local: api/models/auto_model
title: AutoModel
- sections:
- local: api/models/controlnet
title: ControlNetModel
- local: api/models/controlnet_union
title: ControlNetUnionModel
- local: api/models/controlnet_flux
title: FluxControlNetModel
- local: api/models/controlnet_hunyuandit
title: HunyuanDiT2DControlNetModel
- local: api/models/controlnet_sana
title: SanaControlNetModel
- local: api/models/controlnet_sd3
title: SD3ControlNetModel
- local: api/models/controlnet_sparsectrl
title: SparseControlNetModel
- local: api/models/controlnet_union
title: ControlNetUnionModel
title: ControlNets
- sections:
- local: api/models/allegro_transformer3d
title: AllegroTransformer3DModel
- local: api/models/aura_flow_transformer2d
title: AuraFlowTransformer2DModel
- local: api/models/chroma_transformer
title: ChromaTransformer2DModel
- local: api/models/cogvideox_transformer3d
title: CogVideoXTransformer3DModel
- local: api/models/consisid_transformer3d
title: ConsisIDTransformer3DModel
- local: api/models/cogview3plus_transformer2d
title: CogView3PlusTransformer2DModel
- local: api/models/cogview4_transformer2d
title: CogView4Transformer2DModel
- local: api/models/consisid_transformer3d
title: ConsisIDTransformer3DModel
- local: api/models/cosmos_transformer3d
title: CosmosTransformer3DModel
- local: api/models/dit_transformer2d
title: DiTTransformer2DModel
- local: api/models/easyanimate_transformer3d
title: EasyAnimateTransformer3DModel
- local: api/models/flux_transformer
title: FluxTransformer2DModel
- local: api/models/hidream_image_transformer
title: HiDreamImageTransformer2DModel
- local: api/models/hunyuan_transformer2d
title: HunyuanDiT2DModel
- local: api/models/hunyuan_video_transformer_3d
title: HunyuanVideoTransformer3DModel
- local: api/models/latte_transformer3d
title: LatteTransformer3DModel
- local: api/models/lumina_nextdit2d
title: LuminaNextDiT2DModel
- local: api/models/ltx_video_transformer3d
title: LTXVideoTransformer3DModel
- local: api/models/lumina2_transformer2d
title: Lumina2Transformer2DModel
- local: api/models/lumina_nextdit2d
title: LuminaNextDiT2DModel
- local: api/models/mochi_transformer3d
title: MochiTransformer3DModel
- local: api/models/omnigen_transformer
title: OmniGenTransformer2DModel
- local: api/models/pixart_transformer2d
title: PixArtTransformer2DModel
- local: api/models/prior_transformer
title: PriorTransformer
- local: api/models/sd3_transformer2d
title: SD3Transformer2DModel
- local: api/models/sana_transformer2d
title: SanaTransformer2DModel
- local: api/models/sd3_transformer2d
title: SD3Transformer2DModel
- local: api/models/stable_audio_transformer
title: StableAudioDiTModel
- local: api/models/transformer2d
title: Transformer2DModel
- local: api/models/transformer_temporal
title: TransformerTemporalModel
- local: api/models/wan_transformer_3d
title: WanTransformer3DModel
title: Transformers
- sections:
- local: api/models/stable_cascade_unet
title: StableCascadeUNet
- local: api/models/unet
title: UNet1DModel
- local: api/models/unet2d
title: UNet2DModel
- local: api/models/unet2d-cond
title: UNet2DConditionModel
- local: api/models/unet2d
title: UNet2DModel
- local: api/models/unet3d-cond
title: UNet3DConditionModel
- local: api/models/unet-motion
@@ -324,22 +369,28 @@
title: UViT2DModel
title: UNets
- sections:
- local: api/models/asymmetricautoencoderkl
title: AsymmetricAutoencoderKL
- local: api/models/autoencoder_dc
title: AutoencoderDC
- local: api/models/autoencoderkl
title: AutoencoderKL
- local: api/models/autoencoderkl_allegro
title: AutoencoderKLAllegro
- local: api/models/autoencoderkl_cogvideox
title: AutoencoderKLCogVideoX
- local: api/models/autoencoderkl_cosmos
title: AutoencoderKLCosmos
- local: api/models/autoencoder_kl_hunyuan_video
title: AutoencoderKLHunyuanVideo
- local: api/models/autoencoderkl_ltx_video
title: AutoencoderKLLTXVideo
- local: api/models/autoencoderkl_magvit
title: AutoencoderKLMagvit
- local: api/models/autoencoderkl_mochi
title: AutoencoderKLMochi
- local: api/models/asymmetricautoencoderkl
title: AsymmetricAutoencoderKL
- local: api/models/autoencoder_dc
title: AutoencoderDC
- local: api/models/autoencoder_kl_wan
title: AutoencoderKLWan
- local: api/models/consistency_decoder_vae
title: ConsistencyDecoderVAE
- local: api/models/autoencoder_oobleck
@@ -372,10 +423,14 @@
title: AutoPipeline
- local: api/pipelines/blip_diffusion
title: BLIP-Diffusion
- local: api/pipelines/chroma
title: Chroma
- local: api/pipelines/cogvideox
title: CogVideoX
- local: api/pipelines/cogview3
title: CogView3
- local: api/pipelines/cogview4
title: CogView4
- local: api/pipelines/consisid
title: ConsisID
- local: api/pipelines/consistency_models
@@ -390,12 +445,16 @@
title: ControlNet with Stable Diffusion 3
- local: api/pipelines/controlnet_sdxl
title: ControlNet with Stable Diffusion XL
- local: api/pipelines/controlnet_sana
title: ControlNet-Sana
- local: api/pipelines/controlnetxs
title: ControlNet-XS
- local: api/pipelines/controlnetxs_sdxl
title: ControlNet-XS with Stable Diffusion XL
- local: api/pipelines/controlnet_union
title: ControlNetUnion
- local: api/pipelines/cosmos
title: Cosmos
- local: api/pipelines/dance_diffusion
title: Dance Diffusion
- local: api/pipelines/ddim
@@ -408,10 +467,16 @@
title: DiffEdit
- local: api/pipelines/dit
title: DiT
- local: api/pipelines/easyanimate
title: EasyAnimate
- local: api/pipelines/flux
title: Flux
- local: api/pipelines/control_flux_inpaint
title: FluxControlInpaint
- local: api/pipelines/framepack
title: Framepack
- local: api/pipelines/hidream
title: HiDream-I1
- local: api/pipelines/hunyuandit
title: Hunyuan-DiT
- local: api/pipelines/hunyuan_video
@@ -438,6 +503,8 @@
title: LEDITS++
- local: api/pipelines/ltx_video
title: LTXVideo
- local: api/pipelines/lumina2
title: Lumina 2.0
- local: api/pipelines/lumina
title: Lumina-T2X
- local: api/pipelines/marigold
@@ -448,6 +515,8 @@
title: MultiDiffusion
- local: api/pipelines/musicldm
title: MusicLDM
- local: api/pipelines/omnigen
title: OmniGen
- local: api/pipelines/pag
title: PAG
- local: api/pipelines/paint_by_example
@@ -460,6 +529,8 @@
title: PixArt-Σ
- local: api/pipelines/sana
title: Sana
- local: api/pipelines/sana_sprint
title: Sana Sprint
- local: api/pipelines/self_attention_guidance
title: Self-Attention Guidance
- local: api/pipelines/semantic_stable_diffusion
@@ -473,40 +544,40 @@
- sections:
- local: api/pipelines/stable_diffusion/overview
title: Overview
- local: api/pipelines/stable_diffusion/text2img
title: Text-to-image
- local: api/pipelines/stable_diffusion/depth2img
title: Depth-to-image
- local: api/pipelines/stable_diffusion/gligen
title: GLIGEN (Grounded Language-to-Image Generation)
- local: api/pipelines/stable_diffusion/image_variation
title: Image variation
- local: api/pipelines/stable_diffusion/img2img
title: Image-to-image
- local: api/pipelines/stable_diffusion/svd
title: Image-to-video
- local: api/pipelines/stable_diffusion/inpaint
title: Inpainting
- local: api/pipelines/stable_diffusion/depth2img
title: Depth-to-image
- local: api/pipelines/stable_diffusion/image_variation
title: Image variation
- local: api/pipelines/stable_diffusion/k_diffusion
title: K-Diffusion
- local: api/pipelines/stable_diffusion/latent_upscale
title: Latent upscaler
- local: api/pipelines/stable_diffusion/ldm3d_diffusion
title: LDM3D Text-to-(RGB, Depth), Text-to-(RGB-pano, Depth-pano), LDM3D Upscaler
- local: api/pipelines/stable_diffusion/stable_diffusion_safe
title: Safe Stable Diffusion
- local: api/pipelines/stable_diffusion/sdxl_turbo
title: SDXL Turbo
- local: api/pipelines/stable_diffusion/stable_diffusion_2
title: Stable Diffusion 2
- local: api/pipelines/stable_diffusion/stable_diffusion_3
title: Stable Diffusion 3
- local: api/pipelines/stable_diffusion/stable_diffusion_xl
title: Stable Diffusion XL
- local: api/pipelines/stable_diffusion/sdxl_turbo
title: SDXL Turbo
- local: api/pipelines/stable_diffusion/latent_upscale
title: Latent upscaler
- local: api/pipelines/stable_diffusion/upscale
title: Super-resolution
- local: api/pipelines/stable_diffusion/k_diffusion
title: K-Diffusion
- local: api/pipelines/stable_diffusion/ldm3d_diffusion
title: LDM3D Text-to-(RGB, Depth), Text-to-(RGB-pano, Depth-pano), LDM3D Upscaler
- local: api/pipelines/stable_diffusion/adapter
title: T2I-Adapter
- local: api/pipelines/stable_diffusion/gligen
title: GLIGEN (Grounded Language-to-Image Generation)
- local: api/pipelines/stable_diffusion/text2img
title: Text-to-image
title: Stable Diffusion
- local: api/pipelines/stable_unclip
title: Stable unCLIP
@@ -520,6 +591,10 @@
title: UniDiffuser
- local: api/pipelines/value_guided_sampling
title: Value-guided sampling
- local: api/pipelines/visualcloze
title: VisualCloze
- local: api/pipelines/wan
title: Wan
- local: api/pipelines/wuerstchen
title: Wuerstchen
title: Pipelines
@@ -529,6 +604,10 @@
title: Overview
- local: api/schedulers/cm_stochastic_iterative
title: CMStochasticIterativeScheduler
- local: api/schedulers/ddim_cogvideox
title: CogVideoXDDIMScheduler
- local: api/schedulers/multistep_dpm_solver_cogvideox
title: CogVideoXDPMScheduler
- local: api/schedulers/consistency_decoder
title: ConsistencyDecoderScheduler
- local: api/schedulers/cosine_dpm
@@ -598,6 +677,8 @@
title: Attention Processor
- local: api/activations
title: Custom activation functions
- local: api/cache
title: Caching methods
- local: api/normalization
title: Custom normalization layers
- local: api/utilities

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -25,3 +25,16 @@ Customized activation functions for supporting various models in 🤗 Diffusers.
## ApproximateGELU
[[autodoc]] models.activations.ApproximateGELU
## SwiGLU
[[autodoc]] models.activations.SwiGLU
## FP32SiLU
[[autodoc]] models.activations.FP32SiLU
## LinearActivation
[[autodoc]] models.activations.LinearActivation

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -147,3 +147,20 @@ An attention processor is a class for applying different types of attention mech
## XLAFlashAttnProcessor2_0
[[autodoc]] models.attention_processor.XLAFlashAttnProcessor2_0
## XFormersJointAttnProcessor
[[autodoc]] models.attention_processor.XFormersJointAttnProcessor
## IPAdapterXFormersAttnProcessor
[[autodoc]] models.attention_processor.IPAdapterXFormersAttnProcessor
## FluxIPAdapterJointAttnProcessor2_0
[[autodoc]] models.attention_processor.FluxIPAdapterJointAttnProcessor2_0
## XLAFluxFlashAttnProcessor2_0
[[autodoc]] models.attention_processor.XLAFluxFlashAttnProcessor2_0

View File

@@ -0,0 +1,36 @@
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->
# Caching methods
Cache methods speedup diffusion transformers by storing and reusing intermediate outputs of specific layers, such as attention and feedforward layers, instead of recalculating them at each inference step.
## CacheMixin
[[autodoc]] CacheMixin
## PyramidAttentionBroadcastConfig
[[autodoc]] PyramidAttentionBroadcastConfig
[[autodoc]] apply_pyramid_attention_broadcast
## FasterCacheConfig
[[autodoc]] FasterCacheConfig
[[autodoc]] apply_faster_cache
### FirstBlockCacheConfig
[[autodoc]] FirstBlockCacheConfig
[[autodoc]] apply_first_block_cache

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -20,7 +20,15 @@ LoRA is a fast and lightweight training method that inserts and trains a signifi
- [`FluxLoraLoaderMixin`] provides similar functions for [Flux](https://huggingface.co/docs/diffusers/main/en/api/pipelines/flux).
- [`CogVideoXLoraLoaderMixin`] provides similar functions for [CogVideoX](https://huggingface.co/docs/diffusers/main/en/api/pipelines/cogvideox).
- [`Mochi1LoraLoaderMixin`] provides similar functions for [Mochi](https://huggingface.co/docs/diffusers/main/en/api/pipelines/mochi).
- [`AuraFlowLoraLoaderMixin`] provides similar functions for [AuraFlow](https://huggingface.co/fal/AuraFlow).
- [`LTXVideoLoraLoaderMixin`] provides similar functions for [LTX-Video](https://huggingface.co/docs/diffusers/main/en/api/pipelines/ltx_video).
- [`SanaLoraLoaderMixin`] provides similar functions for [Sana](https://huggingface.co/docs/diffusers/main/en/api/pipelines/sana).
- [`HunyuanVideoLoraLoaderMixin`] provides similar functions for [HunyuanVideo](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hunyuan_video).
- [`Lumina2LoraLoaderMixin`] provides similar functions for [Lumina2](https://huggingface.co/docs/diffusers/main/en/api/pipelines/lumina2).
- [`WanLoraLoaderMixin`] provides similar functions for [Wan](https://huggingface.co/docs/diffusers/main/en/api/pipelines/wan).
- [`CogView4LoraLoaderMixin`] provides similar functions for [CogView4](https://huggingface.co/docs/diffusers/main/en/api/pipelines/cogview4).
- [`AmusedLoraLoaderMixin`] is for the [`AmusedPipeline`].
- [`HiDreamImageLoraLoaderMixin`] provides similar functions for [HiDream Image](https://huggingface.co/docs/diffusers/main/en/api/pipelines/hidream)
- [`LoraBaseMixin`] provides a base class with several utility methods to fuse, unfuse, unload, LoRAs and more.
<Tip>
@@ -29,6 +37,10 @@ To learn more about how to load LoRA weights, see the [LoRA](../../using-diffuse
</Tip>
## LoraBaseMixin
[[autodoc]] loaders.lora_base.LoraBaseMixin
## StableDiffusionLoraLoaderMixin
[[autodoc]] loaders.lora_pipeline.StableDiffusionLoraLoaderMixin
@@ -52,11 +64,42 @@ To learn more about how to load LoRA weights, see the [LoRA](../../using-diffuse
## Mochi1LoraLoaderMixin
[[autodoc]] loaders.lora_pipeline.Mochi1LoraLoaderMixin
## AuraFlowLoraLoaderMixin
[[autodoc]] loaders.lora_pipeline.AuraFlowLoraLoaderMixin
## LTXVideoLoraLoaderMixin
[[autodoc]] loaders.lora_pipeline.LTXVideoLoraLoaderMixin
## SanaLoraLoaderMixin
[[autodoc]] loaders.lora_pipeline.SanaLoraLoaderMixin
## HunyuanVideoLoraLoaderMixin
[[autodoc]] loaders.lora_pipeline.HunyuanVideoLoraLoaderMixin
## Lumina2LoraLoaderMixin
[[autodoc]] loaders.lora_pipeline.Lumina2LoraLoaderMixin
## CogView4LoraLoaderMixin
[[autodoc]] loaders.lora_pipeline.CogView4LoraLoaderMixin
## WanLoraLoaderMixin
[[autodoc]] loaders.lora_pipeline.WanLoraLoaderMixin
## AmusedLoraLoaderMixin
[[autodoc]] loaders.lora_pipeline.AmusedLoraLoaderMixin
## LoraBaseMixin
## HiDreamImageLoraLoaderMixin
[[autodoc]] loaders.lora_base.LoraBaseMixin
[[autodoc]] loaders.lora_pipeline.HiDreamImageLoraLoaderMixin
## WanLoraLoaderMixin
[[autodoc]] loaders.lora_pipeline.WanLoraLoaderMixin

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License.
# AsymmetricAutoencoderKL
Improved larger variational autoencoder (VAE) model with KL loss for inpainting task: [Designing a Better Asymmetric VQGAN for StableDiffusion](https://arxiv.org/abs/2306.04632) by Zixin Zhu, Xuelu Feng, Dongdong Chen, Jianmin Bao, Le Wang, Yinpeng Chen, Lu Yuan, Gang Hua.
Improved larger variational autoencoder (VAE) model with KL loss for inpainting task: [Designing a Better Asymmetric VQGAN for StableDiffusion](https://huggingface.co/papers/2306.04632) by Zixin Zhu, Xuelu Feng, Dongdong Chen, Jianmin Bao, Le Wang, Yinpeng Chen, Lu Yuan, Gang Hua.
The abstract from the paper is:

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,29 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# AutoModel
The `AutoModel` is designed to make it easy to load a checkpoint without needing to know the specific model class. `AutoModel` automatically retrieves the correct model class from the checkpoint `config.json` file.
```python
from diffusers import AutoModel, AutoPipelineForText2Image
unet = AutoModel.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", subfolder="unet")
pipe = AutoPipelineForText2Image.from_pretrained("stable-diffusion-v1-5/stable-diffusion-v1-5", unet=unet)
```
## AutoModel
[[autodoc]] AutoModel
- all
- from_pretrained

View File

@@ -1,4 +1,4 @@
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,32 @@
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->
# AutoencoderKLWan
The 3D variational autoencoder (VAE) model with KL loss used in [Wan 2.1](https://github.com/Wan-Video/Wan2.1) by the Alibaba Wan Team.
The model can be loaded with the following code snippet.
```python
from diffusers import AutoencoderKLWan
vae = AutoencoderKLWan.from_pretrained("Wan-AI/Wan2.1-T2V-1.3B-Diffusers", subfolder="vae", torch_dtype=torch.float32)
```
## AutoencoderKLWan
[[autodoc]] AutoencoderKLWan
- decode
- all
## DecoderOutput
[[autodoc]] models.autoencoders.vae.DecoderOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License.
# AutoencoderKL
The variational autoencoder (VAE) model with KL loss was introduced in [Auto-Encoding Variational Bayes](https://arxiv.org/abs/1312.6114v11) by Diederik P. Kingma and Max Welling. The model is used in 🤗 Diffusers to encode images into latents and to decode latent representations into images.
The variational autoencoder (VAE) model with KL loss was introduced in [Auto-Encoding Variational Bayes](https://huggingface.co/papers/1312.6114v11) by Diederik P. Kingma and Max Welling. The model is used in 🤗 Diffusers to encode images into latents and to decode latent representations into images.
The abstract from the paper is:

View File

@@ -1,4 +1,4 @@
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -18,7 +18,7 @@ The model can be loaded with the following code snippet.
```python
from diffusers import AutoencoderKLAllegro
vae = AutoencoderKLCogVideoX.from_pretrained("rhymes-ai/Allegro", subfolder="vae", torch_dtype=torch.float32).to("cuda")
vae = AutoencoderKLAllegro.from_pretrained("rhymes-ai/Allegro", subfolder="vae", torch_dtype=torch.float32).to("cuda")
```
## AutoencoderKLAllegro

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,40 @@
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->
# AutoencoderKLCosmos
[Cosmos Tokenizers](https://github.com/NVIDIA/Cosmos-Tokenizer).
Supported models:
- [nvidia/Cosmos-1.0-Tokenizer-CV8x8x8](https://huggingface.co/nvidia/Cosmos-1.0-Tokenizer-CV8x8x8)
The model can be loaded with the following code snippet.
```python
from diffusers import AutoencoderKLCosmos
vae = AutoencoderKLCosmos.from_pretrained("nvidia/Cosmos-1.0-Tokenizer-CV8x8x8", subfolder="vae")
```
## AutoencoderKLCosmos
[[autodoc]] AutoencoderKLCosmos
- decode
- encode
- all
## AutoencoderKLOutput
[[autodoc]] models.autoencoders.autoencoder_kl.AutoencoderKLOutput
## DecoderOutput
[[autodoc]] models.autoencoders.vae.DecoderOutput

View File

@@ -1,4 +1,4 @@
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,37 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->
# AutoencoderKLMagvit
The 3D variational autoencoder (VAE) model with KL loss used in [EasyAnimate](https://github.com/aigc-apps/EasyAnimate) was introduced by Alibaba PAI.
The model can be loaded with the following code snippet.
```python
from diffusers import AutoencoderKLMagvit
vae = AutoencoderKLMagvit.from_pretrained("alibaba-pai/EasyAnimateV5.1-12b-zh", subfolder="vae", torch_dtype=torch.float16).to("cuda")
```
## AutoencoderKLMagvit
[[autodoc]] AutoencoderKLMagvit
- decode
- encode
- all
## AutoencoderKLOutput
[[autodoc]] models.autoencoders.autoencoder_kl.AutoencoderKLOutput
## DecoderOutput
[[autodoc]] models.autoencoders.vae.DecoderOutput

View File

@@ -1,4 +1,4 @@
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,19 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# ChromaTransformer2DModel
A modified flux Transformer model from [Chroma](https://huggingface.co/lodestones/Chroma)
## ChromaTransformer2DModel
[[autodoc]] ChromaTransformer2DModel

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,30 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->
# CogView4Transformer2DModel
A Diffusion Transformer model for 2D data from [CogView4]()
The model can be loaded with the following code snippet.
```python
from diffusers import CogView4Transformer2DModel
transformer = CogView4Transformer2DModel.from_pretrained("THUDM/CogView4-6B", subfolder="transformer", torch_dtype=torch.bfloat16).to("cuda")
```
## CogView4Transformer2DModel
[[autodoc]] CogView4Transformer2DModel
## Transformer2DModelOutput
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -11,7 +11,7 @@ specific language governing permissions and limitations under the License. -->
# ConsisIDTransformer3DModel
A Diffusion Transformer model for 3D data from [ConsisID](https://github.com/PKU-YuanGroup/ConsisID) was introduced in [Identity-Preserving Text-to-Video Generation by Frequency Decomposition](https://arxiv.org/pdf/2411.17440) by Peking University & University of Rochester & etc.
A Diffusion Transformer model for 3D data from [ConsisID](https://github.com/PKU-YuanGroup/ConsisID) was introduced in [Identity-Preserving Text-to-Video Generation by Frequency Decomposition](https://huggingface.co/papers/2411.17440) by Peking University & University of Rochester & etc.
The model can be loaded with the following code snippet.

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team and The InstantX Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team and The InstantX Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team and Tencent Hunyuan Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team and Tencent Hunyuan Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -12,7 +12,7 @@ specific language governing permissions and limitations under the License.
# HunyuanDiT2DControlNetModel
HunyuanDiT2DControlNetModel is an implementation of ControlNet for [Hunyuan-DiT](https://arxiv.org/abs/2405.08748).
HunyuanDiT2DControlNetModel is an implementation of ControlNet for [Hunyuan-DiT](https://huggingface.co/papers/2405.08748).
ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang, Anyi Rao, and Maneesh Agrawala.

View File

@@ -0,0 +1,29 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# SanaControlNetModel
The ControlNet model was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang, Anyi Rao, Maneesh Agrawala. It provides a greater degree of control over text-to-image generation by conditioning the model on additional inputs such as edge maps, depth maps, segmentation maps, and keypoints for pose detection.
The abstract from the paper is:
*We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to learn a diverse set of conditional controls. The neural architecture is connected with "zero convolutions" (zero-initialized convolution layers) that progressively grow the parameters from zero and ensure that no harmful noise could affect the finetuning. We test various conditioning controls, eg, edges, depth, segmentation, human pose, etc, with Stable Diffusion, using single or multiple conditions, with or without prompts. We show that the training of ControlNets is robust with small (<50k) and large (>1m) datasets. Extensive results show that ControlNet may facilitate wider applications to control image diffusion models.*
This model was contributed by [ishan24](https://huggingface.co/ishan24). ❤️
The original codebase can be found at [NVlabs/Sana](https://github.com/NVlabs/Sana), and you can find official ControlNet checkpoints on [Efficient-Large-Model's](https://huggingface.co/Efficient-Large-Model) Hub profile.
## SanaControlNetModel
[[autodoc]] SanaControlNetModel
## SanaControlNetOutput
[[autodoc]] models.controlnets.controlnet_sana.SanaControlNetOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team and The InstantX Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team and The InstantX Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -11,11 +11,11 @@ specific language governing permissions and limitations under the License. -->
# SparseControlNetModel
SparseControlNetModel is an implementation of ControlNet for [AnimateDiff](https://arxiv.org/abs/2307.04725).
SparseControlNetModel is an implementation of ControlNet for [AnimateDiff](https://huggingface.co/papers/2307.04725).
ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang, Anyi Rao, and Maneesh Agrawala.
The SparseCtrl version of ControlNet was introduced in [SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models](https://arxiv.org/abs/2311.16933) for achieving controlled generation in text-to-video diffusion models by Yuwei Guo, Ceyuan Yang, Anyi Rao, Maneesh Agrawala, Dahua Lin, and Bo Dai.
The SparseCtrl version of ControlNet was introduced in [SparseCtrl: Adding Sparse Controls to Text-to-Video Diffusion Models](https://huggingface.co/papers/2311.16933) for achieving controlled generation in text-to-video diffusion models by Yuwei Guo, Ceyuan Yang, Anyi Rao, Maneesh Agrawala, Dahua Lin, and Bo Dai.
The abstract from the paper is:

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team and The InstantX Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team and The InstantX Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,30 @@
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->
# CosmosTransformer3DModel
A Diffusion Transformer model for 3D video-like data was introduced in [Cosmos World Foundation Model Platform for Physical AI](https://huggingface.co/papers/2501.03575) by NVIDIA.
The model can be loaded with the following code snippet.
```python
from diffusers import CosmosTransformer3DModel
transformer = CosmosTransformer3DModel.from_pretrained("nvidia/Cosmos-1.0-Diffusion-7B-Text2World", subfolder="transformer", torch_dtype=torch.bfloat16)
```
## CosmosTransformer3DModel
[[autodoc]] CosmosTransformer3DModel
## Transformer2DModelOutput
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,30 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->
# EasyAnimateTransformer3DModel
A Diffusion Transformer model for 3D data from [EasyAnimate](https://github.com/aigc-apps/EasyAnimate) was introduced by Alibaba PAI.
The model can be loaded with the following code snippet.
```python
from diffusers import EasyAnimateTransformer3DModel
transformer = EasyAnimateTransformer3DModel.from_pretrained("alibaba-pai/EasyAnimateV5.1-12b-zh", subfolder="transformer", torch_dtype=torch.float16).to("cuda")
```
## EasyAnimateTransformer3DModel
[[autodoc]] EasyAnimateTransformer3DModel
## Transformer2DModelOutput
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,46 @@
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->
# HiDreamImageTransformer2DModel
A Transformer model for image-like data from [HiDream-I1](https://huggingface.co/HiDream-ai).
The model can be loaded with the following code snippet.
```python
from diffusers import HiDreamImageTransformer2DModel
transformer = HiDreamImageTransformer2DModel.from_pretrained("HiDream-ai/HiDream-I1-Full", subfolder="transformer", torch_dtype=torch.bfloat16)
```
## Loading GGUF quantized checkpoints for HiDream-I1
GGUF checkpoints for the `HiDreamImageTransformer2DModel` can be loaded using `~FromOriginalModelMixin.from_single_file`
```python
import torch
from diffusers import GGUFQuantizationConfig, HiDreamImageTransformer2DModel
ckpt_path = "https://huggingface.co/city96/HiDream-I1-Dev-gguf/blob/main/hidream-i1-dev-Q2_K.gguf"
transformer = HiDreamImageTransformer2DModel.from_single_file(
ckpt_path,
quantization_config=GGUFQuantizationConfig(compute_dtype=torch.bfloat16),
torch_dtype=torch.bfloat16
)
```
## HiDreamImageTransformer2DModel
[[autodoc]] HiDreamImageTransformer2DModel
## Transformer2DModelOutput
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,30 @@
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License. -->
# Lumina2Transformer2DModel
A Diffusion Transformer model for 3D video-like data was introduced in [Lumina Image 2.0](https://huggingface.co/Alpha-VLLM/Lumina-Image-2.0) by Alpha-VLLM.
The model can be loaded with the following code snippet.
```python
from diffusers import Lumina2Transformer2DModel
transformer = Lumina2Transformer2DModel.from_pretrained("Alpha-VLLM/Lumina-Image-2.0", subfolder="transformer", torch_dtype=torch.bfloat16)
```
## Lumina2Transformer2DModel
[[autodoc]] Lumina2Transformer2DModel
## Transformer2DModelOutput
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,30 @@
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# OmniGenTransformer2DModel
A Transformer model that accepts multimodal instructions to generate images for [OmniGen](https://github.com/VectorSpaceLab/OmniGen/).
The abstract from the paper is:
*The emergence of Large Language Models (LLMs) has unified language generation tasks and revolutionized human-machine interaction. However, in the realm of image generation, a unified model capable of handling various tasks within a single framework remains largely unexplored. In this work, we introduce OmniGen, a new diffusion model for unified image generation. OmniGen is characterized by the following features: 1) Unification: OmniGen not only demonstrates text-to-image generation capabilities but also inherently supports various downstream tasks, such as image editing, subject-driven generation, and visual conditional generation. 2) Simplicity: The architecture of OmniGen is highly simplified, eliminating the need for additional plugins. Moreover, compared to existing diffusion models, it is more user-friendly and can complete complex tasks end-to-end through instructions without the need for extra intermediate steps, greatly simplifying the image generation workflow. 3) Knowledge Transfer: Benefit from learning in a unified format, OmniGen effectively transfers knowledge across different tasks, manages unseen tasks and domains, and exhibits novel capabilities. We also explore the models reasoning capabilities and potential applications of the chain-of-thought mechanism. This work represents the first attempt at a general-purpose image generation model, and we will release our resources at https://github.com/VectorSpaceLab/OmniGen to foster future advancements.*
```python
import torch
from diffusers import OmniGenTransformer2DModel
transformer = OmniGenTransformer2DModel.from_pretrained("Shitao/OmniGen-v1-diffusers", subfolder="transformer", torch_dtype=torch.bfloat16)
```
## OmniGenTransformer2DModel
[[autodoc]] OmniGenTransformer2DModel

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
<!--Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!-- Copyright 2024 The HuggingFace Team. All rights reserved.
<!-- Copyright 2025 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

Some files were not shown because too many files have changed in this diff Show More