Compare commits

...

582 Commits

Author SHA1 Message Date
sayakpaul
c586aadef6 Release: v0.29.2 2024-06-27 09:19:56 +05:30
Sayak Paul
1479729dde [LoRA] fix vanilla fine-tuned lora loading. (#8691)
fix vanilla fine-tuned lora loading.
2024-06-27 09:17:44 +05:30
Sayak Paul
64b20500dc [Chore] remove deprecation from transformer2d regarding the output class. (#8698)
* remove deprecation from transformer2d regarding the output class.

* up

* deprecate more
2024-06-27 09:17:32 +05:30
Sayak Paul
aa2b3a3bb6 [LoRA] fix conversion utility so that lora dora loads correctly (#8688)
fix conversion utility so that lora dora loads correctly
2024-06-27 09:17:11 +05:30
Sayak Paul
edc1c8928b [LoRA] refactor lora conversion utility. (#8295)
* refactor lora conversion utility.

* remove error raises.

* add onetrainer support too.
2024-06-27 09:16:37 +05:30
Nan
a0a5427028 [SD3] Fix mis-matched shape when num_images_per_prompt > 1 using without T5 (text_encoder_3=None) (#8558)
* fix shape mismatch when num_images_per_prompt > 1 and text_encoder_3=None

* style

* fix copies

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-06-20 11:10:11 -10:00
YiYi Xu
dc74c7ec4f fix from_single_file for checkpoints with t5 (#8631)
fix single file
2024-06-20 11:06:47 -10:00
王奇勋
2eafde786a Support SD3 ControlNet and Multi-ControlNet. (#8566)
* sd3 controlnet



---------

Co-authored-by: haofanwang <haofanwang.ai@gmail.com>
2024-06-20 11:06:31 -10:00
Carolinabanana
7ec060d449 Fix gradient checkpointing issue for Stable Diffusion 3 (#8542)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-06-20 11:06:18 -10:00
Álvaro Somoza
828e364fbe [SD3 Inference] T5 Token limit (#8506)
* max_sequence_length for the T5

* updated img2img

* apply suggestions

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-06-20 11:05:50 -10:00
Marc Sun
6ea08a90b0 Fix sharding when no device_map is passed (#8531)
* Fix sharding when no device_map is passed

* style

* add tests

* align

* add docstring

* format

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-20 11:05:27 -10:00
Dhruv Nair
10ada4dae6 [Single File Loading] Handle unexpected keys in CLIP models when accelerate isn't installed. (#8462)
* update

* update

* update

* update

* update

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-20 11:05:12 -10:00
Dhruv Nair
7fada49e29 Expand Single File support in SD3 Pipeline (#8517)
* update

* update
2024-06-20 11:04:02 -10:00
Beinsezii
46418bd752 Add Hunyuan AutoPipe mapping (#8505) 2024-06-20 11:03:44 -10:00
Beinsezii
80460d98e1 Add SD3 AutoPipeline mappings (#8489) 2024-06-20 11:03:19 -10:00
Sayak Paul
93b4b105a3 fix warning log for Transformer SD3 (#8496)
fix warning log
2024-06-20 11:02:39 -10:00
YiYi Xu
21ccde8209 prepare for patch release 2024-06-20 11:00:42 -10:00
sayakpaul
39aa3909e8 Release: v0.29.0 2024-06-12 21:12:08 +01:00
Sayak Paul
ec068f9b5b fix dual transformer2d import (#8491)
fix
2024-06-12 21:10:27 +01:00
Ameer Azam
0240d4191a Update README_sd3.md (#8490)
becasue in  Readme  it was not correct
train_dreambooth_sd3.py to train_dreambooth_lora_sd3
2024-06-12 21:08:36 +01:00
Dhruv Nair
04717fd861 Add Stable Diffusion 3 (#8483)
* up

* add sd3

* update

* update

* add tests

* fix copies

* fix docs

* update

* add dreambooth lora

* add LoRA

* update

* update

* update

* update

* import fix

* update

* Update src/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* import fix 2

* update

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* update

* update

* update

* fix ckpt id

* fix more ids

* update

* missing doc

* Update src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_3.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_3.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update'

* fix

* update

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

* note on gated access.

* requirements

* licensing

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-06-12 20:44:00 +01:00
Jiwook Han
6fd458e99d 🌐 [i18n-KO] Translated conceptual/philosophy.md and 3 other documents to Korean (#8294)
* translation about 3 documents into Korean

* evaluation doc korean translation

* _toctree.yml modify

* doc title fix : philosopy->philosophy

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update philosophy.md (from jungnerd)

---------

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
2024-06-12 09:40:37 -07:00
Greg Hunkins
1066fe4cbc 🤫 Quiet IP Adapter Mask Warning (#8475)
* quiet attn parameters

* fix lint

* make style && make quality

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-12 16:50:13 +01:00
Sayak Paul
d38f69ea25 change max_shard_size to 10GB (#8445)
* change max_shard_size to 10GB

* add notes to the documentation

* Update src/diffusers/models/modeling_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* change to abs limit

---------

Co-authored-by: Lucain <lucainp@gmail.com>
2024-06-12 13:49:13 +01:00
Patrick
0a1c13af79 image_processor.py: Fixed an error in ValueError's message (#8447)
* image_processor.py: Fixed an error in ValueError's message , as the string's join method tried to join types, instead of strings

Bug that occurred:

f"Input is in incorrect format. Currently, we only support {', '.join(supported_formats)}"
TypeError: sequence item 0: expected str instance, type found

* Fixed: C417 Unnecessary `map` usage (rewrite using a generator expression)

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-11 08:09:24 -10:00
YiYi Xu
0028c34432 fix SEGA pipeline (#8467)
* fix

* style

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-11 06:37:49 -10:00
Sayak Paul
d457beed92 Update README.md to update the MaPO project (#8470)
Update README.md
2024-06-11 10:10:45 +01:00
Jianqi Pan
1d9a6a81b9 🔧 chore: use modeling_outputs.Transformer2DModelOutput (#8436)
* 🔧 chore: use modeling_outputs.Transformer2DModelOutput

* 🔧 chore: isort

* 🔧 chore: isort

* style

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2024-06-10 12:11:41 +01:00
Luc Georges
4e0984db6c fix(ci): remove unnecessary permissions (#8457) 2024-06-10 10:49:29 +01:00
Luc Georges
83bc6c94ea feat(ci): add trufflehog secrets detection (#8430) 2024-06-08 07:56:47 +05:30
Lucain
0d68ddf327 Move away from cached_download (#8419)
* Move away from

* unused constant

* Add custom error
2024-06-07 15:43:00 +05:30
Sayak Paul
7d887118b9 [Core] support saving and loading of sharded checkpoints (#7830)
* feat: support saving a model in sharded checkpoints.

* feat: make loading of sharded checkpoints work.

* add tests

* cleanse the loading logic a bit more.

* more resilience while loading from the Hub.

* parallelize shard downloads by using snapshot_download()/

* default to a shard size.

* more fix

* Empty-Commit

* debug

* fix

* uality

* more debugging

* fix more

* initial comments from Benjamin

* move certain methods to loading_utils

* add test to check if the correct number of shards are present.

* add a test to check if loading of sharded checkpoints from the Hub is okay

* clarify the unit when passed as an int.

* use hf_hub for sharding.

* remove unnecessary code

* remove unnecessary function

* lucain's comments.

* fixes

* address high-level comments.

* fix test

* subfolder shenanigans./

* Update src/diffusers/utils/hub_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Apply suggestions from code review

Co-authored-by: Lucain <lucainp@gmail.com>

* remove _huggingface_hub_version as not needed.

* address more feedback.

* add a test for local_files_only=True/

* need hf hub to be at least 0.23.2

* style

* final comment.

* clean up subfolder.

* deal with suffixes in code.

* _add_variant default.

* use weights_name_pattern

* remove add_suffix_keyword

* clean up downloading of sharded ckpts.

* don't return something special when using index.json

* fix more

* don't use bare except

* remove comments and catch the errors better

* fix a couple of things when using is_file()

* empty

---------

Co-authored-by: Lucain <lucainp@gmail.com>
2024-06-07 14:49:10 +05:30
Lucain
b63c956860 Final fix for mirror community pipeline (#8427) 2024-06-07 11:08:33 +02:00
Lucain
716b2062bf Fix mirror community pipeline (#8426) 2024-06-07 11:03:48 +02:00
Lucain
5fd6825d25 Fix mirror_community_pipeline.yml name (#8425) 2024-06-07 11:00:05 +02:00
Lucain
e0fae6fd73 Mirror ./examples/community folder on HF (#8417)
* first draft

* secret

* tiktok

* capital matters

* dataset matter

* don't be a prick

* refact

* only on main or tag

* document with an example

* Update destination dataset

* link

* allow manual trigger

* better

* lin

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-07 10:56:05 +02:00
Tolga Cangöz
ec1aded12e Optimize test files by fixing CPU-offloading usage (#8409)
* Refactor code to remove unnecessary calls to `to(torch_device)`

* Refactor code to remove unnecessary calls to `to("cuda")`

* Update pipeline_stable_diffusion_diffedit.py
2024-06-06 09:51:26 -10:00
Steven Liu
151a56b80e [docs] Single file usage (#8412)
* single file usage

* edit
2024-06-06 12:40:34 -07:00
Sayak Paul
a3faf3f260 [Core] fix: legacy model mapping (#8416)
* fix: legacy model mapping

* remove print
2024-06-06 20:35:05 +05:30
Sayak Paul
867a2b0cf9 [Hunyuan] add optimization related sections to the hunyuan dit docs. (#8402)
* optimizations to the hunyuan dit docs.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/hunyuandit.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-06-06 05:41:38 +05:30
Tolga Cangöz
98730c5dd7 Errata (#8322)
* Fix typos

* Trim trailing whitespaces

* Remove a trailing whitespace

* chore: Update MarigoldDepthPipeline checkpoint to prs-eth/marigold-lcm-v1-0

* Revert "chore: Update MarigoldDepthPipeline checkpoint to prs-eth/marigold-lcm-v1-0"

This reverts commit fd742b30b4.

* pokemon -> naruto

* `DPMSolverMultistep` -> `DPMSolverMultistepScheduler`

* Improve Markdown stylization

* Improve style

* Improve style

* Refactor pipeline variable names for consistency

* up style
2024-06-05 13:59:09 -07:00
Guillaume LEGENDRE
7ebd359446 Update tailscale action to main (#8403) 2024-06-05 18:53:33 +05:30
Hzzone
d3881f35b7 Gligen training (#7906)
* add training code of gligen

* fix code quality tests.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-05 16:26:42 +04:00
Sayak Paul
48207d6689 [Scheduler] fix: EDM schedulers when using the exp sigma schedule. (#8385)
* fix: euledm when using the exp sigma schedule.

* fix-copies

* remove print.

* reduce friction

* yiyi's suggestioms
2024-06-04 19:31:43 -10:00
Sayak Paul
2f6f426f66 [Hunyuan] allow Hunyuan DiT to run under 6GB for GPU VRAM (#8399)
* allow hunyuan dit to run under 6GB for GPU VRAM

* add section in the docs/
2024-06-05 08:24:19 +04:00
Sayak Paul
a0542c1917 [LoRA] Remove legacy LoRA code and related adjustments (#8316)
* remove legacy code from load_attn_procs.

* finish first draft

* fix more.

* fix more

* add test

* add serialization support.

* fix-copies

* require peft backend for lora tests

* style

* fix test

* fix loading.

* empty

* address benjamin's feedback.
2024-06-05 08:15:30 +04:00
Sayak Paul
a8ad6664c2 [Hunyuan] feat: support chunked ff. (#8397)
feat: support chunked ff.
2024-06-05 08:12:18 +04:00
Sayak Paul
14f7b545bd [Hunyuan DiT] feat: enable fusing qkv projections when doing attention (#8396)
* feat: introduce qkv fusion for Hunyuan

* fix copies
2024-06-05 07:58:03 +04:00
leaps
07cd20041c Update code example in pipeline_stable_unclip_img2img.py EXAMPLE_DOC_STRING (#8401)
Update code example in pipeline_stable_unclip_img2img.py

Previous code caused an error when run
2024-06-04 17:22:46 -10:00
Sayak Paul
6ddbf6222c [Transformer2DModel] Handle norm_type safely while remapping (#8370)
* handle norm_type of transformer2d_model safely.

* log an info when old model class is being returned.

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* remove extra stuff

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-06-04 13:39:19 +04:00
Sayak Paul
3ff39e8e86 [HunyuanDiT] minor docs changes in hunyuandit (#8395)
minor docs changes in hunyuandit
2024-06-04 12:18:53 +04:00
townwish4git
6be43bd855 Fix AsymmetricAutoencoderKL forward (#8378) 2024-06-03 17:25:11 -10:00
Marçal Comajoan Cara
dc89434bdc Update transformer2d.md title (#8375)
* Update transformer2d.md title

For the other classes (e.g., UNet2DModel) the title of the documentation coincides with the name of the class, but that was not the case for Transformer2DModel.

* Update model docs titles for consistency with class names
2024-06-03 17:01:21 -07:00
Dhruv Nair
4d633bfe9a Update slow test actions (#8381)
* update

* update

* update

* update
2024-06-03 18:32:34 +05:30
XCL
174cf868ea Tencent Hunyuan Team - Updated Doc for HunyuanDiT (#8383)
* add hunyuandit doc

* update hunyuandit doc

* update hunyuandit 2d model

* update toctree.yml for hunyuandit
2024-06-03 14:02:46 +04:00
XCL
413604405f Tencent Hunyuan Team: add HunyuanDiT related updates (#8240)
* Hunyuan Team: add HunyuanDiT related updates


---------

Co-authored-by: XCLiu <liuxc1996@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
2024-06-01 12:41:21 -10:00
39th president of the United States, probably
bc108e1533 Fix DREAM training (#8302)
Co-authored-by: Jimmy <39@🇺🇸.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-06-01 11:27:57 +04:00
Anton Obukhov
86555c9f59 Fix marigold documentation (#8372)
* rename prs-eth/marigold-lcm-v1-0 into prs-eth/marigold-depth-lcm-v1-0

* update image paths in https://huggingface.co/datasets/huggingface/documentation-images to use main branch

* fix relative paths to other diffusers pages

* Update docs/source/en/using-diffusers/marigold_usage.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-05-31 12:10:05 -10:00
Sayak Paul
983dec3bf7 [Core] Introduce class variants for Transformer2DModel (#7647)
* init for patches

* finish patched model.

* continuous transformer

* vectorized transformer2d.

* style.

* inits.

* fix-copies.

* introduce DiTTransformer2DModel.

* fixes

* use REMAPPING as suggested by @DN6

* better logging.

* add pixart transformer model.

* inits.

* caption_channels.

* attention masking.

* fix use_additional_conditions.

* remove print.

* debug

* flatten

* fix: assertion for sigma

* handle remapping for modeling_utils

* add tests for dit transformer2d

* quality

* placeholder for pixart tests

* pixart tests

* add _no_split_modules

* add docs.

* check

* check

* check

* check

* fix tests

* fix tests

* move Transformer output to modeling_output

* move errors better and bring back use_additional_conditions attribute.

* add unnecessary things from DiT.

* clean up pixart

* fix remapping

* fix device_map things in pixart2d.

* replace Transformer2DModel with appropriate classes in dit, pixart tests

* empty

* legacy mixin classes./

* use a remapping dict for fetching class names.

* change to specifc model types in the pipeline implementations.

* move _fetch_remapped_cls_from_config to modeling_loading_utils.py

* fix dependency problems.

* add deprecation note.
2024-05-31 13:40:27 +05:30
Dhruv Nair
f9fa8a868c Change checkpoint key used to identify CLIP models in single file checkpoints (#8319)
update
2024-05-31 11:20:31 +05:30
Jonah
05be622b1c Fix depth pipeline "input/weight type should be the same" error at fp16 (#8321)
Fix "input/weight type should be the same"

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-30 13:59:49 -10:00
satani99
352d96eb82 Modularize train_text_to_image_lora_sdxl inferencing during and after training in example (#8335)
* Modularized the train_lora_sdxl file

* Modularized the train_lora_sdxl file

* Modularized the train_lora_sdxl file

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-31 04:52:22 +05:30
Genius Patrick
3511a9623f fix(training): lr scheduler doesn't work properly in distributed scenarios (#8312) 2024-05-30 15:23:19 +05:30
Dhruv Nair
42cae93b94 Fix StableDiffusionPipeline when text_encoder=None (#8297)
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-29 09:00:51 -10:00
Tolga Cangöz
a2ecce26bc Fix Copying Mechanism typo/bug (#8232)
* Fix copying mechanism typos

* fix copying mecha

* Revert, since they are in TODO

* Fix copying mechanism
2024-05-29 09:37:18 -07:00
Steven Liu
9e00b727ad [docs] Files and formats (#7874)
* files and formats

* fix callout

* feedback

* code sample

* feedback
2024-05-29 09:31:32 -07:00
Steven Liu
f7a4626f4b [docs] DeepFloyd training (#8224)
deepfloyd training

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-29 09:27:37 -07:00
Tolga Cangöz
f4a44b7707 Simplify platform_info assignment in diffusers-cli env (#8298)
chore: Simplify `platform_info` assignment
2024-05-29 17:57:42 +05:30
satani99
3bc3b48c10 Modularize train_text_to_image_lora SD inferencing during and after training in example (#8283)
* Modularized the train_lora file

* Modularized the train_lora file

* Modularized the train_lora file

* Modularized the train_lora file

* Modularized the train_lora file

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-29 10:08:02 +05:30
Sayak Paul
581d8aacf7 post release v0.28.0 (#8286)
* post release v0.28.0

* style
2024-05-29 07:13:22 +05:30
Sayak Paul
ba1bfac20b [Core] Refactor IPAdapterPlusImageProjection a bit (#7994)
* use IPAdapterPlusImageProjectionBlock in IPAdapterPlusImageProjection

* reposition IPAdapterPlusImageProjection

* refactor complete?

* fix heads param retrieval.

* update test dict creation method.
2024-05-29 06:30:47 +05:30
Sayak Paul
5edd0b34fa move vqmodel to models.autoencoders. (#8292)
move vqmodel to models.autoencoders.
2024-05-29 06:30:35 +05:30
Sayak Paul
3a28e36aa1 [Post release 0.28.0] remove deprecated blocks. (#8291)
* remove deprecated blocks.

* update the location paths.
2024-05-29 06:29:43 +05:30
Vladimir Mandic
3393c01c9d fix pixart-sigma negative prompt handling (#8299)
* fix negative prompt

* fix

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-28 13:10:35 -10:00
Steven Liu
1fa8dbc63a [docs] Outpaint (#7964)
* first draft

* edits
2024-05-28 14:42:03 -07:00
Steven Liu
0ab6dc0f23 [docs] Scheduler features (#7990)
* noise schedule

* sigmas and zero snr

* feedback

* feedback
2024-05-28 14:41:22 -07:00
Álvaro Somoza
b2030a249c Fix object has no attribute 'flush' when using without a console (#8271)
fix
2024-05-28 11:19:01 -10:00
Sajad Norouzi
67bef2027c Add Kohya fix to SD pipeline for high resolution generation (#7633)
add kohya high resolution fix.
2024-05-28 10:00:04 -10:00
Sayak Paul
aa676c641f change to yiyi's address. (#7981)
* change to yiyi's address.

* update to diffusers@huggingface.co
2024-05-28 08:28:55 -10:00
Sayak Paul
e6df8edadc [LoRA] attempt at fixing onetrainer lora. (#8242)
* attempt at fixing onetrainer lora.

* fix
2024-05-28 08:25:54 -10:00
Jiwook Han
80cfaebaa1 Fix typo in philosophy.md (#8303)
fix typo in philosophy.md
2024-05-28 10:38:48 -07:00
Álvaro Somoza
ba82414106 [docs] Add controlnet example to marigold (#8289)
* initial doc

* fix wrong LCM sentence

* implement binary colormap without requiring matplotlib
update section about Marigold for ControlNet
update formatting of marigold_usage.md

* fix indentation

---------

Co-authored-by: anton <anton.obukhov@gmail.com>
2024-05-28 11:58:06 -04:00
Sayak Paul
fe5f035f79 install wget. (#8285) 2024-05-27 18:06:07 +05:30
Anton Obukhov
b3d10d6d65 [Pipeline] Marigold depth and normals estimation (#7847)
* implement marigold depth and normals pipelines in diffusers core

* remove bibtex

* remove deprecations

* remove save_memory argument

* remove validate_vae

* remove config output

* remove batch_size autodetection

* remove presets logic
move default denoising_steps and processing_resolution into the model config
make default ensemble_size 1

* remove no_grad

* add fp16 to the example usage

* implement is_matplotlib_available
use is_matplotlib_available, is_scipy_available for conditional imports in the marigold depth pipeline

* move colormap, visualize_depth, and visualize_normals into export_utils.py

* make the denoising loop more lucid
fix the outputs to always be 4d tensors or lists of pil images
support a 4d input_image case
attempt to support model_cpu_offload_seq
move check_inputs into a separate function
change default batch_size to 1, remove any logic to make it bigger implicitly

* style

* rename denoising_steps into num_inference_steps

* rename input_image into image

* rename input_latent into latents

* remove decode_image
change decode_prediction to use the AutoencoderKL.decode method

* move clean_latent outside of progress_bar

* refactor marigold-reusable image processing bits into MarigoldImageProcessor class

* clean up the usage example docstring

* make ensemble functions members of the pipelines

* add early checks in check_inputs
rename E into ensemble_size in depth ensembling

* fix vae_scale_factor computation

* better compatibility with torch.compile
better variable naming

* move export_depth_to_png to export_utils

* remove encode_prediction

* improve visualize_depth and visualize_normals to accept multi-dimensional data and lists
remove visualization functions from the pipelines
move exporting depth as 16-bit PNGs functionality from the depth pipeline
update example docstrings

* do not shortcut vae.config variables

* change all asserts to raise ValueError

* rename output_prediction_type to output_type

* better variable names
clean up variable deletion code

* better variable names

* pass desc and leave kwargs into the diffusers progress_bar
implement nested progress bar for images and steps loops

* implement scale_invariant and shift_invariant flags in the ensemble_depth function
add scale_invariant and shift_invariant flags readout from the model config
further refactor ensemble_depth
support ensembling without alignment
add ensemble_depth docstring

* fix generator device placement checks

* move encode_empty_text body into the pipeline call

* minor empty text encoding simplifications

* adjust pipelines' class docstrings to explain the added construction arguments

* improve the scipy failure condition
add comments
improve docstrings
change the default use_full_z_range to True

* make input image values range check configurable in the preprocessor
refactor load_image_canonical in preprocessor to reject unknown types and return the image in the expected 4D format of tensor and on right device
support a list of everything as inputs to the pipeline, change type to PipelineImageInput
implement a check that all input list elements have the same dimensions
improve docstrings of pipeline outputs
remove check_input pipeline argument

* remove forgotten print

* add prediction_type model config

* add uncertainty visualization into export utils
fix NaN values in normals uncertainties

* change default of output_uncertainty to False
better handle the case of an attempt to export or visualize none

* fix `output_uncertainty=False`

* remove kwargs
fix check_inputs according to the new inputs of the pipeline

* rename prepare_latent into prepare_latents as in other pipelines
annotate prepare_latents in normals pipeline with "Copied from"
annotate encode_image in normals pipeline with "Copied from"

* move nested-capable `progress_bar` method into the pipelines
revert the original `progress_bar` method in pipeline_utils

* minor message improvement

* fix cpu offloading

* move colormap, visualize_depth, export_depth_to_16bit_png, visualize_normals, visualize_uncertainty to marigold_image_processing.py
update example docstrings

* fix missing comma

* change torch.FloatTensor to torch.Tensor

* fix importing of MarigoldImageProcessor

* fix vae offloading
fix batched image encoding
remove separate encode_image function and use vae.encode instead

* implement marigold's intial tests
relax generator checks in line with other pipelines
implement return_dict __call__ argument in line with other pipelines

* fix num_images computation

* remove MarigoldImageProcessor and outputs from import structure
update tests

* update docstrings

* update init

* update

* style

* fix

* fix

* up

* up

* up

* add simple test

* up

* update expected np input/output to be channel last

* move expand_tensor_or_array into the MarigoldImageProcessor

* rewrite tests to follow conventions - hardcoded slices instead of image artifacts
write more smoke tests

* add basic docs.

* add anton's contribution statement

* remove todos.

* fix assertion values for marigold depth slow tests

* fix assertion values for depth normals.

* remove print

* support AutoencoderTiny in the pipelines

* update documentation page
add Available Pipelines section
add Available Checkpoints section
add warning about num_inference_steps

* fix missing import in docstring
fix wrong value in visualize_depth docstring

* [doc] add marigold to pipelines overview

* [doc] add section "usage examples"

* fix an issue with latents check in the pipelines

* add "Frame-by-frame Video Processing with Consistency" section

* grammarly

* replace tables with images with css-styled images (blindly)

* style

* print

* fix the assertions.

* take from the github runner.

* take the slices from action artifacts

* style.

* update with the slices from the runner.

* remove unnecessary code blocks.

* Revert "[doc] add marigold to pipelines overview"

This reverts commit a505165150afd8dab23c474d1a054ea505a56a5f.

* remove invitation for new modalities

* split out marigold usage examples

* doc cleanup

---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2024-05-27 17:21:49 +05:30
Dhruv Nair
b82f9f5666 Add zip package to doc builder image (#8284)
update
2024-05-27 15:50:00 +05:30
Sayak Paul
6a5ba1b719 [Workflows] add a more secure way to run tests from a PR. (#7969)
* add a more secure way to run tests from a PR.

* make pytest more secure.

* address dhruv's comments.

* improve validation check.

* Update .github/workflows/run_tests_from_a_pr.yml

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-27 13:47:50 +05:30
Dhaivat Bhatt
4d40c9140c Add details about 1-stage implementation in I2VGen-XL docs (#8282)
* Add details about 1-stage implementation

* Add details about 1-stage implementation
2024-05-27 09:56:32 +05:30
Tolga Cangöz
0ab63ff647 Fix CPU Offloading Usage & Typos (#8230)
* Fix typos

* Fix `pipe.enable_model_cpu_offload()` usage

* Fix cpu offloading

* Update numbers
2024-05-24 11:25:29 -07:00
Tolga Cangöz
db33af065b Fix a grammatical error in the raise messages (#8272)
Fix grammatical error
2024-05-24 11:15:00 -07:00
Yue Wu
1096f88e2b sampling bug fix in diffusers tutorial "basic_training.md" (#8223)
sampling bug fix in basic_training.md

In the diffusers basic training tutorial, setting the manual seed argument (generator=torch.manual_seed(config.seed)) in the pipeline call inside evaluate() function rewinds the dataloader shuffling, leading to overfitting due to the model seeing same sequence of training examples after every evaluation call. Using generator=torch.Generator(device='cpu').manual_seed(config.seed) avoids this.
2024-05-24 11:14:32 -07:00
Dhruv Nair
cef4a51223 Clean up from_single_file docs (#8268)
* update

* update
2024-05-24 17:43:51 +05:30
Lucain
edf5ba6a17 Respect resume_download deprecation V2 (#8267)
* Fix resume_downoad FutureWarning

* only resume download
2024-05-24 12:11:03 +02:00
Sayak Paul
9941f1f61b [Chore] run the documentation workflow in a custom container. (#8266)
run the documentation workflow in a custom container.
2024-05-24 15:10:02 +05:30
Yifan Zhou
46a9db0336 [Community Pipeline] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation (#8239)
* code and doc

* update paper link

* remove redundant codes

* add example video

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-24 14:44:20 +05:30
Dhruv Nair
370146e4e0 Use freedesktop_os_release() in diffusers cli for Python >=3.10 (#8235)
* update

* update
2024-05-24 13:30:40 +05:30
Dhruv Nair
5cd45c24bf Create custom container for doc builder (#8263)
* update

* update
2024-05-24 12:53:48 +05:30
Dhruv Nair
67b3fe0aae Fix resize issue in SVD pipeline with VideoProcessor (#8229)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-23 11:57:34 +05:30
Dhruv Nair
baab065679 Remove unnecessary single file tests for SD Cascade UNet (#7996)
update
2024-05-22 12:29:59 +05:30
BootesVoid
509741aea7 fix: Attribute error in Logger object (logger.warning) (#8183) 2024-05-22 12:29:11 +05:30
Lucain
e1df77ee1e Use HF_TOKEN env var in CI (#7993) 2024-05-21 14:58:10 +05:30
Steven Liu
fdb1baa05c [docs] VideoProcessor (#7965)
* fix?

* fix?

* fix
2024-05-21 08:18:21 +05:30
Vinh H. Pham
6529ee67ec Make VAE compatible to torch.compile() (#7984)
make VAE compatible to torch.compile()

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-20 13:43:59 -04:00
Sai-Suraj-27
df2bc5ef28 fix: Fixed few docstrings according to the Google Style Guide (#7717)
Fixed few docstrings according to the Google Style Guide.
2024-05-20 10:26:05 -07:00
Aleksei Zhuravlev
a7bf77fc28 Passing cross_attention_kwargs to StableDiffusionInstructPix2PixPipeline (#7961)
* Update pipeline_stable_diffusion_instruct_pix2pix.py

Add `cross_attention_kwargs` to `__call__` method of `StableDiffusionInstructPix2PixPipeline`, which are passed to UNet.

* Update documentation for pipeline_stable_diffusion_instruct_pix2pix.py

* Update docstring

* Update docstring

* Fix typing import
2024-05-20 13:14:34 -04:00
Junsong Chen
0f0defdb65 [docs] add doc for PixArtSigmaPipeline (#7857)
* 1. add doc for PixArtSigmaPipeline;

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Guillaume LEGENDRE <glegendre01@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>
Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Hyoungwon Cho <jhw9811@korea.ac.kr>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Tolga Cangöz <46008593+standardAI@users.noreply.github.com>
Co-authored-by: Philip Pham <phillypham@google.com>
2024-05-20 12:40:57 -04:00
Nikita
19df9f3ec0 Update pipeline_controlnet_inpaint_sd_xl.py (#7983) 2024-05-20 12:24:49 -04:00
Jacob Marks
d6ca120987 Fix typo in "attention" (#7977) 2024-05-20 11:54:29 -04:00
Sayak Paul
fb7ae0184f [tests] fix Pixart Sigma tests (#7966)
* checking tests

* checking ii.

* remove prints.

* test_pixart_1024

* fix 1024.
2024-05-19 20:56:31 +05:30
Sayak Paul
70f8d4b488 remove unsafe workflow. (#7967) 2024-05-17 13:46:24 +05:30
Álvaro Somoza
6c60e430ee Consistent SDXL Controlnet callback tensor inputs (#7958)
* make _callback_tensor_inputs consistent between sdxl pipelines

* forgot this one

* fix failing test

* fix test_components_function

* fix controlnet inpaint tests
2024-05-16 07:15:10 -10:00
Alphin Jain
1221b28eac Fix AttributeError in train_lcm_distill_lora_sdxl_wds.py (#7923)
Fix conditional teacher model check in train_lcm_distill_lora_sdxl_wds.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-16 15:49:54 +05:30
Liang Hou
746f603b20 Fix the text tokenizer name in logger warning of PixArt pipelines (#7912)
Fix CLIP to T5 in logger warning
2024-05-15 18:49:29 -10:00
Sai-Suraj-27
2afea72d29 refactor: Refactored code by Merging isinstance calls (#7710)
* Merged isinstance calls to make the code simpler.

* Corrected formatting errors using ruff.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-15 18:33:19 -10:00
Sayak Paul
0f111ab794 [Workflows] add a workflow that can be manually triggered on a PR. (#7942)
* add a workflow that can be manually triggered on a PR.

* remove sudo

* add command

* small fixes.
2024-05-15 17:18:56 +05:30
Guillaume LEGENDRE
4dd7aaa06f move to GH hosted M1 runner (#7949) 2024-05-15 13:47:36 +05:30
Isamu Isozaki
d27e996ccd Adding VQGAN Training script (#5483)
* Init commit

* Removed einops

* Added default movq config for training

* Update explanation of prompts

* Fixed inheritance of discriminator and init_tracker

* Fixed incompatible api between muse and here

* Fixed output

* Setup init training

* Basic structure done

* Removed attention for quick tests

* Style fixes

* Fixed vae/vqgan styles

* Removed redefinition of wandb

* Fixed log_validation and tqdm

* Nothing commit

* Added commit loss to lookup_from_codebook

* Update src/diffusers/models/vq_model.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Adding perliminary README

* Fixed one typo

* Local changes

* Fixed main issues

* Merging

* Update src/diffusers/models/vq_model.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Testing+Fixed bugs in training script

* Some style fixes

* Added wandb to docs

* Fixed timm test

* get testing suite ready.

* remove return loss

* remove return_loss

* Remove diffs

* Remove diffs

* fix ruff format

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-15 08:47:12 +05:30
Sayak Paul
72780ff5b1 [tests] decorate StableDiffusion21PipelineSingleFileSlowTests with slow. (#7941)
decorate StableDiffusion21PipelineSingleFileSlowTests with slow.
2024-05-14 14:26:21 -10:00
Jingyang Zhang
69fdb8720f [Pipeline] Adding BoxDiff to community examples (#7947)
add boxdiff to community examples
2024-05-14 11:18:29 -10:00
Nikita
b2140a895b Fix added_cond_kwargs when using IP-Adapter in StableDiffusionXLControlNetInpaintPipeline (#7924)
Fix `added_cond_kwargs` when using IP-Adapter

Fix error when using IP-Adapter in pipeline and passing `ip_adapter_image_embeds` instead of `ip_adapter_image`

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-14 10:32:08 -10:00
Sayak Paul
e0e8c58f64 [Core] separate the loading utilities in modeling similar to pipelines. (#7943)
separate the loading utilities in modeling similar to pipelines.
2024-05-14 22:33:43 +05:30
Sayak Paul
cbea5d1725 update to use hf-workflows for reporting the Docker build statuses (#7938)
update to use hf-workflows for reporting
2024-05-14 09:25:13 +05:30
Tolga Cangöz
a1245c2c61 Expansion proposal of diffusers-cli env (#7403)
* Expand `diffusers-cli env`

* SafeTensors -> Safetensors

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Move `safetensors_version = "not installed"` to `else`

* Update `safetensors_version` checking

* Add GPU detection for Linux, Mac OS, and Windows

* Add accelerator detection to environment command

* Add is_peft_version to import_utils

* Update env.py

* Add `huggingface_hub` reference

* Add `transformers` reference

* Add reference for `huggingface_hub`

* Fix print statement in env.py for unusual OS

* Up

* Fix platform information in env.py

* up

* Fix import order in env.py

* ruff

* make style

* Fix platform system check in env.py

* Fix run method return type in env.py

* 🤗

* No need f-string

* Remove location info

* Remove accelerate config

* Refactor env.py to remove accelerate config

* feat: Add support for `bitsandbytes` library in environment command

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-14 08:20:24 +05:30
bssrdf
cdda94f412 fix VAE loading issue in train_dreambooth (#7632)
* fixed vae loading issue #7619

* rerun make style && make quality

* bring back model_has_vae and add change \ to / in config_file_name on windows os to make match work

* add missing import platform

* bring back import model_info

* make config_file_name OS independent

* switch to using Path.as_posix() to resolve OS dependence

* improve style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: bssrdf <bssrdf@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-14 08:19:53 +05:30
dependabot[bot]
5b830aa356 Bump transformers from 4.36.0 to 4.38.0 in /examples/research_projects/realfill (#7635)
Bump transformers in /examples/research_projects/realfill

Bumps [transformers](https://github.com/huggingface/transformers) from 4.36.0 to 4.38.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.36.0...v4.38.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-14 08:17:06 +05:30
Kohei
9e7bae9881 Update requirements.txt for text_to_image (#7892)
Update requirements.txt

If the datasets library is old, it will not read the metadata.jsonl and the label will default to an integer of type int.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-14 08:09:12 +05:30
rebel-kblee
b41ce1e090 fix multicontrolnet save_pretrained logic for compatibility (#7821)
fix multicontrolnet save_pretrained logic for compatibility

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-13 09:32:06 -10:00
Sayak Paul
95d3748453 [LoRA] Fix LoRA tests (side effects of RGB ordering) part ii (#7932)
* check

* check 2.

* update slices
2024-05-13 09:23:48 -10:00
Fabio Rigano
44aa9e566d fix AnimateDiff creation with a unet loaded with IP Adapter (#7791)
* Fix loading from_pipe

* Fix style

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-13 08:15:01 -10:00
Álvaro Somoza
fdb05f54ef Official callbacks (#7761) 2024-05-12 17:10:29 -10:00
HelloWorldBeginner
98ba18ba55 Add Ascend NPU support for SDXL. (#7916)
Co-authored-by: mhh001 <mahonghao1@huawei.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-12 13:34:23 +02:00
Sayak Paul
5bb38586a9 [Core] fix offload behaviour when device_map is enabled. (#7919)
fix offload behaviour when device_map is enabled.
2024-05-12 13:29:43 +02:00
Sai-Suraj-27
ec9e88139a fix: Fixed a wrong link to supported python versions in contributing.md file (#7638)
* Fixed a wrong link to python versions in contributing.md file.

* Updated the link to a permalink, so that it will permanently point to the specific line.
2024-05-12 13:21:18 +02:00
momo
e4f8dca9a0 add custom sigmas and timesteps for StableDiffusionXLControlNet pipeline (#7913)
add custom sigmas and timesteps
2024-05-11 23:33:19 -10:00
HelloWorldBeginner
0267c5233a fix bugs when using deepspeed in sdxl (#7917)
fix bugs when using deepspeed

Co-authored-by: mhh001 <mahonghao1@huawei.com>
2024-05-11 20:49:09 +02:00
Mark Van Aken
be4afa0bb4 #7535 Update FloatTensor type hints to Tensor (#7883)
* find & replace all FloatTensors to Tensor

* apply formatting

* Update torch.FloatTensor to torch.Tensor in the remaining files

* formatting

* Fix the rest of the places where FloatTensor is used as well as in documentation

* formatting

* Update new file from FloatTensor to Tensor
2024-05-10 09:53:31 -10:00
Sayak Paul
04f4bd54ea [Core] introduce videoprocessor. (#7776)
* introduce videoprocessor.

* fix quality

* address yiyi's feedback

* fix preprocess_video call.

* video_processor -> image_processor

* fix

* fix more.

* quality

* image_processor -> video_processor

* support List[List[PIL.Image.Image]]

* change to video_processor.

* documentation

* Apply suggestions from code review

* changes

* remove print.

* refactor video processor (part # 7776) (#7861)

* update

* update remove deprecate

* Update src/diffusers/video_processor.py

* update

* Apply suggestions from code review

* deprecate list of 5d for video and list of 4d for image + apply other feedbacks

* up

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add doc.

* tensor2vid -> postprocess_video.

* refactor preprocess with preprocess_video

* set default values.

* empty commit

* more refactoring of prepare_latents in animatediff vid2vid

* checking documentation

* remove documentation for now.

* fix animatediff sdxl

* fix test failure [part of video processor PR] (#7905)

up

* remove preceed_with_frames.

* doc

* fix

* fix

* remove video input as a single-frame video.

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-10 21:02:36 +02:00
Sayak Paul
82be58c512 add missing image processors to the docs (#7910)
add missing processors.
2024-05-10 14:53:57 +02:00
Sayak Paul
6695635696 upgrade to python 3.10 in the Dockerfiles (#7893)
* upgrade to python 3.10

* fix

* try https://askubuntu.com/questions/1459694/can-not-find-python3-10-after-apt-get-installation

* fix

* up

* yes

* okay

* up

* up

* up

* up

* up

* check

* okay

* up

* i[

* fix
2024-05-10 14:29:08 +02:00
YiYi Xu
b934215d4c [scheduler] support custom timesteps and sigmas (#7817)
* support custom sigmas and timesteps, dpm euler

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-05-09 11:07:43 -10:00
YiYi Xu
5ed3abd371 fix _optional_components in StableCascadeCombinedPipeline (#7894)
* fix

* up
2024-05-09 06:32:55 -10:00
Dhruv Nair
1087a510b5 Set max parallel jobs on slow test runners (#7878)
* set max parallel

* update

* update

* update
2024-05-09 19:42:18 +05:30
Sayak Paul
305f2b4498 [Tests] fix things after #7013 (#7899)
* debugging

* save the resulting image

* check if order reversing works.

* checking values.

* up

* okay

* checking

* fix

* remove print
2024-05-09 16:05:35 +02:00
Dhruv Nair
cb0f3b49cb [Refactor] Better align from_single_file logic with from_pretrained (#7496)
* refactor unet single file loading a bit.

* retrieve the unet from create_diffusers_unet_model_from_ldm

* update

* update

* updae

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* tests

* update

* update

* update

* Update docs/source/en/api/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/api/loaders/single_file.md

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/loaders/single_file.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update docs/source/en/api/loaders/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/loaders/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/loaders/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/loaders/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-09 19:00:19 +05:30
Tolga Cangöz
caf9e985df Fix several imports (#7712)
Fix imports
2024-05-09 07:34:44 +02:00
Tolga Cangöz
c1c42698c9 Remove dead code and fix f-string issue (#7720)
* Remove dead code

* PylancereportGeneralTypeIssues: Strings nested within an f-string cannot use the same quote character as the f-string prior to Python 3.12.

* Remove dead code
2024-05-08 13:15:28 -10:00
Pierre Dulac
75aab34675 Allow users to save SDXL LoRA weights for only one text encoder (#7607)
SDXL LoRA weights for text encoders should be decoupled on save

The method checks if at least one of unet, text_encoder and
text_encoder_2 lora weights are passed, which was not reflected in the
implentation.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-08 10:41:58 -10:00
YiYi Xu
35358a2dec fix offload test (#7868)
fix

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-08 07:59:08 -10:00
Aryan
818f760732 [Pipeline] AnimateDiff SDXL (#6721)
* update conversion script to handle motion adapter sdxl checkpoint

* add animatediff xl

* handle addition_embed_type

* fix output

* update

* add imports

* make fix-copies

* add decode latents

* update docstrings

* add animatediff sdxl to docs

* remove unnecessary lines

* update example

* add test

* revert conv_in conv_out kernel param

* remove unused param addition_embed_type_num_heads

* latest IPAdapter impl

* make fix-copies

* fix return

* add IPAdapterTesterMixin to tests

* fix return

* revert based on suggestion

* add freeinit

* fix test_to_dtype test

* use StableDiffusionMixin instead of different helper methods

* fix progress bar iterations

* apply suggestions from review

* hardcode flip_sin_to_cos and freq_shift

* make fix-copies

* fix ip adapter implementation

* fix last failing test

* make style

* Update docs/source/en/api/pipelines/animatediff.md

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* remove todo

* fix doc-builder errors

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-08 21:27:14 +05:30
Philip Pham
f29b93488d Check shape and remove deprecated APIs in scheduling_ddpm_flax.py (#7703)
`model_output.shape` may only have rank 1.

There are warnings related to use of random keys.

```
tests/schedulers/test_scheduler_flax.py: 13 warnings
  /Users/phillypham/diffusers/src/diffusers/schedulers/scheduling_ddpm_flax.py:268: FutureWarning: normal accepts a single key, but was given a key array of shape (1, 2) != (). Use jax.vmap for batching. In a future JAX version, this will be an error.
    noise = jax.random.normal(split_key, shape=model_output.shape, dtype=self.dtype)

tests/schedulers/test_scheduler_flax.py::FlaxDDPMSchedulerTest::test_betas
  /Users/phillypham/virtualenv/diffusers/lib/python3.9/site-packages/jax/_src/random.py:731: FutureWarning: uniform accepts a single key, but was given a key array of shape (1,) != (). Use jax.vmap for batching. In a future JAX version, this will be an error.
    u = uniform(key, shape, dtype, lo, hi)  # type: ignore[arg-type]
```
2024-05-08 13:57:19 +02:00
Tolga Cangöz
d50baf0c63 Fix image upcasting (#7858)
Fix image's upcasting before `vae.encode()` when using `fp16`

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-07 16:45:02 -10:00
Hyoungwon Cho
c2217142bd Modification on the PAG community pipeline (re) (#7876)
* edited_pag_implementation

* update

---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>
2024-05-07 16:35:15 -10:00
Bagheera
8edaf3b79c 7879 - adjust documentation to use naruto dataset, since pokemon is now gated (#7880)
* 7879 - adjust documentation to use naruto dataset, since pokemon is now gated

* replace references to pokemon in docs

* more references to pokemon replaced

* Japanese translation update

---------

Co-authored-by: bghira <bghira@users.github.com>
2024-05-07 09:36:39 -07:00
Álvaro Somoza
23e091564f Fix for "no lora weight found module" with some loras (#7875)
* return layer weight if not found

* better system and test

* key example and typo
2024-05-07 13:54:57 +02:00
Steven Liu
0d23645bd1 [docs] Distilled inference (#7834)
* combine

* edits
2024-05-06 15:07:25 -07:00
Guillaume LEGENDRE
7fa3e5b0f6 Ci - change cache folder (#7867) 2024-05-06 17:55:24 +05:30
Steven Liu
49b959b540 [docs] LCM (#7829)
* lcm

* lcm lora

* fix

* fix hfoption

* edits
2024-05-03 16:08:27 -07:00
HelloWorldBeginner
58237364b1 Add Ascend NPU support for SDXL fine-tuning and fix the model saving bug when using DeepSpeed. (#7816)
* Add Ascend NPU support for SDXL fine-tuning and fix the model saving bug when using DeepSpeed.

* fix check code quality

* Decouple the NPU flash attention and make it an independent module.

* add doc and unit tests for npu flash attention.

---------

Co-authored-by: mhh001 <mahonghao1@huawei.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-03 08:14:34 -10:00
Dhruv Nair
3e35628873 Remove installing python again in container (#7852)
update
2024-05-03 15:09:15 +05:30
Lucain
6a479588db Respect resume_download deprecation (#7843)
* Deprecate resume_download

* align docstring with transformers

* style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-03 08:42:57 +02:00
Aritra Roy Gosthipaty
fa489eaed6 [Tests] reduce the model size in the blipdiffusion fast test (#7849)
reducing model size
2024-05-03 07:46:48 +05:30
Dhruv Nair
0d7c479023 Update deps for pipe test fetcher (#7838)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-02 20:36:47 +05:30
Guillaume LEGENDRE
ce97d7e19b Change GPU Runners (#7840)
* Move to new GPU Runners for slow tests

* Move to new GPU Runners for nightly tests
2024-05-02 18:48:46 +05:30
Guillaume LEGENDRE
44ba90caff move to new runners (#7839) 2024-05-02 14:53:38 +02:00
Dhruv Nair
3c85a57297 Update CI cache (#7832)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-02 14:03:35 +05:30
Dhruv Nair
03ca11318e Update download diff format tests (#7831)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-02 13:15:38 +05:30
Dhruv Nair
3ffa7b46e5 Fix hanging pipeline fetching (#7837)
update
2024-05-02 13:08:57 +05:30
yunseong Cho
c1b2a89e34 Fix key error for dictionary with randomized order in convert_ldm_unet_checkpoint (#7680)
fix key error for different order

Co-authored-by: yunseong <yunseong.cho@superlabs.us>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-02 10:29:55 +05:30
Aritra Roy Gosthipaty
435d37ce5a [Tests] reduce the model size in the audioldm fast test (#7833)
chore: initial size reduction of models
2024-05-02 06:03:52 +05:30
YiYi Xu
5915c2985d [ip-adapter] fix ip-adapter for StableDiffusionInstructPix2PixPipeline (#7820)
update prepare_ip_adapter_ for pix2pix
2024-05-01 06:27:43 -10:00
YiYi Xu
21a7ff12a7 update the logic of is_sequential_cpu_offload (#7788)
* up

* add comment to the tests + fix dit

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-01 06:25:57 -10:00
Sayak Paul
8909ab4b19 [Tests] fix: device map tests for models (#7825)
* fix: device module tests

* remove patch file

* Empty-Commit
2024-05-01 18:45:47 +05:30
Dhruv Nair
c1edb03c37 Fix for pipeline slow test fetcher (#7824)
* update

* update
2024-05-01 17:36:54 +05:30
Steven Liu
0d08370263 [docs] Community pipelines (#7819)
* community pipelines

* feedback

* consolidate
2024-04-30 14:10:14 -07:00
Tolga Cangöz
b8ccb46259 Fix CPU offload in docstring (#7827)
Fix cpu offload
2024-04-30 10:53:27 -07:00
Dhruv Nair
725ead2f5e SSH Runner Workflow Update (#7822)
* add debug workflow

* update
2024-04-30 20:14:18 +05:30
Linoy Tsaban
26a7851e1e Add B-Lora training option to the advanced dreambooth lora script (#7741)
* add blora

* add blora

* add blora

* add blora

* little changes

* little changes

* remove redundancies

* fixes

* add B LoRA to readme

* style

* inference

* defaults + path to loras+ generation

* minor changes

* style

* minor changes

* minor changes

* blora arg

* added --lora_unet_blocks

* style

* Update examples/advanced_diffusion_training/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add commit hash to B-LoRA repo cloneing

* change inference, remove cloning

* change inference, remove cloning
add section about configureable unet blocks

* change inference, remove cloning
add section about configureable unet blocks

* Apply suggestions from code review

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-30 09:46:30 +05:30
Sayak Paul
3fd31eef51 [Core] introduce _no_split_modules to ModelMixin (#6396)
* introduce _no_split_modules.

* unnecessary spaces.

* remove unnecessary kwargs and style

* fix: accelerate imports.

* change to _determine_device_map

* add the blocks that have residual connections.

* add: CrossAttnUpBlock2D

* add: testin

* style

* line-spaces

* quality

* add disk offload test without safetensors.

* checking disk offloading percentages.

* change model split

* add: utility for checking multi-gpu requirement.

* model parallelism test

* splits.

* splits.

* splits

* splits.

* splits.

* splits.

* offload folder to test_disk_offload_with_safetensors

* add _no_split_modules

* fix-copies
2024-04-30 08:46:51 +05:30
Aritra Roy Gosthipaty
b02e2113ff [Tests] reduce the model size in the amused fast test (#7804)
* chore: reducing model sizes

* chore: shrinks further

* chore: shrinks further

* chore: shrinking model for img2img pipeline

* chore: reducing size of model for inpaint pipeline

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-30 08:11:26 +05:30
Aritra Roy Gosthipaty
21f023ec1a [Tests] reduce the model size in the ddpm fast test (#7797)
* chore: reducing unet size for faster tests

* review suggestions

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-30 08:11:03 +05:30
Aritra Roy Gosthipaty
31d9f9ea77 [Tests] reduce the model size in the ddim fast test (#7803)
chore: reducing model size for ddim fast pipeline

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-30 07:54:38 +05:30
Clint Adams
f53352f750 Set main_input_name in StableDiffusionSafetyChecker to "clip_input" (#7500)
FlaxStableDiffusionSafetyChecker sets main_input_name to "clip_input".
This makes StableDiffusionSafetyChecker consistent.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-29 11:39:59 -10:00
RuiningLi
83ae24ce2d Added get_velocity function to EulerDiscreteScheduler. (#7733)
* Added get_velocity function to EulerDiscreteScheduler.

* Fix white space on blank lines

* Added copied from statement

* back to the original.

---------

Co-authored-by: Ruining Li <ruining@robots.ox.ac.uk>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-29 10:32:13 -10:00
jschoormans
8af793b2d4 Adding TextualInversionLoaderMixin for the controlnet_inpaint_sd_xl pipeline (#7288)
* added TextualInversionMixIn to controlnet_inpaint_sd_xl pipeline


---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-29 09:00:53 -10:00
Dhruv Nair
eb96ff0d59 Safetensor loading in AnimateDiff conversion scripts (#7764)
* update

* update
2024-04-29 17:36:50 +05:30
Yushu
a38dd79512 [Pipeline] Fix error of SVD pipeline when num_videos_per_prompt > 1 (#7786)
swap the order for do_classifier_free_guidance concat with repeat

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-29 16:24:16 +05:30
Dhruv Nair
b1c5817a89 Add debugging workflow (#7778)
add debug workflow

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-29 13:44:39 +05:30
Nilesh
235d34cf56 Check for latents, before calling prepare_latents - sdxlImg2Img (#7582)
* Check for latents, before calling prepare_latents - sdxlImg2Img

* Added latents check for all the img2img pipeline

* Fixed silly mistake while checking latents as None
2024-04-28 14:53:29 -10:00
Jenyuan-Huang
5029673987 Update InstantStyle usage in IP-Adapter documentation (#7806)
* enable control ip-adapter per-transformer block on-the-fly


---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: ResearcherXman <xhs.research@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-28 10:34:57 -10:00
Sayak Paul
56bd7e67c2 [Scheduler] introduce sigma schedule. (#7649)
* introduce sigma schedule.

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* address yiyi

* update docstrings.

* implement the schedule for EDMDPMSolverMultistepScheduler

---------

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2024-04-27 07:40:35 +05:30
39th president of the United States, probably
9d16daaf64 Add DREAM training (#6381)
A new function compute_dream_and_update_latents has been added to the
training utilities that allows you to do DREAM rectified training in line
with the paper https://arxiv.org/abs/2312.00210. The method can be used
with an extra argument in the train_text_to_image.py script.

Co-authored-by: Jimmy <39@🇺🇸.com>
2024-04-27 07:19:15 +05:30
Fabio Rigano
8e4ca1b6b2 [Docs] Update image masking and face id example (#7780)
* [Docs] Update image masking and face id example

* Update docs

* Fix docs
2024-04-26 12:51:11 -10:00
Beinsezii
0d2d424fbe Add PixArtSigmaPipeline to AutoPipeline mapping (#7783) 2024-04-26 09:10:20 -10:00
Steven Liu
e24e54fdfa [docs] Fix AutoPipeline docstring (#7779)
fix

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-26 10:09:36 -07:00
btlorch
ebc99a77aa Convert RGB to BGR for the SDXL watermark encoder (#7013)
* Convert channel order to BGR for the watermark encoder. Convert the watermarked BGR images back to RGB. Fixes #6292

* Revert channel order before stacking images to overcome limitations that negative strides are currently not supported

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-25 14:44:53 -10:00
Steven Liu
fa750a15bd [docs] Refactor image quality docs (#7758)
* refactor

* code snippets

* fix path

* fix path in guide

* code outputs

* align toctree title

* title

* fix title
2024-04-25 16:55:35 -07:00
Steven Liu
181688012a [docs] Reproducible pipelines (#7769)
* reproducibility

* feedback

* feedback

* fix path

* github link
2024-04-25 16:15:12 -07:00
Sayak Paul
142f353e1c Fix lora device test (#7738)
* fix lora device test

* fix more.

* fix more/

* quality

* empty

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-25 18:05:27 +05:30
Sayak Paul
b833d0fc80 [Tests] mark UNetControlNetXSModelTests::test_forward_no_control to be flaky (#7771)
decorate UNetControlNetXSModelTests::test_forward_no_control with is_flaky
2024-04-25 07:29:04 +05:30
Sayak Paul
e963621649 [PixArt] fix small nits in pixart sigma (#7767)
fix small nits in pixart sigma
2024-04-25 06:37:35 +05:30
Junsong Chen
39215aa30e PixArt-Sigma Implementation (#7654)
* support PixArt-DMD

---------

Co-authored-by: jschen <chenjunsong4@h-partners.com>
Co-authored-by: badayvedat <badayvedat@gmail.com>
Co-authored-by: Vedat Baday <54285744+badayvedat@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-04-23 22:33:08 -10:00
Dhruv Nair
9ef43f38d4 Fix test for consistency decoder. (#7746)
update
2024-04-24 12:28:11 +05:30
Dhruv Nair
88018fcf20 Fix failing VAE tiling test (#7747)
update
2024-04-24 12:27:45 +05:30
Steven Liu
7404f1e9dc [docs] Clean up toctree (#7715)
* toctree

* optim

* feedback

* improve overview
2024-04-23 09:30:33 -07:00
Sayak Paul
5a69227863 [Metadat utils] fix: json lines ordering. (#7744)
fix: json lines ordering.
2024-04-23 14:32:30 +05:30
Sai-Suraj-27
fc9fecc217 fix: Fixed a wrong decorator by modifying it to @classmethod (#7653)
* Fixed wrong decorator by modifying it to @classmethod.

* Updated the method and it's argument.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-22 14:41:35 -10:00
Fabio Rigano
065f251766 Restore AttnProcessor2_0 in unload_ip_adapter (#7727)
* Restore AttnProcessor2_0 in unload_ip_adapter

* Fix style

* Update test

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-22 13:59:03 -10:00
Jenyuan-Huang
21c747fa0f Support InstantStyle (#7668)
* enable control ip-adapter per-transformer block on-the-fly

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: ResearcherXman <xhs.research@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-22 13:20:19 -10:00
Phil Butler
09129842e7 Remove redundant lines (#7396)
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-22 09:32:16 -10:00
Steven Liu
33b363edfa [docs] AutoPipeline (#7714)
* autopipeline

* edits

* feedback

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-22 10:15:07 -07:00
Dhruv Nair
a9dd86029e Fix Kandinksy V22 tests (#7699)
update
2024-04-22 15:41:59 +05:30
Dhruv Nair
9100652494 Update Wuerschten Test (#7700)
update
2024-04-22 15:41:39 +05:30
Abhinav Gopal
d1e3f489e9 Animatediff Controlnet Community Pipeline IP Adapter Fix (#7413)
* fixed encode_image function signature in controlnet animatediff

* copied encode_image from stable diffusion pipeline

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-19 15:35:07 -10:00
Guillaume LEGENDRE
ae05050db9 fix/add tailscale key in case of failure (#7719)
add tailscale key in case of failure
2024-04-19 10:56:40 +02:00
Sai-Suraj-27
db969cc16d fix: Fixed type annotations for compatability with python 3.8 (#7648)
* Fixed type annotations for compatability with python 3.8

* Add required imports.
2024-04-18 19:34:09 -10:00
Dhruv Nair
3cfe187dc7 Cleanup ControlnetXS (#7701)
* update

* update
2024-04-18 19:32:00 -10:00
Dhruv Nair
90250d9e48 Cast height, width to int inside prepare latents (#7691)
update
2024-04-18 19:30:39 -10:00
YiYi Xu
e5674015f3 adding back test_conversion_when_using_device_map (#7704)
* style


* Fix device map nits (#7705)


---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-18 19:21:32 -10:00
Fabio Rigano
b5c8b555d7 Move IP Adapter Face ID to core (#7186)
* Switch to peft and multi proj layers

* Move Face ID loading and inference to core

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-18 14:13:27 -10:00
Guillaume LEGENDRE
e23c27e905 Add tailscale action to push_test (#7709) 2024-04-18 18:48:39 +05:30
Steven Liu
7635d3d37f [docs] Pipeline loading (#7684)
* pipelines

* schedulers and models

* community pipelines

* feedback
2024-04-17 15:42:27 -07:00
Wentian
9132ce7c58 [Docs] Update TGATE in section optimization. (#7698)
Update tgate.md
2024-04-17 09:37:24 -07:00
Sayak Paul
30c977d1f5 [Workflows] remove installation of redundant modules from flax PR tests (#7662)
remove installation of redundant modules from flax PR tests
2024-04-17 15:16:04 +05:30
Dhruv Nair
f0fa17dd8e Don't install PEFT with UV in slow tests (#7697)
* update

* update
2024-04-17 15:10:38 +05:30
Sai-Suraj-27
c726d02beb fix: Updated ruff configuration to avoid deprecated configuration warning (#7637)
Updated ruff configuration to avoid depreceated config.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-16 15:02:55 -10:00
Wentian
a68503f221 [Docs] Add TGATE in section optimization (#7639)
* Create tgate.md

* Update _toctree.yml

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update tgate.md

* Update tgate.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-04-16 17:58:27 -07:00
Sayak Paul
9d50f7eec1 [Core] is_cosxl_edit arg in SDXL ip2p. (#7650)
* is_cosxl_edit arg in SDXL ip2p.

* Empty-Commit

Co-authored-by: Yiyi Xu <yixu310@gmail.com>

* doc

* remove redundant logic.

* reflect drhuv's comments.

---------

Co-authored-by: Yiyi Xu <yixu310@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-16 22:15:55 +05:30
UmerHA
fda1531d8a Fixing implementation of ControlNet-XS (#6772)
* CheckIn - created DownSubBlocks

* Added extra channels, implemented subblock fwd

* Fixed connection sizes

* checkin

* Removed iter, next in forward

* Models for SD21 & SDXL run through

* Added back pipelines, cleared up connections

* Cleaned up connection creation

* added debug logs

* updated logs

* logs: added input loading

* Update umer_debug_logger.py

* log: Loading hint

* Update umer_debug_logger.py

* added logs

* Changed debug logging

* debug: added more logs

* Fixed num_norm_groups

* Debug: Logging all of SDXL input

* Update umer_debug_logger.py

* debug: updated logs

* checkim

* Readded tests

* Removed debug logs

* Fixed Slow Tests

* Added value ckecks | Updated model_cpu_offload_seq

* accelerate-offloading works ; fast tests work

* Made unet & addon explicit in controlnet

* Updated slow tests

* Added dtype/device to ControlNetXS

* Filled in test model paths

* Added image_encoder/feature_extractor to XL pipe

* Fixed fast tests

* Added comments and docstrings

* Fixed copies

* Added docs ; Updates slow tests

* Moved changes to UNetMidBlock2DCrossAttn

* tiny cleanups

* Removed stray prints

* Removed ip adapters + freeU

- Removed ip adapters + freeU as they don't make sense for ControlNet-XS
- Fixed imports of UNet components

* Fixed test_save_load_float16

* Make style, quality, fix-copies

* Changed loading/saving API for ControlNetXS

- Changed loading/saving API for ControlNetXS
- other small fixes

* Removed ControlNet-XS from research examples

* Make style, quality, fix-copies

* Small fixes

- deleted ControlNetXSModel.init_original
- added time_embedding_mix to StableDiffusionControlNetXSPipeline .from_pretrained / StableDiffusionXLControlNetXSPipeline.from_pretrained
- fixed copy hints

* checkin May 11 '23

* CheckIn Mar 12 '24

* Fixed tests for SD

* Added tests for UNetControlNetXSModel

* Fixed SDXL tests

* cleanup

* Delete Pipfile

* CheckIn Mar 20

Started replacing sub blocks  by `ControlNetXSCrossAttnDownBlock2D` and `ControlNetXSCrossAttnUplock2D`

* check-in Mar 23

* checkin 24 Mar

* Created init for UNetCnxs and CnxsAddon

* CheckIn

* Made from_modules, from_unet and no_control work

* make style,quality,fix-copies & small changes

* Fixed freezing

* Added gradient ckpt'ing; fixed tests

* Fix slow tests(+compile) ; clear naming confusion

* Don't create UNet in init ; removed class_emb

* Incorporated review feedback

- Deleted get_base_pipeline /  get_controlnet_addon for pipes
- Pipes inherit from StableDiffusionXLPipeline
- Made module dicts for cnxs-addon's down/mid/up classes
- Added support for qkv fusion and freeU

* Make style, quality, fix-copies

* Implemented review feedback

* Removed compatibility check for vae/ctrl embedding

* make style, quality, fix-copies

* Delete Pipfile

* Integrated review feedback

- Importing ControlNetConditioningEmbedding now
- get_down/mid/up_block_addon now outside class
- renamed `do_control` to `apply_control`

* Reduced size of test tensors

For this, added `norm_num_groups` as parameter everywhere

* Renamed cnxs-`Addon` to cnxs-`Adapter`

- `ControlNetXSAddon` -> `ControlNetXSAdapter`
- `ControlNetXSAddonDownBlockComponents` -> `DownBlockControlNetXSAdapter`, and similarly for mid/up
- `get_mid_block_addon` -> `get_mid_block_adapter`, and similarly for mid/up

* Fixed save_pretrained/from_pretrained bug

* Removed redundant code

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-16 21:56:20 +05:30
Sayak Paul
cf6e0407e0 don't install peft from the source with uv for now. (#7679) 2024-04-15 09:33:02 +05:30
Sayak Paul
1c000d46e1 fix: metadata token (#7631) 2024-04-15 08:32:27 +05:30
Sayak Paul
08bf754507 make docker-buildx mandatory. (#7652) 2024-04-13 07:26:34 +05:30
kabachuha
2f23437618 Add (Scheduled) Pseudo-Huber Loss training scripts to research projects (#7527)
* add scheduled pseudo-huber loss training scripts

See #7488

* add reduction modes to huber loss

* [DB Lora] *2 multiplier to huber loss cause of 1/2 a^2 conv.

pairing of c6495def1f

* [DB Lora] add option for smooth l1 (huber / delta)

Pairing of dd22958caa

* [DB Lora] unify huber scheduling

Pairing of 19a834c3ab

* [DB Lora] add snr huber scheduler

Pairing of 47fb1a6854

* fixup examples link

* use snr schedule by default in DB

* update all huber scripts with snr

* code quality

* huber: make style && make quality

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-13 07:26:08 +05:30
Benjamin Bossan
2523390c26 FIX Setting device for DoRA parameters (#7655)
Fix a bug that causes the the call to set_lora_device to ignore the DoRA
parameters.
2024-04-12 13:55:46 +02:00
Sai-Suraj-27
279de3c3ff fix: Replaced deprecated logger.warn with logger.warning (#7643)
Fixed deprecated logger.warn with logger.warning.
2024-04-11 09:43:01 -10:00
Yiqin Zhao
8e14535708 Fixed YAML loading. (#7579) 2024-04-11 09:08:42 -10:00
dg845
0bee4d336b LCM Distill Scripts Fix Bug when Initializing Target U-Net (#6848)
* Initialize target_unet from unet rather than teacher_unet so that we correctly add time_embedding.cond_proj if necessary.

* Use UNet2DConditionModel.from_config to initialize target_unet from unet's config.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-11 07:52:12 -10:00
Steven Munn
42f25d601a Skip PEFT LoRA Scaling if the scale is 1.0 (#7576)
* Skip scaling if scale is identity

* move check for weight one to scale and unscale lora

* fix code style/quality

* Empty-Commit

---------

Co-authored-by: Steven Munn <stevenjmunn@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Munn <5297082+stevenjlm@users.noreply.github.com>
2024-04-11 11:02:31 +05:30
Sayak Paul
33c5d125cb [Core] fix img2img pipeline for Playground (#7627)
* playground vae encoding should use std and mean of the vae.

* style.

* fix-copies.
2024-04-11 09:07:38 +05:30
YiYi Xu
aa1f00fd01 Fix cpu offload related slow tests (#7618)
* fix

* up

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-04-10 14:53:45 -10:00
Steven Liu
d95b993427 [docs] T2I (#7623)
* refactor t2i

* add code snippets
2024-04-10 17:10:41 -07:00
Steven Liu
1d480298c1 [docs] Prompt enhancer (#7565)
* prompt enhance

* edits

* align titles

* feedback

* feedback

* feedback

* link to style
2024-04-10 16:09:06 -07:00
Sayak Paul
b2323aa2b7 [Tests] reduce the model sizes in the SD fast tests (#7580)
* give it a shot.

* print.

* correct assertion.

* gather results from the rest of the tests.

* change the assertion values where needed.

* remove print statements.
2024-04-10 11:36:28 -10:00
satani99
37e9d695af Modularize instruct_pix2pix SD inferencing during and after training in examples (#7603)
* Modularize instruct_pix2pix code

* quality check

* quality check

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-10 11:19:16 +05:30
Sayak Paul
a402431de0 [docs] remove duplicate tip block. (#7625)
remove duplicate tip block.
2024-04-10 10:31:11 +05:30
IDKiro
b99b1617cf add the option of upsample function for tiny vae (#7604)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-10 09:27:39 +05:30
Sayak Paul
3e4a6bd2d4 [Core] add "balanced" device_map support to pipelines (#6857)
* get device <-> component mapping when using multiple gpus.

* condition the device_map bits.

* relax condition

* device_map progress.

* device_map enhancement

* some cleaning up and debugging

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* incorporate suggestions from PR.

* remove multi-gpu condition for now.

* guard check the component -> device mapping

* fix: device_memory variable

* dispatching transformers model to have force_hooks=True

* better guarding for transformers device_map

* introduce support balanced_low_memory and balanced_ultra_low_memory.

* remove device_map patch.

* fix: intermediate variable scoping.

* fix: condition in cpu offload.

* fix: flax class restrictions.

* remove modifications from cpu_offload and model_offload

* incorporate changes.

* add a simple forward pass test

* add: torch_device in get_inputs()

* add: tests

* remove print

* safe-guard to(), model offloading and cpu offloading when balanced is used as a device_map.

* style

* remove .

* safeguard device_map with more checks and remove invalid device_mapping strategues.

* make  a class attribute and adjust tests accordingly.

* fix device_map check

* fix test

* adjust comment

* fix: device_map attribute

* fix: dispatching.

* max_memory test for pipeline

* version guard the tests

* fix guard.

* address review feedback.

* reset_device_map method.

* add: test for reset_hf_device_map

* fix a couple things.

* add reset_device_map() in the error message.

* add tests for checking reset_device_map doesn't have unintended consequences.

* fix reset_device_map and offloading tests.

* create _get_final_device_map utility.

* hf_device_map -> _hf_device_map

* add documentation

* add notes suggested by Marc.

* styling.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* move updates within gpu condition.

* other docs related things

* note on ignore a device not specified in .

* provide a suggestion if device mapping errors out.

* fix: typo.

* _hf_device_map -> hf_device_map

* Empty-Commit

* add: example hf_device_map.

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-04-10 08:59:05 +05:30
Sayak Paul
c827e94da0 [Workflows] remove installation of libsndfile1-dev and libgl1 from workflows (#7543)
* remove libsndfile1-dev and libgl1 from workflows and ensure that re present in the respective dockerfiles.

* change to self-hosted runner; let's see 🤞

* add libsndfile1-dev libgl1 for now

* use self-hosted runners for building and push too.
2024-04-10 08:34:56 +05:30
Sayak Paul
44f6b859bf [Core] refactor transformer_2d forward logic into meaningful conditions. (#7489)
* refactor transformer_2d forward logic into meaningful conditions.

* Empty-Commit

* fix: _operate_on_patched_inputs

* fix: _operate_on_patched_inputs

* check

* fix: patch output computation block.

* fix: _operate_on_patched_inputs.

* remove print.

* move operations to blocks.

* more readability neats.

* empty commit

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Revert "Apply suggestions from code review"

This reverts commit 12178b1aa0.

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-10 08:33:19 +05:30
Sayak Paul
ac7ff7d4a3 add utilities for updating diffusers pipeline metadata. (#7573)
* add utilities for updating diffusers pipeline metadata.

* style

* remove first empty line
2024-04-10 08:28:49 +05:30
Fabio Rigano
a0cf607667 Multi-image masking for single IP Adapter (#7499)
* Support multiimage masking

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-09 09:20:57 -10:00
YiYi Xu
a341b536a8 disable test_conversion_when_using_device_map (#7620)
* disable test

* update

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-04-09 09:01:19 -10:00
Christopher Beckham
8e46d97cd8 Add missing restore() EMA call in train SDXL script (#7599)
* Restore unet params back to normal from EMA when validation call is finished

* empty commit

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-09 18:07:55 +05:30
Junjie
7e808e768a [Docs] fix bugs in callback docs (#7594) 2024-04-08 08:46:30 -10:00
w4ffl35
7e39516627 Allow more arguments to be passed to convert_from_ckpt (#7222)
Allow safety and feature extractor arguments to be passed to convert_from_ckpt

Allows management of safety checker and feature extractor
from outside of the convert ckpt class.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-08 10:13:48 +05:30
Nguyễn Công Tú Anh
56a76082ed Add AudioLDM2 TTS (#5381)
* add audioldm2 tts

* change gpt2 max new tokens

* remove unnecessary pipeline and class

* add TTS to AudioLDM2Pipeline

* add TTS docs

* delete unnecessary file

* remove unnecessary import

* add audioldm2 slow testcase

* fix code quality

* remove AudioLDMLearnablePositionalEmbedding

* add variable check vits encoder

* add use_learned_position_embedding

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-08 10:11:24 +05:30
YiYi Xu
6133d98ff7 [IF| add set_begin_index for all IF pipelines (#7577)
add set_begin_index for all if pipelines
2024-04-05 06:54:07 -10:00
Sayak Paul
1c60e094de [Tests] reduce block sizes of UNet and VAE tests (#7560)
* reduce block sizes for unet1d.

* reduce blocks for unet_2d.

* reduce block size for unet_motion

* increase channels.

* correctly increase channels.

* reduce number of layers in unet2dconditionmodel tests.

* reduce block sizes for unet2dconditionmodel tests

* reduce block sizes for unet3dconditionmodel.

* fix: test_feed_forward_chunking

* fix: test_forward_with_norm_groups

* skip spatiotemporal tests on MPS.

* reduce block size in AutoencoderKL.

* reduce block sizes for vqmodel.

* further reduce block size.

* make style.

* Empty-Commit

* reduce sizes for ConsistencyDecoderVAETests

* further reduction.

* further block reductions in AutoencoderKL and AssymetricAutoencoderKL.

* massively reduce the block size in unet2dcontionmodel.

* reduce sizes for unet3d

* fix tests in unet3d.

* reduce blocks further in motion unet.

* fix: output shape

* add attention_head_dim to the test configuration.

* remove unexpected keyword arg

* up a bit.

* groups.

* up again

* fix
2024-04-05 10:08:32 +05:30
UmerHA
71f49a5d2a Skip test_freeu_enabled on MPS (#7570)
* Skip `test_freeu_enabled ` on MPS

* Small fixes

- import skip_mps correctly
- disable all instances of test_freeu_enabled

* Empty commit to trigger tests

* Empty commit to trigger CI
2024-04-04 12:16:04 +02:00
Abhinav Gopal
35db2fdea9 Update pipeline_animatediff_video2video.py (#7457)
* Update pipeline_animatediff_video2video.py

* commit with test for whether latent input can be passed into animatediffvid2vid
2024-04-03 19:34:28 +05:30
Sayak Paul
ad55ce6100 [Chore] increase number of workers for the tests. (#7558)
* increase number of workers for the tests.

* move to beefier runner.

* improve the fast push tests too.

* use a beefy machine for pytorch pipeline tests

* up the number of workers further.
2024-04-03 17:11:42 +05:30
Sayak Paul
a9a5b14f35 [Core] refactor transformers 2d into multiple init variants. (#7491)
* refactor transformers 2d into multiple legacy variants.

* fix: init.

* fix recursive init.

* add inits.

* make transformer block creation more modular.

* complete refactor.

* remove forward

* debug

* remove legacy blocks and refactor within the module itself.

* remove print

* guard caption projection

* remove fetcher.

* reduce the number of args.

* fix: norm_type

* group variables that are shared.

* remove _get_transformer_blocks

* harmonize the init function signatures.

* transformer_blocks to common

* repeat .
2024-04-03 12:56:17 +05:30
Beinsezii
aa19025989 UniPC Multistep add rescale_betas_zero_snr (#7531)
* UniPC Multistep add `rescale_betas_zero_snr`

Same patch as DPM and Euler with the patched final alpha cumprod

BF16 doesn't seem to break down, I think cause UniPC upcasts during some
phases already? We could still force an upcast since it only
loses ≈ 0.005 it/s for me but the difference in output is very small. A
better endeavor might upcasting in step() and removing all the other
upcasts elsewhere?

* UniPC ZSNR UT

* Re-add `rescale_betas_zsnr` doc oops
2024-04-02 17:23:55 -10:00
Beinsezii
19ab04ff56 UniPC Multistep fix tensor dtype/device on order=3 (#7532)
* UniPC UTs iterate solvers on FP16

It wasn't catching errs on order==3. Might be excessive?

* UniPC Multistep fix tensor dtype/device on order=3

* UniPC UTs Add v_pred to fp16 test iter

For completions sake. Probably overkill?
2024-04-02 15:41:29 -10:00
Sayak Paul
4a34307702 add: utility to format our docs too 📜 (#7314)
* add: utility to format our docs too 📜

* debugging saga

* fix: message

* checking

* should be fixed.

* revert pipeline_fixture

* remove empty line

* make style

* fix: setup.py

* style.
2024-04-02 20:49:43 +05:30
Bagheera
8e963d1c2a 7529 do not disable autocast for cuda devices (#7530)
* 7529 do not disable autocast for cuda devices

* Remove typecasting error check for non-mps platforms, as a correct autocast implementation makes it a non-issue

* add autocast fix to other training examples

* disable native_amp for dreambooth (sdxl)

* disable native_amp for pix2pix (sdxl)

* remove tests from remaining files

* disable native_amp on huggingface accelerator for every training example that uses it

* convert more usages of autocast to nullcontext, make style fixes

* make style fixes

* style.

* Empty-Commit

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-02 20:15:06 +05:30
Sayak Paul
2b04ec2ff7 [Tests] Speed up fast pipelines part II (#7521)
* start printing the tensors.

* print full throttle

* set static slices for 7 tests.

* remove printing.

* flatten

* disable test for controlnet

* what happens when things are seeded properly?

* set the right value

* style./

* make pia test fail to check things

* print.

* fix pia.

* checking for animatediff.

* fix: animatediff.

* video synthesis

* final piece.

* style.

* print guess.

* fix: assertion for control guess.

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-02 13:24:56 +05:30
Sayak Paul
000fa82a1e [Chore] remove class assignments for linear and conv. (#7553)
* remove class assignments for linear and conv.

* fix: self.nn
2024-04-02 13:01:04 +05:30
Sayak Paul
5d83f50c23 [Release tests] make nightly workflow dispatchable. (#7541)
* make nightly workflow dispatchable.

* add a note about running the release tests to setup.py
2024-04-02 12:21:17 +05:30
Dhruv Nair
5d21d4a204 Fix FreeU tests (#7540)
update
2024-04-02 11:05:50 +05:30
Álvaro Somoza
73ba81090e [Community pipeline] SDXL Differential Diffusion Img2Img Pipeline (#7550)
* initial-commit pipeline created

* updated README.md
2024-04-01 18:15:30 -10:00
YiYi Xu
7956c36aaa add a from_pipe method to DiffusionPipeline (#7241)
* add from_pipe



---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-01 13:02:00 -10:00
haikmanukyan
5266ab7935 add HD-Painter pipeline (#7520)
* add HD-Painter pipeline

* style fixing

* refactor, change doc, fix ruff

* fix docs

* used correct ruff version

---------

Co-authored-by: Hayk Manukyan <youremail@yourdomain.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-01 15:10:44 +05:30
YiYi Xu
7f724a930e fix the cpu offload tests (#7544)
fix
2024-04-01 14:27:14 +05:30
Jianbing Wu
9bef9f4be7 Fix SVD bug (shape of time_context) (#7268)
* Fix SVD bug (shape of `time_context`)

* Formatting code

* Formatting src/diffusers/models/transformers/transformer_temporal.py by `make style && make quality`

---------

Co-authored-by: kevinkhwu <kevinkhwu@tencent.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-01 14:05:52 +05:30
Dhruv Nair
7aa4514260 Fix typo in CPU offload test (#7542)
update
2024-03-31 22:07:17 -10:00
Bingxin Ke
c2e87869be [Community pipeline] Marigold depth estimation update -- align with marigold v0.1.5 (#7524)
* add resample option; check denoise_step; update ckpt path

* Add seeding in pipeline to increase reproducibility

* fix typo

* fix typo
2024-03-30 07:09:02 -10:00
Stephen
ca61287daa Fix IP Adapter Support for SAG Pipeline (#7260)
* fix ip adapter support

* Update sag pipelines tests, adjust sag pipeline to pass tests

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-30 06:15:29 -10:00
Beinsezii
f0c81562a4 Add final_sigma_zero to UniPCMultistep (#7517)
* Add `final_sigma_zero` to UniPCMultistep

Effectively the same trick as DDIM's `set_alpha_to_one` and
DPM's `final_sigma_type='zero'`.
Currently False by default but maybe this should be True?

* `final_sigma_zero: bool` -> `final_sigmas_type: str`

Should 1:1 match DPM Multistep now.

* Set `final_sigmas_type='sigma_min'` in UniPC UTs
2024-03-29 22:23:45 -10:00
Hyoungwon Cho
9d20ed37a2 Perturbed-Attention Guidance (#7512)
* pag_initial

* pag_docs

* edit_docs

* custom

* typo

* delete_docs

* whitespace

* make style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-30 10:52:51 +05:30
Linoy Tsaban
bda1d4faf8 add Instant id sdxl image2image pipeline (#7507)
* initial commit - instantid img2img

* adapting to img2img

* change add_time_ids

* change add_time_ids

* WIP changes

* add strength to timesteps

* check insightface import

* style

* check insightface import changed to warning

* check insightface import changed to warning

* style

---------

Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
2024-03-30 10:25:21 +05:30
UmerHA
77103d71ca Quick-Fix for #7352 block-lora (#7523)
Fixed important typo
2024-03-30 06:42:28 +05:30
UmerHA
0302446819 Implements Blockwise lora (#7352)
* Initial commit

* Implemented block lora

- implemented block lora
- updated docs
- added tests

* Finishing up

* Reverted unrelated changes made by make style

* Fixed typo

* Fixed bug + Made text_encoder_2 scalable

* Integrated some review feedback

* Incorporated review feedback

* Fix tests

* Made every module configurable

* Adapter to new lora test structure

* Final cleanup

* Some more final fixes

- Included examples in `using_peft_for_inference.md`
- Added hint that only attns are scaled
- Removed NoneTypes
- Added test to check mismatching lens of adapter names / weights raise error

* Update using_peft_for_inference.md

* Update using_peft_for_inference.md

* Make style, quality, fix-copies

* Updated tutorial;Warning if scale/adapter mismatch

* floats are forwarded as-is; changed tutorial scale

* make style, quality, fix-copies

* Fixed typo in tutorial

* Moved some warnings into `lora_loader_utils.py`

* Moved scale/lora mismatch warnings back

* Integrated final review suggestions

* Empty commit to trigger CI

* Reverted emoty commit to trigger CI

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-29 21:15:57 +05:30
Dhruv Nair
4d39b7483d Memory clean up on all Slow Tests (#7514)
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-29 14:23:28 +05:30
Sayak Paul
fac761694a [Tests] Speed up some fast pipeline tests (#7477)
* speed up test_vae_slicing in animatediff

* speed up test_karras_schedulers_shape for attend and excite.

* style.

* get the static slices out.

* specify torch print options.

* modify

* test run with controlnet

* specify kwarg

* fix: things

* not None

* flatten

* controlnet img2img

* complete controlet sd

* finish more

* finish more

* finish more

* finish more

* finish the final batch

* add cpu check for expected_pipe_slice.

* finish the rest

* remove print

* style

* fix ssd1b controlnet test

* checking ssd1b

* disable the test.

* make the test_ip_adapter_single controlnet test more robust

* fix: simple inpaint

* multi

* disable panorama

* enable again

* panorama is shaky so leave it for now

* remove print

* raise tolerance.
2024-03-29 14:11:38 +05:30
YiYi Xu
34c90dbb31 fix OOM for test_vae_tiling (#7510)
use float16 and add torch.no_grad()
2024-03-29 08:22:39 +05:30
Lvkesheng Shen
e49c04d5d6 Bug fix for controlnetpipeline check_image (#7103)
* Bug fix for controlnetpipeline check_image

Bug fix for controlnetpipeline check_image when using multicontrolnet and prompt list

* Update test_inference_multiple_prompt_input function

* Update test_controlnet.py

add test for multiple prompts and multiple image conditioning

* Update test_controlnet.py

Fix format error

---------

Co-authored-by: Lvkesheng Shen <45848260+Fantast416@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-28 08:25:18 -10:00
YiYi Xu
f238cb0736 cpu_offload: remove all hooks before offload (#7448)
* add remove_all_hooks

* a few more fix and tests

* up

* Update src/diffusers/pipelines/pipeline_utils.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* split tests

* add

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-03-28 08:23:02 -10:00
Bagheera
d78acdedc1 apple mps: training support for SDXL (ControlNet, LoRA, Dreambooth, T2I) (#7447)
* apple mps: training support for SDXL LoRA

* sdxl: support training lora, dreambooth, t2i, pix2pix, and controlnet on apple mps

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-28 14:26:18 +05:30
Sayak Paul
6df103deba add: a helpful message when quality and repo consistency checks fail. (#7475) 2024-03-28 13:51:56 +05:30
Sayak Paul
73f28708be Improve nightly tests (#7385)
* flesh out the nightly tests

* address feedback.
2024-03-28 13:26:34 +05:30
Sayak Paul
0cbc78f04c [Modeling utils chore] import load_model_dict_into_meta only once (#7437)
import load_model_dict_into_meta only once
2024-03-28 13:01:53 +05:30
Thomas Liang
0cc5630945 [Chore] Fix Colab notebook links in README.md (#7495) 2024-03-27 12:36:36 -10:00
UmerHA
0b8e29289d Skip test_lora_fuse_nan on mps (#7481)
Skipping test_lora_fuse_nan on mps

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-27 14:35:59 +05:30
Sayak Paul
ab38ddf64f [chore] make the istructions on fetching all commits clearer. (#7474)
* make the istructions on fetching all commits clearer.

* Update setup.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-27 08:16:46 +05:30
YiYi Xu
ead82fedea fix torch.compile for multi-controlnet of sdxl inpaint (#7476)
fix

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-27 08:08:32 +05:30
Disty0
45b42d1203 Add device arg to offloading with combined pipelines (#7471)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-26 13:45:16 -10:00
Long(Tony) Lian
5199ee4f7b Fix missing raise statements in check_inputs (#7473)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-26 13:34:28 -10:00
Bagheera
544710ef0f diffusers#7426 fix stable diffusion xl inference on MPS when dtypes shift unexpectedly due to pytorch bugs (#7446)
* mps: fix XL pipeline inference at training time due to upstream pytorch bug

* Update src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* apply the safe-guarding logic elsewhere.

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-26 20:05:49 +05:30
M. Tolga Cangöz
443aa14e41 Fix Tiling in ConsistencyDecoderVAE (#7290)
* Fix typos

* Add docstring to `decode` method in `ConsistencyDecoderVAE`

* Fix tiling

* Enable tiled VAE decoding with customizable tile sample size and overlap factor

* Revert "Enable tiled VAE decoding with customizable tile sample size and overlap factor"

This reverts commit 181049675e.

* Add VAE tiling test for `ConsistencyDecoderVAE`

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-26 17:59:08 +05:30
Sayak Paul
288632adf6 [Training utils] add kohya conversion dict. (#7435)
* add kohya conversion dict.

* update readme

* typo

* add filename
2024-03-26 17:31:22 +05:30
Ernie Chu
5ce79cbded Update train_dreambooth_lora_sd15_advanced.py (#7433)
you cannot specify `type="bool"` and `action="store_true"` at the same time.
remove excessive and buggy `type=bool`.

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2024-03-26 12:53:02 +02:00
Marçal Comajoan Cara
d52f3e30f8 Fix broken link (#7472)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-26 10:29:08 +05:30
Sayak Paul
699dfb084c feat: support DoRA LoRA from community (#7371)
* feat: support dora loras from community

* safe-guard dora operations under peft version.

* pop use_dora when False

* make dora lora from kohya work.

* fix: kohya conversion utils.

* add a fast test for DoRA compatibility..

* add a nightly test.
2024-03-26 09:37:33 +05:30
Sayak Paul
484c8ef399 [tests] skip dynamo tests when python is 3.12. (#7458)
skip dynamo tests when python is 3.12.
2024-03-26 08:39:48 +05:30
estelleafl
0dd0528851 Small ldm3d fix (#7464)
* fixed typo

* updated doc to be consistent in naming

* make style/quality

* preprocessing for 4 channels and not 6

* make style

* test for 4c

* make style/quality

* fixed test on cpu

* fixed doc typo

* changed default ckpt to 4c

* Update pipeline_stable_diffusion_ldm3d.py

* fix bug

---------

Co-authored-by: Aflalo <estellea@isl-iam1.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu33.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu38.rr.intel.com>
2024-03-25 15:33:43 -10:00
UmerHA
1cd4732e7f Fixed minor error in test_lora_layers_peft.py (#7394)
* Update test_lora_layers_peft.py

* Update utils.py
2024-03-25 11:35:27 -10:00
M. Tolga Cangöz
a51b6cc86a [Docs] Fix typos (#7451)
* Fix typos

* Fix typos

* Fix typos

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-25 11:48:02 -07:00
Dhruv Nair
3bce0f3da1 Fix for str_to_bool definition in testing utils (#7461)
update
2024-03-25 13:33:09 +05:30
Dhruv Nair
9a34953823 Additional Memory clean up for slow tests (#7436)
* update

* update

* update
2024-03-25 12:19:21 +05:30
Sayak Paul
e29f16cfaa [Research Projects] ORPO diffusion for alignment (#7423)
* barebones orpo

* remove reference model.

* full implementation

* change default of beta_orpo

* add a training command.

* fix: dataloading issues.

* interpreting the formulation.

* revert styling

* add: wds full blown version

* fix: per_gpu_batch_siz

* start debuggin

* debugging

* remove print

* fix

* remove filter keys.

* turn on non-blocking calls.

* device_placement

* let's see.

* add bigger training run command

* reinitialize generator for fair repro

* add: detailed readme and requirements

---------

Co-authored-by: Sayak Paul <sayakpaul@Sayaks-MacBook-Pro-2.local>
2024-03-25 08:37:41 +05:30
M. Tolga Cangöz
f7dfcfd971 [IP-Adapter] Fix IP-Adapter Support and Refactor Callback for StableDiffusionPanoramaPipeline (#7262)
* Add properties and `IPAdapterTesterMixin` tests for `StableDiffusionPanoramaPipeline`

* Update torch manual seed to use `torch.Generator(device=device)`

* Refactor 📞🔙 to support `callback_on_step_end`

* make fix-copies
2024-03-24 16:07:02 -10:00
Sayak Paul
3c67864c5a Remove distutils (#7455)
* strtobool

* replace Command from setuptools.
2024-03-25 06:44:53 +05:30
Aryan
363699044e [refactor] Fix FreeInit behaviour (#7410)
* fix freeinit impl

* fix progress bar

* fix progress bar and remove old code

* fix num_inference_steps==1 case for freeinit by atleast running 1 step when fast sampling enabled
2024-03-22 19:20:00 +05:30
Sayak Paul
9613576191 add: space for calculating memory usagee. (#7414)
* add: space for calculating memory usahe.

* Update docs/source/en/using-diffusers/loading.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-22 08:43:21 +05:30
YiYi Xu
e4356d6488 add a "Community Scripts" section (#7358)
* add

* add tiling

* fix

* fix

* fix

* give community script its own readme

* Update examples/community/README_community_scripts.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/community/README_community_scripts.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/community/README_community_scripts.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/community/README_community_scripts.md

---------

Co-authored-by: Alexis Rolland <alexis.rolland@ubisoft.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-21 10:05:07 -10:00
Sayak Paul
82441460ef [Docs] add missing output image (#7425)
add missing output image
2024-03-21 09:22:06 -07:00
sayakpaul
3e1097cb63 Revert "add: space within docs to calculate mememory usage."
This reverts commit 78990dd960.
2024-03-21 08:33:02 +05:30
sayakpaul
78990dd960 add: space within docs to calculate mememory usage. 2024-03-21 08:32:37 +05:30
Yuanhao Zhai
405a1facd2 fix: enable unet_3d_condition to support time_cond_proj_dim (#7364)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-21 07:46:32 +05:30
M. Tolga Cangöz
3028089e5e Fix typos (#7411)
* Fix typos

* Fix typo in SVD.md
2024-03-20 18:46:47 -07:00
Sayak Paul
b536f39818 [Custom Pipelines with Custom Components] fix multiple things (#7304)
* checking to improve pipelines.

* more fixes.

* add: tip to encourage the usage of revision

* Apply suggestions from code review

* retrigger ci

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-20 18:49:00 +05:30
Sayak Paul
e25e525fde [LoRA test suite] refactor the test suite and cleanse it (#7316)
* cleanse and refactor lora testing suite.

* more cleanup.

* make check_if_lora_correctly_set a utility function

* fix: typo

* retrigger ci

* style
2024-03-20 17:13:52 +05:30
Sayak Paul
de9adb907c clean dep installation step in push_tests (#7382)
* clean dep installation step in push_tests

* fix: deps
2024-03-20 07:30:43 +05:30
Sayak Paul
bf861e65dc [Chore] add: fives names to citations. (#7395)
* add: four names to citations.

* add: steven
2024-03-20 06:37:57 +05:30
Dhruv Nair
4da810b943 Remove insecure torch.load calls (#7393)
update
2024-03-19 12:41:50 -10:00
Stephen
161c6e14b6 Change path to posix (modeling_utils.py) (#6781)
* Change path to posix

* running isort

* run style and quality checks

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-19 11:50:34 -10:00
laksjdjf
a6c9015c4e Fix ControlNetModel.from_unet do not load add_embedding (#7269)
* Fix ControlNetModel.from_unet do not load add_embedding

* delete white space in blank line

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-19 09:45:08 -10:00
PJC
e6a5f99e5c Update pipeline_controlnet_sd_xl_img2img.py (#7353)
* Update pipeline_controlnet_sd_xl_img2img.py

fix: safetensors load error

* fix for pass test

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-19 09:29:39 -10:00
Dhruv Nair
80ff4ba63e Fix issue with prompt embeds and latents in SD Cascade Decoder with multiple image embeddings for a single prompt. (#7381)
* fix

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-19 07:40:14 -10:00
Sayak Paul
b09a2aa308 [LoRA] fix cross_attention_kwargs problems and tighten tests (#7388)
* debugging

* let's see the numbers

* let's see the numbers

* let's see the numbers

* restrict tolerance.

* increase inference steps.

* shallow copy of cross_attentionkwargs

* remove print
2024-03-19 17:53:38 +05:30
YiYi Xu
63b6846849 [scheduler] fix a bug in add_noise (#7386)
* fix

* fix

* add a tests

* fix

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-03-19 00:50:58 -10:00
lawfordp2017
139f707e6e Correction for non-integral image resolutions with quantizations other than float32 (#7356)
* Correction for non-integral image resolutions with quantizations other than float32.

* Support for training, and use of diffusers-style casting.
2024-03-19 16:17:44 +05:30
Aryan
e4546fd5bb [docs] Add missing copied from statements in TCD Scheduler (#7360)
* add missing copied from statements in tcd scheduler

* update docstring

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-19 00:45:36 -10:00
Dhruv Nair
d44e31aec2 Add FreeInit Outputs to Docs Page (#7384)
* update

* fix
2024-03-19 14:13:41 +05:30
Sayak Paul
ce9825b56b [LoRA] pop the LoRA scale so that it doesn't get propagated to the weeds (#7338)
* pop scale from the top-level unet instead of getting it.

* improve readability.

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* fix a little bit.

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-19 09:12:05 +05:30
M. Tolga Cangöz
85f9d92883 Fix conditional statement in test_schedulers.py (#7323)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-19 08:28:47 +05:30
M. Tolga Cangöz
916d9812a8 Update loading of config from a file in test_config.py (#7344)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 11:47:36 -10:00
M. Tolga Cangöz
e6a8492242 Use PyTorch's conventional inplace functions (#7332)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 09:12:15 -10:00
Beinsezii
ad0308b3f1 Add Cascade to Auto T2I + Decoder mappings (#7362)
* Add Cascade to Auto T2I + Decoder mappings

* ruff autofix

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 08:58:20 -10:00
M. Tolga Cangöz
e97a633b63 Update access of configuration attributes (#7343)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 08:53:29 -10:00
Sayak Paul
01ac37b331 [LoRA] Clean Kohya conversion utils (#7374)
* clean up the kohya_conversion utility

* state dict assignment
2024-03-18 06:53:37 -10:00
M. Tolga Cangöz
6a05b274cc Fix Typos (#7325)
* Fix PyTorch's convention for inplace functions

* Fix import structure in __init__.py and update config loading logic in test_config.py

* Update configuration access

* Fix typos

* Trim trailing white spaces

* Fix typo in logger name

* Revert "Fix PyTorch's convention for inplace functions"

This reverts commit f65dc4afcb.

* Fix typo in step_index property description

* Revert "Update configuration access"

This reverts commit 8d44e870b8.

* Revert "Fix import structure in __init__.py and update config loading logic in test_config.py"

This reverts commit 2ad5e8bca2.

* Fix typos

* Fix typos

* Fix typos

* Fix a typo: tranform -> transform
2024-03-18 09:48:40 -07:00
Anatoly Belikov
98d46a3f08 delete vae and text encoders after use in SDXL training script (#6693)
delete vae and text encoders after use

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 20:03:53 +05:30
Dhruv Nair
4330a747d4 [Tests] Fix ControlNet Single File tests (#7315)
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 11:28:59 +05:30
Sayak Paul
76de6a09fb post-release v0.27.0 (#7329)
* post-release

* quality
2024-03-18 10:52:20 +05:30
Sayak Paul
25caf24ef9 Fix release workflow deps (#7339)
* pop scale from the top-level unet instead of getting it.

* improve readability.

* fix: pypi workflow deps

* revert
2024-03-16 07:18:11 +05:30
Abubakar Abid
8db3c9bc9f Adds docs for gradio.Interface.from_pipeline() (#7346)
* gradio docs

* Update docs/source/en/api/pipelines/stable_diffusion/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* changes

* changes

* changes

* Update docs/source/en/api/pipelines/stable_diffusion/overview.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-16 07:11:28 +05:30
Sayak Paul
e0e9f81971 add: torch to the pypi step. (#7328) 2024-03-15 12:28:12 +05:30
M. Tolga Cangöz
5d848ec07c [Tests] Update a deprecated parameter in test files and fix several typos (#7277)
* Add properties and `IPAdapterTesterMixin` tests for `StableDiffusionPanoramaPipeline`

* Fix variable name typo and update comments

* Update deprecated `output_type="numpy"` to "np" in test files

* Discard changes to src/diffusers/pipelines/stable_diffusion_panorama/pipeline_stable_diffusion_panorama.py

* Update test_stable_diffusion_panorama.py

* Update numbers in README.md

* Update get_guidance_scale_embedding method to use timesteps instead of w

* Update number of checkpoints in README.md

* Add type hints and fix var name

* Fix PyTorch's convention for inplace functions

* Fix a typo

* Revert "Fix PyTorch's convention for inplace functions"

This reverts commit 74350cf65b.

* Fix typos

* Indent

* Refactor get_guidance_scale_embedding method in LEditsPPPipelineStableDiffusionXL class
2024-03-14 12:17:35 -07:00
Dhruv Nair
4974b84564 Update Cascade Tests (#7324)
* update

* update

* update
2024-03-14 20:51:22 +05:30
Linoy Tsaban
83062fb872 [Advanced DreamBooth LoRA SDXL] Support EDM-style training (follow up of #7126) (#7182)
* add edm style training

* style

* finish adding edm training feature

* import fix

* fix latents mean

* minor adjustments

* add edm to readme

* style

* fix autocast and scheduler config issues when using edm

* style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-14 18:40:14 +05:30
Suraj Patil
b6d7e31d10 add edm schedulers in doc (#7319)
* add edm schedulers in doc

* add in toctree

* address reviewe comments
2024-03-14 11:52:25 +01:00
Anatoly Belikov
53e9aacc10 log loss per image (#7278)
* log loss per image

* add commandline param for per image loss logging

* style

* debug-loss -> debug_loss

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-14 11:41:43 +05:30
Dhruv Nair
41424466e3 [Tests] Fix incorrect constant in VAE scaling test. (#7301)
update
2024-03-14 10:24:01 +05:30
Sayak Paul
95de1981c9 add: pytest log installation (#7313) 2024-03-14 10:01:16 +05:30
Kenneth Gerald Hamilton
0b45b58867 update get_order_list if statement (#7309)
* update get_order_list if statement

* revery
2024-03-13 18:29:42 -10:00
Beinsezii
d3986f18be Change step_offset scheduler docstrings (#7128)
* Change step_offset scheduler docstrings

* Mention it may be needed by some models

* More docstrings

These ones failed literal S&R because I performed it case-sensitive
which is fun.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-13 15:12:00 -10:00
Alexander Bonnet
ee6a3a993d Fix typos in UNet2DConditionModel documentation (#7291)
* fix typo in UNet2DConditionModel documentation

* Fix indentation that may fix doc rendering

* Fix squished doc lines
2024-03-13 09:31:29 -07:00
Michael
b300517305 Add Intro page of TCD (#7259)
* add tcd intro

* resolve repos

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* revise NFEs related

* change inpainting location

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-13 09:21:51 -07:00
jnhuang
ac07b6dc6a Fix Wrong Text-encoder Grad Setting in Custom_Diffusion Training (#7302)
fix index in set textencoder grad

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-13 20:22:44 +05:30
Sayak Paul
46ab56a468 add: support for notifying maintainers about the nightly test status (#7117)
* add: support for notifying maintainers about the nightly test status

* add: a tempoerary workflow for validation.

* cancel in progress.

* runs-on

* clean up

* add: peft dep

* change device.

* multiple edits.

* remove temp workflow.
2024-03-13 16:48:11 +05:30
Sayak Paul
038ff70023 [PyPI publishing] feat: automate the process of pypi publication to some extent. (#7270)
* feat: automate the process of pypi publication to some extent.

* utility to fetch the latest release branch

* correct package name.
2024-03-13 16:27:59 +05:30
Manuel Brack
00eca4b887 [Pipeline] Add LEDITS++ pipelines (#6074)
* Setup LEdits++ file structure

* Fix import

* LEditsPP Stable Diffusion pipeline

* Include variable image aspect ratios

* Implement LEDITS++ for SDXL

* clean up LEditsPPPipelineStableDiffusion

* Adjust inversion output

* Added docu, more cleanup for LEditsPPPipelineStableDiffusion

* clean up LEditsPPPipelineStableDiffusionXL

* Update documentation

* Fix documentation import

* Add skeleton IF implementation

* Fix documentation typo

* Add LEDTIS docu to toctree

* Add missing title

* Finalize SD documentation

* Finalize SD-XL documentation

* Fix code style and quality

* Fix typo

* Fix return types

* added LEditsPPPipelineIF; minor changes for LEditsPPPipelineStableDiffusion and LEditsPPPipelineStableDiffusionXL

* Fix copy reference

* add documentation for IF

* Add first tests

* Fix batching for SD-XL

* Fix text encoding and perfect reconstruction for SD-XL

* Add tests for SD-XL, minor changes

* move user_mask to correct device, use cross_attention_kwargs also for inversion

* Example docstring

* Fix attention resolution for non-square images

* Refactoring for PR review

* Safely remove ledits_utils.py

* Style fixes

* Replace assertions with ValueError

* Remove LEditsPPPipelineIF

* Remove unecessary input checks

* Refactoring of CrossAttnProcessor

* Revert unecessary changes to scheduler

* Remove first progress-bar in inversion

* Refactor scheduler usage and reset

* Use imageprocessor instead of custom logic

* Fix scheduler init warning

* Fix error when running the pipeline in fp16

* Update documentation wrt perfect inversion

* Update tests

* Fix code quality and copy consistency

* Update LEditsPP import

* Remove enable/disable methods that are now in StableDiffusionMixin

* Change import in docs

* Revert import structure change

* Fix ledits imports

---------

Co-authored-by: Katharina Kornmeier <katharina.kornmeier@stud.tu-darmstadt.de>
2024-03-13 12:43:47 +02:00
Dhruv Nair
30132aba30 Update Stable Cascade Conversion Scripts (#7271)
* update

* update

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-13 12:35:44 +05:30
Dhruv Nair
a17d6d6858 Update Cascade documentation (#7257)
* updates

* update

* update

* Update docs/source/en/api/pipelines/stable_cascade.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
2024-03-13 11:29:59 +05:30
Sayak Paul
8efd9ce787 [Chore] clean residue from copy-pasting in the UNet single file loader (#7295)
clean residue from copy-pasting
2024-03-13 11:20:13 +05:30
Dhruv Nair
299c16d0f5 Fix loading Img2Img refiner components in from_single_file (#7282)
* update

* update

* update

* update
2024-03-13 09:25:53 +05:30
Dhruv Nair
69f49195ac Fix passing pooled prompt embeds to Cascade Decoder and Combined Pipeline (#7287)
* update

* update

* update

* update
2024-03-13 09:21:41 +05:30
Dhruv Nair
ed224f94ba Add single file support for Stable Cascade (#7274)
* update

* update

* update

* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-13 08:37:31 +05:30
Sayak Paul
531e719163 [LoRA] use the PyTorch classes wherever needed and start depcrecation cycles (#7204)
* fix PyTorch classes and start deprecsation cycles.

* remove args crafting for accommodating scale.

* remove scale check in feedforward.

* assert against nn.Linear and not CompatibleLinear.

* remove conv_cls and lineaR_cls.

* remove scale

* 👋 scale.

* fix: unet2dcondition

* fix attention.py

* fix: attention.py again

* fix: unet_2d_blocks.

* fix-copies.

* more fixes.

* fix: resnet.py

* more fixes

* fix i2vgenxl unet.

* depcrecate scale gently.

* fix-copies

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* quality

* throw warning when scale is passed to the the BasicTransformerBlock class.

* remove scale from signature.

* cross_attention_kwargs, very nice catch by Yiyi

* fix: logger.warn

* make deprecation message clearer.

* address final comments.

* maintain same depcrecation message and also add it to activations.

* address yiyi

* fix copies

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* more depcrecation

* fix-copies

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-13 07:56:19 +05:30
Sayak Paul
4fbd310fd2 [Chore] switch to logger.warning (#7289)
switch to logger.warning
2024-03-13 06:56:43 +05:30
Dhruv Nair
2ea28d69dc Change export_to_video default (#6990)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-12 17:13:12 +05:30
erliding
a1cb106459 instruct pix2pix pipeline: remove sigma scaling when computing classifier free guidance (#7006)
remove sigma scaling when computing cfg

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-11 10:23:31 +01:00
Sayak Paul
5dd8e04d4b [Dockerfiles] add: a workflow to check if docker containers can be built in case of modifications (#7129)
* add: a workflow to check if docker containers can be built if the files are modified.

* type

* unify docker image build test and push

* make it run on prs too.

* check

* check

* check

* check again.

* remove docker test build file.

* remove extra dependencies./

* check
2024-03-11 08:54:00 +05:30
pravdomil
165af7edd3 Inline InputPadder (#6582)
inline InputPadder
2024-03-09 11:24:07 -10:00
Haofan Wang
6c5f0de713 Support latents_mean and latents_std (#7132)
* update latents_mean and latents_std

* fix typos

* Update src/diffusers/pipelines/controlnet/pipeline_controlnet_sd_xl_img2img.py

* format

---------

Co-authored-by: ResearcherXman <xhs.research@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-09 08:54:19 -10:00
pravdomil
e64fdcf2ce Fix gmflow_dir (#6583)
* remove sys.path

* update readme
2024-03-09 08:53:17 -10:00
Sayak Paul
ec64f371b1 [Chore] remove tf mention (#7245)
remove tf mention
2024-03-09 11:39:04 +05:30
Aryan
cd6e1f1171 [docs/nits] Fix return values based on return_dict and minor doc updates (#7105)
* fix returns and docs

* handle latent output_type correctly

* revert to old tensor2vid impl

* make fix-copies

* fix return in community animatediff pipes

* fix return docstring

* fix return docs

* add missing quote

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-03-08 18:47:24 -10:00
Xiaodong Wang
6f2b310a17 [UNet_Spatio_Temporal_Condition] fix default num_attention_heads in unet_spatio_temporal_condition (#7205)
fix default num_attention_heads in unet_spatio_temporal_condition

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-08 18:29:06 -10:00
YiYi Xu
e3cd6cae50 update the signature of from_single_file (#7216)
* update the signature of from_single_file

* Update src/diffusers/loaders/single_file.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/loaders/single_file.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/loaders/single_file.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/loaders/single_file.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-08 17:51:51 -10:00
qqii
e5ee05da76 [Community Pipeline] Skip Marigold depth_colored with color_map=None (#7170)
[Community Pipeline] Skip Marigold depth_colored generation by passing color_map=None
2024-03-08 17:51:11 -10:00
Mengqing Cao
e6ff752840 Add npu support (#7144)
* Add npu support

* fix for code quality check

* fix for code quality check
2024-03-08 17:12:55 -10:00
UmerHA
3f9c746fb2 Adds denoising_end parameter to ControlNetPipeline for SDXL (#6175)
* Initial commit

* Removed copy hints, as in original SDXLControlNetPipeline

Removed copy hints, as in original SDXLControlNetPipeline, as the `make fix-copies` seems to have issues with the @property decorator.

* Reverted changes to ControlNetXS

* Addendum to: Removed changes to ControlNetXS

* Added test+docs for mixture of denoiser

* Update docs/source/en/using-diffusers/controlnet.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/controlnet.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-08 16:42:02 -10:00
Steven Liu
1f22c98820 [docs] IP-Adapter image embedding (#7226)
* update

* fix parameter name

* feedback

* add no mask version

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-08 08:49:58 -08:00
Sayak Paul
b4226bd6a7 [Tests] fix config checking tests (#7247)
* debig

* cast tuples to lists.

* debug

* handle upcast attention

* handle downblock types for vae.

* remove print.

* address Dhruv's comments.

* fix: upblock types.

* upcast attention

* debug

* debug

* debug

* better guarding.

* style
2024-03-08 18:53:07 +05:30
Chi
46fac824be Solve missing clip_sample implementation in FlaxDDIMScheduler. (#7017)
* I added a new doc string to the class. This is more flexible to understanding other developers what are doing and where it's using.

* Update src/diffusers/models/unet_2d_blocks.py

This changes suggest by maintener.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/unet_2d_blocks.py

Add suggested text

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update unet_2d_blocks.py

I changed the Parameter to Args text.

* Update unet_2d_blocks.py

proper indentation set in this file.

* Update unet_2d_blocks.py

a little bit of change in the act_fun argument line.

* I run the black command to reformat style in the code

* Update unet_2d_blocks.py

similar doc-string add to have in the original diffusion repository.

* Fix bug for mention in this issue section #6901

* Update src/diffusers/schedulers/scheduling_ddim_flax.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix linter

* Restore empty line

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-03-08 12:01:59 +01:00
Martin Müller
b33b64f595 Make mid block optional for flax UNet (#7083)
* make mid block optional for flax UNet

* make style
2024-03-08 11:35:06 +01:00
Sayak Paul
9d9744075e [Easy] fix: save_model_card utility of the DreamBooth SDXL LoRA script (#7258)
* fix: save_model_card utility.

* fix a little more to make it more lenient.

* remove lower()
2024-03-08 15:22:23 +05:30
Sayak Paul
d9a3b69806 [Utils] Improve " # Copied from ..." statements in the pipelines (#6917)
* copied from for t2i pipelines without ip adapter support.

* two more pipelines with proper copied from comments.

* revert to the original implementation
2024-03-08 14:42:26 +05:30
Sayak Paul
f7e5954d5e [Tests] fix: VAE tiling tests when setting the right device (#7246)
* debug

* checking

* fix more

* remove device.

* fix-copies
2024-03-08 10:01:25 +05:30
Sayak Paul
8e19c073e5 [Core] throw error when patch inputs and layernorm are provided for Transformers2D (#7200)
* throw error when patch inputs and layernorm are provided for transformers2d.

* add comment on supported norm_types in transformers2d

* more check

* fix: norm _type handling
2024-03-08 09:41:02 +05:30
Steven Liu
f6df16cbb8 [docs] Community tips (#7137)
* tips

* feedback

* callback only
2024-03-07 15:17:26 -08:00
pravdomil
b24f78349c use self.device (#6595)
* use self.device

* use device

* fix

* fix
2024-03-07 12:46:23 -10:00
Steven Liu
3ce905c9d0 [docs] Merge LoRAs (#7213)
* merge loras

* feedback

* torch.compile

* feedback
2024-03-07 11:28:50 -08:00
bimsarapathiraja
f539497ab4 Remove the line. Using it create wrong output (#7075)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-07 10:04:31 -08:00
Dhruv Nair
39dfb7abbd Raise an error when trying to use SD Cascade Decoder with dtype bfloat16 and torch < 2.2 (#7244)
update
2024-03-07 17:55:46 +05:30
Sayak Paul
196835695e fix: support for loading playground v2.5 single file checkpoint. (#7230)
* fix: support for loading playground v2.5 single file checkpoint.

* remove is_playground_model.

* fix: edm key

* apply Dhruv's comments but errors.

* fix: things.

* delegate model_type inference to a function.

* address Dhruv's comment.

* address rest of the comments.

* fix: kwargs

* fix

* update

---------

Co-authored-by: DN6 <dhruv.nair@gmail.com>
2024-03-07 15:31:03 +05:30
Sayak Paul
0d4dfbbd0a [Examples] fix: prior preservation setting in DreamBooth LoRA SDXL script. (#7242)
fix: prior preservation setting in DreamBooth LoRA SDXL script.

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2024-03-07 15:19:58 +05:30
Rinne
ada3bb941b fix: remove duplicated code in TemporalBasicTransformerBlock. (#7212)
fix: remove duplicate code in TemporalBasicTransformerBlock.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-07 13:25:22 +05:30
Linoy Tsaban
b5814c5555 add DoRA training feature to sdxl dreambooth lora script (#7235)
* dora in canonical script

* add mention of DoRA to readme
2024-03-07 11:43:37 +05:30
Paakhhi
9940573618 Refactor Prompt2Prompt: Inherit from DiffusionPipeline (#7211)
refactor: inherit from DiffusionPipeline instead of StableDiffusionPipeline
2024-03-06 19:34:40 -10:00
Sayak Paul
59433ca1ae [Core] move out the utilities from pipeline_utils.py (#7234)
move out the utilities from pipeline_utils.py
2024-03-07 08:45:24 +05:30
Nate Landman
534f5d54fa Update train_dreambooth_lora_sdxl_advanced.py (#7227)
adding the type gives you

```
TypeError: _StoreTrueAction.__init__() got an unexpected keyword argument 'type'
```
2024-03-06 12:41:48 +01:00
Kashif Rasul
40aa47b998 [Pipiline] Wuerstchen v3 aka Stable Cascasde pipeline (#6487)
* initial diffNext v3

* move to v3 folder

* imports

* dry up the unets

* no switch_level

* fix init

* add switch_level tp config

* Fixed some things

* Added pooled text embeddings

* Initial work on adding image encoder

* changes from @dome272

* Stuff for the image encoder processing and variable naming in decoder

* fix arg name

* inference fixes

* inference fixes

* default TimestepBlock without conds

* c_skip=0 by default

* fix bfloat16 to cpu

* use config

* undo temp change

* fix gen_c_embeddings args

* change text encoding

* text encoding

* undo print

* undo .gitignore change

* Allow WuerstchenV3PriorPipeline to use the base DDPM & DDIM schedulers

* use WuerstchenV3Unet in both pipelines

* fix imports

* initial failing tests

* cleanup

* use scheduler.timesterps

* some fixes to the tests, still not fully working

* fix tests

* fix prior tests

* add dropout to the model_kwargs

* more tests passing

* update expected_slice

* initial rename

* rename tests

* rename class names

* make fix-copies

* initial docs

* autodocs

* typos

* fix arg docs

* add text_encoder info

* combined pipeline has optional image arg

* fix documentation

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* use self.config

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* c_in -> in_channels

* removed kwargs from unet's forward

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove older callback api

* removed kwargs and fixed decoder guidance > 1

* decoder takes emeds

* check and use image_embeds

* fixed all but one decoder test

* fix decoder tests

* update callback api

* fix some more combined tests

* push combined pipeline

* initial docs

* fix doc_string

* update combined api

* no test_callback_inputs test for combined pipeline

* add optional components

* fix ordering of components

* fix combined tests

* update convert script

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade_prior.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade_prior.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade_prior.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* fix imports

* move effnet out of deniosing loop

* prompt_embeds_pooled only when doing guidance

* Fix repeat shape

* move StableCascadeUnet to models/unets/

* more descriptive names

* converted when numpy()

* StableCascadePriorPipelineOutput docs

* rename StableCascadeUNet

* add slow tests

* fix slow tests

* update

* update

* updated model_path

* add args for weights

* set push_to_hub to false

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: Dominic Rampas <d6582533@gmail.com>
Co-authored-by: Pablo Pernias <pablo@pernias.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-03-06 15:07:25 +05:30
Jinay Jain
1bc0d37ffe [bug] Fix float/int guidance scale not working in StableVideoDiffusionPipeline (#7143)
* [bug] Fix float/int guidance scale not working in `StableVideoDiffusionPipeline`

* Add test to disable CFG on SVD

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-06 14:05:07 +05:30
bram-w
eb942b866a SDXL Turbo support and example launch (#6473)
* support and example launch for sdxl turbo

* White space fixes

* Trailing whitespace character

* ruff format

* fix guidance_scale and steps for turbo mode

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Radames Ajna <radamajna@gmail.com>
2024-03-06 11:51:01 +05:30
Michael
687bc27727 add TCD Scheduler (#7174)
* add: support TCD scheduler


---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-04 19:43:34 -10:00
iczaw
6246c70d21 [Community] PromptDiffusion Pipeline (#6752)
* Create promptdiffusioncontrolnet.py

* Update __init__.py

Added PromptDiffusionControlNetModel

* Update __init__.py

Added PromptDiffusionControlNetModel

* Update promptdiffusioncontrolnet.py

* Create pipeline_prompt_diffusion.py

Added Prompt Diffusion pipeline.

* Create convert_original_promptdiffusion_to_diffusers.py

* Update convert_from_ckpt.py

Added download_promptdiffusion_from_original_ckpt, convert_promptdiffusion_checkpoint

* Update promptdiffusioncontrolnet.py

* Update pipeline_prompt_diffusion.py

* Update README.md

* Update pipeline_prompt_diffusion.py

* Delete src/diffusers/models/promptdiffusioncontrolnet.py

* Update __init__.py

* Update __init__.py

* Delete scripts/convert_original_promptdiffusion_to_diffusers.py

* Update convert_from_ckpt.py

* Update README.md

* Delete examples/community/pipeline_prompt_diffusion.py

* Create README.md

* Create promptdiffusioncontrolnet.py

* Create convert_original_promptdiffusion_to_diffusers.py

* Create pipeline_prompt_diffusion.py

* Update README.md

* Update pipeline_prompt_diffusion.py

* Update README.md

* Update pipeline_prompt_diffusion.py

* Update convert_original_promptdiffusion_to_diffusers.py

* Update promptdiffusioncontrolnet.py

* Update README.md

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-05 09:06:02 +05:30
Sayak Paul
577b8a2783 [Core] errors should be caught as soon as possible. (#7203)
errors should be caught as soon as possible.
2024-03-05 08:57:38 +05:30
Vinh H. Pham
13f0c8b219 [Docs] Update callback.md code example (#7150)
Update callback.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-05 08:44:19 +05:30
Aryan
fa1bdce3d4 [docs] Improve SVD pipeline docs (#7087)
* update svd docs

* fix example doc string

* update return type hints/docs

* update type hints

* Fix typos in pipeline_stable_video_diffusion.py

* make style && make fix-copies

* Update src/diffusers/pipelines/stable_video_diffusion/pipeline_stable_video_diffusion.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/stable_video_diffusion/pipeline_stable_video_diffusion.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update based on suggestion

---------

Co-authored-by: M. Tolga Cangöz <mtcangoz@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-04 12:12:38 -08:00
Thiago Crepaldi
ca6cdc77a9 Enable PyTorch's FakeTensorMode for EulerDiscreteScheduler scheduler (#7151)
* Enable FakeTensorMode for EulerDiscreteScheduler scheduler

PyTorch's FakeTensorMode does not support `.numpy()` or `numpy.array()`
calls.

This PR replaces `sigmas` numpy tensor by a PyTorch tensor equivalent

Repro

```python
with torch._subclasses.FakeTensorMode() as fake_mode, ONNXTorchPatcher():
    fake_model = DiffusionPipeline.from_pretrained(model_name, low_cpu_mem_usage=False)
```

that otherwise would fail with
`RuntimeError: .numpy() is not supported for tensor subclasses.`

* Address comments
2024-03-04 09:19:59 -10:00
M. Tolga Cangöz
f4977abcd8 Fix typos (#7181)
* Fix typos

* Fix typos

* Fix typos and update documentation in lora.md
2024-03-04 10:28:23 -08:00
fpgaminer
df8559a7f9 Fix: UNet2DModel::__init__ type hints; fixes issue #4806 (#7175)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-04 09:50:34 -08:00
YiYi Xu
8f206a5873 fix a bug in from_config (#7192)
* fix

* fix

* update comment

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-03-04 07:18:59 -10:00
Linoy Tsaban
8da360aa12 [training scripts] add tags of diffusers-training (#7206)
* add tags for diffusers training

* add tags for diffusers training

* add tags for diffusers training

* add tags for diffusers training

* add tags for diffusers training

* add tags for diffusers training

* add dora tags for drambooth lora scripts

* style
2024-03-04 22:17:25 +05:30
Dhruv Nair
869bad3e52 FIx torch and cuda version in ONNX tests (#7164)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-04 15:46:39 +05:30
Linoy Tsaban
01ee0978cc [advanced dreambooth lora sdxl] add DoRA training feature (#7072)
* add is_dora arg

* style

* add dora training feature to sd 1.5 script

* added notes about DoRA training

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-04 11:07:54 +01:00
Sayak Paul
56b68459f5 Update requirements.txt to remove huggingface-cli (#7202)
Internal message: https://huggingface.slack.com/archives/C03Q18WK18T/p1709529892062479
2024-03-04 11:29:01 +05:30
Vinh H. Pham
2ca264244b adding callback_on_step_end for StableDiffusionLDM3DPipeline (#7149)
* refactor callback

* run make style

* add copy

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-03 18:44:14 -10:00
Sayak Paul
b9e1c30d0e [Docs] more elaborate example for peft torch.compile (#7161)
more elaborate example for peft torch.compile
2024-03-04 08:55:30 +05:30
Sayak Paul
03cd62520f feat: add ip adapter benchmark (#6936)
* feat: add ip adapter benchmark

* sdxl support too.

* Empty-Commit
2024-03-03 15:10:55 +05:30
Álvaro Somoza
001b14023e [ip-adapter] fix problem using embeds with the plus version of ip adapters (#7189)
* initial

* check_inputs fix to the rest of pipelines

* add fix for no cfg too

* use of variable

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-02 21:37:00 -10:00
Junsong Chen
f55873b783 Fix PixArt 256px inference (#6789)
* feat 256px diffusers inference bug

* change the max_length of T5 to pipeline config file

* fix bug in convert_pixart_alpha_to_diffusers.py

* Update scripts/convert_pixart_alpha_to_diffusers.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* remove multi_scale_train parser

* Update src/diffusers/pipelines/pixart_alpha/pipeline_pixart_alpha.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/pixart_alpha/pipeline_pixart_alpha.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* styling

* change `model_token_max_length` to call argument.

* Refactoring

* add: max_sequence_length to the docstring.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-03 10:31:21 +05:30
Sayak Paul
ccb93dcad1 Support EDM-style training in DreamBooth LoRA SDXL script (#7126)
* add: dreambooth lora script for Playground v2.5

* fix: kwarg

* address suraj's comments.

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* apply suraj's suggestion

* incorporate changes in the canonical script./

* tracker naming

* fix: schedule determination

* add: two simple tests

* remove playground script

* note about edm-style training

* address pedro's comments.

* address part of Suraj's comments.

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* remove guidance_scale.

* use mse_loss.

* add comments for preconditioning.

* quality

* Update examples/dreambooth/train_dreambooth_lora_sdxl.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* tackle v-pred.

* Empty-Commit

* support edm for sdxl too.

* address suraj's comments.

* Empty-Commit

---------

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2024-03-03 09:28:57 +05:30
YiYi Xu
ec953047bc [stalebot] fix a bug (#7156)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-03-01 09:44:00 -10:00
Oleh
9a2600ede9 Map speedup (#6745)
* Speed up dataset mapping

* Fix missing columns

* Remove cache files cleanup

* Update examples/text_to_image/train_text_to_image_sdxl.py

* make style

* Fix code style

* style

* Empty-Commit

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
2024-03-01 21:38:21 +05:30
Sayak Paul
5f150c4cef fix: loading problem for sdxl lora dreambooth (#7166) 2024-03-01 19:30:48 +05:30
Quentin Lhoest
66f8bd6869 Fix vae_encodings_fn hash in train_text_to_image_sdxl.py (#7171)
Update train_text_to_image_sdxl.py
2024-03-01 16:56:54 +05:30
Dhruv Nair
64a8cd627a [CI] Remove max parallel flag on slow test runners (#7162)
update
2024-03-01 11:38:21 +05:30
Sayak Paul
5d3923b670 Fix LCM benchmark test (#7158)
* make workflow dispatchable.

* fix: lcm lora compile
2024-03-01 10:33:44 +05:30
Sayak Paul
9451235e5a [Urgent][Docker CI] pin uv version for now and a minor change in the Slack notification (#7155)
pin uv version for now.
2024-03-01 10:11:07 +05:30
Sayak Paul
c2b6ac4e34 [CI] fix path filtering in the documentation workflows (#7153)
fix: path
2024-03-01 07:18:49 +05:30
YiYi Xu
06b01ea87e [ip-adapter] refactor prepare_ip_adapter_image_embeds and skip load image_encoder (#7016)
* add

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-29 15:38:49 -10:00
M. Tolga Cangöz
f4fc75035f [Docs] Fix typos (#7131)
* Add copyright notice to relevant files and fix typos

* Set `timestep_spacing` parameter of `StableDiffusionXLPipeline`'s scheduler to `'trailing'`.

* Update `StableDiffusionXLPipeline.from_single_file` by including EulerAncestralDiscreteScheduler with `timestep_spacing="trailing"` param.

* Update model loading method in SDXL Turbo documentation
2024-02-29 13:03:01 -08:00
Dhruv Nair
8f2d13c684 Fix setting fp16 dtype in AnimateDiff convert script. (#7127)
* update

* update
2024-02-29 22:47:39 +05:30
Sayak Paul
fcfa270fbd add: support for notifying the maintainers about the docker ci status. (#7113) 2024-02-29 19:28:52 +05:30
Sayak Paul
56dac1cedc limit documentation workflow runs for relevant changes. (#7125) 2024-02-29 19:01:54 +05:30
Sayak Paul
3daebe2b44 use uv for installing stuff in the workflows. (#7116)
* use uv for installing stuff in the workflows.

* fix: from source installation command when using uv.

* fix uv venv issue

* edit editable installation.

* fix quality installation

* checking

* make editable.

* more check

* check

* add: export step

* venv handling.

* checking.

* fix: dependency workflows.

* peft tests.

* proper way to initialize env.

* Empty-Commit

* Empty-Commit
2024-02-29 08:27:24 +05:30
Aryan
abd922bd0c [docs] unet type hints (#7134)
update
2024-02-28 10:39:01 -10:00
elucida
fa633ed6de refactor: move model helper function in pipeline to a mixin class (#6571)
* move model helper function in pipeline to EfficiencyMixin

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-28 09:26:39 -10:00
YiYi Xu
2cad1a8465 [stalebot] don't close the issue if the stale label is removed (#7106)
* fix

* fix

* remove stalebot's ability to close issues

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-28 08:04:28 -10:00
Dhruv Nair
e6cf21906d [Diffusers CI] Switch slow test runners (#7123)
update
2024-02-28 22:15:04 +05:30
Sayak Paul
7db935a141 fix kwarg in the SDXL LoRA DreamBooth (#7124)
* fix kwarg

* Empty-Commit
2024-02-28 10:21:28 +05:30
Sayak Paul
fa9bc029b4 [Tests] make test steps dependent on certain things and general cleanup of the workflows (#7026)
make tests conditional and other things.
2024-02-28 08:26:46 +05:30
Beinsezii
2e31a759b5 DPMSolverMultistep add rescale_betas_zero_snr (#7097)
* DPMMultistep rescale_betas_zero_snr

* DPM upcast samples in step()

* DPM rescale_betas_zero_snr UT

* DPMSolverMulti move sample upcast after model convert

Avoids having to re-use the dtype.

* Add a newline for Ruff
2024-02-27 11:37:34 -10:00
M. Tolga Cangöz
e51862bbed [Docs] Fix typos (#7118)
Fix typos, formatting and remove trailing whitespace
2024-02-27 12:38:00 -08:00
Suraj Patil
8492db2332 add DPM scheduler with EDM formulation (#7120)
* add DPM scheduler with EDM formulation

* set sigmas in init

* add _compute_sigmas

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* address some review comments

* up,

* add tests

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-02-27 23:39:38 +05:30
Suraj Patil
f57e7bd92c Add EDMEulerScheduler (#7109)
* Add EDMEulerScheduler

* address review comments

* fix import

* fix test

* add tests

* add co-author

Co-authored-by:  @dg845 dgu8957@gmail.com
2024-02-27 17:51:19 +05:30
Sayak Paul
3e3d46924b [Dockerfile] remove uv from docker jax tpu (#7115)
remove uv from docker jax tpu
2024-02-27 16:25:50 +05:30
Suraj Patil
d71ecad8cd denormalize latents with the mean and std if available (#7111)
* denormalize latents with the mean and std if available

* fix denormalize

* add latent mean and std in vae config

* address sayak's comment
2024-02-27 16:20:05 +05:30
Dhruv Nair
ac49f97a75 Add tests to check configs when using single file loading (#7099)
* update

* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-27 15:47:23 +05:30
Sayak Paul
04bafcbbc2 move to uv in the Dockerfiles. (#7094)
move to uv in the Dockerfiles.
2024-02-27 15:11:28 +05:30
Sayak Paul
7081a25618 [Examples] Multiple enhancements to the ControlNet training scripts (#7096)
* log_validation unification for controlnet.

* additional fixes.

* remove print.

* better reuse and loading

* make final inference run conditional.

* Update examples/controlnet/README_sdxl.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* resize the control image in the snippet.

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-02-27 09:18:46 +05:30
Sayak Paul
848f9fe6ce [Core] pass revision in the loading_kwargs. (#7019)
* pass revision in the loading_kwarhs.

* remove revision from load_sub_model.
2024-02-27 08:52:38 +05:30
Younes Belkada
8a692739c0 FIX [PEFT / Core] Copy the state dict when passing it to load_lora_weights (#7058)
* copy the state dict in load lora weights

* fixup
2024-02-27 02:42:23 +01:00
Sayak Paul
5aa31bd674 [Easy] edit issue and PR templates (#7092)
edit templates to remove patrick's name.
2024-02-27 07:10:03 +05:30
jinghuan-Chen
88aa7f6ebf Make LoRACompatibleConv padding_mode work. (#6031)
* Make LoRACompatibleConv padding_mode work.

* Format code style.

* add fast test

* Update src/diffusers/models/lora.py

Simplify the code by patrickvonplaten.

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* code refactor

* apply patrickvonplaten suggestion to simplify the code.

* rm test_lora_layers_old_backend.py and add test case in test_lora_layers_peft.py

* update test case.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-26 14:05:13 -10:00
M. Tolga Cangöz
ad310af0d6 Fix EMA in train_text_to_image_sdxl.py (#7048)
* Fix typos
2024-02-26 10:39:57 -10:00
Dhruv Nair
d603ccb614 Small change to download in dance diffusion convert script (#7070)
* update

* make style
2024-02-26 12:05:19 +05:30
jiqing-feng
fd0f469568 Resize image before crop (#7095)
resize first

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-26 11:14:08 +05:30
Stephen
ae84e405a3 Pass use_linear_projection parameter to mid block in UNetMotionModel (#7035)
* pass linear projection parameter to mid block

* add cond_proj_dim to motion UNet

* run style and quality checks
2024-02-26 10:49:14 +05:30
Aryan
3a66113306 [Community] Bug fix + Latest IP-Adapter impl. for AnimateDiff img2vid/controlnet (#7086)
* fix img2vid; update to latest ip-adapter impl

* update README

* update animatediff controlnet to latest impl
2024-02-26 10:27:42 +05:30
Vinh H. Pham
7f16187182 Modularize Dreambooth LoRA SDXL inferencing during and after training (#6655)
* modularize log validation

* run make style

* revert import wandb

* fix code quality & import wandb

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-26 09:53:12 +05:30
Vinh H. Pham
f11b922b4f Modularize Dreambooth LoRA SD inferencing during and after training (#6654)
* modulize log validation

* run make style and refactor wanddb support

* remove redundant initialization

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-26 09:01:39 +05:30
Steven Liu
3dd4168d4c [docs] Minor updates (#7063)
* updates

* feedback
2024-02-25 09:38:02 -08:00
Fabio Rigano
1c47d1fc05 Fix head_to_batch_dim for IPAdapterAttnProcessor (#7077)
* Fix IPAdapterAttnProcessor

* Fix batch_to_head_dim and revert reshape
2024-02-25 00:05:46 -10:00
Aryan
bbf70c8739 Fix truthy-ness condition in pipelines that use denoising_start (#6912)
* fix denoising start

* fix tests

* remove debug
2024-02-24 23:39:22 -10:00
M. Tolga Cangöz
738c986957 [Refactor] StableDiffusionReferencePipeline inheriting from DiffusionPipeline (#7071)
Refactor StableDiffusionReferencePipeline to inherit from DiffusionPipeline rather than StableDiffusionPipeline
2024-02-23 22:04:31 -10:00
caiyueliang
c09bb588d3 fix: TensorRTStableDiffusionPipeline cannot set guidance_scale (#7065) 2024-02-23 14:59:02 -10:00
bimsarapathiraja
66a7160f9d Change images to image. The variable images is not used anywhere (#7074) 2024-02-23 10:40:21 -10:00
Chong-U Lim
f05ee56b2f Fix docstring of community pipeline imagic (#7062) 2024-02-23 09:37:52 -08:00
M. Tolga Cangöz
34cc7f9b98 Fix typos (#7068) 2024-02-23 09:24:51 -08:00
M. Tolga Cangöz
53605ed00a [Refactor] save_model_card function in text_to_image examples (#7051)
* Refactor save_model_card function to handle images and repo_folder parameters

* Discard changes to examples/text_to_image/train_text_to_image.py

* Discard changes to examples/text_to_image/train_text_to_image_lora_sdxl.py

* Update train_text_to_image_lora.py

* Update train_text_to_image_sdxl.py
2024-02-23 20:57:37 +05:30
Aryan
bb1b76d3bf IPAdapterTesterMixin (#6862)
* begin IPAdapterTesterMixin



---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-22 14:25:33 -10:00
YiYi Xu
e4b8f173b9 re-add unet refactor PR (#7044)
* add

* remove copied from

---------

Co-authored-by: ultranity <1095429904@qq.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-21 22:00:32 -10:00
Hezi Zisman
f0216b7756 allow explicit tokenizer & text_encoder in unload_textual_inversion (#6977)
* allow passing tokenizer & text_encoder to unload_textual_inversion


---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Fabio Rigano <fabio2rigano@gmail.com>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2024-02-21 16:53:25 -10:00
Lincoln Stein
d5f444de4b Update checkpoint_merger pipeline to pass the "variant" argument (#6670)
* make checkpoint_merger pipeline pass the "variant" argument to from_pretrained()

* make style

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-21 15:45:50 -10:00
M. Tolga Cangöz
5a54dc9e95 Fix typos in text_to_image examples (#7050)
Update copyright information and fix typos in text_to_image examples
2024-02-21 16:40:45 -08:00
YiYi Xu
6fedbd850a fix doc example for fom_single_file (#7015)
* fix doc

* remove use_safetensors from signature

* more

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-21 19:11:21 +05:30
pravdomil
1b3cfb1b10 update header (#6596)
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-02-20 10:09:15 -08:00
Dhruv Nair
af13a90ebd Remove disable_full_determinism from StableVideoDiffusion xformers test. (#7039)
* update

* update
2024-02-20 21:43:23 +05:30
Dhruv Nair
3067da1261 Fix load_model_dict_into_meta for ControlNet from_single_file (#7034)
update
2024-02-20 11:01:02 +05:30
Dhruv Nair
6bceaea3fe Update ControlNet Inpaint single file test (#7022)
update
2024-02-20 10:58:30 +05:30
Dhruv Nair
baf9924be7 Fix alt text and image links in AnimateLCM docs (#7029)
update
2024-02-20 08:30:44 +05:30
Nontapat Kaewamporn
d8d208acde Supper IP Adapter weight loading in StableDiffusionXLControlNetInpaintPipeline (#7031)
* support ip adapter loading

* fix style
2024-02-19 09:44:35 -10:00
Vinh H. Pham
e0f33dfca4 IP-Adapter support for StableDiffusionXLControlNetInpaintPipeline (#6941)
* add ip-adapter support

* support ip image embeds

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-19 08:20:24 -10:00
Dhruv Nair
15b125bb0e Add section on AnimateLCM to docs (#7024)
* update

* update

* update
2024-02-19 22:20:37 +05:30
ustcuna
12004bf3a7 [Community Pipelines]Accelerate inference of stable diffusion xl (SDXL) by IPEX on CPU (#6683)
* add stable_diffusion_xl_ipex community pipeline

* make style for code quality check

* update docs as suggested

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-02-19 13:39:08 +01:00
Dhruv Nair
d2fc5ebb95 [Refactor] FreeInit for AnimateDiff based pipelines (#6874)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update
2024-02-19 11:11:42 +05:30
YiYi Xu
779eef95b4 [from_single_file] pass torch_dtype to set_module_tensor_to_device (#6994)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-19 10:09:19 +05:30
Dhruv Nair
d5b8d1ca04 Fix Pixart Slow Tests (#6962)
* update

* update
2024-02-19 09:50:18 +05:30
Fabio Rigano
eba7e7a6d7 IP-Adapter attention masking (#6847)
* Add attention masking to attn processors

* Update tensor conversion

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-18 18:06:14 -10:00
Sayak Paul
31de879fb4 [IP2P] Make text encoder truly optional in InstructPi2Pix (#6995)
* make text encoder component truly optional.

* more fixes

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-18 20:21:22 +05:30
nbpppp
07349c25fe Fix deprecation warning for torch.utils._pytree._register_pytree_node in PyTorch 2.2 (#7008)
Fixed deprecation warning for torch.utils._pytree._register_pytree_node in PyTorch 2.2

Co-authored-by: Yinghua <yzho0423@uni.sydney.edu.au>
2024-02-18 20:11:33 +05:30
YiYi Xu
8974c50bff [SVD] fix a bug when passing image as tensor (#6999)
* fix

* update docstring

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-17 23:10:31 -10:00
Mikhail Koltakov
c18058b405 Fixed typos in dosctrings of __init__() and in forward() of Unet3DConditionModel (#6663)
* Fixed typos in __init__ and in forward of Unet3DConditionModel

* Resolving conflicts

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-17 21:56:16 -10:00
Thomas Lips
2938d5a672 Add documentation for strength parameter in Controlnet_img2img pipelines (#6951)
copy docstring for `strength` from  stablediffusion img2img pipeline to controlnet img2img pipelines
2024-02-17 21:47:02 -10:00
Sayak Paul
d4ade821cd start depcrecation cycle for lora_attention_proc 👋 (#7007) 2024-02-18 13:11:30 +05:30
Steven Liu
3a7e481611 [docs] Video generation (#6701)
* first draft

* fix path

* fix path

* i2vgen-xl

* review

* modelscopet2v

* feedback
2024-02-16 16:35:37 -08:00
Steven Liu
d649d6c6f3 [docs] Fix callout (#6998)
Update ip_adapter.md
2024-02-16 10:37:12 -10:00
Bhavay Malhotra
777063e1bf Update textual_inversion.py (#6952)
* Update textual_inversion.py

* Apply suggestions from code review

* Update textual_inversion.py

* Update textual_inversion.py

* Update textual_inversion.py

* Update textual_inversion.py

* Update examples/textual_inversion/textual_inversion.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update textual_inversion.py

* styling

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-16 15:39:51 +05:30
Stephen
104afbce84 Standardize model card for textual inversion sdxl (#6963)
* standardize model card

* fix tags

* correct import styling and update tags

* run make style and make quality

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-16 14:27:11 +05:30
co63oc
c0f5346a20 Fix procecss process (#6591)
* Fix words

* Fix

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-15 19:06:33 -10:00
Sayak Paul
087daee2f0 add: peft to the benchmark workflow (#6989) 2024-02-16 09:29:10 +05:30
Paakhhi
7e164d98a8 Fix diffusers import prompt2prompt (#6927)
* Bugfix: correct import for diffusers

* Fix: Prompt2Prompt example

* Format style

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-15 15:30:16 -10:00
Sayak Paul
e6d1728e0a [IP Adapters] feat: allow low_cpu_mem_usage in ip adapter loading (#6946)
* feat: allow low_cpu_mem_usage in ip adapter loading

* reduce the number of device placements.

* documentation.

* throw low_cpu_mem_usage warning only once from the main entry point.
2024-02-15 15:37:17 +05:30
Linoy Tsaban
8f2c7b4df0 [advanced sdxl lora script] - fix #6967 bug when using prior preservation loss (#6968)
* fix bug in micro-conditioning of class images

* fix bug in micro-conditioning of class images

* style
2024-02-15 12:20:05 +05:30
YiYi Xu
2e387dad5f fix IPAdapter unload_ip_adapter test (#6972)
add

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-14 20:42:40 -10:00
Steven Liu
9efe1e52c3 [docs] IP-Adapter (#6897)
* use cases

* first draft

* fix image links

* lcm-lora

* feedback

* review

* feedback

* feedback
2024-02-14 13:23:37 -08:00
Sayak Paul
37b09517b9 fix: controlnet inpaint single file. (#6975) 2024-02-14 19:04:57 +05:30
Sayak Paul
4343ce2c8e [Core] Harmonize single file ckpt model loading (#6971)
* use load_model_into_meta in single file utils

* propagate to autoencoder and controlnet.

* correct class name access behaviour.

* remove torch_dtype from load_model_into_meta; seems unncessary

* remove incorrect kwarg

* style to avoid extra unnecessary line breaks
2024-02-14 10:49:06 +05:30
Younes Belkada
0ca7b68198 [PEFT / docs] Add a note about torch.compile (#6864)
* Update using_peft_for_inference.md

* add more explanation
2024-02-14 02:29:29 +01:00
Dhruv Nair
3cf4f9c735 Allow passing config_file argument to ControlNetModel when using from_single_file (#6959)
* update

* update

* update
2024-02-13 18:54:53 +05:30
Dhruv Nair
40dd9cb2bd Move SDXL T2I Adapter lora test into PEFT workflow (#6965)
update
2024-02-13 17:08:53 +05:30
Dhruv Nair
30bcda7de6 Fix flaky IP Adapter test (#6960)
update
2024-02-13 17:07:39 +05:30
YiYi Xu
9ea62d119a [DPMSolverSinglestepScheduler] correct get_order_list for solver_order=2and lower_order_final=True (#6953)
* add

* change default

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-12 22:10:33 -10:00
Dhruv Nair
a326d61118 Fix configuring VAE from single file mixin (#6950)
* update
2024-02-12 22:10:05 -10:00
Alex Umnov
e7696e20f9 Updated lora inference instructions (#6913)
* Updated lora inference instructions

* Update examples/dreambooth/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update README.md

* Update README.md

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-13 09:35:20 +05:30
Piyush Thakur
4b89aeffe1 [Type annotations] fixed in save_model_card (#6948)
fixed type annotations

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-13 08:56:45 +05:30
Steven Liu
0a1daadef8 [docs] Community pipelines (#6929)
fix
2024-02-12 10:38:13 -08:00
Sayak Paul
371f765908 [Diffusers -> Original SD conversion] fix things (#6933)
* fix: bias loading bug

* fixes for SDXL

* apply changes to the conversion script to match single_file_utils.py

* do transpose to match the single file loading logic.
2024-02-12 17:30:22 +05:30
Piyush Thakur
75aee39eac [Model Card] standardize T2I Adapter Sdxl model card (#6947)
standardize model card template t21-adapter-sdxl
2024-02-12 16:43:20 +05:30
Dhruv Nair
215e6804d3 Unpin torch versions in CI (#6945)
* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-12 16:01:05 +05:30
Disty0
9254d1f39a Pass device to enable_model_cpu_offload in maybe_free_model_hooks (#6937) 2024-02-12 13:42:32 +05:30
Piyush Thakur
e1bdcc7af3 [Model Card] standardize T2I Sdxl Lora model card (#6944)
* standardize model card template t2i-lora-sdxl

* type annotations
2024-02-12 11:45:40 +05:30
Dhruv Nair
84905ca728 Update PixArt Alpha test module to match src module (#6943)
update
2024-02-12 11:01:33 +05:30
Piyush Thakur
6f336650c3 [Model Card] standardize T2I Sdxl model card (#6942)
standardize model card template t2i-sdxl
2024-02-12 10:01:20 +05:30
Piyush Thakur
06a042cd0e [Model Card] standardize T2I Lora model card (#6940)
standardize model card t2i-lora
2024-02-12 10:01:13 +05:30
Piyush Thakur
8772496586 [Model Card] standardize T2I model card (#6939)
* standardize model card

* fix base_model
2024-02-12 10:00:41 +05:30
dg845
35fd84be27 Replace hardcoded values in SchedulerCommonTest with properties (#5479)
---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-02-10 21:34:20 -10:00
YiYi Xu
f2756253e6 Fix a bug in AutoPipeline.from_pipe when switching pipeline with optional components (#6820)
* fix

* add tests

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-09 21:56:52 -10:00
YiYi Xu
0071478d9e allow attention processors to have different signatures (#6915)
add

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-09 21:56:19 -10:00
Sayak Paul
7c8cab313e post release 0.26.2 (#6885)
* post release

* style

* Empty-Commit

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-02-09 07:36:38 -10:00
Sayak Paul
ca9ed5e8d1 [LoRA] deprecate certain lora methods from the old backend. (#6889)
* deprecate certain lora methods from the old backend.

* uncomment necessary things.

* safe remove old lora backend 👋
2024-02-09 17:14:32 +01:00
Bingxin Ke
98b6bee1a1 [Community Pipeline][Bug Fix] marigold_depth_estimation: input image value range (#6787)
[FIX] IMPORTANT: rgb normalization
2024-02-09 16:55:37 +01:00
Dhruv Nair
ab7113487c More IPAdapter test fixes (#6888)
update
2024-02-09 13:54:58 +05:30
camaro
59c307f1d5 Standardize model card for Controlnet (#6910)
* controlnet

* style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-09 12:29:08 +05:30
Sayak Paul
159885adc6 correct hub_token exposition behaviour (thanks to @bghira). (#6918) 2024-02-08 18:38:27 -10:00
Aryan
7337eea59b [refactor] unnecessary lines in decode_latents in video pipelines (#6682)
* refactor decode latents in video pipelines

* make fix-copies
2024-02-08 17:10:52 -10:00
camaro
f07899a57c Standardize model card for Controlnet SDXL (#6908)
controlnet-sdxl
2024-02-09 07:53:39 +05:30
camaro
a83cc0c0bc Standardize model card for Controlnet flax (#6909)
* controlnet-flax

* style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-09 07:52:56 +05:30
C Q
db5194a45d Fix Compatibility Issues in stable_diffusion_xl_reference.py (#6251)
* Fix Compatibility Issues in stable_diffusion_xl_reference.py

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-08 10:52:59 -10:00
shaoxiaowang
e6c9c2513f fix examples/community/pipeline_stable_diffusion_xl_instantid.py (#6759)
Co-authored-by: wangshaoxiao <wangshaoxiao@xiaomi.com>
2024-02-08 10:09:40 -10:00
Kyunghwan Kim
d643b6691f Add vae tiling and slicing in img2img and inpaint (#6871)
* Add vae tiling in img2img and inpaint

* Add vae tiling not slicing
2024-02-08 09:50:00 -10:00
Michael
f5c9be3a0a Remove <cat-toy> validation prompt from examples/textual_inversion/textual_inversion_sdxl.py (#6877)
Remove <cat-toy> validation prompt from textual_inversion_sdxl.py

The `<cat-toy>` validation prompt is a default choice for the example task in the README. But no other part of `textual_inversion_sdxl.py` references the cat toy and `textual_inversion.py` has a default validation prompt of `None` as well.

So bring `textual_inversion_sdxl.py` in line with `textual_inversion.py` and change default validation prompt to `None`
2024-02-08 09:46:06 -10:00
Laisky.Cai
1824d0050e fix: load_image should support PIL Image (#6904)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-08 08:25:37 -10:00
Sayak Paul
30e5e81d58 change to 2024 in the license (#6902)
change to 2024
2024-02-08 08:19:31 -10:00
Masamune Ishihara
8de78001df Add fps argument to export_to_gif function. (#6786) 2024-02-08 21:59:51 +05:30
Patryk Bartkowiak
3ac2357794 changed positional parameters to named parameters like in docs (#6905)
Co-authored-by: Patryk Bartkowiak <patryk.bartkowiak@tcl.com>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2024-02-08 21:39:03 +05:30
Ehsan Akhgari
17808a091e Fix bug when converting checkpoint to diffusers format (#6900)
This fixes #6899.
2024-02-08 18:52:11 +05:30
Sayak Paul
491a933a1b [I2VGenXL] attention_head_dim in the UNet (#6872)
* attention_head_dim

* debug

* print more info

* correct num_attention_heads behaviour

* down_block_num_attention_heads -> num_attention_heads.

* correct the image link in doc.

* add: deprecation for num_attention_head

* fix: test argument to use attention_head_dim

* more fixes.

* quality

* address comments.

* remove depcrecation.
2024-02-08 12:30:14 +05:30
Sayak Paul
aa82df52e7 [IP Adapters] introduce ip_adapter_image_embeds in the SD pipeline call (#6868)
* add: support for passing ip adapter image embeddings

* debugging

* make feature_extractor unloading conditioned on safety_checker

* better condition

* type annotation

* index to look into value slices

* more debugging

* debugging

* serialize embeddings dict

* better conditioning

* remove unnecessary prints.

* Update src/diffusers/loaders/ip_adapter.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* make fix-copies and styling.

* styling and further copy fixing.

* fix: check_inputs call in controlnet sdxl img2img pipeline

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-08 11:10:10 +05:30
Srimanth Agastyaraju
a11b0f83b7 Fix: training resume from fp16 for SDXL Consistency Distillation (#6840)
* Fix: training resume from fp16 for lcm distill lora sdxl

* Fix coding quality - run linter

* Fix 1 - shift mixed precision cast before optimizer

* Fix 2 - State dict errors by removing load_lora_into_unet

* Update train_lcm_distill_lora_sdxl.py - Revert default cache dir to None

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-08 11:09:29 +05:30
Sayak Paul
1835510524 Remove torch_dtype in to() to end deprecation (#6886)
* remove torch_dtype from to()

* remove torch_dtype from usage scripts.

* remove old lora backend

* Revert "remove old lora backend"

This reverts commit adcddf6ba4.
2024-02-08 09:38:57 +05:30
camaro
4a3d52850b fix: keyword argument mismatch (#6895) 2024-02-08 09:37:56 +05:30
YiYi Xu
97d004b9b4 [ip-adapter] make sure length of scale is same as number of ip-adapters when using set_ip_adapter_scale (#6884)
add

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-07 10:13:12 -10:00
Sayak Paul
76696dca55 [Model Card] standardize dreambooth model card (#6729)
* feat: standarize model card creation for dreambooth training.

* correct 'inference

* remove comments.

* take component out of kwargs

* style

* add: card template to have a leaner description.

* widget support.

* propagate changes to train_dreambooth_lora

* propagate changes to custom diffusion

* make widget properly type-annotated
2024-02-07 15:07:11 +05:30
Félix Sanz
17612de451 fix: typo in callback function name and property (#6834)
* fix: callback function name is incorrect

On this tutorial there is a function defined and then used inside `callback_on_step_end` argument, but the name was not correct (mismatch)

* fix: typo in num_timestep (correct is num_timesteps)

fixed property name
2024-02-06 12:05:40 -08:00
Dhruv Nair
994360f7a5 Fix last IP Adapter test (#6875)
update
2024-02-06 08:53:40 -10:00
Dhruv Nair
e6a48db633 Refactor Deepfloyd IF tests. (#6855)
* update

* update

* update
2024-02-06 16:43:17 +05:30
sayakpaul
4f1df69d1a Revert "add attention_head_dim"
This reverts commit 15f6b22466.
2024-02-06 14:48:49 +05:30
sayakpaul
15f6b22466 add attention_head_dim 2024-02-06 14:48:07 +05:30
Sayak Paul
e6fd9ada3a [I2vGenXL] clean up things (#6845)
* remove _to_tensor

* remove _to_tensor definition

* remove _collapse_frames_into_batch

* remove lora for not bloating the code.

* remove sample_size.

* simplify code a bit more

* ensure timesteps are always in tensor.
2024-02-06 09:22:07 +05:30
Edward Li
493228a708 Fix AutoencoderTiny with use_slicing (#6850)
* Fix `AutoencoderTiny` with `use_slicing`

When using slicing with AutoencoderTiny, the encoder mistakenly encodes the entire batch for every image in the batch.

* Fixed formatting issue
2024-02-05 09:18:22 -10:00
Dhruv Nair
8bf046b7fb Add single file and IP Adapter support to PIA Pipeline (#6851)
update
2024-02-05 16:23:18 +05:30
Dhruv Nair
bb99623d09 Update IP Adapter tests to use cosine similarity distance (#6806)
* update

* update
2024-02-05 16:22:59 +05:30
Dhruv Nair
fdf55b1f1c Fix posix path issue in testing utils (#6849)
update
2024-02-05 08:57:18 +05:30
小咩Goat
c6f8c310c3 Fix forward pass in UNetMotionModel when gradient checkpoint is enabled (#6744)
fix #6742

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-02-05 08:04:01 +05:30
YiYi Xu
64909f17b7 update IP-adapter code in UNetMotionModel (#6828)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-05 07:26:46 +05:30
Dhruv Nair
f09ca909c8 Multiple small fixes to Video Pipeline docs (#6805)
* update

* update

* update

* Update src/diffusers/pipelines/i2vgen_xl/pipeline_i2vgen_xl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* update

* update

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-05 07:24:38 +05:30
YiYi Xu
a5fc62f819 add self.use_ada_layer_norm_* params back to BasicTransformerBlock (#6841)
fix sd reference community ppeline

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-04 11:16:44 -10:00
Linoy Tsaban
fbdf26bac5 [dreambooth lora sdxl] add sdxl micro conditioning (#6795)
* add micro conditioning

* remove redundant lines

* style

* fix missing 's'

* fix missing shape bug due to missing RGB if statement

* remove redundant if, change arg order

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-04 16:00:09 +02:00
Fabio Rigano
13001ee315 Bugfix in IPAdapterFaceID (#6835) 2024-02-03 08:56:55 -10:00
Linoy Tsaban
65329aed98 [advanced dreambooth lora sdxl script] new features + bug fixes (#6691)
* add noise_offset param

* micro conditioning - wip

* image processing adjusted and moved to support micro conditioning

* change time ids to be computed inside train loop

* change time ids to be computed inside train loop

* change time ids to be computed inside train loop

* time ids shape fix

* move token replacement of validation prompt to the same section of instance prompt and class prompt

* add offset noise to sd15 advanced script

* fix token loading during validation

* fix token loading during validation in sdxl script

* a little clean

* style

* a little clean

* style

* sdxl script - a little clean + minor path fix

sd 1.5 script - change default resolution value

* ad 1.5 script - minor path fix

* fix missing comma in code example in model card

* clean up commented lines

* style

* remove time ids computed outside training loop - no longer used now that we utilize micro-conditioning, as all time ids are now computed inside the training loop

* style

* [WIP] - added draft readme, building off of examples/dreambooth/README.md

* readme

* readme

* readme

* readme

* readme

* readme

* readme

* readme

* removed --crops_coords_top_left from CLI args

* style

* fix missing shape bug due to missing RGB if statement

* add blog mention at the start of the reamde as well

* Update examples/advanced_diffusion_training/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* change note to render nicely as well

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-03 17:33:43 +02:00
Stephen
02338c9317 Change path to posix (testing_utils.py) (#6803)
change path to pathlib as_posix

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-03 12:44:13 +05:30
Younes Belkada
15ed53d272 Fixes LoRA SDXL training script with DDP + PEFT (#6816)
Update train_dreambooth_lora_sdxl.py
2024-02-03 09:46:32 +05:30
UmerHA
9cc59ba089 [Contributor Experience] Fix test collection on MPS (#6808)
* Update testing_utils.py

* Update testing_utils.py
2024-02-02 20:59:00 +05:30
YiYi Xu
adcbe674a4 [refactor]Scheduler.set_begin_index (#6728) 2024-02-01 09:51:02 -10:00
Sayak Paul
ec9840a5db [Refactor] harmonize the module structure for models in tests (#6738)
* harmonize the module structure for models in tests

* make the folders modules.

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-01 14:23:39 +05:30
YiYi Xu
093a03a1a1 add is_torchvision_available (#6800)
* add

* remove transformer

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-01-31 20:01:44 -10:00
Patrick von Platen
c3369f5673 fix torchvision import (#6796) 2024-01-31 12:13:10 -10:00
1126 changed files with 112382 additions and 30050 deletions

View File

@@ -66,32 +66,32 @@ body:
Questions on DiffusionPipeline (Saving, Loading, From pretrained, ...):
Questions on pipelines:
- Stable Diffusion @yiyixuxu @DN6 @sayakpaul @patrickvonplaten
- Stable Diffusion XL @yiyixuxu @sayakpaul @DN6 @patrickvonplaten
- Kandinsky @yiyixuxu @patrickvonplaten
- ControlNet @sayakpaul @yiyixuxu @DN6 @patrickvonplaten
- T2I Adapter @sayakpaul @yiyixuxu @DN6 @patrickvonplaten
- IF @DN6 @patrickvonplaten
- Text-to-Video / Video-to-Video @DN6 @sayakpaul @patrickvonplaten
- Wuerstchen @DN6 @patrickvonplaten
- Stable Diffusion @yiyixuxu @DN6 @sayakpaul
- Stable Diffusion XL @yiyixuxu @sayakpaul @DN6
- Kandinsky @yiyixuxu
- ControlNet @sayakpaul @yiyixuxu @DN6
- T2I Adapter @sayakpaul @yiyixuxu @DN6
- IF @DN6
- Text-to-Video / Video-to-Video @DN6 @sayakpaul
- Wuerstchen @DN6
- Other: @yiyixuxu @DN6
Questions on models:
- UNet @DN6 @yiyixuxu @sayakpaul @patrickvonplaten
- VAE @sayakpaul @DN6 @yiyixuxu @patrickvonplaten
- Transformers/Attention @DN6 @yiyixuxu @sayakpaul @DN6 @patrickvonplaten
- UNet @DN6 @yiyixuxu @sayakpaul
- VAE @sayakpaul @DN6 @yiyixuxu
- Transformers/Attention @DN6 @yiyixuxu @sayakpaul @DN6
Questions on Schedulers: @yiyixuxu @patrickvonplaten
Questions on Schedulers: @yiyixuxu
Questions on LoRA: @sayakpaul @patrickvonplaten
Questions on LoRA: @sayakpaul
Questions on Textual Inversion: @sayakpaul @patrickvonplaten
Questions on Textual Inversion: @sayakpaul
Questions on Training:
- DreamBooth @sayakpaul @patrickvonplaten
- Text-to-Image Fine-tuning @sayakpaul @patrickvonplaten
- Textual Inversion @sayakpaul @patrickvonplaten
- ControlNet @sayakpaul @patrickvonplaten
- DreamBooth @sayakpaul
- Text-to-Image Fine-tuning @sayakpaul
- Textual Inversion @sayakpaul
- ControlNet @sayakpaul
Questions on Tests: @DN6 @sayakpaul @yiyixuxu
@@ -99,7 +99,7 @@ body:
Questions on JAX- and MPS-related things: @pcuenca
Questions on audio pipelines: @DN6 @patrickvonplaten
Questions on audio pipelines: @DN6

View File

@@ -38,13 +38,13 @@ members/contributors who may be interested in your PR.
Core library:
- Schedulers: @yiyixuxu and @patrickvonplaten
- Pipelines: @patrickvonplaten and @sayakpaul
- Training examples: @sayakpaul and @patrickvonplaten
- Docs: @stevhliu and @yiyixuxu
- Schedulers: @yiyixuxu
- Pipelines: @sayakpaul @yiyixuxu @DN6
- Training examples: @sayakpaul
- Docs: @stevhliu and @sayakpaul
- JAX and MPS: @pcuenca
- Audio: @sanchit-gandhi
- General functionalities: @patrickvonplaten and @sayakpaul
- General functionalities: @sayakpaul @yiyixuxu @DN6
Integrations:

View File

@@ -1,6 +1,7 @@
name: Benchmarking tests
on:
workflow_dispatch:
schedule:
- cron: "30 1 1,15 * *" # every 2 weeks on the 1st and the 15th of every month at 1:30 AM
@@ -30,15 +31,15 @@ jobs:
nvidia-smi
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m pip install pandas
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install pandas peft
- name: Environment
run: |
python utils/print_env.py
- name: Diffusers Benchmarking
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.DIFFUSERS_BOT_TOKEN }}
HF_TOKEN: ${{ secrets.DIFFUSERS_BOT_TOKEN }}
BASE_PATH: benchmark_outputs
run: |
export TOTAL_GPU_MEMORY=$(python -c "import torch; print(torch.cuda.get_device_properties(0).total_memory / (1024**3))")

View File

@@ -1,20 +1,57 @@
name: Build Docker images (nightly)
name: Test, build, and push Docker images
on:
pull_request: # During PRs, we just check if the changes Dockerfiles can be successfully built
branches:
- main
paths:
- "docker/**"
workflow_dispatch:
schedule:
- cron: "0 0 * * *" # every day at midnight
concurrency:
group: docker-image-builds
cancel-in-progress: false
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
env:
REGISTRY: diffusers
CI_SLACK_CHANNEL: ${{ secrets.CI_DOCKER_CHANNEL }}
jobs:
build-docker-images:
runs-on: ubuntu-latest
test-build-docker-images:
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
if: github.event_name == 'pull_request'
steps:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Check out code
uses: actions/checkout@v3
- name: Find Changed Dockerfiles
id: file_changes
uses: jitterbit/get-changed-files@v1
with:
format: 'space-delimited'
token: ${{ secrets.GITHUB_TOKEN }}
- name: Build Changed Docker Images
run: |
CHANGED_FILES="${{ steps.file_changes.outputs.all }}"
for FILE in $CHANGED_FILES; do
if [[ "$FILE" == docker/*Dockerfile ]]; then
DOCKER_PATH="${FILE%/Dockerfile}"
DOCKER_TAG=$(basename "$DOCKER_PATH")
echo "Building Docker image for $DOCKER_TAG"
docker build -t "$DOCKER_TAG" "$DOCKER_PATH"
fi
done
if: steps.file_changes.outputs.all != ''
build-and-push-docker-images:
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
if: github.event_name != 'pull_request'
permissions:
contents: read
@@ -32,17 +69,18 @@ jobs:
- diffusers-flax-tpu
- diffusers-onnxruntime-cpu
- diffusers-onnxruntime-cuda
- diffusers-doc-builder
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Login to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ env.REGISTRY }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and push
uses: docker/build-push-action@v3
with:
@@ -50,3 +88,14 @@ jobs:
context: ./docker/${{ matrix.image-name }}
push: true
tags: ${{ env.REGISTRY }}/${{ matrix.image-name }}:latest
- name: Post to a Slack channel
id: slack
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
# Slack channel id, channel name, or user id to post message.
# See also: https://api.slack.com/methods/chat.postMessage#channels
slack_channel: ${{ env.CI_SLACK_CHANNEL }}
title: "🤗 Results of the ${{ matrix.image-name }} Docker Image build"
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

View File

@@ -7,6 +7,10 @@ on:
- doc-builder*
- v*-release
- v*-patch
paths:
- "src/diffusers/**.py"
- "examples/**"
- "docs/**"
jobs:
build:
@@ -17,7 +21,7 @@ jobs:
package: diffusers
notebook_folder: diffusers_doc
languages: en ko zh ja pt
custom_container: diffusers/diffusers-doc-builder
secrets:
token: ${{ secrets.HUGGINGFACE_PUSH }}
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}

View File

@@ -2,6 +2,10 @@ name: Build PR Documentation
on:
pull_request:
paths:
- "src/diffusers/**.py"
- "examples/**"
- "docs/**"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
@@ -16,3 +20,4 @@ jobs:
install_libgl1: true
package: diffusers
languages: en ko zh ja pt
custom_container: diffusers/diffusers-doc-builder

View File

@@ -0,0 +1,89 @@
name: Mirror Community Pipeline
on:
# Push changes on the main branch
push:
branches:
- main
paths:
- 'examples/community/**.py'
# And on tag creation (e.g. `v0.28.1`)
tags:
- '*'
# Manual trigger with ref input
workflow_dispatch:
inputs:
ref:
description: "Either 'main' or a tag ref"
required: true
default: 'main'
jobs:
mirror_community_pipeline:
runs-on: ubuntu-latest
steps:
# Checkout to correct ref
# If workflow dispatch
# If ref is 'main', set:
# CHECKOUT_REF=refs/heads/main
# PATH_IN_REPO=main
# Else it must be a tag. Set:
# CHECKOUT_REF=refs/tags/{tag}
# PATH_IN_REPO={tag}
# If not workflow dispatch
# If ref is 'refs/heads/main' => set 'main'
# Else it must be a tag => set {tag}
- name: Set checkout_ref and path_in_repo
run: |
if [ "${{ github.event_name }}" == "workflow_dispatch" ]; then
if [ -z "${{ github.event.inputs.ref }}" ]; then
echo "Error: Missing ref input"
exit 1
elif [ "${{ github.event.inputs.ref }}" == "main" ]; then
echo "CHECKOUT_REF=refs/heads/main" >> $GITHUB_ENV
echo "PATH_IN_REPO=main" >> $GITHUB_ENV
else
echo "CHECKOUT_REF=refs/tags/${{ github.event.inputs.ref }}" >> $GITHUB_ENV
echo "PATH_IN_REPO=${{ github.event.inputs.ref }}" >> $GITHUB_ENV
fi
elif [ "${{ github.ref }}" == "refs/heads/main" ]; then
echo "CHECKOUT_REF=${{ github.ref }}" >> $GITHUB_ENV
echo "PATH_IN_REPO=main" >> $GITHUB_ENV
else
# e.g. refs/tags/v0.28.1 -> v0.28.1
echo "CHECKOUT_REF=${{ github.ref }}" >> $GITHUB_ENV
echo "PATH_IN_REPO=${${{ github.ref }}#refs/tags/}" >> $GITHUB_ENV
fi
- name: Print env vars
run: |
echo "CHECKOUT_REF: ${{ env.CHECKOUT_REF }}"
echo "PATH_IN_REPO: ${{ env.PATH_IN_REPO }}"
- uses: actions/checkout@v3
with:
ref: ${{ env.CHECKOUT_REF }}
# Setup + install dependencies
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install --upgrade huggingface_hub
# Check secret is set
- name: whoami
run: huggingface-cli whoami
env:
HF_TOKEN: ${{ secrets.HF_TOKEN_MIRROR_COMMUNITY_PIPELINES }}
# Push to HF! (under subfolder based on checkout ref)
# https://huggingface.co/datasets/diffusers/community-pipelines-mirror
- name: Mirror community pipeline to HF
run: huggingface-cli upload diffusers/community-pipelines-mirror ./examples/community ${PATH_IN_REPO} --repo-type dataset
env:
PATH_IN_REPO: ${{ env.PATH_IN_REPO }}
HF_TOKEN: ${{ secrets.HF_TOKEN_MIRROR_COMMUNITY_PIPELINES }}

View File

@@ -1,6 +1,7 @@
name: Nightly tests on main
name: Nightly and release tests on main/release branch
on:
workflow_dispatch:
schedule:
- cron: "0 0 * * *" # every day at midnight
@@ -12,106 +13,348 @@ env:
PYTEST_TIMEOUT: 600
RUN_SLOW: yes
RUN_NIGHTLY: yes
PIPELINE_USAGE_CUTOFF: 5000
SLACK_API_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
jobs:
run_nightly_tests:
strategy:
fail-fast: false
matrix:
config:
- name: Nightly PyTorch CUDA tests on Ubuntu
framework: pytorch
runner: docker-gpu
image: diffusers/diffusers-pytorch-cuda
report: torch_cuda
- name: Nightly Flax TPU tests on Ubuntu
framework: flax
runner: docker-tpu
image: diffusers/diffusers-flax-tpu
report: flax_tpu
- name: Nightly ONNXRuntime CUDA tests on Ubuntu
framework: onnxruntime
runner: docker-gpu
image: diffusers/diffusers-onnxruntime-cuda
report: onnx_cuda
name: ${{ matrix.config.name }}
runs-on: ${{ matrix.config.runner }}
container:
image: ${{ matrix.config.image }}
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ ${{ matrix.config.runner == 'docker-tpu' && '--privileged' || '--gpus 0'}}
defaults:
run:
shell: bash
setup_torch_cuda_pipeline_matrix:
name: Setup Torch Pipelines Matrix
runs-on: diffusers/diffusers-pytorch-cpu
outputs:
pipeline_test_matrix: ${{ steps.fetch_pipeline_matrix.outputs.pipeline_test_matrix }}
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
if: ${{ matrix.config.runner == 'docker-gpu' }}
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
nvidia-smi
pip install -e .
pip install huggingface_hub
- name: Fetch Pipeline Matrix
id: fetch_pipeline_matrix
run: |
matrix=$(python utils/fetch_torch_cuda_pipeline_test_matrix.py)
echo $matrix
echo "pipeline_test_matrix=$matrix" >> $GITHUB_OUTPUT
- name: Pipeline Tests Artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: test-pipelines.json
path: reports
run_nightly_tests_for_torch_pipelines:
name: Torch Pipelines CUDA Nightly Tests
needs: setup_torch_cuda_pipeline_matrix
strategy:
fail-fast: false
matrix:
module: ${{ fromJson(needs.setup_torch_cuda_pipeline_matrix.outputs.pipeline_test_matrix) }}
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: nvidia-smi
- name: Install dependencies
run: |
python -m pip install -e .[quality,test]
python -m pip install -U git+https://github.com/huggingface/transformers
python -m pip install git+https://github.com/huggingface/accelerate
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: |
python utils/print_env.py
- name: Run nightly PyTorch CUDA tests
if: ${{ matrix.config.framework == 'pytorch' }}
- name: Nightly PyTorch CUDA checkpoint (pipelines) tests
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
- name: Run nightly Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }}
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: |
python -m pytest -n 0 \
-s -v -k "Flax" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
- name: Run nightly ONNXRuntime CUDA tests
if: ${{ matrix.config.framework == 'onnxruntime' }}
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
--make-reports=tests_pipeline_${{ matrix.module }}_cuda \
--report-log=tests_pipeline_${{ matrix.module }}_cuda.log \
tests/pipelines/${{ matrix.module }}
- name: Failure short reports
if: ${{ failure() }}
run: cat reports/tests_${{ matrix.config.report }}_failures_short.txt
run: |
cat reports/tests_pipeline_${{ matrix.module }}_cuda_stats.txt
cat reports/tests_pipeline_${{ matrix.module }}_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: ${{ matrix.config.report }}_test_reports
name: pipeline_${{ matrix.module }}_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_tests_for_other_torch_modules:
name: Torch Non-Pipelines CUDA Nightly Tests
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
strategy:
matrix:
module: [models, schedulers, others, examples]
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly PyTorch CUDA tests for non-pipeline modules
if: ${{ matrix.module != 'examples'}}
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_torch_${{ matrix.module }}_cuda \
--report-log=tests_torch_${{ matrix.module }}_cuda.log \
tests/${{ matrix.module }}
- name: Run nightly example tests with Torch
if: ${{ matrix.module == 'examples' }}
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m uv pip install peft@git+https://github.com/huggingface/peft.git
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v --make-reports=examples_torch_cuda \
--report-log=examples_torch_cuda.log \
examples/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_torch_${{ matrix.module }}_cuda_stats.txt
cat reports/tests_torch_${{ matrix.module }}_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: torch_${{ matrix.module }}_cuda_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_lora_nightly_tests:
name: Nightly LoRA Tests with PEFT and TORCH
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install peft@git+https://github.com/huggingface/peft.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly LoRA tests with PEFT and Torch
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_torch_lora_cuda \
--report-log=tests_torch_lora_cuda.log \
tests/lora
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_torch_lora_cuda_stats.txt
cat reports/tests_torch_lora_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: torch_lora_cuda_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_flax_tpu_tests:
name: Nightly Flax TPU Tests
runs-on: docker-tpu
if: github.event_name == 'schedule'
container:
image: diffusers/diffusers-flax-tpu
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --privileged
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly Flax TPU tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 0 \
-s -v -k "Flax" \
--make-reports=tests_flax_tpu \
--report-log=tests_flax_tpu.log \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_flax_tpu_stats.txt
cat reports/tests_flax_tpu_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: flax_tpu_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_onnx_tests:
name: Nightly ONNXRuntime CUDA tests on Ubuntu
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-onnxruntime-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly ONNXRuntime CUDA tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_onnx_cuda \
--report-log=tests_onnx_cuda.log \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_onnx_cuda_stats.txt
cat reports/tests_onnx_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: ${{ matrix.config.report }}_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_tests_apple_m1:
name: Nightly PyTorch MPS tests on MacOS
runs-on: [ self-hosted, apple-m1 ]
if: github.event_name == 'schedule'
steps:
- name: Checkout diffusers
@@ -132,10 +375,11 @@ jobs:
- name: Install dependencies
shell: arch -arch arm64 bash {0}
run: |
${CONDA_RUN} python -m pip install --upgrade pip
${CONDA_RUN} python -m pip install -e .[quality,test]
${CONDA_RUN} python -m pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
${CONDA_RUN} python -m pip install git+https://github.com/huggingface/accelerate
${CONDA_RUN} python -m pip install --upgrade pip uv
${CONDA_RUN} python -m uv pip install -e [quality,test]
${CONDA_RUN} python -m uv pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
${CONDA_RUN} python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate
${CONDA_RUN} python -m uv pip install pytest-reportlog
- name: Environment
shell: arch -arch arm64 bash {0}
@@ -146,9 +390,11 @@ jobs:
shell: arch -arch arm64 bash {0}
env:
HF_HOME: /System/Volumes/Data/mnt/cache
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps tests/
${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps \
--report-log=tests_torch_mps.log \
tests/
- name: Failure short reports
if: ${{ failure() }}
@@ -160,3 +406,9 @@ jobs:
with:
name: torch_mps_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY

View File

@@ -0,0 +1,23 @@
name: Notify Slack about a release
on:
workflow_dispatch:
release:
types: [published]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.8'
- name: Notify Slack about the release
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
run: pip install requests && python utils/notify_slack_about_release.py

View File

@@ -4,6 +4,8 @@ on:
pull_request:
branches:
- main
paths:
- "src/diffusers/**.py"
push:
branches:
- main
@@ -23,10 +25,12 @@ jobs:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e .
pip install pytest
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pip install --upgrade pip uv
python -m uv pip install -e .
python -m uv pip install pytest
- name: Check for soft dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
pytest tests/others/test_dependencies.py

View File

@@ -4,6 +4,8 @@ on:
pull_request:
branches:
- main
paths:
- "src/diffusers/**.py"
push:
branches:
- main
@@ -23,12 +25,14 @@ jobs:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e .
pip install "jax[cpu]>=0.2.16,!=0.3.2"
pip install "flax>=0.4.1"
pip install "jaxlib>=0.1.65"
pip install pytest
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pip install --upgrade pip uv
python -m uv pip install -e .
python -m uv pip install "jax[cpu]>=0.2.16,!=0.3.2"
python -m uv pip install "flax>=0.4.1"
python -m uv pip install "jaxlib>=0.1.65"
python -m uv pip install pytest
- name: Check for soft dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
pytest tests/others/test_dependencies.py

View File

@@ -1,49 +0,0 @@
name: Run code quality checks
on:
pull_request:
branches:
- main
push:
branches:
- main
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
check_code_quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: |
ruff check examples tests src utils scripts
ruff format examples tests src utils scripts --check
check_repository_consistency:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: |
python utils/check_copies.py
python utils/check_dummies.py
make deps_table_check_updated

View File

@@ -15,7 +15,7 @@ concurrency:
jobs:
setup_pr_tests:
name: Setup PR Tests
runs-on: docker-cpu
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
container:
image: diffusers/diffusers-pytorch-cpu
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
@@ -32,8 +32,8 @@ jobs:
fetch-depth: 0
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
- name: Environment
run: |
python utils/print_env.py
@@ -73,7 +73,7 @@ jobs:
max-parallel: 2
matrix:
modules: ${{ fromJson(needs.setup_pr_tests.outputs.matrix) }}
runs-on: docker-cpu
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
container:
image: diffusers/diffusers-pytorch-cpu
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
@@ -88,16 +88,18 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pip install -e [quality,test]
python -m pip install accelerate
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run all selected tests on CPU
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 2 --dist=loadfile -v --make-reports=${{ matrix.modules }}_tests_cpu ${{ fromJson(needs.setup_pr_tests.outputs.test_map)[matrix.modules] }}
- name: Failure short reports
@@ -121,7 +123,7 @@ jobs:
config:
- name: Hub tests for models, schedulers, and pipelines
framework: hub_tests_pytorch
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_hub
@@ -143,16 +145,18 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pip install -e [quality,test]
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run Hub tests for models, schedulers, and pipelines on a staging env
if: ${{ matrix.config.framework == 'hub_tests_pytorch' }}
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
HUGGINGFACE_CO_STAGING=true python -m pytest \
-m "is_staging_test" \
--make-reports=tests_${{ matrix.config.report }} \

View File

@@ -4,6 +4,9 @@ on:
pull_request:
branches:
- main
paths:
- "src/diffusers/**.py"
- "tests/**.py"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
@@ -16,7 +19,50 @@ env:
PYTEST_TIMEOUT: 60
jobs:
check_code_quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: make quality
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Quality check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make style && make quality'" >> $GITHUB_STEP_SUMMARY
check_repository_consistency:
needs: check_code_quality
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check repo consistency
run: |
python utils/check_copies.py
python utils/check_dummies.py
make deps_table_check_updated
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Repo consistency check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make fix-copies'" >> $GITHUB_STEP_SUMMARY
run_fast_tests:
needs: [check_code_quality, check_repository_consistency]
strategy:
fail-fast: false
matrix:
@@ -25,7 +71,7 @@ jobs:
name: LoRA - ${{ matrix.lib-versions }}
runs-on: docker-cpu
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
container:
image: diffusers/diffusers-pytorch-cpu
@@ -43,23 +89,43 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
if [ "${{ matrix.lib-versions }}" == "main" ]; then
python -m pip install -U git+https://github.com/huggingface/peft.git
python -m pip install -U git+https://github.com/huggingface/transformers.git
python -m pip install -U git+https://github.com/huggingface/accelerate.git
python -m pip install -U peft@git+https://github.com/huggingface/peft.git
python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git
python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
else
python -m pip install -U peft transformers accelerate
python -m uv pip install -U peft transformers accelerate
fi
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run fast PyTorch LoRA CPU tests with PEFT backend
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v \
--make-reports=tests_${{ matrix.config.report }} \
tests/lora/test_lora_layers_peft.py
tests/lora/
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v \
--make-reports=tests_models_lora_${{ matrix.config.report }} \
tests/models/ -k "lora"
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_${{ matrix.config.report }}_failures_short.txt
cat reports/tests_models_lora_${{ matrix.config.report }}_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: pr_${{ matrix.config.report }}_test_reports
path: reports

View File

@@ -4,6 +4,14 @@ on:
pull_request:
branches:
- main
paths:
- "src/diffusers/**.py"
- "benchmarks/**.py"
- "examples/**.py"
- "scripts/**.py"
- "tests/**.py"
- ".github/**.yml"
- "utils/**.py"
push:
branches:
- ci-*
@@ -19,34 +27,72 @@ env:
PYTEST_TIMEOUT: 60
jobs:
check_code_quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: make quality
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Quality check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make style && make quality'" >> $GITHUB_STEP_SUMMARY
check_repository_consistency:
needs: check_code_quality
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check repo consistency
run: |
python utils/check_copies.py
python utils/check_dummies.py
make deps_table_check_updated
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Repo consistency check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make fix-copies'" >> $GITHUB_STEP_SUMMARY
run_fast_tests:
needs: [check_code_quality, check_repository_consistency]
strategy:
fail-fast: false
matrix:
config:
- name: Fast PyTorch Pipeline CPU tests
framework: pytorch_pipelines
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 32-cpu, 256-ram, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_cpu_pipelines
- name: Fast PyTorch Models & Schedulers CPU tests
framework: pytorch_models
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_cpu_models_schedulers
- name: LoRA
framework: lora
runner: docker-cpu
image: diffusers/diffusers-pytorch-cpu
report: torch_cpu_lora
- name: Fast Flax CPU tests
framework: flax
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-flax-cpu
report: flax_cpu
- name: PyTorch Example CPU tests
framework: pytorch_examples
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_example_cpu
@@ -70,18 +116,20 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m pip install accelerate
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run fast PyTorch Pipeline CPU tests
if: ${{ matrix.config.framework == 'pytorch_pipelines' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 8 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/pipelines
@@ -89,23 +137,17 @@ jobs:
- name: Run fast PyTorch Model Scheduler CPU tests
if: ${{ matrix.config.framework == 'pytorch_models' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx and not Dependency" \
--make-reports=tests_${{ matrix.config.report }} \
tests/models tests/schedulers tests/others
- name: Run fast PyTorch LoRA CPU tests
if: ${{ matrix.config.framework == 'lora' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx and not Dependency" \
--make-reports=tests_${{ matrix.config.report }} \
tests/lora
- name: Run fast Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Flax" \
--make-reports=tests_${{ matrix.config.report }} \
tests
@@ -113,8 +155,9 @@ jobs:
- name: Run example PyTorch CPU tests
if: ${{ matrix.config.framework == 'pytorch_examples' }}
run: |
python -m pip install peft
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install peft timm
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
--make-reports=tests_${{ matrix.config.report }} \
examples
@@ -130,13 +173,14 @@ jobs:
path: reports
run_staging_tests:
needs: [check_code_quality, check_repository_consistency]
strategy:
fail-fast: false
matrix:
config:
- name: Hub tests for models, schedulers, and pipelines
framework: hub_tests_pytorch
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_hub
@@ -160,16 +204,18 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run Hub tests for models, schedulers, and pipelines on a staging env
if: ${{ matrix.config.framework == 'hub_tests_pytorch' }}
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
HUGGINGFACE_CO_STAGING=true python -m pytest \
-m "is_staging_test" \
--make-reports=tests_${{ matrix.config.report }} \

View File

@@ -4,6 +4,8 @@ on:
pull_request:
branches:
- main
paths:
- "src/diffusers/**.py"
push:
branches:
- main
@@ -23,10 +25,12 @@ jobs:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -e .
pip install torch torchvision torchaudio
pip install pytest
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pip install --upgrade pip uv
python -m uv pip install -e .
python -m uv pip install torch torchvision torchaudio
python -m uv pip install pytest
- name: Check for soft dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
pytest tests/others/test_dependencies.py

View File

@@ -4,7 +4,10 @@ on:
push:
branches:
- main
paths:
- "src/diffusers/**.py"
- "examples/**.py"
- "tests/**.py"
env:
DIFFUSERS_IS_CI: yes
@@ -18,10 +21,9 @@ env:
jobs:
setup_torch_cuda_pipeline_matrix:
name: Setup Torch Pipelines CUDA Slow Tests Matrix
runs-on: docker-gpu
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
container:
image: diffusers/diffusers-pytorch-cpu # this is a CPU image, but we need it to fetch the matrix
options: --shm-size "16gb" --ipc host
image: diffusers/diffusers-pytorch-cpu
outputs:
pipeline_test_matrix: ${{ steps.fetch_pipeline_matrix.outputs.pipeline_test_matrix }}
steps:
@@ -31,21 +33,17 @@ jobs:
fetch-depth: 2
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m pip install git+https://github.com/huggingface/accelerate.git
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
- name: Environment
run: |
python utils/print_env.py
- name: Fetch Pipeline Matrix
id: fetch_pipeline_matrix
run: |
matrix=$(python utils/fetch_torch_cuda_pipeline_test_matrix.py)
echo $matrix
echo "pipeline_test_matrix=$matrix" >> $GITHUB_OUTPUT
- name: Pipeline Tests Artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
@@ -58,13 +56,13 @@ jobs:
needs: setup_torch_cuda_pipeline_matrix
strategy:
fail-fast: false
max-parallel: 1
max-parallel: 8
matrix:
module: ${{ fromJson(needs.setup_torch_cuda_pipeline_matrix.outputs.pipeline_test_matrix) }}
runs-on: docker-gpu
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
@@ -75,15 +73,15 @@ jobs:
nvidia-smi
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m pip install git+https://github.com/huggingface/accelerate.git
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
python utils/print_env.py
- name: Slow PyTorch CUDA checkpoint tests on Ubuntu
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
@@ -96,7 +94,6 @@ jobs:
run: |
cat reports/tests_pipeline_${{ matrix.module }}_cuda_stats.txt
cat reports/tests_pipeline_${{ matrix.module }}_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
@@ -106,16 +103,16 @@ jobs:
torch_cuda_tests:
name: Torch CUDA Tests
runs-on: docker-gpu
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
strategy:
matrix:
module: [models, schedulers, lora, others]
module: [models, schedulers, lora, others, single_file]
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
@@ -124,9 +121,9 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m pip install git+https://github.com/huggingface/accelerate.git
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
@@ -134,7 +131,7 @@ jobs:
- name: Run slow PyTorch CUDA tests
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
@@ -158,10 +155,10 @@ jobs:
peft_cuda_tests:
name: PEFT CUDA Tests
runs-on: docker-gpu
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
@@ -173,10 +170,10 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m pip install git+https://github.com/huggingface/accelerate.git
python -m pip install git+https://github.com/huggingface/peft.git
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m pip install -U peft@git+https://github.com/huggingface/peft.git
- name: Environment
run: |
@@ -184,7 +181,7 @@ jobs:
- name: Run slow PEFT CUDA tests
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
@@ -192,12 +189,17 @@ jobs:
-s -v -k "not Flax and not Onnx and not PEFTLoRALoading" \
--make-reports=tests_peft_cuda \
tests/lora/
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "lora and not Flax and not Onnx and not PEFTLoRALoading" \
--make-reports=tests_peft_cuda_models_lora \
tests/models/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_peft_cuda_stats.txt
cat reports/tests_peft_cuda_failures_short.txt
cat reports/tests_peft_cuda_models_lora_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
@@ -211,7 +213,7 @@ jobs:
runs-on: docker-tpu
container:
image: diffusers/diffusers-flax-tpu
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --privileged
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ --privileged
defaults:
run:
shell: bash
@@ -223,9 +225,9 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m pip install git+https://github.com/huggingface/accelerate.git
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
@@ -233,7 +235,7 @@ jobs:
- name: Run slow Flax TPU tests
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 0 \
-s -v -k "Flax" \
@@ -255,10 +257,10 @@ jobs:
onnx_cuda_tests:
name: ONNX CUDA Tests
runs-on: docker-gpu
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-onnxruntime-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
@@ -270,9 +272,9 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m pip install git+https://github.com/huggingface/accelerate.git
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
@@ -280,7 +282,7 @@ jobs:
- name: Run slow ONNXRuntime CUDA tests
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
@@ -303,11 +305,11 @@ jobs:
run_torch_compile_tests:
name: PyTorch Compile CUDA tests
runs-on: docker-gpu
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-compile-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Checkout diffusers
@@ -320,13 +322,14 @@ jobs:
nvidia-smi
- name: Install dependencies
run: |
python -m pip install -e .[quality,test,training]
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test,training]
- name: Environment
run: |
python utils/print_env.py
- name: Run example tests on GPU
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "compile" --make-reports=tests_torch_compile_cuda tests/
- name: Failure short reports
@@ -343,11 +346,11 @@ jobs:
run_xformers_tests:
name: PyTorch xformers CUDA tests
runs-on: docker-gpu
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-xformers-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Checkout diffusers
@@ -360,13 +363,14 @@ jobs:
nvidia-smi
- name: Install dependencies
run: |
python -m pip install -e .[quality,test,training]
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test,training]
- name: Environment
run: |
python utils/print_env.py
- name: Run example tests on GPU
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "xformers" --make-reports=tests_torch_xformers_cuda tests/
- name: Failure short reports
@@ -383,11 +387,11 @@ jobs:
run_examples_tests:
name: Examples PyTorch CUDA tests on Ubuntu
runs-on: docker-gpu
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Checkout diffusers
@@ -401,16 +405,20 @@ jobs:
- name: Install dependencies
run: |
python -m pip install -e .[quality,test,training]
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test,training]
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run example tests on GPU
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install timm
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v --make-reports=examples_torch_cuda examples/
- name: Failure short reports
@@ -424,4 +432,4 @@ jobs:
uses: actions/upload-artifact@v2
with:
name: examples_test_reports
path: reports
path: reports

View File

@@ -4,6 +4,10 @@ on:
push:
branches:
- main
paths:
- "src/diffusers/**.py"
- "examples/**.py"
- "tests/**.py"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
@@ -25,22 +29,22 @@ jobs:
config:
- name: Fast PyTorch CPU tests on Ubuntu
framework: pytorch
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_cpu
- name: Fast Flax CPU tests on Ubuntu
framework: flax
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-flax-cpu
report: flax_cpu
- name: Fast ONNXRuntime CPU tests on Ubuntu
framework: onnxruntime
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-onnxruntime-cpu
report: onnx_cpu
- name: PyTorch Example CPU tests on Ubuntu
framework: pytorch_examples
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_example_cpu
@@ -64,17 +68,19 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev libgl1 -y
python -m pip install -e .[quality,test]
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run fast PyTorch CPU tests
if: ${{ matrix.config.framework == 'pytorch' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
@@ -82,7 +88,8 @@ jobs:
- name: Run fast Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Flax" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
@@ -90,7 +97,8 @@ jobs:
- name: Run fast ONNXRuntime CPU tests
if: ${{ matrix.config.framework == 'onnxruntime' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
@@ -98,8 +106,9 @@ jobs:
- name: Run example PyTorch CPU tests
if: ${{ matrix.config.framework == 'pytorch_examples' }}
run: |
python -m pip install peft
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install peft timm
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
--make-reports=tests_${{ matrix.config.report }} \
examples

View File

@@ -4,6 +4,9 @@ on:
push:
branches:
- main
paths:
- "src/diffusers/**.py"
- "tests/**.py"
env:
DIFFUSERS_IS_CI: yes
@@ -20,7 +23,7 @@ concurrency:
jobs:
run_fast_tests_apple_m1:
name: Fast PyTorch MPS tests on MacOS
runs-on: [ self-hosted, apple-m1 ]
runs-on: macos-13-xlarge
steps:
- name: Checkout diffusers
@@ -41,11 +44,11 @@ jobs:
- name: Install dependencies
shell: arch -arch arm64 bash {0}
run: |
${CONDA_RUN} python -m pip install --upgrade pip
${CONDA_RUN} python -m pip install -e .[quality,test]
${CONDA_RUN} python -m pip install torch torchvision torchaudio
${CONDA_RUN} python -m pip install git+https://github.com/huggingface/accelerate.git
${CONDA_RUN} python -m pip install transformers --upgrade
${CONDA_RUN} python -m pip install --upgrade pip uv
${CONDA_RUN} python -m uv pip install -e [quality,test]
${CONDA_RUN} python -m uv pip install torch torchvision torchaudio
${CONDA_RUN} python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
${CONDA_RUN} python -m uv pip install transformers --upgrade
- name: Environment
shell: arch -arch arm64 bash {0}
@@ -56,7 +59,7 @@ jobs:
shell: arch -arch arm64 bash {0}
env:
HF_HOME: /System/Volumes/Data/mnt/cache
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
${CONDA_RUN} python -m pytest -n 0 -s -v --make-reports=tests_torch_mps tests/

81
.github/workflows/pypi_publish.yaml vendored Normal file
View File

@@ -0,0 +1,81 @@
# Adapted from https://blog.deepjyoti30.dev/pypi-release-github-action
name: PyPI release
on:
workflow_dispatch:
push:
tags:
- "*"
jobs:
find-and-checkout-latest-branch:
runs-on: ubuntu-latest
outputs:
latest_branch: ${{ steps.set_latest_branch.outputs.latest_branch }}
steps:
- name: Checkout Repo
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.8'
- name: Fetch latest branch
id: fetch_latest_branch
run: |
pip install -U requests packaging
LATEST_BRANCH=$(python utils/fetch_latest_release_branch.py)
echo "Latest branch: $LATEST_BRANCH"
echo "latest_branch=$LATEST_BRANCH" >> $GITHUB_ENV
- name: Set latest branch output
id: set_latest_branch
run: echo "::set-output name=latest_branch::${{ env.latest_branch }}"
release:
needs: find-and-checkout-latest-branch
runs-on: ubuntu-latest
steps:
- name: Checkout Repo
uses: actions/checkout@v3
with:
ref: ${{ needs.find-and-checkout-latest-branch.outputs.latest_branch }}
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -U setuptools wheel twine
pip install -U torch --index-url https://download.pytorch.org/whl/cpu
pip install -U transformers
- name: Build the dist files
run: python setup.py bdist_wheel && python setup.py sdist
- name: Publish to the test PyPI
env:
TWINE_USERNAME: ${{ secrets.TEST_PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.TEST_PYPI_PASSWORD }}
run: twine upload dist/* -r pypitest --repository-url=https://test.pypi.org/legacy/
- name: Test installing diffusers and importing
run: |
pip install diffusers && pip uninstall diffusers -y
pip install -i https://testpypi.python.org/pypi diffusers
python -c "from diffusers import __version__; print(__version__)"
python -c "from diffusers import DiffusionPipeline; pipe = DiffusionPipeline.from_pretrained('fusing/unet-ldm-dummy-update'); pipe()"
python -c "from diffusers import DiffusionPipeline; pipe = DiffusionPipeline.from_pretrained('hf-internal-testing/tiny-stable-diffusion-pipe', safety_checker=None); pipe('ah suh du')"
python -c "from diffusers import *"
- name: Publish to PyPI
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: twine upload dist/* -r pypi

View File

@@ -0,0 +1,73 @@
name: Check running SLOW tests from a PR (only GPU)
on:
workflow_dispatch:
inputs:
docker_image:
default: 'diffusers/diffusers-pytorch-cuda'
description: 'Name of the Docker image'
required: true
branch:
description: 'PR Branch to test on'
required: true
test:
description: 'Tests to run (e.g.: `tests/models`).'
required: true
env:
DIFFUSERS_IS_CI: yes
IS_GITHUB_CI: "1"
HF_HOME: /mnt/cache
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
PYTEST_TIMEOUT: 600
RUN_SLOW: yes
jobs:
run_tests:
name: "Run a test on our runner from a PR"
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: ${{ github.event.inputs.docker_image }}
options: --gpus 0 --privileged --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Validate test files input
id: validate_test_files
env:
PY_TEST: ${{ github.event.inputs.test }}
run: |
if [[ ! "$PY_TEST" =~ ^tests/ ]]; then
echo "Error: The input string must start with 'tests/'."
exit 1
fi
if [[ ! "$PY_TEST" =~ ^tests/(models|pipelines) ]]; then
echo "Error: The input string must contain either 'models' or 'pipelines' after 'tests/'."
exit 1
fi
if [[ "$PY_TEST" == *";"* ]]; then
echo "Error: The input string must not contain ';'."
exit 1
fi
echo "$PY_TEST"
- name: Checkout PR branch
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.branch }}
repository: ${{ github.event.pull_request.head.repo.full_name }}
- name: Install pytest
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install peft
- name: Run tests
env:
PY_TEST: ${{ github.event.inputs.test }}
run: |
pytest "$PY_TEST"

46
.github/workflows/ssh-runner.yml vendored Normal file
View File

@@ -0,0 +1,46 @@
name: SSH into runners
on:
workflow_dispatch:
inputs:
runner_type:
description: 'Type of runner to test (a10 or t4)'
required: true
docker_image:
description: 'Name of the Docker image'
required: true
env:
IS_GITHUB_CI: "1"
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
HF_HOME: /mnt/cache
DIFFUSERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes
jobs:
ssh_runner:
name: "SSH"
runs-on: [single-gpu, nvidia-gpu, "${{ github.event.inputs.runner_type }}", ci]
container:
image: ${{ github.event.inputs.docker_image }}
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0 --privileged
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Tailscale # In order to be able to SSH when a test fails
uses: huggingface/tailscale-action@main
with:
authkey: ${{ secrets.TAILSCALE_SSH_AUTHKEY }}
slackChannel: ${{ secrets.SLACK_CIFEEDBACK_CHANNEL }}
slackToken: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
waitForSSH: true

15
.github/workflows/trufflehog.yml vendored Normal file
View File

@@ -0,0 +1,15 @@
on:
push:
name: Secret Leaks
jobs:
trufflehog:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Secret Scanning
uses: trufflesecurity/trufflehog@main

30
.github/workflows/update_metadata.yml vendored Normal file
View File

@@ -0,0 +1,30 @@
name: Update Diffusers metadata
on:
workflow_dispatch:
push:
branches:
- main
- update_diffusers_metadata*
jobs:
update_metadata:
runs-on: ubuntu-22.04
defaults:
run:
shell: bash -l {0}
steps:
- uses: actions/checkout@v3
- name: Setup environment
run: |
pip install --upgrade pip
pip install datasets pandas
pip install .[torch]
- name: Update metadata
env:
HF_TOKEN: ${{ secrets.SAYAK_HF_TOKEN }}
run: |
python utils/update_metadata.py --commit_sha ${{ github.sha }}

View File

@@ -19,6 +19,16 @@ authors:
family-names: Rasul
- given-names: Mishig
family-names: Davaadorj
- given-names: Dhruv
family-names: Nair
- given-names: Sayak
family-names: Paul
- given-names: Steven
family-names: Liu
- given-names: William
family-names: Berman
- given-names: Yiyi
family-names: Xu
- given-names: Thomas
family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -355,7 +355,7 @@ You will need basic `git` proficiency to be able to contribute to
manual. Type `git --help` in a shell and enjoy. If you prefer books, [Pro
Git](https://git-scm.com/book/en/v2) is a very good reference.
Follow these steps to start contributing ([supported Python versions](https://github.com/huggingface/diffusers/blob/main/setup.py#L265)):
Follow these steps to start contributing ([supported Python versions](https://github.com/huggingface/diffusers/blob/42f25d601a910dceadaee6c44345896b4cfa9928/setup.py#L270)):
1. Fork the [repository](https://github.com/huggingface/diffusers) by
clicking on the 'Fork' button on the repository's page. This creates a copy of the code

View File

@@ -42,6 +42,7 @@ repo-consistency:
quality:
ruff check $(check_dirs) setup.py
ruff format --check $(check_dirs) setup.py
doc-builder style src/diffusers docs/source --max_len 119 --check_only
python utils/check_doc_toc.py
# Format source code automatically and check is there are any problems left that need manual fixing
@@ -55,6 +56,7 @@ extra_style_checks:
style:
ruff check $(check_dirs) setup.py --fix
ruff format $(check_dirs) setup.py
doc-builder style src/diffusers docs/source --max_len 119
${MAKE} autogenerate_code
${MAKE} extra_style_checks

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -77,7 +77,7 @@ Please refer to the [How to use Stable Diffusion in Apple Silicon](https://huggi
## Quickstart
Generating outputs is super easy with 🤗 Diffusers. To generate an image from text, use the `from_pretrained` method to load any pretrained diffusion model (browse the [Hub](https://huggingface.co/models?library=diffusers&sort=downloads) for 19000+ checkpoints):
Generating outputs is super easy with 🤗 Diffusers. To generate an image from text, use the `from_pretrained` method to load any pretrained diffusion model (browse the [Hub](https://huggingface.co/models?library=diffusers&sort=downloads) for 25.000+ checkpoints):
```python
from diffusers import DiffusionPipeline
@@ -219,7 +219,7 @@ Also, say 👋 in our public Discord channel <a href="https://discord.gg/G7tWnz9
- https://github.com/deep-floyd/IF
- https://github.com/bentoml/BentoML
- https://github.com/bmaltais/kohya_ss
- +8000 other amazing GitHub repositories 💪
- +11.000 other amazing GitHub repositories 💪
Thank you for using us ❤️.
@@ -238,7 +238,7 @@ We also want to thank @heejkoo for the very helpful overview of papers, code and
```bibtex
@misc{von-platen-etal-2022-diffusers,
author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf},
author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Dhruv Nair and Sayak Paul and William Berman and Yiyi Xu and Steven Liu and Thomas Wolf},
title = {Diffusers: State-of-the-art diffusion models},
year = {2022},
publisher = {GitHub},

View File

@@ -141,6 +141,7 @@ class LCMLoRATextToImageBenchmark(TextToImageBenchmark):
super().__init__(args)
self.pipe.load_lora_weights(self.lora_id)
self.pipe.fuse_lora()
self.pipe.unload_lora_weights()
self.pipe.scheduler = LCMScheduler.from_config(self.pipe.scheduler.config)
def get_result_filepath(self, args):
@@ -235,6 +236,35 @@ class InpaintingBenchmark(ImageToImageBenchmark):
)
class IPAdapterTextToImageBenchmark(TextToImageBenchmark):
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_neg_embed.png"
image = load_image(url)
def __init__(self, args):
pipe = self.pipeline_class.from_pretrained(args.ckpt, torch_dtype=torch.float16).to("cuda")
pipe.load_ip_adapter(
args.ip_adapter_id[0],
subfolder="models" if "sdxl" not in args.ip_adapter_id[1] else "sdxl_models",
weight_name=args.ip_adapter_id[1],
)
if args.run_compile:
pipe.unet.to(memory_format=torch.channels_last)
print("Run torch compile")
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
pipe.set_progress_bar_config(disable=True)
self.pipe = pipe
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
ip_adapter_image=self.image,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
class ControlNetBenchmark(TextToImageBenchmark):
pipeline_class = StableDiffusionControlNetPipeline
aux_network_class = ControlNetModel

View File

@@ -0,0 +1,32 @@
import argparse
import sys
sys.path.append(".")
from base_classes import IPAdapterTextToImageBenchmark # noqa: E402
IP_ADAPTER_CKPTS = {
"runwayml/stable-diffusion-v1-5": ("h94/IP-Adapter", "ip-adapter_sd15.bin"),
"stabilityai/stable-diffusion-xl-base-1.0": ("h94/IP-Adapter", "ip-adapter_sdxl.bin"),
}
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="runwayml/stable-diffusion-v1-5",
choices=list(IP_ADAPTER_CKPTS.keys()),
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
args.ip_adapter_id = IP_ADAPTER_CKPTS[args.ckpt]
benchmark_pipe = IPAdapterTextToImageBenchmark(args)
args.ckpt = f"{args.ckpt} (IP-Adapter)"
benchmark_pipe.benchmark(args)

View File

@@ -72,7 +72,7 @@ def main():
command += " --run_compile"
run_command(command.split())
elif file == "benchmark_sd_inpainting.py":
elif file in ["benchmark_sd_inpainting.py", "benchmark_ip_adapters.py"]:
sdxl_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
command = f"python {file} --ckpt {sdxl_ckpt}"
run_command(command.split())

View File

@@ -0,0 +1,52 @@
FROM ubuntu:20.04
LABEL maintainer="Hugging Face"
LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.10 \
python3-pip \
libgl1 \
zip \
wget \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark \
--extra-index-url https://download.pytorch.org/whl/cpu && \
python3.10 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \
huggingface-hub \
Jinja2 \
librosa \
numpy \
scipy \
tensorboard \
transformers \
matplotlib \
setuptools==69.5.1
CMD ["/bin/bash"]

View File

@@ -4,32 +4,36 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.8 \
python3-pip \
python3.8-venv && \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
libgl1 \
python3.10 \
python3-pip \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
# follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --upgrade --no-cache-dir \
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3 -m uv pip install --upgrade --no-cache-dir \
clu \
"jax[cpu]>=0.2.16,!=0.3.2" \
"flax>=0.4.1" \
"jaxlib>=0.1.65" && \
python3 -m pip install --no-cache-dir \
python3 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \

View File

@@ -4,34 +4,38 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.8 \
libgl1 \
python3.10 \
python3-pip \
python3.8-venv && \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
# follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3 -m pip install --no-cache-dir \
"jax[tpu]>=0.2.16,!=0.3.2" \
-f https://storage.googleapis.com/jax-releases/libtpu_releases.html && \
python3 -m pip install --upgrade --no-cache-dir \
python3 -m uv pip install --upgrade --no-cache-dir \
clu \
"flax>=0.4.1" \
"jaxlib>=0.1.65" && \
python3 -m pip install --no-cache-dir \
python3 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \

View File

@@ -4,32 +4,36 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.8 \
libgl1 \
python3.10 \
python3-pip \
python3.8-venv && \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3 -m uv pip install --no-cache-dir \
torch==2.1.2 \
torchvision==0.16.2 \
torchaudio==2.1.2 \
onnxruntime \
--extra-index-url https://download.pytorch.org/whl/cpu && \
python3 -m pip install --no-cache-dir \
python3 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \

View File

@@ -1,35 +1,39 @@
FROM nvidia/cuda:11.6.2-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:12.1.0-runtime-ubuntu20.04
LABEL maintainer="Hugging Face"
LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.8 \
libgl1 \
python3.10 \
python3-pip \
python3.8-venv && \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \
torch==2.1.2 \
torchvision==0.16.2 \
torchaudio==2.1.2 \
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
"onnxruntime-gpu>=1.13.1" \
--extra-index-url https://download.pytorch.org/whl/cu117 && \
python3 -m pip install --no-cache-dir \
python3.10 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \

View File

@@ -4,8 +4,11 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
@@ -13,24 +16,23 @@ RUN apt update && \
ca-certificates \
libsndfile1-dev \
libgl1 \
python3.9 \
python3.9-dev \
python3.10 \
python3-pip \
python3.9-venv && \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3.9 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.9 -m pip install --no-cache-dir --upgrade pip && \
python3.9 -m pip install --no-cache-dir \
torch==2.1.2 \
torchvision==0.16.2 \
torchaudio==2.1.2 \
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark && \
python3.9 -m pip install --no-cache-dir \
python3.10 -m pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \

View File

@@ -4,33 +4,36 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.8 \
python3.10 \
python3-pip \
libgl1 \
python3.8-venv && \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \
torch==2.1.2 \
torchvision==0.16.2 \
torchaudio==2.1.2 \
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark \
--extra-index-url https://download.pytorch.org/whl/cpu && \
python3 -m pip install --no-cache-dir \
python3.10 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \
@@ -40,6 +43,6 @@ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
numpy \
scipy \
tensorboard \
transformers
transformers matplotlib
CMD ["/bin/bash"]

View File

@@ -4,8 +4,11 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
@@ -13,23 +16,23 @@ RUN apt update && \
ca-certificates \
libsndfile1-dev \
libgl1 \
python3.8 \
python3.10 \
python3-pip \
python3.8-venv && \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \
torch==2.1.2 \
torchvision==0.16.2 \
torchaudio==2.1.2 \
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark && \
python3 -m pip install --no-cache-dir \
python3.10 -m pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \

View File

@@ -4,8 +4,11 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
@@ -13,23 +16,23 @@ RUN apt update && \
ca-certificates \
libsndfile1-dev \
libgl1 \
python3.8 \
python3.10 \
python3-pip \
python3.8-venv && \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \
torch==2.1.2 \
torchvision==0.16.2 \
torchaudio==2.1.2 \
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark && \
python3 -m pip install --no-cache-dir \
python3.10 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \

View File

@@ -1,5 +1,5 @@
<!---
Copyright 2023- The HuggingFace Team. All rights reserved.
Copyright 2024- The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
@@ -242,10 +242,10 @@ Here's an example of a tuple return, comprising several objects:
```
Returns:
`tuple(torch.FloatTensor)` comprising various elements depending on the configuration ([`BertConfig`]) and inputs:
- ** loss** (*optional*, returned when `masked_lm_labels` is provided) `torch.FloatTensor` of shape `(1,)` --
`tuple(torch.Tensor)` comprising various elements depending on the configuration ([`BertConfig`]) and inputs:
- ** loss** (*optional*, returned when `masked_lm_labels` is provided) `torch.Tensor` of shape `(1,)` --
Total loss is the sum of the masked language modeling loss and the next sequence prediction (classification) loss.
- **prediction_scores** (`torch.FloatTensor` of shape `(batch_size, sequence_length, config.vocab_size)`) --
- **prediction_scores** (`torch.Tensor` of shape `(batch_size, sequence_length, config.vocab_size)`) --
Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).
```

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -18,151 +18,145 @@
- local: tutorials/basic_training
title: Train a diffusion model
- local: tutorials/using_peft_for_inference
title: Inference with PEFT
title: Load LoRAs for inference
- local: tutorials/fast_diffusion
title: Accelerate inference of text-to-image diffusion models
title: Tutorials
- sections:
- sections:
- local: using-diffusers/loading_overview
title: Overview
- local: using-diffusers/loading
title: Load pipelines, models, and schedulers
- local: using-diffusers/schedulers
title: Load and compare different schedulers
- local: using-diffusers/custom_pipeline_overview
title: Load community pipelines and components
- local: using-diffusers/using_safetensors
title: Load safetensors
- local: using-diffusers/other-formats
title: Load different Stable Diffusion formats
- local: using-diffusers/loading_adapters
title: Load adapters
- local: using-diffusers/push_to_hub
title: Push files to the Hub
title: Loading & Hub
- sections:
- local: using-diffusers/pipeline_overview
title: Overview
- local: using-diffusers/unconditional_image_generation
title: Unconditional image generation
- local: using-diffusers/conditional_image_generation
title: Text-to-image
- local: using-diffusers/img2img
title: Image-to-image
- local: using-diffusers/inpaint
title: Inpainting
- local: using-diffusers/depth2img
title: Depth-to-image
title: Tasks
- sections:
- local: using-diffusers/textual_inversion_inference
title: Textual inversion
- local: training/distributed_inference
title: Distributed inference with multiple GPUs
- local: using-diffusers/reusing_seeds
title: Improve image quality with deterministic generation
- local: using-diffusers/control_brightness
title: Control image brightness
- local: using-diffusers/weighted_prompts
title: Prompt weighting
- local: using-diffusers/freeu
title: Improve generation quality with FreeU
title: Techniques
- sections:
- local: using-diffusers/pipeline_overview
title: Overview
- local: using-diffusers/sdxl
title: Stable Diffusion XL
- local: using-diffusers/sdxl_turbo
title: SDXL Turbo
- local: using-diffusers/kandinsky
title: Kandinsky
- local: using-diffusers/controlnet
title: ControlNet
- local: using-diffusers/shap-e
title: Shap-E
- local: using-diffusers/diffedit
title: DiffEdit
- local: using-diffusers/distilled_sd
title: Distilled Stable Diffusion inference
- local: using-diffusers/callback
title: Pipeline callbacks
- local: using-diffusers/reproducibility
title: Create reproducible pipelines
- local: using-diffusers/custom_pipeline_examples
title: Community pipelines
- local: using-diffusers/contribute_pipeline
title: Contribute a community pipeline
- local: using-diffusers/inference_with_lcm_lora
title: Latent Consistency Model-LoRA
- local: using-diffusers/inference_with_lcm
title: Latent Consistency Model
- local: using-diffusers/svd
title: Stable Video Diffusion
title: Specific pipeline examples
- sections:
- local: training/overview
title: Overview
- local: training/create_dataset
title: Create a dataset for training
- local: training/adapt_a_model
title: Adapt a model to a new task
- sections:
- local: training/unconditional_training
title: Unconditional image generation
- local: training/text2image
title: Text-to-image
- local: training/sdxl
title: Stable Diffusion XL
- local: training/kandinsky
title: Kandinsky 2.2
- local: training/wuerstchen
title: Wuerstchen
- local: training/controlnet
title: ControlNet
- local: training/t2i_adapters
title: T2I-Adapters
- local: training/instructpix2pix
title: InstructPix2Pix
title: Models
- sections:
- local: training/text_inversion
title: Textual Inversion
- local: training/dreambooth
title: DreamBooth
- local: training/lora
title: LoRA
- local: training/custom_diffusion
title: Custom Diffusion
- local: training/lcm_distill
title: Latent Consistency Distillation
- local: training/ddpo
title: Reinforcement learning training with DDPO
title: Methods
title: Training
- sections:
- local: using-diffusers/other-modalities
title: Other Modalities
title: Taking Diffusers Beyond Images
title: Using Diffusers
- local: using-diffusers/loading
title: Load pipelines
- local: using-diffusers/custom_pipeline_overview
title: Load community pipelines and components
- local: using-diffusers/schedulers
title: Load schedulers and models
- local: using-diffusers/other-formats
title: Model files and layouts
- local: using-diffusers/loading_adapters
title: Load adapters
- local: using-diffusers/push_to_hub
title: Push files to the Hub
title: Load pipelines and adapters
- sections:
- local: optimization/opt_overview
- local: using-diffusers/unconditional_image_generation
title: Unconditional image generation
- local: using-diffusers/conditional_image_generation
title: Text-to-image
- local: using-diffusers/img2img
title: Image-to-image
- local: using-diffusers/inpaint
title: Inpainting
- local: using-diffusers/text-img2vid
title: Text or image-to-video
- local: using-diffusers/depth2img
title: Depth-to-image
title: Generative tasks
- sections:
- local: using-diffusers/overview_techniques
title: Overview
- sections:
- local: optimization/fp16
title: Speed up inference
- local: optimization/memory
title: Reduce memory usage
- local: optimization/torch2.0
title: PyTorch 2.0
- local: optimization/xformers
title: xFormers
- local: optimization/tome
title: Token merging
- local: optimization/deepcache
title: DeepCache
title: General optimizations
- local: training/distributed_inference
title: Distributed inference with multiple GPUs
- local: using-diffusers/merge_loras
title: Merge LoRAs
- local: using-diffusers/scheduler_features
title: Scheduler features
- local: using-diffusers/callback
title: Pipeline callbacks
- local: using-diffusers/reusing_seeds
title: Reproducible pipelines
- local: using-diffusers/image_quality
title: Controlling image quality
- local: using-diffusers/weighted_prompts
title: Prompt techniques
title: Inference techniques
- sections:
- local: advanced_inference/outpaint
title: Outpainting
title: Advanced inference
- sections:
- local: using-diffusers/sdxl
title: Stable Diffusion XL
- local: using-diffusers/sdxl_turbo
title: SDXL Turbo
- local: using-diffusers/kandinsky
title: Kandinsky
- local: using-diffusers/ip_adapter
title: IP-Adapter
- local: using-diffusers/controlnet
title: ControlNet
- local: using-diffusers/t2i_adapter
title: T2I-Adapter
- local: using-diffusers/inference_with_lcm
title: Latent Consistency Model
- local: using-diffusers/textual_inversion_inference
title: Textual inversion
- local: using-diffusers/shap-e
title: Shap-E
- local: using-diffusers/diffedit
title: DiffEdit
- local: using-diffusers/inference_with_tcd_lora
title: Trajectory Consistency Distillation-LoRA
- local: using-diffusers/svd
title: Stable Video Diffusion
- local: using-diffusers/marigold_usage
title: Marigold Computer Vision
title: Specific pipeline examples
- sections:
- local: training/overview
title: Overview
- local: training/create_dataset
title: Create a dataset for training
- local: training/adapt_a_model
title: Adapt a model to a new task
- isExpanded: false
sections:
- local: training/unconditional_training
title: Unconditional image generation
- local: training/text2image
title: Text-to-image
- local: training/sdxl
title: Stable Diffusion XL
- local: training/kandinsky
title: Kandinsky 2.2
- local: training/wuerstchen
title: Wuerstchen
- local: training/controlnet
title: ControlNet
- local: training/t2i_adapters
title: T2I-Adapters
- local: training/instructpix2pix
title: InstructPix2Pix
title: Models
- isExpanded: false
sections:
- local: training/text_inversion
title: Textual Inversion
- local: training/dreambooth
title: DreamBooth
- local: training/lora
title: LoRA
- local: training/custom_diffusion
title: Custom Diffusion
- local: training/lcm_distill
title: Latent Consistency Distillation
- local: training/ddpo
title: Reinforcement learning training with DDPO
title: Methods
title: Training
- sections:
- local: optimization/fp16
title: Speed up inference
- local: optimization/memory
title: Reduce memory usage
- local: optimization/torch2.0
title: PyTorch 2.0
- local: optimization/xformers
title: xFormers
- local: optimization/tome
title: Token merging
- local: optimization/deepcache
title: DeepCache
- local: optimization/tgate
title: TGATE
- sections:
- local: using-diffusers/stable_diffusion_jax_how_to
title: JAX/Flax
@@ -172,14 +166,14 @@
title: OpenVINO
- local: optimization/coreml
title: Core ML
title: Optimized model types
title: Optimized model formats
- sections:
- local: optimization/mps
title: Metal Performance Shaders (MPS)
- local: optimization/habana
title: Habana Gaudi
title: Optimized hardware
title: Optimization
title: Accelerate inference and reduce memory
- sections:
- local: conceptual/philosophy
title: Philosophy
@@ -193,7 +187,8 @@
title: Evaluating Diffusion Models
title: Conceptual Guides
- sections:
- sections:
- isExpanded: false
sections:
- local: api/configuration
title: Configuration
- local: api/logging
@@ -201,7 +196,8 @@
- local: api/outputs
title: Outputs
title: Main Classes
- sections:
- isExpanded: false
sections:
- local: api/loaders/ip_adapter
title: IP-Adapter
- local: api/loaders/lora
@@ -215,7 +211,8 @@
- local: api/loaders/peft
title: PEFT
title: Loaders
- sections:
- isExpanded: false
sections:
- local: api/models/overview
title: Overview
- local: api/models/unet
@@ -241,15 +238,26 @@
- local: api/models/consistency_decoder_vae
title: ConsistencyDecoderVAE
- local: api/models/transformer2d
title: Transformer2D
title: Transformer2DModel
- local: api/models/pixart_transformer2d
title: PixArtTransformer2DModel
- local: api/models/dit_transformer2d
title: DiTTransformer2DModel
- local: api/models/hunyuan_transformer2d
title: HunyuanDiT2DModel
- local: api/models/transformer_temporal
title: Transformer Temporal
title: TransformerTemporalModel
- local: api/models/sd3_transformer2d
title: SD3Transformer2DModel
- local: api/models/prior_transformer
title: Prior Transformer
title: PriorTransformer
- local: api/models/controlnet
title: ControlNet
title: ControlNetModel
- local: api/models/controlnet_sd3
title: SD3ControlNetModel
title: Models
- sections:
- isExpanded: false
sections:
- local: api/pipelines/overview
title: Overview
- local: api/pipelines/amused
@@ -270,8 +278,14 @@
title: Consistency Models
- local: api/pipelines/controlnet
title: ControlNet
- local: api/pipelines/controlnet_sd3
title: ControlNet with Stable Diffusion 3
- local: api/pipelines/controlnet_sdxl
title: ControlNet with Stable Diffusion XL
- local: api/pipelines/controlnetxs
title: ControlNet-XS
- local: api/pipelines/controlnetxs_sdxl
title: ControlNet-XS with Stable Diffusion XL
- local: api/pipelines/dance_diffusion
title: Dance Diffusion
- local: api/pipelines/ddim
@@ -284,6 +298,8 @@
title: DiffEdit
- local: api/pipelines/dit
title: DiT
- local: api/pipelines/hunyuandit
title: Hunyuan-DiT
- local: api/pipelines/i2vgenxl
title: I2VGen-XL
- local: api/pipelines/pix2pix
@@ -298,6 +314,10 @@
title: Latent Consistency Models
- local: api/pipelines/latent_diffusion
title: Latent Diffusion
- local: api/pipelines/ledits_pp
title: LEDITS++
- local: api/pipelines/marigold
title: Marigold
- local: api/pipelines/panorama
title: MultiDiffusion
- local: api/pipelines/musicldm
@@ -308,12 +328,16 @@
title: Personalized Image Animator (PIA)
- local: api/pipelines/pixart
title: PixArt-α
- local: api/pipelines/pixart_sigma
title: PixArt-Σ
- local: api/pipelines/self_attention_guidance
title: Self-Attention Guidance
- local: api/pipelines/semantic_stable_diffusion
title: Semantic Guidance
- local: api/pipelines/shap_e
title: Shap-E
- local: api/pipelines/stable_cascade
title: Stable Cascade
- sections:
- local: api/pipelines/stable_diffusion/overview
title: Overview
@@ -321,6 +345,8 @@
title: Text-to-image
- local: api/pipelines/stable_diffusion/img2img
title: Image-to-image
- local: api/pipelines/stable_diffusion/svd
title: Image-to-video
- local: api/pipelines/stable_diffusion/inpaint
title: Inpainting
- local: api/pipelines/stable_diffusion/depth2img
@@ -331,6 +357,8 @@
title: Safe Stable Diffusion
- local: api/pipelines/stable_diffusion/stable_diffusion_2
title: Stable Diffusion 2
- local: api/pipelines/stable_diffusion/stable_diffusion_3
title: Stable Diffusion 3
- local: api/pipelines/stable_diffusion/stable_diffusion_xl
title: Stable Diffusion XL
- local: api/pipelines/stable_diffusion/sdxl_turbo
@@ -344,7 +372,7 @@
- local: api/pipelines/stable_diffusion/ldm3d_diffusion
title: LDM3D Text-to-(RGB, Depth), Text-to-(RGB-pano, Depth-pano), LDM3D Upscaler
- local: api/pipelines/stable_diffusion/adapter
title: Stable Diffusion T2I-Adapter
title: T2I-Adapter
- local: api/pipelines/stable_diffusion/gligen
title: GLIGEN (Grounded Language-to-Image Generation)
title: Stable Diffusion
@@ -363,7 +391,8 @@
- local: api/pipelines/wuerstchen
title: Wuerstchen
title: Pipelines
- sections:
- isExpanded: false
sections:
- local: api/schedulers/overview
title: Overview
- local: api/schedulers/cm_stochastic_iterative
@@ -386,10 +415,16 @@
title: DPMSolverSDEScheduler
- local: api/schedulers/singlestep_dpm_solver
title: DPMSolverSinglestepScheduler
- local: api/schedulers/edm_multistep_dpm_solver
title: EDMDPMSolverMultistepScheduler
- local: api/schedulers/edm_euler
title: EDMEulerScheduler
- local: api/schedulers/euler_ancestral
title: EulerAncestralDiscreteScheduler
- local: api/schedulers/euler
title: EulerDiscreteScheduler
- local: api/schedulers/flow_match_euler_discrete
title: FlowMatchEulerDiscreteScheduler
- local: api/schedulers/heun
title: HeunDiscreteScheduler
- local: api/schedulers/ipndm
@@ -412,12 +447,15 @@
title: ScoreSdeVeScheduler
- local: api/schedulers/score_sde_vp
title: ScoreSdeVpScheduler
- local: api/schedulers/tcd
title: TCDScheduler
- local: api/schedulers/unipc
title: UniPCMultistepScheduler
- local: api/schedulers/vq_diffusion
title: VQDiffusionScheduler
title: Schedulers
- sections:
- isExpanded: false
sections:
- local: api/internal_classes_overview
title: Overview
- local: api/attnprocessor
@@ -430,5 +468,7 @@
title: Utilities
- local: api/image_processor
title: VAE Image Processor
- local: api/video_processor
title: Video Processor
title: Internal classes
title: API

View File

@@ -0,0 +1,231 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Outpainting
Outpainting extends an image beyond its original boundaries, allowing you to add, replace, or modify visual elements in an image while preserving the original image. Like [inpainting](../using-diffusers/inpaint), you want to fill the white area (in this case, the area outside of the original image) with new visual elements while keeping the original image (represented by a mask of black pixels). There are a couple of ways to outpaint, such as with a [ControlNet](https://hf.co/blog/OzzyGT/outpainting-controlnet) or with [Differential Diffusion](https://hf.co/blog/OzzyGT/outpainting-differential-diffusion).
This guide will show you how to outpaint with an inpainting model, ControlNet, and a ZoeDepth estimator.
Before you begin, make sure you have the [controlnet_aux](https://github.com/huggingface/controlnet_aux) library installed so you can use the ZoeDepth estimator.
```py
!pip install -q controlnet_aux
```
## Image preparation
Start by picking an image to outpaint with and remove the background with a Space like [BRIA-RMBG-1.4](https://hf.co/spaces/briaai/BRIA-RMBG-1.4).
<iframe
src="https://briaai-bria-rmbg-1-4.hf.space"
frameborder="0"
width="850"
height="450"
></iframe>
For example, remove the background from this image of a pair of shoes.
<div class="flex flex-row gap-4">
<div class="flex-1">
<img class="rounded-xl" src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/original-jordan.png"/>
<figcaption class="mt-2 text-center text-sm text-gray-500">original image</figcaption>
</div>
<div class="flex-1">
<img class="rounded-xl" src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/no-background-jordan.png"/>
<figcaption class="mt-2 text-center text-sm text-gray-500">background removed</figcaption>
</div>
</div>
[Stable Diffusion XL (SDXL)](../using-diffusers/sdxl) models work best with 1024x1024 images, but you can resize the image to any size as long as your hardware has enough memory to support it. The transparent background in the image should also be replaced with a white background. Create a function (like the one below) that scales and pastes the image onto a white background.
```py
import random
import requests
import torch
from controlnet_aux import ZoeDetector
from PIL import Image, ImageOps
from diffusers import (
AutoencoderKL,
ControlNetModel,
StableDiffusionXLControlNetPipeline,
StableDiffusionXLInpaintPipeline,
)
def scale_and_paste(original_image):
aspect_ratio = original_image.width / original_image.height
if original_image.width > original_image.height:
new_width = 1024
new_height = round(new_width / aspect_ratio)
else:
new_height = 1024
new_width = round(new_height * aspect_ratio)
resized_original = original_image.resize((new_width, new_height), Image.LANCZOS)
white_background = Image.new("RGBA", (1024, 1024), "white")
x = (1024 - new_width) // 2
y = (1024 - new_height) // 2
white_background.paste(resized_original, (x, y), resized_original)
return resized_original, white_background
original_image = Image.open(
requests.get(
"https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/no-background-jordan.png",
stream=True,
).raw
).convert("RGBA")
resized_img, white_bg_image = scale_and_paste(original_image)
```
To avoid adding unwanted extra details, use the ZoeDepth estimator to provide additional guidance during generation and to ensure the shoes remain consistent with the original image.
```py
zoe = ZoeDetector.from_pretrained("lllyasviel/Annotators")
image_zoe = zoe(white_bg_image, detect_resolution=512, image_resolution=1024)
image_zoe
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/zoedepth-jordan.png"/>
</div>
## Outpaint
Once your image is ready, you can generate content in the white area around the shoes with [controlnet-inpaint-dreamer-sdxl](https://hf.co/destitech/controlnet-inpaint-dreamer-sdxl), a SDXL ControlNet trained for inpainting.
Load the inpainting ControlNet, ZoeDepth model, VAE and pass them to the [`StableDiffusionXLControlNetPipeline`]. Then you can create an optional `generate_image` function (for convenience) to outpaint an initial image.
```py
controlnets = [
ControlNetModel.from_pretrained(
"destitech/controlnet-inpaint-dreamer-sdxl", torch_dtype=torch.float16, variant="fp16"
),
ControlNetModel.from_pretrained(
"diffusers/controlnet-zoe-depth-sdxl-1.0", torch_dtype=torch.float16
),
]
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16).to("cuda")
pipeline = StableDiffusionXLControlNetPipeline.from_pretrained(
"SG161222/RealVisXL_V4.0", torch_dtype=torch.float16, variant="fp16", controlnet=controlnets, vae=vae
).to("cuda")
def generate_image(prompt, negative_prompt, inpaint_image, zoe_image, seed: int = None):
if seed is None:
seed = random.randint(0, 2**32 - 1)
generator = torch.Generator(device="cpu").manual_seed(seed)
image = pipeline(
prompt,
negative_prompt=negative_prompt,
image=[inpaint_image, zoe_image],
guidance_scale=6.5,
num_inference_steps=25,
generator=generator,
controlnet_conditioning_scale=[0.5, 0.8],
control_guidance_end=[0.9, 0.6],
).images[0]
return image
prompt = "nike air jordans on a basketball court"
negative_prompt = ""
temp_image = generate_image(prompt, negative_prompt, white_bg_image, image_zoe, 908097)
```
Paste the original image over the initial outpainted image. You'll improve the outpainted background in a later step.
```py
x = (1024 - resized_img.width) // 2
y = (1024 - resized_img.height) // 2
temp_image.paste(resized_img, (x, y), resized_img)
temp_image
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/initial-outpaint.png"/>
</div>
> [!TIP]
> Now is a good time to free up some memory if you're running low!
>
> ```py
> pipeline=None
> torch.cuda.empty_cache()
> ```
Now that you have an initial outpainted image, load the [`StableDiffusionXLInpaintPipeline`] with the [RealVisXL](https://hf.co/SG161222/RealVisXL_V4.0) model to generate the final outpainted image with better quality.
```py
pipeline = StableDiffusionXLInpaintPipeline.from_pretrained(
"OzzyGT/RealVisXL_V4.0_inpainting",
torch_dtype=torch.float16,
variant="fp16",
vae=vae,
).to("cuda")
```
Prepare a mask for the final outpainted image. To create a more natural transition between the original image and the outpainted background, blur the mask to help it blend better.
```py
mask = Image.new("L", temp_image.size)
mask.paste(resized_img.split()[3], (x, y))
mask = ImageOps.invert(mask)
final_mask = mask.point(lambda p: p > 128 and 255)
mask_blurred = pipeline.mask_processor.blur(final_mask, blur_factor=20)
mask_blurred
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/blurred-mask.png"/>
</div>
Create a better prompt and pass it to the `generate_outpaint` function to generate the final outpainted image. Again, paste the original image over the final outpainted background.
```py
def generate_outpaint(prompt, negative_prompt, image, mask, seed: int = None):
if seed is None:
seed = random.randint(0, 2**32 - 1)
generator = torch.Generator(device="cpu").manual_seed(seed)
image = pipeline(
prompt,
negative_prompt=negative_prompt,
image=image,
mask_image=mask,
guidance_scale=10.0,
strength=0.8,
num_inference_steps=30,
generator=generator,
).images[0]
return image
prompt = "high quality photo of nike air jordans on a basketball court, highly detailed"
negative_prompt = ""
final_image = generate_outpaint(prompt, negative_prompt, temp_image, mask_blurred, 7688778)
x = (1024 - resized_img.width) // 2
y = (1024 - resized_img.height) // 2
final_image.paste(resized_img, (x, y), resized_img)
final_image
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/final-outpaint.png"/>
</div>

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -20,14 +20,14 @@ An attention processor is a class for applying different types of attention mech
## AttnProcessor2_0
[[autodoc]] models.attention_processor.AttnProcessor2_0
## FusedAttnProcessor2_0
[[autodoc]] models.attention_processor.FusedAttnProcessor2_0
## AttnAddedKVProcessor
[[autodoc]] models.attention_processor.AttnAddedKVProcessor
## LoRAAttnProcessor
[[autodoc]] models.attention_processor.LoRAAttnProcessor
## AttnAddedKVProcessor2_0
[[autodoc]] models.attention_processor.AttnAddedKVProcessor2_0
## LoRAAttnProcessor2_0
[[autodoc]] models.attention_processor.LoRAAttnProcessor2_0
## CrossFrameAttnProcessor
[[autodoc]] pipelines.text_to_video_synthesis.pipeline_text_to_video_zero.CrossFrameAttnProcessor
## CustomDiffusionAttnProcessor
[[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor
@@ -35,26 +35,26 @@ An attention processor is a class for applying different types of attention mech
## CustomDiffusionAttnProcessor2_0
[[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor2_0
## AttnAddedKVProcessor
[[autodoc]] models.attention_processor.AttnAddedKVProcessor
## CustomDiffusionXFormersAttnProcessor
[[autodoc]] models.attention_processor.CustomDiffusionXFormersAttnProcessor
## AttnAddedKVProcessor2_0
[[autodoc]] models.attention_processor.AttnAddedKVProcessor2_0
## FusedAttnProcessor2_0
[[autodoc]] models.attention_processor.FusedAttnProcessor2_0
## LoRAAttnAddedKVProcessor
[[autodoc]] models.attention_processor.LoRAAttnAddedKVProcessor
## XFormersAttnProcessor
[[autodoc]] models.attention_processor.XFormersAttnProcessor
## LoRAXFormersAttnProcessor
[[autodoc]] models.attention_processor.LoRAXFormersAttnProcessor
## CustomDiffusionXFormersAttnProcessor
[[autodoc]] models.attention_processor.CustomDiffusionXFormersAttnProcessor
## SlicedAttnProcessor
[[autodoc]] models.attention_processor.SlicedAttnProcessor
## SlicedAttnAddedKVProcessor
[[autodoc]] models.attention_processor.SlicedAttnAddedKVProcessor
## XFormersAttnProcessor
[[autodoc]] models.attention_processor.XFormersAttnProcessor
## AttnProcessorNPU
[[autodoc]] models.attention_processor.AttnProcessorNPU

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -25,3 +25,11 @@ All pipelines with [`VaeImageProcessor`] accept PIL Image, PyTorch tensor, or Nu
The [`VaeImageProcessorLDM3D`] accepts RGB and depth inputs and returns RGB and depth outputs.
[[autodoc]] image_processor.VaeImageProcessorLDM3D
## PixArtImageProcessor
[[autodoc]] image_processor.PixArtImageProcessor
## IPAdapterMaskProcessor
[[autodoc]] image_processor.IPAdapterMaskProcessor

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -12,14 +12,18 @@ specific language governing permissions and limitations under the License.
# IP-Adapter
[IP-Adapter](https://hf.co/papers/2308.06721) is a lightweight adapter that enables prompting a diffusion model with an image. This method decouples the cross-attention layers of the image and text features. The image features are generated from an image encoder. Files generated from IP-Adapter are only ~100MBs.
[IP-Adapter](https://hf.co/papers/2308.06721) is a lightweight adapter that enables prompting a diffusion model with an image. This method decouples the cross-attention layers of the image and text features. The image features are generated from an image encoder.
<Tip>
Learn how to load an IP-Adapter checkpoint and image in the [IP-Adapter](../../using-diffusers/loading_adapters#ip-adapter) loading guide.
Learn how to load an IP-Adapter checkpoint and image in the IP-Adapter [loading](../../using-diffusers/loading_adapters#ip-adapter) guide, and you can see how to use it in the [usage](../../using-diffusers/ip_adapter) guide.
</Tip>
## IPAdapterMixin
[[autodoc]] loaders.ip_adapter.IPAdapterMixin
## IPAdapterMaskProcessor
[[autodoc]] image_processor.IPAdapterMaskProcessor

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -12,26 +12,50 @@ specific language governing permissions and limitations under the License.
# Single files
Diffusers supports loading pretrained pipeline (or model) weights stored in a single file, such as a `ckpt` or `safetensors` file. These single file types are typically produced from community trained models. There are three classes for loading single file weights:
The [`~loaders.FromSingleFileMixin.from_single_file`] method allows you to load:
- [`FromSingleFileMixin`] supports loading pretrained pipeline weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
- [`FromOriginalVAEMixin`] supports loading a pretrained [`AutoencoderKL`] from pretrained ControlNet weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
- [`FromOriginalControlnetMixin`] supports loading pretrained ControlNet weights stored in a single file, which can either be a `ckpt` or `safetensors` file.
* a model stored in a single file, which is useful if you're working with models from the diffusion ecosystem, like Automatic1111, and commonly rely on a single-file layout to store and share models
* a model stored in their originally distributed layout, which is useful if you're working with models finetuned with other services, and want to load it directly into Diffusers model objects and pipelines
<Tip>
> [!TIP]
> Read the [Model files and layouts](../../using-diffusers/other-formats) guide to learn more about the Diffusers-multifolder layout versus the single-file layout, and how to load models stored in these different layouts.
To learn more about how to load single file weights, see the [Load different Stable Diffusion formats](../../using-diffusers/other-formats) loading guide.
## Supported pipelines
</Tip>
- [`StableDiffusionPipeline`]
- [`StableDiffusionImg2ImgPipeline`]
- [`StableDiffusionInpaintPipeline`]
- [`StableDiffusionControlNetPipeline`]
- [`StableDiffusionControlNetImg2ImgPipeline`]
- [`StableDiffusionControlNetInpaintPipeline`]
- [`StableDiffusionUpscalePipeline`]
- [`StableDiffusionXLPipeline`]
- [`StableDiffusionXLImg2ImgPipeline`]
- [`StableDiffusionXLInpaintPipeline`]
- [`StableDiffusionXLInstructPix2PixPipeline`]
- [`StableDiffusionXLControlNetPipeline`]
- [`StableDiffusionXLKDiffusionPipeline`]
- [`StableDiffusion3Pipeline`]
- [`LatentConsistencyModelPipeline`]
- [`LatentConsistencyModelImg2ImgPipeline`]
- [`StableDiffusionControlNetXSPipeline`]
- [`StableDiffusionXLControlNetXSPipeline`]
- [`LEditsPPPipelineStableDiffusion`]
- [`LEditsPPPipelineStableDiffusionXL`]
- [`PIAPipeline`]
## Supported models
- [`UNet2DConditionModel`]
- [`StableCascadeUNet`]
- [`AutoencoderKL`]
- [`ControlNetModel`]
- [`SD3Transformer2DModel`]
## FromSingleFileMixin
[[autodoc]] loaders.single_file.FromSingleFileMixin
## FromOriginalVAEMixin
## FromOriginalModelMixin
[[autodoc]] loaders.autoencoder.FromOriginalVAEMixin
## FromOriginalControlnetMixin
[[autodoc]] loaders.controlnet.FromOriginalControlNetMixin
[[autodoc]] loaders.single_file_model.FromOriginalModelMixin

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,6 +1,18 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Consistency Decoder
Consistency decoder can be used to decode the latents from the denoising UNet in the [`StableDiffusionPipeline`]. This decoder was introduced in the [DALL-E 3 technical report](https://openai.com/dall-e-3).
Consistency decoder can be used to decode the latents from the denoising UNet in the [`StableDiffusionPipeline`]. This decoder was introduced in the [DALL-E 3 technical report](https://openai.com/dall-e-3).
The original codebase can be found at [openai/consistencydecoder](https://github.com/openai/consistencydecoder).

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -10,7 +10,7 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->
# ControlNet
# ControlNetModel
The ControlNet model was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang, Anyi Rao, Maneesh Agrawala. It provides a greater degree of control over text-to-image generation by conditioning the model on additional inputs such as edge maps, depth maps, segmentation maps, and keypoints for pose detection.

View File

@@ -0,0 +1,42 @@
<!--Copyright 2024 The HuggingFace Team and The InstantX Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# SD3ControlNetModel
SD3ControlNetModel is an implementation of ControlNet for Stable Diffusion 3.
The ControlNet model was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang, Anyi Rao, Maneesh Agrawala. It provides a greater degree of control over text-to-image generation by conditioning the model on additional inputs such as edge maps, depth maps, segmentation maps, and keypoints for pose detection.
The abstract from the paper is:
*We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to learn a diverse set of conditional controls. The neural architecture is connected with "zero convolutions" (zero-initialized convolution layers) that progressively grow the parameters from zero and ensure that no harmful noise could affect the finetuning. We test various conditioning controls, eg, edges, depth, segmentation, human pose, etc, with Stable Diffusion, using single or multiple conditions, with or without prompts. We show that the training of ControlNets is robust with small (<50k) and large (>1m) datasets. Extensive results show that ControlNet may facilitate wider applications to control image diffusion models.*
## Loading from the original format
By default the [`SD3ControlNetModel`] should be loaded with [`~ModelMixin.from_pretrained`].
```py
from diffusers import StableDiffusion3ControlNetPipeline
from diffusers.models import SD3ControlNetModel, SD3MultiControlNetModel
controlnet = SD3ControlNetModel.from_pretrained("InstantX/SD3-Controlnet-Canny")
pipe = StableDiffusion3ControlNetPipeline.from_pretrained("stabilityai/stable-diffusion-3-medium-diffusers", controlnet=controlnet)
```
## SD3ControlNetModel
[[autodoc]] SD3ControlNetModel
## SD3ControlNetOutput
[[autodoc]] models.controlnet_sd3.SD3ControlNetOutput

View File

@@ -0,0 +1,19 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# DiTTransformer2DModel
A Transformer model for image-like data from [DiT](https://huggingface.co/papers/2212.09748).
## DiTTransformer2DModel
[[autodoc]] DiTTransformer2DModel

View File

@@ -0,0 +1,20 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# HunyuanDiT2DModel
A Diffusion Transformer model for 2D data from [Hunyuan-DiT](https://github.com/Tencent/HunyuanDiT).
## HunyuanDiT2DModel
[[autodoc]] HunyuanDiT2DModel

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,19 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# PixArtTransformer2DModel
A Transformer model for image-like data from [PixArt-Alpha](https://huggingface.co/papers/2310.00426) and [PixArt-Sigma](https://huggingface.co/papers/2403.04692).
## PixArtTransformer2DModel
[[autodoc]] PixArtTransformer2DModel

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -10,7 +10,7 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->
# Prior Transformer
# PriorTransformer
The Prior Transformer was originally introduced in [Hierarchical Text-Conditional Image Generation with CLIP Latents](https://huggingface.co/papers/2204.06125) by Ramesh et al. It is used to predict CLIP image embeddings from CLIP text embeddings; image embeddings are predicted through a denoising diffusion process.

View File

@@ -0,0 +1,19 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# SD3 Transformer Model
The Transformer model introduced in [Stable Diffusion 3](https://hf.co/papers/2403.03206). Its novelty lies in the MMDiT transformer block.
## SD3Transformer2DModel
[[autodoc]] SD3Transformer2DModel

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -10,7 +10,7 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->
# Transformer2D
# Transformer2DModel
A Transformer model for image-like data from [CompVis](https://huggingface.co/CompVis) that is based on the [Vision Transformer](https://huggingface.co/papers/2010.11929) introduced by Dosovitskiy et al. The [`Transformer2DModel`] accepts discrete (classes of vector embeddings) or continuous (actual embeddings) inputs.
@@ -38,4 +38,4 @@ It is assumed one of the input classes is the masked latent pixel. The predicted
## Transformer2DModelOutput
[[autodoc]] models.transformers.transformer_2d.Transformer2DModelOutput
[[autodoc]] models.modeling_outputs.Transformer2DModelOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -10,7 +10,7 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->
# Transformer Temporal
# TransformerTemporalModel
A Transformer model for video-like data.

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -24,4 +24,4 @@ The abstract from the paper is:
## VQEncoderOutput
[[autodoc]] models.vq_model.VQEncoderOutput
[[autodoc]] models.autoencoders.vq_model.VQEncoderOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -16,7 +16,7 @@ aMUSEd was introduced in [aMUSEd: An Open MUSE Reproduction](https://huggingface
Amused is a lightweight text to image model based off of the [MUSE](https://arxiv.org/abs/2301.00704) architecture. Amused is particularly useful in applications that require a lightweight and fast model such as generating many images quickly at once.
Amused is a vqvae token based transformer that can generate an image in fewer forward passes than many diffusion models. In contrast with muse, it uses the smaller text encoder CLIP-L/14 instead of t5-xxl. Due to its small parameter count and few forward pass generation process, amused can generate many images quickly. This benefit is seen particularly at larger batch sizes.
Amused is a vqvae token based transformer that can generate an image in fewer forward passes than many diffusion models. In contrast with muse, it uses the smaller text encoder CLIP-L/14 instead of t5-xxl. Due to its small parameter count and few forward pass generation process, amused can generate many images quickly. This benefit is seen particularly at larger batch sizes.
The abstract from the paper is:

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -101,6 +101,53 @@ AnimateDiff tends to work better with finetuned Stable Diffusion models. If you
</Tip>
### AnimateDiffSDXLPipeline
AnimateDiff can also be used with SDXL models. This is currently an experimental feature as only a beta release of the motion adapter checkpoint is available.
```python
import torch
from diffusers.models import MotionAdapter
from diffusers import AnimateDiffSDXLPipeline, DDIMScheduler
from diffusers.utils import export_to_gif
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-sdxl-beta", torch_dtype=torch.float16)
model_id = "stabilityai/stable-diffusion-xl-base-1.0"
scheduler = DDIMScheduler.from_pretrained(
model_id,
subfolder="scheduler",
clip_sample=False,
timestep_spacing="linspace",
beta_schedule="linear",
steps_offset=1,
)
pipe = AnimateDiffSDXLPipeline.from_pretrained(
model_id,
motion_adapter=adapter,
scheduler=scheduler,
torch_dtype=torch.float16,
variant="fp16",
).to("cuda")
# enable memory savings
pipe.enable_vae_slicing()
pipe.enable_vae_tiling()
output = pipe(
prompt="a panda surfing in the ocean, realistic, high quality",
negative_prompt="low quality, worst quality",
num_inference_steps=20,
guidance_scale=8,
width=1024,
height=1024,
num_frames=16,
)
frames = output.frames[0]
export_to_gif(frames, "animation.gif")
```
### AnimateDiffVideoToVideoPipeline
AnimateDiff can also be used to generate visually similar videos or enable style/character/background or other edits starting from an initial video, allowing you to seamlessly explore creative possibilities.
@@ -118,7 +165,7 @@ from PIL import Image
adapter = MotionAdapter.from_pretrained("guoyww/animatediff-motion-adapter-v1-5-2", torch_dtype=torch.float16)
# load SD 1.5 based finetuned model
model_id = "SG161222/Realistic_Vision_V5.1_noVAE"
pipe = AnimateDiffVideoToVideoPipeline.from_pretrained(model_id, motion_adapter=adapter, torch_dtype=torch.float16).to("cuda")
pipe = AnimateDiffVideoToVideoPipeline.from_pretrained(model_id, motion_adapter=adapter, torch_dtype=torch.float16)
scheduler = DDIMScheduler.from_pretrained(
model_id,
subfolder="scheduler",
@@ -408,12 +455,126 @@ Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers)
</Tip>
<table>
<tr>
<th align=center>Without FreeInit enabled</th>
<th align=center>With FreeInit enabled</th>
</tr>
<tr>
<td align=center>
panda playing a guitar
<br />
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-no-freeinit.gif"
alt="panda playing a guitar"
style="width: 300px;" />
</td>
<td align=center>
panda playing a guitar
<br/>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatediff-freeinit.gif"
alt="panda playing a guitar"
style="width: 300px;" />
</td>
</tr>
</table>
## Using AnimateLCM
[AnimateLCM](https://animatelcm.github.io/) is a motion module checkpoint and an [LCM LoRA](https://huggingface.co/docs/diffusers/using-diffusers/inference_with_lcm_lora) that have been created using a consistency learning strategy that decouples the distillation of the image generation priors and the motion generation priors.
```python
import torch
from diffusers import AnimateDiffPipeline, LCMScheduler, MotionAdapter
from diffusers.utils import export_to_gif
adapter = MotionAdapter.from_pretrained("wangfuyun/AnimateLCM")
pipe = AnimateDiffPipeline.from_pretrained("emilianJR/epiCRealism", motion_adapter=adapter)
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config, beta_schedule="linear")
pipe.load_lora_weights("wangfuyun/AnimateLCM", weight_name="sd15_lora_beta.safetensors", adapter_name="lcm-lora")
pipe.enable_vae_slicing()
pipe.enable_model_cpu_offload()
output = pipe(
prompt="A space rocket with trails of smoke behind it launching into space from the desert, 4k, high resolution",
negative_prompt="bad quality, worse quality, low resolution",
num_frames=16,
guidance_scale=1.5,
num_inference_steps=6,
generator=torch.Generator("cpu").manual_seed(0),
)
frames = output.frames[0]
export_to_gif(frames, "animatelcm.gif")
```
<table>
<tr>
<td><center>
A space rocket, 4K.
<br>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatelcm-output.gif"
alt="A space rocket, 4K"
style="width: 300px;" />
</center></td>
</tr>
</table>
AnimateLCM is also compatible with existing [Motion LoRAs](https://huggingface.co/collections/dn6/animatediff-motion-loras-654cb8ad732b9e3cf4d3c17e).
```python
import torch
from diffusers import AnimateDiffPipeline, LCMScheduler, MotionAdapter
from diffusers.utils import export_to_gif
adapter = MotionAdapter.from_pretrained("wangfuyun/AnimateLCM")
pipe = AnimateDiffPipeline.from_pretrained("emilianJR/epiCRealism", motion_adapter=adapter)
pipe.scheduler = LCMScheduler.from_config(pipe.scheduler.config, beta_schedule="linear")
pipe.load_lora_weights("wangfuyun/AnimateLCM", weight_name="sd15_lora_beta.safetensors", adapter_name="lcm-lora")
pipe.load_lora_weights("guoyww/animatediff-motion-lora-tilt-up", adapter_name="tilt-up")
pipe.set_adapters(["lcm-lora", "tilt-up"], [1.0, 0.8])
pipe.enable_vae_slicing()
pipe.enable_model_cpu_offload()
output = pipe(
prompt="A space rocket with trails of smoke behind it launching into space from the desert, 4k, high resolution",
negative_prompt="bad quality, worse quality, low resolution",
num_frames=16,
guidance_scale=1.5,
num_inference_steps=6,
generator=torch.Generator("cpu").manual_seed(0),
)
frames = output.frames[0]
export_to_gif(frames, "animatelcm-motion-lora.gif")
```
<table>
<tr>
<td><center>
A space rocket, 4K.
<br>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/animatelcm-motion-lora.gif"
alt="A space rocket, 4K"
style="width: 300px;" />
</center></td>
</tr>
</table>
## AnimateDiffPipeline
[[autodoc]] AnimateDiffPipeline
- all
- __call__
## AnimateDiffSDXLPipeline
[[autodoc]] AnimateDiffSDXLPipeline
- all
- __call__
## AnimateDiffVideoToVideoPipeline
[[autodoc]] AnimateDiffVideoToVideoPipeline

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -20,7 +20,8 @@ The abstract of the paper is the following:
*Although audio generation shares commonalities across different types of audio, such as speech, music, and sound effects, designing models for each type requires careful consideration of specific objectives and biases that can significantly differ from those of other types. To bring us closer to a unified perspective of audio generation, this paper proposes a framework that utilizes the same learning method for speech, music, and sound effect generation. Our framework introduces a general representation of audio, called "language of audio" (LOA). Any audio can be translated into LOA based on AudioMAE, a self-supervised pre-trained representation learning model. In the generation process, we translate any modalities into LOA by using a GPT-2 model, and we perform self-supervised audio generation learning with a latent diffusion model conditioned on LOA. The proposed framework naturally brings advantages such as in-context learning abilities and reusable self-supervised pretrained AudioMAE and latent diffusion models. Experiments on the major benchmarks of text-to-audio, text-to-music, and text-to-speech demonstrate state-of-the-art or competitive performance against previous approaches. Our code, pretrained model, and demo are available at [this https URL](https://audioldm.github.io/audioldm2).*
This pipeline was contributed by [sanchit-gandhi](https://huggingface.co/sanchit-gandhi). The original codebase can be found at [haoheliu/audioldm2](https://github.com/haoheliu/audioldm2).
This pipeline was contributed by [sanchit-gandhi](https://huggingface.co/sanchit-gandhi) and [Nguyễn Công Tú Anh](https://github.com/tuanh123789). The original codebase can be
found at [haoheliu/audioldm2](https://github.com/haoheliu/audioldm2).
## Tips
@@ -36,6 +37,8 @@ See table below for details on the three checkpoints:
| [audioldm2](https://huggingface.co/cvssp/audioldm2) | Text-to-audio | 350M | 1.1B | 1150k |
| [audioldm2-large](https://huggingface.co/cvssp/audioldm2-large) | Text-to-audio | 750M | 1.5B | 1150k |
| [audioldm2-music](https://huggingface.co/cvssp/audioldm2-music) | Text-to-music | 350M | 1.1B | 665k |
| [audioldm2-gigaspeech](https://huggingface.co/anhnct/audioldm2_gigaspeech) | Text-to-speech | 350M | 1.1B |10k |
| [audioldm2-ljspeech](https://huggingface.co/anhnct/audioldm2_ljspeech) | Text-to-speech | 350M | 1.1B | |
### Constructing a prompt
@@ -53,7 +56,7 @@ See table below for details on the three checkpoints:
* The quality of the generated waveforms can vary significantly based on the seed. Try generating with different seeds until you find a satisfactory generation.
* Multiple waveforms can be generated in one go: set `num_waveforms_per_prompt` to a value greater than 1. Automatic scoring will be performed between the generated waveforms and prompt text, and the audios ranked from best to worst accordingly.
The following example demonstrates how to construct good music generation using the aforementioned tips: [example](https://huggingface.co/docs/diffusers/main/en/api/pipelines/audioldm2#diffusers.AudioLDM2Pipeline.__call__.example).
The following example demonstrates how to construct good music and speech generation using the aforementioned tips: [example](https://huggingface.co/docs/diffusers/main/en/api/pipelines/audioldm2#diffusers.AudioLDM2Pipeline.__call__.example).
<Tip>

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -12,42 +12,10 @@ specific language governing permissions and limitations under the License.
# AutoPipeline
`AutoPipeline` is designed to:
1. make it easy for you to load a checkpoint for a task without knowing the specific pipeline class to use
2. use multiple pipelines in your workflow
Based on the task, the `AutoPipeline` class automatically retrieves the relevant pipeline given the name or path to the pretrained weights with the `from_pretrained()` method.
To seamlessly switch between tasks with the same checkpoint without reallocating additional memory, use the `from_pipe()` method to transfer the components from the original pipeline to the new one.
```py
from diffusers import AutoPipelineForText2Image
import torch
pipeline = AutoPipelineForText2Image.from_pretrained(
"runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16, use_safetensors=True
).to("cuda")
prompt = "Astronaut in a jungle, cold color palette, muted colors, detailed, 8k"
image = pipeline(prompt, num_inference_steps=25).images[0]
```
<Tip>
Check out the [AutoPipeline](../../tutorials/autopipeline) tutorial to learn how to use this API!
</Tip>
`AutoPipeline` supports text-to-image, image-to-image, and inpainting for the following diffusion models:
- [Stable Diffusion](./stable_diffusion/overview)
- [ControlNet](./controlnet)
- [Stable Diffusion XL (SDXL)](./stable_diffusion/stable_diffusion_xl)
- [DeepFloyd IF](./deepfloyd_if)
- [Kandinsky 2.1](./kandinsky)
- [Kandinsky 2.2](./kandinsky_v22)
The `AutoPipeline` is designed to make it easy to load a checkpoint for a task without needing to know the specific pipeline class. Based on the task, the `AutoPipeline` automatically retrieves the correct pipeline class from the checkpoint `model_index.json` file.
> [!TIP]
> Check out the [AutoPipeline](../../tutorials/autopipeline) tutorial to learn how to use this API!
## AutoPipelineForText2Image

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,39 @@
<!--Copyright 2023 The HuggingFace Team and The InstantX Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# ControlNet with Stable Diffusion 3
StableDiffusion3ControlNetPipeline is an implementation of ControlNet for Stable Diffusion 3.
ControlNet was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang, Anyi Rao, and Maneesh Agrawala.
With a ControlNet model, you can provide an additional control image to condition and control Stable Diffusion generation. For example, if you provide a depth map, the ControlNet model generates an image that'll preserve the spatial information from the depth map. It is a more flexible and accurate way to control the image generation process.
The abstract from the paper is:
*We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to learn a diverse set of conditional controls. The neural architecture is connected with "zero convolutions" (zero-initialized convolution layers) that progressively grow the parameters from zero and ensure that no harmful noise could affect the finetuning. We test various conditioning controls, eg, edges, depth, segmentation, human pose, etc, with Stable Diffusion, using single or multiple conditions, with or without prompts. We show that the training of ControlNets is robust with small (<50k) and large (>1m) datasets. Extensive results show that ControlNet may facilitate wider applications to control image diffusion models.*
This code is implemented by [The InstantX Team](https://huggingface.co/InstantX). You can find pre-trained checkpoints for SD3-ControlNet on [The InstantX Team](https://huggingface.co/InstantX) Hub profile.
<Tip>
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
</Tip>
## StableDiffusion3ControlNetPipeline
[[autodoc]] StableDiffusion3ControlNetPipeline
- all
- __call__
## StableDiffusion3PipelineOutput
[[autodoc]] pipelines.stable_diffusion_3.pipeline_output.StableDiffusion3PipelineOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,3 +1,15 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# ControlNet-XS
ControlNet-XS was introduced in [ControlNet-XS](https://vislearn.github.io/ControlNet-XS/) by Denis Zavadski and Carsten Rother. It is based on the observation that the control model in the [original ControlNet](https://huggingface.co/papers/2302.05543) can be made much smaller and still produce good results.
@@ -12,5 +24,16 @@ Here's the overview from the [project page](https://vislearn.github.io/ControlNe
This model was contributed by [UmerHA](https://twitter.com/UmerHAdil). ❤️
<Tip>
> 🧠 Make sure to check out the Schedulers [guide](https://huggingface.co/docs/diffusers/main/en/using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
</Tip>
## StableDiffusionControlNetXSPipeline
[[autodoc]] StableDiffusionControlNetXSPipeline
- all
- __call__
## StableDiffusionPipelineOutput
[[autodoc]] pipelines.stable_diffusion.StableDiffusionPipelineOutput

View File

@@ -1,3 +1,15 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# ControlNet-XS with Stable Diffusion XL
ControlNet-XS was introduced in [ControlNet-XS](https://vislearn.github.io/ControlNet-XS/) by Denis Zavadski and Carsten Rother. It is based on the observation that the control model in the [original ControlNet](https://huggingface.co/papers/2302.05543) can be made much smaller and still produce good results.
@@ -12,4 +24,22 @@ Here's the overview from the [project page](https://vislearn.github.io/ControlNe
This model was contributed by [UmerHA](https://twitter.com/UmerHAdil). ❤️
> 🧠 Make sure to check out the Schedulers [guide](https://huggingface.co/docs/diffusers/main/en/using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](https://huggingface.co/docs/diffusers/main/en/using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
<Tip warning={true}>
🧪 Many of the SDXL ControlNet checkpoints are experimental, and there is a lot of room for improvement. Feel free to open an [Issue](https://github.com/huggingface/diffusers/issues/new/choose) and leave us feedback on how we can improve!
</Tip>
<Tip>
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines.
</Tip>
## StableDiffusionXLControlNetXSPipeline
[[autodoc]] StableDiffusionXLControlNetXSPipeline
- all
- __call__
## StableDiffusionPipelineOutput
[[autodoc]] pipelines.stable_diffusion.StableDiffusionPipelineOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at

View File

@@ -0,0 +1,95 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Hunyuan-DiT
![chinese elements understanding](https://github.com/gnobitab/diffusers-hunyuan/assets/1157982/39b99036-c3cb-4f16-bb1a-40ec25eda573)
[Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding](https://arxiv.org/abs/2405.08748) from Tencent Hunyuan.
The abstract from the paper is:
*We present Hunyuan-DiT, a text-to-image diffusion transformer with fine-grained understanding of both English and Chinese. To construct Hunyuan-DiT, we carefully design the transformer structure, text encoder, and positional encoding. We also build from scratch a whole data pipeline to update and evaluate data for iterative model optimization. For fine-grained language understanding, we train a Multimodal Large Language Model to refine the captions of the images. Finally, Hunyuan-DiT can perform multi-turn multimodal dialogue with users, generating and refining images according to the context. Through our holistic human evaluation protocol with more than 50 professional human evaluators, Hunyuan-DiT sets a new state-of-the-art in Chinese-to-image generation compared with other open-source models.*
You can find the original codebase at [Tencent/HunyuanDiT](https://github.com/Tencent/HunyuanDiT) and all the available checkpoints at [Tencent-Hunyuan](https://huggingface.co/Tencent-Hunyuan/HunyuanDiT).
**Highlights**: HunyuanDiT supports Chinese/English-to-image, multi-resolution generation.
HunyuanDiT has the following components:
* It uses a diffusion transformer as the backbone
* It combines two text encoders, a bilingual CLIP and a multilingual T5 encoder
<Tip>
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers.md) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading.md#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.
</Tip>
## Optimization
You can optimize the pipeline's runtime and memory consumption with torch.compile and feed-forward chunking. To learn about other optimization methods, check out the [Speed up inference](../../optimization/fp16) and [Reduce memory usage](../../optimization/memory) guides.
### Inference
Use [`torch.compile`](https://huggingface.co/docs/diffusers/main/en/tutorials/fast_diffusion#torchcompile) to reduce the inference latency.
First, load the pipeline:
```python
from diffusers import HunyuanDiTPipeline
import torch
pipeline = HunyuanDiTPipeline.from_pretrained(
"Tencent-Hunyuan/HunyuanDiT-Diffusers", torch_dtype=torch.float16
).to("cuda")
```
Then change the memory layout of the pipelines `transformer` and `vae` components to `torch.channels-last`:
```python
pipeline.transformer.to(memory_format=torch.channels_last)
pipeline.vae.to(memory_format=torch.channels_last)
```
Finally, compile the components and run inference:
```python
pipeline.transformer = torch.compile(pipeline.transformer, mode="max-autotune", fullgraph=True)
pipeline.vae.decode = torch.compile(pipeline.vae.decode, mode="max-autotune", fullgraph=True)
image = pipeline(prompt="一个宇航员在骑马").images[0]
```
The [benchmark](https://gist.github.com/sayakpaul/29d3a14905cfcbf611fe71ebd22e9b23) results on a 80GB A100 machine are:
```bash
With torch.compile(): Average inference time: 12.470 seconds.
Without torch.compile(): Average inference time: 20.570 seconds.
```
### Memory optimization
By loading the T5 text encoder in 8 bits, you can run the pipeline in just under 6 GBs of GPU VRAM. Refer to [this script](https://gist.github.com/sayakpaul/3154605f6af05b98a41081aaba5ca43e) for details.
Furthermore, you can use the [`~HunyuanDiT2DModel.enable_forward_chunking`] method to reduce memory usage. Feed-forward chunking runs the feed-forward layers in a transformer block in a loop instead of all at once. This gives you a trade-off between memory consumption and inference runtime.
```diff
+ pipeline.transformer.enable_forward_chunking(chunk_size=1, dim=1)
```
## HunyuanDiTPipeline
[[autodoc]] HunyuanDiTPipeline
- all
- __call__

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -18,11 +18,11 @@ The abstract from the paper is:
*Video synthesis has recently made remarkable strides benefiting from the rapid development of diffusion models. However, it still encounters challenges in terms of semantic accuracy, clarity and spatio-temporal continuity. They primarily arise from the scarcity of well-aligned text-video data and the complex inherent structure of videos, making it difficult for the model to simultaneously ensure semantic and qualitative excellence. In this report, we propose a cascaded I2VGen-XL approach that enhances model performance by decoupling these two factors and ensures the alignment of the input data by utilizing static images as a form of crucial guidance. I2VGen-XL consists of two stages: i) the base stage guarantees coherent semantics and preserves content from input images by using two hierarchical encoders, and ii) the refinement stage enhances the video's details by incorporating an additional brief text and improves the resolution to 1280×720. To improve the diversity, we collect around 35 million single-shot text-video pairs and 6 billion text-image pairs to optimize the model. By this means, I2VGen-XL can simultaneously enhance the semantic accuracy, continuity of details and clarity of generated videos. Through extensive experiments, we have investigated the underlying principles of I2VGen-XL and compared it with current top methods, which can demonstrate its effectiveness on diverse data. The source code and models will be publicly available at [this https URL](https://i2vgen-xl.github.io/).*
The original codebase can be found [here](https://github.com/ali-vilab/i2vgen-xl/). The model checkpoints can be found [here](https://huggingface.co/ali-vilab/).
The original codebase can be found [here](https://github.com/ali-vilab/i2vgen-xl/). The model checkpoints can be found [here](https://huggingface.co/ali-vilab/).
<Tip>
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines. Also, to know more about reducing the memory usage of this pipeline, refer to the ["Reduce memory usage"] section [here](../../using-diffusers/svd#reduce-memory-usage).
Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-components-across-pipelines) section to learn how to efficiently load the same components into multiple pipelines. Also, to know more about reducing the memory usage of this pipeline, refer to the ["Reduce memory usage"] section [here](../../using-diffusers/svd#reduce-memory-usage).
</Tip>
@@ -31,7 +31,7 @@ Sample output with I2VGenXL:
<table>
<tr>
<td><center>
masterpiece, bestquality, sunset.
library.
<br>
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/i2vgen-xl-example.gif"
alt="library"
@@ -43,10 +43,11 @@ Sample output with I2VGenXL:
## Notes
* I2VGenXL always uses a `clip_skip` value of 1. This means it leverages the penultimate layer representations from the text encoder of CLIP.
* It can generate videos of quality that is often on par with [Stable Video Diffusion](../../using-diffusers/svd) (SVD).
* Unlike SVD, it additionally accepts text prompts as inputs.
* It can generate higher resolution videos.
* It can generate videos of quality that is often on par with [Stable Video Diffusion](../../using-diffusers/svd) (SVD).
* Unlike SVD, it additionally accepts text prompts as inputs.
* It can generate higher resolution videos.
* When using the [`DDIMScheduler`] (which is default for this pipeline), less than 50 steps for inference leads to bad results.
* This implementation is 1-stage variant of I2VGenXL. The main figure in the [I2VGen-XL](https://arxiv.org/abs/2311.04145) paper shows a 2-stage variant, however, 1-stage variant works well. See [this discussion](https://github.com/huggingface/diffusers/discussions/7952) for more details.
## I2VGenXLPipeline
[[autodoc]] I2VGenXLPipeline

Some files were not shown because too many files have changed in this diff Show More