Compare commits

..

1966 Commits

Author SHA1 Message Date
Wauplin
bae4e5f8af Fix PATH_IN_REPO in mirror workflow 2024-06-13 09:48:58 +02:00
Beinsezii
7f51f286a5 Add Hunyuan AutoPipe mapping (#8505) 2024-06-12 16:11:55 -10:00
kkj15dk
829f6defa4 Fix spelling in scheduling_flow_match_euler_discrete.py (#8497)
Update scheduling_flow_match_euler_discrete.py

Spelling:
Foward -> Forward

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-06-12 12:37:47 -10:00
Beinsezii
24bdf4b215 Add SD3 AutoPipeline mappings (#8489) 2024-06-12 12:31:36 -10:00
Radamés Ajna
95e0c3757d Fix small typo (#8498) 2024-06-12 15:30:58 -07:00
Sayak Paul
6cf0be5d3d fix warning log for Transformer SD3 (#8496)
fix warning log
2024-06-12 12:25:18 -10:00
Sayak Paul
ec068f9b5b fix dual transformer2d import (#8491)
fix
2024-06-12 21:10:27 +01:00
Ameer Azam
0240d4191a Update README_sd3.md (#8490)
becasue in  Readme  it was not correct
train_dreambooth_sd3.py to train_dreambooth_lora_sd3
2024-06-12 21:08:36 +01:00
Dhruv Nair
04717fd861 Add Stable Diffusion 3 (#8483)
* up

* add sd3

* update

* update

* add tests

* fix copies

* fix docs

* update

* add dreambooth lora

* add LoRA

* update

* update

* update

* update

* import fix

* update

* Update src/diffusers/pipelines/stable_diffusion_3/pipeline_stable_diffusion_3.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* import fix 2

* update

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* update

* update

* update

* fix ckpt id

* fix more ids

* update

* missing doc

* Update src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/schedulers/scheduling_flow_match_euler_discrete.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_3.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/stable_diffusion_3.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update'

* fix

* update

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

* Update src/diffusers/models/autoencoders/autoencoder_kl.py

* note on gated access.

* requirements

* licensing

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-06-12 20:44:00 +01:00
Jiwook Han
6fd458e99d 🌐 [i18n-KO] Translated conceptual/philosophy.md and 3 other documents to Korean (#8294)
* translation about 3 documents into Korean

* evaluation doc korean translation

* _toctree.yml modify

* doc title fix : philosopy->philosophy

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/ethical_guidelines.md

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update docs/source/ko/conceptual/evaluation.md

Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>

* Update philosophy.md (from jungnerd)

---------

Co-authored-by: Jihun Lim <31366038+heuristicwave@users.noreply.github.com>
Co-authored-by: Chulhwa (Evan) Han <cjfghk5697@ajou.ac.kr>
2024-06-12 09:40:37 -07:00
Greg Hunkins
1066fe4cbc 🤫 Quiet IP Adapter Mask Warning (#8475)
* quiet attn parameters

* fix lint

* make style && make quality

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-12 16:50:13 +01:00
Sayak Paul
d38f69ea25 change max_shard_size to 10GB (#8445)
* change max_shard_size to 10GB

* add notes to the documentation

* Update src/diffusers/models/modeling_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* change to abs limit

---------

Co-authored-by: Lucain <lucainp@gmail.com>
2024-06-12 13:49:13 +01:00
Patrick
0a1c13af79 image_processor.py: Fixed an error in ValueError's message (#8447)
* image_processor.py: Fixed an error in ValueError's message , as the string's join method tried to join types, instead of strings

Bug that occurred:

f"Input is in incorrect format. Currently, we only support {', '.join(supported_formats)}"
TypeError: sequence item 0: expected str instance, type found

* Fixed: C417 Unnecessary `map` usage (rewrite using a generator expression)

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-11 08:09:24 -10:00
YiYi Xu
0028c34432 fix SEGA pipeline (#8467)
* fix

* style

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-11 06:37:49 -10:00
Sayak Paul
d457beed92 Update README.md to update the MaPO project (#8470)
Update README.md
2024-06-11 10:10:45 +01:00
Jianqi Pan
1d9a6a81b9 🔧 chore: use modeling_outputs.Transformer2DModelOutput (#8436)
* 🔧 chore: use modeling_outputs.Transformer2DModelOutput

* 🔧 chore: isort

* 🔧 chore: isort

* style

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2024-06-10 12:11:41 +01:00
Luc Georges
4e0984db6c fix(ci): remove unnecessary permissions (#8457) 2024-06-10 10:49:29 +01:00
Luc Georges
83bc6c94ea feat(ci): add trufflehog secrets detection (#8430) 2024-06-08 07:56:47 +05:30
Lucain
0d68ddf327 Move away from cached_download (#8419)
* Move away from

* unused constant

* Add custom error
2024-06-07 15:43:00 +05:30
Sayak Paul
7d887118b9 [Core] support saving and loading of sharded checkpoints (#7830)
* feat: support saving a model in sharded checkpoints.

* feat: make loading of sharded checkpoints work.

* add tests

* cleanse the loading logic a bit more.

* more resilience while loading from the Hub.

* parallelize shard downloads by using snapshot_download()/

* default to a shard size.

* more fix

* Empty-Commit

* debug

* fix

* uality

* more debugging

* fix more

* initial comments from Benjamin

* move certain methods to loading_utils

* add test to check if the correct number of shards are present.

* add a test to check if loading of sharded checkpoints from the Hub is okay

* clarify the unit when passed as an int.

* use hf_hub for sharding.

* remove unnecessary code

* remove unnecessary function

* lucain's comments.

* fixes

* address high-level comments.

* fix test

* subfolder shenanigans./

* Update src/diffusers/utils/hub_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

* Apply suggestions from code review

Co-authored-by: Lucain <lucainp@gmail.com>

* remove _huggingface_hub_version as not needed.

* address more feedback.

* add a test for local_files_only=True/

* need hf hub to be at least 0.23.2

* style

* final comment.

* clean up subfolder.

* deal with suffixes in code.

* _add_variant default.

* use weights_name_pattern

* remove add_suffix_keyword

* clean up downloading of sharded ckpts.

* don't return something special when using index.json

* fix more

* don't use bare except

* remove comments and catch the errors better

* fix a couple of things when using is_file()

* empty

---------

Co-authored-by: Lucain <lucainp@gmail.com>
2024-06-07 14:49:10 +05:30
Lucain
b63c956860 Final fix for mirror community pipeline (#8427) 2024-06-07 11:08:33 +02:00
Lucain
716b2062bf Fix mirror community pipeline (#8426) 2024-06-07 11:03:48 +02:00
Lucain
5fd6825d25 Fix mirror_community_pipeline.yml name (#8425) 2024-06-07 11:00:05 +02:00
Lucain
e0fae6fd73 Mirror ./examples/community folder on HF (#8417)
* first draft

* secret

* tiktok

* capital matters

* dataset matter

* don't be a prick

* refact

* only on main or tag

* document with an example

* Update destination dataset

* link

* allow manual trigger

* better

* lin

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-07 10:56:05 +02:00
Tolga Cangöz
ec1aded12e Optimize test files by fixing CPU-offloading usage (#8409)
* Refactor code to remove unnecessary calls to `to(torch_device)`

* Refactor code to remove unnecessary calls to `to("cuda")`

* Update pipeline_stable_diffusion_diffedit.py
2024-06-06 09:51:26 -10:00
Steven Liu
151a56b80e [docs] Single file usage (#8412)
* single file usage

* edit
2024-06-06 12:40:34 -07:00
Sayak Paul
a3faf3f260 [Core] fix: legacy model mapping (#8416)
* fix: legacy model mapping

* remove print
2024-06-06 20:35:05 +05:30
Sayak Paul
867a2b0cf9 [Hunyuan] add optimization related sections to the hunyuan dit docs. (#8402)
* optimizations to the hunyuan dit docs.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/hunyuandit.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-06-06 05:41:38 +05:30
Tolga Cangöz
98730c5dd7 Errata (#8322)
* Fix typos

* Trim trailing whitespaces

* Remove a trailing whitespace

* chore: Update MarigoldDepthPipeline checkpoint to prs-eth/marigold-lcm-v1-0

* Revert "chore: Update MarigoldDepthPipeline checkpoint to prs-eth/marigold-lcm-v1-0"

This reverts commit fd742b30b4.

* pokemon -> naruto

* `DPMSolverMultistep` -> `DPMSolverMultistepScheduler`

* Improve Markdown stylization

* Improve style

* Improve style

* Refactor pipeline variable names for consistency

* up style
2024-06-05 13:59:09 -07:00
Guillaume LEGENDRE
7ebd359446 Update tailscale action to main (#8403) 2024-06-05 18:53:33 +05:30
Hzzone
d3881f35b7 Gligen training (#7906)
* add training code of gligen

* fix code quality tests.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-06-05 16:26:42 +04:00
Sayak Paul
48207d6689 [Scheduler] fix: EDM schedulers when using the exp sigma schedule. (#8385)
* fix: euledm when using the exp sigma schedule.

* fix-copies

* remove print.

* reduce friction

* yiyi's suggestioms
2024-06-04 19:31:43 -10:00
Sayak Paul
2f6f426f66 [Hunyuan] allow Hunyuan DiT to run under 6GB for GPU VRAM (#8399)
* allow hunyuan dit to run under 6GB for GPU VRAM

* add section in the docs/
2024-06-05 08:24:19 +04:00
Sayak Paul
a0542c1917 [LoRA] Remove legacy LoRA code and related adjustments (#8316)
* remove legacy code from load_attn_procs.

* finish first draft

* fix more.

* fix more

* add test

* add serialization support.

* fix-copies

* require peft backend for lora tests

* style

* fix test

* fix loading.

* empty

* address benjamin's feedback.
2024-06-05 08:15:30 +04:00
Sayak Paul
a8ad6664c2 [Hunyuan] feat: support chunked ff. (#8397)
feat: support chunked ff.
2024-06-05 08:12:18 +04:00
Sayak Paul
14f7b545bd [Hunyuan DiT] feat: enable fusing qkv projections when doing attention (#8396)
* feat: introduce qkv fusion for Hunyuan

* fix copies
2024-06-05 07:58:03 +04:00
leaps
07cd20041c Update code example in pipeline_stable_unclip_img2img.py EXAMPLE_DOC_STRING (#8401)
Update code example in pipeline_stable_unclip_img2img.py

Previous code caused an error when run
2024-06-04 17:22:46 -10:00
Sayak Paul
6ddbf6222c [Transformer2DModel] Handle norm_type safely while remapping (#8370)
* handle norm_type of transformer2d_model safely.

* log an info when old model class is being returned.

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* remove extra stuff

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-06-04 13:39:19 +04:00
Sayak Paul
3ff39e8e86 [HunyuanDiT] minor docs changes in hunyuandit (#8395)
minor docs changes in hunyuandit
2024-06-04 12:18:53 +04:00
townwish4git
6be43bd855 Fix AsymmetricAutoencoderKL forward (#8378) 2024-06-03 17:25:11 -10:00
Marçal Comajoan Cara
dc89434bdc Update transformer2d.md title (#8375)
* Update transformer2d.md title

For the other classes (e.g., UNet2DModel) the title of the documentation coincides with the name of the class, but that was not the case for Transformer2DModel.

* Update model docs titles for consistency with class names
2024-06-03 17:01:21 -07:00
Dhruv Nair
4d633bfe9a Update slow test actions (#8381)
* update

* update

* update

* update
2024-06-03 18:32:34 +05:30
XCL
174cf868ea Tencent Hunyuan Team - Updated Doc for HunyuanDiT (#8383)
* add hunyuandit doc

* update hunyuandit doc

* update hunyuandit 2d model

* update toctree.yml for hunyuandit
2024-06-03 14:02:46 +04:00
XCL
413604405f Tencent Hunyuan Team: add HunyuanDiT related updates (#8240)
* Hunyuan Team: add HunyuanDiT related updates


---------

Co-authored-by: XCLiu <liuxc1996@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
2024-06-01 12:41:21 -10:00
39th president of the United States, probably
bc108e1533 Fix DREAM training (#8302)
Co-authored-by: Jimmy <39@🇺🇸.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-06-01 11:27:57 +04:00
Anton Obukhov
86555c9f59 Fix marigold documentation (#8372)
* rename prs-eth/marigold-lcm-v1-0 into prs-eth/marigold-depth-lcm-v1-0

* update image paths in https://huggingface.co/datasets/huggingface/documentation-images to use main branch

* fix relative paths to other diffusers pages

* Update docs/source/en/using-diffusers/marigold_usage.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-05-31 12:10:05 -10:00
Sayak Paul
983dec3bf7 [Core] Introduce class variants for Transformer2DModel (#7647)
* init for patches

* finish patched model.

* continuous transformer

* vectorized transformer2d.

* style.

* inits.

* fix-copies.

* introduce DiTTransformer2DModel.

* fixes

* use REMAPPING as suggested by @DN6

* better logging.

* add pixart transformer model.

* inits.

* caption_channels.

* attention masking.

* fix use_additional_conditions.

* remove print.

* debug

* flatten

* fix: assertion for sigma

* handle remapping for modeling_utils

* add tests for dit transformer2d

* quality

* placeholder for pixart tests

* pixart tests

* add _no_split_modules

* add docs.

* check

* check

* check

* check

* fix tests

* fix tests

* move Transformer output to modeling_output

* move errors better and bring back use_additional_conditions attribute.

* add unnecessary things from DiT.

* clean up pixart

* fix remapping

* fix device_map things in pixart2d.

* replace Transformer2DModel with appropriate classes in dit, pixart tests

* empty

* legacy mixin classes./

* use a remapping dict for fetching class names.

* change to specifc model types in the pipeline implementations.

* move _fetch_remapped_cls_from_config to modeling_loading_utils.py

* fix dependency problems.

* add deprecation note.
2024-05-31 13:40:27 +05:30
Dhruv Nair
f9fa8a868c Change checkpoint key used to identify CLIP models in single file checkpoints (#8319)
update
2024-05-31 11:20:31 +05:30
Jonah
05be622b1c Fix depth pipeline "input/weight type should be the same" error at fp16 (#8321)
Fix "input/weight type should be the same"

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-30 13:59:49 -10:00
satani99
352d96eb82 Modularize train_text_to_image_lora_sdxl inferencing during and after training in example (#8335)
* Modularized the train_lora_sdxl file

* Modularized the train_lora_sdxl file

* Modularized the train_lora_sdxl file

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-31 04:52:22 +05:30
Genius Patrick
3511a9623f fix(training): lr scheduler doesn't work properly in distributed scenarios (#8312) 2024-05-30 15:23:19 +05:30
Dhruv Nair
42cae93b94 Fix StableDiffusionPipeline when text_encoder=None (#8297)
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-29 09:00:51 -10:00
Tolga Cangöz
a2ecce26bc Fix Copying Mechanism typo/bug (#8232)
* Fix copying mechanism typos

* fix copying mecha

* Revert, since they are in TODO

* Fix copying mechanism
2024-05-29 09:37:18 -07:00
Steven Liu
9e00b727ad [docs] Files and formats (#7874)
* files and formats

* fix callout

* feedback

* code sample

* feedback
2024-05-29 09:31:32 -07:00
Steven Liu
f7a4626f4b [docs] DeepFloyd training (#8224)
deepfloyd training

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-29 09:27:37 -07:00
Tolga Cangöz
f4a44b7707 Simplify platform_info assignment in diffusers-cli env (#8298)
chore: Simplify `platform_info` assignment
2024-05-29 17:57:42 +05:30
satani99
3bc3b48c10 Modularize train_text_to_image_lora SD inferencing during and after training in example (#8283)
* Modularized the train_lora file

* Modularized the train_lora file

* Modularized the train_lora file

* Modularized the train_lora file

* Modularized the train_lora file

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-29 10:08:02 +05:30
Sayak Paul
581d8aacf7 post release v0.28.0 (#8286)
* post release v0.28.0

* style
2024-05-29 07:13:22 +05:30
Sayak Paul
ba1bfac20b [Core] Refactor IPAdapterPlusImageProjection a bit (#7994)
* use IPAdapterPlusImageProjectionBlock in IPAdapterPlusImageProjection

* reposition IPAdapterPlusImageProjection

* refactor complete?

* fix heads param retrieval.

* update test dict creation method.
2024-05-29 06:30:47 +05:30
Sayak Paul
5edd0b34fa move vqmodel to models.autoencoders. (#8292)
move vqmodel to models.autoencoders.
2024-05-29 06:30:35 +05:30
Sayak Paul
3a28e36aa1 [Post release 0.28.0] remove deprecated blocks. (#8291)
* remove deprecated blocks.

* update the location paths.
2024-05-29 06:29:43 +05:30
Vladimir Mandic
3393c01c9d fix pixart-sigma negative prompt handling (#8299)
* fix negative prompt

* fix

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-28 13:10:35 -10:00
Steven Liu
1fa8dbc63a [docs] Outpaint (#7964)
* first draft

* edits
2024-05-28 14:42:03 -07:00
Steven Liu
0ab6dc0f23 [docs] Scheduler features (#7990)
* noise schedule

* sigmas and zero snr

* feedback

* feedback
2024-05-28 14:41:22 -07:00
Álvaro Somoza
b2030a249c Fix object has no attribute 'flush' when using without a console (#8271)
fix
2024-05-28 11:19:01 -10:00
Sajad Norouzi
67bef2027c Add Kohya fix to SD pipeline for high resolution generation (#7633)
add kohya high resolution fix.
2024-05-28 10:00:04 -10:00
Sayak Paul
aa676c641f change to yiyi's address. (#7981)
* change to yiyi's address.

* update to diffusers@huggingface.co
2024-05-28 08:28:55 -10:00
Sayak Paul
e6df8edadc [LoRA] attempt at fixing onetrainer lora. (#8242)
* attempt at fixing onetrainer lora.

* fix
2024-05-28 08:25:54 -10:00
Jiwook Han
80cfaebaa1 Fix typo in philosophy.md (#8303)
fix typo in philosophy.md
2024-05-28 10:38:48 -07:00
Álvaro Somoza
ba82414106 [docs] Add controlnet example to marigold (#8289)
* initial doc

* fix wrong LCM sentence

* implement binary colormap without requiring matplotlib
update section about Marigold for ControlNet
update formatting of marigold_usage.md

* fix indentation

---------

Co-authored-by: anton <anton.obukhov@gmail.com>
2024-05-28 11:58:06 -04:00
Sayak Paul
fe5f035f79 install wget. (#8285) 2024-05-27 18:06:07 +05:30
Anton Obukhov
b3d10d6d65 [Pipeline] Marigold depth and normals estimation (#7847)
* implement marigold depth and normals pipelines in diffusers core

* remove bibtex

* remove deprecations

* remove save_memory argument

* remove validate_vae

* remove config output

* remove batch_size autodetection

* remove presets logic
move default denoising_steps and processing_resolution into the model config
make default ensemble_size 1

* remove no_grad

* add fp16 to the example usage

* implement is_matplotlib_available
use is_matplotlib_available, is_scipy_available for conditional imports in the marigold depth pipeline

* move colormap, visualize_depth, and visualize_normals into export_utils.py

* make the denoising loop more lucid
fix the outputs to always be 4d tensors or lists of pil images
support a 4d input_image case
attempt to support model_cpu_offload_seq
move check_inputs into a separate function
change default batch_size to 1, remove any logic to make it bigger implicitly

* style

* rename denoising_steps into num_inference_steps

* rename input_image into image

* rename input_latent into latents

* remove decode_image
change decode_prediction to use the AutoencoderKL.decode method

* move clean_latent outside of progress_bar

* refactor marigold-reusable image processing bits into MarigoldImageProcessor class

* clean up the usage example docstring

* make ensemble functions members of the pipelines

* add early checks in check_inputs
rename E into ensemble_size in depth ensembling

* fix vae_scale_factor computation

* better compatibility with torch.compile
better variable naming

* move export_depth_to_png to export_utils

* remove encode_prediction

* improve visualize_depth and visualize_normals to accept multi-dimensional data and lists
remove visualization functions from the pipelines
move exporting depth as 16-bit PNGs functionality from the depth pipeline
update example docstrings

* do not shortcut vae.config variables

* change all asserts to raise ValueError

* rename output_prediction_type to output_type

* better variable names
clean up variable deletion code

* better variable names

* pass desc and leave kwargs into the diffusers progress_bar
implement nested progress bar for images and steps loops

* implement scale_invariant and shift_invariant flags in the ensemble_depth function
add scale_invariant and shift_invariant flags readout from the model config
further refactor ensemble_depth
support ensembling without alignment
add ensemble_depth docstring

* fix generator device placement checks

* move encode_empty_text body into the pipeline call

* minor empty text encoding simplifications

* adjust pipelines' class docstrings to explain the added construction arguments

* improve the scipy failure condition
add comments
improve docstrings
change the default use_full_z_range to True

* make input image values range check configurable in the preprocessor
refactor load_image_canonical in preprocessor to reject unknown types and return the image in the expected 4D format of tensor and on right device
support a list of everything as inputs to the pipeline, change type to PipelineImageInput
implement a check that all input list elements have the same dimensions
improve docstrings of pipeline outputs
remove check_input pipeline argument

* remove forgotten print

* add prediction_type model config

* add uncertainty visualization into export utils
fix NaN values in normals uncertainties

* change default of output_uncertainty to False
better handle the case of an attempt to export or visualize none

* fix `output_uncertainty=False`

* remove kwargs
fix check_inputs according to the new inputs of the pipeline

* rename prepare_latent into prepare_latents as in other pipelines
annotate prepare_latents in normals pipeline with "Copied from"
annotate encode_image in normals pipeline with "Copied from"

* move nested-capable `progress_bar` method into the pipelines
revert the original `progress_bar` method in pipeline_utils

* minor message improvement

* fix cpu offloading

* move colormap, visualize_depth, export_depth_to_16bit_png, visualize_normals, visualize_uncertainty to marigold_image_processing.py
update example docstrings

* fix missing comma

* change torch.FloatTensor to torch.Tensor

* fix importing of MarigoldImageProcessor

* fix vae offloading
fix batched image encoding
remove separate encode_image function and use vae.encode instead

* implement marigold's intial tests
relax generator checks in line with other pipelines
implement return_dict __call__ argument in line with other pipelines

* fix num_images computation

* remove MarigoldImageProcessor and outputs from import structure
update tests

* update docstrings

* update init

* update

* style

* fix

* fix

* up

* up

* up

* add simple test

* up

* update expected np input/output to be channel last

* move expand_tensor_or_array into the MarigoldImageProcessor

* rewrite tests to follow conventions - hardcoded slices instead of image artifacts
write more smoke tests

* add basic docs.

* add anton's contribution statement

* remove todos.

* fix assertion values for marigold depth slow tests

* fix assertion values for depth normals.

* remove print

* support AutoencoderTiny in the pipelines

* update documentation page
add Available Pipelines section
add Available Checkpoints section
add warning about num_inference_steps

* fix missing import in docstring
fix wrong value in visualize_depth docstring

* [doc] add marigold to pipelines overview

* [doc] add section "usage examples"

* fix an issue with latents check in the pipelines

* add "Frame-by-frame Video Processing with Consistency" section

* grammarly

* replace tables with images with css-styled images (blindly)

* style

* print

* fix the assertions.

* take from the github runner.

* take the slices from action artifacts

* style.

* update with the slices from the runner.

* remove unnecessary code blocks.

* Revert "[doc] add marigold to pipelines overview"

This reverts commit a505165150afd8dab23c474d1a054ea505a56a5f.

* remove invitation for new modalities

* split out marigold usage examples

* doc cleanup

---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2024-05-27 17:21:49 +05:30
Dhruv Nair
b82f9f5666 Add zip package to doc builder image (#8284)
update
2024-05-27 15:50:00 +05:30
Sayak Paul
6a5ba1b719 [Workflows] add a more secure way to run tests from a PR. (#7969)
* add a more secure way to run tests from a PR.

* make pytest more secure.

* address dhruv's comments.

* improve validation check.

* Update .github/workflows/run_tests_from_a_pr.yml

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-27 13:47:50 +05:30
Dhaivat Bhatt
4d40c9140c Add details about 1-stage implementation in I2VGen-XL docs (#8282)
* Add details about 1-stage implementation

* Add details about 1-stage implementation
2024-05-27 09:56:32 +05:30
Tolga Cangöz
0ab63ff647 Fix CPU Offloading Usage & Typos (#8230)
* Fix typos

* Fix `pipe.enable_model_cpu_offload()` usage

* Fix cpu offloading

* Update numbers
2024-05-24 11:25:29 -07:00
Tolga Cangöz
db33af065b Fix a grammatical error in the raise messages (#8272)
Fix grammatical error
2024-05-24 11:15:00 -07:00
Yue Wu
1096f88e2b sampling bug fix in diffusers tutorial "basic_training.md" (#8223)
sampling bug fix in basic_training.md

In the diffusers basic training tutorial, setting the manual seed argument (generator=torch.manual_seed(config.seed)) in the pipeline call inside evaluate() function rewinds the dataloader shuffling, leading to overfitting due to the model seeing same sequence of training examples after every evaluation call. Using generator=torch.Generator(device='cpu').manual_seed(config.seed) avoids this.
2024-05-24 11:14:32 -07:00
Dhruv Nair
cef4a51223 Clean up from_single_file docs (#8268)
* update

* update
2024-05-24 17:43:51 +05:30
Lucain
edf5ba6a17 Respect resume_download deprecation V2 (#8267)
* Fix resume_downoad FutureWarning

* only resume download
2024-05-24 12:11:03 +02:00
Sayak Paul
9941f1f61b [Chore] run the documentation workflow in a custom container. (#8266)
run the documentation workflow in a custom container.
2024-05-24 15:10:02 +05:30
Yifan Zhou
46a9db0336 [Community Pipeline] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation (#8239)
* code and doc

* update paper link

* remove redundant codes

* add example video

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-24 14:44:20 +05:30
Dhruv Nair
370146e4e0 Use freedesktop_os_release() in diffusers cli for Python >=3.10 (#8235)
* update

* update
2024-05-24 13:30:40 +05:30
Dhruv Nair
5cd45c24bf Create custom container for doc builder (#8263)
* update

* update
2024-05-24 12:53:48 +05:30
Dhruv Nair
67b3fe0aae Fix resize issue in SVD pipeline with VideoProcessor (#8229)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-23 11:57:34 +05:30
Dhruv Nair
baab065679 Remove unnecessary single file tests for SD Cascade UNet (#7996)
update
2024-05-22 12:29:59 +05:30
BootesVoid
509741aea7 fix: Attribute error in Logger object (logger.warning) (#8183) 2024-05-22 12:29:11 +05:30
Lucain
e1df77ee1e Use HF_TOKEN env var in CI (#7993) 2024-05-21 14:58:10 +05:30
Steven Liu
fdb1baa05c [docs] VideoProcessor (#7965)
* fix?

* fix?

* fix
2024-05-21 08:18:21 +05:30
Vinh H. Pham
6529ee67ec Make VAE compatible to torch.compile() (#7984)
make VAE compatible to torch.compile()

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-20 13:43:59 -04:00
Sai-Suraj-27
df2bc5ef28 fix: Fixed few docstrings according to the Google Style Guide (#7717)
Fixed few docstrings according to the Google Style Guide.
2024-05-20 10:26:05 -07:00
Aleksei Zhuravlev
a7bf77fc28 Passing cross_attention_kwargs to StableDiffusionInstructPix2PixPipeline (#7961)
* Update pipeline_stable_diffusion_instruct_pix2pix.py

Add `cross_attention_kwargs` to `__call__` method of `StableDiffusionInstructPix2PixPipeline`, which are passed to UNet.

* Update documentation for pipeline_stable_diffusion_instruct_pix2pix.py

* Update docstring

* Update docstring

* Fix typing import
2024-05-20 13:14:34 -04:00
Junsong Chen
0f0defdb65 [docs] add doc for PixArtSigmaPipeline (#7857)
* 1. add doc for PixArtSigmaPipeline;

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Guillaume LEGENDRE <glegendre01@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
Co-authored-by: Bagheera <59658056+bghira@users.noreply.github.com>
Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Hyoungwon Cho <jhw9811@korea.ac.kr>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: Tolga Cangöz <46008593+standardAI@users.noreply.github.com>
Co-authored-by: Philip Pham <phillypham@google.com>
2024-05-20 12:40:57 -04:00
Nikita
19df9f3ec0 Update pipeline_controlnet_inpaint_sd_xl.py (#7983) 2024-05-20 12:24:49 -04:00
Jacob Marks
d6ca120987 Fix typo in "attention" (#7977) 2024-05-20 11:54:29 -04:00
Sayak Paul
fb7ae0184f [tests] fix Pixart Sigma tests (#7966)
* checking tests

* checking ii.

* remove prints.

* test_pixart_1024

* fix 1024.
2024-05-19 20:56:31 +05:30
Sayak Paul
70f8d4b488 remove unsafe workflow. (#7967) 2024-05-17 13:46:24 +05:30
Álvaro Somoza
6c60e430ee Consistent SDXL Controlnet callback tensor inputs (#7958)
* make _callback_tensor_inputs consistent between sdxl pipelines

* forgot this one

* fix failing test

* fix test_components_function

* fix controlnet inpaint tests
2024-05-16 07:15:10 -10:00
Alphin Jain
1221b28eac Fix AttributeError in train_lcm_distill_lora_sdxl_wds.py (#7923)
Fix conditional teacher model check in train_lcm_distill_lora_sdxl_wds.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-16 15:49:54 +05:30
Liang Hou
746f603b20 Fix the text tokenizer name in logger warning of PixArt pipelines (#7912)
Fix CLIP to T5 in logger warning
2024-05-15 18:49:29 -10:00
Sai-Suraj-27
2afea72d29 refactor: Refactored code by Merging isinstance calls (#7710)
* Merged isinstance calls to make the code simpler.

* Corrected formatting errors using ruff.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-15 18:33:19 -10:00
Sayak Paul
0f111ab794 [Workflows] add a workflow that can be manually triggered on a PR. (#7942)
* add a workflow that can be manually triggered on a PR.

* remove sudo

* add command

* small fixes.
2024-05-15 17:18:56 +05:30
Guillaume LEGENDRE
4dd7aaa06f move to GH hosted M1 runner (#7949) 2024-05-15 13:47:36 +05:30
Isamu Isozaki
d27e996ccd Adding VQGAN Training script (#5483)
* Init commit

* Removed einops

* Added default movq config for training

* Update explanation of prompts

* Fixed inheritance of discriminator and init_tracker

* Fixed incompatible api between muse and here

* Fixed output

* Setup init training

* Basic structure done

* Removed attention for quick tests

* Style fixes

* Fixed vae/vqgan styles

* Removed redefinition of wandb

* Fixed log_validation and tqdm

* Nothing commit

* Added commit loss to lookup_from_codebook

* Update src/diffusers/models/vq_model.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Adding perliminary README

* Fixed one typo

* Local changes

* Fixed main issues

* Merging

* Update src/diffusers/models/vq_model.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Testing+Fixed bugs in training script

* Some style fixes

* Added wandb to docs

* Fixed timm test

* get testing suite ready.

* remove return loss

* remove return_loss

* Remove diffs

* Remove diffs

* fix ruff format

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-15 08:47:12 +05:30
Sayak Paul
72780ff5b1 [tests] decorate StableDiffusion21PipelineSingleFileSlowTests with slow. (#7941)
decorate StableDiffusion21PipelineSingleFileSlowTests with slow.
2024-05-14 14:26:21 -10:00
Jingyang Zhang
69fdb8720f [Pipeline] Adding BoxDiff to community examples (#7947)
add boxdiff to community examples
2024-05-14 11:18:29 -10:00
Nikita
b2140a895b Fix added_cond_kwargs when using IP-Adapter in StableDiffusionXLControlNetInpaintPipeline (#7924)
Fix `added_cond_kwargs` when using IP-Adapter

Fix error when using IP-Adapter in pipeline and passing `ip_adapter_image_embeds` instead of `ip_adapter_image`

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-14 10:32:08 -10:00
Sayak Paul
e0e8c58f64 [Core] separate the loading utilities in modeling similar to pipelines. (#7943)
separate the loading utilities in modeling similar to pipelines.
2024-05-14 22:33:43 +05:30
Sayak Paul
cbea5d1725 update to use hf-workflows for reporting the Docker build statuses (#7938)
update to use hf-workflows for reporting
2024-05-14 09:25:13 +05:30
Tolga Cangöz
a1245c2c61 Expansion proposal of diffusers-cli env (#7403)
* Expand `diffusers-cli env`

* SafeTensors -> Safetensors

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Move `safetensors_version = "not installed"` to `else`

* Update `safetensors_version` checking

* Add GPU detection for Linux, Mac OS, and Windows

* Add accelerator detection to environment command

* Add is_peft_version to import_utils

* Update env.py

* Add `huggingface_hub` reference

* Add `transformers` reference

* Add reference for `huggingface_hub`

* Fix print statement in env.py for unusual OS

* Up

* Fix platform information in env.py

* up

* Fix import order in env.py

* ruff

* make style

* Fix platform system check in env.py

* Fix run method return type in env.py

* 🤗

* No need f-string

* Remove location info

* Remove accelerate config

* Refactor env.py to remove accelerate config

* feat: Add support for `bitsandbytes` library in environment command

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-14 08:20:24 +05:30
bssrdf
cdda94f412 fix VAE loading issue in train_dreambooth (#7632)
* fixed vae loading issue #7619

* rerun make style && make quality

* bring back model_has_vae and add change \ to / in config_file_name on windows os to make match work

* add missing import platform

* bring back import model_info

* make config_file_name OS independent

* switch to using Path.as_posix() to resolve OS dependence

* improve style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: bssrdf <bssrdf@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-14 08:19:53 +05:30
dependabot[bot]
5b830aa356 Bump transformers from 4.36.0 to 4.38.0 in /examples/research_projects/realfill (#7635)
Bump transformers in /examples/research_projects/realfill

Bumps [transformers](https://github.com/huggingface/transformers) from 4.36.0 to 4.38.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.36.0...v4.38.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-05-14 08:17:06 +05:30
Kohei
9e7bae9881 Update requirements.txt for text_to_image (#7892)
Update requirements.txt

If the datasets library is old, it will not read the metadata.jsonl and the label will default to an integer of type int.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-14 08:09:12 +05:30
rebel-kblee
b41ce1e090 fix multicontrolnet save_pretrained logic for compatibility (#7821)
fix multicontrolnet save_pretrained logic for compatibility

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-13 09:32:06 -10:00
Sayak Paul
95d3748453 [LoRA] Fix LoRA tests (side effects of RGB ordering) part ii (#7932)
* check

* check 2.

* update slices
2024-05-13 09:23:48 -10:00
Fabio Rigano
44aa9e566d fix AnimateDiff creation with a unet loaded with IP Adapter (#7791)
* Fix loading from_pipe

* Fix style

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-13 08:15:01 -10:00
Álvaro Somoza
fdb05f54ef Official callbacks (#7761) 2024-05-12 17:10:29 -10:00
HelloWorldBeginner
98ba18ba55 Add Ascend NPU support for SDXL. (#7916)
Co-authored-by: mhh001 <mahonghao1@huawei.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-12 13:34:23 +02:00
Sayak Paul
5bb38586a9 [Core] fix offload behaviour when device_map is enabled. (#7919)
fix offload behaviour when device_map is enabled.
2024-05-12 13:29:43 +02:00
Sai-Suraj-27
ec9e88139a fix: Fixed a wrong link to supported python versions in contributing.md file (#7638)
* Fixed a wrong link to python versions in contributing.md file.

* Updated the link to a permalink, so that it will permanently point to the specific line.
2024-05-12 13:21:18 +02:00
momo
e4f8dca9a0 add custom sigmas and timesteps for StableDiffusionXLControlNet pipeline (#7913)
add custom sigmas and timesteps
2024-05-11 23:33:19 -10:00
HelloWorldBeginner
0267c5233a fix bugs when using deepspeed in sdxl (#7917)
fix bugs when using deepspeed

Co-authored-by: mhh001 <mahonghao1@huawei.com>
2024-05-11 20:49:09 +02:00
Mark Van Aken
be4afa0bb4 #7535 Update FloatTensor type hints to Tensor (#7883)
* find & replace all FloatTensors to Tensor

* apply formatting

* Update torch.FloatTensor to torch.Tensor in the remaining files

* formatting

* Fix the rest of the places where FloatTensor is used as well as in documentation

* formatting

* Update new file from FloatTensor to Tensor
2024-05-10 09:53:31 -10:00
Sayak Paul
04f4bd54ea [Core] introduce videoprocessor. (#7776)
* introduce videoprocessor.

* fix quality

* address yiyi's feedback

* fix preprocess_video call.

* video_processor -> image_processor

* fix

* fix more.

* quality

* image_processor -> video_processor

* support List[List[PIL.Image.Image]]

* change to video_processor.

* documentation

* Apply suggestions from code review

* changes

* remove print.

* refactor video processor (part # 7776) (#7861)

* update

* update remove deprecate

* Update src/diffusers/video_processor.py

* update

* Apply suggestions from code review

* deprecate list of 5d for video and list of 4d for image + apply other feedbacks

* up

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add doc.

* tensor2vid -> postprocess_video.

* refactor preprocess with preprocess_video

* set default values.

* empty commit

* more refactoring of prepare_latents in animatediff vid2vid

* checking documentation

* remove documentation for now.

* fix animatediff sdxl

* fix test failure [part of video processor PR] (#7905)

up

* remove preceed_with_frames.

* doc

* fix

* fix

* remove video input as a single-frame video.

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-10 21:02:36 +02:00
Sayak Paul
82be58c512 add missing image processors to the docs (#7910)
add missing processors.
2024-05-10 14:53:57 +02:00
Sayak Paul
6695635696 upgrade to python 3.10 in the Dockerfiles (#7893)
* upgrade to python 3.10

* fix

* try https://askubuntu.com/questions/1459694/can-not-find-python3-10-after-apt-get-installation

* fix

* up

* yes

* okay

* up

* up

* up

* up

* up

* check

* okay

* up

* i[

* fix
2024-05-10 14:29:08 +02:00
YiYi Xu
b934215d4c [scheduler] support custom timesteps and sigmas (#7817)
* support custom sigmas and timesteps, dpm euler

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-05-09 11:07:43 -10:00
YiYi Xu
5ed3abd371 fix _optional_components in StableCascadeCombinedPipeline (#7894)
* fix

* up
2024-05-09 06:32:55 -10:00
Dhruv Nair
1087a510b5 Set max parallel jobs on slow test runners (#7878)
* set max parallel

* update

* update

* update
2024-05-09 19:42:18 +05:30
Sayak Paul
305f2b4498 [Tests] fix things after #7013 (#7899)
* debugging

* save the resulting image

* check if order reversing works.

* checking values.

* up

* okay

* checking

* fix

* remove print
2024-05-09 16:05:35 +02:00
Dhruv Nair
cb0f3b49cb [Refactor] Better align from_single_file logic with from_pretrained (#7496)
* refactor unet single file loading a bit.

* retrieve the unet from create_diffusers_unet_model_from_ldm

* update

* update

* updae

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* tests

* update

* update

* update

* Update docs/source/en/api/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* Update docs/source/en/api/loaders/single_file.md

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/loaders/single_file.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update docs/source/en/api/loaders/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/loaders/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/loaders/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/loaders/single_file.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-09 19:00:19 +05:30
Tolga Cangöz
caf9e985df Fix several imports (#7712)
Fix imports
2024-05-09 07:34:44 +02:00
Tolga Cangöz
c1c42698c9 Remove dead code and fix f-string issue (#7720)
* Remove dead code

* PylancereportGeneralTypeIssues: Strings nested within an f-string cannot use the same quote character as the f-string prior to Python 3.12.

* Remove dead code
2024-05-08 13:15:28 -10:00
Pierre Dulac
75aab34675 Allow users to save SDXL LoRA weights for only one text encoder (#7607)
SDXL LoRA weights for text encoders should be decoupled on save

The method checks if at least one of unet, text_encoder and
text_encoder_2 lora weights are passed, which was not reflected in the
implentation.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-08 10:41:58 -10:00
YiYi Xu
35358a2dec fix offload test (#7868)
fix

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-08 07:59:08 -10:00
Aryan
818f760732 [Pipeline] AnimateDiff SDXL (#6721)
* update conversion script to handle motion adapter sdxl checkpoint

* add animatediff xl

* handle addition_embed_type

* fix output

* update

* add imports

* make fix-copies

* add decode latents

* update docstrings

* add animatediff sdxl to docs

* remove unnecessary lines

* update example

* add test

* revert conv_in conv_out kernel param

* remove unused param addition_embed_type_num_heads

* latest IPAdapter impl

* make fix-copies

* fix return

* add IPAdapterTesterMixin to tests

* fix return

* revert based on suggestion

* add freeinit

* fix test_to_dtype test

* use StableDiffusionMixin instead of different helper methods

* fix progress bar iterations

* apply suggestions from review

* hardcode flip_sin_to_cos and freq_shift

* make fix-copies

* fix ip adapter implementation

* fix last failing test

* make style

* Update docs/source/en/api/pipelines/animatediff.md

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* remove todo

* fix doc-builder errors

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-08 21:27:14 +05:30
Philip Pham
f29b93488d Check shape and remove deprecated APIs in scheduling_ddpm_flax.py (#7703)
`model_output.shape` may only have rank 1.

There are warnings related to use of random keys.

```
tests/schedulers/test_scheduler_flax.py: 13 warnings
  /Users/phillypham/diffusers/src/diffusers/schedulers/scheduling_ddpm_flax.py:268: FutureWarning: normal accepts a single key, but was given a key array of shape (1, 2) != (). Use jax.vmap for batching. In a future JAX version, this will be an error.
    noise = jax.random.normal(split_key, shape=model_output.shape, dtype=self.dtype)

tests/schedulers/test_scheduler_flax.py::FlaxDDPMSchedulerTest::test_betas
  /Users/phillypham/virtualenv/diffusers/lib/python3.9/site-packages/jax/_src/random.py:731: FutureWarning: uniform accepts a single key, but was given a key array of shape (1,) != (). Use jax.vmap for batching. In a future JAX version, this will be an error.
    u = uniform(key, shape, dtype, lo, hi)  # type: ignore[arg-type]
```
2024-05-08 13:57:19 +02:00
Tolga Cangöz
d50baf0c63 Fix image upcasting (#7858)
Fix image's upcasting before `vae.encode()` when using `fp16`

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-05-07 16:45:02 -10:00
Hyoungwon Cho
c2217142bd Modification on the PAG community pipeline (re) (#7876)
* edited_pag_implementation

* update

---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>
2024-05-07 16:35:15 -10:00
Bagheera
8edaf3b79c 7879 - adjust documentation to use naruto dataset, since pokemon is now gated (#7880)
* 7879 - adjust documentation to use naruto dataset, since pokemon is now gated

* replace references to pokemon in docs

* more references to pokemon replaced

* Japanese translation update

---------

Co-authored-by: bghira <bghira@users.github.com>
2024-05-07 09:36:39 -07:00
Álvaro Somoza
23e091564f Fix for "no lora weight found module" with some loras (#7875)
* return layer weight if not found

* better system and test

* key example and typo
2024-05-07 13:54:57 +02:00
Steven Liu
0d23645bd1 [docs] Distilled inference (#7834)
* combine

* edits
2024-05-06 15:07:25 -07:00
Guillaume LEGENDRE
7fa3e5b0f6 Ci - change cache folder (#7867) 2024-05-06 17:55:24 +05:30
Steven Liu
49b959b540 [docs] LCM (#7829)
* lcm

* lcm lora

* fix

* fix hfoption

* edits
2024-05-03 16:08:27 -07:00
HelloWorldBeginner
58237364b1 Add Ascend NPU support for SDXL fine-tuning and fix the model saving bug when using DeepSpeed. (#7816)
* Add Ascend NPU support for SDXL fine-tuning and fix the model saving bug when using DeepSpeed.

* fix check code quality

* Decouple the NPU flash attention and make it an independent module.

* add doc and unit tests for npu flash attention.

---------

Co-authored-by: mhh001 <mahonghao1@huawei.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-03 08:14:34 -10:00
Dhruv Nair
3e35628873 Remove installing python again in container (#7852)
update
2024-05-03 15:09:15 +05:30
Lucain
6a479588db Respect resume_download deprecation (#7843)
* Deprecate resume_download

* align docstring with transformers

* style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-03 08:42:57 +02:00
Aritra Roy Gosthipaty
fa489eaed6 [Tests] reduce the model size in the blipdiffusion fast test (#7849)
reducing model size
2024-05-03 07:46:48 +05:30
Dhruv Nair
0d7c479023 Update deps for pipe test fetcher (#7838)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-02 20:36:47 +05:30
Guillaume LEGENDRE
ce97d7e19b Change GPU Runners (#7840)
* Move to new GPU Runners for slow tests

* Move to new GPU Runners for nightly tests
2024-05-02 18:48:46 +05:30
Guillaume LEGENDRE
44ba90caff move to new runners (#7839) 2024-05-02 14:53:38 +02:00
Dhruv Nair
3c85a57297 Update CI cache (#7832)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-02 14:03:35 +05:30
Dhruv Nair
03ca11318e Update download diff format tests (#7831)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-02 13:15:38 +05:30
Dhruv Nair
3ffa7b46e5 Fix hanging pipeline fetching (#7837)
update
2024-05-02 13:08:57 +05:30
yunseong Cho
c1b2a89e34 Fix key error for dictionary with randomized order in convert_ldm_unet_checkpoint (#7680)
fix key error for different order

Co-authored-by: yunseong <yunseong.cho@superlabs.us>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-05-02 10:29:55 +05:30
Aritra Roy Gosthipaty
435d37ce5a [Tests] reduce the model size in the audioldm fast test (#7833)
chore: initial size reduction of models
2024-05-02 06:03:52 +05:30
YiYi Xu
5915c2985d [ip-adapter] fix ip-adapter for StableDiffusionInstructPix2PixPipeline (#7820)
update prepare_ip_adapter_ for pix2pix
2024-05-01 06:27:43 -10:00
YiYi Xu
21a7ff12a7 update the logic of is_sequential_cpu_offload (#7788)
* up

* add comment to the tests + fix dit

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-05-01 06:25:57 -10:00
Sayak Paul
8909ab4b19 [Tests] fix: device map tests for models (#7825)
* fix: device module tests

* remove patch file

* Empty-Commit
2024-05-01 18:45:47 +05:30
Dhruv Nair
c1edb03c37 Fix for pipeline slow test fetcher (#7824)
* update

* update
2024-05-01 17:36:54 +05:30
Steven Liu
0d08370263 [docs] Community pipelines (#7819)
* community pipelines

* feedback

* consolidate
2024-04-30 14:10:14 -07:00
Tolga Cangöz
b8ccb46259 Fix CPU offload in docstring (#7827)
Fix cpu offload
2024-04-30 10:53:27 -07:00
Dhruv Nair
725ead2f5e SSH Runner Workflow Update (#7822)
* add debug workflow

* update
2024-04-30 20:14:18 +05:30
Linoy Tsaban
26a7851e1e Add B-Lora training option to the advanced dreambooth lora script (#7741)
* add blora

* add blora

* add blora

* add blora

* little changes

* little changes

* remove redundancies

* fixes

* add B LoRA to readme

* style

* inference

* defaults + path to loras+ generation

* minor changes

* style

* minor changes

* minor changes

* blora arg

* added --lora_unet_blocks

* style

* Update examples/advanced_diffusion_training/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add commit hash to B-LoRA repo cloneing

* change inference, remove cloning

* change inference, remove cloning
add section about configureable unet blocks

* change inference, remove cloning
add section about configureable unet blocks

* Apply suggestions from code review

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-30 09:46:30 +05:30
Sayak Paul
3fd31eef51 [Core] introduce _no_split_modules to ModelMixin (#6396)
* introduce _no_split_modules.

* unnecessary spaces.

* remove unnecessary kwargs and style

* fix: accelerate imports.

* change to _determine_device_map

* add the blocks that have residual connections.

* add: CrossAttnUpBlock2D

* add: testin

* style

* line-spaces

* quality

* add disk offload test without safetensors.

* checking disk offloading percentages.

* change model split

* add: utility for checking multi-gpu requirement.

* model parallelism test

* splits.

* splits.

* splits

* splits.

* splits.

* splits.

* offload folder to test_disk_offload_with_safetensors

* add _no_split_modules

* fix-copies
2024-04-30 08:46:51 +05:30
Aritra Roy Gosthipaty
b02e2113ff [Tests] reduce the model size in the amused fast test (#7804)
* chore: reducing model sizes

* chore: shrinks further

* chore: shrinks further

* chore: shrinking model for img2img pipeline

* chore: reducing size of model for inpaint pipeline

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-30 08:11:26 +05:30
Aritra Roy Gosthipaty
21f023ec1a [Tests] reduce the model size in the ddpm fast test (#7797)
* chore: reducing unet size for faster tests

* review suggestions

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-30 08:11:03 +05:30
Aritra Roy Gosthipaty
31d9f9ea77 [Tests] reduce the model size in the ddim fast test (#7803)
chore: reducing model size for ddim fast pipeline

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-30 07:54:38 +05:30
Clint Adams
f53352f750 Set main_input_name in StableDiffusionSafetyChecker to "clip_input" (#7500)
FlaxStableDiffusionSafetyChecker sets main_input_name to "clip_input".
This makes StableDiffusionSafetyChecker consistent.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-29 11:39:59 -10:00
RuiningLi
83ae24ce2d Added get_velocity function to EulerDiscreteScheduler. (#7733)
* Added get_velocity function to EulerDiscreteScheduler.

* Fix white space on blank lines

* Added copied from statement

* back to the original.

---------

Co-authored-by: Ruining Li <ruining@robots.ox.ac.uk>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-29 10:32:13 -10:00
jschoormans
8af793b2d4 Adding TextualInversionLoaderMixin for the controlnet_inpaint_sd_xl pipeline (#7288)
* added TextualInversionMixIn to controlnet_inpaint_sd_xl pipeline


---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-29 09:00:53 -10:00
Dhruv Nair
eb96ff0d59 Safetensor loading in AnimateDiff conversion scripts (#7764)
* update

* update
2024-04-29 17:36:50 +05:30
Yushu
a38dd79512 [Pipeline] Fix error of SVD pipeline when num_videos_per_prompt > 1 (#7786)
swap the order for do_classifier_free_guidance concat with repeat

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-29 16:24:16 +05:30
Dhruv Nair
b1c5817a89 Add debugging workflow (#7778)
add debug workflow

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-29 13:44:39 +05:30
Nilesh
235d34cf56 Check for latents, before calling prepare_latents - sdxlImg2Img (#7582)
* Check for latents, before calling prepare_latents - sdxlImg2Img

* Added latents check for all the img2img pipeline

* Fixed silly mistake while checking latents as None
2024-04-28 14:53:29 -10:00
Jenyuan-Huang
5029673987 Update InstantStyle usage in IP-Adapter documentation (#7806)
* enable control ip-adapter per-transformer block on-the-fly


---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: ResearcherXman <xhs.research@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-28 10:34:57 -10:00
Sayak Paul
56bd7e67c2 [Scheduler] introduce sigma schedule. (#7649)
* introduce sigma schedule.

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* address yiyi

* update docstrings.

* implement the schedule for EDMDPMSolverMultistepScheduler

---------

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2024-04-27 07:40:35 +05:30
39th president of the United States, probably
9d16daaf64 Add DREAM training (#6381)
A new function compute_dream_and_update_latents has been added to the
training utilities that allows you to do DREAM rectified training in line
with the paper https://arxiv.org/abs/2312.00210. The method can be used
with an extra argument in the train_text_to_image.py script.

Co-authored-by: Jimmy <39@🇺🇸.com>
2024-04-27 07:19:15 +05:30
Fabio Rigano
8e4ca1b6b2 [Docs] Update image masking and face id example (#7780)
* [Docs] Update image masking and face id example

* Update docs

* Fix docs
2024-04-26 12:51:11 -10:00
Beinsezii
0d2d424fbe Add PixArtSigmaPipeline to AutoPipeline mapping (#7783) 2024-04-26 09:10:20 -10:00
Steven Liu
e24e54fdfa [docs] Fix AutoPipeline docstring (#7779)
fix

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-26 10:09:36 -07:00
btlorch
ebc99a77aa Convert RGB to BGR for the SDXL watermark encoder (#7013)
* Convert channel order to BGR for the watermark encoder. Convert the watermarked BGR images back to RGB. Fixes #6292

* Revert channel order before stacking images to overcome limitations that negative strides are currently not supported

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-25 14:44:53 -10:00
Steven Liu
fa750a15bd [docs] Refactor image quality docs (#7758)
* refactor

* code snippets

* fix path

* fix path in guide

* code outputs

* align toctree title

* title

* fix title
2024-04-25 16:55:35 -07:00
Steven Liu
181688012a [docs] Reproducible pipelines (#7769)
* reproducibility

* feedback

* feedback

* fix path

* github link
2024-04-25 16:15:12 -07:00
Sayak Paul
142f353e1c Fix lora device test (#7738)
* fix lora device test

* fix more.

* fix more/

* quality

* empty

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-25 18:05:27 +05:30
Sayak Paul
b833d0fc80 [Tests] mark UNetControlNetXSModelTests::test_forward_no_control to be flaky (#7771)
decorate UNetControlNetXSModelTests::test_forward_no_control with is_flaky
2024-04-25 07:29:04 +05:30
Sayak Paul
e963621649 [PixArt] fix small nits in pixart sigma (#7767)
fix small nits in pixart sigma
2024-04-25 06:37:35 +05:30
Junsong Chen
39215aa30e PixArt-Sigma Implementation (#7654)
* support PixArt-DMD

---------

Co-authored-by: jschen <chenjunsong4@h-partners.com>
Co-authored-by: badayvedat <badayvedat@gmail.com>
Co-authored-by: Vedat Baday <54285744+badayvedat@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-04-23 22:33:08 -10:00
Dhruv Nair
9ef43f38d4 Fix test for consistency decoder. (#7746)
update
2024-04-24 12:28:11 +05:30
Dhruv Nair
88018fcf20 Fix failing VAE tiling test (#7747)
update
2024-04-24 12:27:45 +05:30
Steven Liu
7404f1e9dc [docs] Clean up toctree (#7715)
* toctree

* optim

* feedback

* improve overview
2024-04-23 09:30:33 -07:00
Sayak Paul
5a69227863 [Metadat utils] fix: json lines ordering. (#7744)
fix: json lines ordering.
2024-04-23 14:32:30 +05:30
Sai-Suraj-27
fc9fecc217 fix: Fixed a wrong decorator by modifying it to @classmethod (#7653)
* Fixed wrong decorator by modifying it to @classmethod.

* Updated the method and it's argument.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-22 14:41:35 -10:00
Fabio Rigano
065f251766 Restore AttnProcessor2_0 in unload_ip_adapter (#7727)
* Restore AttnProcessor2_0 in unload_ip_adapter

* Fix style

* Update test

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-22 13:59:03 -10:00
Jenyuan-Huang
21c747fa0f Support InstantStyle (#7668)
* enable control ip-adapter per-transformer block on-the-fly

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: ResearcherXman <xhs.research@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-22 13:20:19 -10:00
Phil Butler
09129842e7 Remove redundant lines (#7396)
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-22 09:32:16 -10:00
Steven Liu
33b363edfa [docs] AutoPipeline (#7714)
* autopipeline

* edits

* feedback

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-22 10:15:07 -07:00
Dhruv Nair
a9dd86029e Fix Kandinksy V22 tests (#7699)
update
2024-04-22 15:41:59 +05:30
Dhruv Nair
9100652494 Update Wuerschten Test (#7700)
update
2024-04-22 15:41:39 +05:30
Abhinav Gopal
d1e3f489e9 Animatediff Controlnet Community Pipeline IP Adapter Fix (#7413)
* fixed encode_image function signature in controlnet animatediff

* copied encode_image from stable diffusion pipeline

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-19 15:35:07 -10:00
Guillaume LEGENDRE
ae05050db9 fix/add tailscale key in case of failure (#7719)
add tailscale key in case of failure
2024-04-19 10:56:40 +02:00
Sai-Suraj-27
db969cc16d fix: Fixed type annotations for compatability with python 3.8 (#7648)
* Fixed type annotations for compatability with python 3.8

* Add required imports.
2024-04-18 19:34:09 -10:00
Dhruv Nair
3cfe187dc7 Cleanup ControlnetXS (#7701)
* update

* update
2024-04-18 19:32:00 -10:00
Dhruv Nair
90250d9e48 Cast height, width to int inside prepare latents (#7691)
update
2024-04-18 19:30:39 -10:00
YiYi Xu
e5674015f3 adding back test_conversion_when_using_device_map (#7704)
* style


* Fix device map nits (#7705)


---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-18 19:21:32 -10:00
Fabio Rigano
b5c8b555d7 Move IP Adapter Face ID to core (#7186)
* Switch to peft and multi proj layers

* Move Face ID loading and inference to core

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-18 14:13:27 -10:00
Guillaume LEGENDRE
e23c27e905 Add tailscale action to push_test (#7709) 2024-04-18 18:48:39 +05:30
Steven Liu
7635d3d37f [docs] Pipeline loading (#7684)
* pipelines

* schedulers and models

* community pipelines

* feedback
2024-04-17 15:42:27 -07:00
Wentian
9132ce7c58 [Docs] Update TGATE in section optimization. (#7698)
Update tgate.md
2024-04-17 09:37:24 -07:00
Sayak Paul
30c977d1f5 [Workflows] remove installation of redundant modules from flax PR tests (#7662)
remove installation of redundant modules from flax PR tests
2024-04-17 15:16:04 +05:30
Dhruv Nair
f0fa17dd8e Don't install PEFT with UV in slow tests (#7697)
* update

* update
2024-04-17 15:10:38 +05:30
Sai-Suraj-27
c726d02beb fix: Updated ruff configuration to avoid deprecated configuration warning (#7637)
Updated ruff configuration to avoid depreceated config.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-16 15:02:55 -10:00
Wentian
a68503f221 [Docs] Add TGATE in section optimization (#7639)
* Create tgate.md

* Update _toctree.yml

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/tgate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update tgate.md

* Update tgate.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-04-16 17:58:27 -07:00
Sayak Paul
9d50f7eec1 [Core] is_cosxl_edit arg in SDXL ip2p. (#7650)
* is_cosxl_edit arg in SDXL ip2p.

* Empty-Commit

Co-authored-by: Yiyi Xu <yixu310@gmail.com>

* doc

* remove redundant logic.

* reflect drhuv's comments.

---------

Co-authored-by: Yiyi Xu <yixu310@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-16 22:15:55 +05:30
UmerHA
fda1531d8a Fixing implementation of ControlNet-XS (#6772)
* CheckIn - created DownSubBlocks

* Added extra channels, implemented subblock fwd

* Fixed connection sizes

* checkin

* Removed iter, next in forward

* Models for SD21 & SDXL run through

* Added back pipelines, cleared up connections

* Cleaned up connection creation

* added debug logs

* updated logs

* logs: added input loading

* Update umer_debug_logger.py

* log: Loading hint

* Update umer_debug_logger.py

* added logs

* Changed debug logging

* debug: added more logs

* Fixed num_norm_groups

* Debug: Logging all of SDXL input

* Update umer_debug_logger.py

* debug: updated logs

* checkim

* Readded tests

* Removed debug logs

* Fixed Slow Tests

* Added value ckecks | Updated model_cpu_offload_seq

* accelerate-offloading works ; fast tests work

* Made unet & addon explicit in controlnet

* Updated slow tests

* Added dtype/device to ControlNetXS

* Filled in test model paths

* Added image_encoder/feature_extractor to XL pipe

* Fixed fast tests

* Added comments and docstrings

* Fixed copies

* Added docs ; Updates slow tests

* Moved changes to UNetMidBlock2DCrossAttn

* tiny cleanups

* Removed stray prints

* Removed ip adapters + freeU

- Removed ip adapters + freeU as they don't make sense for ControlNet-XS
- Fixed imports of UNet components

* Fixed test_save_load_float16

* Make style, quality, fix-copies

* Changed loading/saving API for ControlNetXS

- Changed loading/saving API for ControlNetXS
- other small fixes

* Removed ControlNet-XS from research examples

* Make style, quality, fix-copies

* Small fixes

- deleted ControlNetXSModel.init_original
- added time_embedding_mix to StableDiffusionControlNetXSPipeline .from_pretrained / StableDiffusionXLControlNetXSPipeline.from_pretrained
- fixed copy hints

* checkin May 11 '23

* CheckIn Mar 12 '24

* Fixed tests for SD

* Added tests for UNetControlNetXSModel

* Fixed SDXL tests

* cleanup

* Delete Pipfile

* CheckIn Mar 20

Started replacing sub blocks  by `ControlNetXSCrossAttnDownBlock2D` and `ControlNetXSCrossAttnUplock2D`

* check-in Mar 23

* checkin 24 Mar

* Created init for UNetCnxs and CnxsAddon

* CheckIn

* Made from_modules, from_unet and no_control work

* make style,quality,fix-copies & small changes

* Fixed freezing

* Added gradient ckpt'ing; fixed tests

* Fix slow tests(+compile) ; clear naming confusion

* Don't create UNet in init ; removed class_emb

* Incorporated review feedback

- Deleted get_base_pipeline /  get_controlnet_addon for pipes
- Pipes inherit from StableDiffusionXLPipeline
- Made module dicts for cnxs-addon's down/mid/up classes
- Added support for qkv fusion and freeU

* Make style, quality, fix-copies

* Implemented review feedback

* Removed compatibility check for vae/ctrl embedding

* make style, quality, fix-copies

* Delete Pipfile

* Integrated review feedback

- Importing ControlNetConditioningEmbedding now
- get_down/mid/up_block_addon now outside class
- renamed `do_control` to `apply_control`

* Reduced size of test tensors

For this, added `norm_num_groups` as parameter everywhere

* Renamed cnxs-`Addon` to cnxs-`Adapter`

- `ControlNetXSAddon` -> `ControlNetXSAdapter`
- `ControlNetXSAddonDownBlockComponents` -> `DownBlockControlNetXSAdapter`, and similarly for mid/up
- `get_mid_block_addon` -> `get_mid_block_adapter`, and similarly for mid/up

* Fixed save_pretrained/from_pretrained bug

* Removed redundant code

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-16 21:56:20 +05:30
Sayak Paul
cf6e0407e0 don't install peft from the source with uv for now. (#7679) 2024-04-15 09:33:02 +05:30
Sayak Paul
1c000d46e1 fix: metadata token (#7631) 2024-04-15 08:32:27 +05:30
Sayak Paul
08bf754507 make docker-buildx mandatory. (#7652) 2024-04-13 07:26:34 +05:30
kabachuha
2f23437618 Add (Scheduled) Pseudo-Huber Loss training scripts to research projects (#7527)
* add scheduled pseudo-huber loss training scripts

See #7488

* add reduction modes to huber loss

* [DB Lora] *2 multiplier to huber loss cause of 1/2 a^2 conv.

pairing of c6495def1f

* [DB Lora] add option for smooth l1 (huber / delta)

Pairing of dd22958caa

* [DB Lora] unify huber scheduling

Pairing of 19a834c3ab

* [DB Lora] add snr huber scheduler

Pairing of 47fb1a6854

* fixup examples link

* use snr schedule by default in DB

* update all huber scripts with snr

* code quality

* huber: make style && make quality

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-13 07:26:08 +05:30
Benjamin Bossan
2523390c26 FIX Setting device for DoRA parameters (#7655)
Fix a bug that causes the the call to set_lora_device to ignore the DoRA
parameters.
2024-04-12 13:55:46 +02:00
Sai-Suraj-27
279de3c3ff fix: Replaced deprecated logger.warn with logger.warning (#7643)
Fixed deprecated logger.warn with logger.warning.
2024-04-11 09:43:01 -10:00
Yiqin Zhao
8e14535708 Fixed YAML loading. (#7579) 2024-04-11 09:08:42 -10:00
dg845
0bee4d336b LCM Distill Scripts Fix Bug when Initializing Target U-Net (#6848)
* Initialize target_unet from unet rather than teacher_unet so that we correctly add time_embedding.cond_proj if necessary.

* Use UNet2DConditionModel.from_config to initialize target_unet from unet's config.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-11 07:52:12 -10:00
Steven Munn
42f25d601a Skip PEFT LoRA Scaling if the scale is 1.0 (#7576)
* Skip scaling if scale is identity

* move check for weight one to scale and unscale lora

* fix code style/quality

* Empty-Commit

---------

Co-authored-by: Steven Munn <stevenjmunn@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Munn <5297082+stevenjlm@users.noreply.github.com>
2024-04-11 11:02:31 +05:30
Sayak Paul
33c5d125cb [Core] fix img2img pipeline for Playground (#7627)
* playground vae encoding should use std and mean of the vae.

* style.

* fix-copies.
2024-04-11 09:07:38 +05:30
YiYi Xu
aa1f00fd01 Fix cpu offload related slow tests (#7618)
* fix

* up

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-04-10 14:53:45 -10:00
Steven Liu
d95b993427 [docs] T2I (#7623)
* refactor t2i

* add code snippets
2024-04-10 17:10:41 -07:00
Steven Liu
1d480298c1 [docs] Prompt enhancer (#7565)
* prompt enhance

* edits

* align titles

* feedback

* feedback

* feedback

* link to style
2024-04-10 16:09:06 -07:00
Sayak Paul
b2323aa2b7 [Tests] reduce the model sizes in the SD fast tests (#7580)
* give it a shot.

* print.

* correct assertion.

* gather results from the rest of the tests.

* change the assertion values where needed.

* remove print statements.
2024-04-10 11:36:28 -10:00
satani99
37e9d695af Modularize instruct_pix2pix SD inferencing during and after training in examples (#7603)
* Modularize instruct_pix2pix code

* quality check

* quality check

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-10 11:19:16 +05:30
Sayak Paul
a402431de0 [docs] remove duplicate tip block. (#7625)
remove duplicate tip block.
2024-04-10 10:31:11 +05:30
IDKiro
b99b1617cf add the option of upsample function for tiny vae (#7604)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-10 09:27:39 +05:30
Sayak Paul
3e4a6bd2d4 [Core] add "balanced" device_map support to pipelines (#6857)
* get device <-> component mapping when using multiple gpus.

* condition the device_map bits.

* relax condition

* device_map progress.

* device_map enhancement

* some cleaning up and debugging

* Apply suggestions from code review

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>

* incorporate suggestions from PR.

* remove multi-gpu condition for now.

* guard check the component -> device mapping

* fix: device_memory variable

* dispatching transformers model to have force_hooks=True

* better guarding for transformers device_map

* introduce support balanced_low_memory and balanced_ultra_low_memory.

* remove device_map patch.

* fix: intermediate variable scoping.

* fix: condition in cpu offload.

* fix: flax class restrictions.

* remove modifications from cpu_offload and model_offload

* incorporate changes.

* add a simple forward pass test

* add: torch_device in get_inputs()

* add: tests

* remove print

* safe-guard to(), model offloading and cpu offloading when balanced is used as a device_map.

* style

* remove .

* safeguard device_map with more checks and remove invalid device_mapping strategues.

* make  a class attribute and adjust tests accordingly.

* fix device_map check

* fix test

* adjust comment

* fix: device_map attribute

* fix: dispatching.

* max_memory test for pipeline

* version guard the tests

* fix guard.

* address review feedback.

* reset_device_map method.

* add: test for reset_hf_device_map

* fix a couple things.

* add reset_device_map() in the error message.

* add tests for checking reset_device_map doesn't have unintended consequences.

* fix reset_device_map and offloading tests.

* create _get_final_device_map utility.

* hf_device_map -> _hf_device_map

* add documentation

* add notes suggested by Marc.

* styling.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* move updates within gpu condition.

* other docs related things

* note on ignore a device not specified in .

* provide a suggestion if device mapping errors out.

* fix: typo.

* _hf_device_map -> hf_device_map

* Empty-Commit

* add: example hf_device_map.

---------

Co-authored-by: Marc Sun <57196510+SunMarc@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-04-10 08:59:05 +05:30
Sayak Paul
c827e94da0 [Workflows] remove installation of libsndfile1-dev and libgl1 from workflows (#7543)
* remove libsndfile1-dev and libgl1 from workflows and ensure that re present in the respective dockerfiles.

* change to self-hosted runner; let's see 🤞

* add libsndfile1-dev libgl1 for now

* use self-hosted runners for building and push too.
2024-04-10 08:34:56 +05:30
Sayak Paul
44f6b859bf [Core] refactor transformer_2d forward logic into meaningful conditions. (#7489)
* refactor transformer_2d forward logic into meaningful conditions.

* Empty-Commit

* fix: _operate_on_patched_inputs

* fix: _operate_on_patched_inputs

* check

* fix: patch output computation block.

* fix: _operate_on_patched_inputs.

* remove print.

* move operations to blocks.

* more readability neats.

* empty commit

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Revert "Apply suggestions from code review"

This reverts commit 12178b1aa0.

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-10 08:33:19 +05:30
Sayak Paul
ac7ff7d4a3 add utilities for updating diffusers pipeline metadata. (#7573)
* add utilities for updating diffusers pipeline metadata.

* style

* remove first empty line
2024-04-10 08:28:49 +05:30
Fabio Rigano
a0cf607667 Multi-image masking for single IP Adapter (#7499)
* Support multiimage masking

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-04-09 09:20:57 -10:00
YiYi Xu
a341b536a8 disable test_conversion_when_using_device_map (#7620)
* disable test

* update

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-04-09 09:01:19 -10:00
Christopher Beckham
8e46d97cd8 Add missing restore() EMA call in train SDXL script (#7599)
* Restore unet params back to normal from EMA when validation call is finished

* empty commit

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-09 18:07:55 +05:30
Junjie
7e808e768a [Docs] fix bugs in callback docs (#7594) 2024-04-08 08:46:30 -10:00
w4ffl35
7e39516627 Allow more arguments to be passed to convert_from_ckpt (#7222)
Allow safety and feature extractor arguments to be passed to convert_from_ckpt

Allows management of safety checker and feature extractor
from outside of the convert ckpt class.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-08 10:13:48 +05:30
Nguyễn Công Tú Anh
56a76082ed Add AudioLDM2 TTS (#5381)
* add audioldm2 tts

* change gpt2 max new tokens

* remove unnecessary pipeline and class

* add TTS to AudioLDM2Pipeline

* add TTS docs

* delete unnecessary file

* remove unnecessary import

* add audioldm2 slow testcase

* fix code quality

* remove AudioLDMLearnablePositionalEmbedding

* add variable check vits encoder

* add use_learned_position_embedding

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-08 10:11:24 +05:30
YiYi Xu
6133d98ff7 [IF| add set_begin_index for all IF pipelines (#7577)
add set_begin_index for all if pipelines
2024-04-05 06:54:07 -10:00
Sayak Paul
1c60e094de [Tests] reduce block sizes of UNet and VAE tests (#7560)
* reduce block sizes for unet1d.

* reduce blocks for unet_2d.

* reduce block size for unet_motion

* increase channels.

* correctly increase channels.

* reduce number of layers in unet2dconditionmodel tests.

* reduce block sizes for unet2dconditionmodel tests

* reduce block sizes for unet3dconditionmodel.

* fix: test_feed_forward_chunking

* fix: test_forward_with_norm_groups

* skip spatiotemporal tests on MPS.

* reduce block size in AutoencoderKL.

* reduce block sizes for vqmodel.

* further reduce block size.

* make style.

* Empty-Commit

* reduce sizes for ConsistencyDecoderVAETests

* further reduction.

* further block reductions in AutoencoderKL and AssymetricAutoencoderKL.

* massively reduce the block size in unet2dcontionmodel.

* reduce sizes for unet3d

* fix tests in unet3d.

* reduce blocks further in motion unet.

* fix: output shape

* add attention_head_dim to the test configuration.

* remove unexpected keyword arg

* up a bit.

* groups.

* up again

* fix
2024-04-05 10:08:32 +05:30
UmerHA
71f49a5d2a Skip test_freeu_enabled on MPS (#7570)
* Skip `test_freeu_enabled ` on MPS

* Small fixes

- import skip_mps correctly
- disable all instances of test_freeu_enabled

* Empty commit to trigger tests

* Empty commit to trigger CI
2024-04-04 12:16:04 +02:00
Abhinav Gopal
35db2fdea9 Update pipeline_animatediff_video2video.py (#7457)
* Update pipeline_animatediff_video2video.py

* commit with test for whether latent input can be passed into animatediffvid2vid
2024-04-03 19:34:28 +05:30
Sayak Paul
ad55ce6100 [Chore] increase number of workers for the tests. (#7558)
* increase number of workers for the tests.

* move to beefier runner.

* improve the fast push tests too.

* use a beefy machine for pytorch pipeline tests

* up the number of workers further.
2024-04-03 17:11:42 +05:30
Sayak Paul
a9a5b14f35 [Core] refactor transformers 2d into multiple init variants. (#7491)
* refactor transformers 2d into multiple legacy variants.

* fix: init.

* fix recursive init.

* add inits.

* make transformer block creation more modular.

* complete refactor.

* remove forward

* debug

* remove legacy blocks and refactor within the module itself.

* remove print

* guard caption projection

* remove fetcher.

* reduce the number of args.

* fix: norm_type

* group variables that are shared.

* remove _get_transformer_blocks

* harmonize the init function signatures.

* transformer_blocks to common

* repeat .
2024-04-03 12:56:17 +05:30
Beinsezii
aa19025989 UniPC Multistep add rescale_betas_zero_snr (#7531)
* UniPC Multistep add `rescale_betas_zero_snr`

Same patch as DPM and Euler with the patched final alpha cumprod

BF16 doesn't seem to break down, I think cause UniPC upcasts during some
phases already? We could still force an upcast since it only
loses ≈ 0.005 it/s for me but the difference in output is very small. A
better endeavor might upcasting in step() and removing all the other
upcasts elsewhere?

* UniPC ZSNR UT

* Re-add `rescale_betas_zsnr` doc oops
2024-04-02 17:23:55 -10:00
Beinsezii
19ab04ff56 UniPC Multistep fix tensor dtype/device on order=3 (#7532)
* UniPC UTs iterate solvers on FP16

It wasn't catching errs on order==3. Might be excessive?

* UniPC Multistep fix tensor dtype/device on order=3

* UniPC UTs Add v_pred to fp16 test iter

For completions sake. Probably overkill?
2024-04-02 15:41:29 -10:00
Sayak Paul
4a34307702 add: utility to format our docs too 📜 (#7314)
* add: utility to format our docs too 📜

* debugging saga

* fix: message

* checking

* should be fixed.

* revert pipeline_fixture

* remove empty line

* make style

* fix: setup.py

* style.
2024-04-02 20:49:43 +05:30
Bagheera
8e963d1c2a 7529 do not disable autocast for cuda devices (#7530)
* 7529 do not disable autocast for cuda devices

* Remove typecasting error check for non-mps platforms, as a correct autocast implementation makes it a non-issue

* add autocast fix to other training examples

* disable native_amp for dreambooth (sdxl)

* disable native_amp for pix2pix (sdxl)

* remove tests from remaining files

* disable native_amp on huggingface accelerator for every training example that uses it

* convert more usages of autocast to nullcontext, make style fixes

* make style fixes

* style.

* Empty-Commit

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-02 20:15:06 +05:30
Sayak Paul
2b04ec2ff7 [Tests] Speed up fast pipelines part II (#7521)
* start printing the tensors.

* print full throttle

* set static slices for 7 tests.

* remove printing.

* flatten

* disable test for controlnet

* what happens when things are seeded properly?

* set the right value

* style./

* make pia test fail to check things

* print.

* fix pia.

* checking for animatediff.

* fix: animatediff.

* video synthesis

* final piece.

* style.

* print guess.

* fix: assertion for control guess.

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-02 13:24:56 +05:30
Sayak Paul
000fa82a1e [Chore] remove class assignments for linear and conv. (#7553)
* remove class assignments for linear and conv.

* fix: self.nn
2024-04-02 13:01:04 +05:30
Sayak Paul
5d83f50c23 [Release tests] make nightly workflow dispatchable. (#7541)
* make nightly workflow dispatchable.

* add a note about running the release tests to setup.py
2024-04-02 12:21:17 +05:30
Dhruv Nair
5d21d4a204 Fix FreeU tests (#7540)
update
2024-04-02 11:05:50 +05:30
Álvaro Somoza
73ba81090e [Community pipeline] SDXL Differential Diffusion Img2Img Pipeline (#7550)
* initial-commit pipeline created

* updated README.md
2024-04-01 18:15:30 -10:00
YiYi Xu
7956c36aaa add a from_pipe method to DiffusionPipeline (#7241)
* add from_pipe



---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-01 13:02:00 -10:00
haikmanukyan
5266ab7935 add HD-Painter pipeline (#7520)
* add HD-Painter pipeline

* style fixing

* refactor, change doc, fix ruff

* fix docs

* used correct ruff version

---------

Co-authored-by: Hayk Manukyan <youremail@yourdomain.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-04-01 15:10:44 +05:30
YiYi Xu
7f724a930e fix the cpu offload tests (#7544)
fix
2024-04-01 14:27:14 +05:30
Jianbing Wu
9bef9f4be7 Fix SVD bug (shape of time_context) (#7268)
* Fix SVD bug (shape of `time_context`)

* Formatting code

* Formatting src/diffusers/models/transformers/transformer_temporal.py by `make style && make quality`

---------

Co-authored-by: kevinkhwu <kevinkhwu@tencent.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-04-01 14:05:52 +05:30
Dhruv Nair
7aa4514260 Fix typo in CPU offload test (#7542)
update
2024-03-31 22:07:17 -10:00
Bingxin Ke
c2e87869be [Community pipeline] Marigold depth estimation update -- align with marigold v0.1.5 (#7524)
* add resample option; check denoise_step; update ckpt path

* Add seeding in pipeline to increase reproducibility

* fix typo

* fix typo
2024-03-30 07:09:02 -10:00
Stephen
ca61287daa Fix IP Adapter Support for SAG Pipeline (#7260)
* fix ip adapter support

* Update sag pipelines tests, adjust sag pipeline to pass tests

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-30 06:15:29 -10:00
Beinsezii
f0c81562a4 Add final_sigma_zero to UniPCMultistep (#7517)
* Add `final_sigma_zero` to UniPCMultistep

Effectively the same trick as DDIM's `set_alpha_to_one` and
DPM's `final_sigma_type='zero'`.
Currently False by default but maybe this should be True?

* `final_sigma_zero: bool` -> `final_sigmas_type: str`

Should 1:1 match DPM Multistep now.

* Set `final_sigmas_type='sigma_min'` in UniPC UTs
2024-03-29 22:23:45 -10:00
Hyoungwon Cho
9d20ed37a2 Perturbed-Attention Guidance (#7512)
* pag_initial

* pag_docs

* edit_docs

* custom

* typo

* delete_docs

* whitespace

* make style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-30 10:52:51 +05:30
Linoy Tsaban
bda1d4faf8 add Instant id sdxl image2image pipeline (#7507)
* initial commit - instantid img2img

* adapting to img2img

* change add_time_ids

* change add_time_ids

* WIP changes

* add strength to timesteps

* check insightface import

* style

* check insightface import changed to warning

* check insightface import changed to warning

* style

---------

Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
2024-03-30 10:25:21 +05:30
UmerHA
77103d71ca Quick-Fix for #7352 block-lora (#7523)
Fixed important typo
2024-03-30 06:42:28 +05:30
UmerHA
0302446819 Implements Blockwise lora (#7352)
* Initial commit

* Implemented block lora

- implemented block lora
- updated docs
- added tests

* Finishing up

* Reverted unrelated changes made by make style

* Fixed typo

* Fixed bug + Made text_encoder_2 scalable

* Integrated some review feedback

* Incorporated review feedback

* Fix tests

* Made every module configurable

* Adapter to new lora test structure

* Final cleanup

* Some more final fixes

- Included examples in `using_peft_for_inference.md`
- Added hint that only attns are scaled
- Removed NoneTypes
- Added test to check mismatching lens of adapter names / weights raise error

* Update using_peft_for_inference.md

* Update using_peft_for_inference.md

* Make style, quality, fix-copies

* Updated tutorial;Warning if scale/adapter mismatch

* floats are forwarded as-is; changed tutorial scale

* make style, quality, fix-copies

* Fixed typo in tutorial

* Moved some warnings into `lora_loader_utils.py`

* Moved scale/lora mismatch warnings back

* Integrated final review suggestions

* Empty commit to trigger CI

* Reverted emoty commit to trigger CI

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-29 21:15:57 +05:30
Dhruv Nair
4d39b7483d Memory clean up on all Slow Tests (#7514)
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-29 14:23:28 +05:30
Sayak Paul
fac761694a [Tests] Speed up some fast pipeline tests (#7477)
* speed up test_vae_slicing in animatediff

* speed up test_karras_schedulers_shape for attend and excite.

* style.

* get the static slices out.

* specify torch print options.

* modify

* test run with controlnet

* specify kwarg

* fix: things

* not None

* flatten

* controlnet img2img

* complete controlet sd

* finish more

* finish more

* finish more

* finish more

* finish the final batch

* add cpu check for expected_pipe_slice.

* finish the rest

* remove print

* style

* fix ssd1b controlnet test

* checking ssd1b

* disable the test.

* make the test_ip_adapter_single controlnet test more robust

* fix: simple inpaint

* multi

* disable panorama

* enable again

* panorama is shaky so leave it for now

* remove print

* raise tolerance.
2024-03-29 14:11:38 +05:30
YiYi Xu
34c90dbb31 fix OOM for test_vae_tiling (#7510)
use float16 and add torch.no_grad()
2024-03-29 08:22:39 +05:30
Lvkesheng Shen
e49c04d5d6 Bug fix for controlnetpipeline check_image (#7103)
* Bug fix for controlnetpipeline check_image

Bug fix for controlnetpipeline check_image when using multicontrolnet and prompt list

* Update test_inference_multiple_prompt_input function

* Update test_controlnet.py

add test for multiple prompts and multiple image conditioning

* Update test_controlnet.py

Fix format error

---------

Co-authored-by: Lvkesheng Shen <45848260+Fantast416@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-28 08:25:18 -10:00
YiYi Xu
f238cb0736 cpu_offload: remove all hooks before offload (#7448)
* add remove_all_hooks

* a few more fix and tests

* up

* Update src/diffusers/pipelines/pipeline_utils.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* split tests

* add

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-03-28 08:23:02 -10:00
Bagheera
d78acdedc1 apple mps: training support for SDXL (ControlNet, LoRA, Dreambooth, T2I) (#7447)
* apple mps: training support for SDXL LoRA

* sdxl: support training lora, dreambooth, t2i, pix2pix, and controlnet on apple mps

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-28 14:26:18 +05:30
Sayak Paul
6df103deba add: a helpful message when quality and repo consistency checks fail. (#7475) 2024-03-28 13:51:56 +05:30
Sayak Paul
73f28708be Improve nightly tests (#7385)
* flesh out the nightly tests

* address feedback.
2024-03-28 13:26:34 +05:30
Sayak Paul
0cbc78f04c [Modeling utils chore] import load_model_dict_into_meta only once (#7437)
import load_model_dict_into_meta only once
2024-03-28 13:01:53 +05:30
Thomas Liang
0cc5630945 [Chore] Fix Colab notebook links in README.md (#7495) 2024-03-27 12:36:36 -10:00
UmerHA
0b8e29289d Skip test_lora_fuse_nan on mps (#7481)
Skipping test_lora_fuse_nan on mps

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-27 14:35:59 +05:30
Sayak Paul
ab38ddf64f [chore] make the istructions on fetching all commits clearer. (#7474)
* make the istructions on fetching all commits clearer.

* Update setup.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-27 08:16:46 +05:30
YiYi Xu
ead82fedea fix torch.compile for multi-controlnet of sdxl inpaint (#7476)
fix

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-27 08:08:32 +05:30
Disty0
45b42d1203 Add device arg to offloading with combined pipelines (#7471)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-26 13:45:16 -10:00
Long(Tony) Lian
5199ee4f7b Fix missing raise statements in check_inputs (#7473)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-26 13:34:28 -10:00
Bagheera
544710ef0f diffusers#7426 fix stable diffusion xl inference on MPS when dtypes shift unexpectedly due to pytorch bugs (#7446)
* mps: fix XL pipeline inference at training time due to upstream pytorch bug

* Update src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* apply the safe-guarding logic elsewhere.

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-26 20:05:49 +05:30
M. Tolga Cangöz
443aa14e41 Fix Tiling in ConsistencyDecoderVAE (#7290)
* Fix typos

* Add docstring to `decode` method in `ConsistencyDecoderVAE`

* Fix tiling

* Enable tiled VAE decoding with customizable tile sample size and overlap factor

* Revert "Enable tiled VAE decoding with customizable tile sample size and overlap factor"

This reverts commit 181049675e.

* Add VAE tiling test for `ConsistencyDecoderVAE`

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-26 17:59:08 +05:30
Sayak Paul
288632adf6 [Training utils] add kohya conversion dict. (#7435)
* add kohya conversion dict.

* update readme

* typo

* add filename
2024-03-26 17:31:22 +05:30
Ernie Chu
5ce79cbded Update train_dreambooth_lora_sd15_advanced.py (#7433)
you cannot specify `type="bool"` and `action="store_true"` at the same time.
remove excessive and buggy `type=bool`.

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2024-03-26 12:53:02 +02:00
Marçal Comajoan Cara
d52f3e30f8 Fix broken link (#7472)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-26 10:29:08 +05:30
Sayak Paul
699dfb084c feat: support DoRA LoRA from community (#7371)
* feat: support dora loras from community

* safe-guard dora operations under peft version.

* pop use_dora when False

* make dora lora from kohya work.

* fix: kohya conversion utils.

* add a fast test for DoRA compatibility..

* add a nightly test.
2024-03-26 09:37:33 +05:30
Sayak Paul
484c8ef399 [tests] skip dynamo tests when python is 3.12. (#7458)
skip dynamo tests when python is 3.12.
2024-03-26 08:39:48 +05:30
estelleafl
0dd0528851 Small ldm3d fix (#7464)
* fixed typo

* updated doc to be consistent in naming

* make style/quality

* preprocessing for 4 channels and not 6

* make style

* test for 4c

* make style/quality

* fixed test on cpu

* fixed doc typo

* changed default ckpt to 4c

* Update pipeline_stable_diffusion_ldm3d.py

* fix bug

---------

Co-authored-by: Aflalo <estellea@isl-iam1.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu33.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu38.rr.intel.com>
2024-03-25 15:33:43 -10:00
UmerHA
1cd4732e7f Fixed minor error in test_lora_layers_peft.py (#7394)
* Update test_lora_layers_peft.py

* Update utils.py
2024-03-25 11:35:27 -10:00
M. Tolga Cangöz
a51b6cc86a [Docs] Fix typos (#7451)
* Fix typos

* Fix typos

* Fix typos

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-25 11:48:02 -07:00
Dhruv Nair
3bce0f3da1 Fix for str_to_bool definition in testing utils (#7461)
update
2024-03-25 13:33:09 +05:30
Dhruv Nair
9a34953823 Additional Memory clean up for slow tests (#7436)
* update

* update

* update
2024-03-25 12:19:21 +05:30
Sayak Paul
e29f16cfaa [Research Projects] ORPO diffusion for alignment (#7423)
* barebones orpo

* remove reference model.

* full implementation

* change default of beta_orpo

* add a training command.

* fix: dataloading issues.

* interpreting the formulation.

* revert styling

* add: wds full blown version

* fix: per_gpu_batch_siz

* start debuggin

* debugging

* remove print

* fix

* remove filter keys.

* turn on non-blocking calls.

* device_placement

* let's see.

* add bigger training run command

* reinitialize generator for fair repro

* add: detailed readme and requirements

---------

Co-authored-by: Sayak Paul <sayakpaul@Sayaks-MacBook-Pro-2.local>
2024-03-25 08:37:41 +05:30
M. Tolga Cangöz
f7dfcfd971 [IP-Adapter] Fix IP-Adapter Support and Refactor Callback for StableDiffusionPanoramaPipeline (#7262)
* Add properties and `IPAdapterTesterMixin` tests for `StableDiffusionPanoramaPipeline`

* Update torch manual seed to use `torch.Generator(device=device)`

* Refactor 📞🔙 to support `callback_on_step_end`

* make fix-copies
2024-03-24 16:07:02 -10:00
Sayak Paul
3c67864c5a Remove distutils (#7455)
* strtobool

* replace Command from setuptools.
2024-03-25 06:44:53 +05:30
Aryan
363699044e [refactor] Fix FreeInit behaviour (#7410)
* fix freeinit impl

* fix progress bar

* fix progress bar and remove old code

* fix num_inference_steps==1 case for freeinit by atleast running 1 step when fast sampling enabled
2024-03-22 19:20:00 +05:30
Sayak Paul
9613576191 add: space for calculating memory usagee. (#7414)
* add: space for calculating memory usahe.

* Update docs/source/en/using-diffusers/loading.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-22 08:43:21 +05:30
YiYi Xu
e4356d6488 add a "Community Scripts" section (#7358)
* add

* add tiling

* fix

* fix

* fix

* give community script its own readme

* Update examples/community/README_community_scripts.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/community/README_community_scripts.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/community/README_community_scripts.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update examples/community/README_community_scripts.md

---------

Co-authored-by: Alexis Rolland <alexis.rolland@ubisoft.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-21 10:05:07 -10:00
Sayak Paul
82441460ef [Docs] add missing output image (#7425)
add missing output image
2024-03-21 09:22:06 -07:00
sayakpaul
3e1097cb63 Revert "add: space within docs to calculate mememory usage."
This reverts commit 78990dd960.
2024-03-21 08:33:02 +05:30
sayakpaul
78990dd960 add: space within docs to calculate mememory usage. 2024-03-21 08:32:37 +05:30
Yuanhao Zhai
405a1facd2 fix: enable unet_3d_condition to support time_cond_proj_dim (#7364)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-21 07:46:32 +05:30
M. Tolga Cangöz
3028089e5e Fix typos (#7411)
* Fix typos

* Fix typo in SVD.md
2024-03-20 18:46:47 -07:00
Sayak Paul
b536f39818 [Custom Pipelines with Custom Components] fix multiple things (#7304)
* checking to improve pipelines.

* more fixes.

* add: tip to encourage the usage of revision

* Apply suggestions from code review

* retrigger ci

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-20 18:49:00 +05:30
Sayak Paul
e25e525fde [LoRA test suite] refactor the test suite and cleanse it (#7316)
* cleanse and refactor lora testing suite.

* more cleanup.

* make check_if_lora_correctly_set a utility function

* fix: typo

* retrigger ci

* style
2024-03-20 17:13:52 +05:30
Sayak Paul
de9adb907c clean dep installation step in push_tests (#7382)
* clean dep installation step in push_tests

* fix: deps
2024-03-20 07:30:43 +05:30
Sayak Paul
bf861e65dc [Chore] add: fives names to citations. (#7395)
* add: four names to citations.

* add: steven
2024-03-20 06:37:57 +05:30
Dhruv Nair
4da810b943 Remove insecure torch.load calls (#7393)
update
2024-03-19 12:41:50 -10:00
Stephen
161c6e14b6 Change path to posix (modeling_utils.py) (#6781)
* Change path to posix

* running isort

* run style and quality checks

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-19 11:50:34 -10:00
laksjdjf
a6c9015c4e Fix ControlNetModel.from_unet do not load add_embedding (#7269)
* Fix ControlNetModel.from_unet do not load add_embedding

* delete white space in blank line

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-19 09:45:08 -10:00
PJC
e6a5f99e5c Update pipeline_controlnet_sd_xl_img2img.py (#7353)
* Update pipeline_controlnet_sd_xl_img2img.py

fix: safetensors load error

* fix for pass test

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-19 09:29:39 -10:00
Dhruv Nair
80ff4ba63e Fix issue with prompt embeds and latents in SD Cascade Decoder with multiple image embeddings for a single prompt. (#7381)
* fix

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-19 07:40:14 -10:00
Sayak Paul
b09a2aa308 [LoRA] fix cross_attention_kwargs problems and tighten tests (#7388)
* debugging

* let's see the numbers

* let's see the numbers

* let's see the numbers

* restrict tolerance.

* increase inference steps.

* shallow copy of cross_attentionkwargs

* remove print
2024-03-19 17:53:38 +05:30
YiYi Xu
63b6846849 [scheduler] fix a bug in add_noise (#7386)
* fix

* fix

* add a tests

* fix

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-03-19 00:50:58 -10:00
lawfordp2017
139f707e6e Correction for non-integral image resolutions with quantizations other than float32 (#7356)
* Correction for non-integral image resolutions with quantizations other than float32.

* Support for training, and use of diffusers-style casting.
2024-03-19 16:17:44 +05:30
Aryan
e4546fd5bb [docs] Add missing copied from statements in TCD Scheduler (#7360)
* add missing copied from statements in tcd scheduler

* update docstring

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-19 00:45:36 -10:00
Dhruv Nair
d44e31aec2 Add FreeInit Outputs to Docs Page (#7384)
* update

* fix
2024-03-19 14:13:41 +05:30
Sayak Paul
ce9825b56b [LoRA] pop the LoRA scale so that it doesn't get propagated to the weeds (#7338)
* pop scale from the top-level unet instead of getting it.

* improve readability.

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* fix a little bit.

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-19 09:12:05 +05:30
M. Tolga Cangöz
85f9d92883 Fix conditional statement in test_schedulers.py (#7323)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-19 08:28:47 +05:30
M. Tolga Cangöz
916d9812a8 Update loading of config from a file in test_config.py (#7344)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 11:47:36 -10:00
M. Tolga Cangöz
e6a8492242 Use PyTorch's conventional inplace functions (#7332)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 09:12:15 -10:00
Beinsezii
ad0308b3f1 Add Cascade to Auto T2I + Decoder mappings (#7362)
* Add Cascade to Auto T2I + Decoder mappings

* ruff autofix

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 08:58:20 -10:00
M. Tolga Cangöz
e97a633b63 Update access of configuration attributes (#7343)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 08:53:29 -10:00
Sayak Paul
01ac37b331 [LoRA] Clean Kohya conversion utils (#7374)
* clean up the kohya_conversion utility

* state dict assignment
2024-03-18 06:53:37 -10:00
M. Tolga Cangöz
6a05b274cc Fix Typos (#7325)
* Fix PyTorch's convention for inplace functions

* Fix import structure in __init__.py and update config loading logic in test_config.py

* Update configuration access

* Fix typos

* Trim trailing white spaces

* Fix typo in logger name

* Revert "Fix PyTorch's convention for inplace functions"

This reverts commit f65dc4afcb.

* Fix typo in step_index property description

* Revert "Update configuration access"

This reverts commit 8d44e870b8.

* Revert "Fix import structure in __init__.py and update config loading logic in test_config.py"

This reverts commit 2ad5e8bca2.

* Fix typos

* Fix typos

* Fix typos

* Fix a typo: tranform -> transform
2024-03-18 09:48:40 -07:00
Anatoly Belikov
98d46a3f08 delete vae and text encoders after use in SDXL training script (#6693)
delete vae and text encoders after use

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 20:03:53 +05:30
Dhruv Nair
4330a747d4 [Tests] Fix ControlNet Single File tests (#7315)
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-18 11:28:59 +05:30
Sayak Paul
76de6a09fb post-release v0.27.0 (#7329)
* post-release

* quality
2024-03-18 10:52:20 +05:30
Sayak Paul
25caf24ef9 Fix release workflow deps (#7339)
* pop scale from the top-level unet instead of getting it.

* improve readability.

* fix: pypi workflow deps

* revert
2024-03-16 07:18:11 +05:30
Abubakar Abid
8db3c9bc9f Adds docs for gradio.Interface.from_pipeline() (#7346)
* gradio docs

* Update docs/source/en/api/pipelines/stable_diffusion/overview.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* changes

* changes

* changes

* Update docs/source/en/api/pipelines/stable_diffusion/overview.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-16 07:11:28 +05:30
Sayak Paul
e0e9f81971 add: torch to the pypi step. (#7328) 2024-03-15 12:28:12 +05:30
M. Tolga Cangöz
5d848ec07c [Tests] Update a deprecated parameter in test files and fix several typos (#7277)
* Add properties and `IPAdapterTesterMixin` tests for `StableDiffusionPanoramaPipeline`

* Fix variable name typo and update comments

* Update deprecated `output_type="numpy"` to "np" in test files

* Discard changes to src/diffusers/pipelines/stable_diffusion_panorama/pipeline_stable_diffusion_panorama.py

* Update test_stable_diffusion_panorama.py

* Update numbers in README.md

* Update get_guidance_scale_embedding method to use timesteps instead of w

* Update number of checkpoints in README.md

* Add type hints and fix var name

* Fix PyTorch's convention for inplace functions

* Fix a typo

* Revert "Fix PyTorch's convention for inplace functions"

This reverts commit 74350cf65b.

* Fix typos

* Indent

* Refactor get_guidance_scale_embedding method in LEditsPPPipelineStableDiffusionXL class
2024-03-14 12:17:35 -07:00
Dhruv Nair
4974b84564 Update Cascade Tests (#7324)
* update

* update

* update
2024-03-14 20:51:22 +05:30
Linoy Tsaban
83062fb872 [Advanced DreamBooth LoRA SDXL] Support EDM-style training (follow up of #7126) (#7182)
* add edm style training

* style

* finish adding edm training feature

* import fix

* fix latents mean

* minor adjustments

* add edm to readme

* style

* fix autocast and scheduler config issues when using edm

* style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-14 18:40:14 +05:30
Suraj Patil
b6d7e31d10 add edm schedulers in doc (#7319)
* add edm schedulers in doc

* add in toctree

* address reviewe comments
2024-03-14 11:52:25 +01:00
Anatoly Belikov
53e9aacc10 log loss per image (#7278)
* log loss per image

* add commandline param for per image loss logging

* style

* debug-loss -> debug_loss

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-14 11:41:43 +05:30
Dhruv Nair
41424466e3 [Tests] Fix incorrect constant in VAE scaling test. (#7301)
update
2024-03-14 10:24:01 +05:30
Sayak Paul
95de1981c9 add: pytest log installation (#7313) 2024-03-14 10:01:16 +05:30
Kenneth Gerald Hamilton
0b45b58867 update get_order_list if statement (#7309)
* update get_order_list if statement

* revery
2024-03-13 18:29:42 -10:00
Beinsezii
d3986f18be Change step_offset scheduler docstrings (#7128)
* Change step_offset scheduler docstrings

* Mention it may be needed by some models

* More docstrings

These ones failed literal S&R because I performed it case-sensitive
which is fun.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-13 15:12:00 -10:00
Alexander Bonnet
ee6a3a993d Fix typos in UNet2DConditionModel documentation (#7291)
* fix typo in UNet2DConditionModel documentation

* Fix indentation that may fix doc rendering

* Fix squished doc lines
2024-03-13 09:31:29 -07:00
Michael
b300517305 Add Intro page of TCD (#7259)
* add tcd intro

* resolve repos

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* revise NFEs related

* change inpainting location

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-13 09:21:51 -07:00
jnhuang
ac07b6dc6a Fix Wrong Text-encoder Grad Setting in Custom_Diffusion Training (#7302)
fix index in set textencoder grad

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-13 20:22:44 +05:30
Sayak Paul
46ab56a468 add: support for notifying maintainers about the nightly test status (#7117)
* add: support for notifying maintainers about the nightly test status

* add: a tempoerary workflow for validation.

* cancel in progress.

* runs-on

* clean up

* add: peft dep

* change device.

* multiple edits.

* remove temp workflow.
2024-03-13 16:48:11 +05:30
Sayak Paul
038ff70023 [PyPI publishing] feat: automate the process of pypi publication to some extent. (#7270)
* feat: automate the process of pypi publication to some extent.

* utility to fetch the latest release branch

* correct package name.
2024-03-13 16:27:59 +05:30
Manuel Brack
00eca4b887 [Pipeline] Add LEDITS++ pipelines (#6074)
* Setup LEdits++ file structure

* Fix import

* LEditsPP Stable Diffusion pipeline

* Include variable image aspect ratios

* Implement LEDITS++ for SDXL

* clean up LEditsPPPipelineStableDiffusion

* Adjust inversion output

* Added docu, more cleanup for LEditsPPPipelineStableDiffusion

* clean up LEditsPPPipelineStableDiffusionXL

* Update documentation

* Fix documentation import

* Add skeleton IF implementation

* Fix documentation typo

* Add LEDTIS docu to toctree

* Add missing title

* Finalize SD documentation

* Finalize SD-XL documentation

* Fix code style and quality

* Fix typo

* Fix return types

* added LEditsPPPipelineIF; minor changes for LEditsPPPipelineStableDiffusion and LEditsPPPipelineStableDiffusionXL

* Fix copy reference

* add documentation for IF

* Add first tests

* Fix batching for SD-XL

* Fix text encoding and perfect reconstruction for SD-XL

* Add tests for SD-XL, minor changes

* move user_mask to correct device, use cross_attention_kwargs also for inversion

* Example docstring

* Fix attention resolution for non-square images

* Refactoring for PR review

* Safely remove ledits_utils.py

* Style fixes

* Replace assertions with ValueError

* Remove LEditsPPPipelineIF

* Remove unecessary input checks

* Refactoring of CrossAttnProcessor

* Revert unecessary changes to scheduler

* Remove first progress-bar in inversion

* Refactor scheduler usage and reset

* Use imageprocessor instead of custom logic

* Fix scheduler init warning

* Fix error when running the pipeline in fp16

* Update documentation wrt perfect inversion

* Update tests

* Fix code quality and copy consistency

* Update LEditsPP import

* Remove enable/disable methods that are now in StableDiffusionMixin

* Change import in docs

* Revert import structure change

* Fix ledits imports

---------

Co-authored-by: Katharina Kornmeier <katharina.kornmeier@stud.tu-darmstadt.de>
2024-03-13 12:43:47 +02:00
Dhruv Nair
30132aba30 Update Stable Cascade Conversion Scripts (#7271)
* update

* update

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-13 12:35:44 +05:30
Dhruv Nair
a17d6d6858 Update Cascade documentation (#7257)
* updates

* update

* update

* Update docs/source/en/api/pipelines/stable_cascade.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* update

* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
2024-03-13 11:29:59 +05:30
Sayak Paul
8efd9ce787 [Chore] clean residue from copy-pasting in the UNet single file loader (#7295)
clean residue from copy-pasting
2024-03-13 11:20:13 +05:30
Dhruv Nair
299c16d0f5 Fix loading Img2Img refiner components in from_single_file (#7282)
* update

* update

* update

* update
2024-03-13 09:25:53 +05:30
Dhruv Nair
69f49195ac Fix passing pooled prompt embeds to Cascade Decoder and Combined Pipeline (#7287)
* update

* update

* update

* update
2024-03-13 09:21:41 +05:30
Dhruv Nair
ed224f94ba Add single file support for Stable Cascade (#7274)
* update

* update

* update

* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-13 08:37:31 +05:30
Sayak Paul
531e719163 [LoRA] use the PyTorch classes wherever needed and start depcrecation cycles (#7204)
* fix PyTorch classes and start deprecsation cycles.

* remove args crafting for accommodating scale.

* remove scale check in feedforward.

* assert against nn.Linear and not CompatibleLinear.

* remove conv_cls and lineaR_cls.

* remove scale

* 👋 scale.

* fix: unet2dcondition

* fix attention.py

* fix: attention.py again

* fix: unet_2d_blocks.

* fix-copies.

* more fixes.

* fix: resnet.py

* more fixes

* fix i2vgenxl unet.

* depcrecate scale gently.

* fix-copies

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* quality

* throw warning when scale is passed to the the BasicTransformerBlock class.

* remove scale from signature.

* cross_attention_kwargs, very nice catch by Yiyi

* fix: logger.warn

* make deprecation message clearer.

* address final comments.

* maintain same depcrecation message and also add it to activations.

* address yiyi

* fix copies

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* more depcrecation

* fix-copies

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-13 07:56:19 +05:30
Sayak Paul
4fbd310fd2 [Chore] switch to logger.warning (#7289)
switch to logger.warning
2024-03-13 06:56:43 +05:30
Dhruv Nair
2ea28d69dc Change export_to_video default (#6990)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-12 17:13:12 +05:30
erliding
a1cb106459 instruct pix2pix pipeline: remove sigma scaling when computing classifier free guidance (#7006)
remove sigma scaling when computing cfg

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-11 10:23:31 +01:00
Sayak Paul
5dd8e04d4b [Dockerfiles] add: a workflow to check if docker containers can be built in case of modifications (#7129)
* add: a workflow to check if docker containers can be built if the files are modified.

* type

* unify docker image build test and push

* make it run on prs too.

* check

* check

* check

* check again.

* remove docker test build file.

* remove extra dependencies./

* check
2024-03-11 08:54:00 +05:30
pravdomil
165af7edd3 Inline InputPadder (#6582)
inline InputPadder
2024-03-09 11:24:07 -10:00
Haofan Wang
6c5f0de713 Support latents_mean and latents_std (#7132)
* update latents_mean and latents_std

* fix typos

* Update src/diffusers/pipelines/controlnet/pipeline_controlnet_sd_xl_img2img.py

* format

---------

Co-authored-by: ResearcherXman <xhs.research@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-09 08:54:19 -10:00
pravdomil
e64fdcf2ce Fix gmflow_dir (#6583)
* remove sys.path

* update readme
2024-03-09 08:53:17 -10:00
Sayak Paul
ec64f371b1 [Chore] remove tf mention (#7245)
remove tf mention
2024-03-09 11:39:04 +05:30
Aryan
cd6e1f1171 [docs/nits] Fix return values based on return_dict and minor doc updates (#7105)
* fix returns and docs

* handle latent output_type correctly

* revert to old tensor2vid impl

* make fix-copies

* fix return in community animatediff pipes

* fix return docstring

* fix return docs

* add missing quote

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-03-08 18:47:24 -10:00
Xiaodong Wang
6f2b310a17 [UNet_Spatio_Temporal_Condition] fix default num_attention_heads in unet_spatio_temporal_condition (#7205)
fix default num_attention_heads in unet_spatio_temporal_condition

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-08 18:29:06 -10:00
YiYi Xu
e3cd6cae50 update the signature of from_single_file (#7216)
* update the signature of from_single_file

* Update src/diffusers/loaders/single_file.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/loaders/single_file.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/loaders/single_file.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/loaders/single_file.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-08 17:51:51 -10:00
qqii
e5ee05da76 [Community Pipeline] Skip Marigold depth_colored with color_map=None (#7170)
[Community Pipeline] Skip Marigold depth_colored generation by passing color_map=None
2024-03-08 17:51:11 -10:00
Mengqing Cao
e6ff752840 Add npu support (#7144)
* Add npu support

* fix for code quality check

* fix for code quality check
2024-03-08 17:12:55 -10:00
UmerHA
3f9c746fb2 Adds denoising_end parameter to ControlNetPipeline for SDXL (#6175)
* Initial commit

* Removed copy hints, as in original SDXLControlNetPipeline

Removed copy hints, as in original SDXLControlNetPipeline, as the `make fix-copies` seems to have issues with the @property decorator.

* Reverted changes to ControlNetXS

* Addendum to: Removed changes to ControlNetXS

* Added test+docs for mixture of denoiser

* Update docs/source/en/using-diffusers/controlnet.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/controlnet.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-08 16:42:02 -10:00
Steven Liu
1f22c98820 [docs] IP-Adapter image embedding (#7226)
* update

* fix parameter name

* feedback

* add no mask version

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-08 08:49:58 -08:00
Sayak Paul
b4226bd6a7 [Tests] fix config checking tests (#7247)
* debig

* cast tuples to lists.

* debug

* handle upcast attention

* handle downblock types for vae.

* remove print.

* address Dhruv's comments.

* fix: upblock types.

* upcast attention

* debug

* debug

* debug

* better guarding.

* style
2024-03-08 18:53:07 +05:30
Chi
46fac824be Solve missing clip_sample implementation in FlaxDDIMScheduler. (#7017)
* I added a new doc string to the class. This is more flexible to understanding other developers what are doing and where it's using.

* Update src/diffusers/models/unet_2d_blocks.py

This changes suggest by maintener.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/unet_2d_blocks.py

Add suggested text

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update unet_2d_blocks.py

I changed the Parameter to Args text.

* Update unet_2d_blocks.py

proper indentation set in this file.

* Update unet_2d_blocks.py

a little bit of change in the act_fun argument line.

* I run the black command to reformat style in the code

* Update unet_2d_blocks.py

similar doc-string add to have in the original diffusion repository.

* Fix bug for mention in this issue section #6901

* Update src/diffusers/schedulers/scheduling_ddim_flax.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix linter

* Restore empty line

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-03-08 12:01:59 +01:00
Martin Müller
b33b64f595 Make mid block optional for flax UNet (#7083)
* make mid block optional for flax UNet

* make style
2024-03-08 11:35:06 +01:00
Sayak Paul
9d9744075e [Easy] fix: save_model_card utility of the DreamBooth SDXL LoRA script (#7258)
* fix: save_model_card utility.

* fix a little more to make it more lenient.

* remove lower()
2024-03-08 15:22:23 +05:30
Sayak Paul
d9a3b69806 [Utils] Improve " # Copied from ..." statements in the pipelines (#6917)
* copied from for t2i pipelines without ip adapter support.

* two more pipelines with proper copied from comments.

* revert to the original implementation
2024-03-08 14:42:26 +05:30
Sayak Paul
f7e5954d5e [Tests] fix: VAE tiling tests when setting the right device (#7246)
* debug

* checking

* fix more

* remove device.

* fix-copies
2024-03-08 10:01:25 +05:30
Sayak Paul
8e19c073e5 [Core] throw error when patch inputs and layernorm are provided for Transformers2D (#7200)
* throw error when patch inputs and layernorm are provided for transformers2d.

* add comment on supported norm_types in transformers2d

* more check

* fix: norm _type handling
2024-03-08 09:41:02 +05:30
Steven Liu
f6df16cbb8 [docs] Community tips (#7137)
* tips

* feedback

* callback only
2024-03-07 15:17:26 -08:00
pravdomil
b24f78349c use self.device (#6595)
* use self.device

* use device

* fix

* fix
2024-03-07 12:46:23 -10:00
Steven Liu
3ce905c9d0 [docs] Merge LoRAs (#7213)
* merge loras

* feedback

* torch.compile

* feedback
2024-03-07 11:28:50 -08:00
bimsarapathiraja
f539497ab4 Remove the line. Using it create wrong output (#7075)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-07 10:04:31 -08:00
Dhruv Nair
39dfb7abbd Raise an error when trying to use SD Cascade Decoder with dtype bfloat16 and torch < 2.2 (#7244)
update
2024-03-07 17:55:46 +05:30
Sayak Paul
196835695e fix: support for loading playground v2.5 single file checkpoint. (#7230)
* fix: support for loading playground v2.5 single file checkpoint.

* remove is_playground_model.

* fix: edm key

* apply Dhruv's comments but errors.

* fix: things.

* delegate model_type inference to a function.

* address Dhruv's comment.

* address rest of the comments.

* fix: kwargs

* fix

* update

---------

Co-authored-by: DN6 <dhruv.nair@gmail.com>
2024-03-07 15:31:03 +05:30
Sayak Paul
0d4dfbbd0a [Examples] fix: prior preservation setting in DreamBooth LoRA SDXL script. (#7242)
fix: prior preservation setting in DreamBooth LoRA SDXL script.

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2024-03-07 15:19:58 +05:30
Rinne
ada3bb941b fix: remove duplicated code in TemporalBasicTransformerBlock. (#7212)
fix: remove duplicate code in TemporalBasicTransformerBlock.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-07 13:25:22 +05:30
Linoy Tsaban
b5814c5555 add DoRA training feature to sdxl dreambooth lora script (#7235)
* dora in canonical script

* add mention of DoRA to readme
2024-03-07 11:43:37 +05:30
Paakhhi
9940573618 Refactor Prompt2Prompt: Inherit from DiffusionPipeline (#7211)
refactor: inherit from DiffusionPipeline instead of StableDiffusionPipeline
2024-03-06 19:34:40 -10:00
Sayak Paul
59433ca1ae [Core] move out the utilities from pipeline_utils.py (#7234)
move out the utilities from pipeline_utils.py
2024-03-07 08:45:24 +05:30
Nate Landman
534f5d54fa Update train_dreambooth_lora_sdxl_advanced.py (#7227)
adding the type gives you

```
TypeError: _StoreTrueAction.__init__() got an unexpected keyword argument 'type'
```
2024-03-06 12:41:48 +01:00
Kashif Rasul
40aa47b998 [Pipiline] Wuerstchen v3 aka Stable Cascasde pipeline (#6487)
* initial diffNext v3

* move to v3 folder

* imports

* dry up the unets

* no switch_level

* fix init

* add switch_level tp config

* Fixed some things

* Added pooled text embeddings

* Initial work on adding image encoder

* changes from @dome272

* Stuff for the image encoder processing and variable naming in decoder

* fix arg name

* inference fixes

* inference fixes

* default TimestepBlock without conds

* c_skip=0 by default

* fix bfloat16 to cpu

* use config

* undo temp change

* fix gen_c_embeddings args

* change text encoding

* text encoding

* undo print

* undo .gitignore change

* Allow WuerstchenV3PriorPipeline to use the base DDPM & DDIM schedulers

* use WuerstchenV3Unet in both pipelines

* fix imports

* initial failing tests

* cleanup

* use scheduler.timesterps

* some fixes to the tests, still not fully working

* fix tests

* fix prior tests

* add dropout to the model_kwargs

* more tests passing

* update expected_slice

* initial rename

* rename tests

* rename class names

* make fix-copies

* initial docs

* autodocs

* typos

* fix arg docs

* add text_encoder info

* combined pipeline has optional image arg

* fix documentation

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* use self.config

* Update src/diffusers/pipelines/stable_cascade/modeling_stable_cascade_common.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* c_in -> in_channels

* removed kwargs from unet's forward

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove older callback api

* removed kwargs and fixed decoder guidance > 1

* decoder takes emeds

* check and use image_embeds

* fixed all but one decoder test

* fix decoder tests

* update callback api

* fix some more combined tests

* push combined pipeline

* initial docs

* fix doc_string

* update combined api

* no test_callback_inputs test for combined pipeline

* add optional components

* fix ordering of components

* fix combined tests

* update convert script

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade_prior.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade_prior.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/stable_cascade/pipeline_stable_cascade_prior.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* fix imports

* move effnet out of deniosing loop

* prompt_embeds_pooled only when doing guidance

* Fix repeat shape

* move StableCascadeUnet to models/unets/

* more descriptive names

* converted when numpy()

* StableCascadePriorPipelineOutput docs

* rename StableCascadeUNet

* add slow tests

* fix slow tests

* update

* update

* updated model_path

* add args for weights

* set push_to_hub to false

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

---------

Co-authored-by: Dominic Rampas <d6582533@gmail.com>
Co-authored-by: Pablo Pernias <pablo@pernias.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: 99991 <99991@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-03-06 15:07:25 +05:30
Jinay Jain
1bc0d37ffe [bug] Fix float/int guidance scale not working in StableVideoDiffusionPipeline (#7143)
* [bug] Fix float/int guidance scale not working in `StableVideoDiffusionPipeline`

* Add test to disable CFG on SVD

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-06 14:05:07 +05:30
bram-w
eb942b866a SDXL Turbo support and example launch (#6473)
* support and example launch for sdxl turbo

* White space fixes

* Trailing whitespace character

* ruff format

* fix guidance_scale and steps for turbo mode

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Radames Ajna <radamajna@gmail.com>
2024-03-06 11:51:01 +05:30
Michael
687bc27727 add TCD Scheduler (#7174)
* add: support TCD scheduler


---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-04 19:43:34 -10:00
iczaw
6246c70d21 [Community] PromptDiffusion Pipeline (#6752)
* Create promptdiffusioncontrolnet.py

* Update __init__.py

Added PromptDiffusionControlNetModel

* Update __init__.py

Added PromptDiffusionControlNetModel

* Update promptdiffusioncontrolnet.py

* Create pipeline_prompt_diffusion.py

Added Prompt Diffusion pipeline.

* Create convert_original_promptdiffusion_to_diffusers.py

* Update convert_from_ckpt.py

Added download_promptdiffusion_from_original_ckpt, convert_promptdiffusion_checkpoint

* Update promptdiffusioncontrolnet.py

* Update pipeline_prompt_diffusion.py

* Update README.md

* Update pipeline_prompt_diffusion.py

* Delete src/diffusers/models/promptdiffusioncontrolnet.py

* Update __init__.py

* Update __init__.py

* Delete scripts/convert_original_promptdiffusion_to_diffusers.py

* Update convert_from_ckpt.py

* Update README.md

* Delete examples/community/pipeline_prompt_diffusion.py

* Create README.md

* Create promptdiffusioncontrolnet.py

* Create convert_original_promptdiffusion_to_diffusers.py

* Create pipeline_prompt_diffusion.py

* Update README.md

* Update pipeline_prompt_diffusion.py

* Update README.md

* Update pipeline_prompt_diffusion.py

* Update convert_original_promptdiffusion_to_diffusers.py

* Update promptdiffusioncontrolnet.py

* Update README.md

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-05 09:06:02 +05:30
Sayak Paul
577b8a2783 [Core] errors should be caught as soon as possible. (#7203)
errors should be caught as soon as possible.
2024-03-05 08:57:38 +05:30
Vinh H. Pham
13f0c8b219 [Docs] Update callback.md code example (#7150)
Update callback.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-05 08:44:19 +05:30
Aryan
fa1bdce3d4 [docs] Improve SVD pipeline docs (#7087)
* update svd docs

* fix example doc string

* update return type hints/docs

* update type hints

* Fix typos in pipeline_stable_video_diffusion.py

* make style && make fix-copies

* Update src/diffusers/pipelines/stable_video_diffusion/pipeline_stable_video_diffusion.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/stable_video_diffusion/pipeline_stable_video_diffusion.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update based on suggestion

---------

Co-authored-by: M. Tolga Cangöz <mtcangoz@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-03-04 12:12:38 -08:00
Thiago Crepaldi
ca6cdc77a9 Enable PyTorch's FakeTensorMode for EulerDiscreteScheduler scheduler (#7151)
* Enable FakeTensorMode for EulerDiscreteScheduler scheduler

PyTorch's FakeTensorMode does not support `.numpy()` or `numpy.array()`
calls.

This PR replaces `sigmas` numpy tensor by a PyTorch tensor equivalent

Repro

```python
with torch._subclasses.FakeTensorMode() as fake_mode, ONNXTorchPatcher():
    fake_model = DiffusionPipeline.from_pretrained(model_name, low_cpu_mem_usage=False)
```

that otherwise would fail with
`RuntimeError: .numpy() is not supported for tensor subclasses.`

* Address comments
2024-03-04 09:19:59 -10:00
M. Tolga Cangöz
f4977abcd8 Fix typos (#7181)
* Fix typos

* Fix typos

* Fix typos and update documentation in lora.md
2024-03-04 10:28:23 -08:00
fpgaminer
df8559a7f9 Fix: UNet2DModel::__init__ type hints; fixes issue #4806 (#7175)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-04 09:50:34 -08:00
YiYi Xu
8f206a5873 fix a bug in from_config (#7192)
* fix

* fix

* update comment

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-03-04 07:18:59 -10:00
Linoy Tsaban
8da360aa12 [training scripts] add tags of diffusers-training (#7206)
* add tags for diffusers training

* add tags for diffusers training

* add tags for diffusers training

* add tags for diffusers training

* add tags for diffusers training

* add tags for diffusers training

* add dora tags for drambooth lora scripts

* style
2024-03-04 22:17:25 +05:30
Dhruv Nair
869bad3e52 FIx torch and cuda version in ONNX tests (#7164)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-04 15:46:39 +05:30
Linoy Tsaban
01ee0978cc [advanced dreambooth lora sdxl] add DoRA training feature (#7072)
* add is_dora arg

* style

* add dora training feature to sd 1.5 script

* added notes about DoRA training

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-04 11:07:54 +01:00
Sayak Paul
56b68459f5 Update requirements.txt to remove huggingface-cli (#7202)
Internal message: https://huggingface.slack.com/archives/C03Q18WK18T/p1709529892062479
2024-03-04 11:29:01 +05:30
Vinh H. Pham
2ca264244b adding callback_on_step_end for StableDiffusionLDM3DPipeline (#7149)
* refactor callback

* run make style

* add copy

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-03 18:44:14 -10:00
Sayak Paul
b9e1c30d0e [Docs] more elaborate example for peft torch.compile (#7161)
more elaborate example for peft torch.compile
2024-03-04 08:55:30 +05:30
Sayak Paul
03cd62520f feat: add ip adapter benchmark (#6936)
* feat: add ip adapter benchmark

* sdxl support too.

* Empty-Commit
2024-03-03 15:10:55 +05:30
Álvaro Somoza
001b14023e [ip-adapter] fix problem using embeds with the plus version of ip adapters (#7189)
* initial

* check_inputs fix to the rest of pipelines

* add fix for no cfg too

* use of variable

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-03-02 21:37:00 -10:00
Junsong Chen
f55873b783 Fix PixArt 256px inference (#6789)
* feat 256px diffusers inference bug

* change the max_length of T5 to pipeline config file

* fix bug in convert_pixart_alpha_to_diffusers.py

* Update scripts/convert_pixart_alpha_to_diffusers.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* remove multi_scale_train parser

* Update src/diffusers/pipelines/pixart_alpha/pipeline_pixart_alpha.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Update src/diffusers/pipelines/pixart_alpha/pipeline_pixart_alpha.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* styling

* change `model_token_max_length` to call argument.

* Refactoring

* add: max_sequence_length to the docstring.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-03-03 10:31:21 +05:30
Sayak Paul
ccb93dcad1 Support EDM-style training in DreamBooth LoRA SDXL script (#7126)
* add: dreambooth lora script for Playground v2.5

* fix: kwarg

* address suraj's comments.

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* apply suraj's suggestion

* incorporate changes in the canonical script./

* tracker naming

* fix: schedule determination

* add: two simple tests

* remove playground script

* note about edm-style training

* address pedro's comments.

* address part of Suraj's comments.

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* remove guidance_scale.

* use mse_loss.

* add comments for preconditioning.

* quality

* Update examples/dreambooth/train_dreambooth_lora_sdxl.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* tackle v-pred.

* Empty-Commit

* support edm for sdxl too.

* address suraj's comments.

* Empty-Commit

---------

Co-authored-by: Suraj Patil <surajp815@gmail.com>
2024-03-03 09:28:57 +05:30
YiYi Xu
ec953047bc [stalebot] fix a bug (#7156)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-03-01 09:44:00 -10:00
Oleh
9a2600ede9 Map speedup (#6745)
* Speed up dataset mapping

* Fix missing columns

* Remove cache files cleanup

* Update examples/text_to_image/train_text_to_image_sdxl.py

* make style

* Fix code style

* style

* Empty-Commit

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Quentin Lhoest <42851186+lhoestq@users.noreply.github.com>
Co-authored-by: Quentin Lhoest <lhoest.q@gmail.com>
2024-03-01 21:38:21 +05:30
Sayak Paul
5f150c4cef fix: loading problem for sdxl lora dreambooth (#7166) 2024-03-01 19:30:48 +05:30
Quentin Lhoest
66f8bd6869 Fix vae_encodings_fn hash in train_text_to_image_sdxl.py (#7171)
Update train_text_to_image_sdxl.py
2024-03-01 16:56:54 +05:30
Dhruv Nair
64a8cd627a [CI] Remove max parallel flag on slow test runners (#7162)
update
2024-03-01 11:38:21 +05:30
Sayak Paul
5d3923b670 Fix LCM benchmark test (#7158)
* make workflow dispatchable.

* fix: lcm lora compile
2024-03-01 10:33:44 +05:30
Sayak Paul
9451235e5a [Urgent][Docker CI] pin uv version for now and a minor change in the Slack notification (#7155)
pin uv version for now.
2024-03-01 10:11:07 +05:30
Sayak Paul
c2b6ac4e34 [CI] fix path filtering in the documentation workflows (#7153)
fix: path
2024-03-01 07:18:49 +05:30
YiYi Xu
06b01ea87e [ip-adapter] refactor prepare_ip_adapter_image_embeds and skip load image_encoder (#7016)
* add

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-29 15:38:49 -10:00
M. Tolga Cangöz
f4fc75035f [Docs] Fix typos (#7131)
* Add copyright notice to relevant files and fix typos

* Set `timestep_spacing` parameter of `StableDiffusionXLPipeline`'s scheduler to `'trailing'`.

* Update `StableDiffusionXLPipeline.from_single_file` by including EulerAncestralDiscreteScheduler with `timestep_spacing="trailing"` param.

* Update model loading method in SDXL Turbo documentation
2024-02-29 13:03:01 -08:00
Dhruv Nair
8f2d13c684 Fix setting fp16 dtype in AnimateDiff convert script. (#7127)
* update

* update
2024-02-29 22:47:39 +05:30
Sayak Paul
fcfa270fbd add: support for notifying the maintainers about the docker ci status. (#7113) 2024-02-29 19:28:52 +05:30
Sayak Paul
56dac1cedc limit documentation workflow runs for relevant changes. (#7125) 2024-02-29 19:01:54 +05:30
Sayak Paul
3daebe2b44 use uv for installing stuff in the workflows. (#7116)
* use uv for installing stuff in the workflows.

* fix: from source installation command when using uv.

* fix uv venv issue

* edit editable installation.

* fix quality installation

* checking

* make editable.

* more check

* check

* add: export step

* venv handling.

* checking.

* fix: dependency workflows.

* peft tests.

* proper way to initialize env.

* Empty-Commit

* Empty-Commit
2024-02-29 08:27:24 +05:30
Aryan
abd922bd0c [docs] unet type hints (#7134)
update
2024-02-28 10:39:01 -10:00
elucida
fa633ed6de refactor: move model helper function in pipeline to a mixin class (#6571)
* move model helper function in pipeline to EfficiencyMixin

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-28 09:26:39 -10:00
YiYi Xu
2cad1a8465 [stalebot] don't close the issue if the stale label is removed (#7106)
* fix

* fix

* remove stalebot's ability to close issues

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-28 08:04:28 -10:00
Dhruv Nair
e6cf21906d [Diffusers CI] Switch slow test runners (#7123)
update
2024-02-28 22:15:04 +05:30
Sayak Paul
7db935a141 fix kwarg in the SDXL LoRA DreamBooth (#7124)
* fix kwarg

* Empty-Commit
2024-02-28 10:21:28 +05:30
Sayak Paul
fa9bc029b4 [Tests] make test steps dependent on certain things and general cleanup of the workflows (#7026)
make tests conditional and other things.
2024-02-28 08:26:46 +05:30
Beinsezii
2e31a759b5 DPMSolverMultistep add rescale_betas_zero_snr (#7097)
* DPMMultistep rescale_betas_zero_snr

* DPM upcast samples in step()

* DPM rescale_betas_zero_snr UT

* DPMSolverMulti move sample upcast after model convert

Avoids having to re-use the dtype.

* Add a newline for Ruff
2024-02-27 11:37:34 -10:00
M. Tolga Cangöz
e51862bbed [Docs] Fix typos (#7118)
Fix typos, formatting and remove trailing whitespace
2024-02-27 12:38:00 -08:00
Suraj Patil
8492db2332 add DPM scheduler with EDM formulation (#7120)
* add DPM scheduler with EDM formulation

* set sigmas in init

* add _compute_sigmas

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* address some review comments

* up,

* add tests

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-02-27 23:39:38 +05:30
Suraj Patil
f57e7bd92c Add EDMEulerScheduler (#7109)
* Add EDMEulerScheduler

* address review comments

* fix import

* fix test

* add tests

* add co-author

Co-authored-by:  @dg845 dgu8957@gmail.com
2024-02-27 17:51:19 +05:30
Sayak Paul
3e3d46924b [Dockerfile] remove uv from docker jax tpu (#7115)
remove uv from docker jax tpu
2024-02-27 16:25:50 +05:30
Suraj Patil
d71ecad8cd denormalize latents with the mean and std if available (#7111)
* denormalize latents with the mean and std if available

* fix denormalize

* add latent mean and std in vae config

* address sayak's comment
2024-02-27 16:20:05 +05:30
Dhruv Nair
ac49f97a75 Add tests to check configs when using single file loading (#7099)
* update

* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-27 15:47:23 +05:30
Sayak Paul
04bafcbbc2 move to uv in the Dockerfiles. (#7094)
move to uv in the Dockerfiles.
2024-02-27 15:11:28 +05:30
Sayak Paul
7081a25618 [Examples] Multiple enhancements to the ControlNet training scripts (#7096)
* log_validation unification for controlnet.

* additional fixes.

* remove print.

* better reuse and loading

* make final inference run conditional.

* Update examples/controlnet/README_sdxl.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* resize the control image in the snippet.

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2024-02-27 09:18:46 +05:30
Sayak Paul
848f9fe6ce [Core] pass revision in the loading_kwargs. (#7019)
* pass revision in the loading_kwarhs.

* remove revision from load_sub_model.
2024-02-27 08:52:38 +05:30
Younes Belkada
8a692739c0 FIX [PEFT / Core] Copy the state dict when passing it to load_lora_weights (#7058)
* copy the state dict in load lora weights

* fixup
2024-02-27 02:42:23 +01:00
Sayak Paul
5aa31bd674 [Easy] edit issue and PR templates (#7092)
edit templates to remove patrick's name.
2024-02-27 07:10:03 +05:30
jinghuan-Chen
88aa7f6ebf Make LoRACompatibleConv padding_mode work. (#6031)
* Make LoRACompatibleConv padding_mode work.

* Format code style.

* add fast test

* Update src/diffusers/models/lora.py

Simplify the code by patrickvonplaten.

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* code refactor

* apply patrickvonplaten suggestion to simplify the code.

* rm test_lora_layers_old_backend.py and add test case in test_lora_layers_peft.py

* update test case.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-26 14:05:13 -10:00
M. Tolga Cangöz
ad310af0d6 Fix EMA in train_text_to_image_sdxl.py (#7048)
* Fix typos
2024-02-26 10:39:57 -10:00
Dhruv Nair
d603ccb614 Small change to download in dance diffusion convert script (#7070)
* update

* make style
2024-02-26 12:05:19 +05:30
jiqing-feng
fd0f469568 Resize image before crop (#7095)
resize first

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-26 11:14:08 +05:30
Stephen
ae84e405a3 Pass use_linear_projection parameter to mid block in UNetMotionModel (#7035)
* pass linear projection parameter to mid block

* add cond_proj_dim to motion UNet

* run style and quality checks
2024-02-26 10:49:14 +05:30
Aryan
3a66113306 [Community] Bug fix + Latest IP-Adapter impl. for AnimateDiff img2vid/controlnet (#7086)
* fix img2vid; update to latest ip-adapter impl

* update README

* update animatediff controlnet to latest impl
2024-02-26 10:27:42 +05:30
Vinh H. Pham
7f16187182 Modularize Dreambooth LoRA SDXL inferencing during and after training (#6655)
* modularize log validation

* run make style

* revert import wandb

* fix code quality & import wandb

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-26 09:53:12 +05:30
Vinh H. Pham
f11b922b4f Modularize Dreambooth LoRA SD inferencing during and after training (#6654)
* modulize log validation

* run make style and refactor wanddb support

* remove redundant initialization

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-26 09:01:39 +05:30
Steven Liu
3dd4168d4c [docs] Minor updates (#7063)
* updates

* feedback
2024-02-25 09:38:02 -08:00
Fabio Rigano
1c47d1fc05 Fix head_to_batch_dim for IPAdapterAttnProcessor (#7077)
* Fix IPAdapterAttnProcessor

* Fix batch_to_head_dim and revert reshape
2024-02-25 00:05:46 -10:00
Aryan
bbf70c8739 Fix truthy-ness condition in pipelines that use denoising_start (#6912)
* fix denoising start

* fix tests

* remove debug
2024-02-24 23:39:22 -10:00
M. Tolga Cangöz
738c986957 [Refactor] StableDiffusionReferencePipeline inheriting from DiffusionPipeline (#7071)
Refactor StableDiffusionReferencePipeline to inherit from DiffusionPipeline rather than StableDiffusionPipeline
2024-02-23 22:04:31 -10:00
caiyueliang
c09bb588d3 fix: TensorRTStableDiffusionPipeline cannot set guidance_scale (#7065) 2024-02-23 14:59:02 -10:00
bimsarapathiraja
66a7160f9d Change images to image. The variable images is not used anywhere (#7074) 2024-02-23 10:40:21 -10:00
Chong-U Lim
f05ee56b2f Fix docstring of community pipeline imagic (#7062) 2024-02-23 09:37:52 -08:00
M. Tolga Cangöz
34cc7f9b98 Fix typos (#7068) 2024-02-23 09:24:51 -08:00
M. Tolga Cangöz
53605ed00a [Refactor] save_model_card function in text_to_image examples (#7051)
* Refactor save_model_card function to handle images and repo_folder parameters

* Discard changes to examples/text_to_image/train_text_to_image.py

* Discard changes to examples/text_to_image/train_text_to_image_lora_sdxl.py

* Update train_text_to_image_lora.py

* Update train_text_to_image_sdxl.py
2024-02-23 20:57:37 +05:30
Aryan
bb1b76d3bf IPAdapterTesterMixin (#6862)
* begin IPAdapterTesterMixin



---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-22 14:25:33 -10:00
YiYi Xu
e4b8f173b9 re-add unet refactor PR (#7044)
* add

* remove copied from

---------

Co-authored-by: ultranity <1095429904@qq.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-21 22:00:32 -10:00
Hezi Zisman
f0216b7756 allow explicit tokenizer & text_encoder in unload_textual_inversion (#6977)
* allow passing tokenizer & text_encoder to unload_textual_inversion


---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Fabio Rigano <fabio2rigano@gmail.com>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2024-02-21 16:53:25 -10:00
Lincoln Stein
d5f444de4b Update checkpoint_merger pipeline to pass the "variant" argument (#6670)
* make checkpoint_merger pipeline pass the "variant" argument to from_pretrained()

* make style

---------

Co-authored-by: Lincoln Stein <lstein@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-21 15:45:50 -10:00
M. Tolga Cangöz
5a54dc9e95 Fix typos in text_to_image examples (#7050)
Update copyright information and fix typos in text_to_image examples
2024-02-21 16:40:45 -08:00
YiYi Xu
6fedbd850a fix doc example for fom_single_file (#7015)
* fix doc

* remove use_safetensors from signature

* more

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-21 19:11:21 +05:30
pravdomil
1b3cfb1b10 update header (#6596)
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-02-20 10:09:15 -08:00
Dhruv Nair
af13a90ebd Remove disable_full_determinism from StableVideoDiffusion xformers test. (#7039)
* update

* update
2024-02-20 21:43:23 +05:30
Dhruv Nair
3067da1261 Fix load_model_dict_into_meta for ControlNet from_single_file (#7034)
update
2024-02-20 11:01:02 +05:30
Dhruv Nair
6bceaea3fe Update ControlNet Inpaint single file test (#7022)
update
2024-02-20 10:58:30 +05:30
Dhruv Nair
baf9924be7 Fix alt text and image links in AnimateLCM docs (#7029)
update
2024-02-20 08:30:44 +05:30
Nontapat Kaewamporn
d8d208acde Supper IP Adapter weight loading in StableDiffusionXLControlNetInpaintPipeline (#7031)
* support ip adapter loading

* fix style
2024-02-19 09:44:35 -10:00
Vinh H. Pham
e0f33dfca4 IP-Adapter support for StableDiffusionXLControlNetInpaintPipeline (#6941)
* add ip-adapter support

* support ip image embeds

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-19 08:20:24 -10:00
Dhruv Nair
15b125bb0e Add section on AnimateLCM to docs (#7024)
* update

* update

* update
2024-02-19 22:20:37 +05:30
ustcuna
12004bf3a7 [Community Pipelines]Accelerate inference of stable diffusion xl (SDXL) by IPEX on CPU (#6683)
* add stable_diffusion_xl_ipex community pipeline

* make style for code quality check

* update docs as suggested

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-02-19 13:39:08 +01:00
Dhruv Nair
d2fc5ebb95 [Refactor] FreeInit for AnimateDiff based pipelines (#6874)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update
2024-02-19 11:11:42 +05:30
YiYi Xu
779eef95b4 [from_single_file] pass torch_dtype to set_module_tensor_to_device (#6994)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-19 10:09:19 +05:30
Dhruv Nair
d5b8d1ca04 Fix Pixart Slow Tests (#6962)
* update

* update
2024-02-19 09:50:18 +05:30
Fabio Rigano
eba7e7a6d7 IP-Adapter attention masking (#6847)
* Add attention masking to attn processors

* Update tensor conversion

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-18 18:06:14 -10:00
Sayak Paul
31de879fb4 [IP2P] Make text encoder truly optional in InstructPi2Pix (#6995)
* make text encoder component truly optional.

* more fixes

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-18 20:21:22 +05:30
nbpppp
07349c25fe Fix deprecation warning for torch.utils._pytree._register_pytree_node in PyTorch 2.2 (#7008)
Fixed deprecation warning for torch.utils._pytree._register_pytree_node in PyTorch 2.2

Co-authored-by: Yinghua <yzho0423@uni.sydney.edu.au>
2024-02-18 20:11:33 +05:30
YiYi Xu
8974c50bff [SVD] fix a bug when passing image as tensor (#6999)
* fix

* update docstring

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-17 23:10:31 -10:00
Mikhail Koltakov
c18058b405 Fixed typos in dosctrings of __init__() and in forward() of Unet3DConditionModel (#6663)
* Fixed typos in __init__ and in forward of Unet3DConditionModel

* Resolving conflicts

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-17 21:56:16 -10:00
Thomas Lips
2938d5a672 Add documentation for strength parameter in Controlnet_img2img pipelines (#6951)
copy docstring for `strength` from  stablediffusion img2img pipeline to controlnet img2img pipelines
2024-02-17 21:47:02 -10:00
Sayak Paul
d4ade821cd start depcrecation cycle for lora_attention_proc 👋 (#7007) 2024-02-18 13:11:30 +05:30
Steven Liu
3a7e481611 [docs] Video generation (#6701)
* first draft

* fix path

* fix path

* i2vgen-xl

* review

* modelscopet2v

* feedback
2024-02-16 16:35:37 -08:00
Steven Liu
d649d6c6f3 [docs] Fix callout (#6998)
Update ip_adapter.md
2024-02-16 10:37:12 -10:00
Bhavay Malhotra
777063e1bf Update textual_inversion.py (#6952)
* Update textual_inversion.py

* Apply suggestions from code review

* Update textual_inversion.py

* Update textual_inversion.py

* Update textual_inversion.py

* Update textual_inversion.py

* Update examples/textual_inversion/textual_inversion.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update textual_inversion.py

* styling

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-16 15:39:51 +05:30
Stephen
104afbce84 Standardize model card for textual inversion sdxl (#6963)
* standardize model card

* fix tags

* correct import styling and update tags

* run make style and make quality

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-16 14:27:11 +05:30
co63oc
c0f5346a20 Fix procecss process (#6591)
* Fix words

* Fix

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-15 19:06:33 -10:00
Sayak Paul
087daee2f0 add: peft to the benchmark workflow (#6989) 2024-02-16 09:29:10 +05:30
Paakhhi
7e164d98a8 Fix diffusers import prompt2prompt (#6927)
* Bugfix: correct import for diffusers

* Fix: Prompt2Prompt example

* Format style

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-15 15:30:16 -10:00
Sayak Paul
e6d1728e0a [IP Adapters] feat: allow low_cpu_mem_usage in ip adapter loading (#6946)
* feat: allow low_cpu_mem_usage in ip adapter loading

* reduce the number of device placements.

* documentation.

* throw low_cpu_mem_usage warning only once from the main entry point.
2024-02-15 15:37:17 +05:30
Linoy Tsaban
8f2c7b4df0 [advanced sdxl lora script] - fix #6967 bug when using prior preservation loss (#6968)
* fix bug in micro-conditioning of class images

* fix bug in micro-conditioning of class images

* style
2024-02-15 12:20:05 +05:30
YiYi Xu
2e387dad5f fix IPAdapter unload_ip_adapter test (#6972)
add

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-14 20:42:40 -10:00
Steven Liu
9efe1e52c3 [docs] IP-Adapter (#6897)
* use cases

* first draft

* fix image links

* lcm-lora

* feedback

* review

* feedback

* feedback
2024-02-14 13:23:37 -08:00
Sayak Paul
37b09517b9 fix: controlnet inpaint single file. (#6975) 2024-02-14 19:04:57 +05:30
Sayak Paul
4343ce2c8e [Core] Harmonize single file ckpt model loading (#6971)
* use load_model_into_meta in single file utils

* propagate to autoencoder and controlnet.

* correct class name access behaviour.

* remove torch_dtype from load_model_into_meta; seems unncessary

* remove incorrect kwarg

* style to avoid extra unnecessary line breaks
2024-02-14 10:49:06 +05:30
Younes Belkada
0ca7b68198 [PEFT / docs] Add a note about torch.compile (#6864)
* Update using_peft_for_inference.md

* add more explanation
2024-02-14 02:29:29 +01:00
Dhruv Nair
3cf4f9c735 Allow passing config_file argument to ControlNetModel when using from_single_file (#6959)
* update

* update

* update
2024-02-13 18:54:53 +05:30
Dhruv Nair
40dd9cb2bd Move SDXL T2I Adapter lora test into PEFT workflow (#6965)
update
2024-02-13 17:08:53 +05:30
Dhruv Nair
30bcda7de6 Fix flaky IP Adapter test (#6960)
update
2024-02-13 17:07:39 +05:30
YiYi Xu
9ea62d119a [DPMSolverSinglestepScheduler] correct get_order_list for solver_order=2and lower_order_final=True (#6953)
* add

* change default

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-12 22:10:33 -10:00
Dhruv Nair
a326d61118 Fix configuring VAE from single file mixin (#6950)
* update
2024-02-12 22:10:05 -10:00
Alex Umnov
e7696e20f9 Updated lora inference instructions (#6913)
* Updated lora inference instructions

* Update examples/dreambooth/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update README.md

* Update README.md

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-13 09:35:20 +05:30
Piyush Thakur
4b89aeffe1 [Type annotations] fixed in save_model_card (#6948)
fixed type annotations

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-13 08:56:45 +05:30
Steven Liu
0a1daadef8 [docs] Community pipelines (#6929)
fix
2024-02-12 10:38:13 -08:00
Sayak Paul
371f765908 [Diffusers -> Original SD conversion] fix things (#6933)
* fix: bias loading bug

* fixes for SDXL

* apply changes to the conversion script to match single_file_utils.py

* do transpose to match the single file loading logic.
2024-02-12 17:30:22 +05:30
Piyush Thakur
75aee39eac [Model Card] standardize T2I Adapter Sdxl model card (#6947)
standardize model card template t21-adapter-sdxl
2024-02-12 16:43:20 +05:30
Dhruv Nair
215e6804d3 Unpin torch versions in CI (#6945)
* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-12 16:01:05 +05:30
Disty0
9254d1f39a Pass device to enable_model_cpu_offload in maybe_free_model_hooks (#6937) 2024-02-12 13:42:32 +05:30
Piyush Thakur
e1bdcc7af3 [Model Card] standardize T2I Sdxl Lora model card (#6944)
* standardize model card template t2i-lora-sdxl

* type annotations
2024-02-12 11:45:40 +05:30
Dhruv Nair
84905ca728 Update PixArt Alpha test module to match src module (#6943)
update
2024-02-12 11:01:33 +05:30
Piyush Thakur
6f336650c3 [Model Card] standardize T2I Sdxl model card (#6942)
standardize model card template t2i-sdxl
2024-02-12 10:01:20 +05:30
Piyush Thakur
06a042cd0e [Model Card] standardize T2I Lora model card (#6940)
standardize model card t2i-lora
2024-02-12 10:01:13 +05:30
Piyush Thakur
8772496586 [Model Card] standardize T2I model card (#6939)
* standardize model card

* fix base_model
2024-02-12 10:00:41 +05:30
dg845
35fd84be27 Replace hardcoded values in SchedulerCommonTest with properties (#5479)
---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-02-10 21:34:20 -10:00
YiYi Xu
f2756253e6 Fix a bug in AutoPipeline.from_pipe when switching pipeline with optional components (#6820)
* fix

* add tests

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-09 21:56:52 -10:00
YiYi Xu
0071478d9e allow attention processors to have different signatures (#6915)
add

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-09 21:56:19 -10:00
Sayak Paul
7c8cab313e post release 0.26.2 (#6885)
* post release

* style

* Empty-Commit

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-02-09 07:36:38 -10:00
Sayak Paul
ca9ed5e8d1 [LoRA] deprecate certain lora methods from the old backend. (#6889)
* deprecate certain lora methods from the old backend.

* uncomment necessary things.

* safe remove old lora backend 👋
2024-02-09 17:14:32 +01:00
Bingxin Ke
98b6bee1a1 [Community Pipeline][Bug Fix] marigold_depth_estimation: input image value range (#6787)
[FIX] IMPORTANT: rgb normalization
2024-02-09 16:55:37 +01:00
Dhruv Nair
ab7113487c More IPAdapter test fixes (#6888)
update
2024-02-09 13:54:58 +05:30
camaro
59c307f1d5 Standardize model card for Controlnet (#6910)
* controlnet

* style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-09 12:29:08 +05:30
Sayak Paul
159885adc6 correct hub_token exposition behaviour (thanks to @bghira). (#6918) 2024-02-08 18:38:27 -10:00
Aryan
7337eea59b [refactor] unnecessary lines in decode_latents in video pipelines (#6682)
* refactor decode latents in video pipelines

* make fix-copies
2024-02-08 17:10:52 -10:00
camaro
f07899a57c Standardize model card for Controlnet SDXL (#6908)
controlnet-sdxl
2024-02-09 07:53:39 +05:30
camaro
a83cc0c0bc Standardize model card for Controlnet flax (#6909)
* controlnet-flax

* style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-09 07:52:56 +05:30
C Q
db5194a45d Fix Compatibility Issues in stable_diffusion_xl_reference.py (#6251)
* Fix Compatibility Issues in stable_diffusion_xl_reference.py

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-08 10:52:59 -10:00
shaoxiaowang
e6c9c2513f fix examples/community/pipeline_stable_diffusion_xl_instantid.py (#6759)
Co-authored-by: wangshaoxiao <wangshaoxiao@xiaomi.com>
2024-02-08 10:09:40 -10:00
Kyunghwan Kim
d643b6691f Add vae tiling and slicing in img2img and inpaint (#6871)
* Add vae tiling in img2img and inpaint

* Add vae tiling not slicing
2024-02-08 09:50:00 -10:00
Michael
f5c9be3a0a Remove <cat-toy> validation prompt from examples/textual_inversion/textual_inversion_sdxl.py (#6877)
Remove <cat-toy> validation prompt from textual_inversion_sdxl.py

The `<cat-toy>` validation prompt is a default choice for the example task in the README. But no other part of `textual_inversion_sdxl.py` references the cat toy and `textual_inversion.py` has a default validation prompt of `None` as well.

So bring `textual_inversion_sdxl.py` in line with `textual_inversion.py` and change default validation prompt to `None`
2024-02-08 09:46:06 -10:00
Laisky.Cai
1824d0050e fix: load_image should support PIL Image (#6904)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-08 08:25:37 -10:00
Sayak Paul
30e5e81d58 change to 2024 in the license (#6902)
change to 2024
2024-02-08 08:19:31 -10:00
Masamune Ishihara
8de78001df Add fps argument to export_to_gif function. (#6786) 2024-02-08 21:59:51 +05:30
Patryk Bartkowiak
3ac2357794 changed positional parameters to named parameters like in docs (#6905)
Co-authored-by: Patryk Bartkowiak <patryk.bartkowiak@tcl.com>
Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2024-02-08 21:39:03 +05:30
Ehsan Akhgari
17808a091e Fix bug when converting checkpoint to diffusers format (#6900)
This fixes #6899.
2024-02-08 18:52:11 +05:30
Sayak Paul
491a933a1b [I2VGenXL] attention_head_dim in the UNet (#6872)
* attention_head_dim

* debug

* print more info

* correct num_attention_heads behaviour

* down_block_num_attention_heads -> num_attention_heads.

* correct the image link in doc.

* add: deprecation for num_attention_head

* fix: test argument to use attention_head_dim

* more fixes.

* quality

* address comments.

* remove depcrecation.
2024-02-08 12:30:14 +05:30
Sayak Paul
aa82df52e7 [IP Adapters] introduce ip_adapter_image_embeds in the SD pipeline call (#6868)
* add: support for passing ip adapter image embeddings

* debugging

* make feature_extractor unloading conditioned on safety_checker

* better condition

* type annotation

* index to look into value slices

* more debugging

* debugging

* serialize embeddings dict

* better conditioning

* remove unnecessary prints.

* Update src/diffusers/loaders/ip_adapter.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* make fix-copies and styling.

* styling and further copy fixing.

* fix: check_inputs call in controlnet sdxl img2img pipeline

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-08 11:10:10 +05:30
Srimanth Agastyaraju
a11b0f83b7 Fix: training resume from fp16 for SDXL Consistency Distillation (#6840)
* Fix: training resume from fp16 for lcm distill lora sdxl

* Fix coding quality - run linter

* Fix 1 - shift mixed precision cast before optimizer

* Fix 2 - State dict errors by removing load_lora_into_unet

* Update train_lcm_distill_lora_sdxl.py - Revert default cache dir to None

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-08 11:09:29 +05:30
Sayak Paul
1835510524 Remove torch_dtype in to() to end deprecation (#6886)
* remove torch_dtype from to()

* remove torch_dtype from usage scripts.

* remove old lora backend

* Revert "remove old lora backend"

This reverts commit adcddf6ba4.
2024-02-08 09:38:57 +05:30
camaro
4a3d52850b fix: keyword argument mismatch (#6895) 2024-02-08 09:37:56 +05:30
YiYi Xu
97d004b9b4 [ip-adapter] make sure length of scale is same as number of ip-adapters when using set_ip_adapter_scale (#6884)
add

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-07 10:13:12 -10:00
Sayak Paul
76696dca55 [Model Card] standardize dreambooth model card (#6729)
* feat: standarize model card creation for dreambooth training.

* correct 'inference

* remove comments.

* take component out of kwargs

* style

* add: card template to have a leaner description.

* widget support.

* propagate changes to train_dreambooth_lora

* propagate changes to custom diffusion

* make widget properly type-annotated
2024-02-07 15:07:11 +05:30
Félix Sanz
17612de451 fix: typo in callback function name and property (#6834)
* fix: callback function name is incorrect

On this tutorial there is a function defined and then used inside `callback_on_step_end` argument, but the name was not correct (mismatch)

* fix: typo in num_timestep (correct is num_timesteps)

fixed property name
2024-02-06 12:05:40 -08:00
Dhruv Nair
994360f7a5 Fix last IP Adapter test (#6875)
update
2024-02-06 08:53:40 -10:00
Dhruv Nair
e6a48db633 Refactor Deepfloyd IF tests. (#6855)
* update

* update

* update
2024-02-06 16:43:17 +05:30
sayakpaul
4f1df69d1a Revert "add attention_head_dim"
This reverts commit 15f6b22466.
2024-02-06 14:48:49 +05:30
sayakpaul
15f6b22466 add attention_head_dim 2024-02-06 14:48:07 +05:30
Sayak Paul
e6fd9ada3a [I2vGenXL] clean up things (#6845)
* remove _to_tensor

* remove _to_tensor definition

* remove _collapse_frames_into_batch

* remove lora for not bloating the code.

* remove sample_size.

* simplify code a bit more

* ensure timesteps are always in tensor.
2024-02-06 09:22:07 +05:30
Edward Li
493228a708 Fix AutoencoderTiny with use_slicing (#6850)
* Fix `AutoencoderTiny` with `use_slicing`

When using slicing with AutoencoderTiny, the encoder mistakenly encodes the entire batch for every image in the batch.

* Fixed formatting issue
2024-02-05 09:18:22 -10:00
Dhruv Nair
8bf046b7fb Add single file and IP Adapter support to PIA Pipeline (#6851)
update
2024-02-05 16:23:18 +05:30
Dhruv Nair
bb99623d09 Update IP Adapter tests to use cosine similarity distance (#6806)
* update

* update
2024-02-05 16:22:59 +05:30
Dhruv Nair
fdf55b1f1c Fix posix path issue in testing utils (#6849)
update
2024-02-05 08:57:18 +05:30
小咩Goat
c6f8c310c3 Fix forward pass in UNetMotionModel when gradient checkpoint is enabled (#6744)
fix #6742

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-02-05 08:04:01 +05:30
YiYi Xu
64909f17b7 update IP-adapter code in UNetMotionModel (#6828)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-05 07:26:46 +05:30
Dhruv Nair
f09ca909c8 Multiple small fixes to Video Pipeline docs (#6805)
* update

* update

* update

* Update src/diffusers/pipelines/i2vgen_xl/pipeline_i2vgen_xl.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* update

* update

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-05 07:24:38 +05:30
YiYi Xu
a5fc62f819 add self.use_ada_layer_norm_* params back to BasicTransformerBlock (#6841)
fix sd reference community ppeline

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-02-04 11:16:44 -10:00
Linoy Tsaban
fbdf26bac5 [dreambooth lora sdxl] add sdxl micro conditioning (#6795)
* add micro conditioning

* remove redundant lines

* style

* fix missing 's'

* fix missing shape bug due to missing RGB if statement

* remove redundant if, change arg order

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-04 16:00:09 +02:00
Fabio Rigano
13001ee315 Bugfix in IPAdapterFaceID (#6835) 2024-02-03 08:56:55 -10:00
Linoy Tsaban
65329aed98 [advanced dreambooth lora sdxl script] new features + bug fixes (#6691)
* add noise_offset param

* micro conditioning - wip

* image processing adjusted and moved to support micro conditioning

* change time ids to be computed inside train loop

* change time ids to be computed inside train loop

* change time ids to be computed inside train loop

* time ids shape fix

* move token replacement of validation prompt to the same section of instance prompt and class prompt

* add offset noise to sd15 advanced script

* fix token loading during validation

* fix token loading during validation in sdxl script

* a little clean

* style

* a little clean

* style

* sdxl script - a little clean + minor path fix

sd 1.5 script - change default resolution value

* ad 1.5 script - minor path fix

* fix missing comma in code example in model card

* clean up commented lines

* style

* remove time ids computed outside training loop - no longer used now that we utilize micro-conditioning, as all time ids are now computed inside the training loop

* style

* [WIP] - added draft readme, building off of examples/dreambooth/README.md

* readme

* readme

* readme

* readme

* readme

* readme

* readme

* readme

* removed --crops_coords_top_left from CLI args

* style

* fix missing shape bug due to missing RGB if statement

* add blog mention at the start of the reamde as well

* Update examples/advanced_diffusion_training/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* change note to render nicely as well

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-03 17:33:43 +02:00
Stephen
02338c9317 Change path to posix (testing_utils.py) (#6803)
change path to pathlib as_posix

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-02-03 12:44:13 +05:30
Younes Belkada
15ed53d272 Fixes LoRA SDXL training script with DDP + PEFT (#6816)
Update train_dreambooth_lora_sdxl.py
2024-02-03 09:46:32 +05:30
UmerHA
9cc59ba089 [Contributor Experience] Fix test collection on MPS (#6808)
* Update testing_utils.py

* Update testing_utils.py
2024-02-02 20:59:00 +05:30
YiYi Xu
adcbe674a4 [refactor]Scheduler.set_begin_index (#6728) 2024-02-01 09:51:02 -10:00
Sayak Paul
ec9840a5db [Refactor] harmonize the module structure for models in tests (#6738)
* harmonize the module structure for models in tests

* make the folders modules.

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-02-01 14:23:39 +05:30
YiYi Xu
093a03a1a1 add is_torchvision_available (#6800)
* add

* remove transformer

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-01-31 20:01:44 -10:00
Patrick von Platen
c3369f5673 fix torchvision import (#6796) 2024-01-31 12:13:10 -10:00
Sayak Paul
04cd6adf8c [Feat] add I2VGenXL for image-to-video generation (#6665)
---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-01-31 10:38:51 -10:00
YiYi Xu
66722dbea7 [sdxl k-diffusion pipeline]move sigma to device (#6757)
move sigma to device

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-31 09:29:15 -10:00
YiYi Xu
2e8d18e699 [IP-Adapter] Support multiple IP-Adapters (#6573)
---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Alvaro Somoza <somoza.alvaro@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-01-31 07:11:15 -10:00
Steven Liu
03373de0db [docs] Add missing parameter (#6775)
add missing param
2024-01-31 08:53:40 -08:00
Dhruv Nair
56bea6b4a1 Add PIA Model/Pipeline (#6698)
* update

* update

* updaet

* add tests and docs

* clean up

* add to toctree

* fix copies

* pr review feedback

* fix copies

* fix tests

* update docs

* update

* update

* update docs

* update

* update

* update

* update
2024-01-31 18:00:17 +02:00
Dhruv Nair
d7dc0ffd79 Fix setting scaling factor in VAE config (#6779)
fix
2024-01-31 19:47:22 +05:30
Kashif Rasul
97ee616971 add ipo, hinge and cpo loss to dpo trainer (#6788)
add ipo and hinge loss to dpo trainer
2024-01-31 16:41:31 +05:30
Sayak Paul
0fc62d1702 [Kandinsky tests] add is_flaky to test_model_cpu_offload_forward_pass (#6762)
* add is_flaky to test_model_cpu_offload_forward_pass

* style

* update

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-01-31 14:51:12 +05:30
Dhruv Nair
f4d3f913f4 Pin torch < 2.2.0 in test runners (#6780)
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-31 13:41:18 +05:30
Viet Nguyen
1cab64b3be Update train_diffusion_dpo.py (#6754)
* Update train_diffusion_dpo.py

Address #6702

* Update train_diffusion_dpo_sdxl.py

* Empty-Commit

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-31 12:46:23 +05:30
Sayak Paul
8d7dc85312 add note about serialization (#6764) 2024-01-31 12:45:40 +05:30
dg845
87a92f779c Fix bug in ResnetBlock2D.forward where LoRA Scale gets Overwritten (#6736)
Fix bug in ResnetBlock2D.forward when not USE_PEFT_BACKEND and using scale_shift for time emb where the lora scale  gets overwritten.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-30 14:43:48 -10:00
Yunxuan Xiao
0db766ba77 [DDPMScheduler] Load alpha_cumprod to device to avoid redundant data movement. (#6704)
* load cumprod tensor to device

Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>

* fixing ci

Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>

* make fix-copies

Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>

---------

Signed-off-by: woshiyyya <xiaoyunxuan1998@gmail.com>
2024-01-30 13:19:37 -10:00
Dhruv Nair
8e94663503 Update export to video to support new tensor_to_vid function in video pipelines (#6715)
update
2024-01-30 19:43:33 +05:30
YiYi Xu
b09b90e24c udpate ip-adapter slow tests (#6760)
* udpate slices

* up

* hopefully last one

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-01-29 17:55:41 -10:00
Sajad Norouzi
058b47553e Fix mixed precision fine-tuning for text-to-image-lora-sdxl example. (#6751)
* Fix mixed precision fine-tuning for text-to-image-lora-sdxl example.

* fix text_encoder_two bug.

---------

Co-authored-by: Sajad Norouzi <sajadn@dev-dsk-sajadn-2a-87239470.us-west-2.amazon.com>
2024-01-30 06:55:02 +05:30
xhedit
7f58a76f48 Update lora.md with a more accurate description of rank (#6724)
* Update lora.md with a more accurate description of rank

* Update docs/source/en/training/lora.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-01-29 09:41:51 -08:00
Sayak Paul
09b7bfce91 [Core] move transformer scripts to transformers modules (#6747)
* move transformer scripts to transformers modules

* move transformer model test

* move prior transformer test to  directory

* fix doc path

* correct doc path

* add: __init__.py
2024-01-29 22:28:28 +05:30
Fabio Rigano
5d8b1987ec Add unload_textual_inversion method (#6656)
* Add unload_textual_inversion

* Fix dicts in tokenizer

* Fix quality

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Fix variable name after last update

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-01-29 06:26:22 -10:00
Sayak Paul
acd1962769 correct hflip arg (#6743) 2024-01-28 17:42:34 +05:30
Stephen
5b1b80a5b6 Change os.path to pathlib Path (#6737)
Change os.path to pathlib
2024-01-28 10:24:13 +05:30
gzguevara
8581d9bce4 changed to posix unet (#6719)
changed to posix

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-27 16:31:52 +05:30
Yingtian Liu
c101066227 Correct SNR weighted loss in v-prediction case by only adding 1 to SNR on the denominator (#6307)
* fix minsnr implementation for v-prediction case

* format code

* always compute snr when snr_gamma is specified

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-27 09:18:09 +05:30
Sayak Paul
d4c7ab7bf1 [Hub] feat: explicitly tag to diffusers when using push_to_hub (#6678)
* feat: explicitly tag to diffusers when using push_to_hub

* remove tags.

* reset repo.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix: tests

* fix: push_to_hub behaviour for tagging from save_pretrained

* Apply suggestions from code review

Co-authored-by: Lucain <lucainp@gmail.com>

* Apply suggestions from code review

Co-authored-by: Lucain <lucainp@gmail.com>

* import fixes.

* add library name to existing model card.

* add: standalone test for generate_model_card

* fix tests for standalone method

* moved library_name to a better place.

* merge create_model_card and generate_model_card.

* fix test

* address lucain's comments

* fix return identation

* Apply suggestions from code review

Co-authored-by: Lucain <lucainp@gmail.com>

* address further comments.

* Update src/diffusers/pipelines/pipeline_utils.py

Co-authored-by: Lucain <lucainp@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lucain <lucainp@gmail.com>
2024-01-26 23:01:48 +05:30
dg845
ea9dc3fa90 Add UFOGenScheduler to Community Examples (#6650)
* Add UFOGenScheduler with diffusers imports changed from relative to absolute.

* make style

* Add community README entry for UFOGenScheduler.
2024-01-26 15:11:14 +02:00
dg845
b4220e97b1 Add Community Example Consistency Training Script (#6717)
* initial commit for unconditional/class-conditional consistency training script

* make style

* Add entry for consistency training script in community README.

* Move consistency training script from community to research_projects/consistency_training

* Add requirements.txt and README to research_projects/consistency_training directory.

* Manually revert community README changes for consistency training.

* Fix path to script after moving script to research projects.

* Add option to load U-Net weights from pretrained model.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-01-26 15:10:57 +02:00
Dhruv Nair
dc85b578c2 Move tests for SD inference variant pipelines into their own modules (#6707)
* update

* update

* update
2024-01-26 14:09:41 +02:00
Dhruv Nair
0d927c7542 Add IP Adapters to slow tests (#6714)
update
2024-01-25 19:52:50 -10:00
Andrew Ishutin
5b93338235 fix custom diffusion training with concept list (#6710)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-26 11:17:51 +08:00
Aryan V S
7c1c705f60 fix community README (#6645) 2024-01-25 17:52:52 -08:00
Aryan V S
9e72016468 [docs] AnimateDiff Video-to-Video (#6712)
* add animatediff vid2vid to docs

* Update docs/source/en/api/pipelines/animatediff.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* apply suggestions from review

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-01-25 17:51:43 -08:00
Patrick von Platen
3e9716f22b Correct sigmas cpu settings (#6708) 2024-01-26 09:32:24 +08:00
Steven Liu
87bfbc320d [docs] UViT2D (#6643)
* uvit2d

* fix

* fix?

* add correct paper

* fix paths

* update abstract
2024-01-25 09:37:28 -08:00
Aryan V S
a517f665a4 AnimateDiff Video to Video (#6328)
* begin animatediff img2video and video2video

* revert animatediff to original implementation

* add img2video as pipeline

* update

* add vid2vid pipeline

* update imports

* update

* remove copied from line for check_inputs

* update

* update examples

* add multi-batch support

* fix __init__.py files

* move img2vid to community

* update community readme and examples

* fix

* make fix-copies

* add vid2vid batch params

* apply suggestions from review

Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com>

* add test for animatediff vid2vid

* torch.stack -> torch.cat

Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com>

* make style

* docs for vid2vid

* update

* fix prepare_latents

* fix docs

* remove img2vid

* update README to :main

* remove slow test

* refactor pipeline output

* update docs

* update docs

* merge community readme from :main

* final fix i promise

* add support for url in animatediff example

* update example

* update callbacks to latest implementation

* Update src/diffusers/pipelines/animatediff/pipeline_animatediff_video2video.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/animatediff/pipeline_animatediff_video2video.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix merge

* Apply suggestions from code review

* remove callback and callback_steps as suggested in review

* Update tests/pipelines/animatediff/test_animatediff_video2video.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix import error caused due to unet refactor in #6630

* fix numpy import error after tensor2vid refactor in #6626

* make fix-copies

* fix numpy error

* fix progress bar test

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-01-24 18:22:26 +05:30
Brandon Strong
16748d1eba SD 1.5 Support For Advanced Lora Training (train_dreambooth_lora_sdxl_advanced.py) (#6449)
* sd1.5 support in separate script

A quick adaptation to support people interested in using this method on 1.5 models.

* sd15 prompt text encoding and unet conversions

as per @linoytsaban 's recommendations. Testing would be appreciated,

* Readability and quality improvements

Removed some mentions of SDXL, and some arguments that don't apply to sd 1.5, and cleaned up some comments.

* make style/quality commands

* tracker rename and run-it doc

* Update examples/advanced_diffusion_training/train_dreambooth_lora_sd15_advanced.py

* Update examples/advanced_diffusion_training/train_dreambooth_lora_sd15_advanced.py

---------

Co-authored-by: Linoy Tsaban <57615435+linoytsaban@users.noreply.github.com>
2024-01-24 11:20:08 +02:00
Haofan Wang
c9081a8abd [Fix bugs] pipeline_controlnet_sd_xl.py (#6653)
* Update pipeline_controlnet_sd_xl.py

* Update pipeline_controlnet_xs_sd_xl.py
2024-01-24 09:18:12 +05:30
Yasuna
0eb68d9ddb [Docs] update: tutorials ja | AUTOPIPELINE.md (#6629)
* add en file

* translate 1-118 lines

* add text

* add toctree

* fix

* fix typo

* fix link
2024-01-23 09:19:55 -08:00
Haofan Wang
9941b3f124 Add InstantID Pipeline (#6673)
* add instantid pipeline

* format

* Update README.md

* Update README.md

* format

---------

Co-authored-by: ResearcherXman <xhs.research@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-23 17:25:23 +05:30
Ayush Mangal
16b9f98b48 [WIP][Community Pipeline] InstaFlow! One-Step Stable Diffusion with Rectified Flow (#6057)
* Add instaflow community pipeline

* Make styling fixes

* Add lora

* Fix formatting

* Add docs

* Update README.md

* Update README.md

* Remove do LORA

* Update readme

* Update README.md

* Update README.md

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-01-23 13:52:50 +02:00
Dhruv Nair
fee93c81eb [Refactor] Update from single file (#6428)
* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update'

* update

* update

* update

* update

* update

* update

* up

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* up

* update

* update

* update

* update

* update'

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* clean

* update

* update

* clean up

* clean up

* update

* clean

* clean

* update

* updaet

* clean up

* fix docs

* update

* update

* Revert "update"

This reverts commit dbfb8f1ea9.

* update

* update

* update

* update

* fix controlnet

* fix scheduler

* fix controlnet tests
2024-01-23 14:42:03 +05:30
Sayak Paul
5308cce994 [Tests] Test for passing local config file to from_single_file() (#6638)
make config file local too.
2024-01-23 14:21:23 +05:30
YiYi Xu
318556b20e fix dpm related slow test failure (#6680)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-01-22 18:52:05 -10:00
Dhruv Nair
6620eda357 Standardise outputs for video pipelines (#6626)
* update

* update

* update

* update

* update

* update

* update

* clean up

* clean up
2024-01-23 10:07:07 +05:30
Sayak Paul
1f0705adcf [Big refactor] move unets to unets module 🦋 (#6630)
* move unets to  module 🦋

* parameterize unet-level import.

* fix flax unet2dcondition model import

* models __init__

* mildly depcrecating models.unet_2d_blocks in favor of models.unets.unet_2d_blocks.

* noqa

* correct depcrecation behaviour

* inherit from the actual classes.

* Empty-Commit

* backwards compatibility for unet_2d.py

* backward compatibility for unet_2d_condition

* bc for unet_1d

* bc for unet_1d_blocks
2024-01-23 08:57:58 +05:30
M. Tolga Cangöz
5e96333cb2 Update README (#6669)
Update number of checkpoints and repositories in README
2024-01-22 08:08:07 -08:00
Sayak Paul
da95a28ff6 [Diffusion DPO] apply fixes from #6547 (#6668)
apply fixes from #6547
2024-01-22 20:14:54 +05:30
Dhruv Nair
d66d554dc2 Add tearDown method to LoRA tests. (#6660)
* update

* update
2024-01-22 14:00:37 +05:30
Junsong Chen
c7df846dec add Sa-Solver (#5975)
* add Sa-Solver



---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: scxue <xueshuchen17@mails.ucas.edu.cn>
Co-authored-by: jschen <chenjunsong4@h-partners.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-01-21 21:37:44 -10:00
Vinh H. Pham
8e7bbfbe5a add padding_mask_crop to all inpaint pipelines (#6360)
* add padding_mask_crop
---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-01-21 19:42:22 -10:00
YiYi Xu
e2773c6255 fix SDXL-kdiffusion tests (#6647)
🤞🤞🤞

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-01-19 17:37:29 -10:00
YiYi Xu
ac61eefc9f fix DPM Scheduler with use_karras_sigmas option (#6477)
* fix

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-01-19 07:08:22 -10:00
HelloWorldBeginner
f95615b823 Fixed the bug related to saving DeepSpeed models. (#6628)
* Fixed the bug related to saving DeepSpeed models.

* Add information about training SD models using DeepSpeed to the README.

* Apply suggestions from code review

---------

Co-authored-by: mhh001 <mahonghao1@huawei.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-19 19:21:57 +05:30
SangKim
a9288b49c9 Modularize InstructPix2Pix SDXL inferencing during and after training in examples (#6569) 2024-01-19 15:47:34 +05:30
elucida
c54419658b refactor: extract init/forward function in UNet2DConditionModel (#6478)
* - extract function for stage in UNet2DConditionModel init & forward
- Add new function get_mid_block() to unet_2d_blocks.py

* add type hint to get_mid_block aligned with get_up_block and get_down_block; rename _set_xxx function

* add type hint and  use keyword arguments

* remove `copy from` in versatile diffusion
2024-01-19 12:13:34 +02:00
Aryan V S
6382663dc8 [Community] Experimental AnimateDiff Image to Video (open to improvements) (#6509)
* add animatediff img2vid

* fix

* Update examples/community/README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix code snippet between ip adapter face id and animatediff img2vid

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-01-19 12:05:41 +02:00
spezialspezial
58b8dce129 Update convert_from_ckpt.py / read checkpoint config yaml contents (#6633)
Update convert_from_ckpt.py / read config yaml contents

Added missing reading of config yaml file contents
2024-01-19 08:25:03 +05:30
Dhruv Nair
a65ca8a059 Fix failing tests due to Posix Path (#6627)
update
2024-01-18 19:19:57 +05:30
Steven Liu
5ca062e011 [docs] Fix missing API function (#6604)
fix?
2024-01-17 13:59:09 -08:00
Linoy Tsaban
619e3ab6f6 [bug fix] advanced dreambooth lora sdxl - fixes bugs described in #6486 (#6599)
* fixes bugs:
1. redundant retraction
2. param clone
3. stopping optimization of text encoder params

* param upscaling

* style
2024-01-17 20:11:45 +05:30
Patrick von Platen
9e2804f720 Update pr_test_peft_backend.yml to use 1 process for testing (#6613) 2024-01-17 19:25:30 +05:30
Aryan V S
9112028ed8 FreeInit (#6315)
* freeinit

* update freeinit implementation based on review

Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com>

* fix

* another fix

* refactor

* fix timesteps missing bug

* apply suggestions from review

Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com>

* add test for freeinit

* apply suggestions from review

Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com>

* refactor

* fix test

* fix tensor not on same device

* update

* remove return_intermediate_results

* fix broken freeinit test

* update animatediff docs

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-01-17 17:17:07 +05:30
Steve Rhoades
dce06680d2 Fixes torch.compile() compatible training (#6589)
resolve conflicts
2024-01-17 07:47:03 +05:30
Bhavay Malhotra
dd63168319 Update installation.md (#6438)
* Update installation.md

* Update installation.md

* Update installation.md
2024-01-16 13:41:38 -08:00
Celestial Phineas
1040dfd9cc [Fix] Multiple image conditionings in a single batch for StableDiffusionControlNetPipeline (#6334)
* [Fix] Multiple image conditionings in a single batch for `StableDiffusionControlNetPipeline`.

* Refactor `check_inputs` in `StableDiffusionControlNetPipeline` to avoid redundant codes.

* Make the behavior of MultiControlNetModel to be the same to the original ControlNetModel

* Keep the code change minimum for nested list support

* Add fast test `test_inference_nested_image_input`

* Remove redundant check for nested image condition in `check_inputs`

Remove `len(image) == len(prompt)` check out of `check_image()`

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Better `ValueError` message for incompatible nested image list size

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Fix syntax error in `check_inputs`

* Remove warning message for multi-ControlNets with multiple prompts

* Fix a typo in test_controlnet.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* Add test case for multiple prompts, single image conditioning in `StableDiffusionMultiControlNetPipelineFastTests`

* Improved `ValueError` message for nested `controlnet_conditioning_scale`

* Documenting the behavior of image list as `StableDiffusionControlNetPipeline` input

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-01-16 10:40:55 -10:00
Sayak Paul
49a4b377c1 remove omegaconf from the residues 👋 (#6600)
remove omegaconf 👋
2024-01-16 21:55:41 +05:30
JuanCarlosPi
dff35a86e4 Change in ip-adapter docs. CLIPVisionModelWithProjection should be im… (#6597)
Change in ip-adapter docs. CLIPVisionModelWithProjection should be imported from transformers, not diffusers
2024-01-16 08:18:13 -08:00
Yondon Fu
8842bcadb9 [SVD] Return np.ndarray when output_type="np" (#6507)
[SVD] Fix output_type="np"
2024-01-16 14:51:36 +05:30
Steve Rhoades
181280baba Fixes training resuming: Advanced Dreambooth LoRa Training (#6566)
* Fixes #6418 Advanced Dreambooth LoRa Training

* change order of import to fix nit

* fix nit, use cast_training_params

* remove torch.compile fix, will move to a new PR

* remove unnecessary import
2024-01-16 14:30:49 +05:30
Charchit Sharma
53f498d2a4 Use of Posix to better support Windows compatibility in testing_utils (#6587)
* changes in utils

* removed loc
2024-01-16 10:27:29 +02:00
Charchit Sharma
990860911f change to posix for better Windows support for lora loaders (#6590)
* posix lora

* changes and style fix
2024-01-16 13:46:29 +05:30
Fabio Rigano
23eed39702 Fix path generation in IP Adapter (#6564)
* Fix path generation on Windows

* Update set_default_attn_processors

* Use pathlib

* Fix quality

* Fix copy

* Revert changes in set_default_attn_processors

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-16 11:25:22 +05:30
YiYi Xu
fefed44543 update slow test for SDXL k-diffusion pipeline (#6588)
update expected slice
2024-01-15 18:54:33 -10:00
Dong
814f56d2fe 🐛 fix ip-adapter controlnet img2img missing code (#6528)
* 🐛 fix ip-adapter controlnet img2img missing code

* 📝 edit test

* 📝 edit test

* 📝 run make style and quality

* 🎨 remove slow tests
2024-01-15 16:41:09 -10:00
SangKim
96d6e16550 Enable image resizing to adjust its height and width in StableDiffusionXLInstructPix2PixPipeline (#6581)
* Enable image resizing to adjust its height and width in StableDiffusionXLInstructPix2PixPipeline

* Ensure that validation is performed at every 'validation_step', not at every step
2024-01-16 07:50:34 +05:30
Aryan V S
c11de13588 [training] fix training resuming problem for fp16 (SD LoRA DreamBooth) (#6554)
* fix training resume

* update

* update
2024-01-16 07:27:06 +05:30
Patrick von Platen
357855f8fc [Docs] Fix controlnet indent (#6578) 2024-01-15 18:12:55 +02:00
Fabio Rigano
f825221b5d [Community Pipeline] IPAdapter FaceID (#6276)
* Add support for IPAdapter FaceID

* Add docs

* Move subfolder to kwargs

* Fix quality

* Fix image encoder loading

* Fix loading + add test

* Move to community folder

* Fix style

* Revert constant update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-15 16:43:54 +02:00
Aryan V S
119d734f6e [AnimateDiff+Controlnet] Fix multicontrolnet support (#6551)
* fix multicontrolnet support

* update README with multicontrolnet example
2024-01-15 16:36:54 +02:00
Sayak Paul
cb4b3f0b78 [OmegaConf] replace it with yaml (#6488)
* remove omegaconf from convert_from_ckpt.

* remove from single_file.

* change to string based ubscription.

* style

* okay

* fix: vae_param

* no . indexing.

* style

* style

* turn getattrs into explicit if/else

* style

* propagate changes to ldm_uncond.

* propagate to gligen

* propagate to if.

* fix: quotes.

* propagate to audioldm.

* propagate to audioldm2

* propagate to musicldm.

* propagate to vq_diffusion

* propagate to zero123.

* remove omegaconf from diffusers codebase.
2024-01-15 20:02:10 +05:30
Haofan Wang
3d574b3bbe Fix a bug of flip in SDXL training script (#6547)
* Update train_text_to_image_sdxl.py

* Update train_text_to_image_lora_sdxl.py
2024-01-15 16:28:04 +02:00
Charchit Sharma
09903774d9 Make T2I Adapter SDXL Training Script torch.compile compatible (#6577)
update for t2i_adapter
2024-01-15 19:42:56 +05:30
dependabot[bot]
d6a70d8ba8 Bump jinja2 from 3.1.2 to 3.1.3 in /examples/research_projects/realfill (#6539)
Bumps [jinja2](https://github.com/pallets/jinja) from 3.1.2 to 3.1.3.
- [Release notes](https://github.com/pallets/jinja/releases)
- [Changelog](https://github.com/pallets/jinja/blob/main/CHANGES.rst)
- [Commits](https://github.com/pallets/jinja/compare/3.1.2...3.1.3)

---
updated-dependencies:
- dependency-name: jinja2
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2024-01-15 16:10:10 +02:00
Charchit Sharma
e3103e171f Make InstructPix2Pix SDXL Training Script torch.compile compatible (#6576)
* changes for pix2pix_sdxl

* style fix
2024-01-15 17:54:03 +05:30
Charchit Sharma
b053053ac9 Make InstructPix2Pix Training Script torch.compile compatible (#6558)
* added torch.compile for pix2pix

* required changes
2024-01-15 17:03:22 +05:30
Vinh H. Pham
08702fc1cb Make text-to-image SDXL LoRA Training Script torch.compile compatible (#6556)
make compile compatible
2024-01-15 16:58:16 +05:30
Vinh H. Pham
7ce89e979c Make text-to-image SD LoRA Training Script torch.compile compatible (#6555)
make compile compatible
2024-01-15 16:55:08 +05:30
gzguevara
05faf3263b SDXL text-to-image torch compatible (#6550)
* torch compatible

* code quality fix

* ruff style

* ruff format
2024-01-15 16:49:11 +05:30
Sayak Paul
a080f0d3a2 [Training Utils] create a utility for casting the lora params during training. (#6553)
create a utility for casting the lora params during training.
2024-01-15 13:51:13 +05:30
Sayak Paul
79df50388d [Training] fix training resuming problem when using FP16 (SDXL LoRA DreamBooth) (#6514)
* fix: training resume from fp16.

* add: comment

* remove residue from another branch.

* remove more residues.

* thanks to Younes; no hacks.

* style.

* clean things a bit and modularize _set_state_dict_into_text_encoder

* add comment about the fix detailed.
2024-01-12 17:11:06 +05:30
Vinh H. Pham
7d631825b0 Make Dreambooth SD Training Script torch.compile compatible (#6532)
* support compile

* make style

* move unwrap_model inside function

* change unwrap call

* run make style

* Update examples/dreambooth/train_dreambooth.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Revert "Update examples/dreambooth/train_dreambooth.py"

This reverts commit 70ab09732e.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-12 12:50:15 +05:30
gzguevara
33d2b5b087 SD text-to-image torch compile compatible (#6519)
* added unwrapper

* fiz typo
2024-01-12 09:28:35 +05:30
Suvaditya Mukherjee
f486d34b04 Make ControlNet SD Training Script torch.compile compatible (#6525)
* update: make controlnet script torch compile compatible

Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>

* update: correct earlier mistakes for compilation

Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>

* update: fix code style issues

Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>

---------

Signed-off-by: Suvaditya Mukherjee <suvadityamuk@gmail.com>
2024-01-12 09:27:26 +05:30
Charchit Sharma
e44b205e0b Make ControlNet SDXL Training Script torch.compile compatible (#6526)
* make torch.compile compatible

* fix quality
2024-01-12 09:25:09 +05:30
Vinh H. Pham
60cb44323d Make Dreambooth SD LoRA Training Script torch.compile compatible (#6534)
support compile
2024-01-12 09:24:03 +05:30
Radamés Ajna
1dd0ac9401 [DPO Training] pass tracker name as argument (#6542)
pass tracker name as argumentw
2024-01-12 09:15:39 +05:30
Yassine El Boudouri
c6b04589b6 Remove conversion to RGB (#6479)
* Remove conversion to RGB

* Add a Conversion Function

* Add type hint for convert_method

* Update src/diffusers/utils/loading_utils.py

Update docstring

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docstring

* Optimize imports

* Optimize imports (2)

* Reformat code

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-12 07:20:24 +05:30
Sayak Paul
5dc3471380 [SVD] support generators that are created on GPU (#6484)
* debug generator

* fix?

* fix?

* fix

* remove print.

* revert none check
2024-01-11 20:08:18 +05:30
Aryan V S
9df566e6da [Community] StyleAligned Pipeline (#6489)
* add stylealigned sdxl pipeline

* bugfix

* update docs

* remove einops dependency

* update README

* update example docstring
2024-01-11 14:35:55 +01:00
Sayak Paul
be0b425762 [Training] make checkpointing compatible when using torch.compile (part II) (#6511)
make checkpointing compatible when using torch.compile.
2024-01-11 18:37:30 +05:30
jquintanilla4
da843b3d53 .load_ip_adapter in StableDiffusionXLAdapterPipeline (#6246)
* Added testing notebook and .load_ip_adapter to XLAdapterPipeline

* Added annotations

* deleted testing notebook

* Update src/diffusers/pipelines/t2i_adapter/pipeline_stable_diffusion_xl_adapter.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* code clean up

* Add feature_extractor and image_encoder to components

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-01-11 13:04:08 +05:30
dg845
17cece072a Fix bug in LCM Distillation Scripts when args.unet_time_cond_proj_dim is used (#6523)
* Fix bug where unet's time_cond_proj_dim is not set correctly if using args.unet_time_cond_proj_dim.

* make style
2024-01-11 08:21:07 +05:30
Steven Liu
a551ddf928 [docs] mask_blur and padding_mask_crop (#6498)
new inpaint features
2024-01-10 08:14:34 -08:00
Steven Liu
1d57892980 [docs] Callbacks (#6471)
edits
2024-01-10 08:14:07 -08:00
antoine-scenario
3e8b63216e Add IP-Adapter to StableDiffusionXLControlNetImg2ImgPipeline (#6293)
* add IP-Adapter to StableDiffusionXLControlNetImg2ImgPipeline

Update src/diffusers/pipelines/controlnet/pipeline_controlnet_sd_xl_img2img.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

fix tests

* fix failing test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-09 22:02:11 -10:00
YiYi Xu
dd4459ad79 [Refactor] splitingResnetBlock2D into multiple blocks (#6166)
---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-09 21:38:05 -10:00
YiYi Xu
6313645b6b add StableDiffusionXLKDiffusionPipeline (#6447)
---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-01-09 16:29:01 -10:00
Rahul Raman
2d1f2182cc example: Train Instruct pix2 pix with lora implementation (#6469)
* base template file - train_instruct_pix2pix.py

* additional import and parser argument requried for lora

* finetune only instructpix2pix model -- no need to include these layers

* inject lora layers

* freeze unet model -- only lora layers are trained

* training modifications to train only lora parameters

* store only lora parameters

* move train script to research project

* run quality and style code checks

* move train script to a new folder

* add README

* update README

* update references in README

---------

Co-authored-by: Rahul Raman <rahulraman@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-10 06:38:19 +05:30
Steven Liu
3be7c96e28 [docs] Stable video diffusion (#6472)
svd
2024-01-09 09:21:58 -08:00
Steven Liu
3c79dd9dbe [docs] PEFT adapter API (#6499)
follow up
2024-01-09 08:09:15 -08:00
Steven Liu
9d767916da [docs] Fast diffusion (#6470)
* edits

* fix

* feedback
2024-01-09 08:08:31 -08:00
Patrick von Platen
ae7cd5ad4c Link issue template to discussions 2024-01-09 17:00:15 +01:00
Sayak Paul
4497b3ec98 [Training] make DreamBooth SDXL LoRA training script compatible with torch.compile (#6483)
* make it torch.compile comaptible

* make the text encoder compatible too.

* style
2024-01-09 20:11:26 +05:30
Yifan Zhou
fc63ebdd3a [Community Pipeline] Rerender-A-Video: Zero-Shot Video-to-Video Translation (#6332)
* upload codes and doc

* lint

* lint

* lint

* update code

* remove blank lines

* Fix load url
2024-01-09 14:55:34 +01:00
Sayak Paul
8b6fae4ce5 [SVD] fix: vae type (#6475)
fix: vae type
2024-01-09 14:50:02 +01:00
jiqing-feng
aa1797e109 enable stable-xl textual inversion (#6421)
* enable stable-xl textual inversion

* check if optimizer_2 exists

* check text_encoder_2 before using

* add textual inversion for sdxl in a single file

* fix style

* fix example style

* reset for error changes

* add readme for sdxl

* fix style

* disable autocast as it will cause cast error when weight_dtype=bf16

* fix spelling error

* fix style and readme and 8bit optimizer

* add README_sdxl.md link

* add tracker key on log_validation

* run style

* rm the second center crop

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-09 15:12:33 +05:30
Patrick von Platen
5bacc2f5af [SAG] Support more schedulers, add better error message and make tests faster (#6465)
* finish

* finish

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-09 09:24:38 +05:30
Yasuna
6ae7e8112a [Docs] update: tutorials ja | INDEX.md, TUTORIAL_OVERVIEW.md, TOCTREE.yml (#6338)
* add tutorials to toctree.yml

* fix title

* fix words

* add overview ja

* fix diffusion to 拡散

* fix line 21

* add space

* delete supported pipline

* fix tutorial_overview.md

* fix space

* fix typo

* Delete docs/source/ja/tutorials/using_peft_for_inference.md

this file is not translated

* Delete docs/source/ja/tutorials/basic_training.md

this file is not translated

* Delete docs/source/ja/tutorials/autopipeline.md

this file is not translated

* fix toctree
2024-01-08 09:06:46 -08:00
Sayak Paul
e0f349c2b0 correct reviewers in PR template (#6485) 2024-01-08 19:12:04 +05:30
Sayak Paul
774f5c4581 minor changes to the SVD doc (#6466)
minor changes
2024-01-06 08:40:46 +05:30
Sayak Paul
a483a8eddf Rename REAMDE.md to README.md 2024-01-05 20:49:44 +05:30
Lucain
9a9daee724 Fix offline mode import (#6467) 2024-01-05 15:34:40 +01:00
Sayak Paul
585f941366 [Core] introduce PeftAdapterMixin module. (#6416)
* introduce integrations module.

* remove duplicate methods.

* better imports.

* move to loaders.py

* remove peftadaptermixin from modelmixin.

* add: peftadaptermixin selectively.

* add: entry to _toctree

* Empty-Commit
2024-01-05 18:18:28 +05:30
Dhruv Nair
86a26761ac Correctly handle creating model index json files when setting compiled modules in pipelines. (#6436)
update
2024-01-05 18:02:09 +05:30
Liang Hou
6ef2b8a92f Fix amused paper link (#6462) 2024-01-05 13:12:09 +01:00
Vinh H. Pham
3848606c7e [Community Pipeline] Add gluegen (#6433)
* init works

* add gluegen pipeline

* add gluegen code

* add another way to load language adapter

* make style

* Update README.md

* change doc
2024-01-05 13:05:40 +01:00
Sayak Paul
2a97067b84 [Experimental] Diffusion LoRA DPO training (#6422)
* add: experimental script for diffusion dpo training.

* random_crop cli.

* fix: caption tokenization.

* fix: pixel_values index.

* fix: grad?

* debug

* fix: reduction.

* fixes in the loss calculation.

* style

* fix: unwrap call.

* fix: validation inference.

* add: initial sdxl script

* debug

* make sure images in the tuple are of same res

* fix model_max_length

* report print

* boom

* fix: numerical issues.

* fix: resolution

* comment about resize.

* change the order of the training transformation.

* save call.

* debug

* remove print

* manually detaching necessary?

* use the same vae for validation.

* add: readme.
2024-01-05 16:40:06 +05:30
Sayak Paul
ae060fc4f1 [feat] introduce unload_lora(). (#6451)
* introduce unload_lora.

* fix-copies
2024-01-05 16:22:11 +05:30
Sayak Paul
9d945b2b90 0.25.0 post release (#6358)
* post release

* style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2024-01-05 16:13:27 +05:30
Junsheng121
d184291c7d null-text-inversion-pipeline-implementation (#6329)
* null-text-inversion-implementation

* edited

* edited

* edited

* edited

* edited

* edit

* makestyle

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-05 11:35:21 +01:00
Sayak Paul
0a0bb526aa [LoRA depcrecation] LoRA depcrecation trilogy (#6450)
* edebug

* debug

* more debug

* more more debug

* remove tests for LoRAAttnProcessors.

* rename
2024-01-05 15:48:20 +05:30
Linoy Tsaban
2fada8dc1b [bug fix] fixes #6444 - checkpointing save issue in advanced dreambooth lora sdxl script (#6464)
* unwrap text encoder when saving hook only for full text encoder tuning

* unwrap text encoder when saving hook only for full text encoder tuning

* save embeddings in each checkpoint as well

* save embeddings in each checkpoint as well

* save embeddings in each checkpoint as well

* Update examples/advanced_diffusion_training/train_dreambooth_lora_sdxl_advanced.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-05 15:35:24 +05:30
jiqing-feng
f2d51a28f7 Intel Gen 4 Xeon and later support bf16 (#6367)
* Intel Gen 4 Xeon and later support bf16

* fix bf16 notes
2024-01-05 11:47:28 +05:30
Horseee
811fd06292 [Doc] Add DeepCache in section optimization/General optimizations (#6390)
* add documentation for DeepCache

* fix typo

* add wandb url for DeepCache

* fix some typos

* add item in _toctree.yml

* update formats for arguments

* Update deepcache.md

* Update docs/source/en/optimization/deepcache.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* add StableDiffusionXLPipeline in doc

* Separate SDPipeline and SDXLPipeline

* Add the paper link of ablation experiments for hyper-parameters

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-01-05 09:57:08 +05:30
dg845
f3d1333e02 Improve LCM(-LoRA) Distillation Scripts (#6420)
* Make WDS pipeline interpolation type configurable.

* Make the VAE encoding batch size configurable.

* Make lora_alpha and lora_dropout configurable for LCM LoRA scripts.

* Generalize scalings_for_boundary_conditions function and make the timestep scaling configurable.

* Make LoRA target modules configurable for LCM-LoRA scripts.

* Move resolve_interpolation_mode to src/diffusers/training_utils.py and make interpolation type configurable in non-WDS script.

* apply suggestions from review
2024-01-05 06:55:13 +05:30
Steven Liu
acd926f4f2 [docs] Fix local links (#6440)
fix local links
2024-01-04 09:59:11 -08:00
Lucain
691d8d3e15 Respect offline mode when loading pipeline (#6456)
* Respect offline mode when loading model

* default to local entry if connectionerror
2024-01-04 17:05:55 +01:00
Sayak Paul
e7c0af5e71 add: amused paper link. (#6453) 2024-01-04 13:44:54 +05:30
Sayak Paul
107e02160a [LoRA tests] fix stuff related to assertions arising from the recent changes. (#6448)
* debug

* debug test_with_different_scales_fusion_equivalence

* use the right method.

* place it right.

* let's see.

* let's see again

* alright then.

* add a comment.
2024-01-04 12:55:15 +05:30
sayakpaul
6dbef45e6e Revert "debug"
This reverts commit 7715e6c31c.
2024-01-04 10:39:38 +05:30
sayakpaul
7715e6c31c debug 2024-01-04 10:39:00 +05:30
sayakpaul
05b3d36a25 Revert "debug"
This reverts commit fb4aec0ce3.
2024-01-04 10:38:04 +05:30
sayakpaul
fb4aec0ce3 debug 2024-01-04 10:37:28 +05:30
Sayak Paul
63de23e3db disable running peft non-peft lora test in the peft env. (#6437)
* disable running peft non-peft lora test in the peft env.

* Empty-Commit
2024-01-04 10:18:46 +05:30
Chi
2993257f2a Batter way to write binarize() function. (#6394)
* I added a new doc string to the class. This is more flexible to understanding other developers what are doing and where it's using.

* Update src/diffusers/models/unet_2d_blocks.py

This changes suggest by maintener.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/unet_2d_blocks.py

Add suggested text

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update unet_2d_blocks.py

I changed the Parameter to Args text.

* Update unet_2d_blocks.py

proper indentation set in this file.

* Update unet_2d_blocks.py

a little bit of change in the act_fun argument line.

* I run the black command to reformat style in the code

* Update unet_2d_blocks.py

similar doc-string add to have in the original diffusion repository.

* Batter way to write binarize function

* Solve check_code_quality error

* My mistake to run pull request but not reformated file

* Update image_processor.py

* remove extra variable and space

* Update image_processor.py

* Run ruff libarary to reformat my file

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-01-04 09:32:08 +05:30
Sayak Paul
aad18faa3e Update README_sdxl.md to update the LR (#6432)
Update README_sdxl.md
2024-01-03 20:55:51 +05:30
Sayak Paul
d700140076 [LoRA deprecation] handle rest of the stuff related to deprecated lora stuff. (#6426)
* handle rest of the stuff related to deprecated lora stuff.

* fix: copies

* don't modify the uNet in-place.

* fix: temporal autoencoder.

* manually remove lora layers.

* don't copy unet.

* alright

* remove lora attn processors from unet3d

* fix: unet3d.

* styl

* Empty-Commit
2024-01-03 20:54:09 +05:30
Sayak Paul
2e4dc3e25d [LoRA] add: test to check if peft loras are loadable in non-peft envs. (#6400)
* add: test to check if peft loras are loadable in non-peft envs.

* add torch_device approrpiately.

* fix: get_dummy_inputs().

* test logits.

* rename

* debug

* debug

* fix: generator

* new assertion values after fixing the seed.

* shape

* remove print statements and settle this.

* to update values.

* change values when lora config is initialized under a fixed seed.

* update colab link

* update notebook link

* sanity restored by getting the exact same values without peft.
2024-01-03 09:57:49 +05:30
YiYi Xu
3e2961f0b4 [doc] update inpaint doc to use apply_overlay (#6364)
add doc

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-01-02 11:16:36 -10:00
Vinh H. Pham
79c380bc80 Correct how apply_overlay read crop_coords (#6417)
correct reading variables
2024-01-02 19:31:12 +01:00
Aryan V S
e30b661437 Update lpw_xl pipeline to latest diffusers (#6411)
* add clip_skip, freeu, qkv

* fix

* add ip-adapter support

* callback on step end

* update

* fix NoneType bug

* fix

* add guidance scale embedding

* add textual inversion
2024-01-02 16:28:45 +01:00
Linoy Tsaban
b4077af212 [bug fix] using snr gamma and prior preservation loss in the dreambooth lora sdxl training scripts (#6356)
* change timesteps used to calculate snr when --with_prior_preservation is enabled

* change timesteps used to calculate snr when --with_prior_preservation is enabled (canonical script)

* style

* revert canonical script to before snr gamma change

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-02 09:21:39 -06:00
Daniel Socek
9f2bff502e [svd] fix noise_aug_strength type in svd pipe (#6389) 2024-01-02 14:45:07 +01:00
CyrusVorwald
0cb92717f9 add StableDiffusionXLControlNetInpaintPipeline to auto pipeline (#6302)
* add StableDiffusionXLControlNetInpaintPipeline to auto pipeline

* fixed style
2024-01-02 14:44:48 +01:00
Fabio Rigano
86714b72d0 Add unload_ip_adapter method (#6192)
* Add unload_ip_adapter method

* Update attn_processors with original layers

* Add test

* Use set_default_attn_processor

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-02 14:40:46 +01:00
Sayak Paul
61f6c5472a [LoRA] Remove the use of depcrecated loRA functionalities such as LoRAAttnProcessor (#6369)
* start deprecating loraattn.

* fix

* wrap into unet_lora_state_dict

* utilize text_encoder_lora_params

* utilize text_encoder_attn_modules

* debug

* debug

* remove print

* don't use text encoder for test_stable_diffusion_lora

* load the procs.

* set_default_attn_processor

* fix: set_default_attn_processor call.

* fix: lora_components[unet_lora_params]

* checking for 3d.

* 3d.

* more fixes.

* debug

* debug

* debug

* debug

* more debug

* more debug

* more debug

* more debug

* more debug

* more debug

* hack.

* remove comments and prep for a PR.

* appropriate set_lora_weights()

* fix

* fix: test_unload_lora_sd

* fix: test_unload_lora_sd

* use dfault attebtion processors.

* debu

* debug nan

* debug nan

* debug nan

* use NaN instead of inf

* remove comments.

* fix: test_text_encoder_lora_state_dict_unchanged

* attention processor default

* default attention processors.

* default

* style
2024-01-02 18:14:04 +05:30
lookas
17546020fc Fix #6409 (#6410)
* Update value_guided_sampling.py

Fix #6409

* Comply code style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-02 11:16:21 +05:30
2510
8a366b835c Fix gradient-checkpointing option is ignored in SDXL+LoRA training. (#6388) (#6402)
* Fix gradient-checkpointing option is ignored in SDXL+LoRA training. (#6388)

* Fix gradient-checkpointing option is ignored in SD+LoRA training.

* Fix gradient checkpoint is not applied to text encoders. (SDXL+LoRA)

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-01-01 08:51:04 +05:30
Sayak Paul
61d223c884 add: CUDA graph details. (#6408) 2023-12-31 13:43:26 +05:30
apolinário
bf725e044e Add new WebUI conversion state_dict_utils to __init__ utils (#6404)
* Add new state_dict_utils to __init__ utils

* style

---------

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
2023-12-30 09:31:39 -06:00
apolinário
1622265e13 Add WebUI format support to Advanced Training Script (#6403)
* Add WebUI format support to Advanced Training Script

* style

---------

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
2023-12-30 08:45:49 -06:00
apolinário
0b63ad5ad5 Create convert_diffusers_sdxl_lora_to_webui.py (#6395)
* Create convert_diffusers_sdxl_lora_to_webui.py

* Move some conversion logic to utils

* fix logging import

* Add usage example

---------

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
2023-12-30 08:15:11 -06:00
Sayak Paul
6a376ceea2 [LoRA] remove unnecessary components from lora peft test suite (#6401)
remove unnecessary components from lora peft suite/
2023-12-30 18:25:40 +05:30
gzguevara
9f283b01d2 changed w&b report link (#6387) 2023-12-29 19:49:11 +05:30
Sayak Paul
203724e9d9 [Docs] add note on fp16 in fast diffusion (#6380)
add note on fp16
2023-12-29 09:38:50 +05:30
gzguevara
e7044a4221 multi-subject-dreambooth-inpainting with 🤗 datasets (#6378)
* files added

* fixing code quality

* fixing code quality

* fixing code quality

* fixing code quality

* sorted import block

* seperated import wandb

* ruff on script

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-29 09:33:49 +05:30
Sayak Paul
034b39b8cb [docs] add details concerning diffusers-specific bits. (#6375)
add details concerning diffusers-specific bits.
2023-12-28 23:12:49 +05:30
Sayak Paul
2db73f4a50 remove delete documentation trigger workflows. (#6373) 2023-12-28 18:26:14 +05:30
Adrian Punga
84d7faebe4 Fix support for MPS in KDPM2AncestralDiscreteScheduler (#6365)
Fix support for MPS

MPS doesn't support float64
2023-12-28 10:22:02 +01:00
YiYi Xu
4c483deb90 [refactor embeddings] gligen + ip-adapter (#6244)
* refactor ip-adapter-imageproj, gligen

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-12-27 18:48:42 -10:00
Sayak Paul
1ac07d8a8d [Training examples] Follow up of #6306 (#6346)
* add to dreambooth lora.

* add: t2i lora.

* add: sdxl t2i lora.

* style

* lcm lora sdxl.

* unwrap

* fix: enable_adapters().
2023-12-28 07:37:50 +05:30
apolinário
1fff527702 Fix keys for lora format on advanced training scripts (#6361)
fix keys for lora format on advanced training scripts
2023-12-27 11:38:03 -06:00
apolinário
645a62bf3b Add PEFT to advanced training script (#6294)
* Fix ProdigyOPT in SDXL Dreambooth script

* style

* style

* Add PEFT to Advanced Training Script

* style

* style

*  style 

* change order for logic operation

* add lora alpha

* style

* Align PEFT to new format

* Update train_dreambooth_lora_sdxl_advanced.py

Apply #6355 fix

---------

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
2023-12-27 10:00:32 -03:00
Dhruv Nair
6414d4e4f9 Fix chunking in SVD (#6350)
fix
2023-12-27 13:07:41 +01:00
Andy W
43672b4a22 Fix "push_to_hub only create repo in consistency model lora SDXL training script" (#6102)
* fix

* style fix

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-27 15:25:19 +05:30
dg845
9df3d84382 Fix LCM distillation bug when creating the guidance scale embeddings using multiple GPUs. (#6279)
Fix bug when creating the guidance embeddings using multiple GPUs.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-27 14:25:21 +05:30
Jianqi Pan
c751449011 fix: use retrieve_latents (#6337) 2023-12-27 10:44:26 +05:30
Dhruv Nair
c1e8bdf1d4 Move ControlNetXS into Community Folder (#6316)
* update

* update

* update

* update

* update

* make style

* remove docs

* update

* move to research folder.

* fix-copies

* remove _toctree entry.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-27 08:15:23 +05:30
Sayak Paul
78b87dc25a [LoRA] make LoRAs trained with peft loadable when peft isn't installed (#6306)
* spit diffusers-native format from the get go.

* rejig the peft_to_diffusers mapping.
2023-12-27 08:01:10 +05:30
Will Berman
0af12f1f8a amused update links to new repo (#6344)
* amused update links to new repo

* lint
2023-12-26 22:46:28 +01:00
Justin Ruan
6e123688dc Remove unused parameters and fixed FutureWarning (#6317)
* Remove unused parameters and fixed `FutureWarning`

* Fixed wrong config instance

* update unittest for `DDIMInverseScheduler`
2023-12-26 22:09:10 +01:00
YiYi Xu
f0a588b8e2 adding auto1111 features to inpainting pipeline (#6072)
* add inpaint_full_res

* fix

* update

* move get_crop_region to image processor

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* move apply_overlay to image processor

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-12-26 10:20:29 -10:00
priprapre
fa31704420 [SDXL-IP2P] Update README_sdxl, Replace the link for wandb log with the correct run (#6270)
Replace the link for wandb log with the correct run
2023-12-26 21:13:11 +01:00
Sayak Paul
9d79991da0 [Docs] fix: video rendering on svd. (#6330)
fix: video rendering on svd.
2023-12-26 21:05:22 +01:00
Will Berman
7d865ac9c6 amused other pipelines docs (#6343)
other pipelines
2023-12-26 20:20:32 +01:00
Dhruv Nair
fb02316db8 Add AnimateDiff conversion scripts (#6340)
* add scripts

* update
2023-12-26 22:40:00 +05:30
Dhruv Nair
98a2b3d2d8 Update Animatediff docs (#6341)
* update

* update

* update
2023-12-26 22:39:46 +05:30
Dhruv Nair
2026ec0a02 Interruptable Pipelines (#5867)
* add interruptable pipelines

* add tests

* updatemsmq

* add interrupt property

* make fix copies

* Revert "make fix copies"

This reverts commit 914b35332b.

* add docs

* add tutorial

* Update docs/source/en/tutorials/interrupting_diffusion_process.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tutorials/interrupting_diffusion_process.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update

* fix quality issues

* fix

* update

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-26 22:39:26 +05:30
dg845
3706aa3305 Add rescale_betas_zero_snr Argument to DDPMScheduler (#6305)
* Add rescale_betas_zero_snr argument to DDPMScheduler.

* Propagate rescale_betas_zero_snr changes to DDPMParallelScheduler.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-26 17:54:30 +01:00
Sayak Paul
d4f10ea362 [Diffusion fast] add doc for diffusion fast (#6311)
* add doc for diffusion fast

* add entry to _toctree

* Apply suggestions from code review

* fix titlew

* fix: title entry

* add note about fuse_qkv_projections
2023-12-26 22:19:55 +05:30
Younes Belkada
3aba99af8f [Peft / Lora] Add adapter_names in fuse_lora (#5823)
* add adapter_name in fuse

* add tesrt

* up

* fix CI

* adapt from suggestion

* Update src/diffusers/utils/testing_utils.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* change to `require_peft_version_greater`

* change variable names in test

* Update src/diffusers/loaders/lora.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* break into 2 lines

* final comments

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2023-12-26 16:54:47 +01:00
Sayak Paul
6683f97959 [Training] Add datasets version of LCM LoRA SDXL (#5778)
* add: script to train lcm lora for sdxl with 🤗 datasets

* suit up the args.

* remove comments.

* fix num_update_steps

* fix batch unmarshalling

* fix num_update_steps_per_epoch

* fix; dataloading.

* fix microconditions.

* unconditional predictions debug

* fix batch size.

* no need to use use_auth_token

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* make vae encoding batch size an arg

* final serialization in kohya

* style

* state dict rejigging

* feat: no separate teacher unet.

* debug

* fix state dict serialization

* debug

* debug

* debug

* remove prints.

* remove kohya utility and make style

* fix serialization

* fix

* add test

* add peft dependency.

* add: peft

* remove peft

* autocast device determination from accelerator

* autocast

* reduce lora rank.

* remove unneeded space

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* style

* remove prompt dropout.

* also save in native diffusers ckpt format.

* debug

* debug

* debug

* better formation of the null embeddings.

* remove space.

* autocast fixes.

* autocast fix.

* hacky

* remove lora_sayak

* Apply suggestions from code review

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* style

* make log validation leaner.

* move back enabled in.

* fix: log_validation call.

* add: checkpointing tests

* taking my chances to see if disabling autocasting has any effect?

* start debugging

* name

* name

* name

* more debug

* more debug

* index

* remove index.

* print length

* print length

* print length

* move unet.train() after add_adapter()

* disable some prints.

* enable_adapters() manually.

* remove prints.

* some changes.

* fix params_to_optimize

* more fixes

* debug

* debug

* remove print

* disable grad for certain contexts.

* Add support for IPAdapterFull (#5911)

* Add support for IPAdapterFull


Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix a bug in `add_noise` function  (#6085)

* fix

* copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>

* [Advanced Diffusion Script] Add Widget default text (#6100)

add widget

* [Advanced Training Script] Fix pipe example (#6106)

* IP-Adapter for StableDiffusionControlNetImg2ImgPipeline (#5901)

* adapter for StableDiffusionControlNetImg2ImgPipeline

* fix-copies

* fix-copies

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* IP adapter support for most pipelines (#5900)

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_upscale.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_instruct_pix2pix.py

* update tests

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_panorama.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion_safe/pipeline_stable_diffusion_safe.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_text2img.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_img2img.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

* revert changes to sd_attend_and_excite and sd_upscale

* make style

* fix broken tests

* update ip-adapter implementation to latest

* apply suggestions from review

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* fix: lora_alpha

* make vae casting conditional/

* param upcasting

* propagate comments from https://github.com/huggingface/diffusers/pull/6145

Co-authored-by: dg845 <dgu8957@gmail.com>

* [Peft] fix saving / loading when unet is not "unet" (#6046)

* [Peft] fix saving / loading when unet is not "unet"

* Update src/diffusers/loaders/lora.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* undo stablediffusion-xl changes

* use unet_name to get unet for lora helpers

* use unet_name

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* [Wuerstchen] fix fp16 training and correct lora args (#6245)

fix fp16 training

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* [docs] fix: animatediff docs (#6339)

fix: animatediff docs

* add: note about the new script in readme_sdxl.

* Revert "[Peft] fix saving / loading when unet is not "unet" (#6046)"

This reverts commit 4c7e983bb5.

* Revert "[Wuerstchen] fix fp16 training and correct lora args (#6245)"

This reverts commit 0bb9cf0216.

* Revert "[docs] fix: animatediff docs (#6339)"

This reverts commit 11659a6f74.

* remove tokenize_prompt().

* assistive comments around enable_adapters() and diable_adapters().

---------

Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Fabio Rigano <57982783+fabiorigano@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
Co-authored-by: Charchit Sharma <charchitsharma11@gmail.com>
Co-authored-by: Aryan V S <contact.aryanvs@gmail.com>
Co-authored-by: dg845 <dgu8957@gmail.com>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
2023-12-26 21:22:05 +05:30
Sayak Paul
4e7b0cb396 [docs] fix: animatediff docs (#6339)
fix: animatediff docs
2023-12-26 19:13:49 +05:30
Kashif Rasul
35b81fffae [Wuerstchen] fix fp16 training and correct lora args (#6245)
fix fp16 training

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-26 11:40:04 +01:00
Kashif Rasul
e0d8c910e9 [Peft] fix saving / loading when unet is not "unet" (#6046)
* [Peft] fix saving / loading when unet is not "unet"

* Update src/diffusers/loaders/lora.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* undo stablediffusion-xl changes

* use unet_name to get unet for lora helpers

* use unet_name

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-26 11:39:28 +01:00
dg845
a3d31e3a3e Change LCM-LoRA README Script Example Learning Rates to 1e-4 (#6304)
Change README LCM-LoRA example learning rates to 1e-4.
2023-12-25 21:29:20 +05:30
Jianqi Pan
84c403aedb fix: cannot set guidance_scale (#6326)
fix: set guidance_scale
2023-12-25 21:16:57 +05:30
Sayak Paul
f4b0b26f7e [Tests] Speed up example tests (#6319)
* remove validation args from textual onverson tests

* reduce number of train steps in textual inversion tests

* fix: directories.

* debig

* fix: directories.

* remove validation tests from textual onversion

* try reducing the time of test_text_to_image_checkpointing_use_ema

* fix: directories

* speed up test_text_to_image_checkpointing

* speed up test_text_to_image_checkpointing_checkpoints_total_limit_removes_multiple_checkpoints

* fix

* speed up test_instruct_pix2pix_checkpointing_checkpoints_total_limit_removes_multiple_checkpoints

* set checkpoints_total_limit to 2.

* test_text_to_image_lora_checkpointing_checkpoints_total_limit_removes_multiple_checkpoints speed up

* speed up test_unconditional_checkpointing_checkpoints_total_limit_removes_multiple_checkpoints

* debug

* fix: directories.

* speed up test_instruct_pix2pix_checkpointing_checkpoints_total_limit

* speed up: test_controlnet_checkpointing_checkpoints_total_limit_removes_multiple_checkpoints

* speed up test_controlnet_sdxl

* speed up dreambooth tests

* speed up test_dreambooth_lora_checkpointing_checkpoints_total_limit_removes_multiple_checkpoints

* speed up test_custom_diffusion_checkpointing_checkpoints_total_limit_removes_multiple_checkpoints

* speed up test_text_to_image_lora_sdxl_text_encoder_checkpointing_checkpoints_total_limit

* speed up # checkpoint-2 should have been deleted

* speed up examples/text_to_image/test_text_to_image.py::TextToImage::test_text_to_image_checkpointing_checkpoints_total_limit

* additional speed ups

* style
2023-12-25 19:50:48 +05:30
Sayak Paul
89459a5d56 fix: lora peft dummy components (#6308)
* fix: lora peft dummy components

* fix: dummy components
2023-12-25 11:26:45 +05:30
Sayak Paul
008d9818a2 fix: t2i apdater paper link (#6314) 2023-12-25 10:45:14 +05:30
mwkldeveloper
2d43094ffc fix RuntimeError: Input type (float) and bias type (c10::Half) should be the same in train_text_to_image_lora.py (#6259)
* fix RuntimeError: Input type (float) and bias type (c10::Half) should be the same

* format source code

* format code

* remove the autocast blocks within the pipeline

* add autocast blocks to pipeline caller in train_text_to_image_lora.py
2023-12-24 14:34:35 +05:30
Celestial Phineas
7c05b975b7 Fix typos in the ValueError for a nested image list as StableDiffusionControlNetPipeline input. (#6286)
Fixed typos in the `ValueError` for a nested image list as input.
2023-12-24 14:32:24 +05:30
Dhruv Nair
fe574c8b29 LoRA Unfusion test fix (#6291)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-24 14:31:48 +05:30
Sayak Paul
90b9479903 [LoRA PEFT] fix LoRA loading so that correct alphas are parsed (#6225)
* initialize alpha too.

* add: test

* remove config parsing

* store rank

* debug

* remove faulty test
2023-12-24 09:59:41 +05:30
apolinário
df76a39e1b Fix Prodigy optimizer in SDXL Dreambooth script (#6290)
* Fix ProdigyOPT in SDXL Dreambooth script

* style

* style
2023-12-22 06:42:04 -06:00
Bingxin Ke
3369bc810a [Community Pipeline] Add Marigold Monocular Depth Estimation (#6249)
* [Community Pipeline] Add Marigold Monocular Depth Estimation

- add single-file pipeline
- update README

* fix format - add one blank line

* format script with ruff

* use direct image link in example code

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-22 15:41:46 +05:30
Pedro Cuenca
7fe47596af Allow diffusers to load with Flax, w/o PyTorch (#6272) 2023-12-22 09:37:30 +01:00
Dhruv Nair
59d1caa238 Remove peft tests from old lora backend tests (#6273)
update
2023-12-22 13:35:52 +05:30
Dhruv Nair
c022e52923 Remove ONNX inpaint legacy (#6269)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-22 13:35:21 +05:30
Will Berman
4039815276 open muse (#5437)
amused

rename

Update docs/source/en/api/pipelines/amused.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

AdaLayerNormContinuous default values

custom micro conditioning

micro conditioning docs

put lookup from codebook in constructor

fix conversion script

remove manual fused flash attn kernel

add training script

temp remove training script

add dummy gradient checkpointing func

clarify temperatures is an instance variable by setting it

remove additional SkipFF block args

hardcode norm args

rename tests folder

fix paths and samples

fix tests

add training script

training readme

lora saving and loading

non-lora saving/loading

some readme fixes

guards

Update docs/source/en/api/pipelines/amused.md

Co-authored-by: Suraj Patil <surajp815@gmail.com>

Update examples/amused/README.md

Co-authored-by: Suraj Patil <surajp815@gmail.com>

Update examples/amused/train_amused.py

Co-authored-by: Suraj Patil <surajp815@gmail.com>

vae upcasting

add fp16 integration tests

use tuple for micro cond

copyrights

remove casts

delegate to torch.nn.LayerNorm

move temperature to pipeline call

upsampling/downsampling changes
2023-12-21 11:40:55 -08:00
Sayak Paul
5b186b7128 [Refactor] move ldm3d out of stable_diffusion. (#6263)
ldm3d.
2023-12-21 18:59:55 +05:30
Sayak Paul
ab0459f2b7 [Deprecated pipelines] remove pix2pix zero from init (#6268)
remove pix2pix zero from init
2023-12-21 18:17:28 +05:30
Sayak Paul
9c7cc36011 [Refactor] move panorama out of stable_diffusion (#6262)
* move panorama out.

* fix: diffedit

* fix: import.

* fix: impirt
2023-12-21 18:17:05 +05:30
Sayak Paul
325f6c53ed [Refactor] move attend and excite out of stable_diffusion. (#6261)
* move attend and excite out.

* fix: import

* fix diffedit
2023-12-21 16:49:32 +05:30
Benjamin Bossan
43979c2890 TST Fix LoRA test that fails with PEFT >= 0.7.0 (#6216)
See #6185 for context.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-21 11:50:05 +01:00
Sayak Paul
9ea6ac1b07 [Refactor] move sag out of stable_diffusion (#6264)
move sag out of .
2023-12-21 16:09:49 +05:30
Sayak Paul
2c34c7d6dd [Refactor] move gligen out of stable diffusion. (#6265)
* move gligen out of stable diffusion.

* fix: import

* fix import module
2023-12-21 15:26:52 +05:30
Sayak Paul
bffadde126 [Refactor] move k diffusion out of stable_diffusion (#6267)
move k diffusion out of stable_diffusion
2023-12-21 15:24:24 +05:30
YShow
35a969d297 [Training] remove depcreated method from lora scripts again (#6266)
* remove depcreated method from lora scripts

* check code quality
2023-12-21 14:17:52 +05:30
sayakpaul
c5ff469d0e Revert "move attend and excite out of stable_diffusion"
This reverts commit bcecfbc873.
2023-12-21 12:35:58 +05:30
sayakpaul
bcecfbc873 move attend and excite out of stable_diffusion 2023-12-21 12:35:09 +05:30
Sayak Paul
6269045c5b [Refactor] move diffedit out of stable_diffusion (#6260)
* move diffedit out of stable_diffuson

* fix: import

* style

* fix: import
2023-12-21 12:26:36 +05:30
lvzi
6ca9c4af05 fix: unscale fp16 gradient problem & potential error (#6086) (#6231)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-21 09:09:26 +05:30
dependabot[bot]
0532cece97 Bump transformers from 4.34.0 to 4.36.0 in /examples/research_projects/realfill (#6255)
Bump transformers in /examples/research_projects/realfill

Bumps [transformers](https://github.com/huggingface/transformers) from 4.34.0 to 4.36.0.
- [Release notes](https://github.com/huggingface/transformers/releases)
- [Commits](https://github.com/huggingface/transformers/compare/v4.34.0...v4.36.0)

---
updated-dependencies:
- dependency-name: transformers
  dependency-type: direct:production
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2023-12-21 09:03:17 +05:30
Sayak Paul
22b45304bf [Refactor upsamplers and downsamplers] separate out upsamplers and downsamplers. (#6128)
* separate out upsamplers and downsamplers.

* import all the necessary blocks in resnet for backward comp.

* move upsample2d and downsample2d to utils.

* move downsample_2d to downsamplers.py

* apply feedback

* fix import

* samplers -> sampling
2023-12-20 21:01:33 +05:30
Beinsezii
457abdf2cf EulerAncestral add rescale_betas_zero_snr (#6187)
* EulerAncestral add `rescale_betas_zero_snr`

Uses same infinite sigma fix from EulerDiscrete. Interestingly the
ancestral version had the opposite problem: too much contrast instead of
too little.

* UT for EulerAncestral `rescale_betas_zero_snr`

* EulerAncestral upcast samples during step()

It helps this scheduler too, particularly when the model is using bf16.

While the noise dtype is still the model's it's automatically upcasted
for the add so all it affects is determinism.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-20 13:09:25 +05:30
hako-mikan
ff43dba7ea [Fix] Fix Regional Prompting Pipeline (#6188)
* Update regional_prompting_stable_diffusion.py

* reformat

* reformat

* reformat

* reformat

* reformat

* reformat

* reformat

* regormat

* reformat

* reformat

* reformat

* reformat

* Update regional_prompting_stable_diffusion.py

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-20 10:37:19 +05:30
Steven Liu
5433962992 [docs] Batched seeds (#6237)
batched seed
2023-12-19 16:50:18 -08:00
raven
df476d9f63 [Docs] Fix a code example in the ControlNet Inpainting documentation (#6236)
fix document on masked image in inpainting controlnet
2023-12-19 13:14:37 -08:00
YiYi Xu
3e71a20650 [refactor embeddings]pixart-alpha (#6212)
pixart-alpha

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-12-19 07:07:24 -10:00
Sayak Paul
bf40d7d82a add peft dependency to fast push tests (#6229)
* add peft dependency

* add peft dependency at the correct place.
2023-12-19 13:26:25 +05:30
Dhruv Nair
32ff4773d4 ControlNetXS fixes. (#6228)
update
2023-12-19 11:58:34 +05:30
Sayak Paul
288ceebea5 [T2I LoRA training] fix: unscale fp16 gradient problem (#6119)
* fix: unscale fp16 gradient problem

* fix for dreambooth lora sdxl

* make the type-casting conditional.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-12-19 09:54:17 +05:30
Sayak Paul
9221da4063 fix: init for vae during pixart tests (#6215)
* fix: init for vae during pixart tests

* print the values

* add flatten

* correct assertion value for test_inference

* correct assertion values for test_inference_non_square_images

* run styling

* debug test_inference_with_multiple_images_per_prompt

* fix assertion values for test_inference_with_multiple_images_per_prompt
2023-12-18 18:16:57 -10:00
YiYi Xu
57fde871e1 offload the optional module image_encoder (#6151)
* offload image_encoder

* add test

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-18 15:10:01 -10:00
Fabio Rigano
68e962395c Add converter method for ip adapters (#6150)
* Add converter method for ip adapters

* Move converter method

* Update to image proj converter

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-18 13:46:43 -10:00
Dhruv Nair
781775ea56 Slow Test for Pipelines minor fixes (#6221)
update
2023-12-19 00:45:51 +05:30
Patrick von Platen
fa3c86beaf [SVD] Fix guidance scale (#6002)
* [SVD] Fix guidance scale

* make style
2023-12-18 19:33:24 +01:00
Haofan Wang
7d0a47f387 Update train_text_to_image_lora.py (#6144)
* Update train_text_to_image_lora.py

* Fix typo?

---------

Co-authored-by: M. Tolga Cangöz <46008593+standardAI@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-18 19:33:05 +01:00
Aryan V S
67b3d3267e Support img2img and inpaint in lpw-xl (#6114)
* add img2img and inpaint support to lpw-xl

* update community README

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-18 19:19:11 +01:00
TilmannR
4e77056885 Update README.md (#6191)
Typo: The script for LoRA training is `train_text_to_image_lora_prior.py` not `train_text_to_image_prior_lora.py`.

Alternatively you could rename the file and keep the README.md unchanged.
2023-12-18 19:08:29 +01:00
Dhruv Nair
a0c54828a1 Deprecate Pipelines (#6169)
* deprecate pipe

* make style

* update

* add deprecation message

* format

* remove tests for deprecated pipelines

* remove deprecation message

* make style

* fix copies

* clean up

* clean

* clean

* clean

* clean up

* clean up

* clean up toctree

* clean up

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-12-18 23:08:29 +05:30
Patrick von Platen
8d891e6e1b [Torch Compile] Fix torch compile for svd vae (#6217) 2023-12-18 18:21:17 +01:00
Patrick von Platen
cce1fe2d41 [Text-to-Video] Clean up pipeline (#6213)
* make style

* make style

* make style

* make style
2023-12-18 18:21:09 +01:00
Abin Thomas
d816bcb5e8 Fix t2i. blog url (#6205) 2023-12-18 09:12:28 -08:00
d8ahazard
6976cab7ca Fix possible re-conversion issues after extracting from safetensors (#6097)
* Fix possible re-conversion issues after extracting from diffusers

Properly rename specific vae keys.

* Whoops
2023-12-18 11:51:20 +01:00
Dhruv Nair
fcbed3fa79 Fix SDXL Inpainting from single file with Refiner Model (#6147)
* update

* update

* update
2023-12-18 11:45:37 +01:00
Sayak Paul
b98b314b7a [Training] remove depcreated method from lora scripts. (#6207)
remove depcreated method from lora scripts.
2023-12-18 15:52:43 +05:30
Omar Sanseviero
74558ff65b Nit fix to training params (#6200) 2023-12-18 11:06:16 +01:00
Yudong Jin
49644babd3 Fix the test script in examples/text_to_image/README.md (#6209)
* Update examples/text_to_image/README.md

* Update examples/text_to_image/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-18 15:36:00 +05:30
Sayak Paul
56b3b21693 [Refactor autoencoders] feat: introduce autoencoders module (#6129)
* feat: introduce autoencoders module

* more changes for styling and copy fixing

* path changes in the docs.

* fix: import structure in init.

* fix controlnetxs import
2023-12-18 12:42:15 +05:30
Sayak Paul
9cef07da5a [Benchmarks] fix: lcm benchmarking reporting (#6198)
* fix: lcm benchmarking reporting

* fix generate_csv_dict call.
2023-12-17 15:32:11 +05:30
Sayak Paul
2d94c7838e [Core] feat: enable fused attention projections for other SD and SDXL pipelines (#6179)
* feat: enable fused attention projections for other SD and SDXL pipelines

* add: test for SD fused projections.
2023-12-16 08:45:54 +05:30
Sayak Paul
a81334e3f0 [LoRA] add an error message when dealing with _best_guess_weight_name ofline (#6184)
* add an error message when dealing with _best_guess_weight_name ofline

* simplify condition
2023-12-16 08:36:08 +05:30
Dhruv Nair
d704a730cd Compile test fix (#6104)
* update

* update
2023-12-15 18:34:46 +05:30
dg845
49db233b35 Clean Up Comments in LCM(-LoRA) Distillation Scripts. (#6145)
* Clean up comments in LCM(-LoRA) distillation scripts.

* Calculate predicted source noise noise_pred correctly for all prediction_types.

* make style

* apply suggestions from review

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-15 18:18:16 +05:30
Dhruv Nair
93ea26f272 Add PEFT to training deps (#6148)
add peft to training deps

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-15 08:39:59 +05:30
Dhruv Nair
f5dfe2a8b0 LoRA test fixes (#6163)
* update

* update

* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-15 08:39:41 +05:30
Patrick von Platen
4836cfad98 [Sigmas] Keep sigmas on CPU (#6173)
* correct

* Apply suggestions from code review

* make style
2023-12-15 07:43:18 +05:30
Kuba
1ccbfbb663 [docs] Add missing \ in lora.md (#6174) 2023-12-14 16:55:43 -08:00
Linoy Tsaban
29dfe22a8e [advanced dreambooth lora sdxl training script] load pipeline for inference only if validation prompt is used (#6171)
* load pipeline for inference only if validation prompt is used

* move things outside

* load pipeline for inference only if validation prompt is used

* fix readme when validation prompt is used

---------

Co-authored-by: linoytsaban <linoy@huggingface.co>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
2023-12-14 11:45:33 -06:00
Aryan V S
56806cdbfd Add missing subclass docs, Fix broken example in SD_safe (#6116)
* fix broken example in pipeline_stable_diffusion_safe

* fix typo in pipeline_stable_diffusion_pix2pix_zero

* add missing docs

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-14 09:20:30 -08:00
Steven Liu
8ccc76ab37 [docs] IP-Adapter API doc (#6140)
add ip-adapter

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-14 09:19:37 -08:00
Monohydroxides
c46711e895 [Community] Add SDE Drag pipeline (#6105)
* Add community pipeline: sde_drag.py

* Update README.md

* Update README.md

Update example code and visual example

* Update sde_drag.py

Update code example.
2023-12-14 20:47:20 +05:30
Sayak Paul
1d686bac81 [feat: Benchmarking Workflow] add stuff for a benchmarking workflow (#5839)
* add poc for benchmarking workflow.

* import

* fix argument

* fix: argument

* fix: path

* fix

* fix

* path

* output csv files.

* workflow cleanup

* append token

* add utility to push to hf dataset

* fix: kw arg

* better reporting

* fix: headers

* better formatting of the numbers.

* better type annotation

* fix: formatting

* moentarily disable check

* push results.

* remove disable check

* introduce base classes.

* img2img class

* add inpainting pipeline

* intoduce base benchmark class.

* add img2img and inpainting

* feat: utility to compare changes

* fix

* fix import

* add args

* basepath

* better exception handling

* better path handling

* fix

* fix

* remove

* ifx

* fix

* add: support for controlnet.

* image_url -> url

* move images to huggingface hub

* correct urls.

* root_ckpt

* flush before benchmarking

* don't install accelerate from source

* add runner

* simplify Diffusers Benchmarking step

* change runner

* fix: subprocess call.

* filter percentage values

* fix controlnet benchmark

* add t2i adapters.

* fix filter columns

* fix t2i adapter benchmark

* fix init.

* fix

* remove safetensors flag

* fix args print

* fix

* feat: run_command

* add adapter resolution mapping

* benchmark t2i adapter fix.

* fix adapter input

* fix

* convert to L.

* add flush() add appropriate places

* better filtering

* okay

* get env for torch

* convert to float

* fix

* filter out nans.

* better coment

* sdxl

* sdxl for other benchmarks.

* fix: condition

* fix: condition for inpainting

* fix: mapping for resolution

* fix

* include kandinsky and wuerstchen

* fix: Wuerstchen

* Empty-Commit

* [Community] AnimateDiff + Controlnet Pipeline (#5928)

* begin work on animatediff + controlnet pipeline

* complete todos, uncomment multicontrolnet, input checks

Co-Authored-By: EdoardoBotta <botta.edoardo@gmail.com>

* update

Co-Authored-By: EdoardoBotta <botta.edoardo@gmail.com>

* add example

* update community README

* Update examples/community/README.md

---------

Co-authored-by: EdoardoBotta <botta.edoardo@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* EulerDiscreteScheduler add `rescale_betas_zero_snr` (#6024)

* EulerDiscreteScheduler add `rescale_betas_zero_snr`

* Revert "[Community] AnimateDiff + Controlnet Pipeline (#5928)"

This reverts commit 821726d7c0.

* Revert "EulerDiscreteScheduler add `rescale_betas_zero_snr` (#6024)"

This reverts commit 3dc2362b5a.

* add SDXL turbo

* add lcm lora to the mix as well.

* fix

* increase steps to 2 when running turbo i2i

* debug

* debug

* debug

* fix for good

* fix and isolate better

* fuse lora so that torch compile works with peft

* fix: LCMLoRA

* better identification for LCM

* change to cron job

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Aryan V S <contact.aryanvs@gmail.com>
Co-authored-by: EdoardoBotta <botta.edoardo@gmail.com>
Co-authored-by: Beinsezii <39478211+Beinsezii@users.noreply.github.com>
2023-12-12 11:03:34 +05:30
M. Tolga Cangöz
0a401b95b7 [Docs] Fix typos (#6122)
Fix typos and trim trailing whitespaces
2023-12-11 10:55:28 -08:00
Edward Li
664e931bcb Correct type annotation for VaeImageProcessor.numpy_to_pil (#6111)
From `(np.ndarray) -> PIL.Image.Image` to `(np.ndarray) -> List[PIL.Image.Image]`.
2023-12-11 15:22:04 +05:30
Aryan V S
88bdd97ccd IP adapter support for most pipelines (#5900)
* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_upscale.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_attend_and_excite.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_instruct_pix2pix.py

* update tests

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_panorama.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_sag.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion_safe/pipeline_stable_diffusion_safe.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_text2img.py

* support ip-adapter in src/diffusers/pipelines/latent_consistency_models/pipeline_latent_consistency_img2img.py

* support ip-adapter in src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

* revert changes to sd_attend_and_excite and sd_upscale

* make style

* fix broken tests

* update ip-adapter implementation to latest

* apply suggestions from review

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-10 21:19:14 +05:30
Charchit Sharma
08b453e382 IP-Adapter for StableDiffusionControlNetImg2ImgPipeline (#5901)
* adapter for StableDiffusionControlNetImg2ImgPipeline

* fix-copies

* fix-copies

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-09 11:02:55 +05:30
apolinário
2a111bc9fe [Advanced Training Script] Fix pipe example (#6106) 2023-12-08 15:56:35 +01:00
apolinário
16e6997f0d [Advanced Diffusion Script] Add Widget default text (#6100)
add widget
2023-12-08 12:45:27 +01:00
YiYi Xu
3b9b98656e Fix a bug in add_noise function (#6085)
* fix

* copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-12-07 11:35:28 -10:00
Fabio Rigano
b65928b556 Add support for IPAdapterFull (#5911)
* Add support for IPAdapterFull


Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-12-07 06:40:39 -10:00
Beinsezii
6bf1ca2c79 EulerDiscreteScheduler add rescale_betas_zero_snr (#6024)
* EulerDiscreteScheduler add `rescale_betas_zero_snr`
2023-12-06 21:51:04 -10:00
Aryan V S
978dec9014 [Community] AnimateDiff + Controlnet Pipeline (#5928)
* begin work on animatediff + controlnet pipeline

* complete todos, uncomment multicontrolnet, input checks

Co-Authored-By: EdoardoBotta <botta.edoardo@gmail.com>

* update

Co-Authored-By: EdoardoBotta <botta.edoardo@gmail.com>

* add example

* update community README

* Update examples/community/README.md

---------

Co-authored-by: EdoardoBotta <botta.edoardo@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-12-06 21:01:41 -10:00
Dhruv Nair
79a7ab92d1 Fix clearing backend cache from device agnostic testing (#6075)
update
2023-12-07 11:18:31 +05:30
Younes Belkada
c2717317f0 [PEFT] Adapt example scripts to use PEFT (#5388)
* adapt example scripts to use PEFT

* Update examples/text_to_image/train_text_to_image_lora.py

* fix

* add for SDXL

* oops

* make sure to install peft

* fix

* fix

* fix dreambooth and lora

* more fixes

* add peft to requirements.txt

* fix

* final fix

* add peft version in requirements

* remove comment

* change variable names

* add few lines in readme

* add to reqs

* style

* fix issues

* fix lora dreambooth xl tests

* init_lora_weights to gaussian and add out proj where missing

* ammend requirements.

* ammend requirements.txt

* add correct peft versions

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-07 09:39:29 +05:30
Ian
bf7f9b49a2 Fix typing inconsistency in Euler discrete scheduler (#6052) 2023-12-06 23:45:16 +01:00
UmerHA
e192ae08d3 Add ControlNet-XS support (#5827)
* Check in 23-10-05

* check-in 23-10-06

* check-in 23-10-07 2pm

* check-in 23-10-08

* check-in 231009T1200

* check-in 230109

* checkin 231010

* init + forward run

* checkin

* checkin

* ControlNetXSModel is now saveable+loadable

* Forward works

* checkin

* Pipeline works with `no_control=True`

* checkin

* debug: save intermediate outputs of resnet

* checkin

* Understood time error + fixed connection error

* checkin

* checkin 231106T1600

* turned off detailled debug prints

* time debug logs

* small fix

* Separated control_scale for connections/time

* simplified debug logging

* Full denoising works with control scale = 0

* aligned logs

* Added control_attention_head_dim param

* Passing n_heads instead of dim_head into ctrl unet

* Fixed ctrl midblock bug

* Cleanup

* Fixed time dtype bug

* checkin

* 1. from_unet, 2. base passed, 3. all unet params

* checkin

* Finished docstrings

* cleanup

* make style

* checkin

* more tests pass

* Fixed tests

* removed debug logs

* make style + quality

* make fix-copies

* fixed documentation

* added cnxs to doc toc

* added control start/end param

* Update controlnetxs_sdxl.md

* tried to fix copies..

* Fixed norm_num_groups in from_unet

* added sdxl-depth test

* created SD2.1 controlnet-xs pipeline

* re-added debug logs

* Adjusting group norm ; readded logs

* Added debug log statements

* removed debug logs ; started tests for sd2.1

* updated sd21 tests

* fixed tests

* fixed tests

* slightly increased error tolerance for 1 test

* make style & quality

* Added docs for CNXS-SD

* make fix-copies

* Fixed sd compile test ; fixed gradient ckpointing

* vae downs = cnxs conditioning downs; removed guess

* make style & quality

* Fixed tests

* fixed test

* Incorporated review feedback

* simplified control model surgery

* fixed tests & make style / quality

* Updated docs; deleted pip & cursor files

* Rolled back minimal change to resnet

* Update resnet.py

* Update resnet.py

* Update src/diffusers/models/controlnetxs.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/controlnetxs.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Incorporated review feedback

* Update docs/source/en/api/pipelines/controlnetxs_sdxl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/controlnetxs.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/controlnetxs.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/controlnetxs.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/controlnetxs.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/controlnetxs.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/controlnet_xs/pipeline_controlnet_xs.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/controlnetxs.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/controlnet_xs/pipeline_controlnet_xs_sd_xl.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Incorporated doc feedback

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-12-06 23:33:47 +01:00
Steven Liu
87a09d66f3 [docs] SDXL Turbo (#6065)
api docs
2023-12-06 14:33:14 -08:00
Lucain
75ada25048 Harmonize HF environment variables + deprecate use_auth_token (#6066)
* Harmonize HF environment variables + deprecate use_auth_token

* fix import

* fix
2023-12-06 22:22:31 +01:00
Patrick von Platen
2243a59483 [Euler Discrete] Fix sigma (#6078)
* [Euler Discrete] Fix sigma

* make style
2023-12-06 19:59:38 +01:00
apolinário
466d32c442 [Advanced Diffusion Training] Cache latents to avoid VAE passes for every training step (#6076)
* add cache latents

* style
2023-12-06 14:46:53 +01:00
Dhruv Nair
20ba1fdbbd Disable Tests Fetcher (#6060)
update
2023-12-06 18:10:11 +05:30
Pedro Cuenca
ab6672fecd Use CC12M for LCM WDS training example (#5908)
* Fix SD scripts - there are only 2 items per batch

* Adjustments to make the SDXL scripts work with other datasets

* Use public webdataset dataset for examples

* make style

* Minor tweaks to the readmes.

* Stress that the database is illustrative.
2023-12-06 10:35:36 +01:00
Dhruv Nair
f90a5139a2 fix 2023-12-06 06:03:58 +00:00
Sayak Paul
a2bc2e14b9 [feat] allow SDXL pipeline to run with fused QKV projections (#6030)
* debug

* from step

* print

* turn sigma a list

* make str

* init_noise_sigma

* comment

* remove prints

* feat: introduce fused projections

* change to a better name

* no grad

* device.

* device

* dtype

* okay

* print

* more print

* fix: unbind -> split

* fix: qkv >-> k

* enable disable

* apply attention processor within the method

* attn processors

* _enable_fused_qkv_projections

* remove print

* add fused projection to vae

* add todos.

* add: documentation and cleanups.

* add: test for qkv projection fusion.

* relax assertions.

* relax further

* fix: docs

* fix-copies

* correct error message.

* Empty-Commit

* better conditioning on disable_fused_qkv_projections

* check

* check processor

* bfloat16 computation.

* check latent dtype

* style

* remove copy temporarily

* cast latent to bfloat16

* fix: vae -> self.vae

* remove print.

* add _change_to_group_norm_32

* comment out stuff that didn't work

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* reflect patrick's suggestions.

* fix imports

* fix: disable call.

* fix more

* fix device and dtype

* fix conditions.

* fix more

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-12-06 07:33:26 +05:30
Arsalan
f427345ab1 Device agnostic testing (#5612)
* utils and test modifications to enable device agnostic testing

* device for manual seed in unet1d

* fix generator condition in vae test

* consistency changes to testing

* make style

* add device agnostic testing changes to source and one model test

* make dtype check fns private, log cuda fp16 case

* remove dtype checks from import utils, move to testing_utils

* adding tests for most model classes and one pipeline

* fix vae import
2023-12-05 19:04:13 +05:30
apolinário
6e221334cd [advanced_dreambooth_lora_sdxl_tranining_script] save embeddings locally fix (#6058)
* Update train_dreambooth_lora_sdxl_advanced.py

* remove global function args from dreamboothdataset class

* style

* style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-05 13:52:34 +01:00
Patrick von Platen
53bc30dd45 [From single file] remove depr warning (#6043) 2023-12-05 18:12:25 +05:30
Radamés Ajna
eacf5e34eb Fix demofusion (#6049)
* Update pipeline_demofusion_sdxl.py

* Update README.md
2023-12-05 18:10:46 +05:30
Dhruv Nair
4c05f7856a Ldm unet convert fix (#6038)
* fix

* fix ldm conversion

* fix linting
2023-12-05 18:01:02 +05:30
Dhruv Nair
bbd3572044 Pin Ruff Version (#6059)
pinn ruff
2023-12-05 17:51:37 +05:30
Dhruv Nair
f948778322 Move kandinsky convert script (#6047)
move kandinsky convert script
2023-12-05 15:12:37 +05:30
Steven Liu
4684ea2fe8 [docs] #Copied from mechanism (#6007)
* copied from section

* feedback
2023-12-04 10:12:52 -08:00
Steven Liu
b64f835ea7 [docs] Add Kandinsky 3 (#5988)
* add

* fix api docs

* edits
2023-12-04 10:11:15 -08:00
Linoy Tsaban
880c0fdd36 [advanced dreambooth lora training script][bug_fix] change token_abstraction type to str (#6040)
* improve help tags

* style fix

* changes token_abstraction type to string.
support multiple concepts for pivotal using a comma separated string.

* style fixup

* changed logger to warning (not yet available)

* moved the token_abstraction parsing to be in the same block as where we create the mapping of identifier to token

---------

Co-authored-by: Linoy <linoy@huggingface.co>
2023-12-04 18:38:44 +01:00
RuoyiDu
c36f1c3160 [Community Pipeline] DemoFusion: Democratising High-Resolution Image Generation With No $$$ (#6022)
* Add files via upload

* Update README.md

* Update pipeline_demofusion_sdxl.py

* Update pipeline_demofusion_sdxl.py

* Update examples/community/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-04 19:44:57 +05:30
takuoko
0a08d41961 [Feature] Support IP-Adapter Plus (#5915)
* Support IP-Adapter Plus

* fix format

* restore before black format

* restore before black format

* generic

* Refactor PerceiverAttention

* format

* fix test and refactor PerceiverAttention

* generic encode_image

* keep attention implementation

* merge tests

* encode_image backward compatible

* code quality

* fix controlnet inpaint pipeline

* refactor FFN

* refactor FFN

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2023-12-04 12:43:34 +01:00
Levi McCallum
e185084a5d Add variant argument to dreambooth lora sdxl advanced (#6021)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-04 12:04:15 +01:00
Dhruv Nair
b21729225a Update Tests Fetcher (#5950)
* update setup and deps table

* update

* update

* update

* up

* up

* update

* up

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* update

* quality fix

* fix failure reporting

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-12-04 12:59:41 +05:30
Parth38
8a812e4e14 Update value_guided_sampling.py (#6027)
* Update value_guided_sampling.py

Changed the scheduler step function as predict_epsilon parameter is not there in latest  DDPM Scheduler

* Update value_guided_sampling.md

Updated a link to a working notebook

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-04 10:36:25 +05:30
gujing
bf92e746c0 fix StableDiffusionTensorRT super args error (#6009) 2023-12-04 10:06:23 +05:30
Linoy Tsaban
b785a155d6 [advanced dreambooth lora sdxl training script] improve help tags (#6035)
* improve help tags

* style fix

---------

Co-authored-by: Linoy <linoy@huggingface.co>
2023-12-04 09:41:25 +05:30
Sayak Paul
d486f0e846 [LoRA serialization] fix: duplicate unet prefix problem. (#5991)
* fix: duplicate unet prefix problem.

* Update src/diffusers/loaders/lora.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-12-02 21:35:16 +05:30
Sayak Paul
3351270627 [PixArt Tests] remove fast tests from slow suite (#5945)
remove fast tests from slow suite
2023-12-02 20:58:27 +05:30
Junsong Chen
4520e1221a adapt PixArtAlphaPipeline for pixart-lcm model (#5974)
* adapt PixArtAlphaPipeline for pixart-lcm model

* remove original_inference_steps from __call__

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-12-02 13:30:40 +05:30
Long(Tony) Lian
618260409f LLMGroundedDiffusionPipeline: inherit from DiffusionPipeline and fix peft (#6023)
* LLMGroundedDiffusionPipeline: inherit from DiffusionPipeline and fix peft

* Use main in the revision in the examples

* Add "Copied from" statements in comments

* Fix formatting with ruff
2023-12-01 09:58:25 -10:00
Patrick von Platen
dadd55fb36 Post Release: v0.24.0 (#5985)
* Post Release: v0.24.0

* post pone deprecation

* post pone deprecation

* Add model_index.json
2023-12-01 18:43:44 +01:00
YiYi Xu
1b6c7ea74e [schedulers] create self.sigmas during __init__ (#6006)
* fix dpm
* all scheulers
2023-12-01 07:15:37 -10:00
YiYi Xu
b41f809a4e [Kandinsky 3.0] Follow-up TODOs (#5944)
clean-up kendinsky 3.0
2023-12-01 07:14:22 -10:00
Patrick von Platen
0f55c17e17 fix style 2023-12-01 15:59:34 +00:00
Charchit Sharma
5058d27f12 added attention_head_dim, attention_type, resolution_idx (#6011) 2023-12-01 16:26:58 +01:00
M. Tolga Cangöz
748c1b3ec7 [Docs] Update a link (#6014)
* Update the location of Python's version

* Trim trailing whitespace
2023-12-01 16:26:25 +01:00
M. Tolga Cangöz
523507034f [logging] Fix assertion bug (#6012)
Fix assertion bug
2023-12-01 16:26:04 +01:00
hako-mikan
46c751e970 [Community Pipeline] Regional Prompting Pipeline (#6015)
* Update README.md

* Update README.md

* Add files via upload

* Update README.md

* Update examples/community/README.md

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-12-01 16:22:59 +01:00
Patrick von Platen
bc1d28c888 [From Single File] Allow Text Encoder to be passed (#6020)
Allow text encoder to be passed
2023-12-01 16:19:04 +01:00
Sayak Paul
af378c1dd1 [Easy] minor edits to setup.py (#5996)
minor edits to setup
2023-12-01 20:38:46 +05:30
Steven Liu
6ba4c5395f [docs] Fix SVD video (#6004)
Update svd.md
2023-12-01 16:07:47 +01:00
Linoy Tsaban
c1e4529541 [advanced_dreambooth_lora_sdxl_tranining_script] readme fix (#6019)
readme
2023-12-01 15:14:57 +01:00
Linoy Tsaban
d29d97b616 [examples/advanced_diffusion_training] bug fixes and improvements for LoRA Dreambooth SDXL advanced training script (#5935)
* imports and readme bug fixes

* bug fix - ensures text_encoder params are dtype==float32 (when using pivotal tuning) even if the rest of the model is loaded in fp16

* added pivotal tuning to readme

* mapping token identifier to new inserted token in validation prompt (if used)

* correct default value of --train_text_encoder_frac

* change default value of  --adam_weight_decay_text_encoder

* validation prompt generations when using pivotal tuning bug fix

* style fix

* textual inversion embeddings name change

* style fix

* bug fix - stopping text encoder optimization halfway

* readme - will include token abstraction and new inserted tokens when using pivotal tuning
- added type to --num_new_tokens_per_abstraction

* style fix

---------

Co-authored-by: Linoy Tsaban <linoy@huggingface.co>
2023-12-01 14:18:43 +01:00
Jongho Choi
7d4a257c7f Remove a duplicated line? (#6010)
Update __init__.py
2023-12-01 15:49:36 +05:30
Kristian Mischke
141cd52d56 Fix LLMGroundedDiffusionPipeline super class arguments (#5993)
* make `requires_safety_checker` a kwarg instead of a positional argument as it's more future-proof

* apply `make style` formatting edits

* add image_encoder to arguments and pass to super constructor
2023-11-30 10:15:14 -10:00
Steven Liu
f72b28c75b [docs] Fix video link (#5986)
Update svd.md
2023-11-29 20:52:25 +01:00
Suraj Patil
ada8109d5b Fix SVD doc (#5983)
fix url
2023-11-29 19:55:05 +01:00
Patrick von Platen
b34acbdcbc [SDXL Turbo] Add some docs (#5982)
* add diffusers example

* add diffusers example

* Comment about making it faster

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-11-29 19:52:07 +01:00
Suraj Patil
63f767ef15 Add SVD (#5895)
* begin model

* finish blocks

* add_embedding

* addition_time_embed_dim

* use TimestepEmbedding

* fix temporal res block

* fix time_pos_embed

* fix add_embedding

* add conversion script

* fix model

* up

* add new resnet blocks

* make forward work

* return sample in original shape

* fix temb shape in TemporalResnetBlock

* add spatio temporal transformers

* add vae blocks

* fix blocks

* update

* update

* fix shapes in Alphablender and add time activation in res blcok

* use new blocks

* style

* fix temb shape

* fix SpatioTemporalResBlock

* reuse TemporalBasicTransformerBlock

* fix TemporalBasicTransformerBlock

* use TransformerSpatioTemporalModel

* fix TransformerSpatioTemporalModel

* fix time_context dim

* clean up

* make temb optional

* add blocks

* rename model

* update conversion script

* remove UNetMidBlockSpatioTemporal

* add in init

* remove unused arg

* remove unused arg

* remove more unsed args

* up

* up

* check for None

* update vae

* update up/mid blocks for decoder

* begin pipeline

* adapt scheduler

* add guidance scalings

* fix norm eps in temporal transformers

* add temporal autoencoder

* make pipeline run

* fix frame decodig

* decode in float32

* decode n frames at a time

* pass decoding_t to decode_latents

* fix decode_latents

* vae encode/decode in fp32

* fix dtype in TransformerSpatioTemporalModel

* type image_latents same as image_embeddings

* allow using differnt eps in temporal block for video decoder

* fix default values in vae

* pass num frames in decode

* switch spatial to temporal for mixing in VAE

* fix num frames during split decoding

* cast alpha to sample dtype

* fix attention in MidBlockTemporalDecoder

* fix typo

* fix guidance_scales dtype

* fix missing activation in TemporalDecoder

* skip_post_quant_conv

* add vae conversion

* style

* take guidance scale as input

* up

* allow passing PIL to export_video

* accept fps as arg

* add pipeline and vae in init

* remove hack

* use AutoencoderKLTemporalDecoder

* don't scale image latents

* add unet tests

* clean up unet

* clean TransformerSpatioTemporalModel

* add slow svd test

* clean up

* make temb optional in Decoder mid block

* fix norm eps in TransformerSpatioTemporalModel

* clean up temp decoder

* clean up

* clean up

* use c_noise values for timesteps

* use math for log

* update

* fix copies

* doc

* upcast vae

* update forward pass for gradient checkpointing

* make added_time_ids is tensor

* up

* fix upcasting

* remove post quant conv

* add _resize_with_antialiasing

* fix _compute_padding

* cleanup model

* more cleanup

* more cleanup

* more cleanup

* remove freeu

* remove attn slice

* small clean

* up

* up

* remove extra step kwargs

* remove eta

* remove dropout

* remove callback

* remove merge factor args

* clean

* clean up

* move to dedicated folder

* remove attention_head_dim

* docstr and small fix

* update unet doc strings

* rename decoding_t

* correct linting

* store c_skip and c_out

* cleanup

* clean TemporalResnetBlock

* more cleanup

* clean up vae

* clean up

* begin doc

* more cleanup

* up

* up

* doc

* Improve

* better naming

* better naming

* better naming

* better naming

* better naming

* better naming

* better naming

* better naming

* Apply suggestions from code review

* Default chunk size to None

* add example

* Better

* Apply suggestions from code review

* update doc

* Update src/diffusers/pipelines/stable_diffusion_video/pipeline_stable_diffusion_video.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* style

* Get torch compile working

* up

* rename

* fix doc

* add chunking

* torch compile

* torch compile

* add modelling outputs

* torch compile

* Improve chunking

* Apply suggestions from code review

* Update docs/source/en/using-diffusers/svd.md

* Close diff tag

* remove slicing

* resnet docstr

* add docstr in resnet

* rename

* Apply suggestions from code review

* update tests

* Fix output type latents

* fix more

* fix more

* Update docs/source/en/using-diffusers/svd.md

* fix more

* add pipeline tests

* remove unused arg

* clean  up

* make sure get_scaling receives tensors

* fix euler scheduler

* fix get_scalings

* simply euler for now

* remove old test file

* use randn_tensor to create noise

* fix device for rand tensor

* increase expected_max_difference

* fix test_inference_batch_single_identical

* actually fix test_inference_batch_single_identical

* disable test_save_load_float16

* skip test_float16_inference

* skip test_inference_batch_single_identical

* fix test_xformers_attention_forwardGenerator_pass

* Apply suggestions from code review

* update StableVideoDiffusionPipelineSlowTests

* update image

* add diffusers example

* fix more

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
2023-11-29 19:13:36 +01:00
PENGUINLIONG
d1b2a1a957 Fixed custom module importing on Windows (#5891)
* Fixed custom module importing on Windows

Windows use back slash and `os.path.join()` follows that convention.

* Apply suggestions from code review

Co-authored-by: Lucain <lucainp@gmail.com>

* Update pipeline_utils.py

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Lucain <lucainp@gmail.com>
2023-11-29 16:33:04 +01:00
Kashif Rasul
01782c220e [Wuerstchen] Adapt lora training example scripts to use PEFT (#5959)
* Adapt lora example scripts to use PEFT

* add to_out.0
2023-11-29 16:18:20 +01:00
vahramtadevosyan
d63a498c3b [Pipeline] Add TextToVideoZeroSDXLPipeline (#4695)
* integrated sdxl for the text2video-zero pipeline

* make fix-copies

* fixed CI issues

* make fix-copies

* added docs and `copied from` statements

* added fast tests

* made a small change in docs

* quality+style check fix

* updated docs. added controlnet inference with sdxl

* added device compatibility for fast tests

* fixed docstrings

* changing vae upcasting

* remove torch.empty_cache to speed up inference

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* made fast tests to run on dummy models only, fixed copied from statements

* fixed testing utils imports

* Added bullet points for SDXL support

* fixed formatting & quality

* Update tests/pipelines/text_to_video/test_text_to_video_zero_sdxl.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/pipelines/text_to_video/test_text_to_video_zero_sdxl.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fixed minor error for merging

* fixed updates of sdxl

* made fast tests inherit from `PipelineTesterMixin` and run in 3-4secs on CPU

* make style && make quality

* reimplemented fast tests w/o default attn processor

* make style & make quality

* make fix-copies

* make fix-copies

* fixed docs

* make style & make quality & make fix-copies

* bug fix in cross attention

* make style && make quality

* make fix-copies

* fix gpu issues

* make fix-copies

* updated pipeline signature

---------

Co-authored-by: Vahram <vahram.tadevosyan@lambda-loginnode02.cm.cluster>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-11-29 16:10:43 +01:00
Marko Kostiv
6a4aad43dc Controlnet ssd 1b support (#5779)
* Add SSD-1B support for controlnet model

* Add conditioning_channels into ControlNet init from unet

* Fix black formatting

* Isort fixes

* Adds SSD-1B controlnet pipeline test with UNetMidBlock2D as mid block

* Overrides failing ssd-1b tests

* Fixes tests after main branch update

* Fixes code quality checks

---------

Co-authored-by: Marko Kostiv <marko@linearity.io>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-11-29 16:10:01 +01:00
Steven Liu
ddd8bd53ed [docs] LCM training (#5796)
* first draft

* feedback
2023-11-29 16:08:05 +01:00
JuanCarlosPi
9f7b2cf2dc Support of ip-adapter to the StableDiffusionControlNetInpaintPipeline (#5887)
* Change pipeline_controlnet_inpaint.py to add ip-adapter support. Changes are similar to those in pipeline_controlnet

* Change tests for the StableDiffusionControlNetInpaintPipeline by adding image_encoder: None

* Update src/diffusers/pipelines/controlnet/pipeline_controlnet_inpaint.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2023-11-29 16:00:24 +01:00
Sayak Paul
895c4b704b [LoRA refactor] move several state dict conversion utils out of lora.py (#5955)
* move several state dict conversion utils out of lora.py

* check

* check

* check

* check

* check

* check

* check

* revert back

* check

* check

* again check

* maybe fix?

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-29 20:24:21 +05:30
Linh Nguyen
636feba552 Rename output_dir argument (#5916)
Fix typo in output_dir argument: "text-inversion-model" → "dreambooth-model"
2023-11-29 15:47:16 +01:00
Andrés Romero
79dc7df03e [bug fix] Inpainting for MultiAdapter (#5922)
* bug in MultiAdapter for Inpainting

* adapter_input is a list for MultiAdapter

---------

Co-authored-by: andres <andres@hax.ai>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-11-29 15:46:26 +01:00
Charchit Sharma
6031ecbd23 added doc for Kandinsky3.0 (#5937)
* added en doc for Kandinsky3.0

* required changes

* Update docs/source/en/api/pipelines/kandinsky3.md

* Update docs/source/en/api/pipelines/kandinsky3.md

* Update docs/source/en/api/pipelines/kandinsky3.md

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-29 15:32:00 +01:00
Sayak Paul
fdd003d8e2 [Tests] Refactor test_examples.py for better readability (#5946)
* control and custom diffusion

* dreambooth

* instructpix2pix and dreambooth ckpting

* t2i adapters.

* text to image ft

* textual inversion

* unconditional

* workflows

* import fix

* fix import
2023-11-29 18:43:59 +05:30
Steven Liu
172acc98b9 [docs] Update pipeline list (#5952)
add to list
2023-11-29 14:08:39 +01:00
estelleafl
5ae3c3a56b [ldm3d] Ldm3d upscaler to community pipeline (#5870)
---------
Co-authored-by: Aflalo <estellea@isl-gpu27.rr.intel.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2023-11-28 09:00:39 -10:00
Soumik Rakshit
21bc59ab24 fix: minor typo in docstring (#5961) 2023-11-28 18:18:34 +05:30
Steven Liu
50a749e909 [docs] Fix space (#5898)
* fix

* minor edits
2023-11-27 11:50:59 -08:00
YiYi Xu
d9075be494 [load_textual_inversion]: allow multiple tokens (#5837)
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-11-27 06:52:36 -10:00
Patrick von Platen
b135b6e905 [From_pretrained] Fix warning (#5948) 2023-11-27 14:35:19 +01:00
T. Xu
14a0d21d2e [Community Pipeline] Diffusion Posterior Sampling for General Noisy Inverse Problems (#5939)
* [community pipeline] dps impl

* add type checking

* pass ruff check

* ruff formatter
2023-11-27 14:29:42 +01:00
Patrick von Platen
ebf581e85f [Tests] Make sure that we don't run tests multiple times (#5949)
* [Tests] Make sure that we don't run tests mulitple times

* [Tests] Make sure that we don't run tests mulitple times

* [Tests] Make sure that we don't run tests mulitple times
2023-11-27 14:18:56 +01:00
Patrick von Platen
e550163b9f [Vae] Make sure all vae's work with latent diffusion models (#5880)
* add comments to explain the code better

* add comments to explain the code better

* add comments to explain the code better

* add comments to explain the code better

* add comments to explain the code better

* fix more

* fix more

* fix more

* fix more

* fix more

* fix more
2023-11-27 14:17:47 +01:00
Viktor Grygorchuk
20f0cbc88f fix: error on device for lpw_stable_diffusion_xl pipeline if pipe.enable_sequential_cpu_offload() enabled (#5885)
fix: set device for pipe.enable_sequential_cpu_offload()
2023-11-27 13:47:47 +01:00
Chi
d72a24b790 Replace multiple variables with one variable. (#5715)
* I added a new doc string to the class. This is more flexible to understanding other developers what are doing and where it's using.

* Update src/diffusers/models/unet_2d_blocks.py

This changes suggest by maintener.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/unet_2d_blocks.py

Add suggested text

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update unet_2d_blocks.py

I changed the Parameter to Args text.

* Update unet_2d_blocks.py

proper indentation set in this file.

* Update unet_2d_blocks.py

a little bit of change in the act_fun argument line.

* I run the black command to reformat style in the code

* Update unet_2d_blocks.py

similar doc-string add to have in the original diffusion repository.

* I enhanced the code by replacing multiple redundant variables with a single variable, as they all served the same purpose. Additionally, I utilized the get_activation function for improved flexibility in choosing activation functions.

* Using as black package to reformated my file

* reverte some changes

* Remove conv_out_padding variables and using as conv_in_padding

* conv_out_padding create and add them into the code.

* run black command to solving styling problem

* add little bit space between comment and import statement

* I am utilizing the ruff library to address the style issues in my Makefile.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-27 13:34:52 +01:00
ginjia
d3cda804e7 add LoRA weights load and fuse support for IPEX pipeline (#5920)
add IPEX pipeline LoRA weights loading support
2023-11-27 13:32:43 +01:00
dg845
07eac4d65a Fix LCM Stable Diffusion distillation bug related to parsing unet_time_cond_proj_dim (#5893)
* Fix bug related to parsing unet_time_cond_proj_dim.

* Fix analogous bug in the SD-XL LCM distillation script.
2023-11-27 13:00:40 +01:00
Iván de Prado
c079cae3d4 Avoid computing min() that is expensive when do_normalize is False in the image processor (#5896)
Avoid computing min() that is expensive when do_normalize is False

Avoid extra computing when do_normalize is False
2023-11-27 12:46:26 +01:00
Wang, Yi
c7bfb8b22a set the model to train state before accelerator prepare (#5099)
Signed-off-by: Wang, Yi A <yi.a.wang@intel.com>
2023-11-27 12:43:49 +01:00
dg845
67d070749a Add Custom Timesteps Support to LCMScheduler and Supported Pipelines (#5874)
* Add custom timesteps support to LCMScheduler.

* Add custom timesteps support to StableDiffusionPipeline.

* Add custom timesteps support to StableDiffusionXLPipeline.

* Add custom timesteps support to remaining Stable Diffusion pipelines which support LCMScheduler (img2img, inpaint).

* Add custom timesteps support to remaining Stable Diffusion XL pipelines which support LCMScheduler (img2img, inpaint).

* Add custom timesteps support to StableDiffusionControlNetPipeline.

* Add custom timesteps support to T21 Stable Diffusion (XL) Adapters.

* Clean up Stable Diffusion inpaint tests.

* Manually add support for custom timesteps to AltDiffusion pipelines since make fix-copies doesn't appear to work correctly (it deletes the whole pipeline).

* make style

* Refactor pipeline timestep handling into the retrieve_timesteps function.
2023-11-27 12:39:14 +01:00
Aryan V S
9c357bda3f Deprecate KarrasVeScheduler and ScoreSdeVpScheduler (#5269)
* deprecated: KarrasVeScheduler, ScoreSdeVpScheduler

* delete tests relevant to deprecated schedulers

* chore: run make style

* fix: import error caused due to incorrect _import_structure after deprecation

* fix: ScoreSdeVpScheduler was not importable from diffusers

* remove import added by assumption

* Update src/diffusers/schedulers/__init__.py as suggested by @patrickvonplaten

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make it a part deprecated

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix

* fix

* fix doc

* fix doc....again.......

* remove karras_ve test folder

Co-Authored-By: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-11-27 12:33:02 +01:00
Sayak Paul
3f7c3511dc [Core] add support for gradient checkpointing in transformer_2d (#5943)
add support for gradient checkpointing in transformer_2d
2023-11-27 16:21:12 +05:30
Junsong Chen
7d6f30e89b [Fix: pixart-alpha] random 512px resolution bug (#5842)
* [Fix: pixart-alpha]
add ASPECT_RATIO_512_BIN in use_resolution_binning for random 512px image generation.

* add slow test file for 512px generation without resolution binning

* fix: slow tests for resolution binning.

---------

Co-authored-by: jschen <chenjunsong4@h-partners.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-11-27 13:35:35 +05:30
Patrick von Platen
6d2e19f746 [Examples] Allow downloading variant model files (#5531)
* add variant

* add variant

* Apply suggestions from code review

* reformat

* fix: textual_inversion.py

* fix: variant in model_info

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2023-11-27 10:43:20 +05:30
Patrick von Platen
2a7f43a73b correct num inference steps 2023-11-24 17:09:26 +00:00
Patrick von Platen
b978334d71 [@cene555][Kandinsky 3.0] Add Kandinsky 3.0 (#5913)
* finalize

* finalize

* finalize

* add slow test

* add slow test

* add slow test

* Fix more

* add slow test

* fix more

* fix more

* fix more

* fix more

* fix more

* fix more

* fix more

* fix more

* fix more

* Better

* Fix more

* Fix more

* add slow test

* Add auto pipelines

* add slow test

* Add all

* add slow test

* add slow test

* add slow test

* add slow test

* add slow test

* Apply suggestions from code review

* add slow test

* add slow test
2023-11-24 17:46:00 +01:00
Sayak Paul
e5f232f76b [Docs] add: 8bit inference with pixart alpha (#5814)
* add: 8bit inference with pixart alpha

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add: note on 4bit.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* address comment

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-24 20:36:33 +05:30
Linoy Tsaban
3003ff4947 [bug fix] fix small bug in readme template of sdxl lora training script (#5914)
readme improvement and metadata fix
2023-11-23 19:08:49 +01:00
Linoy Tsaban
5ffa603244 [bug fix] fix small bug in readme template of sdxl lora training script (#5906)
* readme bug fix

* style fix

---------

Co-authored-by: Linoy Tsaban <linoy@huggingface.co>
2023-11-23 12:11:50 +01:00
Linoy Tsaban
0eeee618cf Adds an advanced version of the SD-XL DreamBooth LoRA training script supporting pivotal tuning (#5883)
* sdxl dreambooth lora training script with pivotal tuning

* bug fix - args missing from parse_args

* code quality fixes

* comment unnecessary code from TokenEmbedding handler class

* fixup

---------

Co-authored-by: Linoy Tsaban <linoy@huggingface.co>
2023-11-22 16:27:56 +01:00
Andrés Romero
93f1a14cab ControlNet+Adapter pipeline, and ControlNet+Adapter+Inpaint pipeline (#5869)
* ControlNet+Adapter pipeline, and +Inpaint pipeline


---------

Co-authored-by: andres <andres@hax.ai>
2023-11-21 08:59:29 -10:00
Patrick von Platen
13d73d9303 [Lora] Seperate logic (#5809)
* [Lora] Seperate logic

* [Lora] Seperate logic

* [Lora] Seperate logic

* add comments to explain the code better

* add comments to explain the code better
2023-11-21 18:58:37 +01:00
YiYi Xu
ba352aea29 [feat] IP Adapters (author @okotaku ) (#5713)
* add ip-adapter


---------

Co-authored-by: okotaku <to78314910@gmail.com>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-21 07:34:30 -10:00
Linoy Tsaban
6fac1369d0 Add features to the Dreambooth LoRA SDXL training script (#5508)
* Additions:
- support for different lr for text encoder
- support for Prodigy optimizer
- support for min snr gamma
- support for custom captions and dataset loading from the hub

* adjusted --caption_column behaviour (to -not- use the second column of the dataset by default if --caption_column is not provided)

* fixed --output_dir / --model_dir_name confusion

* added --repeats, --adam_weight_decay_text_encoder
+ some fixes

* Update examples/dreambooth/train_dreambooth_lora_sdxl.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update examples/dreambooth/train_dreambooth_lora_sdxl.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update examples/dreambooth/train_dreambooth_lora_sdxl.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* - import compute_snr from diffusers/training_utils.py
- cluster adamw together
- when using 'prodigy', if --train_text_encoder == True and --text_encoder_lr != --learning rate, changes the lr of the text encoders optimization params to be --learning_rate (otherwise errors)

* shape fixes when custom captions are used

* formatting and a little cleanup

* code styling

* --repeats default value fixed, changed to 1

* bug fix - removed redundant lines of embedding concatenation when using prior_preservation (that duplicated class_prompt embeddings)

* changed dataset loading logic according to the following usecases (to avoid unnecessary dependency on datasets)-
1. user provides --dataset_name
2. user provides local dir --instance_data_dir that contains a metadata .jsonl file
3. user provides local dir --instance_data_dir that contains only images
in cases [1,2] we import datasets and use load_dataset method, in case [3] we process the data same as in the original script setting

* styling fix

* arg name fix

* adjusted the --repeats logic

* -removed redundant arg and 'if' when loading local folder with prompts
-updated readme template
-some default val fixes
-custom caption tests

* image path fix for readme

* code style

* bug fix

* --caption_column arg

* readme fix

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Linoy Tsaban <linoy@huggingface.co>
2023-11-21 17:38:43 +01:00
Steven Liu
1093f9d615 [docs] MusicLDM (#5854)
* fix

* feedback
2023-11-21 15:27:41 +01:00
Aryan V S
81780882b8 Addition of new callbacks to controlnets (#5812)
* add new callbacks to src/diffusers/pipelines/controlnet/pipeline_controlnet.py

* update callbacks

* fix repeated kwarg

* update

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-21 15:22:20 +01:00
Dhruv Nair
ebc7bedeb7 Add tests fetcher (#5848)
* add tests fetcher to utils

* add test fetcher

* update

* update

* remove unused dependency version check script

* update

* fix mistake

* update

* update

* update

* update

* update

* update

* update

* remove concurrency params

* update

* update

* update

* update

* update

* update

* move test fetcher to dedicated workflow
2023-11-21 18:01:44 +05:30
co63oc
ee519cfef5 Update README.md (#5855) 2023-11-21 11:56:13 +01:00
Steven Liu
7457aa67cb [docs] Loader APIs (#5813)
* first draft

* remove old loader doc

* start adding lora code examples

* finish

* add link to loralinearlayer

* feedback

* fix
2023-11-20 10:53:13 -08:00
M. Tolga Cangöz
c72a173906 Revert "[Docs] Update and make improvements" (#5858)
* Revert "[`Docs`] Update and make improvements (#5819)"

This reverts commit c697f52476.

* Update README.md

* Update memory.md

* Update basic_training.md

* Update write_own_pipeline.md

* Update fp16.md

* Update basic_training.md

* Update write_own_pipeline.md

* Update write_own_pipeline.md
2023-11-20 10:22:21 -08:00
dg845
dc21498b43 Update LCMScheduler Inference Timesteps to be More Evenly Spaced (#5836)
* Change LCMScheduler.set_timesteps to pick more evenly spaced inference timesteps.

* Change inference_indices implementation to better match previous behavior.

* Add num_inference_steps=26 test case to test_inference_steps.

* run CI

---------

Co-authored-by: patil-suraj <surajp815@gmail.com>
2023-11-20 15:46:10 +01:00
Patrick von Platen
3303aec5f8 make style 2023-11-20 12:54:52 +01:00
ginjia
4abbbff618 fix an issue that ipex occupy too much memory, it will not impact per… (#5625)
* fix an issue that ipex occupy too much memory, it will not impact performance

* make style

---------

Co-authored-by: root <jun.chen@intel.com>
Co-authored-by: Meng Guoqing <guoqing.meng@intel.com>
2023-11-20 12:43:29 +01:00
Younes Belkada
fda297703f [test / peft] Fix silent behaviour on PR tests (#5852)
Update pr_test_peft_backend.yml
2023-11-20 12:42:58 +01:00
Roy Hvaara
2695ba8e9d [JAX] Replace uses of jax.devices("cpu") with jax.local_devices(backend="cpu") (#5864)
An upcoming change to JAX will include non-local (addressable) CPU devices in jax.devices() when JAX is used multicontroller-style, where there are multiple Python processes.

This change preserves the current behavior by replacing uses of jax.devices("cpu"), which previously only returned local devices, with jax.local_devices("cpu"), which will return local devices both now and in the future.

This change is always safe (i.e., it should always preserve the previous behavior), but it may sometimes be unnecessary if code is never used in a multicontroller setting.

Co-authored-by: Peter Hawkins <phawkins@google.com>
2023-11-20 12:32:50 +01:00
Aryan V S
3ab921166d [Community] [WIP] LCM Interpolation Pipeline (#5767)
* wip: add interpolate pipeline for lcm

* update documentation

* update documentation
2023-11-20 12:18:17 +01:00
Kashif Rasul
6b04d61cf6 [Styling] stylify using ruff (#5841)
* ruff format

* not need to use doc-builder's black styling as the doc is styled in ruff

* make fix-copies

* comment

* use run_ruff
2023-11-20 11:48:34 +01:00
Sayak Paul
9c7f7fc475 [ControlNet] fix import in single file loading (#5834)
fix import

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-11-20 15:50:36 +05:30
Aryan V S
4adad57e57 UnboundLocalError in SDXLInpaint.prepare_latents() (#5648)
* fix: UnboundLocalError with image_latents

* chore: run make style, quality, fix-copies

* revert changes from make fix-copies

* revert changes from make fix-copies

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2023-11-19 22:39:51 -10:00
Younes Belkada
4e54dfe985 [Tests/LoRA/PEFT] Test also on PEFT / transformers / accelerate latest (#5820)
* add also peft latest on peft CI

* up

* up

* up

* Update .github/workflows/pr_test_peft_backend.yml

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-11-17 19:17:31 +01:00
Sourab Mangrulkar
6f1435332b Speed up the peft lora unload (#5741)
* Update peft_utils.py

* fix bug

* make the util backwards compatible.

Co-Authored-By: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* fix import issue

* refactor the backward compatibilty condition

* rename the conditional variable

* address comments

Co-Authored-By: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* address comment

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2023-11-17 19:15:44 +01:00
Patrick von Platen
c6f90daea6 [PEFT] Unpin peft (#5850) 2023-11-17 19:15:02 +01:00
Will Berman
2a84e8bb5a fix memory consistency decoder test (#5828)
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-11-17 09:31:01 -08:00
Lucain
c896b841e4 Set usedforsecurity=False in hashlib methods (FIPS compliance) (#5790)
* Set usedforsecurity=False in hashlib methods (FIPS compliance)

* update version dependency

* bump hfh version

* bump hfh version
2023-11-17 14:56:58 +01:00
Sayak Paul
69412d0a15 [Docs] add: japanese sdxl as a reference (#5844)
add: japanese sdxl as a reference
2023-11-17 18:14:02 +05:30
Patrick von Platen
913986afa5 Improve setup.py and add dependency check (#5826)
* put peft in requirements

* correct peft

* correct installs

* make style

* make style
2023-11-17 12:05:26 +01:00
Steven Liu
ff573ae245 [docs] Fix title (#5831)
fix section title
2023-11-17 08:53:11 +05:30
M. Tolga Cangöz
c697f52476 [Docs] Update and make improvements (#5819)
Update and make improvements
2023-11-16 13:47:25 -08:00
Suraj Patil
a042909c83 LCM-LoRA docs (#5782)
* begin doc

* fix examples

* add in toctree

* fix toctree

* improve copy

* improve introductions

* add lcm doc

* fix filename

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* address Sayak's comments

* remove controlnet aux

* open in colab

* move to Specific pipeline examples

* update controlent and adapter examples

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-16 16:19:27 +01:00
Suraj Patil
64cbd8e27a Support LCM in ControlNet and Adapter pipelines. (#5822)
* support lcm

* fix tests

* fix tests
2023-11-16 14:59:50 +01:00
Aryan V S
038b42db94 Improve docs and type hints (#5759)
* improvement: docs and type hints

* improvement: docs and type hints

minor refactor

* improvement: docs and type hints

* update with suggestions from review

Co-Authored-By: Dhruv Nair <dhruv.nair@gmail.com>

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-11-16 14:00:32 +05:30
M. Tolga Cangöz
ecbe27a07f [Docs] Fix typos and update files at API's Pipelines page 2 (#5748)
* Fix typos, update, add Copyright info, and trim trailing whitespace

* Update docs/source/en/api/pipelines/text_to_video_zero.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* 1 second is not a long video, but 6 seconds is

* Update text_to_video_zero.md

* Update text_to_video_zero.md

* Update text_to_video_zero.md

* Update wuerstchen.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-15 10:54:55 -08:00
M. Tolga Cangöz
3ad4207d1f [Docs] Fix typos, update, and add visualizations at Using Diffusers' Pipelines for Inference Page (#5649)
* Fix typos, update, add visualizations

* Update sdxl.md

* Update controlnet.md

* Update docs/source/en/using-diffusers/shap-e.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/shap-e.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update diffedit.md

* Update kandinsky.md

* Update sdxl.md

* Update controlnet.md

* Update docs/source/en/using-diffusers/controlnet.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/controlnet.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update controlnet.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-15 10:30:01 -08:00
MilkClouds
3517fb9430 fix: enabled num_images_per_prompt>1 for lpw_stable_diffusion_xl (community pipeline) (#5807)
* fix: enabled num_images_per_prompt>1 for lpw_stable_diffusion_xl

* style: fixed isort
2023-11-15 07:40:29 -10:00
Dhruv Nair
cdadb023a2 Make Video Tests faster (#5787)
* update test

* update
2023-11-15 10:56:01 +05:30
M. Tolga Cangöz
51fd3dd206 [Docs] Remove .to('cuda') before .enable_model_cpu_offload() (#5795)
Remove .to('cuda') before cpu_offload, trim trailing whitespaces
2023-11-14 17:20:54 -08:00
Pedro Gabriel Gengo Lourenço
98457580c0 Fixed documentation of consistency decoder (#5768)
* Fixed doc for consistency decoder

* Style fix
2023-11-14 15:27:26 -08:00
M. Tolga Cangöz
3b37488fa3 Fix typos, improve, update at main-page files and .github files (#5588)
* Update keywords; remove version cuz it changes constantly?

* Update if necessary

* Fix typos and links

* version_range_max from PyTorch's setup.py; fix typos; 1 checklist is enough?

* Fix a typo

* Fix typos

* There is already a Blank issue link at the bottom of the page; direct to Diffusers' forum

* Add 🌐 Translating a New Language? page

* Update .github/ISSUE_TEMPLATE/translate.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update PHILOSOPHY.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update PHILOSOPHY.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update PHILOSOPHY.md

* Update PHILOSOPHY.md

* Update CONTRIBUTING.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update CONTRIBUTING.md

* Update CONTRIBUTING.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update PHILOSOPHY.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update PHILOSOPHY.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Add X account link

* Update setup.py

* Update CITATION.cff

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-14 11:46:52 -08:00
Kadir Nar
c7260ce253 🔧 Fix import codes in diffusers library (#5792)
* 🔧 Fix import codes in diffusers library

* Refactor imports in community examples
2023-11-14 11:37:59 -08:00
M. Tolga Cangöz
8092017d3f [Docs] Fix typos and update files at API's Pipelines page 1 (#5744)
* Fix typos, update, add Copyright info, and trim trailing whitespace

* Update alt_diffusion.md

* Remove nonoperational demo

* Update docs/source/en/api/pipelines/consistency_models.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/latent_consistency_models.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-14 10:36:20 -08:00
Steven Liu
bae14c8bcb [docs] Update training docs (#5512)
* first draft

* try hfoption syntax

* fix hfoption id

* add text2image

* fix tag

* feedback

* feedbacks

* add textual inversion

* DreamBooth

* lora

* controlnet

* instructpix2pix

* custom diffusion

* t2i

* separate training methods and models

* sdxl

* kandinsky

* wuerstchen

* light edits
2023-11-14 10:29:56 -08:00
Sayak Paul
ded93f798c [Refactor] refactor loaders.py to make it cleaner and leaner. (#5771)
* refactor loaders.py to make it cleaner and leaner.

* refactor loaders init

* inits.

* textual inversion to the init.

* inits.

* remove certain modules from the main init.

* AttnProcsLayers

* fix imports

* avoid circular import.

* fix circular import pt 2.

* address PR comments

* imports

* fix: imports.

* remove from main init for avoiding circular deps.

* remove spurious deps.

* fix-copies.

* fix imports.

* more debug

* more debug

* Apply suggestions from code review

* Apply suggestions from code review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-14 12:54:28 +01:00
Sayak Paul
a5720e9e31 [PixArt-Alpha] Fix PixArt-Alpha pipeline when number of images to generate is more than 1 (#5752)
* does this fix things?

* attention mask use

* attention mask order

* better masking.

* add: tesrt

* remove mask_featur

* test

* debug

* fix: tests

* deprecate mask_feature

* add deprecation test

* add slow test

* add print statements to retrieve the assertion values.

* fix for the 1024 fast tes

* fix tesy

* fix the remaining

* Apply suggestions from code review

* more debug

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-14 12:45:35 +01:00
Richard Löwenström
16d500455b Use greater than or equal to in version comparisons for peft (#5785) 2023-11-14 12:17:51 +01:00
Yusuke Suzuki
210a07b13c fix exception around NSFW filter on flax stable diffusion (#5675) 2023-11-14 12:16:52 +01:00
Patrick von Platen
0a0ebc7cb4 [LCM] Better error message (#5788) 2023-11-14 12:08:15 +01:00
Patrick von Platen
81df9c85de Unwrap models everywhere (#5789)
more debug
2023-11-14 12:08:03 +01:00
takuoko
bfe94a3993 [Enhacne] Support maybe_raise_or_warn for peft (#5653)
* Support maybe_raise_or_warn for peft

* fix by comment

* unwrap function
2023-11-14 11:40:35 +01:00
Lukas Kuhn
c9c5436c94 download_from_original_stable_diffusion_ckpt initializes correct default pipeline for SDXL (#5784)
* feat: sdxl will be automatically detected as pipeline_class

* fix: formatting

* fix: formatting with black

* fix: import pipeline wrongly sorted
2023-11-14 11:35:26 +01:00
Sourab Mangrulkar
9c8eca702c add lora delete feature (#5738)
* add lora delete feature

* added tests and changed condition

* deal with corner cases

* more corner cases

* rename to `delete_adapter_layers` for consistency

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
2023-11-14 10:51:13 +01:00
co63oc
069123f66e Update checkpoint_merger.py (#5780) 2023-11-14 10:46:11 +01:00
Sayak Paul
ed759f0aee [PixArt-Alpha] Introduce resolution binning (#5739)
* feat: add resolution binning

Co-authored-by: lawrence-cj <jschen@mail.dlut.edu.cn>

* rename

* debug

* add :test

* remove unused variable

* set resolution_binning to False.

---------

Co-authored-by: lawrence-cj <jschen@mail.dlut.edu.cn>
2023-11-14 08:34:59 +05:30
Long(Tony) Lian
5b231aa38b Fix the pipeline name in the examples for LMD+ pipeline. Add a colab link to pipeline README. (#5775)
* Fix the pipeline name in the examples for LMD+ pipeline

* Add LMD+ colab link

* Apply code formatting

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-11-14 07:43:37 +05:30
M. Tolga Cangöz
a359ff7644 [Docs] Fix typos and update files at API's Main Classes, Models, and Schedulers pages (#5720)
* Fix typos, update, add Copyright info, and trim trailing whitespaces

* Update docs/source/en/api/loaders.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/models/autoencoder_tiny.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/models/autoencoder_tiny.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-13 14:32:59 -08:00
Steven Liu
4b45a1e147 [docs] Use other checkpoints with inpaint (#5590)
* tip about inpaint checkpoints

* expand section

* feedback
2023-11-13 12:39:30 -08:00
Steven Liu
f782ca112a [docs] Callbacks (#5735)
* updates

* feedback
2023-11-13 12:11:07 -08:00
Steven Liu
80e78d2cac [docs] Custom community components (#5732)
* fixes

* feedback
2023-11-13 11:01:52 -08:00
JacobYuan7
4d3b4e00ed Update the reference for text_to_video.md (#5706)
* Update the reference for text_to_video.md

The original reference (VideoFusion) might be misleading. VideoFusion is not open-sourced. I am the co-first author of ModelScopeT2V. I change the referred paper to the right one.

* Update docs/source/en/api/pipelines/text_to_video.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-13 19:06:18 +01:00
Thuan H. Nguyen
8fcd52febb Correct code for distributed training of RealFill (#5740)
Correct code for distributed training
2023-11-13 19:01:15 +01:00
Nicolas Hug
0488810f61 Fix realfill example compatibility with latest torchvision version (#5736) 2023-11-13 18:55:17 +01:00
Patrick von Platen
ef7787ea59 make style 2023-11-13 18:54:53 +01:00
Jianqi Pan
1ce4b5f3e3 fix: fix forward function signature of controlnet reference_only pipeline example (#5717)
fix: ignore other args

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-13 18:54:21 +01:00
Kashif Rasul
c9f847a70f [Wuerstchen] fix for when USE_PEFT_BACKEND is True (#5704)
* fix for when USE_PEFT_BACKEND is True

* Update modeling_wuerstchen_prior.py

* revert change

* add lora tests
2023-11-13 18:37:55 +01:00
Younes Belkada
8789d0b6c7 fix styling issues on main (#5754)
fix styling issues
2023-11-13 20:05:15 +05:30
Long(Tony) Lian
b1fbef544c Add LLM-grounded Diffusion (LMD+) pipeline (#5634)
* Add LLM-grounded Diffusion (LMD+) pipeline

* Update the formatting

* Applied formatting
2023-11-11 12:51:05 +05:30
YiYi Xu
a3476d5bd6 Controlnet Img2img: pass height and width to image_processor.preprocess (#5665)
pass height and width to image_processor.preprocess

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-11-10 06:53:20 -10:00
Sayak Paul
1477865e48 post release v0.23.0 (#5730)
* post release

* fix: variant test

* up

* fix: test
2023-11-10 16:35:44 +05:30
aihao
1f87f83e68 add load_datasete data_dir parameter (#5747) 2023-11-09 23:03:14 -10:00
Sayak Paul
77ba494b29 [ConsistencyDecoder] fix: doc type (#5745)
fix: doc type
2023-11-10 14:05:22 +05:30
Garry Dolley
1328aeb274 [Docs] Clarify that these are two separate examples (#5734)
* [Docs] Running the pipeline twice does not appear to be the intention of these examples

One is with `cross_attention_kwargs` and the other (next line) removes it

* [Docs] Clarify that these are two separate examples

One using `scale` and the other without it
2023-11-09 14:26:14 -08:00
M. Tolga Cangöz
53a8439fd1 [Docs] Fix typos and update files at Optimization Page (#5674)
* Fix typos, update, trim trailing whitespace

* Trim trailing whitespaces

* Update docs/source/en/optimization/memory.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/optimization/memory.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update _toctree.yml

* Update adapt_a_model.md

* Reverse

* Reverse

* Reverse

* Update dreambooth.md

* Update instructpix2pix.md

* Update lora.md

* Update overview.md

* Update t2i_adapters.md

* Update text2image.md

* Update text_inversion.md

* Update create_dataset.md

* Update create_dataset.md

* Update create_dataset.md

* Update create_dataset.md

* Update coreml.md

* Delete docs/source/en/training/create_dataset.md

* Original create_dataset.md

* Update create_dataset.md

* Delete docs/source/en/training/create_dataset.md

* Add original file

* Delete docs/source/en/training/create_dataset.md

* Add original one

* Delete docs/source/en/training/text2image.md

* Delete docs/source/en/training/instructpix2pix.md

* Delete docs/source/en/training/dreambooth.md

* Add original files

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-09 13:48:57 -08:00
Suraj Patil
db2d8e76f8 Add LCM Scripts (#5727)
* add lcm scripts

* Co-authored-by: dgu8957@gmail.com
2023-11-09 17:29:12 +01:00
Sayak Paul
bc2ba004c6 [LCM] add: locm docs. (#5723)
* add: locm docs.

* correct path

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* up

* add

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-09 17:21:26 +01:00
Patrick von Platen
3d7eaf83d7 LCM Add Tests (#5707)
* lcm add tests

* uP

* Fix all

* uP

* Add

* all

* uP

* uP

* uP

* uP

* uP

* uP

* uP
2023-11-09 15:45:11 +01:00
Patrick von Platen
bf406ea886 Correct consist dec (#5722)
* uP

* Update src/diffusers/models/consistency_decoder_vae.py

* uP

* uP
2023-11-09 13:10:24 +01:00
Will Berman
2fd46405cd consistency decoder (#5694)
* consistency decoder

* rename

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/pipelines/consistency_models/pipeline_consistency_models.py

* uP

* Apply suggestions from code review

* uP

* uP

* uP

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-11-09 12:21:41 +01:00
Dhruv Nair
43346adc1f Install accelerate from PyPI in PR test runner (#5721)
install acclerate from pypi
2023-11-09 15:38:00 +05:30
takuoko
6110d7c95f [Bugfix] fix error of peft lora when xformers enabled (#5697)
* bugfix peft lor

* Apply suggestions from code review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-08 22:55:36 +01:00
Dhruv Nair
65ef7a0c5c Fix prompt bug in AnimateDiff (#5702)
* fix prompt bug

* add test
2023-11-08 21:49:09 +01:00
apolinário
6e68c71503 Add adapter fusing + PEFT to the docs (#5662)
* Add adapter fusing + PEFT to the docs

* Update docs/source/en/tutorials/using_peft_for_inference.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/tutorials/using_peft_for_inference.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/tutorials/using_peft_for_inference.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tutorials/using_peft_for_inference.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tutorials/using_peft_for_inference.md

* Update docs/source/en/tutorials/using_peft_for_inference.md

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-08 18:26:53 +01:00
Patrick von Platen
17528afcba Fix styling issues (#5699)
* up

* up

* up

* Empty-Commit

* fix keyword argument call.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-11-08 17:49:06 +05:30
Sayak Paul
78be400761 [PixArt-Alpha] fix mask feature condition. (#5695)
* fix mask feature condition.

* debug

* remove identical test

* set correct

* Empty-Commit
2023-11-08 17:42:46 +05:30
Patrick von Platen
c803a8f8c0 [LCM] Fix img2img (#5698)
* [LCM] Fix img2img

* make fix-copies

* make fix-copies

* make fix-copies

* up
2023-11-08 11:51:46 +01:00
Philipp Hasper
d384265df7 Fixed is_safetensors_compatible() handling of windows path separators (#5650)
Closes #4665
2023-11-08 11:51:15 +01:00
Kirill
11c125667b Fix the misaligned pipeline usage in dreamshaper docstrings (#5700)
Fix the misaligned pipeline usage
2023-11-08 11:49:03 +01:00
YiYi Xu
69996938cf speed up Shap-E fast test (#5686)
skip rendering

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-11-08 11:43:20 +01:00
Chi
9ae90593c0 Replacing the nn.Mish activation function with a get_activation function. (#5651)
* I added a new doc string to the class. This is more flexible to understanding other developers what are doing and where it's using.

* Update src/diffusers/models/unet_2d_blocks.py

This changes suggest by maintener.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/unet_2d_blocks.py

Add suggested text

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update unet_2d_blocks.py

I changed the Parameter to Args text.

* Update unet_2d_blocks.py

proper indentation set in this file.

* Update unet_2d_blocks.py

a little bit of change in the act_fun argument line.

* I run the black command to reformat style in the code

* Update unet_2d_blocks.py

similar doc-string add to have in the original diffusion repository.

* I removed the dummy variable defined in both the encoder and decoder.

* Now, I run black package to reformat my file

* Remove the redundant line from the adapter.py file.

* Black package using to reformated my file

* Replacing the nn.Mish activation function with a get_activation function allows developers to more easily choose the right activation function for their task. Additionally, removing redundant variables can improve code readability and maintainability.

* I try to fix this: Fast tests for PRs / Fast PyTorch Models & Schedulers CPU tests (pull_request)

* Update src/diffusers/models/resnet.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2023-11-08 07:58:21 +05:30
M. Tolga Cangöz
7942bb8dc2 [Docs] Fix typos, improve, update at Using Diffusers' Task page (#5611)
* Fix typos, improve, update; kandinsky doesn't want fp16 due to deprecation; ogkalu and kohbanye don't have safetensor; add make_image_grid for better visualization

* Update inpaint.md

* Remove erronous Space

* Update docs/source/en/using-diffusers/conditional_image_generation.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update img2img.md

* load_image() already converts to RGB

* Update depth2img.md

* Update img2img.md

* Update inpaint.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-07 15:42:20 -08:00
dg845
aab6de22c3 Improve LCMScheduler (#5681)
* Refactor LCMScheduler.step such that prev_sample == denoised at the last timestep in the schedule.

* Make timestep scaling when calculating boundary conditions configurable.

* Reparameterize timestep_scaling to be a multiplicative rather than division scaling.

* make style

* fix dtype conversion

* make style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-07 18:48:18 +01:00
Sayak Paul
1dc231d14a [PixArt-Alpha] Support non-square images (#5672)
* debug

* support non-square images

* add: test

* fix: test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-07 18:21:33 +01:00
Sayak Paul
84cd9e8d01 Make sure DDPM and diffusers can be used without Transformers (#5668)
* fix: import bug

* fix

* fix

* fix import utils for lcm

* fix: pixart alpha init

* Fix

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-07 17:38:12 +01:00
Sayak Paul
a8523bffa8 [PixArt-Alpha] fix mask_feature so that precomputed embeddings work with a batch size > 1 (#5677)
* fix embeds

* remove todo

* add: test

* better name
2023-11-07 17:12:47 +01:00
Dhruv Nair
97c8199dbb Explicit torch/flax dependency check (#5673)
* explicit torch dependency check

* update

* update

* update
2023-11-07 16:38:20 +01:00
Dhruv Nair
414d7c4991 Fix Basic Transformer Block (#5683)
* fix

* Update src/diffusers/models/attention.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-07 16:36:49 +01:00
Dhruv Nair
8ca179a0a9 Update free model hooks (#5680)
update free model hooks
2023-11-07 20:50:57 +05:30
Dhruv Nair
71f56c771a Model tests xformers fixes (#5679)
* fix model xformers test

* update
2023-11-07 20:50:41 +05:30
Dhruv Nair
6a89a6c93a Update custom diffusion attn processor (#5663)
update custom diffusion attn processor
2023-11-07 12:46:38 +05:30
Beinsezii
9bafef34bd Add Pixart to AUTO_TEXT2IMAGE_PIPELINES_MAPPING (#5664) 2023-11-07 07:45:56 +05:30
Sayak Paul
64603389da post release (v0.22.0) (#5658)
post release
2023-11-06 16:23:38 +01:00
Patrick von Platen
f05d75c076 [Custom Pipelines] Make sure that community pipelines can use repo revision (#5659)
fix custom pipelines
2023-11-06 15:11:48 +01:00
Sayak Paul
aec3de8bdb correct pipeline class name (#5652) 2023-11-06 14:08:27 +05:30
Sayak Paul
d61889fc17 [Feat] PixArt-Alpha (#5642)
* init pixart alpha pipeline

* fix: import

* script

* script

* script

* add: vae to the pipeline

* add: vae_scale_factor

* add: checkpoint_path

* clean conversion script a bit.

* size embeddings.

* fix: size embedding

* update scrip

* support for interpolation of position embedding.

* support for conditioning.

* ..

* ..

* ..

* final layer

* final layer

* align if encode_prompt

* support for caption embedding

* refactor

* refactor

* refactor

* start cross attention

* start cross attention

* cross_attention_dim

* cross

* cross

* support for resolution and aspect_ratio

* support for caption projection

* refactor patch embeddings

* batch_size

* up

* commit

* commit

* commit.

* squeeze

* squeeze

* squeeze

* squeeze

* squeeze

* squeeze

* squeeze

* squeeze

* squeeze

* squeeze

* squeeze

* squeeze.

* squeeze.

* fix final block./

* fix final block./

* fix final block./

* clean

* fix: interpolation scale.

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging'

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* make --checkpoint_path non-required.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* remove num_tokens

* timesteps -> timestep

* timesteps -> timestep

* timesteps -> timestep

* timesteps -> timestep

* timesteps -> timestep

* timesteps -> timestep

* debug

* debug

* update conversion script.

* update conversion script.

* update conversion script.

* debug

* debug

* debug

* clean

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* debug

* deug

* debug

* debug

* debug

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* fix

* clean

* fix

* fix

* boom

* boom

* some changes

* boom

* save

* up

* remove i

* fix more tests

* DPMSolverMultistepScheduler

* fix

* offloading

* fix conversion script

* fix conversion script

* remove print

* remove support for negative prompt embeds.

* typo.

* remove extra kwargs

* bring conversion script to where it was

* fix

* trying mu luck

* trying my luck again

* again

* again

* again

* clean up

* up

* up

* update example

* support for 512

* remove spacing

* finalize docs.

* test debug

* fix: assertion values.

* debug

* debug

* debug

* fix: repeat

* remove prints.

* Apply suggestions from code review

* Apply suggestions from code review

* Correct more

* Apply suggestions from code review

* Change all

* Clean more

* fix more

* Fix more

* Fix more

* Correct more

* address patrick's comments.

* remove unneeded args

* clean up pipeline.

* sty;e

* make the use of additional conditions better conditioned.

* None better

* dtype

* height and width validation

* add a note about size brackets.

* fix

* spit out slow test outputs.

* fix?

* fix optional test

* fix more

* remove unneeded comment

* debug

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-06 08:40:04 +01:00
YiYi Xu
2b23ec82e8 add callbacks to denoising step (#5427)
* draft1

* update

* style

* move to the end of loop

* update

* update callbak_on_step_end_inputs

* Revert "update"

This reverts commit 5f9b153183.

* Revert "update callbak_on_step_end_inputs"

This reverts commit 44889f4dab.

* update

* update test required_optional_params

* remove self.lora_scale

* img2img

* inpaint

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix

* apply feedbacks on img2img + inpaint: keep only important pipeline attributes

* depth

* pix2pix

* make _callback_tensor_inputs an class variable so that we can use it for testing

* add a basic tst for callback

* add a read-only tensor input timesteps + fix tests

* add second test for callback cfg

* sdxl

* sdxl img2img

* sdxl inpaint

* kandinsky prior

* kandinsky decoder

* kandinsky img2img + combined

* kandinsky inpaint

* fix copies

* fix

* consistent default inputs

* fix copies

* wuerstchen_prior prior

* test_wuerstchen_decoder + fix test for prior

* wuerstchen_combined pipeline + skip tests

* skip test for kandinsky combined

* lcm

* remove timesteps etc

* add doc string

* copies

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* make style and improve tests

* up

* up

* fix more

* fix cfg test

* tests for callbacks

* fix for real

* update

* lcm img2img

* add doc

* add doc page to index

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-11-05 20:00:41 +01:00
Chi
080081bded Remove the redundant line from the adapter.py file. (#5618)
* I added a new doc string to the class. This is more flexible to understanding other developers what are doing and where it's using.

* Update src/diffusers/models/unet_2d_blocks.py

This changes suggest by maintener.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/unet_2d_blocks.py

Add suggested text

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update unet_2d_blocks.py

I changed the Parameter to Args text.

* Update unet_2d_blocks.py

proper indentation set in this file.

* Update unet_2d_blocks.py

a little bit of change in the act_fun argument line.

* I run the black command to reformat style in the code

* Update unet_2d_blocks.py

similar doc-string add to have in the original diffusion repository.

* I removed the dummy variable defined in both the encoder and decoder.

* Now, I run black package to reformat my file

* Remove the redundant line from the adapter.py file.

* Black package using to reformated my file

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-11-03 22:02:36 -10:00
Sayak Paul
dd9a5caf61 [Core] support for tiny autoencoder in img2img (#5636)
* support for tiny autoencoder in img2img

Co-authored-by: slep0v <37597789+slep0v@users.noreply.github.com>

* copy fix

* line space

* line space

* clean up

* spit out expected value

* spit out expected value

* assertion values.

* assertion values.

---------

Co-authored-by: slep0v <37597789+slep0v@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-11-03 15:31:53 +01:00
M. Tolga Cangöz
a35e72b032 [Docs] Fix typos, improve, update at Using Diffusers' Tecniques page (#5627)
Fix typos, improve, update; better visualization
2023-11-03 13:51:41 +01:00
dg845
beb8f216ed Clean up LCM Pipeline and Test Code. (#5641)
* Clean up LCM pipeline and pipeline test code.

* Add comment for LCM img2img sampling loop.
2023-11-03 13:50:48 +01:00
Ryan Dick
7ad70cee74 Model loading speed optimization (#5635)
Move unchanging operation out of loop for speed benefit.
2023-11-03 13:48:13 +01:00
Sayak Paul
60c5eb5877 [Easy] clean up the LCM docstrings. (#5637)
* clean up the LCM docstrings.

* clean up

* fix: examples

* Apply suggestions from code review
2023-11-03 12:14:48 +01:00
YiYi Xu
d122206466 fix a bug in AutoPipeline.from_pipe() when creating a controlnet pipeline from an existing controlnet (#5638)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-11-03 12:14:19 +01:00
Sayak Paul
c84982a804 [Easy] Minor AnimateDiff Doc nits (#5640)
minor
2023-11-03 16:27:54 +05:30
Dhruv Nair
84e7bb875d Update animatediff docs to include section on Motion LoRAs (#5639)
update animatediff docs
2023-11-03 15:53:59 +05:30
Patrick von Platen
072e00897a [LCM] Make sure img2img works (#5632)
* [LCM] Clean up implementations

* Add all

* correct more

* correct more

* finish

* up
2023-11-02 19:50:47 +01:00
M. Tolga Cangöz
b91d5ddd1a [Docs] Fix typos, improve, update at Using Diffusers' Loading & Hub page (#5584)
* Fix typos, improve, update

* Change to trending and apply some Grammarly fixes

* Grammarly fixes

* Update loading_adapters.md

* Update loading_adapters.md

* Update other-formats.md

* Update push_to_hub.md

* Update loading_adapters.md

* Update loading.md

* Update docs/source/en/using-diffusers/push_to_hub.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update schedulers.md

* Update docs/source/en/using-diffusers/loading.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/loading_adapters.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update A1111 LoRA files part

* Update other-formats.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-02 11:05:43 -07:00
Dhruv Nair
2a8cf8e39f Animatediff Proposal (#5413)
* draft design

* clean up

* clean up

* clean up

* clean up

* clean up

* clean  up

* clean up

* clean up

* clean up

* update pipeline

* clean up

* clean up

* clean up

* add tests

* change motion block

* clean up

* clean up

* clean up

* update

* update

* update

* update

* update

* update

* update

* update

* clean up

* update

* update

* update model test

* update

* update

* update

* update

* make style

* update

* fix embeddings

* update

* merge upstream

* max fix copies

* fix bug

* fix mistake

* add docs

* update

* clean up

* update

* clean up

* clean up

* fix docstrings

* fix docstrings

* update

* update

* clean  up

* update
2023-11-02 15:04:03 +01:00
M. Tolga Cangöz
9ced7844da [Docs] Fix typos, improve, update at Conceptual Guides page (#5585)
* Fix typos, improve, update

* Update docs/source/en/conceptual/contribution.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conceptual/contribution.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/conceptual/philosophy.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update philosophy.md

* Update philosophy.md

* Update docs/source/en/conceptual/philosophy.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/controlling_generation.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/using-diffusers/controlling_generation.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Remove e.g.; some Grammarly fixes

* Update docs/source/en/conceptual/philosophy.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update contribution.md

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-02 12:55:23 +01:00
Patrick von Platen
9723f8a557 [Tests] Fix cpu offload test (#5626)
* fix more

* fix more
2023-11-02 12:49:58 +01:00
Sayak Paul
b81f709fb6 [remote code] document trust remote code. (#5620)
document trust remote code.
2023-11-02 12:02:31 +01:00
Steven Liu
75ea54a151 [docs] Kandinsky guide (#4555)
* kandinsky 2.1 first draft

* add kandinsky 2.2

* fix identical section headers

* try hfoptions syntax

* add img2img

* add inpaint

* add interpolate

* fix tag

* more cleanups

* typo

* update hfoptions id

* align hfoptions tags
2023-11-01 15:36:22 -07:00
Patrick von Platen
c0f0582651 [SDXL Adapter] Revert load lora (#5615)
* fix

* fix
2023-11-01 22:18:58 +01:00
M. Tolga Cangöz
b81c69e489 [Docs] Fix typos, improve, update at Get Started page (#5587)
* Fix typos, improve, update

* Update _toctree.yml

* Update docs/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Apply Grammarly fixes

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-01 22:11:57 +01:00
Younes Belkada
02ba50c610 [PEFT / LoRA] Fix civitai bug when network alpha is an empty dict (#5608)
* fix civitai bug

* add test

* up

* fix test

* added slow test.

* style

* Update src/diffusers/utils/peft_utils.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Update src/diffusers/utils/peft_utils.py

---------

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2023-11-01 22:08:22 +01:00
Patrick von Platen
4f2bf67355 Revert "Fix the order of width and height of original size in SDXL training script" (#5614)
Revert "Fix the order of width and height of original size in SDXL training script (#5382)"

This reverts commit 45db049973.
2023-11-01 22:04:47 +01:00
Chi
29cf163b95 Remove Redundant Variables from Encoder and Decoder (#5569)
* I added a new doc string to the class. This is more flexible to understanding other developers what are doing and where it's using.

* Update src/diffusers/models/unet_2d_blocks.py

This changes suggest by maintener.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/unet_2d_blocks.py

Add suggested text

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update unet_2d_blocks.py

I changed the Parameter to Args text.

* Update unet_2d_blocks.py

proper indentation set in this file.

* Update unet_2d_blocks.py

a little bit of change in the act_fun argument line.

* I run the black command to reformat style in the code

* Update unet_2d_blocks.py

similar doc-string add to have in the original diffusion repository.

* I removed the dummy variable defined in both the encoder and decoder.

* Now, I run black package to reformat my file

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-11-01 21:50:33 +01:00
Patrick von Platen
839c2a5ece fix 2023-11-01 21:39:30 +01:00
ilisparrow
5712c3d2ef [Core] enable lora for sdxl adapters too and add slow tests. (#5555)
* Enable lora for sdxl adapters too.

Issue #5516

* fix: assertion values.

* Use numpy_cosine_similarity_distance on the arrays

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Use numpy_cosine_similarity_distance on the arrays

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Changed imports orders to pass tests

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

---------

Co-authored-by: Ilias A <iliasamri00@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-11-01 21:25:38 +01:00
clarencechen
151998e1c2 Update final CPU offloading code for more diffusion pipelines (#5589)
* Update final model offload for more pipelines

Add test to ensure all pipeline components are returned to CPU after
execution with model offloading

* Add comment to explain early UNet offload in Text-to-Video pipeline

* Style
2023-11-01 21:22:56 +01:00
Steven Liu
d1eb14bc35 [docs] Lu lambdas (#5602)
lu lambdas
2023-11-01 11:47:11 -07:00
M. Tolga Cangöz
5c75a5fbc4 [Docs] Fix typos, improve, update at Tutorials page (#5586)
* Fix typos, improve, update

* Update autopipeline.md

* Update docs/source/en/tutorials/using_peft_for_inference.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tutorials/using_peft_for_inference.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/tutorials/using_peft_for_inference.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-11-01 10:40:47 -07:00
M. Tolga Cangöz
442017ccc8 [Docs] Fix typos (#5583)
* Add Copyright info

* Fix typos, improve, update

* Update deepfloyd_if.md

* Update ldm3d_diffusion.md

* Update opt_overview.md
2023-10-31 10:04:08 -07:00
Dhruv Nair
f1d052c5b8 Update docker image for xformers (#5597)
update docker image for xformers
2023-10-31 15:02:10 +05:30
YiYi Xu
ce9484b139 fix a mistake in text2image training script for kandinsky2.2 (#5244)
fix

Co-authored-by: yiyixuxu <yixu@Yis-MacBook-Pro.local>
2023-10-30 23:06:16 -10:00
Jincheng Miao
ed00ead345 [Community Pipelines] add textual inversion support for stable_diffusion_ipex (#5571) 2023-10-31 11:54:16 +05:30
TimothyAlexisVass
f0b2f6ce05 Fix divide by zero RuntimeWarning (#5543) 2023-10-31 11:39:08 +05:30
Younes Belkada
32fea1cc9b [core / PEFT ]Bump transformers min version for PEFT integration (#5579)
Update constants.py
2023-10-30 19:35:46 +01:00
Aryan V S
bb46be2f18 Fix incorrect loading of custom pipeline (#5568)
* update

* update

* update

* update
2023-10-30 19:32:11 +01:00
Cheng Lu
ac7b1716b7 Stabilize DPM++, especially for SDXL and SDE-DPM++ (#5541)
* stabilize dpmpp for sdxl by using euler at the final step

* add lu's uniform logsnr time steps

* add test

* fix check_copies

* fix tests

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-30 06:36:53 -10:00
Peter @sHTiF Stefcek
3fc10ded00 add fix to be able use StableDiffusionXLAdapterPipeline.from_single_file (#5547) 2023-10-30 16:46:44 +01:00
Thuan H. Nguyen
5b087e82d1 Add realfill (#5456)
* Add realfill

* Move realfill folder

* Fix some format issues
2023-10-30 15:21:40 +01:00
Younes Belkada
8f3100db9f [PEFT / Tests] Add peft slow tests on push (#5419)
* add peft slow tests workflow

* Update .github/workflows/push_tests.yml

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-30 14:27:00 +01:00
Patrick von Platen
3ec828d6dd Fix moved _expand_mask function (#5581)
* finish

* finish
2023-10-30 14:25:31 +01:00
Gabriel de Souza
9135e54e76 docs: initial pt translation (#5549)
* docs: initial pt translation

* docs: add pt build to github workflow and fix some missing translations
2023-10-27 10:51:35 -07:00
jiaqiw09
e140c0562e fix error reported 'find_unused_parameters' running in mutiple GPUs (#5355)
* fix error reported 'find_unused_parameters' running in mutiple GPUs or NPUs

* fix code check of importing module by its alphabetic order

---------

Co-authored-by: jiaqiw <wangjiaqi50@huawei.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-10-27 22:49:14 +05:30
Steven Liu
595ba6f786 [docs] Internal classes API (#5513)
* internal classes api

* add internal class overview

* fix toctree
2023-10-27 09:48:41 -07:00
Sayak Paul
798591346d [Core] fix FreeU disable method (#5552)
* disable freeu debug

* debug

* potentially fix.

* finish

* manually remove the spaces

* remove tab
2023-10-27 21:29:11 +05:30
YiYi Xu
f912f39b50 correct checkpoint in kandinsky2.2 doc page (#5550)
update checkpoint

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-10-27 08:49:15 +05:30
nickkolok
0d4b459be6 Update train_dreambooth.py - fix typos (#5539) 2023-10-26 13:35:05 -07:00
Patrick von Platen
cee1cd6e9c [Remote code] Add functionality to run remote models, schedulers, pipelines (#5472)
* upload custom remote poc

* up

* make style

* finish

* better name

* Apply suggestions from code review

* Update tests/pipelines/test_pipelines.py

* more fixes

* remove ipdb

* more fixes

* fix more

* finish tests

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-26 17:11:49 +02:00
p1kit
5b448a5e5d [Tests] Optimize test configurations for faster execution (#5535)
Optimize test configurations for faster execution
2023-10-26 16:02:34 +05:30
Patrick von Platen
a69ebe5527 [Tests] Speed up expert of mixture tests (#5533)
* [Tests] Speed up expert of mixture tests

* make style
2023-10-26 09:42:27 +02:00
Chi
ce7f334472 Remove multiple if-else statement in the get_activation function. (#5446)
* I added a new doc string to the class. This is more flexible to understanding other developers what are doing and where it's using.

* Update src/diffusers/models/unet_2d_blocks.py

This changes suggest by maintener.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/unet_2d_blocks.py

Add suggested text

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update unet_2d_blocks.py

I changed the Parameter to Args text.

* Update unet_2d_blocks.py

proper indentation set in this file.

* Update unet_2d_blocks.py

a little bit of change in the act_fun argument line.

* I run the black command to reformat style in the code

* Update unet_2d_blocks.py

similar doc-string add to have in the original diffusion repository.

* I use a lower method in the activation function.

* Replace multiple if-else statements with a dictionary of activation functions, and call one if statement to retrieve the appropriate function.

* I am using black package to reforamted my file

* I defined the ACTIVATION_FUNCTIONS variable outside of the function

* activation function variable convert to lower case

* First, I resolved the conflict issue. Then, I ran the Black package to reformat my file.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-26 09:36:30 +05:30
Ran Ran
8959c5b9de Add from_pt flag to enable model from PT (#5501)
* Add from_pt flag to enable model from PT

* Format the file

* Reformat the file
2023-10-25 23:07:34 +02:00
Steven Liu
bc8a08f67c [docs] Loader docs (#5473)
* first draft

* make fix-copies

* add peft section

* manual fix

* make fix-copies again

* manually revert changes to other files
2023-10-25 09:45:05 -07:00
Yi-Xuan XU
dbce14da56 fix a bug on torch_dtype argument in from_single_file of ControlNetModel (#5528)
fix wrong parameter
2023-10-25 17:29:56 +02:00
RampagingSloth
71ad02607d Fix missing punctuation in PHILOSOPHY.md (#5530)
Fix missing punctuation.
2023-10-25 17:29:34 +02:00
Patrick von Platen
dd981256ad make fix-copies 2023-10-25 17:19:38 +02:00
Aryan V S
0c9f174d59 Improve typehints and docs in diffusers/models (#5391)
* improvement: add typehints and docs to src/diffusers/models/attention_processor.py

* improvement: add typehints and docs to src/diffusers/models/vae.py

* improvement: add missing docs in src/diffusers/models/vq_model.py

* improvement: add typehints and docs to src/diffusers/models/transformer_temporal.py

* improvement: add typehints and docs to src/diffusers/models/t5_film_transformer.py

* improvement: add type hints to src/diffusers/models/unet_1d_blocks.py

* improvement: add missing type hints to src/diffusers/models/unet_2d_blocks.py

* fix: CI error (make fix-copies required)

* fix: CI error (make fix-copies required again)

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-10-25 17:19:15 +02:00
Patrick von Platen
d420d71398 make style 2023-10-25 16:12:14 +02:00
Logan
a1fad8286f Add a new community pipeline (#5477)
* Add a new community pipeline

examples/community/latent_consistency_img2img.py

which can be called like this

import torch
from diffusers import DiffusionPipeline

pipe = DiffusionPipeline.from_pretrained(
                "SimianLuo/LCM_Dreamshaper_v7", custom_pipeline="latent_consistency_txt2img", custom_revision="main")

            # To save GPU memory, torch.float16 can be used, but it may compromise image quality.
pipe.to(torch_device="cuda", torch_dtype=torch.float32)

img2img=LatentConsistencyModelPipeline_img2img(
    vae=pipe.vae,
    text_encoder=pipe.text_encoder,
    tokenizer=pipe.tokenizer,
    unet=pipe.unet,
    #scheduler=pipe.scheduler,
    scheduler=None,
    safety_checker=None,
    feature_extractor=pipe.feature_extractor,
    requires_safety_checker=False,
)

img = Image.open("thisismyimage.png")

result = img2img(prompt,img,strength,num_inference_steps=4)

* Apply suggestions from code review

Fix name formatting for scheduler

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update readme (and run formatter on latent_consistency_img2img.py)

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-25 16:11:56 +02:00
Patrick von Platen
dc943eb99d [Schedulers] Fix 2nd order other than heun (#5526)
* [Schedulers] Fix 2nd order other than heun

* Apply suggestions from code review
2023-10-25 14:39:56 +02:00
YiYi Xu
0fc25715a1 fix a bug in 2nd order schedulers when using in ensemble of experts config (#5511)
* fix

* fix copies

* remove heun from tests

* add back heun and fix the tests to include 2nd order

* fix the other test too

* Apply suggestions from code review

* Apply suggestions from code review

* Apply suggestions from code review

* make style

* add more comments

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-25 12:34:05 +02:00
AnyISalIn
de71fa59f5 fix error of peft lora when xformers enabled (#5506)
Signed-off-by: AnyISalIn <anyisalin@gmail.com>
2023-10-25 10:58:15 +05:30
Chengxi Guo
dcbfe662ef fix typo (#5505)
Signed-off-by: mymusise <mymusise1@gmail.com>
2023-10-24 17:14:05 -07:00
dg845
958e17dada Add Latent Consistency Models Pipeline (#5448)
* initial commit for LatentConsistencyModelPipeline and LCMScheduler based on the community pipeline

* Add callback and freeu support.

* apply suggestions from review

* Clean up LCMScheduler

* Remove timeindex argument to LCMScheduler.step.

* Add support for clipping or thresholding the predicted original sample.

* Remove unused methods and arguments in LCMScheduler.

* Improve comment about (lack of) negative prompt support.

* Change input guidance_scale to match the StableDiffusionPipeline (Imagen) CFG formulation.

* Move lcm_origin_steps from pipeline __call__ to LCMScheduler.__init__/config (as origin_steps).

* Fix typo when clipping/thresholding in LCMScheduler.

* Add some initial LCMScheduler tests.

* add type annotations from review

* Fix type annotation bug.

* Override test_add_noise_device in LCMSchedulerTest since hardcoded timesteps doesn't work under default settings.

* Add generator argument pipeline prepare_latents call.

* Cast LCMScheduler.timesteps to long in set_timesteps.

* Add onestep and multistep full loop scheduler tests.

* Set default height/width to None and don't hardcode guidance scale embedding dim.

* Add initial LatentConsistencyPipeline fast and slow tests.

* Add initial documentation for LatentConsistencyModelPipeline and LCMScheduler.

* Make remaining failing fast tests pass.

* make style

* Make original_inference_steps configurable from pipeline __call__ again.

* make style

* Remove guidance_rescale arg from pipeline __call__ since LCM currently doesn't support CFG.

* Make LCMScheduler defaults match config of LCM_Dreamshaper_v7 checkpoint.

* Fix LatentConsistencyPipeline slow tests and add dummy expected slices.

* Add checks for original_steps in LCMScheduler.set_timesteps.

* make fix-copies

* Improve LatentConsistencyModelPipeline docs.

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* Update src/diffusers/schedulers/scheduling_lcm.py

* Apply suggestions from code review

Co-authored-by: Aryan V S <avs050602@gmail.com>

* finish

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Aryan V S <avs050602@gmail.com>
2023-10-24 21:06:02 +02:00
Steven Liu
7c3a75a1ce [docs] General updates (#5378)
* first draft

* feedback

* feedback
2023-10-24 11:51:55 -07:00
Isamu Isozaki
b8896a154a Japanese docs (#5478)
* Finished _toctree.yml and index.md

* Finished installation.md

* Properly finished installation.md and almost finished quicktour

* Finished quicktour

* Finished stable diffusion doc

* Fixed _toctree.yml

* Fixed requests

* Fix country code

* Properly push
2023-10-24 11:30:04 -07:00
Bowen Bao
c7617e482a Register BaseOutput subclasses as supported torch.utils._pytree nodes (#5459)
* Register BaseOutput subclasses as supported torch.utils._pytree nodes

* lint

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-10-24 15:01:47 +05:30
Sayak Paul
77241c48af [Core] Refactor activation and normalization layers (#5493)
* move out the activations.

* move normalization layers.

* add doc.

* add doc.

* fix: paths

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* style

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-24 08:49:43 +05:30
Abhishar Sinha
096f84b05f Fixed autoencoder typo (#5500) 2023-10-23 13:59:00 -07:00
YiYi Xu
9e1edfc1ad fix a few issues in controlnet inpaint pipelines (#5470)
* add

* Update docs/source/en/api/pipelines/controlnet_sdxl.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-23 09:24:51 -10:00
Steven Liu
6b06c30a65 [docs] Fix links (#5499)
fix links
2023-10-23 20:39:29 +02:00
zideliu
188d864fa3 [BUG] in transformer_temporal Fix Bugs (#5496)
Fix Bugs
2023-10-23 20:38:41 +02:00
Kyunghwan Kim
6e608d8a35 Fix typo in controlnet docs (#5486) 2023-10-23 20:36:35 +02:00
Dhruv Nair
33293ed504 Fix Slow Tests (#5469)
fix tests
2023-10-23 20:24:31 +02:00
Sayak Paul
48ce118d1c [torch.compile] fix graph break problems partially (#5453)
* fix: controlnet graph?

* fix: sample

* fix:

* remove print

* styling

* fix-copies

* prevent more graph breaks?

* prevent more graph breaks?

* see?

* revert.

* compilation.

* rpopagate changes to controlnet sdxl pipeline too.

* add: clean version checking.
2023-10-23 23:41:52 +05:30
Patrick von Platen
1ade42f729 make style 2023-10-23 19:43:54 +02:00
Shyam Marjit
677df5ac12 fixed SDXL text encoder training bug #5016 (#5078)
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-23 19:43:43 +02:00
Andrei Filatov
16851efa0f Update README.md (#5497)
Right now, only "main" branch has this community pipeline code. So, adding it manually into pipeline
2023-10-23 18:57:43 +02:00
Ryan Dick
0eac9cd04e Make T2I-Adapter downscale padding match the UNet (#5435)
* Update get_dummy_inputs(...) in T2I-Adapter tests to take image height and width as params.

* Update the T2I-Adapter unit tests to run with the standard number of UNet down blocks so that all T2I-Adapter down blocks get exercised.

* Update the T2I-Adapter down blocks to better match the padding behavior of the UNet.

* Revert "Update the T2I-Adapter unit tests to run with the standard number of UNet down blocks so that all T2I-Adapter down blocks get exercised."

This reverts commit 6d4a060a34.

* Create  utility functions for testing the T2I-Adapter downscaling bahevior.

* (minor) Improve readability with an intermediate named variable.

* Statically parameterize  T2I-Adapter test dimensions rather than generating them dynamically.

* Fix static checks.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-23 18:52:31 +02:00
Younes Belkada
bc7a4d4917 [PEFT] Fix scale unscale with LoRA adapters (#5417)
* fix scale unscale v1

* final fixes + CI

* fix slow trst

* oops

* fix copies

* oops

* oops

* fix

* style

* fix copies

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-21 22:17:18 +05:30
Vishnu V Jaddipal
8dba180885 Added support to create asymmetrical U-Net structures (#5400)
* Added args, kwargs to ```U

* Add UNetMidBlock2D as a supported mid block type

* Fix extra init input for UNetMidBlock2D, change allowed types for Mid-block init

* Update unet_2d_condition.py

* Update unet_2d_condition.py

* Update unet_2d_condition.py

* Update unet_2d_condition.py

* Update unet_2d_condition.py

* Update unet_2d_condition.py

* Update unet_2d_condition.py

* Update unet_2d_condition.py

* Update unet_2d_blocks.py

* Update unet_2d_blocks.py

* Update unet_2d_blocks.py

* Update unet_2d_condition.py

* Update unet_2d_blocks.py

* Updated docstring, increased check strictness

Updated the docstring for ```UNet2DConditionModel``` to include ```reverse_transformer_layers_per_block``` and updated checking for nested list type ```transformer_layers_per_block```

* Add basic shape-check test for asymmetrical unets

* Update src/diffusers/models/unet_2d_blocks.py

Removed blank line

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update unet_2d_condition.py

Remove blank space

* Update unet_2d_condition.py

Changed docstring for `mid_block_type`

* Fixed docstring and wrong default value

* Reformat with black

* Reformat with necessary commands

* Add UNetMidBlockFlat to versatile_diffusion/modeling_text_unet.py to ensure consistency

* Removed args, kwargs, use on mid-block type

* Make fix-copies

* Update src/diffusers/models/unet_2d_condition.py

Wrap into single line

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* make fix-copies

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-20 11:42:28 +02:00
kesimeg
5366db5df1 fix une2td ignoring class_labels (#5401)
* fix une2td ignoring class_labels

* unet2.py error message updated

* style and quality changes

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-10-20 14:21:46 +05:30
Patrick von Platen
e516858886 make style 2023-10-18 22:48:49 +02:00
Chi
36a0bacc29 Beautiful Doc string added into the UNetMidBlock2D class. (#5389)
* I added a new doc string to the class. This is more flexible to understanding other developers what are doing and where it's using.

* Update src/diffusers/models/unet_2d_blocks.py

This changes suggest by maintener.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/unet_2d_blocks.py

Add suggested text

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update unet_2d_blocks.py

I changed the Parameter to Args text.

* Update unet_2d_blocks.py

proper indentation set in this file.

* Update unet_2d_blocks.py

a little bit of change in the act_fun argument line.

* I run the black command to reformat style in the code

* Update unet_2d_blocks.py

similar doc-string add to have in the original diffusion repository.

* Update unet_2d_blocks.py

Added Beutifull doc-string into the UNetMidBlock2D class.

* Update unet_2d_blocks.py

I replaced the definition in this parameter resnet_time_scale_shift and resnet_groups.

* Update unet_2d_blocks.py

I remove additional sentences into the resnet_groups argument.

* Update unet_2d_blocks.py

I replaced my definition with the maintainer definition in the attention_head_dim parameter.

* I am using black package for reformated my file

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2023-10-18 22:48:36 +02:00
Patrick von Platen
9ad0530fea make style 2023-10-18 22:39:29 +02:00
linjiapro
45db049973 Fix the order of width and height of original size in SDXL training script (#5382)
* wip

* wip

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-18 22:38:52 +02:00
Liang Hou
a68f5062fb style(sdxl): remove identity assignments (#5418) 2023-10-18 22:38:20 +02:00
__mo_san__
b864d674a5 Update-DeepFloyd-IF-Pipelines-Docstrings (#5304)
* added TODOs

* Enhanced and reformatted the docstrings of IFPipeline methods.

* Enhanced and fixed the docstrings of IFImg2ImgSuperResolutionPipeline methods.

* Enhanced and fixed the docstrings of IFImg2ImgPipeline methods.

* Enhanced and fixed the docstrings of IFInpaintingSuperResolutionPipeline methods.

* Enhanced and fixed the docstrings of IFInpaintingPipeline  methods.

* Enhanced and fixed the docstrings of IFSuperResolutionPipeline methods.

* Update src/diffusers/pipelines/deepfloyd_if/pipeline_if.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/deepfloyd_if/pipeline_if_img2img.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/deepfloyd_if/pipeline_if_img2img_superresolution.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/deepfloyd_if/pipeline_if_inpainting.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/deepfloyd_if/pipeline_if_superresolution.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/deepfloyd_if/pipeline_if_inpainting_superresolution.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* remove redundant code

* fix code style

* revert the ordering to not break backwards compatibility

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-18 21:40:41 +02:00
Patrick von Platen
85dccab7fd Add latent consistency (#5438)
* Add latent consistency

* Update examples/community/README.md

* Add latent consistency

* make fix copies

* Apply suggestions from code review
2023-10-18 14:18:31 +02:00
Sayak Paul
87fd3ce32b [from_single_file()]fix: local single file loading. (#5440)
fix: local single file loading.
2023-10-18 17:33:12 +05:30
Patrick von Platen
109d5bbe0d Merge branch 'main' of https://github.com/huggingface/diffusers 2023-10-18 09:27:46 +00:00
Patrick von Platen
f277d5e540 make fix copies 2023-10-18 09:27:35 +00:00
Dhruv Nair
28e8d1f6ec Fix pipe fetcher for slow tests (#5424)
* fix pipe fetcher

* filter out community pipelines
2023-10-18 00:42:56 +05:30
Dhruv Nair
98a0712d69 Update base image for slow CUDA tests (#5426)
update base image for tests
2023-10-17 23:18:53 +05:30
Susheel Thapa
324d18fba2 Chore: Typo fixed in multiple files (#5422) 2023-10-17 08:17:03 -07:00
Arka
ad8068e414 changed channel parameters for UNET and VAE. Changed configs parameters of CLIPText (#5370)
* changed channel parameters for UNET and VAE. Decreased hidden layers size with increased attention heads and intermediate size

* changed the assertion check range

* clean up

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-10-17 14:46:42 +05:30
Sayak Paul
b4cbbd5ed2 [Examples] Follow up of #5393 (#5420)
* fix: create_repo()

* Empty-Commit
2023-10-17 12:07:39 +05:30
Sayak Paul
8b3d2aeaf8 [Core] Fix/pipeline without text encoders for SDXL (#5301)
* fix: sdxl pipeline when unet is not available.

* fix moe

* account for text

* ifx more

* don't make unet optional.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* split conditionals.

* add optional components to sdxl pipeline

* propagate changes to the rest of the pipelines.

* add: test

* add to all

* fix: rest of the pipelines.

* use pipeline_class variable

* separate pipeline mixin

* use safe_serialization

* fix: test

* access actual output.

* add: optional test to adapter and ip2p sdxl pipeline tests/

* add optional test to controlnet sdxl.

* fix tests

* fix ip2p tests

* fix more

* fifx more.

* use np output type.

* fix for StableDiffusionXLMultiControlNetPipelineFastTests.

* fix: SDXLOptionalComponentsTesterMixin

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix tests

* Empty-Commit

* revert previous

* quality

* fix: test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-17 11:17:06 +05:30
Patrick von Platen
57239dacd0 make style 2023-10-16 16:29:50 +02:00
Gregg Helt
de12776b3a Add ability to mix usage of T2I-Adapter(s) and ControlNet(s). (#5362)
* Add ability to mix usage of T2I-Adapter(s) and ControlNet(s).
Previously, UNet2DConditional implemnetation onloy allowed use of one or the other.
Adds new forward() arg down_intrablock_additional_residuals specifically for T2I-Adapters. If down_intrablock_addtional_residuals is not used, maintains backward compatibility with prior usage of only T2I-Adapter or ControlNet but not both

* Improving forward() arg docs in src/diffusers/models/unet_2d_condition.py

Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>

* Add deprecation warning if down_block_additional_residues is used for T2I-Adapter (intrablock residuals)

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Oops my bad, fixing last commit.

* Added import of diffusers utils.deprecate

* Conform to max line length

* Modifying T2I-Adapter pipelines to reflect change to UNet forward() arg for T2I-Adapter residuals.

---------

Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-16 16:29:05 +02:00
Sayak Paul
cc12f3ec92 [Examples] Update with HFApi (#5393)
* update training examples to use HFAPI.

* update training example.

* reflect the changes in the korean version too.

* Empty-Commit
2023-10-16 19:34:46 +05:30
Heinz-Alexander Fuetterer
0ea78f9707 chore: fix typos (#5386)
* chore: fix typos

* Update src/diffusers/pipelines/shap_e/renderer.py

Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>

---------

Co-authored-by: psychedelicious <4822129+psychedelicious@users.noreply.github.com>
2023-10-16 15:23:37 +02:00
Sayak Paul
5495073faf [Docs] add docs on peft diffusers integration (#5359)
* add docs on peft diffusers integration/

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: pacman100 <13534540+pacman100@users.noreply.github.com>

* update URLs.

* Apply suggestions from code review

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* Apply suggestions from code review

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* minor changes

* Update docs/source/en/tutorials/using_peft_for_inference.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* reflect the latest changes.

* note about update.

---------

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: pacman100 <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-16 18:41:37 +05:30
Kashif Rasul
d03c9099bc [Wuerstchen] text to image training script (#5052)
* initial script

* formatting

* prior trainer wip

* add efficient_net_encoder

* add CLIPTextModel

* add prior ema support

* optimizer

* fix typo

* add dataloader

* prompt_embeds and image_embeds

* intial training loop

* fix output_dir

* fix add_noise

* accelerator check

* make effnet_transforms dynamic

* fix training loop

* add validation logging

* use loaded text_encoder

* use PreTrainedTokenizerFast

* load weigth from pickle

* save_model_card

* remove unused file

* fix typos

* save prior pipeilne in its own folder

* fix imports

* fix pipe_t2i

* scale image_embeds

* remove snr_gamma

* format

* initial lora prior training

* log_validation and save

* initial gradient working

* remove save/load hooks

* set set_attn_processor on prior_prior

* add lora script

* typos

* use LoraLoaderMixin for prior pipeline

* fix usage

* make fix-copies

* yse repo_id

* write_lora_layers is a staitcmethod

* use defualts

* fix defaults

* undo

* Update src/diffusers/pipelines/wuerstchen/pipeline_wuerstchen_prior.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/loaders.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/loaders.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/wuerstchen/modeling_wuerstchen_prior.py

* Update src/diffusers/loaders.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/loaders.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add graident checkpoint support to prior

* gradient_checkpointing

* formatting

* Update examples/wuerstchen/text_to_image/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update examples/wuerstchen/text_to_image/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update examples/wuerstchen/text_to_image/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update examples/wuerstchen/text_to_image/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update examples/wuerstchen/text_to_image/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update examples/wuerstchen/text_to_image/train_text_to_image_lora_prior.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update src/diffusers/loaders.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update examples/wuerstchen/text_to_image/train_text_to_image_prior.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* use default unet and text_encoder

* fix test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-10-16 15:00:33 +02:00
Sayak Paul
93df5bb670 [Examples] fix unconditioning generation training example for mixed-precision training (#5407)
* fix: unconditional generation example

* fix: float in loss.

* apply styling.
2023-10-16 14:11:35 +05:30
Sayak Paul
07b297e7de [Bot] FIX stale.py uses timezone-aware datetime (#5396)
reflect stalebot change from https://github.com/huggingface/peft/pull/1016/
2023-10-16 00:54:50 +05:30
Younes Belkada
2bfa55f4ed [core / PEFT / LoRA] Integrate PEFT into Unet (#5151)
* v1

* add tests and fix previous failing tests

* fix CI

* add tests + v1 `PeftLayerScaler`

* style

* add scale retrieving mechanism system

* fix CI

* up

* up

* simple approach --> not same results for some reason

* fix issues

* fix copies

* remove unneeded method

* active adapters!

* fix merge conflicts

* up

* up

* kohya - test-1

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix scale

* fix copies

* add comment

* multi adapters

* fix tests

* oops

* v1 faster loading - in progress

* Revert "v1 faster loading - in progress"

This reverts commit ac925f8132.

* kohya same generation

* fix some slow tests

* peft integration features for unet lora

1. Support for Multiple ranks/alphas
2. Support for Multiple active adapters
3. Support for enabling/disabling LoRAs

* fix `get_peft_kwargs`

* Update loaders.py

* add some tests

* add unfuse tests

* fix tests

* up

* add set adapter from sourab and tests

* fix multi adapter tests

* style & quality

* style

* remove comment

* fix `adapter_name` issues

* fix unet adapter name for sdxl

* fix enabling/disabling adapters

* fix fuse / unfuse unet

* nit

* fix

* up

* fix cpu offloading

* fix another slow test

* fix another offload test

* add more tests

* all slow tests pass

* style

* fix alpha pattern for unet and text encoder

* Update src/diffusers/loaders.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Update src/diffusers/models/attention.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* up

* up

* clarify comment

* comments

* change comment order

* change comment order

* stylr & quality

* Update tests/lora/test_lora_layers_peft.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix bugs and add tests

* Update src/diffusers/models/modeling_utils.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Update src/diffusers/models/modeling_utils.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* refactor

* suggestion

* add break statemebt

* add compile tests

* move slow tests to peft tests as I modified them

* quality

* refactor a bit

* style

* change import

* style

* fix CI

* refactor slow tests one last time

* style

* oops

* oops

* oops

* final tweak tests

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/loaders.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* comments

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* remove comments

* more comments

* try

* revert

* add `safe_merge` tests

* add comment

* style, comments and run tests in fp16

* add warnings

* fix doc test

* replace with `adapter_weights`

* add `get_active_adapters()`

* expose `get_list_adapters` method

* better error message

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* style

* trigger slow lora tests

* fix tests

* maybe fix last test

* revert

* Update src/diffusers/loaders.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Update src/diffusers/loaders.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Update src/diffusers/loaders.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Update src/diffusers/loaders.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* move `MIN_PEFT_VERSION`

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* let's not use class variable

* fix few nits

* change a bit offloading logic

* check earlier

* rm unneeded block

* break long line

* return empty list

* change logic a bit and address comments

* add typehint

* remove parenthesis

* fix

* revert to fp16 in tests

* add to gpu

* revert to old test

* style

* Update src/diffusers/loaders.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* change indent

* Apply suggestions from code review

* Apply suggestions from code review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sourab Mangrulkar <13534540+pacman100@users.noreply.github.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-13 16:47:03 +02:00
Sayak Paul
9bc55e8b7f [Core] Add FreeU to all the core pipelines and their (mostly-used) derivatives (#5376)
* add: freeu to the core sdxl pipeline.

* add: freeu to video2video

* add: freeu to the core SD pipelines.

* add: freeu to image variation for sdxl.

* add: freeu to SD ControlNet pipelines.

* add: freeu to SDXL controlnet pipelines.

* add: freu to t2i adapter pipelines.

* make fix-copies.
2023-10-13 17:16:16 +05:30
Dhruv Nair
4d2c981d55 New xformers test runner (#5349)
* move xformers to dedicated runner

* fix

* remove ptl from test runner images
2023-10-13 00:32:39 +05:30
Steven Liu
cf03f5b718 [docs] Minor fixes (#5369)
minor fixes
2023-10-11 17:18:29 -07:00
Patrick von Platen
5313aa6231 make style 2023-10-11 11:19:35 +00:00
Chi
ea8364e581 I Added Doc-String Into The class. (#5293)
* I added a new doc string to the class. This is more flexible to understanding other developers what are doing and where it's using.

* Update src/diffusers/models/unet_2d_blocks.py

This changes suggest by maintener.

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/unet_2d_blocks.py

Add suggested text

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update unet_2d_blocks.py

I changed the Parameter to Args text.

* Update unet_2d_blocks.py

proper indentation set in this file.

* Update unet_2d_blocks.py

a little bit of change in the act_fun argument line.

* I run the black command to reformat style in the code

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-11 13:19:06 +02:00
Soumik Rakshit
e00df25aee Fix StableDiffusionXLImg2ImgPipeline creation in sdxl tutorial (#5367)
fix: StableDiffusionXLImg2ImgPipeline creation in sdxl tutorial
2023-10-11 13:07:53 +02:00
Aryan V S
91fd181245 Improve typehints and docs in diffusers/models (#5312)
* improvement: add missing typehints and docs to diffusers/models/attention.py

* chore: convert doc strings to raw python strings

add missing typehints

* improvement: add missing typehints and docs to diffusers/models/adapter.py

* improvement: add missing typehints and docs to diffusers/models/lora.py

* docs: include suggestion by @sayakpaul in src/diffusers/models/adapter.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* docs: include suggestion by @sayakpaul in src/diffusers/models/adapter.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* docs: include suggestion by @sayakpaul in src/diffusers/models/adapter.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* docs: include suggestion by @sayakpaul in src/diffusers/models/adapter.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/models/lora.py

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-11 13:04:59 +02:00
Sayak Paul
0fa32bd674 [Examples] use loralinear instead of depecrecated lora attn procs. (#5331)
* use loralinear instead of depecrecated lora attn procs.

* fix parameters()

* fix saving

* add back support for add kv proj.

* fix: param accumul,ation.

* propagate the changes.
2023-10-11 13:02:42 +02:00
ssusie
aea73834f6 Adding PyTorch XLA support for sdxl inference (#5273)
* Added  mark_step for sdxl to run with pytorch xla. Also updated README with instructions for xla

* adding soft dependency on torch_xla

* fix some styling

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-10-11 12:35:37 +02:00
Steven Liu
a139213578 [docs] Create a mask for inpainting (#5322)
add mask making section
2023-10-10 08:29:27 -07:00
Humphrey009
9c82b68f07 fix problem of 'accelerator.is_main_process' to run in mutiple GPUs (#5340)
fix problem of 'accelerator.is_main_process' to run in mutiple GPUs or NPUs

Co-authored-by: jiaqiw <wangjiaqi50@huawei.com>
2023-10-10 15:39:22 +05:30
Julien Simon
d3e0750d5d Add missing dependency in requirements file (#5345)
Update requirements_sdxl.txt

Add missing 'datasets'

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-10 15:37:58 +05:30
Roy Hvaara
4ac205e32f [JAX] Replace uses of jnp.array in types with jnp.ndarray. (#4719)
`jnp.array` is a function, not a type:
https://jax.readthedocs.io/en/latest/_autosummary/jax.numpy.array.html
so it never makes sense to use `jnp.array` in a type annotation.

Presumably the intent was to write `jnp.ndarray` aka `jax.Array`. Change uses of `jnp.array` to `jnp.ndarray`.

Co-authored-by: Peter Hawkins <phawkins@google.com>
2023-10-09 22:04:40 +02:00
Patrick von Platen
ed2f956072 Fix loading broken LoRAs that could give NaN (#5316)
* Fix fuse Lora

* improve a bit

* make style

* Update src/diffusers/models/lora.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* ciao C file

* ciao C file

* test & make style

---------

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2023-10-09 18:01:55 +02:00
Jake Vanderplas
a844065384 replace references to deprecated KeyArray & PRNGKeyArray (#5324) 2023-10-09 17:31:50 +02:00
Jonathan Whitaker
35952e61c1 Fix links in docs to adapter code (#5323)
Update adapter.md to fix links to adapter pipelines
2023-10-09 17:20:12 +02:00
Patrick von Platen
d199bc62ec make style 2023-10-09 17:12:12 +02:00
Aryan V S
8d314c96ee [HacktoberFest] Add missing docstrings to diffusers/models (#5248)
* add missing docstrings

* chore: run make quality

* improvement: include docs suggestion by @yiyixuxu

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-09 17:09:16 +02:00
Brian Yarbrough
e2c0208c86 Add py.typed for PEP 561 compliance (#5326)
See #5325
2023-10-09 17:08:55 +02:00
Aryan V S
bd72927c63 Improve typehints and docs in diffusers/models (#5299)
* improvement: add typehints and docs to diffusers/models/activations.py

* improvement: add typehints and docs to diffusers/models/resnet.py
2023-10-09 16:29:23 +02:00
__mo_san__
c4d66200b7 make-fast-test-for-StableDiffusionControlNetPipeline-faster (#5292)
* decrease UNet2DConditionModel & ControlNetModel blocks

* decrease UNet2DConditionModel & ControlNetModel blocks

* decrease even more blocks & number of norm groups

* decrease vae block out channels and n of norm goups

* fix code style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-09 14:56:27 +05:30
Sebastian
2ed7e05fc2 Improve performance of fast test by reducing down blocks (#5290)
* Reduce number of down block channels

* Remove debug code

* Set new excepted image slice values for sdxl euler test
2023-10-09 14:49:56 +05:30
Pu Cao
cc2c4ae759 fix inference in custom diffusion (#5329)
* Update train_custom_diffusion.py

* make style

* Empty-Commit

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-09 10:08:01 +02:00
chuzh
6bd55b54bc Fix [core/GLIGEN]: TypeError when iterating over 0-d tensor with In-painting mode when EulerAncestralDiscreteScheduler is used (#5305)
* fix(gligen_inpaint_pipeline): 🐛 Wrap the timestep() 0-d tensor in a list to convert to 1-d tensor. This avoids the TypeError caused by trying to directly iterate over a 0-dimensional tensor in the denoising stage

* test(gligen/gligen_text_image): unit test using the EulerAncestralDiscreteScheduler

---------

Co-authored-by: zhen-hao.chu <zhen-hao.chu@vitrox.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-09 09:54:01 +02:00
Zeng Xian
0513a8cfd8 fix typo in train dreambooth lora description (#5332) 2023-10-08 14:54:33 +02:00
Shubham S Jagtap
306dc6e751 Update README.md (#5267)
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2023-10-06 14:50:18 -07:00
vedant2003
dd25ef5679 [Hacktoberfest]Fixing issues #5241 (#5255)
* Update pipeline_wuerstchen_prior.py

* prior_num_inference_steps updated

* height, width, num_inference_steps, and guidance_scale synced

* parameters synced

* latent_mean, latent_std, and resolution_multiple synced

* prior_num_inference_steps changed

* Formatted pipeline_wuerstchen_prior.py

* Update src/diffusers/pipelines/wuerstchen/pipeline_wuerstchen_prior.py

---------

Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
2023-10-06 14:19:38 -07:00
TimothyAlexisVass
016866792d Minor fixes (#5309)
tiny fixes
2023-10-06 11:20:06 -07:00
Steven Liu
f0a2c63753 [docs] Improved inpaint docs (#5210)
* start

* finish draft

* add section

* edits

* feedback

* make fix-copies

* rebase
2023-10-06 09:44:24 -07:00
Sayak Paul
7eaae83f16 [LoRA] fix: torch.compile() for lora conv (#5298)
fix: torch.compile() for lora conv
2023-10-06 17:14:47 +02:00
Dhruv Nair
872ae1dd12 Add from single file to StableDiffusionUpscalePipeline and StableDiffusionLatentUpscalePipeline (#5194)
* add from single file

* clean up

* make style

* add single file loading for upscaling
2023-10-06 13:18:13 +02:00
Dhruv Nair
6ce01bd647 Bump tolerance on shape test (#5289)
bump tolerance on shape test
2023-10-06 10:25:18 +02:00
Patrick von Platen
0922210c5c Update bug-report.yml 2023-10-06 09:42:20 +02:00
Bagheera
02a8d664a2 Min-SNR Gamma: correct the fix for SNR weighted loss in v-prediction … (#5238)
Min-SNR Gamma: correct the fix for SNR weighted loss in v-prediction by adding 1 to SNR rather than the resulting loss weights

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-10-05 20:52:27 +02:00
Sayak Paul
e6faf607f7 add: entry for DDPO support. (#5250)
* add: entry for DDPO support.

* move to training

* address steven's comments./
2023-10-05 14:29:00 +02:00
Dhruv Nair
d8d8b2ae77 pin torch version (#5297) 2023-10-05 13:00:30 +02:00
Kadir Nar
84b82a6cb7 [Core] Add FreeU mechanism (#5164)
*  Added Fourier filter function to upsample blocks

* 🔧 Update Fourier_filter for float16 support

*  Added UNetFreeUConfig to UNet model for FreeU adaptation 🛠️

* move unet to its original form and add fourier_filter to torch_utils.

* implement freeU enable mechanism

* implement disable mechanism

* resolution index.

* correct resolution idx condition.

* fix copies.

* no need to use resolution_idx in vae.

* spell out the kwargs

* proper config property

* fix attribution setting

* place unet hasattr properly.

* fix: attribute access.

* proper disable

* remove validation method.

* debug

* debug

* debug

* debug

* debug

* debug

* potential fix.

* add: doc.

* fix copies

* add: tests.

* add: support freeU in SDXL.

* set default value of resolution idx.

* set default values for resolution_idx.

* fix copies

* fix rest.

* fix copies

* address PR comments.

* run fix-copies

* move apply_free_u to utils and other minors.

* introduce support for video (unet3D)

* minor ups

* consistent fix-copies.

* consistent stuff

* fix-copies

* add: rest

* add: docs.

* fix: tests

* fix: doc path

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* style up

* move to techniques.

* add: slow test for sd freeu.

* add: slow test for sd freeu.

* add: slow test for sd freeu.

* add: slow test for sd freeu.

* add: slow test for sd freeu.

* add: slow test for sd freeu.

* add: slow test for video with freeu

* add: slow test for video with freeu

* add: slow test for video with freeu

* style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-05 10:37:04 +02:00
1toTree
e46ec5f88f Zh doc (#4807)
* Update _toctree.yml

* Add files via upload

* Update docs/source/zh/stable_diffusion.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-10-04 08:54:13 -07:00
Patrick von Platen
25c177aace make style 2023-10-04 11:46:34 +02:00
Anatoly Belikov
c7e08958b8 handle case when controlnet is list or tuple (#5179)
* handle case when controlnet is list

* Update src/diffusers/loaders.py

* Apply suggestions from code review

* Update src/diffusers/loaders.py

* typecheck comment

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-04 11:46:10 +02:00
Dhruv Nair
dd5a36291f New Pipeline Slow Test runners (#5131)
* pipline fetcher

* update script

* clean up

* clean up

* clean up

* new pipeline runner

* rename tests to match modules

* test actions in pr

* change runner to gpu

* clean up

* clean up

* clean up

* fix report

* fix reporting

* clean up

* show test stats in failure reports

* give names to jobs

* add lora tests

* split torch cuda tests and add compile tests

* clean up

* fix tests

* change push to run only on main

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-04 11:42:17 +02:00
Patrick von Platen
7271f8b717 Fix UniPC scheduler for 1D (#5276) 2023-10-03 11:49:27 +02:00
Patrick von Platen
dfcce3ca6e [Research folder] Add SDXL example (#5275)
* [SDXL Flax] Add research folder

* Add co-author

Co-authored-by: Juan Acevedo <jfacevedo@google.com>

---------

Co-authored-by: Juan Acevedo <jfacevedo@google.com>
2023-10-03 07:51:46 +02:00
Patrick von Platen
2457599114 make fix copies 2023-10-02 17:53:17 +00:00
Patrick von Platen
bdd16116f3 [Schedulers] Fix callback steps (#5261)
* fix all

* make fix copies

* make fix copies
2023-10-02 19:52:53 +02:00
Leng Yue
c8b0f0eb21 Update UniPC to support 1D diffusion. (#5199)
* Update Unipc einsum to support 1D and 3D diffusion.

* Add unittest

* Update unittest & edge case

* Fix unittest

* Fix testing_utils.py

* Fix unittest file

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-02 19:17:46 +02:00
stano
7a4324cce3 Add a docstring for the AutoencoderKL's encode (#5239)
* Add docstring for the AutoencoderKL's encode

#5229

* Support Python 3.8 syntax in AutoencoderKL.decode type hints

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Follow the style guidelines in AutoencoderKL's encode

#5230

---------

Co-authored-by: stano <>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-02 19:17:34 +02:00
stano
37a787a106 Add docstring for the AutoencoderKL's decode (#5242)
* Add docstring for the AutoencoderKL's decode

#5230

* Follow the style guidelines in AutoencoderKL's decode

#5230

---------

Co-authored-by: stano <>
2023-10-02 18:42:32 +02:00
Sayak Paul
d56825e4b4 fix: how print training resume logs. (#5117)
* fix: how print training resume logs.

* propagate changes to text-to-image scripts.

* propagate changes to instructpix2pix.

* propagate changes to dreambooth

* propagate changes to custom diffusion and instructpix2pix

* propagate changes to kandinsky

* propagate changes to textual inv.

* debug

* fix: checkpointing.

* debug

* debug

* debug

* back to the square

* debug

* debug

* change condition order.

* debug

* debug

* debug

* debug

* revert to original

* clean

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-02 18:29:52 +02:00
dg845
cd1b8d7ca8 [WIP] Refactor UniDiffuser Pipeline and Tests (#4948)
* Add VAE slicing and tiling methods.

* Switch to using VaeImageProcessing for preprocessing and postprocessing of images.

* Rename the VaeImageProcessor to vae_image_processor to avoid a name clash with the CLIPImageProcessor (image_processor).

* Remove the postprocess() function because we're using a VaeImageProcessor instead.

* Remove UniDiffuserPipeline.decode_image_latents because we're using VaeImageProcessor instead.

* Refactor generating text from text latents into a decode_text_latents method.

* Add enable_full_determinism() to UniDiffuser tests.

* make style

* Add PipelineLatentTesterMixin to UniDiffuserPipelineFastTests.

* Remove enable_model_cpu_offload since it is now part of DiffusionPipeline.

* Rename the VaeImageProcessor instance to self.image_processor for consistency with other pipelines and rename the CLIPImageProcessor instance to clip_image_processor to avoid a name clash.

* Update UniDiffuser conversion script.

* Make safe_serialization configurable in UniDiffuser conversion script.

* Rename image_processor to clip_image_processor in UniDiffuser tests.

* Add PipelineKarrasSchedulerTesterMixin to UniDiffuserPipelineFastTests.

* Add initial test for compiling the UniDiffuser model (not tested yet).

* Update encode_prompt and _encode_prompt to match that of StableDiffusionPipeline.

* Turn off standard classifier-free guidance for now.

* make style

* make fix-copies

* apply suggestions from review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-10-02 18:24:55 +02:00
Patrick von Platen
db91e710da make style 2023-10-02 16:14:53 +00:00
Nandika-A
2a62aadcff Add docstrings in forward methods of adapter model (#5253)
* added docstrings in forward methods of T2IAdapter model and FullAdapter model

* added docstrings in forward methods of FullAdapterXL and AdapterBlock models

* Added docstrings in forward methods of adapter models
2023-10-02 18:14:41 +02:00
Patrick von Platen
4f74a5e1f7 [PEFT warnings] Only sure deprecation warnings in the future (#5240)
* [PEFT warnings] Only sure deprecation warnings in the future

* make style
2023-10-02 17:33:32 +02:00
Dhruv Nair
bbe8d3ae13 Compile test fixes (#5235)
compile test fixes
2023-10-02 17:06:10 +02:00
Zanz2
907fd91ce9 Fixed constants.py not using hugging face hub environment variable (#5222) 2023-10-02 15:51:24 +02:00
Pedro Cuenca
0c7cb9a613 Flax: Ignore PyTorch, ONNX files when they coexist with Flax weights (#5237)
Ignore PyTorch, ONNX files when they coexist with Flax weights
2023-10-02 12:17:23 +02:00
Mishig
84e5cc596c Fix doc KO unconditional_image_generation.md (#5236)
Fix indent issue
2023-09-29 18:49:40 +02:00
Younes Belkada
cc92332096 [PEFT / LoRA ] Fix text encoder scaling (#5204)
* move text encoder changes

* fix

* add comment.

* fix tests

* Update src/diffusers/utils/peft_utils.py

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-29 18:30:53 +02:00
Charles Bensimon
9cfd4ef074 Make BaseOutput dataclasses picklable (#5234)
* Make BaseOutput dataclasses picklable

* make style

* Test

* Empty commit

* Simpler and safer
2023-09-29 16:35:16 +02:00
Patrick von Platen
78a78515d6 make style 2023-09-29 08:55:26 +02:00
Seunghyeon Kim
9c03a7da43 Fix DDIMInverseScheduler (#5145)
* fix ddim inverse scheduler

* update test of ddim inverse scheduler

* update test of pix2pix_zero

* update test of diffedit

* fix typo

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-29 08:55:00 +02:00
Steven Liu
1d3120fbaa [docs] Quicktour fixes (#5211)
* fix

* feedback
2023-09-29 08:43:58 +02:00
Dhruv Nair
c78ee143e9 Move more slow tests to nightly (#5220)
* move to nightly

* fix mistake
2023-09-28 19:00:41 +05:30
Ömer Veysel Çağatan
622f35b1d0 fixed vae scaling (#5213) 2023-09-28 15:13:27 +02:00
Younes Belkada
39baf0b41b [PEFT / LoRA ] Kohya checkpoints support (#5202)
kohya support
2023-09-28 15:07:23 +02:00
Nicholas Bardy
1c4c4c48d9 Correct file name in t2i adapter training readme (#5207)
Update README_sdxl.md
2023-09-28 14:51:02 +02:00
Younes Belkada
d840253f6a [PEFT] Fix typo for import (#5217)
Update loaders.py
2023-09-28 17:33:25 +05:30
Pedro Cuenca
536c297a14 Trickle down split_head_dim (#5208) 2023-09-28 01:03:36 +02:00
Benjamin Paine
693a0d08e4 Remove Offensive Language from Community Pipelines (#5206)
* Update run_onnx_controlnet.py

* Update run_tensorrt_controlnet.py
2023-09-27 22:02:25 +02:00
Patrick von Platen
cac7adab11 [Flax SDXL] fix zero out sdxl (#5203) 2023-09-27 18:29:07 +02:00
Patrick von Platen
a584d42ce5 [LoRA, Xformers] Fix xformers lora (#5201)
* fix xformers lora

* improve

* fix
2023-09-27 21:46:32 +05:30
Sayak Paul
cdcc01be0e [Examples] add compute_snr() to training utils. (#5188)
add compute_snr() to training utils.
2023-09-27 21:42:20 +05:30
Dhruv Nair
ba59e92fb0 Fix memory issues in tests (#5183)
* fix memory issues

* set _offload_gpu_id

* set gpu offload id
2023-09-27 14:04:57 +02:00
Sourab Mangrulkar
02247d9ce1 PEFT Integration for Text Encoder to handle multiple alphas/ranks, disable/enable adapters and support for multiple adapters (#5147)
* more fixes

* up

* up

* style

* add in setup

* oops

* more changes

* v1 rzfactor CI

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* few todos

* protect torch import

* style

* fix fuse text encoder

* Update src/diffusers/loaders.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* replace with `recurse_replace_peft_layers`

* keep old modules for BC

* adjustments on `adjust_lora_scale_text_encoder`

* nit

* move tests

* add conversion utils

* remove unneeded methods

* use class method instead

* oops

* use `base_version`

* fix examples

* fix CI

* fix weird error with python 3.8

* fix

* better fix

* style

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add comment

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* conv2d support for recurse remove

* added docstrings

* more docstring

* add deprecate

* revert

* try to fix merge conflicts

* peft integration features for text encoder

1. support multiple rank/alpha values
2. support multiple active adapters
3. support disabling and enabling adapters

* fix bug

* fix code quality

* Apply suggestions from code review

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* fix bugs

* Apply suggestions from code review

Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>

* address comments

Co-Authored-By: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
Co-Authored-By: Patrick von Platen <patrick.v.platen@gmail.com>

* fix code quality

* address comments

* address comments

* Apply suggestions from code review

* find and replace

---------

Co-authored-by: younesbelkada <younesbelkada@gmail.com>
Co-authored-by: Younes Belkada <49240599+younesbelkada@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2023-09-27 13:21:54 +02:00
YiYi Xu
940f9410cb Add test_full_loop_with_noise tests to all scheduler with add_nosie function (#5184)
* add fast tests for dpm-multi

* add more tests

* style

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-09-27 13:08:37 +02:00
Patrick von Platen
ad06e5106e [Docs] Improve xformers page (#5196)
[Docs] Improve
2023-09-27 16:02:15 +05:30
Pedro Cuenca
ae2fc01a91 Wrap lines in docstring (#5190) 2023-09-26 20:10:40 +02:00
Juan Acevedo
16d56c4b4f F/flax split head dim (#5181)
* split_head_dim flax attn

* Make split_head_dim non default

* make style and make quality

* add description for split_head_dim flag

* Update src/diffusers/models/attention_flax.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Juan Acevedo <jfacevedo@google.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-26 20:08:30 +02:00
Patrick von Platen
c82f7bafba [SDXL Flax] fix SDXL flax init (#5187)
* fix SDXL flax init

* finish

* Fix
2023-09-26 19:55:05 +02:00
Pedro Cuenca
d9e7857af3 timestep_spacing for FlaxDPMSolverMultistepScheduler (#5189)
* timestep_spacing for FlaxDPMSolverMultistepScheduler

* Style
2023-09-26 19:54:53 +02:00
Steven Liu
fd1c54abf2 [docs] Improved text-to-image guide (#4938)
* first draft

* edits

* feedback
2023-09-26 09:20:19 -07:00
Dhruv Nair
9946dcf8db Test Fixes for CUDA Tests and Fast Tests (#5172)
* fix other tests

* fix tests

* fix tests

* Update tests/pipelines/shap_e/test_shap_e_img2img.py

* Update tests/pipelines/shap_e/test_shap_e_img2img.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix upstream merge mistake

* fix tests:

* test fix

* Update tests/lora/test_lora_layers_old_backend.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/lora/test_lora_layers_old_backend.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-26 19:08:02 +05:30
Ernie Chu
21e402faa0 fix-VaeImageProcessor-docstring (#5182)
```
do_binarize (`bool`, *optional*, defaults to `True`)
|
v
do_binarize (`bool`, *optional*, defaults to `False`)
```
2023-09-26 15:06:45 +02:00
Bagheera
4a06c74547 Min-SNR Gamma: follow-up fix for zero-terminal SNR models on v-prediction or epsilon (#5177)
* merge with main

* fix flax example

* fix onnx example

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-09-26 18:14:52 +05:30
Bagheera
89d8f84893 Timestep bias for fine-tuning SDXL (#5094)
* Timestep bias for fine-tuning SDXL

* Adjust parameter choices to include "range" and reword the help statements

* Condition our use of weighted timesteps on the value of timestep_bias_strategy

* style

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-09-26 13:45:37 +05:30
Dhruv Nair
bdd2544673 Tests compile fixes (#5148)
* test fix

* fix tests

* fix report name

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-26 11:36:46 +05:30
Patrick von Platen
a91a273d0b [Docs] Try to fix doc builder (#5180)
* try to fix docs

* try to fix docs
2023-09-25 20:24:50 +02:00
Patrick von Platen
bed8aceca1 make style 2023-09-25 20:24:03 +02:00
Ryan Dick
415093335b Fix the total_downscale_factor returned by FullAdapterXL T2IAdapters (#5134)
* Fix FullAdapterXL.total_downscale_factor.

* Fix incorrect error message in T2IAdapter.__init__(...).

* Move IP-Adapter test_total_downscale_factor(...) to pipeline test file (requested in code review).

* Add more info to error message about an unsupported T2I-Adapter adapter_type.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-25 20:23:14 +02:00
Hengwen Tong
dfdf85d32c [pipeline utils] sanitize pretrained_model_name_or_path (#5173)
Make sure the repo_id is valid before sending it to huggingface_hub to get a more understandable error message.

Re #5110

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-25 20:22:41 +02:00
Bagheera
539846a7d5 SDXL microconditioning documentation should indicate the correct default order of parameters, so that developers know (#5155)
* SDXL microconditioning documentation should indicate the correct default order of parameters, so that developers know

* SDXL microconditioning documentation should indicate the correct default order of parameters, so that developers know

* empty

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-25 20:22:09 +02:00
Patrick von Platen
d70944bf7f fix docs 2023-09-25 19:55:49 +02:00
Patrick von Platen
589cd8100b make style 2023-09-25 19:27:20 +02:00
Carson Katri
6281d2066b Add callbacks to WuerstchenDecoderPipeline and WuerstchenCombinedPipeline (#5154) 2023-09-25 19:26:53 +02:00
Anh71me
28254c79b6 Fix type annotation (#5146)
* Fix type annotation on Scheduler.from_pretrained

* Fix type annotation on PIL.Image
2023-09-25 19:26:39 +02:00
MLRichter
0bc6be6960 Update wuerstchen.md (#5156) 2023-09-25 18:43:08 +02:00
Patrick von Platen
144c3a8b7c [Imports] Fix many import bugs and make sure that doc builder CI test works correctly (#5176)
* [Doc builder] Ensure slow import for doc builder

* Apply suggestions from code review

* env for doc builder

* fix more

* [Diffusers] Set import to slow as env variable

* fix docs

* fix docs

* Apply suggestions from code review

* Apply suggestions from code review

* fix docs

* fix docs
2023-09-25 18:06:51 +02:00
Patrick von Platen
30a512ea69 [Core] Improve .to(...) method, fix offloads multi-gpu, add docstring, add dtype (#5132)
* fix cpu offload

* fix

* fix

* Update src/diffusers/pipelines/pipeline_utils.py

* make style

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* fix more

* fix more

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-09-25 14:10:18 +02:00
Dhruv Nair
92f15f5bd4 Model CPU offload fix for BLIPDiffusion (#5174)
cpu offload fix for blip diffusion
2023-09-25 17:07:32 +05:30
Patrick von Platen
22b19d578e [Tests] Add is flaky decorator (#5139)
* add is flaky decorator

* fix more
2023-09-25 13:24:44 +02:00
Sayak Paul
787195fe20 Fix/controlnet lora (#5157)
* print

* print

* print

* print

* print

* debugging

* debugging

* debugging

* debugging

* safer condition.

* remove prints and try excepts.

* Empty-Commit

* Apply suggestions from code review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-25 12:08:05 +02:00
Mishig
48664d62b8 Delete duplicatd doc file (#5169) 2023-09-24 19:58:13 +02:00
YiYi Xu
5b11c5dc77 fix the add_noise function for dpm-multi et al (#5158)
* remove to _device() for sigmas

* update add_noise to use simgas

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-09-23 09:07:50 -10:00
Sayak Paul
310cf32801 add: note on whom to tag for issues related to community pipelines. (#5083) 2023-09-23 17:01:37 +01:00
Steven Liu
06b316ef5c [docs] Improved image-to-image guide (#5020)
* finish first draft

* feedback

* feedback
2023-09-22 13:20:30 -07:00
Pedro Cuenca
3651b14cf4 SDXL flax (#4254)
* support transformer_layers_per block in flax UNet

* add support for text_time additional embeddings to Flax UNet

* rename attention layers for VAE

* add shape asserts when renaming attention layers

* transpose VAE attention layers

* add pipeline flax SDXL code [WIP]

* continue add pipeline flax SDXL code [WIP]

* cleanup

* Working on JIT support

Fixed prompt embedding shapes so they work in parallel mode. Assuming we
always have both text encoders for now, for simplicity.

* Fixing embeddings (untested)

* Remove spurious line

* Shard guidance_scale when jitting.

* Decode images

* Fix sharding

* style

* Refiner UNet can be loaded.

* Refiner / img2img pipeline

* Allow latent outputs from base and latent inputs in refiner

This makes it possible to chain base + refiner without having to use the
vae decoder in the base model, the vae encoder in the refiner, skipping
conversions to/from PIL, and avoiding TPU <-> CPU memory copies.

* Adapt to FlaxCLIPTextModelOutput

* Update Flax XL pipeline to FlaxCLIPTextModelOutput

* make fix-copies

* make style

* add euler scheduler

* Fix import

* Fix copies, comment unused code.

* Fix SDXL Flax imports

* Fix euler discrete begin

* improve init import

* finish

* put discrete euler in init

* fix flax euler

* Fix more

* make style

* correct init

* correct init

* Temporarily remove FlaxStableDiffusionXLImg2ImgPipeline

* correct pipelines

* finish

---------

Co-authored-by: Martin Müller <martin.muller.me@gmail.com>
Co-authored-by: patil-suraj <surajp815@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-22 18:34:04 +02:00
Pedro Cuenca
2e860e89eb SDXL: update links to refine docs (#5101)
* SDXL: update links to refine docs

* make style
2023-09-22 13:17:17 +02:00
Younes Belkada
493f9529d7 [PEFT / LoRA] PEFT integration - text encoder (#5058)
* more fixes

* up

* up

* style

* add in setup

* oops

* more changes

* v1 rzfactor CI

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* few todos

* protect torch import

* style

* fix fuse text encoder

* Update src/diffusers/loaders.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* replace with `recurse_replace_peft_layers`

* keep old modules for BC

* adjustments on `adjust_lora_scale_text_encoder`

* nit

* move tests

* add conversion utils

* remove unneeded methods

* use class method instead

* oops

* use `base_version`

* fix examples

* fix CI

* fix weird error with python 3.8

* fix

* better fix

* style

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add comment

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* conv2d support for recurse remove

* added docstrings

* more docstring

* add deprecate

* revert

* try to fix merge conflicts

* v1 tests

* add new decorator

* add saving utilities test

* adapt tests a bit

* add save / from_pretrained tests

* add saving tests

* add scale tests

* fix deps tests

* fix lora CI

* fix tests

* add comment

* fix

* style

* add slow tests

* slow tests pass

* style

* Update src/diffusers/utils/import_utils.py

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* Apply suggestions from code review

Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>

* circumvents pattern finding issue

* left a todo

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update hub path

* add lora workflow

* fix

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Benjamin Bossan <BenjaminBossan@users.noreply.github.com>
2023-09-22 13:03:39 +02:00
hysts
b32555a2da [docs] Add missing parenthesis in the sample code of BLIP Diffusion (#5144)
Add missing parenthesis in the sample code of BLIP Diffusion
2023-09-22 09:38:17 +01:00
YiYi Xu
80c00e5451 add use_karras_sigmas to KDPM2DiscreteScheduler and KDPM2AncestralDiscreteScheduler (#5111)
---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-09-21 13:50:41 -10:00
YiYi Xu
2badddfdb6 add multi adapter support to StableDiffusionXLAdapterPipeline (#5127)
fix and add tests

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-09-21 12:54:59 -10:00
Bagheera
d558811b26 Min-SNR gamma support for Dreambooth training (#5107)
* min-SNR gamma for Dreambooth training

* Align the mse_loss_weights style with SDXL training example

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-09-21 22:53:06 +01:00
Ayush Mangal
157c9011d8 Add BLIP Diffusion (#4388)
* Add BLIP Diffusion skeleton

* Add other model components

* Add BLIP2, need to change it for now

* Fix pipeline imports

* Load pretrained ViT

* Make qformer fwd pass same

* Replicate fwd passes

* Fix device bug

* Add accelerate functions

* Remove extra functions from Blip2

* Minor bug

* Integrate initial review changes

* Refactoring

* Refactoring

* Refactor

* Add controlnet

* Refactor

* Update conversion script

* Add image processor

* Shift postprocessing to ImageProcessor

* Refactor

* Fix device

* Add fast tests

* Update conversion script

* Fix checkpoint conversion script

* Integrate review changes

* Integrate reivew changes

* Remove unused functions from test

* Reuse HF image processor in Cond image

* Create new BlipImageProcessor based on transfomers

* Fix image preprocessor

* Minor

* Minor

* Add canny preprocessing

* Fix controlnet preprocessing

* Fix blip diffusion test

* Add controlnet test

* Add initial doc strings

* Integrate review changes

* Refactor

* Update examples

* Remove DDIM comments

* Add copied from for prepare_latents

* Add type anotations

* Add docstrings

* Do black formatting

* Add batch support

* Make tests pass

* Make controlnet tests pass

* Black formatting

* Fix progress bar

* Fix some licensing comments

* Fix imports

* Refactor controlnet

* Make tests faster

* Edit examples

* Black formatting/Ruff

* Add doc

* Minor

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Move controlnet pipeline

* Make tests faster

* Fix imports

* Fix formatting

* Fix make errors

* Fix make errors

* Minor

* Add suggested doc changes

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Edit docs

* Fix 16 bit loading

* Update examples

* Edit toctree

* Update docs/source/en/api/pipelines/blip_diffusion.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Minor

* Add tips

* Edit examples

* Update model paths

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-09-21 17:05:35 +01:00
Bagheera
24563ca654 SNR gamma fixes for v_prediction training (#5106)
Co-authored-by: bghira <bghira@users.github.com>
2023-09-20 21:18:56 +01:00
Younes Belkada
914586f5b6 [core] Use python 3.8 in workflow and setup file (#5122)
* use python 3.7 instead

* Update setup.py
2023-09-20 20:57:06 +02:00
김태민
5b78141fd3 [FIX BUG] add config_files parser #5114 (#5115)
* add config_files parser #5114

* add config_files parser_fix #5114
2023-09-20 16:17:47 +02:00
Sayak Paul
e312b2302b [LoRA] support LyCORIS (#5102)
* better condition.

* debugging

* how about now?

* how about now?

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* support for lycoris.

* style

* add: lycoris test

* fix from_pretrained call.

* fix assertion values.
2023-09-20 10:30:18 +01:00
YiYi Xu
8263cf00f8 refactor DPMSolverMultistepScheduler using sigmas (#4986)
---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-19 11:21:49 -10:00
Bagheera
74e43a4fbd Resolve v_prediction issue for min-SNR gamma weighted loss function (#5096)
* Resolve v_prediction issue for min-SNR gamma weighted loss function

* Combine MSE loss calculation of epsilon and velocity, with a note about the application of the epsilon code to sample prediction

* style

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-09-19 17:31:27 +01:00
Bagheera
81331f3b7d Add x-prediction / prediction_type=sample support for SDXL fine-tuning (#5095)
Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-09-19 16:57:44 +01:00
Dhruv Nair
29970757de Fast Tests on PR improvements: Batch Tests fixes (#5080)
* fix test

* initial commit

* change test

* updates:

* fix tests

* test fix

* test fix

* fix tests

* make test faster

* clean up

* fix precision in test

* fix precision

* Fix tests

* Fix logging test

* fix test

* fix test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-19 18:31:21 +05:30
Dhruv Nair
c2787c11c2 Fixes for Float16 inference Fast CUDA Tests (#5097)
* wip

* fix tests
2023-09-19 17:25:48 +05:30
Dhruv Nair
79a3f39eb5 Move to slow tests to nightly (#5093)
* move slow tests to nightly

* move slow tests to nightly
2023-09-19 16:04:26 +05:30
Dhruv Nair
431dd2f4d6 Fix precision related issues in Kandinsky Pipelines (#5098)
* fix failing tests

* make style
2023-09-19 16:02:21 +05:30
Sayak Paul
edcbb6f42e [WIP] core: add support for clip skip to SDXL (#5057)
* core: add support for clip ckip to SDXL

* add clip_skip support to the rest of the pipeline.

* Empty-Commit
2023-09-19 10:51:36 +01:00
Patrick von Platen
5a287d3f23 [SDXL] Make sure multi batch prompt embeds works (#5073)
* [SDXL] Make sure multi batch prompt embeds works

* [SDXL] Make sure multi batch prompt embeds works

* improve more

* improve more

* Apply suggestions from code review
2023-09-19 11:49:49 +02:00
maksymbekuzarovSC
65c162a5b3 Fixed get_word_inds mistake/typo in P2P community pipeline (breaking code examples) (#5089)
Fixed `get_word_inds` mistake/typo in P2P community pipeline

The function `get_word_inds` was taking a string of text and either a word (str) or a word index (int) and returned the indices of token(s) the word would be encoded to.

However, there was a typo, in which in the second `if` branch the word was checked to be a `str` **again**, not `int`, which resulted in an [example code from the docs](https://github.com/huggingface/diffusers/tree/main/examples/community#prompt2prompt-pipeline) to result in an error
2023-09-19 11:34:49 +02:00
Sayak Paul
04d696d650 [Core] Add support for CLIP-skip (#4901)
* add support for clip skip

* fix condition

* fix

* add clip_output_layer_to_default

* expose

* remove the previous functions.

* correct condition.

* apply final layer norm

* address feedback

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* refactor clip_skip.

* port to the other pipelines.

* fix copies one more time

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-19 10:30:53 +01:00
Sayak Paul
ed507680e3 [LoRA] don't break offloading for incompatible lora ckpts. (#5085)
* don't break offloading for incompatible lora ckpts.

* debugging

* better condition.

* fix

* fix

* fix

* fix

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-18 23:46:28 +02:00
Will Berman
7974fad13b remove unused adapter weights in constructor (#5088)
remove adapter weights in MultiAdapter constructor
2023-09-18 22:36:55 +02:00
Will Berman
6d7279adad t2i Adapter community member fix (#5090)
* convert tensorrt controlnet

* Fix code quality

* Fix code quality

* Fix code quality

* Fix code quality

* Fix code quality

* Fix code quality

* Fix number controlnet condition

* Add convert SD XL to onnx

* Add convert SD XL to tensorrt

* Add convert SD XL to tensorrt

* Add examples in comments

* Add examples in comments

* Add test onnx controlnet

* Add tensorrt test

* Remove copied

* Move file test to examples/community

* Remove script

* Remove script

* Remove text

* Fix import

* Fix T2I MultiAdapter

* fix tests

---------

Co-authored-by: dotieuthien <thien.do@mservice.com.vn>
Co-authored-by: dotieuthien <dotieuthien9997@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: dotieuthien <hades@cinnamon.is>
2023-09-18 22:35:49 +02:00
Patrick von Platen
119ad2c3dc [LoRA] Centralize LoRA tests (#5086)
* [LoRA] Centralize LoRA tests

* [LoRA] Centralize LoRA tests

* [LoRA] Centralize LoRA tests

* [LoRA] Centralize LoRA tests

* [LoRA] Centralize LoRA tests
2023-09-18 17:54:33 +02:00
Ruoxi
16b9a57d29 Implement CustomDiffusionAttnProcessor2_0. (#4604)
* Implement `CustomDiffusionAttnProcessor2_0`

* Doc-strings and type annotations for `CustomDiffusionAttnProcessor2_0`. (#1)

* Update attnprocessor.md

* Update attention_processor.py

* Interops for `CustomDiffusionAttnProcessor2_0`.

* Formatted `attention_processor.py`.

* Formatted doc-string in `attention_processor.py`

* Conditional CustomDiffusion2_0 for training example.

* Remove unnecessary reference impl in comments.

* Fix `save_attn_procs`.
2023-09-18 14:49:00 +02:00
Patrick von Platen
7b39f43c06 [Textual inversion] Refactor textual inversion to make it cleaner (#5076)
* [Textual inversion] Clean loading

* [Textual inversion] Clean loading

* [Textual inversion] Clean up

* [Textual inversion] Clean up

* [Textual inversion] Clean up

* [Textual inversion] Clean up
2023-09-18 14:30:34 +02:00
Sayak Paul
bfc606301f add doc around fusing multiple loras. (#5056)
* add doc around fusing multiple loras.

* Apply suggestions from code review

Co-authored-by: apolinário <joaopaulo.passos@gmail.com>

* address poli's comments.

---------

Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
2023-09-18 12:42:58 +01:00
YiYi Xu
6886e28fd8 fix a bug in inpaint pipeline when use regular text2image unet (#5033)
* fix

* fix num_images_per_prompt >1

* other pipelines

* add fast tests for inpaint pipelines

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-09-18 13:40:11 +02:00
Lee Dong Joo
b089102a8e fix guidance_rescale docstring (#5063) 2023-09-18 13:39:12 +02:00
Kashif Rasul
73bb97adfc [LoRA] fix typo in attention_processor.py (#5066)
* [LoRA] fix typo in attention_processor.py

fixes #5062

* make style

* make fix-copies, logger comented for torch compile
2023-09-16 14:43:18 +02:00
Sayak Paul
38a664a3d6 fix: validation_image arg (#5053) 2023-09-15 12:20:50 +01:00
Kashif Rasul
427feb5359 [Wuerstchen] fix typos in docs (#5051)
* fix typos in docs

* fix for issue  #5023
2023-09-15 12:53:25 +02:00
Gang Wu
9f40d7970e [FIX BUG] type of args in train_instruct_pix2pix_sdxl.py (#4955) 2023-09-15 12:53:07 +02:00
Bagheera
a0198676d7 Remove logger.info statement from Unet2DCondition code to ensure torch compile reliably succeeds (#4982)
* Remove logger.info statement from Unet2DCondition code to ensure torch compile reliably succeeds

* Convert logging statement to a comment for future archaeologists

* Update src/diffusers/models/unet_2d_condition.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-15 12:52:46 +02:00
Patrick von Platen
abc47dece6 [SDXL, Docs] Textual inversion (#5039)
* [SDXL, Docs] Textual inversion

* Update docs/source/en/using-diffusers/sdxl.md

* finish

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-09-15 12:51:36 +02:00
dotieuthien
941473a12f Fix import in examples (#5048)
* convert tensorrt controlnet

* Fix code quality

* Fix code quality

* Fix code quality

* Fix code quality

* Fix code quality

* Fix code quality

* Fix number controlnet condition

* Add convert SD XL to onnx

* Add convert SD XL to tensorrt

* Add convert SD XL to tensorrt

* Add examples in comments

* Add examples in comments

* Add test onnx controlnet

* Add tensorrt test

* Remove copied

* Move file test to examples/community

* Remove script

* Remove script

* Remove text

* Fix import

---------

Co-authored-by: dotieuthien <thien.do@mservice.com.vn>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-15 12:48:06 +02:00
dg845
4c8a05f115 Fix Consistency Models UNet2DMidBlock2D Attention GroupNorm Bug (#4863)
* Add attn_groups argument to UNet2DMidBlock2D to control theinternal Attention block's GroupNorm.

* Add docstring for attn_norm_num_groups in UNet2DModel.

* Since the test UNet config uses resnet_time_scale_shift == 'scale_shift', also set attn_norm_num_groups to 32.

* Add test for attn_norm_num_groups to UNet2DModelTests.

* Fix expected slices for slow tests.

* Also fix tolerances for slow tests.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-09-15 11:27:51 +01:00
Dhruv Nair
5fd42e5d61 Add SDXL refiner only tests (#5041)
* add refiner only tests

* make style
2023-09-15 12:58:03 +05:30
YiYi Xu
e70cb1243f [WIP] adding Kandinsky training scripts (#4890)
* Add files via upload

Co-authored-by: Shahmatov Arseniy <62886550+cene555@users.noreply.github.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-09-14 06:58:20 -10:00
YiYi Xu
fe4837a96e add step_index and clear noise_sampler at begining of each loop (#5024)
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-09-14 06:48:35 -10:00
Patrick von Platen
342c5c02c0 [Release 0.21] Bump version (#5018)
* [Release 0.21] Bump version

* fix & remove

* fix more

* fix all, upload
2023-09-14 18:28:57 +02:00
UmerHA
169fc4add5 Add Prompt2Prompt pipeline (#4563)
* Initial commit P2P

* Replaced CrossAttention, added test skeleton

* bug fixes

* Updated docstring

* Removed unused function

* Created tests

* improved tests

- made fast inference tests faster
- corrected image shape assertions

* Corrected expected output shape in tests

* small fix: test inputs

* Update tests

- used conditional unet2d
- set expected image slices
- edit_kwargs are now not popped, so pipe can be run multiple times

* Fixed bug in int tests

* Fixed tests

* Linting

* Create prompt2prompt.md

* Added to docs toc

* Ran make fix-copies

* Fixed code blocks in docs

* Using same interface as StableDiffusionPipeline

* Fixed small test bug

* Added all options SDPipeline.__call_ has

* Fixed docstring; made __call__ like in SD

* Linting

* Added test for multiple prompts

* Improved docs

* Incorporated feedback

* Reverted formatting on unrelated files

* Moved prompt2prompt to community

- Moved prompt2prompt pipeline from main to community
- Deleted tests
- Moved documentation to community and shorted it

* Update src/diffusers/utils/dummy_torch_and_transformers_objects.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-14 16:39:59 +02:00
bonlime
566bdf4c44 Allow disabling from_pretrained tqdm (#5007) 2023-09-14 16:36:19 +02:00
Younes Belkada
0eb715d715 [LoRA] Make optional arguments explicit (#5038)
make optional arguments explciit
2023-09-14 15:58:54 +02:00
Patrick von Platen
3aa641289c [Import] Add missing settings / Correct some dummy imports (#5036)
* [Import] Add missing settings

* up

* up

* up
2023-09-14 12:42:54 +02:00
Vladimir Mandic
ef29b24fda allow loading of sd models from safetensors without online lookups using local config files (#5019)
finish config_files implementation
2023-09-14 12:30:15 +02:00
Patrick von Platen
8dc93ad3e4 [Import] Don't force transformers to be installed (#5035)
* [Import] Don't force transformers to be installed

* make style
2023-09-14 11:42:10 +02:00
Dhruv Nair
e2033d2dff Fix model offload bug when key isn't present (#5030)
* fix model offload bug when key isn't present

* make style
2023-09-14 11:02:06 +02:00
Steven Liu
19edca82f1 [docs] Create clearer optimization sections (#4870)
* refactor

* update general optim sections

* update more sections

* few more updates

* benchmark code
2023-09-13 15:21:15 -07:00
Lucain
b954c22a44 Fix broken link in docs (#5015)
fix broken link
2023-09-13 15:40:25 +02:00
Kashif Rasul
77373c5eb1 [Wuerstchen] fix compel usage (#4999)
* fix compel usage

* minor changes in documentation

* fix tests

* fix more

* fix more

* typos

* fix tests

* formatting

---------

Co-authored-by: Dominic Rampas <d6582533@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-13 14:54:59 +02:00
Sayak Paul
0ea51627f1 [Core] Fix dtype in InstructPix2Pix SDXL while computing image_latents (#5013)
* check out dtypes.

* check out dtypes.

* check out dtypes.

* check out dtypes.

* check out dtypes.

* check out dtypes.

* check out dtypes.

* potential fix

* check out dtypes.

* check out dtypes.

* working?
2023-09-13 10:50:24 +01:00
Patrick von Platen
6d6a08f1f1 [Flax->PT] Fix flaky testing (#5011)
fix flaky flax class name
2023-09-13 11:29:13 +02:00
Dhruv Nair
34bfe98eaf Gligen Text to Image fix (#5010)
* fix gligen clip import issue

* fix dtype issue with gligen text to image pipeline

* make fix copies
2023-09-13 10:23:59 +01:00
Patrick von Platen
b47f5115da [Lora] fix lora fuse unfuse (#5003)
* fix lora fuse unfuse

* add same changes to loaders.py

* add test

---------

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
2023-09-13 11:21:04 +02:00
Patrick von Platen
324aef6d14 [SDXL] Add LoRA to all pipelines (#4896)
* [SDXL] Add LoRA to all pipelines

* fix all

* fix all

* fix all

* fix more docs

* make style
2023-09-13 11:05:20 +02:00
Sayak Paul
8009272f48 [Tests and Docs] Add a test on serializing pipelines with components containing fused LoRA modules (#4962)
* add: test to ensure pipelines can be saved with fused lora modules.

* add docs about serialization with fused lora.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Empty-Commit

* Update docs/source/en/training/lora.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-13 10:01:37 +01:00
Patrick von Platen
1037287e2b examples fix t2i training (#5001)
* examples fix t2i training

* make style
2023-09-12 23:52:41 +02:00
Steven Liu
6ea95b7a90 Fix PR template (#4984)
fix template
2023-09-12 19:36:38 +02:00
Patrick von Platen
0e0db625d0 Fix safety checker seq offload (#4998)
* fix safety checker

* fix safety checker

* fix safety checker
2023-09-12 18:56:35 +02:00
dg845
1f948109b8 [docs] Fix DiffusionPipeline.enable_sequential_cpu_offload docstring (#4952)
* Fix an unmatched backtick and make description more general for DiffusionPipeline.enable_sequential_cpu_offload.

* make style

* _exclude_from_cpu_offload -> self._exclude_from_cpu_offload

* make style

* apply suggestions from review

* make style
2023-09-12 08:58:47 -07:00
Patrick von Platen
37cb819df5 [Lora] Speed up lora loading (#4994)
* speed up lora loading

* Apply suggestions from code review

* up

* up

* Fix more

* Correct more

* Apply suggestions from code review

* up

* Fix more

* Fix more -

* up

* up
2023-09-12 17:51:15 +02:00
Dhruv Nair
f64d52dbca fix custom diffusion tests (#4996) 2023-09-12 17:50:47 +02:00
Dhruv Nair
4d897aaff5 fix image variation slow test (#4995)
fix image variation tests

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-12 17:45:47 +02:00
Patrick von Platen
b1105269b7 make style 2023-09-12 14:55:27 +00:00
Kashif Rasul
5d28d2217f [Wuerstchen] fix combined pipeline's num_images_per_prompt (#4989)
* fix encode_prompt

* added prompt_embeds and negative_prompt_embeds

* prompt_embeds for the prior only
2023-09-12 16:55:13 +02:00
Kashif Rasul
73bf620dec fix E721 Do not compare types, use isinstance() (#4992) 2023-09-12 16:52:25 +02:00
Dhruv Nair
c806f2fad6 remove extra gligen in import (#4987) 2023-09-12 18:35:29 +05:30
Patrick von Platen
18b7264bd0 [Utils] Correct custom init sort (#4967)
* [Utils] Correct custom init sort

* [Utils] Correct custom init sort

* [Utils] Correct custom init sort

* add type checking

* fix custom init sort

* fix test

* fix tests

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-09-12 11:05:53 +02:00
zhiqiang
d82157b3ce [Bug Fix] Should pass the dtype instead of torch_dtype (#4917)
.
2023-09-11 19:45:58 +02:00
Patrick von Platen
93579650f8 Refactor model offload (#4514)
* [Draft] Refactor model offload

* [Draft] Refactor model offload

* Apply suggestions from code review

* cpu offlaod updates

* remove model cpu offload from individual pipelines

* add hook to offload models to cpu

* clean up

* model offload

* add model cpu offload string

* make style

* clean up

* fixes for offload issues

* fix tests issues

* resolve merge conflicts

* update src/diffusers/pipelines/pipeline_utils.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

* Update src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion.py

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-09-11 19:39:26 +02:00
Kashif Rasul
16a056a7b5 Wuerstchen fixes (#4942)
* fix arguments and make example code work

* change arguments in combined test

* Add default timesteps

* style

* fixed test

* fix broken test

* formatting

* fix docstrings

* fix  num_images_per_prompt

* fix doc styles

* please dont change this

* fix tests

* rename to DEFAULT_STAGE_C_TIMESTEPS

---------

Co-authored-by: Dominic Rampas <d6582533@gmail.com>
2023-09-11 15:47:53 +02:00
Patrick von Platen
6c6a246461 Update README.md (#4973)
Add monthly pip installs
2023-09-11 15:45:19 +02:00
Patrick von Platen
6bbee1048b Make sure Flax pipelines can be loaded into PyTorch (#4971)
* Make sure Flax pipelines can be loaded into PyTorch

* add test

* Update src/diffusers/pipelines/pipeline_utils.py
2023-09-11 12:03:49 +02:00
Patrick von Platen
2c60f7d14e [Core] Remove TF import checks (#4968)
[TF] Remove tf
2023-09-11 11:22:40 +02:00
Dhruv Nair
b6e0b016ce Lazy Import for Diffusers (#4829)
* initial commit

* move modules to import struct

* add dummy objects and _LazyModule

* add lazy import to schedulers

* clean up unused imports

* lazy import on models module

* lazy import for schedulers module

* add lazy import to pipelines module

* lazy import altdiffusion

* lazy import audio diffusion

* lazy import audioldm

* lazy import consistency model

* lazy import controlnet

* lazy import dance diffusion ddim ddpm

* lazy import deepfloyd

* lazy import kandinksy

* lazy imports

* lazy import semantic diffusion

* lazy imports

* lazy import stable diffusion

* move sd output to its own module

* clean up

* lazy import t2iadapter

* lazy import unclip

* lazy import versatile and vq diffsuion

* lazy import vq diffusion

* helper to fetch objects from modules

* lazy import sdxl

* lazy import txt2vid

* lazy import stochastic karras

* fix model imports

* fix bug

* lazy import

* clean up

* clean up

* fixes for tests

* fixes for tests

* clean up

* remove import of torch_utils from utils module

* clean up

* clean up

* fix mistake import statement

* dedicated modules for exporting and loading

* remove testing utils from utils module

* fixes from  merge conflicts

* Update src/diffusers/pipelines/kandinsky2_2/__init__.py

* fix docs

* fix alt diffusion copied from

* fix check dummies

* fix more docs

* remove accelerate import from utils module

* add type checking

* make style

* fix check dummies

* remove torch import from xformers check

* clean up error message

* fixes after upstream merges

* dummy objects fix

* fix tests

* remove unused module import

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-11 09:56:22 +02:00
Sayak Paul
88735249da [Docs] fix: minor formatting in the Würstchen docs (#4965)
fix: minor formatting in the docs
2023-09-11 09:12:53 +02:00
Will Berman
4191ddee11 Revert revert and install accelerate main (#4963)
* Revert "Temp Revert "[Core] better support offloading when side loading is enabled… (#4927)"

This reverts commit 2ab170499e.

* tests: install accelerate from main
2023-09-11 08:49:46 +02:00
Will Berman
2ab170499e Temp Revert "[Core] better support offloading when side loading is enabled… (#4927)
Revert "[Core] better support offloading when side loading is enabled. (#4855)"

This reverts commit e4b8e7928b.
2023-09-08 19:54:59 -07:00
Sayak Paul
914c513ee0 [Docs] add t2i adapter entry to overview of training scripts. (#4946)
add t2i adapter entry to overview of training scripts.
2023-09-09 06:52:11 +05:30
Will Berman
d73e6ad050 guard save model hooks to only execute on main process (#4929) 2023-09-08 10:30:06 -07:00
Sayak Paul
d0cf681a1f [Tests] add: tests for t2i adapter training. (#4947)
add: tests for t2i adapter training.
2023-09-08 19:45:39 +05:30
Suraj Patil
dfec61f4b3 [examples] T2IAdapter training script (#4934)
* add t2i_example script

* remove in channels logic

* remove comments

* remove use_euler arg

* add requirements

* only use canny example

* use datasets

* comments

* make log_validation consistent with other scripts

* add readme

* fix title in readme

* update check_min_version

* change a few minor things.

* add doc entry

* add: test for t2i adapter training

* remove use_auth_token

* fix: logged info.

* remove tests for now.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-09-08 10:03:02 +05:30
Suraj Patil
0ec7a02b6a [StableDiffusionXLAdapterPipeline] allow negative micro conds (#4941)
* allow negative micro conds in t2i pipeline

* Empty-Commit

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-09-08 07:59:42 +05:30
Suraj Patil
626284f8d1 [StableDiffusionXLAdapterPipeline] add adapter_conditioning_factor (#4937)
add adapter_conditioning_factor
2023-09-07 19:05:28 +02:00
Sayak Paul
9800cc5ece [InstructPix2Pix] Fix pipeline implementation and add docs (#4844)
* initial evident fixes.

* instructpix2pix fixes.

* add: entry to doc.

* address PR feedback.

* make fix-copies
2023-09-07 15:34:19 +05:30
Kashif Rasul
541bb6ee63 Würstchen model (#3849)
* initial

* initial

* added initial convert script for paella vqmodel

* initial wuerstchen pipeline

* add LayerNorm2d

* added modules

* fix typo

* use model_v2

* embed clip caption amd negative_caption

* fixed name of var

* initial modules in one place

* WuerstchenPriorPipeline

* inital shape

* initial denoising prior loop

* fix output

* add WuerstchenPriorPipeline to __init__.py

* use the noise ratio in the Prior

* try to save pipeline

* save_pretrained working

* Few additions

* add _execution_device

* shape is int

* fix batch size

* fix shape of ratio

* fix shape of ratio

* fix output dataclass

* tests folder

* fix formatting

* fix float16 + started with generator

* Update pipeline_wuerstchen.py

* removed vqgan code

* add WuerstchenGeneratorPipeline

* fix WuerstchenGeneratorPipeline

* fix docstrings

* fix imports

* convert generator pipeline

* fix convert

* Work on Generator Pipeline. WIP

* Pipeline works with our diffuzz code

* apply scale factor

* removed vqgan.py

* use cosine schedule

* redo the denoising loop

* Update src/diffusers/models/resnet.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* use torch.lerp

* use warp-diffusion org

* clip_sample=False,

* some refactoring

* use model_v3_stage_c

* c_cond size

* use clip-bigG

* allow stage b clip to be None

* add dummy

* würstchen scheduler

* minor changes

* set clip=None in the pipeline

* fix attention mask

* add attention_masks to text_encoder

* make fix-copies

* add back clip

* add text_encoder

* gen_text_encoder and tokenizer

* fix import

* updated pipeline test

* undo changes to pipeline test

* nip

* fix typo

* fix output name

* set guidance_scale=0 and remove diffuze

* fix doc strings

* make style

* nip

* removed unused

* initial docs

* rename

* toc

* cleanup

* remvoe test script

* fix-copies

* fix multi images

* remove dup

* remove unused modules

* undo changes for debugging

* no  new line

* remove dup conversion script

* fix doc string

* cleanup

* pass default args

* dup permute

* fix some tests

* fix prepare_latents

* move Prior class to modules

* offload only the text encoder and vqgan

* fix resolution calculation for prior

* nip

* removed testing script

* fix shape

* fix argument to set_timesteps

* do not change .gitignore

* fix resolution calculations + readme

* resolution calculation fix + readme

* small fixes

* Add combined pipeline

* rename generator -> decoder

* Update .gitignore

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* removed efficient_net

* create combined WuerstchenPipeline

* make arguments consistent with VQ model

* fix var names

* no need to return text_encoder_hidden_states

* add latent_dim_scale to config

* split model into its own file

* add WuerschenPipeline to docs

* remove unused latent_size

* register latent_dim_scale

* update script

* update docstring

* use Attention preprocessor

* concat with normed input

* fix-copies

* add docs

* fix test

* fix style

* add to cpu_offloaded_model

* updated type

* remove 1-line func

* updated type

* initial decoder test

* formatting

* formatting

* fix autodoc link

* num_inference_steps is int

* remove comments

* fix example in docs

* Update src/diffusers/pipelines/wuerstchen/diffnext.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* rename layernorm to WuerstchenLayerNorm

* rename DiffNext to WuerstchenDiffNeXt

* added comment about MixingResidualBlock

* move paella vq-vae to pipelines' folder

* initial decoder test

* increased test_float16_inference expected diff

* self_attn is always true

* more passing decoder tests

* batch image_embeds

* fix failing tests

* set the correct dtype

* relax inference test

* update prior

* added combined pipeline test

* faster test

* faster test

* Update src/diffusers/pipelines/wuerstchen/pipeline_wuerstchen_combined.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix issues from review

* update wuerstchen.md + change generator name

* resolve issues

* fix copied from usage and add back batch_size

* fix API

* fix arguments

* fix combined test

* Added timesteps argument + fixes

* Update tests/pipelines/test_pipelines_common.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/pipelines/wuerstchen/test_wuerstchen_prior.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/wuerstchen/pipeline_wuerstchen_combined.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/wuerstchen/pipeline_wuerstchen_combined.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/wuerstchen/pipeline_wuerstchen_combined.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/wuerstchen/pipeline_wuerstchen_combined.py

* up

* Fix more

* failing tests

* up

* up

* correct naming

* correct docs

* correct docs

* fix test params

* correct docs

* fix classifier free guidance

* fix classifier free guidance

* fix more

* fix all

* make tests faster

---------

Co-authored-by: Dominic Rampas <d6582533@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Dominic Rampas <61938694+dome272@users.noreply.github.com>
2023-09-06 16:15:51 +02:00
dg845
b76274cb53 [docs] Fix typo in Inpainting force unmasked area unchanged example (#4910)
Fix typo by replacing init_image_arr and repainted_image_arr with init_image and repainted_image, respectively.
2023-09-06 10:49:01 +02:00
Patrick von Platen
dc3e0ca59b [Textual inversion] Relax loading textual inversion (#4903)
* [Textual inversion] Relax loading textual inversion

* up
2023-09-06 10:39:44 +02:00
Sayak Paul
6c314ad0ce [Docs] add doc entry to explain lora fusion and use of different scales. (#4893)
* add doc entry to explain lora fusion and use of different scales.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-09-06 07:38:13 +05:30
Steven Liu
946bb53c56 [docs] Add stronger warning for SDXL height/width (#4867)
* add size warning

* feedback
2023-09-05 10:50:42 -07:00
YiYi Xu
ea311e6989 remove latent input for kandinsky prior_emb2emb pipeline (#4887)
* remove latent input

* fix test

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-09-04 22:19:49 -10:00
YiYi Xu
4c5718a09c fix a bug in StableDiffusionUpscalePipeline.run_safety_checker (#4886)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-09-04 22:18:59 -10:00
Patrick von Platen
2340ed629e [Test] Reduce CPU memory (#4897)
* [Test] Reduce CPU memory

* [Test] Reduce CPU memory
2023-09-05 13:18:35 +05:30
Bagheera
cfdfcf2018 Add --vae_precision option to the SDXL pix2pix script so that we have… (#4881)
* Add --vae_precision option to the SDXL pix2pix script so that we have the option of avoiding float32 overhead

* style

---------

Co-authored-by: bghira <bghira@users.github.com>
2023-09-05 09:04:06 +02:00
Sayak Paul
e4b8e7928b [Core] better support offloading when side loading is enabled. (#4855)
* better support offloading when side loading is enabled.

* load_textual_inversion

* better messaging for textual inversion.

* fixes

* address PR feedback.

* sdxl support.

* improve messaging

* recursive removal when cpu sequential offloading is enabled.

* add: lora tests

* recruse.

* add: offload tests for textual inversion.
2023-09-05 06:55:13 +05:30
dg845
55e17907f9 Add dropout parameter to UNet2DModel/UNet2DConditionModel (#4882)
* Add dropout param to get_down_block/get_up_block and UNet2DModel/UNet2DConditionModel.

* Add dropout param to Versatile Diffusion modeling, which has a copy of UNet2DConditionModel and its own get_down_block/get_up_block functions.
2023-09-05 00:02:21 +02:00
Sayak Paul
c81a88b239 [Core] LoRA improvements pt. 3 (#4842)
* throw warning when more than one lora is attempted to be fused.

* introduce support of lora scale during fusion.

* change test name

* changes

* change to _lora_scale

* lora_scale to call whenever applicable.

* debugging

* lora_scale additional.

* cross_attention_kwargs

* lora_scale -> scale.

* lora_scale fix

* lora_scale in patched projection.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* styling.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* remove unneeded prints.

* remove unneeded prints.

* assign cross_attention_kwargs.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* clean up.

* refactor scale retrieval logic a bit.

* fix nonetypw

* fix: tests

* add more tests

* more fixes.

* figure out a way to pass lora_scale.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* unify the retrieval logic of lora_scale.

* move adjust_lora_scale_text_encoder to lora.py.

* introduce dynamic adjustment lora scale support to sd

* fix up copies

* Empty-Commit

* add: test to check fusion equivalence on different scales.

* handle lora fusion warning.

* make lora smaller

* make lora smaller

* make lora smaller

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-04 23:52:31 +02:00
YiYi Xu
2c1677eefe allow passing components to connected pipelines when use the combined pipeline (#4883)
* fix

* add test

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-09-04 06:21:36 -10:00
dg845
c73e609aae Fix get_dummy_inputs for Stable Diffusion Inpaint Tests (#4845)
* Change StableDiffusionInpaintPipelineFastTests.get_dummy_inputs to produce a random image and a white mask_image.

* Add dummy expected slices for the test_stable_diffusion_inpaint tests.

* Remove print statement

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-04 12:04:59 +02:00
Erwann Millon
2fa4b3ffb0 check for unet_lora_layers in sdxl pipeline's save_lora_weights method (#4821)
run make fix-copies and make style
2023-09-04 09:59:59 +02:00
Isamu Isozaki
3201903d94 Retrieval Augmented Diffusion Models (#3297)
* Resetting rdm pr

* Fixed styles

* Fixed style

* Moved to rdm folder+fixed slight errors

* Removed config diff

* Started adding tests

* Adding retrieved images

* Fixed faiss import

* Fixed import errors

* Fixing tests

* Added require_faiss

* Updated dependency table

* Attempt solving consistency test

* Fixed truncation and vocab size issue

* Passed common tests

* Finished up cpu testing on pipeline

* Passed all tests locally

* Removed some slow tests

* Removed diffs from test_pipeline_common

* Remove logs

* Removed diffs from test_pipelines_common

* Fixed style

* Fully fixed styles on diffs

* Fixed name

* Proper rename

* Fixed dummies

* Fixed issue with dummyonnx

* Fixed black style

* Fixed dummies

* Changed ordering

* Fixed logging

* Fixing

* Fixing

* quality

* Debugging regex

* Fix dummies with guess

* Fixed typo

* Attempt fix dummies

* black

* ruff

* fixed ordering

* Logging

* Attempt fix

* Attempt fix dummy

* Attempt fixing styles

* Fixed faiss dependency

* Removed unnecessary deprecations

* Finished up main changes

* Added doc

* Passed tests

* Fixed tests

* Remove invisible watermark

* Fixed ruff errors

* Added prompt embed to tests

* Added tests and made retriever an optional component

* Fixed styles

* Made faiss a dependency of pipeline

* Logging

* Fixed dummies

* Make pipeline test work

* Fixed style

* Moved to research projects

* Remove diff

* Fixed style error

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-04 09:42:04 +02:00
Patrick von Platen
705c592ea9 [Tests] Add combined pipeline tests (#4869)
* [Tests] Add combined pipeline tests

* Update tests/pipelines/kandinsky_v22/test_kandinsky.py
2023-09-02 21:36:20 +02:00
Harutatsu Akiyama
c52acaaf17 [ControlNet SDXL Inpainting] Support inpainting of ControlNet SDXL (#4694)
* [ControlNet SDXL Inpainting] Support inpainting of ControlNet SDXL

Co-authored-by: Jiabin Bai 1355864570@qq.com


---------

Co-authored-by: Harutatsu Akiyama <kf.zy.qin@gmail.com>
2023-09-02 08:04:22 -10:00
Steven Liu
2c45a53aef [docs] Shap-E guide (#4700)
* first draft

* fixes

* more fixes

* fix toctree
2023-09-01 19:52:41 -07:00
Steven Liu
22ea35cf23 [docs] DiffEdit guide (#4722)
* first draft

* minor edits
2023-09-01 14:18:41 -07:00
YiYi Xu
5c404f20f4 [WIP] masked_latent_inputs for inpainting pipeline (#4819)
* add

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-09-01 06:55:31 -10:00
YiYi Xu
d8b6f5d09e support AutoPipeline.from_pipe between a pipeline and its ControlNet pipeline counterpart (#4861)
add
2023-09-01 06:53:03 -10:00
YiYi Xu
30a5acc39f fix a bug in sdxl-controlnet-img2img when using MultiControlNetModel (#4862)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-09-01 06:51:59 -10:00
Seongsu Park
0c775544dd [Docs] Korean translation update (#4684)
* Docs kr update 3

controlnet, reproducibility 업로드

generator 그대로 사용
seamless multi-GPU 그대로 사용

create_dataset 번역 1차

stable_diffusion_jax

new translation

Add coreml, tome

kr docs minor fix

translate training/instructpix2pix

fix training/instructpix2pix.mdx

using-diffusers/weighting_prompts 번역 1차

add SDXL docs

Translate using-diffuers/loading_overview.md

translate using-diffusers/textual_inversion_inference.md

Conditional image generation (#37)

* stable_diffusion_jax

* index_update

* index_update

* condition_image_generation

---------

Co-authored-by: Seongsu Park <tjdtnsu@gmail.com>

jihwan/stable_diffusion.mdx

custom_diffusion 작업 완료

quicktour 작업 완료

distributed inference & control brightness (#40)

* distributed_inference.mdx

* control_brightness

---------

Co-authored-by: idra79haza <idra79haza@github.com>
Co-authored-by: Seongsu Park <tjdtnsu@gmail.com>

using_safetensors (#41)

* distributed_inference.mdx

* control_brightness

* using_safetensors.mdx

---------

Co-authored-by: idra79haza <idra79haza@github.com>
Co-authored-by: Seongsu Park <tjdtnsu@gmail.com>

delete safetensor short

* Repace mdx to md

* toctree update

* Add controlling_generation

* toctree fix

* colab link, minor fix

* docs name typo fix

* frontmatter fix

* translation fix
2023-09-01 09:23:45 -07:00
Pedro Cuenca
60d259add1 Fix link from API to using-diffusers (#4856)
* Fix link from API to using-diffusers

* Fix link
2023-09-01 15:05:01 +02:00
Dhruv Nair
189e9f01b3 Test Cleanup Precision issues (#4812)
* proposal for flaky tests

* more precision fixes

* move more tests to use cosine distance

* more test fixes

* clean up

* use default attn

* clean up

* update expected value

* make style

* make style

* Apply suggestions from code review

* Update src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_img2img.py

* make style

* fix failing tests

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-09-01 17:58:37 +05:30
Nguyễn Công Tú Anh
38466c369f Add GLIGEN Text Image implementation (#4777)
* Add GLIGEN Text Image implementation

* add style transfer from image

* fix check_repository_consistency

* add convert script GLIGEN model to Diffusers

* rename attention type

* fix style code

* remove PositionNetTextImage

* Revert "fix check_repository_consistency"

This reverts commit 15f098c96e.

* change attention type name

* update docs for GLIGEN

* change examples with hf-document-image

* fix style

* add CLIPImageProjection for GLIGEN

* Add new encode_prompt, load project matrix in pipe init

* move CLIPImageProjection to stable_diffusion

* add comment
2023-09-01 15:48:01 +05:30
dg845
5f740d0f55 [docs] Add inpainting example for forcing the unmasked area to remain unchanged to the docs (#4536)
* Initial code to add force_unmasked_unchanged argument to StableDiffusionInpaintPipeline.__call__.

* Try to improve StableDiffusionInpaintPipelineFastTests.get_dummy_inputs.

* Use original mask to preserve unmasked pixels in pixel space rather than latent space.

* make style

* start working on note in docs to force unmasked area to be unchanged

* Add example of forcing the unmasked area to remain unchanged.

* Revert "make style"

This reverts commit fa7759293a.

* Revert "Use original mask to preserve unmasked pixels in pixel space rather than latent space."

This reverts commit 092bd0e9e9.

* Revert "Try to improve StableDiffusionInpaintPipelineFastTests.get_dummy_inputs."

This reverts commit ff41cf43c5.

* Revert "Initial code to add force_unmasked_unchanged argument to StableDiffusionInpaintPipeline.__call__."

This reverts commit 989979752a.

---------

Co-authored-by: Will Berman <wlbberman@gmail.com>
2023-08-31 21:29:16 -07:00
YiYi Xu
75f81c25d1 fix sdxl-inpaint fast test (#4859)
fix inpaint test

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-08-31 15:42:58 -10:00
Patrick von Platen
bbf733ab70 [SDXL Inpaint] Correct strength default (#4858) 2023-08-31 20:34:33 +02:00
Steven Liu
aedd78767c [docs] ControlNet guide (#4640)
* first draft

* finish first draft

* feedback and remove sections from API pages

* clean docstrings

* add full code example
2023-08-31 10:02:02 -04:00
Patrick von Platen
7caa3682e4 Remove warn with deprecate (#4850)
* Remove warn with deprecate

* Fix typo with 1.0,0
2023-08-31 15:08:41 +02:00
Ella Charlaix
0edb4cac78 Fix image processor inputs width (#4853)
fix width for np array inputs
2023-08-31 14:50:55 +02:00
Yukun Huang
85b3f08c26 Fix potential type mismatch errors in SDXL pipelines (#4796)
* Fix potential type conversion errors in SDXL pipelines

* make sure vae stays in fp16

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-31 09:22:18 +02:00
Sayak Paul
19f3161d94 [Docs] improve the LoRA doc. (#4838)
* improve the LoRA doc.

* include fuse_lora and unfuse_lora

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-08-31 00:13:15 +05:30
Steven Liu
a1fdfca36f [docs] SDXL (#4428)
* first draft

* reorg toctree

* note about minsdxl

* feedback

* fix

* micro-conditionings

* add tip

* fix section levels

* d'oh fix pipeline names

* feedback

* remove old section
2023-08-30 11:34:55 -04:00
Patrick von Platen
d1e20be664 make style 2023-08-30 14:13:14 +02:00
Anatoly Belikov
af3854d6ad sketch inpaint from a1111 for non-inpaint models (#4824)
* Create masked_stable_diffusion_img2img.py

* add MaskedIm2ImPipeline to readme

* Update README.md
2023-08-30 09:51:28 +02:00
Patrick von Platen
9f1936d2fc Fix Unfuse Lora (#4833)
* Fix Unfuse Lora

* add tests

* Fix more

* Fix more

* Fix all

* make style

* make style
2023-08-30 09:32:25 +05:30
Eugene Antropov
fbca2e0a7a Add loading ckpt from file for SDXL controlNet (#4683)
* Add load ckpt from file for ControlNet SDXL

* Reformat code

* Resort imports

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-30 09:00:53 +05:30
Sayak Paul
3768d4d77c [Core] refactor encode_prompt (#4617)
* refactoring of encode_prompt()

* better handling of device.

* fix: device determination

* fix: device determination 2

* handle num_images_per_prompt

* revert changes in loaders.py and give birth to encode_prompt().

* minor refactoring for encode_prompt()/

* make backward compatible.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix: concatenation of the neg and pos embeddings.

* incorporate encode_prompt() in test_stable_diffusion.py

* turn it into big PR.

* make it bigger

* gligen fixes.

* more fixes to fligen

* _encode_prompt -> encode_prompt in tests

* first batch

* second batch

* fix blasphemous mistake

* fix

* fix: hopefully for the final time.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-30 08:57:26 +05:30
Nikhil Gajendrakumar
8ccb619416 VaeImageProcessor: Allow image resizing also for torch and numpy inputs (#4832)
Co-authored-by: Nikhil Gajendrakumar <nikhilkatte@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-29 22:45:05 +02:00
zideliu
0699ac62f0 fix typo (#4822) 2023-08-29 20:54:36 +02:00
Patrick von Platen
a76f2ad538 make style 2023-08-29 09:25:09 +02:00
VitjanZ
7200daa412 Support saving multiple t2i adapter models under one checkpoint (#4798)
* adding save and load for MultiAdapter, adding test

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Adding changes from review test_stable_diffusion_adapter

* import sorting fix

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-29 09:24:40 +02:00
Alexsey Shestacov
3eeaf4e041 Fix convert_original_stable_diffusion_to_diffusers script (#4817)
Fix stable diffusion conversion script
2023-08-29 09:14:45 +02:00
Patrick von Platen
c583f3b452 Fuse loras (#4473)
* Fuse loras

* initial implementation.

* add slow test one.

* styling

* add: test for checking efficiency

* print

* position

* place model offload correctly

* style

* style.

* unfuse test.

* final checks

* remove warning test

* remove warnings altogether

* debugging

* tighten up tests.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* denugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debuging

* debugging

* debugging

* debugging

* suit up the generator initialization a bit.

* remove print

* update assertion.

* debugging

* remove print.

* fix: assertions.

* style

* can generator be a problem?

* generator

* correct tests.

* support text encoder lora fusion.

* tighten up tests.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-08-29 09:14:24 +02:00
Chong Mou
12358b986f add models for T2I-Adapter-XL (#4696)
* T2I-Adapter-XL

* update

* update

* add pipeline

* modify pipeline

* modify pipeline

* modify pipeline

* modify pipeline

* modify pipeline

* modify modeling_text_unet

* fix styling.

* fix: copies.

* adapter settings

* new test case

* new test case

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* revert prints.

* new test case

* remove print

* org test case

* add test_pipeline

* styling.

* fix copies.

* modify test parameter

* style.

* add adapter-xl doc

* double quotes in docs

* Fix potential type mismatch

* style.

---------

Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2023-08-29 10:34:07 +05:30
YiYi Xu
5eeedd9e33 add StableDiffusionXLControlNetImg2ImgPipeline (#4592)
---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-28 08:16:27 -10:00
YiYi Xu
a971c598b5 fix auto_pipeline: pass kwargs to load_config (#4793)
* fix

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-28 07:42:16 -10:00
YiYi Xu
934d439a42 fix bug in StableDiffusionXLControlNetPipeline when use guess_mode (#4799)
* fix



---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-28 06:51:17 -10:00
Dhruv Nair
e3f3672f46 Fix Disentangle ONNX and non-ONNX pipeline (#4656)
* initial commit to fix inheritance issue

* clean up sd onnx upscale

* clean up
2023-08-28 21:14:49 +05:30
Mario Namtao Shianti Larcher
87ae330056 [Examples] Save SDXL LoRA weights with chosen precision (#4791)
* Increase min accelerate ver to avoid OOM when mixed precision

* Rm re-instantiation of VAE

* Rm casting to float32

* Del unused models and free GPU

* Fix style
2023-08-28 13:57:40 +05:30
Patrick von Platen
1b46c66132 make style 2023-08-28 07:17:21 +00:00
Yead
031358988b Fix save_path bug in textual inversion training script (#4710)
* Update textual_inversion.py

fixed safe_path bug in textual inversion training

* Update test_examples.py

update test_textual_inversion for updating saved file's name

* Update textual_inversion.py

fixed some formatting issues

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-28 09:17:08 +02:00
Shauray Singh
fd35689f25 [WIP] Add Fabric (#4201)
* empty PR

* init

* changes

* starting with the pipeline

* stable diff

* prev

* more things, getting started

* more functions

* makeing it more readable

* almost done testing

* var changes

* testing

* device

* device support

* maybe

* device malfunctions

* new new

* register

* testing

* exec does not work

* float

* change info

* change of architecture

* might work

* testing with colab

* more attn atuff

* stupid additions

* documenting and testing

* writing tests

* more docs

* tests and docs

* remove test

* empty PR

* init

* changes

* starting with the pipeline

* stable diff

* prev

* more things, getting started

* more functions

* makeing it more readable

* almost done testing

* var changes

* testing

* device

* device support

* maybe

* device malfunctions

* new new

* register

* testing

* exec does not work

* float

* change info

* change of architecture

* might work

* testing with colab

* more attn atuff

* stupid additions

* documenting and testing

* writing tests

* more docs

* tests and docs

* remove test

* change cross attention

* revert back

* tests

* reverting back to orig

* changes

* test passing

* pipeline changes

* before quality

* quality checks pass

* remove print statements

* doc fixes

* __init__ error something

* update docs, working on dim

* working on encoding

* doc fix

* more fixes

* no more dependent on 512*512

* update docs

* fixes

* test passing

* remove comment

* fixes and migration

* simpler tests

* doc changes

* green CI

* changes

* more docs

* changes

* new images

* to community examples

* selete

* more fixes

* changes

* fix

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-08-28 09:10:55 +02:00
chillpixel
e8c9069d6f Update loaders.py (#4805)
* Update loaders.py

Solves an error sometimes thrown while iterating over state_dict.keys() caused by using the .pop() method within the loop.

* Update loaders.py
2023-08-28 11:23:25 +05:30
Patrick von Platen
766aa50f70 [LoRA Attn Processors] Refactor LoRA Attn Processors (#4765)
* [LoRA Attn] Refactor LoRA attn

* correct for network alphas

* fix more

* fix more tests

* fix more tests

* Move below

* Finish

* better version

* correct serialization format

* fix

* fix more

* fix more

* fix more

* Apply suggestions from code review

* Update src/diffusers/pipelines/stable_diffusion/pipeline_onnx_stable_diffusion_img2img.py

* deprecation

* relax atol for slow test slighly

* Finish tests

* make style

* make style
2023-08-28 10:38:09 +05:30
Patrick von Platen
c4d2823601 [SDXL Lora] Fix last ben sdxl lora (#4797)
* Fix last ben sdxl lora

* Correct typo

* make style
2023-08-26 23:31:56 +02:00
Patrick von Platen
4f8853e481 [Torch compile] Fix torch compile for controlnet (#4795)
Fix torch compile for controlnete
2023-08-26 22:30:02 +02:00
Steven Liu
fed88195e3 [docs] Fix syntax for compel (#4794)
* fix syntax

* update image
2023-08-26 11:33:10 -07:00
Sayak Paul
0de35e4a52 [Tests] Tighten up LoRA loading relaxation (#4787)
* debugging

* better logic for filtering.

* Update src/diffusers/loaders.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-26 15:01:16 +05:30
Canberk Kandemir
0d81e543a2 Unet fix (#4769)
* Optional images variable train_custom_diffusion.py

* Fixed train_custom_diffusion.py

* Revert accidental changes to unet_2d_condition.py

* "Format code with black"
2023-08-26 11:01:24 +02:00
Sayak Paul
3be0ff9056 [Core] Support negative conditions in SDXL (#4774)
* add: support negative conditions.

* fix: key

* add: tests

* address PR feedback.

* add documentation

* add img2img support.

* add inpainting support.

* ad controlnet support

* Apply suggestions from code review

* modify wording in the doc.
2023-08-26 09:13:44 +05:30
Patrick von Platen
2764db3194 [SDXL] Add docs about forcing passed embeddings to be 0 (#4783)
* make style

* make style

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* make style

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-08-25 20:52:45 +02:00
Patrick von Platen
048d901993 make style 2023-08-25 18:51:03 +00:00
cmdr2
cb432c4ebc Allow passing a checkpoint state_dict to convert_from_ckpt (instead of just a string path) (#4653)
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-25 20:50:39 +02:00
YiYi Xu
b7b1a30bc4 refactor prepare_mask_and_masked_image with VaeImageProcessor (#4444)
* refactor image processor for mask
---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-08-25 08:18:48 -10:00
Will Berman
7e5587a5ac instance_prompt->class_prompt (#4784) 2023-08-25 20:06:55 +02:00
Mayank Khanduja
dc8da1d449 Fixed broken link of CLIP doc in evaluation doc (#4760) 2023-08-25 20:04:50 +02:00
Zijian He
3dd540171d fix bug of progress bar in clip guided images mixing (#4729) 2023-08-25 18:54:03 +02:00
YiYi Xu
b3b2d30cd8 fix a bug in from_pretrained when load optional components (#4745)
* fix
---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-25 06:25:48 -10:00
Dhruv Nair
3bba44d74e [WIP ] Proposal to address precision issues in CI (#4775)
* proposal for flaky tests

* clean up
2023-08-25 19:12:09 +05:30
Sanchit Gandhi
b1290d3fb8 Convert MusicLDM (#4579)
* from audioldm

* fix vae

* move to new pipeline

* copied from audioldm

* remove redundant control flow

* iterate

* fix docstring

* finish pipeline

* tests: from audioldm2

* iterate

* finish fast tests

* finish slow integration tests

* add docs

* remove dtype test

* update toctree

* "copied from" in conversion (where possible)

* Update docs/source/en/api/pipelines/musicldm.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix docstring

* make nightly

* style

* fix dtype test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-25 13:31:00 +01:00
Sanchit Gandhi
29a11c2a94 [AudioLDM 2] Pipeline fixes (#4738)
* fix docs

* fix unet docs

* use image output for latents

* fix hub checkpoints

* fix pipeline example

* update example

* return_dict = False

* revert image pipeline output

* revert doc changes

* remove dtype test

* make style

* remove docstring updates

* remove unet docstring update

* Empty commit to re-trigger CI

* fix cpu offload

* fix dtype test

* add offload test
2023-08-25 11:38:10 +01:00
Patrick von Platen
cdacd8f1dd Torch device (#4755) 2023-08-25 11:13:32 +02:00
Sayak Paul
470d51c8ed improve setup.py (#4748) 2023-08-25 13:44:20 +05:30
Andrew Zhu
d6141205cd fix sdxl_lwp empty neg_prompt error issue (#4743)
* fix sdxl_lwp empty neg_prompt error issue

* fix sdxl_lwp empty neg_prompt error issue, update code format

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-08-25 09:55:56 +05:30
Sayak Paul
4447547eda [Examples] fix sdxl dreambooth lora checkpointing. (#4749)
* fix sdxl dreambooth lora checkpointing.

* style
2023-08-25 09:50:02 +05:30
Sayak Paul
5222294748 [LoRA] relax lora loading logic (#4610)
* relax lora loading logic.

* cater to the other cases too.

* fix: variable name

* bring the chaos down.

* check

* deal with checkpointed files.

* Apply suggestions from code review

Co-authored-by: apolinário <joaopaulo.passos@gmail.com>

* style

---------

Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
2023-08-25 09:35:51 +05:30
Mario Namtao Shianti Larcher
c25c46137d [Examples] Add madebyollin VAE to SDXL LoRA example, along with an explanation (#4762)
Add madebyollin VAE to LoRA example, along with an explenation
2023-08-25 09:34:32 +05:30
Will Berman
3105c710ba [fix] multi t2i adapter set total_downscale_factor (#4621)
* [fix] multi t2i adapter set total_downscale_factor

* move image checks into check inputs

* remove copied from
2023-08-24 12:01:23 -07:00
Patrick von Platen
58f5f748f4 [Tests] Fix paint by example (#4761)
* [Tests] Fix paint by example

* Update src/diffusers/pipelines/paint_by_example/image_encoder.py
2023-08-24 16:03:10 +02:00
Dhruv Nair
4f05058bb7 Clean up flaky behaviour on Slow CUDA Pytorch Push Tests (#4759)
use max diff to compare model outputs
2023-08-24 18:58:02 +05:30
Patrick von Platen
5d4413001b make style 2023-08-24 10:19:47 +00:00
Symbiomatrix
863e741614 Bugfix for SDXL model loading in low ram system. (#4628)
Update convert_from_ckpt.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-24 12:19:16 +02:00
Sanchit Gandhi
24c5e7708b [AudioLDM2] Doc fixes (#4739)
* [AudioLDM2] Doc fixes

* update docstrings

* fix unet docstring

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-24 07:20:27 +05:30
YiYi Xu
cd21b965d1 add a step_index counter (#4347)
add self.step_index

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-23 10:49:54 -10:00
Yinzhen Wang
d185b5ed5f change validation scheduler for train_dreambooth.py when training IF (#4333)
* dreambooth training

* train_dreambooth validation scheduler

* set a particular scheduler via a string

* modify readme after setting a particular scheduler via a string

* modify readme after setting a particular scheduler

* use importlib to set a particular scheduler

* import with correct sort
2023-08-23 22:18:17 +02:00
Suraj Patil
709a642827 fix dummy import for AudioLDM2 (#4741)
* fix import

* style
2023-08-23 22:07:47 +02:00
Sanchit Gandhi
0a0fe69aa6 [AudioLDM Docs] Update docstring (#4744) 2023-08-23 11:04:54 -07:00
realliujiaxu
124e76ddc6 [docs] add variant="fp16" flag (#4678) 2023-08-23 10:00:34 -07:00
Sanchit Gandhi
05b0ec63bc [AudioLDM Docs] Fix docs for output (#4737) 2023-08-23 18:02:11 +02:00
Sayak Paul
4909b1e3ac [Examples] fix checkpointing and casting bugs in train_text_to_image_lora_sdxl.py (#4632)
* fix: casting issues.

* fix checkpointing.

* tests

* fix: bugs
2023-08-23 10:58:54 +05:30
Ollin Boer Bohan
052bf3280b Fix AutoencoderTiny encoder scaling convention (#4682)
* Fix AutoencoderTiny encoder scaling convention

  * Add [-1, 1] -> [0, 1] rescaling to EncoderTiny

  * Move [0, 1] -> [-1, 1] rescaling from AutoencoderTiny.decode to DecoderTiny
    (i.e. immediately after the final conv, as early as possible)

  * Fix missing [0, 255] -> [0, 1] rescaling in AutoencoderTiny.forward

  * Update AutoencoderTinyIntegrationTests to protect against scaling issues.
    The new test constructs a simple image, round-trips it through AutoencoderTiny,
    and confirms the decoded result is approximately equal to the source image.
    This test checks behavior with and without tiling enabled.
    This test will fail if new AutoencoderTiny scaling issues are introduced.

  * Context: Raw TAESD weights expect images in [0, 1], but diffusers'
    convention represents images with zero-centered values in [-1, 1],
    so AutoencoderTiny needs to scale / unscale images at the start of
    encoding and at the end of decoding in order to work with diffusers.

* Re-add existing AutoencoderTiny test, update golden values

* Add comments to AutoencoderTiny.forward
2023-08-23 08:38:37 +05:30
Patrick von Platen
80871ac597 fix bad error message when transformers is missing (#4714) 2023-08-22 21:25:01 +02:00
Patrick von Platen
6abc66ef28 Fix all docs (#4721)
* [Docs] Fix all

* fix
2023-08-22 21:00:21 +02:00
Patrick von Platen
38efac9f61 Revert "Move controlnet load local tests to nightly (#4543)" (#4713)
This reverts commit 7b07f9812a.
2023-08-22 19:55:15 +02:00
Patrick von Platen
4f6399bedd rename test file to run, so that examples tests do not fail (#4715)
* rename test file to run, so that examples tests do not fail

* [Tests] Rename community tests
2023-08-22 19:54:46 +02:00
Patrick von Platen
6e1af3a777 [Docs] Fix docs controlnet missing /Tip (#4717) 2023-08-22 18:40:26 +02:00
zideliu
f22aad6e3a Add reference_attn & reference_adain support for sdxl (#4502)
* ADD SDXL reference & reference adain

* Update README.md

* Update README.md

* format stable_diffusion_xl_reference.py

* format file

* Format file

* format file

* fix format

* fix format with ruff

* fix format

* Update examples/community/README.md

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update examples/community/README.md

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update README.md

* Update README.md & fix typo

* Update README.md

* fix format

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-08-22 20:22:01 +05:30
realliujiaxu
ecded50ad5 add convert diffuser pipeline of XL to original stable diffusion (#4596)
convert diffuser pipeline of XL to original stable diffusion

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-08-22 19:11:06 +05:30
Alex McKinney
e34d9aa681 Replaces DIFFUSERS_TEST_DEVICE backend list with trying device (#4673)
This is a better method than comparing against a list of supported backends as it allows for supporting any number of backends provided they are installed on the user's system.
This should have no effect on the behaviour of tests in Huggingface's CI workers.
See transformers#25506 where this approach has already been added.
2023-08-22 11:48:12 +05:30
Sayak Paul
8d30d25794 [LoRA] default to None when fc alphas are not available. (#4706)
default to None when fc alphas are not available.
2023-08-22 08:47:08 +05:30
Sayak Paul
1e0395e791 [LoRA] ensure different LoRA ranks for text encoders can be properly handled (#4669)
* debugging starts

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging ends, but does it?

* more robustness.
2023-08-22 08:21:13 +05:30
Sayak Paul
9141c1f9d5 [Core] enable lora for sdxl controlnets too and add slow tests. (#4666)
* enable lora for sdxl controlnets too.

* add: tests

* fix: assertion values.
2023-08-22 07:13:23 +05:30
dg845
f75b8aa9dd [docs] Add note in UniDiffusers Doc about PyTorch 1.X numerical stability issue (#4703)
* Add note regarding UniDiffuser pipeline numerical stability issues on PyTorch 1.X

* Use the doc-builder warning tag.
2023-08-22 07:12:06 +05:30
Sanchit Gandhi
7a24977ce3 Add AudioLDM 2 (#4549)
* from audioldm

* unet down + mid

* vae, clap, flan-t5

* start sequence audio mae

* iterate on audioldm encoder

* finish encoder

* finish weight conversion

* text pre-processing

* gpt2 pre-processing

* fix projection model

* working

* unet equivalence

* finish in base

* add unet cond

* finish unet

* finish custom unet

* start clean-up

* revert base unet changes

* refactor pre-processing

* tests: from audioldm

* fix some tests

* more fixes

* iterate on tests

* make fix copies

* harden fast tests

* slow integration tests

* finish tests

* update checkpoint

* update copyright

* docs

* remove outdated method

* add docstring

* make style

* remove decode latents

* enable cpu offload

* (text_encoder_1, tokenizer_1) -> (text_encoder, tokenizer)

* more clean up

* more refactor

* build pr docs

* Update docs/source/en/api/pipelines/audioldm2.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* small clean

* tidy conversion

* update for large checkpoint

* generate -> generate_language_model

* full clap model

* shrink clap-audio in tests

* fix large integration test

* fix fast tests

* use generation config

* make style

* update docs

* finish docs

* finish doc

* update tests

* fix last test

* syntax

* finalise tests

* refactor projection model in prep for TTS

* fix fast tests

* style

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-08-21 12:34:21 +01:00
zuojianghua
74d902eb59 add config_file to from_single_file (#4614)
* Update loaders.py

add config_file to from_single_file, 
when the download_from_original_stable_diffusion_ckpt use

* Update loaders.py

add config_file to from_single_file,
when the download_from_original_stable_diffusion_ckpt use

* change config_file to original_config_file

* make style && make quality

---------

Co-authored-by: jianghua.zuo <jianghua.zuo@weimob.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-08-18 19:33:12 +05:30
Andrew Zhu
d7c4ae619d Add SDXL long weighted prompt pipeline (replace pr:4629) (#4661)
* Add SDXL long weighted prompt pipeline

* Add SDXL long weighted prompt pipeline usage sample in the readme document

* Add SDXL long weighted prompt pipeline usage sample in the readme document, add result image
2023-08-18 11:30:10 +05:30
Isotr0py
67ea2b7afa Support tiled encode/decode for AutoencoderTiny (#4627)
* Impl tae slicing and tiling

* add tae tiling test

* add parameterized test

* formatted code

* fix failed test

* style docs
2023-08-18 09:12:55 +05:30
Sayak Paul
a10107f92b fix: lora sdxl tests (#4652) 2023-08-17 15:59:50 +05:30
Sayak Paul
d0c30cfd37 make post-release (#4650) 2023-08-17 14:16:25 +05:30
Jacqui Wei
7c3e7fedcd Fix use_onnx parameter usage in from_pretrained func and update test_download_no_onnx_by_default test (#4508)
* add missing use_onnx in from_pretrained func

* fix test_download_no_onnx_by_default test func

* address comments

* split test cases
2023-08-17 11:49:32 +05:30
Patrick von Platen
029fb41695 [Safetensors] Make safetensors the default way of saving weights (#4235)
* make safetensors default

* set default save method as safetensors

* update tests

* update to support saving safetensors

* update test to account for safetensors default

* update example tests to use safetensors

* update example to support safetensors

* update unet tests for safetensors

* fix failing loader tests

* fix qc issues

* fix pipeline tests

* fix example test

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-08-17 10:54:28 +05:30
Batuhan Taskaya
852dc76d6d Support higher dimension LoRAs (#4625)
* Support higher dimension LoRAs

* add: tests

* fix: assertion values.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-08-17 10:07:07 +05:30
Scott Lessans
064f150813 Fix UnboundLocalError during LoRA loading (#4523)
* fixed

* add: tests

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-08-17 09:33:35 +05:30
Sayak Paul
5333f4c0ec make things clear in the controlnet sdxl doc. (#4644) 2023-08-17 09:04:28 +05:30
Dhruv Nair
3d08d8dc4e fix loading custom text encoder when using from_single_file (#4571)
fix loading custom text encoder when using from_single_file
2023-08-17 08:41:09 +05:30
Steven Liu
bdc4c3265f [docs] MultiControlNet (#4635)
multicontrolnet docs
2023-08-17 08:14:20 +05:30
Steven Liu
4ff7264d9b [docs] PushToHubMixin (#4622)
* push to hub docs

* fix typo

* feedback

* make style
2023-08-16 13:20:59 -06:00
Sayak Paul
5049599143 [Core] feat: MultiControlNet support for SDXL ControlNet pipeline (#4597)
* core: add multicontrolnet support to sdxl controlnet

* modify checks.

* fix: original_size determination

* add: tests for multi controlnet sdxl.

* remove unnecessary prints.
2023-08-16 20:30:39 +05:30
Suraj Patil
7b93c2a882 [research_projects] SDXL controlnet script (#4633)
add controlent script,
2023-08-16 18:27:08 +05:30
Dirk Morris
a7de96505b Fix unipc use_karras_sigmas exception - fixes huggingface/diffusers#4580 (#4581)
* Fix unipc karras sigmas exception - fixes huggingface/diffusers#4580

* Add unipc scheduler tests for karras sigmas
2023-08-16 10:01:53 +05:30
Sayak Paul
351aab60e9 Update text2image.md to fix the links (#4626) 2023-08-16 09:53:10 +05:30
nikhil-masterful
da5ab51d54 Add GLIGEN implementation (#4441)
* Add GLIGEN implementation

* GLIGEN: Fix code quality check failures

* GLIGEN: Fix Import block un-sorted or un-formatted failures

* GLIGEN: Fix check_repository_consistency failures

* GLIGEN: Add 'PositionNet' to versatile_diffusion/modeling_text_unet.py

* GLIGEN: check_repository_consistency: fix 'copy does not match' error

* GLIGEN: Fix review comments (1)

* GLIGEN: Fix E721 Do not compare types, use `isinstance()` failures

* GLIGEN : Ensure _encode_prompt() copy matches to StableDiffusionPipeline

* GLIGEN: Fix ruff E721 failure in unidiffuser/test_unidiffuser.py

* GLIGEN: doc_builder: restyle pipeline_stable_diffusion_gligen.py

* GIGLEN: reset files unrelated to gligen

* GLIGEN: Fix documentation comments (1)

* GLIGEN: Fix review comments (2)

* GLIGEN: Added FastTest

* GLIGEN: Fix review comments (3)
2023-08-16 09:34:17 +05:30
Sayak Paul
5175d3d7a5 add: train to text image with sdxl script. (#4505)
* add: train to text image with sdxl script.

Co-authored-by: CaptnSeraph <s3raph1m@gmail.com>

* fix: partial func.

* fix: default value of output_dir.

* make style

* set num inference steps to 25.

* remove mentions of LoRA.

* up min version

* add: ema cli arg

* run device placement while running step.

* precompute vae encodings too.

* fix

* debug

* should work now.

* debug

* debug

* goes alright?

* style

* debugging

* debugging

* debugging

* debugging

* fix

* reinit scheduler if prediction_type was passed.

* akways cast vae in float32

* better handling of snr.

Co-authored-by: bghira <bghira@users.github.com>

* the vae should be also passed

* add: docs.

* add: sdlx t2i tests

* save the pipeline

* autocast.

* fix: save_model_card

* fix: save_model_card.

---------

Co-authored-by: CaptnSeraph <s3raph1m@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: bghira <bghira@users.github.com>
2023-08-16 09:02:49 +05:30
Sayak Paul
a7508a76f0 add: pushtohubmixin to pipelines and schedulers docs overview. (#4607)
* add: pushtohubmixin to pipelines and schedulers docs overview.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-08-15 22:23:17 +05:30
Sayak Paul
aaef41b5fe [Docs] fix links in the controlling generation doc. (#4612)
* fix links in the controlling generation doc.

* more fixes.
2023-08-15 20:27:13 +05:30
Wang Qiang
078df46bc9 An invalid clerical error in sdxl finetune (#4608) 2023-08-15 10:41:51 +05:30
Sayak Paul
15782fd506 [Pipeline utils] feat: implement push_to_hub for standalone models, schedulers as well as pipelines (#4128)
* feat: implement push_to_hub for standalone models.

* address PR feedback.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* remove max_shard_size.

* add: support for scheduler push_to_hub

* enable push_to_hub support for flax schedulers.

* enable push_to_hub for pipelines.

* Apply suggestions from code review

Co-authored-by: Lucain <lucainp@gmail.com>

* reflect pr feedback.

* address another round of deedback.

* better handling of kwargs.

* add: tests

* Apply suggestions from code review

Co-authored-by: Lucain <lucainp@gmail.com>

* setting hub staging to False for now.

* incorporate staging test as a separate job.

Co-authored-by: ydshieh <2521628+ydshieh@users.noreply.github.com>

* fix: tokenizer loading.

* fix: json dumping.

* move is_staging_test to a better location.

* better treatment to tokens.

* define repo_id to better handle concurrency

* style

* explicitly set token

* Empty-Commit

* move SUER, TOKEN to test

* collate org_repo_id

* delete repo

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lucain <lucainp@gmail.com>
Co-authored-by: ydshieh <2521628+ydshieh@users.noreply.github.com>
2023-08-15 07:39:22 +05:30
Sayak Paul
d93ca26893 [Examples] Update InstructPix2Pix README_sdxl.md to fix mentions (#4574)
* Update README_sdxl.md to fix mentions

* add --push_to_hub

* add --push_to_hub

* fix: mention
2023-08-14 17:48:13 +05:30
Claire Froelich
32963c24c5 Fix git-lfs command typo in docs (#4586)
fix typo in git-lfs command

added missing hyphen. "git lfs" is not a command
2023-08-14 17:21:45 +05:30
AisingioroHao
1b739e7344 Fixed invalid pipeline_class_name parameter. (#4590)
* Fixed invalid pipeline_class_name parameter.

* Fix the format
2023-08-14 17:21:17 +05:30
Sayak Paul
d67eba0f31 [Utility] adds an image grid utility (#4576)
* add: utility for image grid.

* add: return type.

* change necessary places.

* add to utility page.
2023-08-12 10:34:51 +05:30
Steven Liu
714bfed859 [docs] Fix ControlNet SDXL docstring (#4582)
fix
2023-08-11 10:43:40 -07:00
Sayak Paul
d5983a6779 [Examples] fix: network_alpha -> network_alphas (#4572)
network_alpha
2023-08-11 14:18:49 +05:30
Mystfit
796c01534d Fixing repo_id regex validation error on windows platforms (#4358)
* Fixing repo_id regex validation error on windows platforms

* Validating correct URL with prefix is provided

If we are loading a URL then we don't need to use os.path.join and array slicing to split out a repo_id and file path from an absolute filepath. 

Checking if the URL prefix is valid first before doing any URL splitting otherwise we raise a ValueError since neither a valid filepath or URL was provided.

* Style fixes

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-08-11 11:11:43 +05:30
Abhipsha Das
c8d86e9f0a Remove code snippets containing is_safetensors_available() (#4521)
* [WIP] Remove code snippets containing `is_safetensors_available()`

* Modifying `import_utils.py`

* update pipeline tests for safetensor default

* fix test related to cached requests

* address import nits

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2023-08-11 11:05:22 +05:30
dotieuthien
b28cd3fba0 Convert Stable Diffusion ControlNet to TensorRT (#4465)
* convert tensorrt controlnet

* Fix code quality

* Fix code quality

* Fix code quality

* Fix code quality

* Fix code quality

* Fix code quality

* Fix number controlnet condition

* Add convert SD XL to onnx

* Add convert SD XL to tensorrt

* Add convert SD XL to tensorrt

* Add examples in comments

* Add examples in comments

* Add test onnx controlnet

* Add tensorrt test

* Remove copied

* Move file test to examples/community

* Remove script

* Remove script

* Remove text

---------

Co-authored-by: dotieuthien <thien.do@mservice.com.vn>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-11 08:12:26 +05:30
Steven Liu
cd7071e750 [docs] Add safetensors flag (#4245)
* add safetensors flag

* apply review
2023-08-10 12:37:23 -07:00
Steven Liu
e31f38b5d6 [docs] Remove attention slicing (#4518)
* remove attention slicing

* apply feedback
2023-08-10 11:00:03 -07:00
Steven Liu
3bd5e073cb [docs] Expand prompt weighting (#4516)
* add more weighting/blend/conjunction

* finish blend/conjunction

* add textual inversion example

* add dreambooth
2023-08-10 10:56:53 -07:00
YiYi Xu
3df52ba8dc [Doc] update sdxl-controlnet repo name (#4564)
* rename

* style

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-08-10 22:02:32 +05:30
Sayak Paul
c697c5ab57 improve controlnet sdxl docs now that we have a good checkpoint. (#4556) 2023-08-10 08:21:36 +05:30
VV-A-VV
3fd45eb10f fix some typo error (#4546)
* fix some typo error

* Undo changes to capitalization
2023-08-10 06:49:25 +05:30
Patrick von Platen
5cbcbe3c63 Revert "introduce minimalistic reimplementation of SDXL on the SDXL doc" (#4548)
Revert "introduce minimalistic reimplementation of SDXL on the SDXL doc (#4532)"

This reverts commit e7e3749498.
2023-08-10 06:49:06 +05:30
Dhruv Nair
7b07f9812a Move controlnet load local tests to nightly (#4543)
move controlnet load local tests to nihghtly
2023-08-09 23:00:42 +05:30
Steven Liu
16ad13b61d [docs] Clean scheduler api (#4204)
* clean scheduler mixin

* up to dpmsolvermultistep

* finish cleaning

* first draft

* fix overview table

* apply feedback

* update reference code
2023-08-09 09:00:35 -07:00
Dhruv Nair
da0e2fce38 pin ruff version for quality checks (#4539)
* pin ruff version for quality checks

* update dependency versions table

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-08-09 16:46:45 +05:30
Dhruv Nair
a67ff32301 Move slow tests to nightly (#4526)
* move slow pix2pixzero tests to nightly

* move slow panorama tests to nightly

* move txt2video full test to nightly

* clean up

* remove nightly test from text to video pipeline
2023-08-09 12:38:15 +02:00
jere357
3c1b4933bd Changed code that converts tensors to PIL images in the write_your_own_pipeline notebook (#4489)
changed code that converts tensors to PIL images
2023-08-09 15:00:51 +05:30
Sayak Paul
e731ae0ec8 Update README_sdxl.md to include the free-tier Colab Notebook (#4540)
Update README_sdxl.md
2023-08-09 14:32:14 +05:30
Rastislav Švarba
6c5b5b260e Fix push_to_hub in train_text_to_image_lora_sdxl.py example (#4535)
fix: push_to_hub in train text2image lora sdxl
2023-08-09 11:48:24 +05:30
Simo Ryu
e7e3749498 introduce minimalistic reimplementation of SDXL on the SDXL doc (#4532)
minsdxl
2023-08-09 07:33:07 +05:30
Wooyeol Baek
c7c0b57541 Copy lora functions to XLPipelines (#4512)
* add load_lora_weights and save_lora_weights to StableDiffusionXLImg2ImgPipeline

* add load_lora_weights and save_lora_weights to StableDiffusionXLInpaintPipeline

* apply black format

* apply black format

* add copy statement

* fix statements

* fix statements

* fix statements

* run `make fix-copies`
2023-08-08 18:53:22 +05:30
Dhruv Nair
c91272d631 fix indexing issue in sd reference pipeline (#4531) 2023-08-08 15:14:19 +02:00
George He
f0725c5845 Fix misc typos (#4479)
Fix typos
2023-08-07 17:21:19 -07:00
YiYi Xu
aef11cbf66 add pipeline_class_name argument to Stable Diffusion conversion script (#4461)
* add pipeline class

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* style

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-07 06:44:31 -10:00
Dhruv Nair
71c8224159 Moving certain pipelines slow tests to nightly (#4469)
* move audioldm tests to nightly

* move kandinsky im2img ddpm test to nightly

* move flax dpm test to nightly

* move diffedit dpm test to nightly

* move fp16 slow tests to nightly
2023-08-07 17:28:56 +02:00
Patrick von Platen
4367b8a300 move pipeline only when running validation (#4515) 2023-08-07 17:20:18 +02:00
ethansmith2000
f4f854138d grad checkpointing (#4474)
* grad checkpointing

* fix make fix-copies

* fix

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-07 15:26:54 +02:00
Patrick von Platen
e1b5b8ba13 Make sure fp16-fix is used as default (#4510)
* Make sue fp16-fix is used as default

* fix vae

* finish

* fix
2023-08-07 15:16:37 +02:00
Patrick von Platen
dff5ff35a9 [SDXL LoRA] fix batch size lora (#4509)
fix batch size lora
2023-08-07 13:27:13 +02:00
Sayak Paul
b2456717e6 Update lora.md to clarify SDXL support (#4503)
* Update lora.md

* Update lora.md
2023-08-07 11:06:30 +05:30
Vladislav Artemyev
2e69cf16fe Log global_step instead of epoch to tensorboard (#4493)
Co-authored-by: mrlzla <noname@noname.com>
2023-08-07 07:49:39 +05:30
takuoko
9c29bc2df8 [Examples] Support train_text_to_image_lora_sdxl.py (#4365)
* add train_text_to_image_lora_sdxl.py

* add train_text_to_image_lora_sdxl.py

* add test and minor fix

* Update examples/text_to_image/README_sdxl.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* fix unwrap_model rule

* add invisible-watermark in requirements

* del invisible-watermark

* Update examples/text_to_image/README_sdxl.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update examples/text_to_image/README_sdxl.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update examples/text_to_image/train_text_to_image_lora_sdxl.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* del comment & update readme

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-06 13:47:20 +05:30
AisingioroHao
70d098540d Add a data_dir parameter to the load_dataset method. (#4482)
Co-authored-by: AisingioroHao0 <1286098622@qq.com>
2023-08-06 08:45:48 +05:30
Patrick von Platen
ea1fcc28a4 [SDXL] Allow SDXL LoRA to be run with less than 16GB of VRAM (#4470)
* correct

* correct blocks

* finish

* finish

* finish

* Apply suggestions from code review

* fix

* up

* up

* up

* Update examples/dreambooth/README_sdxl.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Apply suggestions from code review

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-08-04 20:06:38 +02:00
Patrick von Platen
66de221409 Update README_sdxl.md (#4472) 2023-08-04 16:23:35 +02:00
Sayak Paul
06f73bd6d1 [Tests] Adds integration tests for SDXL LoRAs (#4462)
* add: integration tests for SDXL LoRAs.

* change pipeline class.

* fix assertion values.

* print values again.

* let's see.

* let's see.

* let's see.

* finish
2023-08-04 16:25:53 +05:30
asfiyab-nvidia
c14c141b86 TensorRT Inpaint pipeline: minor fixes (#4457)
Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>
2023-08-04 12:28:28 +02:00
manosplitsis
79ef9e528c Fixed multi-token textual inversion training (#4452)
* added placeholder token concatenation during training

* Update examples/textual_inversion/textual_inversion.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-04 12:21:31 +02:00
Dhruv Nair
801a5e2199 Cleanup Pass on flaky slow tests for Stable Diffusion (#4455)
* lower num inference steps and precision checkk

* fix flaky inpaint tests

* remove unsued imports

* set unet default attn processor
2023-08-04 10:24:56 +02:00
YiYi Xu
1edd0debaa fix-format (#4458)
make style

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-08-03 20:34:37 -07:00
YiYi Xu
29ece0db79 a few fix for kandinsky combined pipeline (#4352)
* add xformer

* enable_sequential_cpu_offload

* style

* Update src/diffusers/pipelines/kandinsky/pipeline_kandinsky_combined.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-08-03 15:10:41 -10:00
Patrick von Platen
1a8843f93e add sdxl to prompt weighting (#4439)
* add sdxl to prompt weighting

* Update docs/source/en/using-diffusers/weighted_prompts.md

* Update docs/source/en/using-diffusers/weighted_prompts.md

* add sdxl to prompt weighting

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Apply suggestions from code review

* Update docs/source/en/using-diffusers/weighted_prompts.md

* Apply suggestions from code review

* correct

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-08-03 21:41:48 +02:00
JinK
e391b789ac Support different strength for Stable Diffusion TensorRT Inpainting pipeline (#4216)
* Support different strength

* run make style
2023-08-03 21:32:44 +02:00
VV-A-VV
d0b8de1262 Delete the duplicate code for the contolnet img 2 img (#4411)
delete the duplicated code from the controlnet-img-2-img pipelines
2023-08-03 20:32:03 +02:00
Yuyang Zhao
b9058754c5 Fix bug caused by typo (#4357)
Fix the typo that causes `omegaconf.errors.ConfigAttributeError: Missing key parms`
2023-08-03 20:27:42 +02:00
Alan Ji
777becda6b fix typo to ensure make test-examples work correctly (#4329)
fix typo to ensure `make test-examples` work correctly
2023-08-03 20:17:48 +02:00
cmdr2
380bfd82c1 Allow controlnets to be loaded (from ckpt) in a parallel thread with a SD model (ckpt), and speed it up slightly (#4298)
Faster controlnet model instantiation, and allow controlnets to be loaded (from ckpt) in a parallel thread with a SD model (ckpt) without  tensor errors (race condition)
2023-08-03 20:11:13 +02:00
Steven Liu
5989a85edb [docs] Distilled SD (#4442)
* first draft

* add blog link
2023-08-03 11:03:42 -07:00
Levi McCallum
4188f3063a Add rank argument to train_dreambooth_lora_sdxl.py (#4343)
* Add rank argument to train_dreambooth_lora_sdxl.py

* Update train_dreambooth_lora_sdxl.py
2023-08-03 23:27:30 +05:30
George He
0b4430e840 Fix typerror in pipeline handling for MultiControlNets which only contain a single ControlNet (#4454)
* Handle single controlnet in multicontrolnet

* Fix formatting
2023-08-03 17:37:07 +02:00
Patrick von Platen
a74c995e7d make style 2023-08-03 14:45:06 +00:00
Neil Wang
85aa673bec auto type conversion (#4270)
* type conversion

Default value of `control_guidance_start` and `control_guidance_end` in `StableDiffusionControlNetPipeline.check_inputs` causes `TypeError: object of type 'float' has no len()`

Proposed fix: 
Convert `control_guidance_start` and `control_guidance_end` to list if float

* Update src/diffusers/pipelines/controlnet/pipeline_controlnet.py

* Update src/diffusers/pipelines/controlnet/pipeline_controlnet.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/controlnet/pipeline_controlnet.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-03 16:44:48 +02:00
Dhruv Nair
1d2587bb34 move tests to nightly (#4451)
* move tests to nightly

* clean up code quality issues

* more clean up
2023-08-03 15:25:28 +02:00
Patrick von Platen
372b58108e fix make style 2023-08-03 10:17:00 +00:00
w4ffl35
45171174b8 Prevent online access when desired when using download_from_original_stable_diffusion_ckpt (#4271)
Prevent online access when desired

- Bypass requests with config files option added to download_from_original_stable_diffusion_ckpt
- Adds local_files_only flags to all from_pretrained requests
2023-08-03 12:16:41 +02:00
cmdr2
4c4fe042a7 Accept pooled_prompt_embeds in the SDXL Controlnet pipeline. Fixes an error if prompt_embeds are passed. (#4309)
* Accept pooled_prompt_embeds in the SDXL Controlnet pipeline. Fixes an error if prompt_embeds are passed.

* Add a test for pooled prompt embeds
2023-08-03 13:05:19 +05:30
Will Berman
47bf8e566c can call encode_prompt with out setting a text encoder instance variable (#4396)
* can call encode_prompt with out setting a text encoder instance variable

* fix
2023-08-02 21:25:30 -07:00
Sayak Paul
18fc40c169 [Feat] add tiny Autoencoder for (almost) instant decoding (#4384)
* add: model implementation of tiny autoencoder.

* add: inits.

* push the latest devs.

* add: conversion script and finish.

* add: scaling factor args.

* debugging

* fix denormalization.

* fix: positional argument.

* handle use_torch_2_0_or_xformers.

* handle post_quant_conv

* handle dtype

* fix: sdxl image processor for tiny ae.

* fix: sdxl image processor for tiny ae.

* unify upcasting logic.

* copied from madness.

* remove trailing whitespace.

* set is_tiny_vae = False

* address PR comments.

* change to AutoencoderTiny

* make act_fn an str throughout

* fix: apply_forward_hook decorator call

* get rid of the special is_tiny_vae flag.

* directly scale the output.

* fix dummies?

* fix: act_fn.

* get rid of the Clamp() layer.

* bring back copied from.

* movement of the blocks to appropriate modules.

* add: docstrings to AutoencoderTiny

* add: documentation.

* changes to the conversion script.

* add doc entry.

* settle tests.

* style

* add one slow test.

* fix

* fix 2

* fix 2

* fix: 4

* fix: 5

* finish integration tests

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* style

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-08-02 23:58:05 +05:30
Xin Kong
615c04db15 [Pipelines] Add community pipeline for Zero123 (#4295)
* add zero123 pipeline to community

* add community doc

* reformat

* update zero123 pipeline, including cc_projection within diffusers; add convert ckpt scripts; support diffusers weights
2023-08-02 19:36:49 +02:00
Steven Liu
ae82a3eb34 [docs] AutoPipeline tutorial (#4273)
* first draft

* tidy api

* apply feedback

* mdx to md

* apply feedback

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-02 10:32:02 -07:00
Sayak Paul
816ca0048f [LoRA] Fix SDXL text encoder LoRAs (#4371)
* temporarily disable text encoder loras.

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debbuging.

* modify doc.

* rename tests.

* print slices.

* fix: assertions

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-08-02 17:00:56 +05:30
Sayak Paul
fef8d2f726 remove mentions of textual inversion from sdxl. (#4404) 2023-08-02 15:29:46 +05:30
Ella Charlaix
579b4b2020 Update documentation (#4422)
* update documentation

* minor
2023-08-02 11:49:22 +02:00
Steven Liu
6c5bd2a38d [docs] Fix SDXL docstring (#4397)
fix guidance scale value
2023-08-01 16:40:15 -07:00
Will Berman
160474ac61 train dreambooth fix pre encode class prompt (#4395) 2023-08-01 12:00:05 -07:00
YiYi Xu
c10861ee1b fix test_float16_inference (#4412)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-08-01 07:49:50 -10:00
YiYi Xu
94b332c476 support from_single_file for SDXL inpainting (#4408)
fix

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-08-01 07:47:22 -10:00
Dhruv Nair
6f4355f89f Cleanup pass for flaky Slow Tests for Stable diffusion (#4415)
* update expected slice so img2img compile tests pass

* use default attn processor

* use default attn processor and update expected slice value to pass test

* use default attn processor

* set default attn processor and update expected slice

* set default attn processor and change precision for check

* set unet to use default attn processor
2023-08-01 18:21:14 +02:00
estelleafl
05a1cb902c [ldm3d] documentation fixing typos (#4284)
* fixed typo

* updated doc to be consistent in naming

* make style/quality

* preprocessing for 4 channels and not 6

* make style

* test for 4c

* make style/quality

* fixed test on cpu

* fixed doc typo

* changed default ckpt to 4c

* Update pipeline_stable_diffusion_ldm3d.py

---------

Co-authored-by: Aflalo <estellea@isl-iam1.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu33.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu38.rr.intel.com>
2023-08-01 09:03:29 -07:00
Patrick von Platen
c69526a3d5 [AutoPipeline] Correct naming (#4420) 2023-08-01 14:56:27 +02:00
Nishant Rajadhyaksha
6c49d542a3 Update docs of unet_1d.py (#4394)
Update unet_1d.py

highlighting the way the modules are actually fed in the main code as the order matters because no skip block attaches time embeds whilst others do not
2023-07-31 11:04:47 -07:00
Sayak Paul
ba43ce3476 minor doc fixes. (#4380) 2023-07-31 12:15:56 +05:30
Andrey Voroshilov
ea5b0575f8 Clean up duplicate lines in encode_prompt (#4369)
* Clean up duplicate line

* Clean up duplicate lines

* Clean up duplicate line
2023-07-30 15:49:46 +05:30
Patrick von Platen
4f986fb28a [SDXL] Fix dummy imports incorrect naming (#4370)
[SDXL] Fix dummy imports
2023-07-30 12:17:38 +02:00
Harutatsu Akiyama
aae27262f4 [SDXL-IP2P] Add gif for demonstrating training processes (#4342)
* [SDXL-IP2P] Add gif for demonstrating training processes

* [SDXL-IP2P] Add gif for demonstrating training processes

* [SDXL-IP2P] Change gif to URLs

* [SDXL-IP2P] Add URLs in case gif now show

---------

Co-authored-by: Harutatsu Akiyama <kf.zy.qin@gmail.com>
2023-07-30 10:07:10 +05:30
Sayak Paul
34b5b63bb8 Update README.md to have PyPI-friendly path (#4351) 2023-07-29 08:59:18 +05:30
Will Berman
2b1786735e fix fp type in t2i adapter docs (#4350) 2023-07-28 13:01:52 -07:00
Sayak Paul
4a4cdd6b07 [Feat] Support SDXL Kohya-style LoRA (#4287)
* sdxl lora changes.

* better name replacement.

* better replacement.

* debugging

* debugging

* debugging

* debugging

* debugging

* remove print.

* print state dict keys.

* print

* distingisuih better

* debuggable.

* fxi: tyests

* fix: arg from training script.

* access from class.

* run style

* debug

* save intermediate

* some simplifications for SDXL LoRA

* styling

* unet config is not needed in diffusers format.

* fix: dynamic SGM block mapping for SDXL kohya loras (#4322)

* Use lora compatible layers for linear proj_in/proj_out (#4323)

* improve condition for using the sgm_diffusers mapping

* informative comment.

* load compatible keys and embedding layer maaping.

* Get SDXL 1.0 example lora to load

* simplify

* specif ranks and hidden sizes.

* better handling of k rank and hidden

* debug

* debug

* debug

* debug

* debug

* fix: alpha keys

* add check for handling LoRAAttnAddedKVProcessor

* sanity comment

* modifications for text encoder SDXL

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* denugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* up

* up

* up

* up

* up

* up

* unneeded comments.

* unneeded comments.

* kwargs for the other attention processors.

* kwargs for the other attention processors.

* debugging

* debugging

* debugging

* debugging

* improve

* debugging

* debugging

* more print

* Fix alphas

* debugging

* debugging

* debugging

* debugging

* debugging

* debugging

* clean up

* clean up.

* debugging

* fix: text

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Batuhan Taskaya <batuhan@python.org>
2023-07-28 19:49:49 +02:00
Patrick von Platen
b7b6d6138d [SDXL] Make watermarker optional under certain circumstances to improve usability of SDXL 1.0 (#4346)
* improve sdxl

* more fixes

* improve sdxl

* improve sdxl

* improve sdxl

* finish
2023-07-28 19:29:22 +02:00
kathath
faa6cbc959 Fix repeat of negative prompt (#4335)
fix repeat of negative prompt
2023-07-28 18:14:22 +02:00
Patrick von Platen
306a7bd047 [ONNX] Don't download ONNX model by default (#4338)
* [Download] Don't download ONNX weights by default

* [Download] Don't download ONNX weights by default

* [Download] Don't download ONNX weights by default

* fix more

* finish

* finish

* finish
2023-07-28 14:02:48 +02:00
Tanupriya Singh
c7250f2b8a correct doc string for default value of guidance_scale (#4339) 2023-07-28 13:54:28 +02:00
Patrick von Platen
18b018c864 [SDXL Refiner] Fix refiner forward pass for batched input (#4327)
* fix_batch_xl

* Fix other pipelines as well

* up

* up

* Update tests/pipelines/stable_diffusion_xl/test_stable_diffusion_xl_inpaint.py

* sort

* up

* Finish it all up Co-authored-by: Bagheera <bghira@users.github.com>

* Co-authored-by: Bagheera bghira@users.github.com

* Co-authored-by: Bagheera <bghira@users.github.com>

* Finish it all up Co-authored-by: Bagheera <bghira@users.github.com>
2023-07-28 12:34:18 +02:00
Sayak Paul
54fab2cd5f Update README_sdxl.md to correct the header (#4330)
Update README_sdxl.md
2023-07-28 09:22:14 +05:30
Sayak Paul
961173064d Honor the SDXL 1.0 licensing from the training scripts. (#4319)
* honor the original license.

* train_instruct_pix2pix_xl -> train_instruct_pix2pix_sdxl
2023-07-28 01:28:36 +05:30
Sayak Paul
7d0d073261 [Tests] add test for pipeline import. (#4276)
* add test for pipeline import.

* Update tests/others/test_dependencies.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* address suggestions

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-28 00:08:15 +05:30
Xinyang Li
01b6ec21fa fix validation option for dreambooth training example (#4317) 2023-07-27 09:58:52 -07:00
Ella Charlaix
92e5ddd295 Fix typo documentation (#4320)
fix typo documentation
2023-07-27 21:31:58 +05:30
Patrick von Platen
1926331eaf [Local loading] Correct bug with local files only (#4318)
* [Local loading] Correct bug with local files only

* file not found error

* fix

* finish
2023-07-27 16:16:46 +02:00
YiYi Xu
5fd3dca5f3 fix a bug in StableDiffusionUpscalePipeline when prompt is None (#4278)
* fix batch_size

* add test

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-07-27 15:07:50 +02:00
Duong A. Nguyen
a2091b7071 Fix SDXL conversion from original to diffusers (#4280)
* fix sdxl conversion

* convention
2023-07-27 15:05:43 +02:00
Patrick von Platen
d8bc1a4e51 [Torch.compile] Fixes torch compile graph break (#4315)
* fix torch compile

* Fix all

* make style
2023-07-27 13:53:36 +02:00
YiYi Xu
80c10d8245 update Kandinsky doc (#4301)
* update doc

* fix an error in autopipe doc

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-07-27 13:10:41 +02:00
Patrick von Platen
20e92586c1 0.20.0dev0 (#4299)
* 0.20.0dev0

* make style
2023-07-26 23:06:18 +02:00
Patrick von Platen
5623ea065a quick fix 2023-07-26 21:01:17 +02:00
Patrick von Platen
16049caf79 quick fix 2023-07-26 18:47:21 +00:00
Patrick von Platen
6a6dfe1cbd Rename (#4294)
* up

* Apply suggestions from code review

* Apply suggestions from code review

* up
2023-07-26 20:41:21 +02:00
Ella Charlaix
b83bdce42a add openvino and onnx runtime SD XL documentation (#4285)
* add openvino SD XL documentation

* add onnx SD XL integration

* rephrase

* update doc

* add images

* update model
2023-07-26 20:25:07 +02:00
camenduru
c6ae9b7df6 Where did this 'x' come from, Elon? (#4277)
* why mdx?

* why mdx?

* why mdx?

* no x for kandinksy either

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-26 18:18:14 +02:00
Patrick von Platen
b3e5cd6b4d [Kandinsky] Add combined pipelines / Fix cpu model offload / Fix inpainting (#4207)
* Add combined pipeline

* Download readme

* Upload

* up

* up

* fix final

* Add enable model cpu offload kandinsky

* finish

* finish

* Fix

* fix more

* make style

* fix kandinsky mask

* fix inpainting test

* add callbacks

* add tests

* fix tests

* Apply suggestions from code review

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* docs

* docs

* correct docs

* fix tests

* add warning

* correct docs

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2023-07-26 17:13:55 +02:00
Patrick von Platen
b37dc3b3cd Fix all missing optional import statements from pipeline folders (#4272)
* fix circular import

* fix imports when watermark not specified

* fix all pipelines
2023-07-26 01:46:05 +02:00
Batuhan Taskaya
ff8f58086b Load Kohya-ss style LoRAs with auxilary states (#4147)
* Support to load Kohya-ss style LoRA file format (without restrictions)

Co-Authored-By: Takuma Mori <takuma104@gmail.com>
Co-Authored-By: Sayak Paul <spsayakpaul@gmail.com>

* tmp: add sdxl to mlp_modules

---------

Co-authored-by: Takuma Mori <takuma104@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-07-26 00:24:19 +02:00
Sayak Paul
161449d51a [SDXL DreamBooth LoRA] multiple fixes (#4262)
* add automatic licensing.

* debugging

* debugging

* more debugging

* more debugging.

* run make fix-copies.

* change to default tracker.
2023-07-25 21:10:01 +02:00
Steven Liu
34abee0907 [docs] Fix image in SDXL docs (#4267)
fix image link
2023-07-25 09:41:11 -07:00
Harutatsu Akiyama
428dbfecd9 [SDXL and IP2P]: instruction pix2pix XL training and pipeline (#4079)
* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* [Community] Implementation of the IADB community pipeline (#3996)

* community pipeline: implementation of iadb

* iadb.py: reformat using black

* iadb.py: linting update

* add kandinsky to readme table (#4081)

Co-authored-by: yiyixuxu <yixu310@gmail,com>

* [From Single File] Force accelerate to be installed (#4078)

force accelerate to be installed

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Support instruction pix2pix sdxl

* Clean up IP2P SDXL code

* Clean up IP2P SDXL code

* [IP2P and SDXL] clean up code

* [IP2P and SDXL] clean up code

* [IP2P and SDXL] clean up code

* [IP2P SDXL] Address code reviews

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews, add docs, tests

* [IP2P SDXL] Address code reviews

* [IP2P SDXL] Address code reviews

* [IP2P SDXL] Add README_SDXL

* [IP2P SDXL] Address code reviews

* [IP2P SDXL] Address code reviews

* [IP2P SDXL] Fix the copy problems

* [IP2P SDXL] Add license

* [IP2P SDXL] Add license

* [IP2P SDXL] Add license

* [IP2P SDXL] Address code reivew for selecting VAE andd others

* [IP2P SDXL] Update README_sdxl

* [IP2P SDXL] Update __init__

* [IP2P SDXL] Update dummy_torch_and_transformers_and_invisible_watermark_objects

* address patrick's comments and some additions to readmes.

---------

Co-authored-by: Harutatsu Akiyama <kf.zy.qin@gmail.com>
Co-authored-by: Thomas Chambon <36728882+tchambon@users.noreply.github.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-07-25 18:19:35 +05:30
Ragnar Rova
4e2a021829 Model path for sdxl wrong in dreambooth README (#4261) 2023-07-25 18:06:50 +05:30
Patrick von Platen
ebfe343149 [from_single_file] Fix circular import (#4259)
* up

* finish

* fix final
2023-07-25 14:30:39 +02:00
Sayak Paul
5ef6b8fa53 Update README_sdxl.md to change the note on default hyperparameters (#4258) 2023-07-25 16:57:48 +05:30
YiYi Xu
c11d11d63d [draft v2] AutoPipeline (#4138)
* initial

* style

* from ...pipelines -> from ..pipeline_util

* make style

* fix-copies

* fix value_guided_sampling oops

* style

* add test

* Show failing test

* update from_pipe

* fix

* add controlnet, additional test and register unused original config

* update for controlnet

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* store unused config as private attribute and pass if can

* add doc

* kandinsky inpaint pipeline does not work with decoder checkpoint

* update doc

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* style

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix

* Apply suggestions from code review

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-07-25 13:20:35 +02:00
Patrick von Platen
d74561da2c [SDXL] Improve docs (#4196)
* Improve docs

* Correct docs

* Add better example inpaint

* make style

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* fix

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-07-25 12:48:25 +02:00
Patrick von Platen
a0422ed0c9 [From Single File] Allow vae to be loaded (#4242)
* Allow vae to be loaded

* up
2023-07-25 12:16:43 +02:00
Will Berman
3dd339379d do not pass list to accelerator.init_trackers (#4248) 2023-07-24 21:10:37 -07:00
nupurkmr9
5652c43f83 Resolve bf16 error as mentioned in this [issue](https://github.com/huggingface/diffusers/issues/4139#issuecomment-1639977304) (#4214)
* resolve bf16 error

* resolve bf16 error

* resolve bf16 error

* resolve bf16 error

* resolve bf16 error

* resolve bf16 error

* resolve bf16 error
2023-07-25 05:41:19 +05:30
Sayak Paul
365e8461ac [SDXL DreamBooth LoRA] add support for text encoder fine-tuning (#4097)
* Allow low precision sd xl

* finish

* finish

* feat: initial draft for supporting text encoder lora finetuning for SDXL DreamBooth

* fix: variable assignments.

* add: autocast block.

* add debugging

* vae dtype hell

* fix: vae dtype hell.

* fix: vae dtype hell 3.

* clean up

* lora text encoder loader.

* fix: unwrapping models.

* add: tests.

* docs.

* handle unexpected keys.

* fix vae dtype in the final inference.

* fix scope problem.

* fix: save_model_card args.

* initialize: prefix to None.

* fix: dtype issues.

* apply gixes.

* debgging.

* debugging

* debugging

* debugging

* debugging

* debugging

* add: fast tests.

* pre-tokenize.

* address: will's comments.

* fix: loader and tests.

* fix: dataloader.

* simplify dataloader.

* length.

* simplification.

* make style && make quality

* simplify state_dict munging

* fix: tests.

* fix: state_dict packing.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-25 05:35:48 +05:30
Sayak Paul
fed12376c5 [ControlNet SDXL training] fixes in the training script (#4223)
* fix: #4206

* add: sdxl controlnet training smoketest.

* remove unnecessary token inits.

* add: licensing to model card.

* include SDXL licensing in the model card and make public visibility default

* debugging

* debugging

* disable local file download.

* fix: training test.

* fix: ckpt prefix.
2023-07-25 05:31:48 +05:30
Patrick von Platen
95b7de88fd [Docs] Fix from pretrained docs (#4240)
* [Docs] Fix from pretrained docs

* [Docs] Fix from pretrained docs
2023-07-24 20:24:29 +02:00
Apoorva Kulkarni
cbb1ead60b docs: Add missing import statement in textual_inversion inference example (#4227)
docs: Add missing import statement in textual_inversion inference instructions
2023-07-24 11:07:53 -07:00
Steven Liu
5470a4fce3 [docs] Other modalities (#4205)
remove coming soon, rl pipeline
2023-07-24 10:51:24 -07:00
39th president of the United States, probably
e98fabc550 Allow specifying denoising_start and denoising_end as integers representing the discrete timesteps, fixing the XL ensemble not working for many schedulers (#4115)
* Fix the XL ensemble not working for any kerras scheduler sigmas and having an off by one bug

* Update src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

* make sytle

---------

Co-authored-by: Jimmy <39@🇺🇸.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-24 19:44:35 +02:00
Cris
fa356bd4da [docs] Changed path for ControlNet in docs (#4215)
docs: changed path for control net
2023-07-24 10:13:10 -07:00
Patrick von Platen
3ba36f97b8 [SD-XL] Fix sdxl controlnet inference (#4238)
* Fix controlnet xl inference

* correct some sd xl control inference
2023-07-24 18:43:35 +02:00
Patrick von Platen
b288684d25 [SDXL] Fix sd xl encode prompt (#4237)
* [SDXL] Fix sd xl encode prompt

* add tests
2023-07-24 18:37:07 +02:00
Lucain
06eda5b232 Raise initial HTTPError if pipeline is not cached locally (#4230)
* Raise initial HTTPError if pipeline is not cached locally

* make style
2023-07-24 15:35:16 +02:00
Hu Ye
8e5921cac1 fix a bug of prompt embeds in sdxl (#4099)
* fix bug in sdxl

* Update pipeline_stable_diffusion_xl_img2img.py

* Update pipeline_stable_diffusion_xl.py

* Update pipeline_stable_diffusion_xl_img2img.py

* Update pipeline_stable_diffusion_xl_inpaint.py

* Update pipeline_stable_diffusion_xl.py

* Update pipeline_stable_diffusion_xl_img2img.py

* Update pipeline_stable_diffusion_xl_inpaint.py

* Update pipeline_stable_diffusion_xl_img2img.py

* Update pipeline_controlnet_sd_xl.py

* Update pipeline_controlnet_sd_xl.py

* Update pipeline_stable_diffusion_xl.py

* Update pipeline_stable_diffusion_xl_img2img.py

* Update pipeline_stable_diffusion_xl_inpaint.py

* Update test_stable_diffusion_xl.py

* Update test_stable_diffusion_xl.py

* Update test_stable_diffusion_xl.py

add test on prompt_embeds

* add test on prompt_embeds

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-24 10:14:20 +02:00
YiYi Xu
8e8954bd15 fix no CFG for kandinsky pipelines (#4193)
* fix bug when no cfg

* style

* fix no cfg for shap-e and cycle

* style

* fix no cfg for sdxl

* fix copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-07-24 10:10:14 +02:00
Jackmin801
09fab5610e [fix] network_alpha when loading unet lora from old format (#4221)
fix: missed network_alpha when loading lora from old format
2023-07-24 06:13:32 +05:30
Apoorva Kulkarni
2e53936c97 docs: Typo in dreambooth example README.md (#4203)
fix: Typo in dreambooth example README.md
2023-07-21 15:16:38 -07:00
Steven Liu
a69754bb87 [docs] Clean up pipeline apis (#3905)
* start with stable diffusion

* fix

* finish stable diffusion pipelines

* fix path to pipeline output

* fix flax paths

* fix copies

* add up to score sde ve

* finish first pass of pipelines

* fix copies

* second review

* align doc titles

* more review fixes

* final review
2023-07-21 11:01:34 -07:00
Kadir Nar
bcc570b910 📄 Renamed File for Better Understanding (#4056)
* 📄 Renamed File for Better Understanding

Renamed the 'rl' file to 'run_locomotion'. This change was made to improve the clarity and readability of the codebase. The 'rl' name was ambiguous, and 'run_locomotion' provides a more clear description of the file's purpose.

Thanks 🙌

* 📁 [Docs] Renamed Directory for Better Clarity

Renamed the 'rl' directory to 'reinforcement_learning'. This change provides a clearer understanding of the directory's purpose and its contents.

* Update examples/reinforcement_learning/README.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* 📝 Update README

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-21 09:08:27 -07:00
Sayak Paul
4dcab9227a [SDXL ControlNet Training] Follow-up fixes (#4188)
* hash computation. thanks to @lhoestq

* disable dtype casting.

* remove comments.
2023-07-21 20:55:33 +05:30
apolinário
aed30dff6b Allow passing different prompts to each text_encoder on stable_diffusion_xl pipelines (#4156)
* sdxl prompt2

* Improve checks

* doc linting

* whoops

* remove cat

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add other pipelines and tests

* Add multi-prompting to docs

* doc and copies check

* Fix copied froms

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Bring back the original code for unrelated files

* Fix tests

* Fix img2img

* Fix all

* fix

---------

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-07-21 14:50:22 +02:00
statelesshz
e2bbaa4f54 make enable_sequential_cpu_offload more generic for third-party devices (#4191)
* make enable_sequential_cpu_offload more generic for third-party devices

* make style
2023-07-21 17:15:09 +05:30
Patrick von Platen
1e853e240e [Safetensors] make safetensors a required dep (#4177) 2023-07-21 12:57:30 +02:00
Batuhan Taskaya
ad787082e2 Fix unloading of LoRAs when xformers attention procs are in use (#4179) 2023-07-21 14:29:20 +05:30
Will Berman
7a47df22a5 remove bentoml doc in favor of blogpost (#4182) 2023-07-21 08:23:36 +05:30
Patrick von Platen
d620070bb3 [ControlNet Training] Remove safety from controlnet (#4180)
Remove safety from controlnet
2023-07-21 08:03:59 +05:30
Patrick von Platen
b7a6e34cc6 [From single file] Make sure that controlnet stays False for from_single_file (#4181)
* fix from signle file

* Make sure converison always works with safetensors
2023-07-21 02:09:11 +02:00
YiYi Xu
47b3346422 Shap-E: add support for mesh output (#4062)
* add output_type=mesh

* update img2img

* make style

* add doc

* make style

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add docstring for output_type

* add a section in doc about hub mesh visualization/ rotation

* update conversion script so default background is white

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update src/diffusers/pipelines/shap_e/pipeline_shap_e_img2img.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* renderer -> shap_e_renderer

* img2img renderer -> shap_e_renderer

* fix tests

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-07-20 18:05:13 +02:00
Ruslan Vorovchenko
07f1fbb18e Asymmetric vqgan (#3956)
* added AsymmetricAutoencoderKL

* fixed copies+dummy

* added script to convert original asymmetric vqgan

* added docs

* updated docs

* fixed style

* fixes, added tests

* update doc

* fixed doc

* fixed tests

* naming

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* naming

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* udpated code example

* updated doc

* comments fixes

* added docstring

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* comments fixes

* added inpaint pipeline tests

* comment suggestion: delete method

* yet another fixes

---------

Co-authored-by: Ruslan Vorovchenko <r.vorovchenko@prequelapp.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-20 17:51:06 +02:00
Lim Swee Kiat
2551b73670 Fix bug in ControlNetPipelines with MultiControlNetModel of length 1 (#4032)
* Fix bug in ControlNetPipelines with MultiControlNetModel of length 1

* Add tests for varying number of ControlNet models

* Fix missing indexing for control_guidance_start and control_guidance_end

* Fix code quality

* Separate test for MultiControlNet with one model

* Revert formatting of earlier test
2023-07-20 17:45:08 +02:00
Allen Zhang
930c8fdcb7 fix incorrect attention head dimension in AttnProcessor2_0 (#4154)
fix inner_dim
2023-07-20 17:40:50 +02:00
Patrick von Platen
6b1abba18d Add controlnet and vae from single file (#4084)
* Add controlnet from single file

* Updates

* make style

* finish

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-07-19 14:50:27 +02:00
Saurav Maheshkar
470f51cd26 feat: add act_fn param to OutValueFunctionBlock (#3994)
* feat: add act_fn param to OutValueFunctionBlock

* feat: update unet1d tests to not use mish

* feat: add `mish` as the default activation function

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* feat: drop mish tests from unet1d

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-19 14:44:44 +02:00
Patrick von Platen
b7e35dc782 make style 2023-07-19 14:28:35 +02:00
Byron Mallett
c77ac246c1 Fixed SDXL single file loading to use the correct requested pipeline class (#4142)
Using pipeline_class argument to instantiate the correct pipeline when loading SDXL models from single files
2023-07-19 14:40:13 +02:00
Zhao Shenyang
ed2a3584ab Docs/bentoml integration (#4090)
* docs: first draft of BentoML integration

* Update the diffusers doc

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* add BentoML integration guide under Optimization section

* restyle codes

---------

Co-authored-by: Sherlock113 <sherlockxu07@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-07-18 11:56:13 -07:00
Sayak Paul
3eb498e7b4 [Core] add: controlnet support for SDXL (#4038)
* add: controlnet sdxl.

* modifications to controlnet.

* run styling.

* add: __init__.pys

* incorporate https://github.com/huggingface/diffusers/pull/4019 changes.

* run make fix-copies.

* resize the conditioning images.

* remove autocast.

* run styling.

* disable autocast.

* debugging

* device placement.

* back to autocast.

* remove comment.

* save some memory by reusing the vae and unet in the pipeline.

* apply styling.

* Allow low precision sd xl

* finish

* finish

* changes to accommodate the improved VAE.

* modifications to how we handle vae encoding in the training.

* make style

* make existing controlnet fast tests pass.

* change vae checkpoint cli arg.

* fix: vae pretrained paths.

* fix: steps in get_scheduler().

* debugging.

* debugging./

* fix: weight conversion.

* add: docs.

* add: limited tests./

* add: datasets to the requirements.

* update docstrings and incorporate the usage of watermarking.

* incorporate fix from #4083

* fix watermarking dependency handling.

* run make-fix-copies.

* Empty-Commit

* Update requirements_sdxl.txt

* remove vae upcasting part.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* run make style

* run make fix-copies.

* disable suppot for multicontrolnet.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* run make fix-copies.

* dtyle/.

* fix-copies.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-18 18:25:34 +05:30
clarencechen
c6e56e92ed Add Recent Timestep Scheduling Improvements to DDIM Inverse Scheduler (#3865)
* Add Recent Timestep Scheduling Improvements to DDIM Inverse Scheduler

Roll timesteps by one to reflect origin-destination semantic discrepancy

Restore `set_alpha_to_one` option to handle negative initial timesteps

Remove `set_alpha_to_zero` option not used due to previous truncation

* Bugfix

* Remove unnecessary calls to `detach()`

Use `self.image_processor.preprocess` in DiffEdit pipeline functions

* Preprocess list input for inverted image latents in diffedit pipeline

* Add `timestep_spacing` and `steps_offset` to `DPMSolverMultistepInverseScheduler`

* Update expected test results to account for inverting last forward diffusion step

* Fix inversion progress bar bug

* Add first draft for proper fast tests for DDIMInverseScheduler

* Add deprecated DDIMInverseScheduler kwarg to ConfigMixer registry

* Fix test failure in DPMMultistepInverseScheduler

Invert step specification leads to negative noise variance in SDE-based algs

Add first draft for proper fast tests for DPMMultistepInverseScheduler

* Update expected test results to account for inverting last forward diffusion step

Clean up diffedit fast test
2023-07-18 11:35:16 +02:00
Patrick von Platen
27062c3631 Refactor execution device & cpu offload (#4114)
* create general cpu offload & execution device

* Remove boiler plate

* finish

* kp

* Correct offload more pipelines

* up

* Update src/diffusers/pipelines/pipeline_utils.py

* make style

* up
2023-07-18 11:04:40 +02:00
takuoko
6427aa995e [Enhance] Add rank in dreambooth (#4112)
add rank in dreambooth
2023-07-18 11:30:06 +05:30
Seongsu Park
8b18cd8e7f [Docs] Korean translation update (#4022)
* feat) optimization kr translation

* fix) typo, italic setting

* feat) dreambooth, text2image kr

* feat) lora kr

* fix) LoRA

* fix) fp16 fix

* fix) doc-builder style

* fix) fp16 일부 단어 수정

* fix) fp16 style fix

* fix) opt, training docs update

* merge conflict

* Fix community pipelines (#3266)

* Allow disabling torch 2_0 attention (#3273)

* Allow disabling torch 2_0 attention

* make style

* Update src/diffusers/models/attention.py

* Release: v0.16.1

* feat) toctree update

* feat) toctree update

* Fix custom releases (#3708)

* Fix custom releases

* make style

* Fix loading if unexpected keys are present (#3720)

* Fix loading

* make style

* Release: v0.17.0

* opt_overview

* commit

* Create pipeline_overview.mdx

* unconditional_image_generatoin_1stDraft

*  Add translation for write_own_pipeline.mdx

* conditional-직역, 언컨디셔널

* unconditional_image_generation first draft

* reviese

* Update pipeline_overview.mdx

* revise-2

* ♻️ translation fixed for write_own_pipeline.mdx

* complete translate basic_training.mdx

* other-formats.mdx 번역 완료

* fix tutorials/basic_training.mdx

* other-formats 수정

* inpaint 한국어 번역

* depth2img translation

* translate training/adapt-a-model.mdx

* revised_all

* feedback taken

* using_safetensors.mdx_first_draft

* custom_pipeline_examples.mdx_first_draft

* img2img 한글번역 완료

* tutorial_overview edit

* reusing_seeds

* torch2.0

* translate complete

* fix) 용어 통일 규약 반영

* [fix] 피드백을 반영해서 번역 보정

* 오탈자 정정 + 컨벤션 위배된 부분 정정

* typo, style fix

* toctree update

* copyright fix

* toctree fix

* Update _toctree.yml

---------

Co-authored-by: Chanran Kim <seriousran@gmail.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Lee, Hongkyu <75282888+howsmyanimeprofilepicture@users.noreply.github.com>
Co-authored-by: hyeminan <adios9709@gmail.com>
Co-authored-by: movie5 <oyh5800@naver.com>
Co-authored-by: idra79haza <idra79haza@github.com>
Co-authored-by: Jihwan Kim <cuchoco@naver.com>
Co-authored-by: jungwoo <boonkoonheart@gmail.com>
Co-authored-by: jjuun0 <jh061993@gmail.com>
Co-authored-by: szjung-test <93111772+szjung-test@users.noreply.github.com>
Co-authored-by: idra79haza <37795618+idra79haza@users.noreply.github.com>
Co-authored-by: howsmyanimeprofilepicture <howsmyanimeprofilepicture@gmail.com>
Co-authored-by: hoswmyanimeprofilepicture <hoswmyanimeprofilepicture@gmail.com>
2023-07-17 18:28:08 -07:00
Will Berman
a0597f33ac t2i pipeline (#3932)
* Quick implementation of t2i-adapter

Load adapter module with from_pretrained

Prototyping generalized adapter framework

Writeup doc string for sideload framework(WIP) + some minor update on implementation

Update adapter models

Remove old adapter optional args in UNet

Add StableDiffusionAdapterPipeline unit test

Handle cpu offload in StableDiffusionAdapterPipeline

Auto correct coding style

Update model repo name to "RzZ/sd-v1-4-adapter-pipeline"

Refactor MultiAdapter to better compatible with config system

Export MultiAdapter

Create pipeline document template from controlnet

Create dummy objects

Supproting new AdapterLight model

Fix StableDiffusionAdapterPipeline common pipeline test

[WIP] Update adapter pipeline document

Handle num_inference_steps in StableDiffusionAdapterPipeline

Update definition of Adapter "channels_in"

Update documents

Apply code style

Fix doc typo and merge error

Update doc string and example

Quality of life improvement

Remove redundant code and file from prototyping

Remove unused pageage

Remove comments

Fix title

Fix typo

Add conditioning scale arg

Bring back old implmentation

Offload sideload

Add supply info on document

Update src/diffusers/models/adapter.py

Co-authored-by: Will Berman <wlbberman@gmail.com>

Update MultiAdapter constructor

Swap out custom checkpoint and update pipeline constructor

Update docment

Apply suggestions from code review

Co-authored-by: Will Berman <wlbberman@gmail.com>

Correcting style

Following single-file policy

Update auto size in image preprocess func

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_adapter.py

Co-authored-by: Will Berman <wlbberman@gmail.com>

fix copies

Update adapter pipeline behavior

Add adapter_conditioning_scale doc string

Add the missing doc string

Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Fix few bugs from suggestion

Handle L-mode PIL image as control image

Rename to differentiate adapter resblock

Update src/diffusers/models/adapter.py

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

Fix typo

Update adapter parameter name

Update test case and code style

Fix copies

Fix typo

Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_adapter.py

Co-authored-by: Will Berman <wlbberman@gmail.com>

Update Adapter class name

Add checkpoint converting script

Fix style

Fix-copies

Remove dev script

Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

Updates for parameter rename

Fix convert_adapter

remove main

fix diff

more

refactoring

more

more

small fixes

refactor

tests

more slow tests

more tests

Update docs/source/en/api/pipelines/overview.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

add community contributor to docs

Update docs/source/en/api/pipelines/stable_diffusion/adapter.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

Update docs/source/en/api/pipelines/stable_diffusion/adapter.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

Update docs/source/en/api/pipelines/stable_diffusion/adapter.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

Update docs/source/en/api/pipelines/stable_diffusion/adapter.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

Update docs/source/en/api/pipelines/stable_diffusion/adapter.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

fix

remove from_adapters

license

paper link

docs

more url fixes

more docs

fix

fixes

fix

fix

* fix sample inplace add

* additional_kwargs -> additional_residuals

* move t2i adapter pipeline to own module

* preprocess -> _preprocess_adapter_image

* add TencentArc to license

* fix example code links

* add image converter and fix example doc string

* fix links

* clearer additional residual application

---------

Co-authored-by: HimariO <dsfhe49854@gmail.com>
2023-07-17 12:55:44 -07:00
Kadir Nar
3929954613 📝 Update doc with more descriptive title and filename for "IF" section (#4049)
* 📝 Update doc with more descriptive title and filename for "IF" section

Updated the documentation to provide a more descriptive title and filename for the "IF" section. Previously, having only "IF" as the title was not conveying a clear meaning. By renaming the section to "DeepFloyd IF," we provide users with a more informative and context-specific heading.

Thanks! 🙌

* 📝 Update name for "IF" section in 📝 Update name for "IF" section in README

Updated the link and name for the "IF" section in the README file to reflect the new heading "DeepFloyd IF."

* 📝 Fix broken link for "Instruct Pix2Pix" section in README

Fixed the broken link for the "Instruct Pix2Pix" section in the README file. Previously, the link was pointing to an incorrect location due to the presence of "stable_diffusion" in the URL. By removing "stable_diffusion" from the URL, I have corrected the error and ensured that users are directed to the correct section.

* 🔧💼 Updated parameters in _toctree.yml file

- ✏️ Updated 'local' parameter to 'api/pipelines/deepfloyd_if'.
- ✏️ Updated 'title' parameter to 'DeepFloyd IF'.

🎯 These changes aim to improve visibility and accessibility in the documentation of the DeepFloyd IF pipeline. 🚀📚
2023-07-17 09:25:37 -07:00
Apoorva Pandey
f6ce323633 Make setup.py compatible with pipenv (#4121) 2023-07-17 17:31:04 +02:00
edward zhu
6b33c11c5b add noise_sampler_seed to StableDiffusionKDiffusionPipeline.__call__ (#3911)
* add noise_sampler to StableDiffusionKDiffusionPipeline

* fix/docs: Fix the broken doc links (#3897)

* fix/docs: Fix the broken doc links

Signed-off-by: GitHub <noreply@github.com>

* Update docs/source/en/using-diffusers/write_own_pipeline.mdx

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Add video img2img (#3900)

* Add image to image video

* Improve

* better naming

* make fix copies

* add docs

* finish tests

* trigger tests

* make style

* correct

* finish

* Fix more

* make style

* finish

* fix/doc-code: Updating to the latest version parameters (#3924)

fix/doc-code: update to use the new parameter

Signed-off-by: GitHub <noreply@github.com>

* fix/doc: no import torch issue (#3923)

Ffix/doc: no import torch issue

Signed-off-by: GitHub <noreply@github.com>

* Correct controlnet out of list error (#3928)

* Correct controlnet out of list error

* Apply suggestions from code review

* correct tests

* correct tests

* fix

* test all

* Apply suggestions from code review

* test all

* test all

* Apply suggestions from code review

* Apply suggestions from code review

* fix more tests

* Fix more

* Apply suggestions from code review

* finish

* Apply suggestions from code review

* Update src/diffusers/schedulers/scheduling_k_dpm_2_ancestral_discrete.py

* finish

* Adding better way to define multiple concepts and also validation capabilities. (#3807)

* - Added validation parameters
- Changed some parameter descriptions to better explain their use.
- Fixed a few typos.
- Added concept_list parameter for better management of multiple subjects
- changed logic for image validation

* - Fixed bad logic for class data root directories

* Defaulting validation_steps to None for an easier logic

* Fixed multiple validation prompts

* Fixed bug on validation negative prompt

* Changed validation logic for tracker.

* Added uuid for validation image labeling

* Fix error when comparing validation prompts and validation negative prompts

* Improved error message when negative prompts for validation are more than the number of prompts

* - Changed image tracking number from epoch to global_step
- Added Typing for functions

* Added some validations more when using concept_list parameter and the regular ones.

* Fixed error message

* Added more validations for validation parameters

* Improved messaging for errors

* Fixed validation error for parameters with default values

* - Added train step to image name for validation
- reformatted code

* - Added train step to image's name for validation
- reformatted code

* Updated README.md file.

* reverted back original script of train_dreambooth.py

* reverted back original script of train_dreambooth.py

* left one blank line at the eof

* reverted back setup.py

* reverted back setup.py

* added same logic for when parameters for prior preservation are used without enabling the flag while using concept_list parameter.

* Ran black formatter.

* fixed a few strings

* fixed import sort with isort and removed fstrings without placeholder

* fixed import order with ruff (since with isort wasn't ok)

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [ldm3d] Update code to be functional with the new checkpoints (#3875)

* fixed typo

* updated doc to be consistent in naming

* make style/quality

* preprocessing for 4 channels and not 6

* make style

* test for 4c

* make style/quality

* fixed test on cpu

---------

Co-authored-by: Aflalo <estellea@isl-iam1.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu33.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu38.rr.intel.com>

* Improve memory text to video (#3930)

* Improve memory text to video

* Apply suggestions from code review

* add test

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* finish test setup

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* revert automatic chunking (#3934)

* revert automatic chunking

* Apply suggestions from code review

* revert automatic chunking

* avoid upcasting by assigning dtype to noise tensor (#3713)

* avoid upcasting by assigning dtype to noise tensor

* make style

* Update train_unconditional.py

* Update train_unconditional.py

* make style

* add unit test for pickle

* revert change

---------

Co-authored-by: root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>

* Fix failing np tests (#3942)

* Fix failing np tests

* Apply suggestions from code review

* Update tests/pipelines/test_pipelines_common.py

* Add `timestep_spacing` and `steps_offset` to schedulers (#3947)

* Add timestep_spacing to DDPM, LMSDiscrete, PNDM.

* Remove spurious line.

* More easy schedulers.

* Add `linspace` to DDIM

* Noise sigma for `trailing`.

* Add timestep_spacing to DEISMultistepScheduler.

Not sure the range is the way it was intended.

* Fix: remove line used to debug.

* Support timestep_spacing in DPMSolverMultistep, DPMSolverSDE, UniPC

* Fix: convert to numpy.

* Use sched. defaults when instantiating from_config

For params not present in the original configuration.

This makes it possible to switch pipeline schedulers even if they use
different timestep_spacing (or any other param).

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Missing args in DPMSolverMultistep

* Test: default args not in config

* Style

* Fix scheduler name in test

* Remove duplicated entries

* Add test for solver_type

This test currently fails in main. When switching from DEIS to UniPC,
solver_type is "logrho" (the default value from DEIS), which gets
translated to "bh1" by UniPC. This is different to the default value for
UniPC: "bh2". This is where the translation happens: 36d22d0709/src/diffusers/schedulers/scheduling_unipc_multistep.py (L171)

* UniPC: use same default for solver_type

Fixes a bug when switching from UniPC from another scheduler (i.e.,
DEIS) that uses a different solver type. The solver is now the same as
if we had instantiated the scheduler directly.

* do not save use default values

* fix more

* fix all

* fix schedulers

* fix more

* finish for real

* finish for real

* flaky tests

* Update tests/pipelines/stable_diffusion/test_stable_diffusion_pix2pix_zero.py

* Default steps_offset to 0.

* Add missing docstrings

* Apply suggestions from code review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add Consistency Models Pipeline (#3492)

* initial commit

* Improve consistency models sampling implementation.

* Add CMStochasticIterativeScheduler, which implements the multi-step sampler (stochastic_iterative_sampler) in the original code, and make further improvements to sampling.

* Add Unet blocks for consistency models

* Add conversion script for Unet

* Fix bug in new unet blocks

* Fix attention weight loading

* Make design improvements to ConsistencyModelPipeline and CMStochasticIterativeScheduler and add initial version of tests.

* make style

* Make small random test UNet class conditional and set resnet_time_scale_shift to 'scale_shift' to better match consistency model checkpoints.

* Add support for converting a test UNet and non-class-conditional UNets to the consistency models conversion script.

* make style

* Change num_class_embeds to 1000 to better match the original consistency models implementation.

* Add support for distillation in pipeline_consistency_models.py.

* Improve consistency model tests:
	- Get small testing checkpoints from hub
	- Modify tests to take into account "distillation" parameter of ConsistencyModelPipeline
	- Add onestep, multistep tests for distillation and distillation + class conditional
	- Add expected image slices for onestep tests

* make style

* Improve ConsistencyModelPipeline:
	- Add initial support for class-conditional generation
	- Fix initial sigma for onestep generation
	- Fix some sigma shape issues

* make style

* Improve ConsistencyModelPipeline:
	- add latents __call__ argument and prepare_latents method
	- add check_inputs method
	- add initial docstrings for ConsistencyModelPipeline.__call__

* make style

* Fix bug when randomly generating class labels for class-conditional generation.

* Switch CMStochasticIterativeScheduler to configuring a sigma schedule and make related changes to the pipeline and tests.

* Remove some unused code and make style.

* Fix small bug in CMStochasticIterativeScheduler.

* Add expected slices for multistep sampling tests and make them pass.

* Work on consistency model fast tests:
	- in pipeline, call self.scheduler.scale_model_input before denoising
	- get expected slices for Euler and Heun scheduler tests
	- make Euler test pass
	- mark Heun test as expected fail because it doesn't support prediction_type "sample" yet
	- remove DPM and Euler Ancestral tests because they don't support use_karras_sigmas

* make style

* Refactor conversion script to make it easier to add more model architectures to convert in the future.

* Work on ConsistencyModelPipeline tests:
	- Fix device bug when handling class labels in ConsistencyModelPipeline.__call__
	- Add slow tests for onestep and multistep sampling and make them pass
	- Refactor fast tests
	- Refactor ConsistencyModelPipeline.__init__

* make style

* Remove the add_noise and add_noise_to_input methods from CMStochasticIterativeScheduler for now.

* Run python utils/check_copies.py --fix_and_overwrite
python utils/check_dummies.py --fix_and_overwrite to make dummy objects for new pipeline and scheduler.

* Make fast tests from PipelineTesterMixin pass.

* make style

* Refactor consistency models pipeline and scheduler:
	- Remove support for Karras schedulers (only support CMStochasticIterativeScheduler)
	- Move sigma manipulation, input scaling, denoising from pipeline to scheduler
	- Make corresponding changes to tests and ensure they pass

* make style

* Add docstrings and further refactor pipeline and scheduler.

* make style

* Add initial version of the consistency models documentation.

* Refactor custom timesteps logic following DDPMScheduler/IFPipeline and temporarily add torch 2.0 SDPA kernel selection logic for debugging.

* make style

* Convert current slow tests to use fp16 and flash attention.

* make style

* Add slow tests for normal attention on cuda device.

* make style

* Fix attention weights loading

* Update consistency model fast tests for new test checkpoints with attention fix.

* make style

* apply suggestions

* Add add_noise method to CMStochasticIterativeScheduler (copied from EulerDiscreteScheduler).

* Conversion script now outputs pipeline instead of UNet and add support for LSUN-256 models and different schedulers.

* When both timesteps and num_inference_steps are supplied, raise warning instead of error (timesteps take precedence).

* make style

* Add remaining diffusers model checkpoints for models in the original consistency model release and update usage example.

* apply suggestions from review

* make style

* fix attention naming

* Add tests for CMStochasticIterativeScheduler.

* make style

* Make CMStochasticIterativeScheduler tests pass.

* make style

* Override test_step_shape in CMStochasticIterativeSchedulerTest instead of modifying it in SchedulerCommonTest.

* make style

* rename some models

* Improve API

* rename some models

* Remove duplicated block

* Add docstring and make torch compile work

* More fixes

* Fixes

* Apply suggestions from code review

* Apply suggestions from code review

* add more docstring

* update consistency conversion script

---------

Co-authored-by: ayushmangal <ayushmangal@microsoft.com>
Co-authored-by: Ayush Mangal <43698245+ayushtues@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* add test case for StableDiffusionKDiffusionPipeline noise_sampler

---------

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: Aisuko <urakiny@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Andrés Mauricio Repetto Ferrero <amd.repetto@gmail.com>
Co-authored-by: estelleafl <estelle.aflalo@intel.com>
Co-authored-by: Aflalo <estellea@isl-iam1.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu33.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu38.rr.intel.com>
Co-authored-by: Prathik Rao <prathikr@usc.edu>
Co-authored-by: root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
Co-authored-by: ayushmangal <ayushmangal@microsoft.com>
Co-authored-by: Ayush Mangal <43698245+ayushtues@users.noreply.github.com>
2023-07-17 17:10:17 +02:00
Patrick von Platen
5729829cd8 [From single file] Make accelerate optional (#4132)
* Make accelerate optional

* make accelerate optional
2023-07-17 15:56:21 +02:00
Patrick von Platen
e27500b72c [From ckpt] replace with os path join (#3746)
replace with os path join
2023-07-17 11:39:06 +02:00
Patrick von Platen
fe5911bf3d [Stable Diffusion Inpaint ]Fix dtype inpaint (#4113)
Fix dtype inpaint
2023-07-15 16:10:40 +02:00
Patrick von Platen
b024ebb965 [SD-XL] Add inpainting (#4098)
* Add more

* more

* up

* Get ensemble of expert denoisers working

* Fix code

* add tests

* up
2023-07-14 17:05:44 +02:00
Patrick von Platen
ad8f985e81 Allow low precision vae sd xl (#4083)
* Allow low precision sd xl

* finish

* finish

* make style
2023-07-14 14:51:16 +02:00
Patrick von Platen
ee2f2775b2 [tests] use parent class for monkey patching to not break other tests (#4088)
* [tests] use parent class for monkey patching to not break other tests

* fix
2023-07-14 13:44:44 +02:00
Sayak Paul
692b7a907d [Feat] add: utility for unloading lora. (#4034)
* add: test for testing unloading lora.

* add :reason to skipif.

* initial implementation of lora unload().

* apply styling.

* add: doc.

* change checkpoints.

* reinit generator

* finalize slow test.

* add fast test for unloading lora.
2023-07-14 16:30:18 +05:30
Patrick von Platen
71c918b848 [Invisible watermark] Correct version (#4087) 2023-07-14 09:30:43 +05:30
Sayak Paul
83ca21f539 fix: minor things in the SDXL docs. (#4070) 2023-07-14 09:01:20 +05:30
Gabriel Birnbaum
f3802eb805 fix requirement in SDXL (#4082) 2023-07-14 02:58:20 +02:00
Patrick von Platen
bfe8b41315 [From Single File] Force accelerate to be installed (#4078)
force accelerate to be installed
2023-07-13 23:16:43 +02:00
YiYi Xu
4535088cec add kandinsky to readme table (#4081)
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-07-13 22:26:51 +02:00
Thomas Chambon
2eceaaef0f [Community] Implementation of the IADB community pipeline (#3996)
* community pipeline: implementation of iadb

* iadb.py: reformat using black

* iadb.py: linting update
2023-07-13 16:49:41 +02:00
Ruoxi
ece55227ff Multiply lr scheduler steps by num_processes. (#3983)
* Multiply lr scheduler steps by `num_processes`.

* Stop multiplying steps by gradient accumulation.
2023-07-13 17:50:25 +05:30
Patrick von Platen
92a57a8e84 Fix kandinsky remove safety (#4065)
* Finish

* make style
2023-07-12 21:42:29 +02:00
Lucain
d7280b7436 Trigger CI on ci-* branches (#3635) 2023-07-12 19:31:36 +02:00
Patrick von Platen
e9eb0938f4 make style 2023-07-12 19:24:47 +02:00
junming huang
a29ea36d62 Update train_unconditional.py (#3899)
increase the time of timeout when using big dataset or high resolution
2023-07-12 19:24:28 +02:00
Evgenii Kashin
af48bf2008 Add circular padding for artifact-free StableDiffusionPanoramaPipeline (#4025)
* Add circular padding option

* Fix style with black

* Fix corner case with small image size

* Add circular padding test cases

* Fix docstring

* Improve docstring for circular padding, remove slow test case

* Update docs for circular padding argument

* Add images comparison for circular padding
2023-07-12 20:49:46 +05:30
Patrick von Platen
4b50ecceb0 Correct sdxl docs (#4058) 2023-07-12 16:02:31 +02:00
Bagheera
99b540b072 [SDXL] Partial diffusion support for Text2Img and Img2Img Pipelines (#4015)
* diffusers#4003 - initial implementation of max_inference_steps

* diffusers#4003 - initial implementation of max_inference_steps and first_inference_step for img2img

* diffusers#4003 - use first_inference_step as an input arg for get_timestamps in img2img

* diffusers#4003 Do not add noise during img2img when we have a defined first timestep

* diffusers#4003 Mild updates after revert

* diffusers#4003 Missing change

* Show implementation with denoising_start and end

* Apply suggestions from code review

* Update src/diffusers/pipelines/stable_diffusion_xl/pipeline_stable_diffusion_xl.py

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* move to 0.19.0dev

* Apply suggestions from code review

* add exhaustive tests

* add docs

* finish

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* make style

---------

Co-authored-by: bghira <bghira@users.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-07-12 13:54:27 +02:00
Patrick von Platen
b9feed8795 move to 0.19.0dev (#4048) 2023-07-11 22:49:12 +02:00
Kadir Nar
f9cedfb75c 📝 Fix broken link to models documentation (#4026)
* 📝 Fix broken link to models documentation

Corrected the link to the models documentation in the README. Previously, the link was pointing to an incorrect URL. Now, the link directs users to the correct documentation page for more details on the models.

Thanks! 🙌

* Update src/diffusers/models/README.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2023-07-11 12:10:37 -07:00
Michael Monashev
fc7aa64ea8 Fixed pipeline parameter descriptions. (#3955)
Fixed parameter descriptions
2023-07-11 09:52:17 -07:00
Patrick von Platen
d0979f5274 quick fix to make sure sdxl conversion works 2023-07-11 16:51:56 +00:00
Vines
fcb0da7f00 Fix diffedit doc typo (#3977)
fix diffedit code mistake
2023-07-11 09:47:36 -07:00
manosplitsis
8e8b046d24 Typo fix in example docstring in audioldm pipeline (#4044)
typo fix in example docstring in audioldm pipeline
2023-07-11 09:35:11 -07:00
oOraph
5e704a2c71 keep _use_default_values as a list type (#4040)
Signed-off-by: Raphael <oOraph@users.noreply.github.com>
Co-authored-by: Raphael <oOraph@users.noreply.github.com>
2023-07-11 18:21:08 +02:00
Patrick von Platen
8bff782354 Improve single loading file (#4041)
* start improving single file load

* Fix more

* start improving single file load

* Fix sd 2.1

* further improve from_single_file
2023-07-11 17:01:25 +02:00
Lucain
6632823690 FIX force_download in download utility (#4036)
FIX force_download in download utils
2023-07-11 12:20:00 +02:00
Sayak Paul
f74d5e1c2f [diffusers-cli] Add a CLI command to run FP16 and safetensors conversions and opening PRs (#3982)
* feat: add a cli util for fp16 and safetensors format.

* fix: commit_description.

* add: usage example.
2023-07-11 08:30:09 +05:30
Sayak Paul
3d74dc2abd [Examples] Add a training script for SDXL DreamBooth LoRA (#4016)
* add dreambooth lora script for SDXL incorporating latest changes.

* remove use_auth_token=True.

* add: documentation

* remove unneeded cli.

* increase the number of training steps in the readme.

* add LoraLoaderMixin to the subclassing mix.

* add sdxl lora dreambooth test.

* add: inference code sample.

* add: refiner output.

* add LoraLoaderMixin to the mix of classes of StableDiffusionXLImg2ImgPipeline.

* change default resolution of DreamBoothDataset.

* better sdxl report path.

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-07-11 07:38:41 +05:30
Susnato Dhar
dfd7eafbce Fixed wrong link in CONTRIBUTING.md (#4028)
Update CONTRIBUTING.md
2023-07-10 21:00:55 +02:00
Steven Liu
8dd0ddc3c4 [docs] Fix index page (#3997)
* fix

* correct link

* Fix typo

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-07-10 10:24:37 -07:00
Patrick von Platen
080ecf01b3 Improve loading pipe (#4009)
* improve loading subcomponents

* Add test for logging

* improve loading subcomponents

* make style

* make style

* fix

* finish
2023-07-10 18:05:40 +02:00
Pedro Cuenca
7a91ea6c2b Remove remaining not in upscale pipeline (#4020)
Remove remaining `not` in upscale pipeline.
2023-07-10 12:37:50 +02:00
Sayak Paul
e4559f48c1 minor improvements to the SDXL doc. (#3985)
* minor improvements to the SDXL doc.

* use_refiner variable.

* fix: typo.
2023-07-10 16:04:23 +05:30
Pedro Cuenca
d6b861401e Correctly keep vae in float16 when using PyTorch 2 or xFormers (#4019)
* Update pipeline_stable_diffusion_xl.py

fix a bug

* Update pipeline_stable_diffusion_xl_img2img.py

* Update pipeline_stable_diffusion_xl_img2img.py

* Update pipeline_stable_diffusion_upscale.py

* style

---------

Co-authored-by: Hu Ye <xiaohuzc@gmail.com>
2023-07-10 12:19:56 +02:00
Patrick von Platen
e4f6c3799d [DiffusionPipeline] Deprecate not throwing error when loading non-existant variant (#4011)
* Deprecate variant nicely

* make style

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-07-10 12:12:29 +02:00
Patrick von Platen
98c9aac1d5 [SDXL] Fix all sequential offload (#4010)
* Fix all sequential offload

* make style

* make style
2023-07-10 10:27:23 +02:00
Omar Sanseviero
e3d71ad89a Minor nits to Dance DIffusion (#4012)
Update pipeline_dance_diffusion.py
2023-07-10 02:53:06 +05:30
Patrick von Platen
68f61a07d6 Make sure torch compile doesn't access unet config (#4008) 2023-07-10 00:23:55 +05:30
Patrick von Platen
4a3e574807 make style 2023-07-09 16:02:59 +00:00
Will Berman
c2a28c346c Refactor LoRA (#3778)
* refactor to support patching LoRA into T5

instantiate the lora linear layer on the same device as the regular linear layer

get lora rank from state dict

tests

fmt

can create lora layer in float32 even when rest of model is float16

fix loading model hook

remove load_lora_weights_ and T5 dispatching

remove Unet#attn_processors_state_dict

docstrings

* text encoder monkeypatch class method

* fix test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-09 18:02:46 +02:00
Patrick von Platen
78922ed7c7 Add sdxl prompt embeddings (#3995)
* Add sdxl prompt embeddings

* Fix more

* fix some slow tests
2023-07-07 16:50:53 +02:00
Patrick von Platen
6fde5a6dd6 [Tests] Fix some slow tests (#3989)
fix some slow tests
2023-07-07 15:17:57 +02:00
Amiha
d1d0b8afce Don't use bare prints in a library (#3991) 2023-07-07 15:17:50 +02:00
Batuhan Taskaya
04ddad484e Add 'rank' parameter to Dreambooth LoRA training script (#3945) 2023-07-07 17:26:10 +05:30
Saurav Maheshkar
03d829d59e feat: add Dropout to Flax UNet (#3894)
* feat: add Dropout to Flax UNet

* feat: add @compact decorator

* fix: drop nn.compact
2023-07-07 11:38:16 +02:00
Omar Sanseviero
8d8b4311b9 Fix code snippet for Audio Diffusion (#3987) 2023-07-07 10:39:38 +02:00
Yorai Levi
1fbcc78d6e typo in safetensors (safetenstors) (#3976)
* Update pipeline_utils.py

typo in safetensors (safetenstors)

* Update loaders.py

typo in safetensors (safetenstors)

* Update modeling_utils.py

typo in safetensors (safetenstors)
2023-07-07 13:03:51 +05:30
Patrick von Platen
51593da25a fix main docs 2023-07-06 19:28:33 +02:00
Patrick von Platen
38e563d0c7 Fix SD XL Docs (#3971)
* finish sd xl docs

* make style

* Apply suggestions from code review

* uP

* uP

* Correct
2023-07-06 19:21:03 +02:00
Aisuko
b8f089c5a3 fix/doc-code: import torch and fix the broken document address (#3941)
Signed-off-by: GitHub <noreply@github.com>
2023-07-06 09:29:04 -07:00
Patrick von Platen
187ea539ae Improve SD XL (#3968)
* improve sd xl

* correct more

* finish

* make style

* fix more
2023-07-06 18:11:20 +02:00
Patrick von Platen
8bf80fc8d8 disable num attenion heads (#3969)
* disable num attenion heads

* finish
2023-07-06 17:51:40 +02:00
YiYi Xu
45f6d52b10 Add Shap-E (#3742)
* refactor prior_transformer

adding conversion script

add pipeline

add step_index from pipeline, + remove permute

add zero pad token

remove copy from statement for betas_for_alpha_bar function

* add

* add

* update conversion script for renderer model

* refactor camera a little bit

* clean up

* style

* fix copies

* Update src/diffusers/schedulers/scheduling_heun_discrete.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/shap_e/pipeline_shap_e.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/shap_e/pipeline_shap_e.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* alpha_transform_type

* remove step_index argument

* remove get_sigmas_karras

* remove _yiyi_sigma_to_t

* move the rescale prompt_embeds from prior_transformer to pipeline

* replace baddbmm with einsum to match origial repo

* Revert "replace baddbmm with einsum to match origial repo"

This reverts commit 3f6b435d65.

* add step_index to scale_model_input

* Revert "move the rescale prompt_embeds from prior_transformer to pipeline"

This reverts commit 5b5a8e6be9.

* move rescale from prior_transformer to pipeline

* correct step_index in scale_model_input

* remove print lines

* refactor prior - reduce arguments

* make style

* add prior_image

* arg embedding_proj_norm -> norm_embedding_proj

* add pre-norm for proj_embedding

* move rescale prompt from pipeline to _encode_prompt

* add img2img pipeline

* style

* copies

* Update src/diffusers/models/prior_transformer.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/prior_transformer.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/prior_transformer.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/prior_transformer.py

add arg: encoder_hid_proj

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/prior_transformer.py

add new config: norm_in_type

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/prior_transformer.py

add new config: added_emb_type

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/prior_transformer.py

rename out_dim -> clip_embed_dim

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/prior_transformer.py

rename config: out_dim -> clip_embed_dim

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/prior_transformer.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/prior_transformer.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* finish refactor prior_tranformer

* make style

* refactor renderer

* fix

* make style

* refactor img2img

* remove params_proj

* add test

* add upcast_softmax to prior_transformer

* enable num_images_per_prompt, add save_gif utility

* add

* add fast test

* make style

* add slow test

* style

* add test for img2img

* refactor

* enable batching

* style

* refactor scheduler

* update test

* style

* attempt to solve batch related tests timeout

* add doc

* Update src/diffusers/pipelines/shap_e/pipeline_shap_e.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/shap_e/pipeline_shap_e_img2img.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* hardcode rendering related config

* update betas_for_alpha_bar on ddpm_scheduler

* fix copies

* fix

* export_to_gif

* style

* second attempt to speed up batching tests

* add doc page to index

* Remove intermediate clipping

* 3rd attempt to speed up batching tests

* Remvoe time index

* simplify scheduler

* Fix more

* Fix more

* fix more

* make style

* fix schedulers

* fix some more tests

* finish

* add one more test

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* style

* apply feedbacks

* style

* fix copies

* add one example

* style

* add example for img2img

* fix doc

* fix more doc strings

* size -> frame_size

* style

* update doc

* style

* fix on doc

* update repo name

* improve the usage example in shap-e img2img

* add usage examples in the shap-e docs.

* consolidate examples.

* minor fix.

* update doc

* Apply suggestions from code review

* Apply suggestions from code review

* remove upcast

* Make sure background is white

* Update src/diffusers/pipelines/shap_e/pipeline_shap_e.py

* Apply suggestions from code review

* Finish

* Apply suggestions from code review

* Update src/diffusers/pipelines/shap_e/pipeline_shap_e.py

* Make style

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-07-06 15:20:42 +02:00
YiYi Xu
746215670a Kandinsky_v22_yiyi (#3936)
* Kandinsky2_2

* fix init kandinsky2_2

* kandinsky2_2 fix inpainting

* rename pipelines: remove decoder + 2_2 -> V22

* Update scheduling_unclip.py

* remove text_encoder and tokenizer arguments from doc string

* add test for text2img

* add tests for text2img & img2img

* fix

* add test for inpaint

* add prior tests

* style

* copies

* add controlnet test

* style

* add a test for controlnet_img2img

* update prior_emb2emb api to accept image_embedding or image

* add a test for prior_emb2emb

* style

* remove try except

* example

* fix

* add doc string examples to all kandinsky pipelines

* style

* update doc

* style

* add a top about 2.2

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* vae -> movq

* vae -> movq

* style

* fix the #copied from

* remove decoder from file name

* update doc: add a section for kandinsky 2.2

* fix

* fix-copies

* add coped from

* add copies from for prior

* add copies from for prior emb2emb

* copy from for img2img

* copied from for inpaint

* more copied from

* more copies from

* more copies

* remove the yiyi comments

* Apply suggestions from code review

* Self-contained example, pipeline order

* Import prior output instead of redefining.

* Style

* Make VQModel compatible with model offload.

* Fix copies

---------

Co-authored-by: Shahmatov Arseniy <62886550+cene555@users.noreply.github.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-07-06 15:05:42 +02:00
Patrick von Platen
bc9a8cef6f [SD-XL] Add new pipelines (#3859)
* Add new text encoder

* add transformers depth

* More

* Correct conversion script

* Fix more

* Fix more

* Correct more

* correct text encoder

* Finish all

* proof that in works in run local xl

* clean up

* Get refiner to work

* Add red castle

* Fix batch size

* Improve pipelines more

* Finish text2image tests

* Add img2img test

* Fix more

* fix import

* Fix embeddings for classic models (#3888)

Fix embeddings for classic SD models.

* Allow multiple prompts to be passed to the refiner (#3895)

* finish more

* Apply suggestions from code review

* add watermarker

* Model offload (#3889)

* Model offload.

* Model offload for refiner / img2img

* Hardcode encoder offload on img2img vae encode

Saves some GPU RAM in img2img / refiner tasks so it remains below 8 GB.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* correct

* fix

* clean print

* Update install warning for `invisible-watermark`

* add: missing docstrings.

* fix and simplify the usage example in img2img.

* fix setup for watermarking.

* Revert "fix setup for watermarking."

This reverts commit 491bc9f5a6.

* fix: watermarking setup.

* fix: op.

* run make fix-copies.

* make sure tests pass

* improve convert

* make tests pass

* make tests pass

* better error message

* fiinsh

* finish

* Fix final test

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-07-06 13:37:27 +02:00
Sayak Paul
b62d9a1fdc [Text-to-video] Add torch.compile() compatibility (#3949)
* use sample directly instead of the dataclass.

* more usage of directly samples instead of dataclasses

* more usage of directly samples instead of dataclasses

* use direct sample in the pipeline.

* direct usage of sample in the img2img case.
2023-07-06 14:30:50 +05:30
Sayak Paul
46af98267d [Consistency Models] correct checkpoint url in the doc (#3962)
correct checkpoint url.
2023-07-06 14:22:43 +05:30
Prathik Rao
de1426119d Make UNet2DConditionOutput pickle-able (#3857)
* add default to unet output to prevent it from being a required arg

* add unit test

* make style

* adjust unit test

* mark as fast test

* adjust assert statement in test

---------

Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2023-07-06 13:12:41 +05:30
Sayak Paul
41ea88f38c Update consistency_models.mdx (#3961) 2023-07-06 10:55:24 +05:30
dg845
aed7499a8d Add Consistency Models Pipeline (#3492)
* initial commit

* Improve consistency models sampling implementation.

* Add CMStochasticIterativeScheduler, which implements the multi-step sampler (stochastic_iterative_sampler) in the original code, and make further improvements to sampling.

* Add Unet blocks for consistency models

* Add conversion script for Unet

* Fix bug in new unet blocks

* Fix attention weight loading

* Make design improvements to ConsistencyModelPipeline and CMStochasticIterativeScheduler and add initial version of tests.

* make style

* Make small random test UNet class conditional and set resnet_time_scale_shift to 'scale_shift' to better match consistency model checkpoints.

* Add support for converting a test UNet and non-class-conditional UNets to the consistency models conversion script.

* make style

* Change num_class_embeds to 1000 to better match the original consistency models implementation.

* Add support for distillation in pipeline_consistency_models.py.

* Improve consistency model tests:
	- Get small testing checkpoints from hub
	- Modify tests to take into account "distillation" parameter of ConsistencyModelPipeline
	- Add onestep, multistep tests for distillation and distillation + class conditional
	- Add expected image slices for onestep tests

* make style

* Improve ConsistencyModelPipeline:
	- Add initial support for class-conditional generation
	- Fix initial sigma for onestep generation
	- Fix some sigma shape issues

* make style

* Improve ConsistencyModelPipeline:
	- add latents __call__ argument and prepare_latents method
	- add check_inputs method
	- add initial docstrings for ConsistencyModelPipeline.__call__

* make style

* Fix bug when randomly generating class labels for class-conditional generation.

* Switch CMStochasticIterativeScheduler to configuring a sigma schedule and make related changes to the pipeline and tests.

* Remove some unused code and make style.

* Fix small bug in CMStochasticIterativeScheduler.

* Add expected slices for multistep sampling tests and make them pass.

* Work on consistency model fast tests:
	- in pipeline, call self.scheduler.scale_model_input before denoising
	- get expected slices for Euler and Heun scheduler tests
	- make Euler test pass
	- mark Heun test as expected fail because it doesn't support prediction_type "sample" yet
	- remove DPM and Euler Ancestral tests because they don't support use_karras_sigmas

* make style

* Refactor conversion script to make it easier to add more model architectures to convert in the future.

* Work on ConsistencyModelPipeline tests:
	- Fix device bug when handling class labels in ConsistencyModelPipeline.__call__
	- Add slow tests for onestep and multistep sampling and make them pass
	- Refactor fast tests
	- Refactor ConsistencyModelPipeline.__init__

* make style

* Remove the add_noise and add_noise_to_input methods from CMStochasticIterativeScheduler for now.

* Run python utils/check_copies.py --fix_and_overwrite
python utils/check_dummies.py --fix_and_overwrite to make dummy objects for new pipeline and scheduler.

* Make fast tests from PipelineTesterMixin pass.

* make style

* Refactor consistency models pipeline and scheduler:
	- Remove support for Karras schedulers (only support CMStochasticIterativeScheduler)
	- Move sigma manipulation, input scaling, denoising from pipeline to scheduler
	- Make corresponding changes to tests and ensure they pass

* make style

* Add docstrings and further refactor pipeline and scheduler.

* make style

* Add initial version of the consistency models documentation.

* Refactor custom timesteps logic following DDPMScheduler/IFPipeline and temporarily add torch 2.0 SDPA kernel selection logic for debugging.

* make style

* Convert current slow tests to use fp16 and flash attention.

* make style

* Add slow tests for normal attention on cuda device.

* make style

* Fix attention weights loading

* Update consistency model fast tests for new test checkpoints with attention fix.

* make style

* apply suggestions

* Add add_noise method to CMStochasticIterativeScheduler (copied from EulerDiscreteScheduler).

* Conversion script now outputs pipeline instead of UNet and add support for LSUN-256 models and different schedulers.

* When both timesteps and num_inference_steps are supplied, raise warning instead of error (timesteps take precedence).

* make style

* Add remaining diffusers model checkpoints for models in the original consistency model release and update usage example.

* apply suggestions from review

* make style

* fix attention naming

* Add tests for CMStochasticIterativeScheduler.

* make style

* Make CMStochasticIterativeScheduler tests pass.

* make style

* Override test_step_shape in CMStochasticIterativeSchedulerTest instead of modifying it in SchedulerCommonTest.

* make style

* rename some models

* Improve API

* rename some models

* Remove duplicated block

* Add docstring and make torch compile work

* More fixes

* Fixes

* Apply suggestions from code review

* Apply suggestions from code review

* add more docstring

* update consistency conversion script

---------

Co-authored-by: ayushmangal <ayushmangal@microsoft.com>
Co-authored-by: Ayush Mangal <43698245+ayushtues@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-05 19:33:58 +02:00
Pedro Cuenca
07c9a08e67 Add timestep_spacing and steps_offset to schedulers (#3947)
* Add timestep_spacing to DDPM, LMSDiscrete, PNDM.

* Remove spurious line.

* More easy schedulers.

* Add `linspace` to DDIM

* Noise sigma for `trailing`.

* Add timestep_spacing to DEISMultistepScheduler.

Not sure the range is the way it was intended.

* Fix: remove line used to debug.

* Support timestep_spacing in DPMSolverMultistep, DPMSolverSDE, UniPC

* Fix: convert to numpy.

* Use sched. defaults when instantiating from_config

For params not present in the original configuration.

This makes it possible to switch pipeline schedulers even if they use
different timestep_spacing (or any other param).

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Missing args in DPMSolverMultistep

* Test: default args not in config

* Style

* Fix scheduler name in test

* Remove duplicated entries

* Add test for solver_type

This test currently fails in main. When switching from DEIS to UniPC,
solver_type is "logrho" (the default value from DEIS), which gets
translated to "bh1" by UniPC. This is different to the default value for
UniPC: "bh2". This is where the translation happens: 36d22d0709/src/diffusers/schedulers/scheduling_unipc_multistep.py (L171)

* UniPC: use same default for solver_type

Fixes a bug when switching from UniPC from another scheduler (i.e.,
DEIS) that uses a different solver type. The solver is now the same as
if we had instantiated the scheduler directly.

* do not save use default values

* fix more

* fix all

* fix schedulers

* fix more

* finish for real

* finish for real

* flaky tests

* Update tests/pipelines/stable_diffusion/test_stable_diffusion_pix2pix_zero.py

* Default steps_offset to 0.

* Add missing docstrings

* Apply suggestions from code review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-05 15:49:30 +02:00
Patrick von Platen
2837d49079 Fix failing np tests (#3942)
* Fix failing np tests

* Apply suggestions from code review

* Update tests/pipelines/test_pipelines_common.py
2023-07-04 14:00:43 +02:00
Prathik Rao
1997614aa9 avoid upcasting by assigning dtype to noise tensor (#3713)
* avoid upcasting by assigning dtype to noise tensor

* make style

* Update train_unconditional.py

* Update train_unconditional.py

* make style

* add unit test for pickle

* revert change

---------

Co-authored-by: root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2023-07-04 07:19:49 +05:30
Patrick von Platen
4e898560ce revert automatic chunking (#3934)
* revert automatic chunking

* Apply suggestions from code review

* revert automatic chunking
2023-07-03 23:12:41 +02:00
Patrick von Platen
332d2bbea3 Improve memory text to video (#3930)
* Improve memory text to video

* Apply suggestions from code review

* add test

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* finish test setup

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-07-03 18:17:34 +02:00
estelleafl
b8a5dda56e [ldm3d] Update code to be functional with the new checkpoints (#3875)
* fixed typo

* updated doc to be consistent in naming

* make style/quality

* preprocessing for 4 channels and not 6

* make style

* test for 4c

* make style/quality

* fixed test on cpu

---------

Co-authored-by: Aflalo <estellea@isl-iam1.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu33.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu38.rr.intel.com>
2023-07-03 18:15:46 +02:00
Andrés Mauricio Repetto Ferrero
572d8e2002 Adding better way to define multiple concepts and also validation capabilities. (#3807)
* - Added validation parameters
- Changed some parameter descriptions to better explain their use.
- Fixed a few typos.
- Added concept_list parameter for better management of multiple subjects
- changed logic for image validation

* - Fixed bad logic for class data root directories

* Defaulting validation_steps to None for an easier logic

* Fixed multiple validation prompts

* Fixed bug on validation negative prompt

* Changed validation logic for tracker.

* Added uuid for validation image labeling

* Fix error when comparing validation prompts and validation negative prompts

* Improved error message when negative prompts for validation are more than the number of prompts

* - Changed image tracking number from epoch to global_step
- Added Typing for functions

* Added some validations more when using concept_list parameter and the regular ones.

* Fixed error message

* Added more validations for validation parameters

* Improved messaging for errors

* Fixed validation error for parameters with default values

* - Added train step to image name for validation
- reformatted code

* - Added train step to image's name for validation
- reformatted code

* Updated README.md file.

* reverted back original script of train_dreambooth.py

* reverted back original script of train_dreambooth.py

* left one blank line at the eof

* reverted back setup.py

* reverted back setup.py

* added same logic for when parameters for prior preservation are used without enabling the flag while using concept_list parameter.

* Ran black formatter.

* fixed a few strings

* fixed import sort with isort and removed fstrings without placeholder

* fixed import order with ruff (since with isort wasn't ok)

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-07-03 17:55:45 +02:00
Patrick von Platen
2e8668f0af Correct controlnet out of list error (#3928)
* Correct controlnet out of list error

* Apply suggestions from code review

* correct tests

* correct tests

* fix

* test all

* Apply suggestions from code review

* test all

* test all

* Apply suggestions from code review

* Apply suggestions from code review

* fix more tests

* Fix more

* Apply suggestions from code review

* finish

* Apply suggestions from code review

* Update src/diffusers/schedulers/scheduling_k_dpm_2_ancestral_discrete.py

* finish
2023-07-03 15:10:07 +02:00
Aisuko
b298484fd0 fix/doc: no import torch issue (#3923)
Ffix/doc: no import torch issue

Signed-off-by: GitHub <noreply@github.com>
2023-07-03 12:28:42 +02:00
Aisuko
f911287cc9 fix/doc-code: Updating to the latest version parameters (#3924)
fix/doc-code: update to use the new parameter

Signed-off-by: GitHub <noreply@github.com>
2023-07-03 12:28:05 +02:00
Patrick von Platen
62825064bf Add video img2img (#3900)
* Add image to image video

* Improve

* better naming

* make fix copies

* add docs

* finish tests

* trigger tests

* make style

* correct

* finish

* Fix more

* make style

* finish
2023-07-02 13:19:27 +02:00
Aisuko
5439e917ca fix/docs: Fix the broken doc links (#3897)
* fix/docs: Fix the broken doc links

Signed-off-by: GitHub <noreply@github.com>

* Update docs/source/en/using-diffusers/write_own_pipeline.mdx

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Signed-off-by: GitHub <noreply@github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-07-01 08:07:59 +02:00
Steven Liu
174dcd697f [docs] Model API (#3562)
* add modelmixin and unets

* remove old model page

* minor fixes

* fix unet2dcondition

* add vqmodel and autoencoderkl

* add rest of models

* fix autoencoderkl path

* fix toctree

* fix toctree again

* apply feedback

* apply feedback

* fix copies

* fix controlnet copy

* fix copies
2023-06-29 17:24:39 -07:00
takuoko
cdf2ae8a84 [Enhance] Add LoRA rank args in train_text_to_image_lora (#3866)
* add rank args in lora finetune

* del network_alpha
2023-06-29 17:09:59 +05:30
Sayak Paul
49949f321d [Tests] add test for checking soft dependencies. (#3847)
* add test for checking soft dependencies.

* address patrick's comments.

* dependency tests should not run twice.

* debugging.

* up.
2023-06-28 22:05:25 +05:30
Uranus
c7469ebe74 fix sde add noise typo (#3839)
* fix sde typo

* fix code style
2023-06-28 15:44:29 +02:00
Wadim Korablin
150013060e Support for manual CLIP loading in StableDiffusionPipeline - txt2img. (#3832)
* Support for manual CLIP loading in StableDiffusionPipeline - txt2img.

* Update src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py

* Update variables & according docs to match previous style.

* Updated to match style & quality of 'diffusers'

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-06-28 15:29:48 +02:00
Patrick von Platen
219636f7e4 improve tolerance 2023-06-28 13:29:36 +00:00
Vincent Neemie
35bac5edec Fixing the global_step key not found (#3844)
* Fixing the global_step key not found

* Apply suggestions from code review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-06-28 14:36:33 +02:00
Saurav Maheshkar
0bf6aeb885 feat: rename single-letter vars in resnet.py (#3868)
feat: rename single-letter vars
2023-06-28 13:31:32 +02:00
Joachim Blaafjell Holwech
9a45d7fb76 Add guidance start/stop (#3770)
* Add guidance start/stop

* Add guidance start/stop to inpaint class

* Black formatting

* Add support for guidance for multicontrolnet

* Add inclusive end

* Improve design

* correct imports

* Finish

* Finish all

* Correct more

* make style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-06-27 01:04:11 +02:00
regisss
61916fefc4 Update Habana Gaudi doc (#3863)
* Update Habana Gaudi doc

* Fix typo
2023-06-24 21:17:11 +02:00
Sayak Paul
fc6acb6b97 [Docs] add: contributor note in the paradigms docs. (#3852)
add: contributor note in the paradigms docs.
2023-06-22 17:54:35 +05:30
Patrick von Platen
5e3f8fff40 Fix some audio tests (#3841)
* Fix some audio tests

* make style

* fix

* make style
2023-06-22 13:53:27 +02:00
Patrick von Platen
5df2acf7d2 [Conversion] Small fixes (#3848)
* [Conversion] Small fixes

* Update src/diffusers/pipelines/stable_diffusion/convert_from_ckpt.py
2023-06-22 13:52:59 +02:00
Patrick von Platen
88d269461c Correct bad attn naming (#3797)
* relax tolerance slightly

* correct incorrect naming

* correct namingc

* correct more

* Apply suggestions from code review

* Fix more

* Correct more

* correct incorrect naming

* Update src/diffusers/models/controlnet.py

* Correct flax

* Correct renaming

* Correct blocks

* Fix more

* Correct more

* mkae style

* mkae style

* mkae style

* mkae style

* mkae style

* Fix flax

* mkae style

* rename

* rename

* rename attn head dim to attention_head_dim

* correct flax

* make style

* improve

* Correct more

* make style

* fix more

* mkae style

* Update src/diffusers/models/controlnet_flax.py

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-06-22 13:52:48 +02:00
Robert Dargavel Smith
0c6d1bc985 fix audio_diffusion tests (#3850) 2023-06-22 12:27:39 +02:00
Sayak Paul
13e781f9a5 fix: random module seeding (#3846) 2023-06-22 12:26:55 +02:00
Will Berman
0bab447670 relax tol attention conversion test (#3842) 2023-06-21 12:35:38 -07:00
Steven Liu
1f02087607 [docs] More API stuff (#3835)
* clean up loaders

* clean up rest of main class apis

* apply feedback
2023-06-21 11:07:23 -07:00
YiYi Xu
95ea538c79 Add ddpm kandinsky (#3783)
* update doc

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-06-21 07:23:18 -10:00
Hans Brouwer
ef3844d3a8 Support ControlNet models with different number of channels in control images (#3815)
support ControlNet models with a different hint_channels value (e.g. TemporalNet2)
2023-06-21 13:11:45 +02:00
dqueue
3ebbaf7c96 Update control_brightness.mdx (#3825) 2023-06-20 14:09:51 +02:00
Andy Shih
73b125df68 [Pipeline] Add new pipeline for ParaDiGMS -- parallel sampling of diffusion models (#3716)
* add paradigms parallel sampling pipeline

* linting

* ran make fix-copies

* add paradigms parallel sampling pipeline

* linting

* ran make fix-copies

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* changes based on review

* add docs for paradigms

* update docs with paradigms abstract

* improve documentation, and add tests for ddim/ddpm batch_step_no_noise

* fix docs and run make fix-copies

* minor changes to docs.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* move parallel scheduler to new classes for DDPMParallelScheduler and DDIMParallelScheduler

* remove changes for scheduling_ddim, adjust licenses, credits, and commented code

* fix tensor type that is breaking tests

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-06-20 15:04:26 +05:30
Sayak Paul
88eb04489d [Docs] add missing pipelines from the overview pages and minor fixes (#3795)
* add entry for safe stable diffusion to the sd overview page.

* add missing pipelines o the broader overview section in the pipelines.

* address PR feedback./
2023-06-20 11:15:21 +05:30
Sayak Paul
4870626728 [Examples] Improve the model card pushed from the train_text_to_image.py script (#3810)
* refactor: readme serialized from the example when push_to_hub is True.

* fix: batch size arg.

* a bit better formatting

* minor fixes.

* add note on env.

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* condition wandb info better

* make mixed_precision assignment in cli args explicit.

* separate inference block for sample images.

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* address more comments.

* autocast mode.

* correct none image type problem.

* ifx: list assignment.

* minor fix.

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-06-20 08:59:41 +05:30
estelleafl
666743302f [ldm3d] Fixed small typo (#3820)
* fixed typo

* updated doc to be consistent in naming

* make style/quality

---------

Co-authored-by: Aflalo <estellea@isl-iam1.rr.intel.com>
2023-06-19 17:38:02 +02:00
Steven Liu
f7cc9adc05 [docs] Zero SNR (#3776)
* add zero snr doc

* fix image link

* apply feedback

* separate page
2023-06-16 13:19:37 -07:00
Will Berman
59aefe9ea6 device map legacy attention block weight conversion (#3804) 2023-06-16 10:39:20 -07:00
Will Berman
3ddc2b7395 [train text to image] add note to loading from checkpoint (#3806)
add note to loading from checkpoint
2023-06-16 11:54:49 +05:30
Will Berman
d49e2dd54c manual check for checkpoints_total_limit instead of using accelerate (#3681)
* manual check for checkpoints_total_limit instead of using accelerate

* remove controlnet_conditioning_embedding_out_channels
2023-06-15 15:38:54 -07:00
Isotr0py
7bfd2375c7 fix typo (#3800) 2023-06-15 22:00:47 +05:30
Patrick von Platen
ea8ae8c639 Complete set_attn_processor for prior and vae (#3796)
* relax tolerance slightly

* Add more tests

* upload readme

* upload readme

* Apply suggestions from code review

* Improve API Autoencoder KL

* finalize

* finalize tests

* finalize tests

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* up

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-15 17:42:49 +02:00
estelleafl
958d9ec723 Ldm3d first PR (#3668)
* added ldm3d pipeline and updated image processor to support depth

* added description

* added paper reference

* added docs

* fixed bug

* added test

* Update tests/pipelines/stable_diffusion/test_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update tests/pipelines/stable_diffusion/test_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* added reference in indexmdx

* reverted changes tto image processor'

* added LDM3DOutput

* Fixes with make style

* fix failing tests for make fix-copies

* aligned with our version

* Update pipeline_stable_diffusion_ldm3d.py

updated the guidance scale

* Fix for failing check_code_quality test

* Code review feedback

* Fix typo in ldm3d_diffusion.mdx

* updated the doc accordnlgy

* copyrights

* fixed test failure

* make style

* added image processor of LDM3D in the documentation:

* added ldm3d doc to toctree

* run make style && make quality

* run make fix-copies

* Update docs/source/en/api/image_processor.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/ldm3d_diffusion.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update docs/source/en/api/pipelines/stable_diffusion/ldm3d_diffusion.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* updated the safety checker to accept tuple

* make style and make quality

* Update src/diffusers/pipelines/stable_diffusion/__init__.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_ldm3d.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* LDM3D output

* up

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Aflalo <estellea@isl-gpu27.rr.intel.com>
Co-authored-by: Anahita Bhiwandiwalla <anahita.bhiwandiwalla@intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu26.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-iam1.rr.intel.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Aflalo <estellea@isl-gpu42.rr.intel.com>
Co-authored-by: Aflalo <estellea@isl-gpu43.rr.intel.com>
2023-06-15 17:36:52 +02:00
Sayak Paul
77f9137f10 feat: add PR template. (#3786)
* feat: add PR template.

* address pr comments.

* Update .github/PULL_REQUEST_TEMPLATE.md

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-06-15 19:41:54 +05:30
Naga Sai Abhinay
231bdf2e56 UnCLIP Image Interpolation -> Keep same initial noise across interpolation steps (#3782)
* Maintain same decoder start noise for all interp steps

* Correct comment

* use batch_size for consistency
2023-06-15 15:15:40 +02:00
Arpan Tripathi
75124fc91e Added LoRA loading to StableDiffusionKDiffusionPipeline (#3751)
Added `LoraLoaderMixin` to `StableDiffusionKDiffusionPipeline`
2023-06-15 15:09:44 +02:00
Patrick von Platen
908e5e9cc6 Fix some bad comment in training scripts (#3798)
* relax tolerance slightly

* correct incorrect naming
2023-06-15 15:07:51 +02:00
cmdr2
2715079344 Fix broken cpu-offloading in legacy inpainting SD pipeline (#3773) 2023-06-15 14:56:40 +02:00
takuoko
1ae15fa64c [Enhance] Update reference (#3723)
* update reference pipeline

* update reference pipeline

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-06-15 14:34:12 +02:00
Sayak Paul
027a365a62 [Bug Report template] modify the issue template to include core maintainers. (#3785)
* modify the issue template to include core maintainers.

* add: entry for audio.

* Update .github/ISSUE_TEMPLATE/bug-report.yml

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-06-15 07:43:07 +05:30
Steven Liu
f96b760658 [docs] Fix Colab notebook cells (#3777)
fix colab notebook cells
2023-06-14 10:21:39 -07:00
YiYi Xu
7761b89d7b update conversion script for Kandinsky unet (#3766)
* update kandinsky conversion script

* style

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-06-14 06:57:53 -10:00
jfozard
ce5504934a Update pipeline_flax_stable_diffusion_controlnet.py (#3306)
Update pipeline_flax_controlnet.py

Change type of images array from jax.numpy.array to numpy.ndarray to permit in-place modification of the array when the safety checker detects a NSFW image.
2023-06-12 14:25:46 -10:00
Patrick von Platen
34d14d7848 [MultiControlNet] Allow save and load (#3747)
* [MultiControlNet] Allow save and load

* Correct more

* [MultiControlNet] Allow save and load

* make style

* Apply suggestions from code review
2023-06-12 18:29:58 +02:00
Patrick von Platen
ef9590712a [Tests] Relax tolerance of flaky failing test (#3755)
relax tolerance slightly
2023-06-12 18:28:30 +02:00
Andranik Movsisyan
a812fb6f5c Text2video zero refinements (#3733)
* fix docs typos. add frame_ids argument to text2video-zero pipeline call

* make style && make quality

* add support of pytorch 2.0 scaled_dot_product_attention for CrossFrameAttnProcessor

* add chunk-by-chunk processing to text2video-zero docs

* make style && make quality

* Update docs/source/en/api/pipelines/text_to_video_zero.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-12 18:03:18 +02:00
Liam Swayne
f46b22ba13 [documentation] grammatical fixes in installation.mdx (#3735)
Update installation.mdx
2023-06-12 17:42:01 +02:00
JeLuF
b2b13cd315 [Documentation] Replace dead link to Flax install guide (#3739)
Replace dead link to Flax documentation

Replace the dead link to the Flax installation guide by a working one: https://flax.readthedocs.io/en/latest/#installation
2023-06-12 17:40:48 +02:00
Patrick von Platen
38adcd21bd [Stable Diffusion Inpaint & ControlNet inpaint] Correct timestep inpaint (#3749)
* Correct timestep inpaint

* make style

* Fix

* Apply suggestions from code review

* make style
2023-06-12 13:59:38 +02:00
Patrick von Platen
790212f4d9 Correct another push token (#3745)
clean up more
2023-06-12 10:29:23 +02:00
Patrick von Platen
11aa105077 Correct Token to upload docs (#3744)
clean up more
2023-06-12 10:04:45 +02:00
Patrick von Platen
abbfe4b5b7 fix zh 2023-06-10 17:54:55 +02:00
Patrick von Platen
1d50f47a58 Merge branch 'main' of https://github.com/huggingface/diffusers 2023-06-10 17:04:59 +02:00
Patrick von Platen
e891b00dfc build docs 2023-06-10 16:58:59 +02:00
Patrick von Platen
27af55d1b4 build docs 2023-06-10 16:56:41 +02:00
YiYi Xu
05361960f2 remove seed (#3734)
* remove seed

* style

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-06-09 08:27:02 -10:00
Patrick von Platen
c42f6ee43e Post 0.17.0 release (#3721)
* Post release

* Post release
2023-06-08 18:08:49 +02:00
Patrick von Platen
f523b11a10 Fix loading if unexpected keys are present (#3720)
* Fix loading

* make style
2023-06-08 16:48:06 +02:00
Zachary Mueller
79fa94ea8b Apply deprecations from Accelerate (#3714)
Apply deprecations
2023-06-08 16:44:22 +02:00
Patrick von Platen
a06317abea [Actions] Fix actions (#3712) 2023-06-07 18:57:28 +01:00
YiYi Xu
500a3ff9ef [docs] add image processor documentation (#3710)
add image processor

Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
2023-06-07 18:35:07 +01:00
Mishig
8caa530069 [doc build] Use secrets (#3707)
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-06-07 18:21:16 +01:00
Kadir Nar
cd6186907c [Community] Support StableDiffusionCanvasPipeline (#3590)
* added StableDiffusionCanvasPipeline pipeline

* Added utils codes to pipe_utils file.

* make style

* delete mixture.py and Text2ImageRegion class

* make style

* Added the codes to the readme.md file.

* Moved functions from pipeline_utils to mix_canvas
2023-06-07 17:43:33 +01:00
Patrick von Platen
803d653748 Fix custom releases (#3708)
* Fix custom releases

* make style
2023-06-07 17:33:54 +01:00
Alex McKinney
cd9d0913d9 Fixes eval generator init in train_text_to_image_lora.py (#3678) 2023-06-07 15:37:13 +05:30
Pedro Cuenca
fdec23188a [Tests] Run slow matrix sequentially (#3500)
[tests] Run slow matrix sequentially.
2023-06-07 11:01:35 +01:00
Max-We
12a232efa9 Fix schedulers zero SNR and rescale classifier free guidance (#3664)
* Implement option for rescaling betas to zero terminal SNR

* Implement rescale classifier free guidance in pipeline_stable_diffusion.py

* focus on DDIM

* make style

* make style

* make style

* make style

* Apply suggestions from Peter Lin

* Apply suggestions from Peter Lin

* make style

* Apply suggestions from code review

* Apply suggestions from code review

* make style

* make style

---------

Co-authored-by: MaxWe00 <gitlab.9v1lq@slmail.me>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-06-07 10:57:10 +01:00
Patrick von Platen
74fd735eb0 Add draft for lora text encoder scale (#3626)
* Add draft for lora text encoder scale

* Improve naming

* fix: training dreambooth lora script.

* Apply suggestions from code review

* Update examples/dreambooth/train_dreambooth_lora.py

* Apply suggestions from code review

* Apply suggestions from code review

* add lora mixin when fit

* add lora mixin when fit

* add lora mixin when fit

* fix more

* fix more

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-06 22:47:46 +01:00
Jason C.H
2de9e2df36 Fix from_ckpt for Stable Diffusion 2.x (#3662) 2023-06-06 22:39:11 +01:00
Isotr0py
11b3002b48 Support views batch for panorama (#3632)
* support views batch for panorama

* add entry for the new argument

* format entry for the new argument

* add view_batch_size test

* fix batch test and a boundary condition

* add more docstrings

* fix a typos

* fix typos

* add: entry to the doc about view_batch_size.

* Revert "add: entry to the doc about view_batch_size."

This reverts commit a36aeaa9ed.

* add a tip on .

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-07 02:50:02 +05:30
stano
10f4ecd177 Fix the Kandinsky docstring examples (#3695)
- use the correct Prior hub model id
 - use the new names in KandinskyPriorPipelineOutput
2023-06-06 22:18:14 +01:00
Sayak Paul
de16f64667 feat: when using PT 2.0 use LoRAAttnProcessor2_0 for text enc LoRA. (#3691) 2023-06-06 21:20:53 +01:00
YiYi Xu
017ee1609b refactor Image processor for x4 upscaler (#3692)
* refactor x4 upscaler

* style

* copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-06-06 21:08:36 +01:00
Sayak Paul
8669e8313d [LoRA] feat: add lora attention processor for pt 2.0. (#3594)
* feat: add lora attention processor for pt 2.0.

* explicit context manager for SDPA.

* switch to flash attention

* make shapes compatible to work optimally with SDPA.

* fix: circular import problem.

* explicitly specify the flash attention kernel in sdpa

* fall back to efficient attention context manager.

* remove explicit dispatch.

* fix: removed processor.

* fix: remove optional from type annotation.

* feat: make changes regarding LoRAAttnProcessor2_0.

* remove confusing warning.

* formatting.

* relax tolerance for PT 2.0

* fix: loading message.

* remove unnecessary logging.

* add: entry to the docs.

* add: network_alpha argument.

* relax tolerance.
2023-06-06 14:56:05 +05:30
Takuma Mori
b45204ea5a Add function to remove monkey-patch for text encoder LoRA (#3649)
* merge undoable-monkeypatch

* remove TEXT_ENCODER_TARGET_MODULES, refactoring

* move create_lora_weight_file
2023-06-06 14:06:13 +05:30
Steven Liu
a8b0f42c38 [docs] Fix link to loader method (#3680)
fix link to load_lora_weights
2023-06-06 13:37:47 +05:30
Will Berman
41ae670828 move activation dispatches into helper function (#3656)
* move activation dispatches into helper function

* tests
2023-06-05 12:30:48 -07:00
Will Berman
462956be7b small tweaks for parsing thibaudz controlnet checkpoints (#3657) 2023-06-05 10:24:31 -07:00
YiYi Xu
5990014700 [WIP]Vae preprocessor refactor (PR1) (#3557)
VaeImageProcessor.preprocess refactor

* refactored VaeImageProcessor 
   -  allow passing optional height and width argument to resize()
   - add convert_to_rgb
* refactored prepare_latents method for img2img pipelines so that if we pass latents directly as image input, it will not encode it again
* added a test in test_pipelines_common.py to test latents as image inputs
* refactored img2img pipelines that accept latents as image: 
   - controlnet img2img, stable diffusion img2img , instruct_pix2pix

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-05 07:11:00 -10:00
Steven Liu
1a6a647e06 [docs] More API fixes (#3640)
* part 2 of api fixes

* move randn_tensor

* add to toctree

* apply feedback

* more feedback
2023-06-05 09:47:26 -07:00
Sayak Paul
995bbcb9aa [UniDiffuser test] fix one test so that it runs correctly on V100 (#3675)
* fix: assertion.

* assertion fix.
2023-06-05 17:42:31 +05:30
pdoane
d0416ab090 Update Compel documentation for textual inversions (#3663)
* Update Compel documentation for textual inversions

* Fix typo
2023-06-05 16:46:27 +05:30
Vladislav Lyubimov
1994dbcb5e Fix from_ckpt not working properly on windows (#3666) 2023-06-05 11:55:37 +01:00
Patrick von Platen
262d539a8a Correct multi gpu dreambooth (#3673)
Correct multi gpu
2023-06-05 11:03:11 +01:00
Will Berman
0fc2fb71c1 dreambooth upscaling fix added latents (#3659) 2023-06-05 10:32:16 +01:00
Steven Liu
523a50a8eb [docs] Load A1111 LoRA (#3629)
* load a1111 lora

* fix

* apply feedback

* fix
2023-06-05 11:05:42 +05:30
0x1355
de45af4a46 Allow setting num_cycles for cosine_with_restarts lr scheduler (#3606)
Expose num_cycles kwarg of get_schedule() through args.lr_num_cycles.
2023-06-05 10:18:29 +05:30
0x1355
b95cbdf6fc Set step_rules correctly for piecewise_constant scheduler (#3605)
So that schedule_func() calls get_piecewise_constant_schedule() with correctly named kwarg.
2023-06-05 10:16:26 +05:30
Will Berman
7a39691362 linting fix (#3653) 2023-06-02 13:33:19 -07:00
Will Berman
5911a3aa47 dreambooth if docs - stage II, more info (#3628)
* dreambooth if docs - stage II, more info

* Update docs/source/en/training/dreambooth.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/training/dreambooth.mdx

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update docs/source/en/training/dreambooth.mdx

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* download instructions for downsized images

* update source README to match docs

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-02 10:37:13 -07:00
Will Berman
b7af946138 set config from original module but set compiled module on class (#3650)
* set config from original module but set compiled module on class

* add test
2023-06-02 10:26:41 -07:00
asfiyab-nvidia
d3717e6368 add Stable Diffusion TensorRT Inpainting pipeline (#3642)
* add tensorrt inpaint pipeline

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* run make style

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

---------

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-06-02 18:14:31 +01:00
Kadir Nar
0dbdc0cbae [Community Doc] Updated the filename and readme file. (#3634)
* Updated the filename and readme file.

* reformatter

* reformetter
2023-06-02 17:53:09 +01:00
YiYi Xu
0e8688113a fix inpainting pipeline when providing initial latents (#3641)
* fix latents

* fix copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-06-02 17:03:15 +01:00
Kashif Rasul
f1d4743394 fixed typo in example train_text_to_image.py (#3608)
fixed typo
2023-06-02 20:54:54 +05:30
Lachlan Nicholson
a6c7b5b6b7 Iterate over unique tokens to avoid duplicate replacements for multivector embeddings (#3588)
* iterate over unique tokens to avoid duplicate replacements

* added test for multiple references to multi embedding

* adhere to black formatting

* reorder test post-rebase
2023-06-02 16:10:22 +01:00
Takuma Mori
8e552bb4fe Support Kohya-ss style LoRA file format (in a limited capacity) (#3437)
* add _convert_kohya_lora_to_diffusers

* make style

* add scaffold

* match result: unet attention only

* fix monkey-patch for text_encoder

* with CLIPAttention

While the terrible images are no longer produced,
the results do not match those from the hook ver.
This may be due to not setting the network_alpha value.

* add to support network_alpha

* generate diff image

* fix monkey-patch for text_encoder

* add test_text_encoder_lora_monkey_patch()

* verify that it's okay to release the attn_procs

* fix closure version

* add comment

* Revert "fix monkey-patch for text_encoder"

This reverts commit bb9c61e6fa.

* Fix to reuse utility functions

* make LoRAAttnProcessor targets to self_attn

* fix LoRAAttnProcessor target

* make style

* fix split key

* Update src/diffusers/loaders.py

* remove TEXT_ENCODER_TARGET_MODULES loop

* add print memory usage

* remove test_kohya_loras_scaffold.py

* add: doc on LoRA civitai

* remove print statement and refactor in the doc.

* fix state_dict test for kohya-ss style lora

* Apply suggestions from code review

Co-authored-by: Takuma Mori <takuma104@gmail.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-06-02 17:40:24 +05:30
Patrick von Platen
32ea2142c0 [Kandinsky] Improve kandinsky API a bit (#3636)
* Improve docs

* up

* Update docs/source/en/api/pipelines/kandinsky.mdx

* up

* up

* correct more

* further improve

* Update docs/source/en/api/pipelines/kandinsky.mdx

Co-authored-by: YiYi Xu <yixu310@gmail.com>

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2023-06-02 08:57:20 +01:00
Sayak Paul
55dbfa0229 [Docs] include the instruction-tuning blog link in the InstructPix2Pix docs (#3644)
include the instruction-tuning blog link.
2023-06-02 08:04:35 +05:30
Will Berman
4f14b36329 Full Dreambooth IF stage II upscaling (#3561)
* update dreambooth lora to work with IF stage II

* Update dreambooth script for IF stage II upscaler
2023-05-31 09:39:31 -07:00
Will Berman
f751b8844e update dreambooth lora to work with IF stage II (#3560) 2023-05-31 09:39:03 -07:00
Prathik Rao
abb89da4de update code to reflect latest changes as of May 30th (#3616)
* update code to reflect latest changes as of May 30th

* update text to image example

* reflect changes to textual inversion

* make style

* fix typo

* Revert unnecessary readme changes

---------

Co-authored-by: root <root@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
Co-authored-by: Prathik Rao <prathikrao@microsoft.com@orttrainingdev8.d32nl1ml4oruzj4qz3bqlggovf.px.internal.cloudapp.net>
2023-05-31 11:29:04 +02:00
Will Berman
7d0ac4eeab goodbye frog (#3617) 2023-05-30 23:18:01 +01:00
Patrick von Platen
0cc3a7a123 Make sure we also change the config when setting encoder_hid_dim_type=="text_proj" and allow xformers (#3615)
* fix if

* make style

* make style

* add tests for xformers

* make style

* update
2023-05-30 20:47:14 +01:00
Patrick von Platen
9d3ff0794d fix tests (#3614) 2023-05-30 18:59:07 +01:00
Patrick von Platen
a359ab4e29 Update README.md 2023-05-30 18:26:32 +01:00
Patrick von Platen
160c377ddc Make style 2023-05-30 13:14:09 +01:00
Denis
bb22d546c0 [Community] CLIP Guided Images Mixing with Stable DIffusion Pipeline (#3587)
* added clip_guided_images_mixing_stable_diffusion file and readme description

* apply pre-commit

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-30 13:13:45 +01:00
Greg Hunkins
799f5b4e12 [Feat] Enable State Dict For Textual Inversion Loader (#3439)
* enable state dict for textual inversion loader

* Empty-Commit | restart CI

* Empty-Commit | restart CI

* Empty-Commit | restart CI

* Empty-Commit | restart CI

* add tests

* fix tests

* fix tests

* fix tests

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-30 13:13:34 +01:00
takuoko
07ef4855cd [Community, Enhancement] Add reference tricks in README (#3589)
add reference tricks
2023-05-30 12:38:16 +01:00
Kadir Nar
6cbddf558a [Community] Support StableDiffusionTilingPipeline (#3586)
* added mixture pipeline

* added docstring

* update docstring
2023-05-30 12:24:15 +01:00
Rupert Menneer
35a740427e #3487 Fix inpainting strength for various samplers (#3532)
* Throw error if strength adjusted num_inference_steps < 1

* Added new fast test to check ValueError raised when num_inference_steps < 1

when strength adjusts the num_inference_steps then the inpainting pipeline should fail

* fix #3487 initial latents are now only scaled by init_noise_sigma when pure noise

updated this commit w.r.t the latest merge here: https://github.com/huggingface/diffusers/pull/3533

* fix

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-30 12:17:42 +01:00
Sayak Paul
0612f48cd0 [UniDiffuser Tests] Fix some tests (#3609)
* fix: unidiffuser test failures.

* living room.
2023-05-30 12:07:18 +01:00
Kadir Nar
c059cc0992 [docs] update the broken links (#3577) 2023-05-30 11:44:53 +01:00
Patrick von Platen
c0f867afd1 Fix temb attention (#3607)
* Fix temb attention

* Apply suggestions from code review

* make style

* Add tests and fix docker

* Apply suggestions from code review
2023-05-30 11:26:23 +01:00
Sayak Paul
c6ae883751 remove print statements from attention processor. (#3592) 2023-05-29 09:20:31 +05:30
Steven Liu
5559d04237 [docs] Working with different formats (#3534)
* add ckpt

* fix format

* apply feedback

* fix

* include pb

* rename file
2023-05-26 14:37:51 -07:00
Brandon
9917c32916 [docs] update the broken links (#3568)
update the broken links

update the broken links for training folder doc
2023-05-26 12:10:32 -07:00
Steven Liu
ab986769f1 [docs] Maintenance (#3552)
* doc fixes

* fix latex

* parenthesis on inside
2023-05-26 12:04:15 -07:00
Will Berman
bdc75e753d [IF super res] correctly normalize PIL input (#3536)
* [IF super res] correctl normalize PIL input

* 175 -> 127.5
2023-05-26 10:59:44 -07:00
Leon Lin
1d1f648c6b fix dreambooth attention mask (#3541) 2023-05-26 10:58:50 -07:00
Takuma Mori
67cf0445ef Fix to apply LoRAXFormersAttnProcessor instead of LoRAAttnProcessor when xFormers is enabled (#3556)
* fix to use LoRAXFormersAttnProcessor

* add test

* using new LoraLoaderMixin.save_lora_weights

* add test_lora_save_load_with_xformers
2023-05-26 17:33:25 +05:30
dg845
352ca3198c [WIP] Add UniDiffuser model and pipeline (#2963)
* Fix a bug of pano when not doing CFG (#3030)

* Fix a bug of pano when not doing CFG

* enhance code quality

* apply formatting.

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Text2video zero refinements (#3070)

* fix progress bar issue in pipeline_text_to_video_zero.py. Copy scheduler after first backward

* fix tensor loading in test_text_to_video_zero.py

* make style && make quality

* Release: v0.15.0

* [Tests] Speed up panorama tests (#3067)

* fix: norm group test for UNet3D.

* chore: speed up the panorama tests (fast).

* set default value of _test_inference_batch_single_identical.

* fix: batch_sizes default value.

* [Post release] v0.16.0dev (#3072)

* Adds profiling flags, computes train metrics average. (#3053)

* WIP controlnet training

- bugfix --streaming
- bugfix running report_to!='wandb'
- adds memory profile before validation

* Adds final logging statement.

* Sets train epochs to 11.

Looking at a longer ~16ep run, we see only good validation images
after ~11ep:

https://wandb.ai/andsteing/controlnet_fill50k/runs/3j2hx6n8

* Removes --logging_dir (it's not used).

* Adds --profile flags.

* Updates --output_dir=runs/fill-circle-{timestamp}.

* Compute mean of `train_metrics`.

Previously `train_metrics[-1]` was logged, resulting in very bumpy train
metrics.

* Improves logging a bit.

- adds l2_grads gradient norm logging
- adds steps_per_sec
- sets walltime as x coordinate of train/step
- logs controlnet_params config

* Adds --ccache (doesn't really help though).

* minor fix in controlnet flax example (#2986)

* fix the error when push_to_hub but not log validation

* contronet_from_pt & controlnet_revision

* add intermediate checkpointing to the guide

* Bugfix --profile_steps

* Sets `RACKER_PROJECT_NAME='controlnet_fill50k'`.

* Logs fractional epoch.

* Adds relative `walltime` metric.

* Adds `StepTraceAnnotation` and uses `global_step` insetad of `step`.

* Applied `black`.

* Streamlines commands in README a bit.

* Removes `--ccache`.

This makes only a very small difference (~1 min) with this model size, so removing
the option introduced in cdb3cc.

* Re-ran `black`.

* Update examples/controlnet/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Converts spaces to tab.

* Removes repeated args.

* Skips first step (compilation) in profiling

* Updates README with profiling instructions.

* Unifies tabs/spaces in README.

* Re-ran style & quality.

---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* [Pipelines] Make sure that None functions are correctly not saved (#3080)

* doc string example remove from_pt (#3083)

* [Tests] parallelize (#3078)

* [Tests] parallelize

* finish folder structuring

* Parallelize tests more

* Correct saving of pipelines

* make sure logging level is correct

* try again

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Throw deprecation warning for return_cached_folder (#3092)

Throw deprecation warning

* Allow SD attend and excite pipeline to work with any size output images (#2835)

Allow stable diffusion attend and excite pipeline to work with any size output image. Re: #2476, #2603

* [docs] Update community pipeline docs (#2989)

* update community pipeline docs

* fix formatting

* explain sharing workflows

* Add to support Guess Mode for StableDiffusionControlnetPipleline (#2998)

* add guess mode (WIP)

* fix uncond/cond order

* support guidance_scale=1.0 and batch != 1

* remove magic coeff

* add docstring

* add intergration test

* add document to controlnet.mdx

* made the comments a bit more explanatory

* fix table

* fix default value for attend-and-excite (#3099)

* fix default

* remvoe one line as requested by gc team  (#3077)

remvoe one line

* ddpm custom timesteps (#3007)

add custom timesteps test

add custom timesteps descending order check

docs

timesteps -> custom_timesteps

can only pass one of num_inference_steps and timesteps

* Fix breaking change in `pipeline_stable_diffusion_controlnet.py` (#3118)

fix breaking change

* Add global pooling to controlnet (#3121)

* [Bug fix] Fix img2img processor with safety checker (#3127)

Fix img2img processor with safety checker

* [Bug fix] Make sure correct timesteps are chosen for img2img (#3128)

Make sure correct timesteps are chosen for img2img

* Improve deprecation warnings (#3131)

* Fix config deprecation (#3129)

* Better deprecation message

* Better deprecation message

* Better doc string

* Fixes

* fix more

* fix more

* Improve __getattr__

* correct more

* fix more

* fix

* Improve more

* more improvements

* fix more

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* make style

* Fix all rest & add tests & remove old deprecation fns

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* feat: verfication of multi-gpu support for select examples. (#3126)

* feat: verfication of multi-gpu support for select examples.

* add: multi-gpu training sections to the relvant doc pages.

* speed up attend-and-excite fast tests (#3079)

* Optimize log_validation in train_controlnet_flax (#3110)

extract pipeline from log_validation

* make style

* Correct textual inversion readme (#3145)

* Update README.md

* Apply suggestions from code review

* Add unet act fn to other model components (#3136)

Adding act fn config to the unet timestep class embedding and conv
activation.

The custom activation defaults to silu which is the default
activation function for both the conv act and the timestep class
embeddings so default behavior is not changed.

The only unet which use the custom activation is the stable diffusion
latent upscaler https://huggingface.co/stabilityai/sd-x2-latent-upscaler/blob/main/unet/config.json
(I ran a script against the hub to confirm).
The latent upscaler does not use the conv activation nor the timestep
class embeddings so we don't change its behavior.

* class labels timestep embeddings projection dtype cast (#3137)

This mimics the dtype cast for the standard time embeddings

* [ckpt loader] Allow loading the Inpaint and Img2Img pipelines, while loading a ckpt model (#2705)

* [ckpt loader] Allow loading the Inpaint and Img2Img pipelines, while loading a ckpt model

* Address review comment from PR

* PyLint formatting

* Some more pylint fixes, unrelated to our change

* Another pylint fix

* Styling fix

* add from_ckpt method as Mixin (#2318)

* add mixin class for pipeline from original sd ckpt

* Improve

* make style

* merge main into

* Improve more

* fix more

* up

* Apply suggestions from code review

* finish docs

* rename

* make style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add TensorRT SD/txt2img Community Pipeline to diffusers along with TensorRT utils (#2974)

* Add SD/txt2img Community Pipeline to diffusers along with TensorRT utils

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* update installation command

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* update tensorrt installation

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* changes
1. Update setting of cache directory
2. Address comments: merge utils and pipeline code.
3. Address comments: Add section in README

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* apply make style

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

---------

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Correct `Transformer2DModel.forward` docstring (#3074)

⚙️chore(transformer_2d) update function signature for encoder_hidden_states

* Update pipeline_stable_diffusion_inpaint_legacy.py (#2903)

* Update pipeline_stable_diffusion_inpaint_legacy.py

* fix preprocessing of Pil images with adequate batch size

* revert map

* add tests

* reformat

* Update test_stable_diffusion_inpaint_legacy.py

* Update test_stable_diffusion_inpaint_legacy.py

* Update test_stable_diffusion_inpaint_legacy.py

* Update test_stable_diffusion_inpaint_legacy.py

* next try to fix the style

* wth is this

* Update testing_utils.py

* Update testing_utils.py

* Update test_stable_diffusion_inpaint_legacy.py

* Update test_stable_diffusion_inpaint_legacy.py

* Update test_stable_diffusion_inpaint_legacy.py

* Update test_stable_diffusion_inpaint_legacy.py

* Update test_stable_diffusion_inpaint_legacy.py

* Update test_stable_diffusion_inpaint_legacy.py

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Modified altdiffusion pipline to support altdiffusion-m18 (#2993)

* Modified altdiffusion pipline to support altdiffusion-m18

* Modified altdiffusion pipline to support altdiffusion-m18

* Modified altdiffusion pipline to support altdiffusion-m18

* Modified altdiffusion pipline to support altdiffusion-m18

* Modified altdiffusion pipline to support altdiffusion-m18

* Modified altdiffusion pipline to support altdiffusion-m18

* Modified altdiffusion pipline to support altdiffusion-m18

---------

Co-authored-by: root <fulong_ye@163.com>

* controlnet training resize inputs to multiple of 8 (#3135)

controlnet training center crop input images to multiple of 8

The pipeline code resizes inputs to multiples of 8.
Not doing this resizing in the training script is causing
the encoded image to have different height/width dimensions
than the encoded conditioning image (which uses a separate
encoder that's part of the controlnet model).

We resize and center crop the inputs to make sure they're the
same size (as well as all other images in the batch). We also
check that the initial resolution is a multiple of 8.

* adding custom diffusion training to diffusers examples (#3031)

* diffusers==0.14.0 update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion update

* custom diffusion

* custom diffusion

* custom diffusion

* custom diffusion

* custom diffusion

* apply formatting and get rid of bare except.

* refactor readme and other minor changes.

* misc refactor.

* fix: repo_id issue and loaders logging bug.

* fix: save_model_card.

* fix: save_model_card.

* fix: save_model_card.

* add: doc entry.

* refactor doc,.

* custom diffusion

* custom diffusion

* custom diffusion

* apply style.

* remove tralining whitespace.

* fix: toctree entry.

* remove unnecessary print.

* custom diffusion

* custom diffusion

* custom diffusion test

* custom diffusion xformer update

* custom diffusion xformer update

* custom diffusion xformer update

---------

Co-authored-by: Nupur Kumari <nupurkumari@Nupurs-MacBook-Pro.local>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Nupur Kumari <nupurkumari@nupurs-mbp.wifi.local.cmu.edu>

* make style

* Update custom_diffusion.mdx (#3165)

Add missing newlines for rendering the links correctly

* Added distillation for quantization example on textual inversion. (#2760)

* Added distillation for quantization example on textual inversion.

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

* refined readme and code style.

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

* Update text2images.py

* refined code of model load and added compatibility check.

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

* fixed code style.

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

* fix C403 [*] Unnecessary `list` comprehension (rewrite as a `set` comprehension)

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

---------

Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>

* Update Noise Autocorrelation Loss Function for Pix2PixZero Pipeline (#2942)

* Update Pix2PixZero Auto-correlation Loss

* Add fast inversion tests

* Clarify purpose and mark as deprecated

Fix inversion prompt broadcasting

* Register modules set to `None` in config for `test_save_load_optional_components`

* Update new tests to coordinate with #2953

* [DreamBooth] add text encoder LoRA support in the DreamBooth training script (#3130)

* add: LoRA text encoder support for DreamBooth example.

* fix initialization.

* fix: modification call.

* add: entry in the readme.

* use dog dataset from hub.

* fix: params to clip.

* add entry to the LoRA doc.

* add: tests for lora.

* remove unnecessary list comprehension./

* Update Habana Gaudi documentation (#3169)

* Update Habana Gaudi doc

* Fix tables

* Add model offload to x4 upscaler (#3187)

* Add model offload to x4 upscaler

* fix

* [docs] Deterministic algorithms (#3172)

deterministic algos

* Update custom_diffusion.mdx to credit the author (#3163)

* Update custom_diffusion.mdx

* fix: unnecessary list comprehension.

* Fix TensorRT community pipeline device set function (#3157)

pass silence_dtype_warnings as kwarg

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make `from_flax` work for controlnet (#3161)

fix from_flax

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [docs] Clarify training args (#3146)

* clarify training arg

* apply feedback

* Multi Vector Textual Inversion (#3144)

* Multi Vector

* Improve

* fix multi token

* improve test

* make style

* Update examples/test_examples.py

* Apply suggestions from code review

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* update

* Finish

* Apply suggestions from code review

---------

Co-authored-by: Suraj Patil <surajp815@gmail.com>

* Add `Karras sigmas` to HeunDiscreteScheduler (#3160)

* Add karras pattern to discrete heun scheduler

* Add integration test

* Fix failing CI on pytorch test on M1 (mps)

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [AudioLDM] Fix dtype of returned waveform (#3189)

* Fix bug in train_dreambooth_lora (#3183)

* Update train_dreambooth_lora.py

fix bug

* Update train_dreambooth_lora.py

* [Community Pipelines] Update lpw_stable_diffusion pipeline (#3197)

* Update lpw_stable_diffusion.py

* fix cpu offload

* Make sure VAE attention works with Torch 2_0 (#3200)

* Make sure attention works with Torch 2_0

* make style

* Fix more

* Revert "[Community Pipelines] Update lpw_stable_diffusion pipeline" (#3201)

Revert "[Community Pipelines] Update lpw_stable_diffusion pipeline (#3197)"

This reverts commit 9965cb50ea.

* [Bug fix] Fix batch size attention head size mismatch (#3214)

* fix mixed precision training on train_dreambooth_inpaint_lora (#3138)

cast to weight dtype

* adding enable_vae_tiling and disable_vae_tiling functions (#3225)

adding enable_vae_tiling and disable_val_tiling functions

* Add ControlNet v1.1 docs (#3226)

Add v1.1 docs

* Fix issue in maybe_convert_prompt (#3188)

When the token used for textual inversion does not have any special symbols (e.g. it is not surrounded by <>), the tokenizer does not properly split the replacement tokens.  Adding a space for the padding tokens fixes this.

* Sync cache version check from transformers (#3179)

sync cache version check from transformers

* Fix docs text inversion (#3166)

* Fix docs text inversion

* Apply suggestions from code review

* add model (#3230)

* add

* clean

* up

* clean up more

* fix more tests

* Improve docs further

* improve

* more fixes docs

* Improve docs more

* Update src/diffusers/models/unet_2d_condition.py

* fix

* up

* update doc links

* make fix-copies

* add safety checker and watermarker to stage 3 doc page code snippets

* speed optimizations docs

* memory optimization docs

* make style

* add watermarking snippets to doc string examples

* make style

* use pt_to_pil helper functions in doc strings

* skip mps tests

* Improve safety

* make style

* new logic

* fix

* fix bad onnx design

* make new stable diffusion upscale pipeline model arguments optional

* define has_nsfw_concept when non-pil output type

* lowercase linked to notebook name

---------

Co-authored-by: William Berman <WLBberman@gmail.com>

* Allow return pt x4 (#3236)

* Add all files

* update

* Allow fp16 attn for x4 upscaler (#3239)

* Add all files

* update

* Make sure vae is memory efficient for PT 1

* make style

* fix fast test (#3241)

* Adds a document on token merging (#3208)

* add document on token merging.

* fix headline.

* fix: headline.

* add some samples for comparison.

* [AudioLDM] Update docs to use updated ckpt (#3240)

* [AudioLDM] Update docs to use updated ckpt

* make style

* Release: v0.16.0

* Post release for 0.16.0 (#3244)

* Post release

* fix more

* [docs] only mention one stage (#3246)

* [docs] only mention one stage

* add blurb on auto accepting

---------

Co-authored-by: William Berman <WLBberman@gmail.com>

* Write model card in controlnet training script (#3229)

Write model card in controlnet training script.

* [2064]: Add stochastic sampler (sample_dpmpp_sde) (#3020)

* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* Review comments

* [Review comment]: Add is_torchsde_available()

* [Review comment]: Test and docs

* [Review comment]

* [Review comment]

* [Review comment]

* [Review comment]

* [Review comment]

---------

Co-authored-by: njindal <njindal@adobe.com>

* [Stochastic Sampler][Slow Test]: Cuda test fixes (#3257)

[Slow Test]: Cuda test fixes

Co-authored-by: njindal <njindal@adobe.com>

* Remove required from tracker_project_name (#3260)

Remove required from tracker_project_name.

As observed by https://github.com/off99555 in https://github.com/huggingface/diffusers/issues/2695#issuecomment-1470755050, it already has a default value.

* adding required parameters while calling the get_up_block and get_down_block  (#3210)

* removed unnecessary parameters from get_up_block and get_down_block functions

* adding resnet_skip_time_act, resnet_out_scale_factor and cross_attention_norm to get_up_block and get_down_block functions

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* [docs] Update interface in repaint.mdx (#3119)

Update repaint.mdx

accomodate to #1701

* Update IF name to XL (#3262)

Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>

* fix typo in score sde pipeline (#3132)

* Fix typo in textual inversion JAX training script (#3123)

The pipeline is built as `pipe` but then used as `pipeline`.

* AudioDiffusionPipeline - fix encode method after config changes (#3114)

* config fixes

* deprecate get_input_dims

* Revert "Revert "[Community Pipelines] Update lpw_stable_diffusion pipeline"" (#3265)

Revert "Revert "[Community Pipelines] Update lpw_stable_diffusion pipeline" (#3201)"

This reverts commit 91a2a80eb2.

* Fix community pipelines (#3266)

* update notebook (#3259)

Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>

* [docs] add notes for stateful model changes (#3252)

* [docs] add notes for stateful model changes

* Update docs/source/en/optimization/fp16.mdx

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* link to accelerate docs for discarding hooks

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* [LoRA] quality of life improvements in the loading semantics and docs (#3180)

* 👽 qol improvements for LoRA.

* better function name?

* fix: LoRA weight loading with the new format.

* address Patrick's comments.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* change wording around encouraging the use of load_lora_weights().

* fix: function name.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [Community Pipelines] EDICT pipeline implementation (#3153)

* EDICT pipeline initial commit

- Starting point taking from https://github.com/Joqsan/edict-diffusion

* refactor __init__() method

* minor refactoring

* refactor scheduler code

- remove scheduler and move its methods to the EDICTPipeline class

* make CFG optional
- refactor encode_prompt().
- include optional generator for sampling with vae.
- minor variable renaming

* add EDICT pipeline description to README.md

* replace preprocess() with VaeImageProcessor

* run make style and make quality commands

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [Docs]zh translated docs update (#3245)

* zh translated docs update

* update _toctree

* Update logging.mdx (#2863)

Fix typos

* Add multiple conditions to StableDiffusionControlNetInpaintPipeline (#3125)

* try multi controlnet inpaint

* multi controlnet inpaint

* multi controlnet inpaint

* Let's make sure that dreambooth always uploads to the Hub (#3272)

* Update Dreambooth README

* Adapt all docs as well

* automatically write model card

* fix

* make style

* Diffedit Zero-Shot Inpainting Pipeline (#2837)

* Update Pix2PixZero Auto-correlation Loss

* Add Stable Diffusion DiffEdit pipeline

* Add draft documentation and import code

* Bugfixes and refactoring

* Add option to not decode latents in the inversion process

* Harmonize preprocessing

* Revert "Update Pix2PixZero Auto-correlation Loss"

This reverts commit b218062fed.

* Update annotations

* rename `compute_mask` to `generate_mask`

* Update documentation

* Update docs

* Update Docs

* Fix copy

* Change shape of output latents to batch first

* Update docs

* Add first draft for tests

* Bugfix and update tests

* Add `cross_attention_kwargs` support for all pipeline methods

* Fix Copies

* Add support for PIL image latents

Add support for mask broadcasting

Update docs and tests

Align `mask` argument to `mask_image`

Remove height and width arguments

* Enable MPS Tests

* Move example docstrings

* Fix test

* Fix test

* fix pipeline inheritance

* Harmonize `prepare_image_latents` with StableDiffusionPix2PixZeroPipeline

* Register modules set to `None` in config for `test_save_load_optional_components`

* Move fixed logic to specific test class

* Clean changes to other pipelines

* Update new tests to coordinate with #2953

* Update slow tests for better results

* Safety to avoid potential problems with torch.inference_mode

* Add reference in SD Pipeline Overview

* Fix tests again

* Enforce determinism in noise for generate_mask

* Fix copies

* Widen test tolerance for fp16 based on `test_stable_diffusion_upscale_pipeline_fp16`

* Add LoraLoaderMixin and update `prepare_image_latents`

* clean up repeat and reg

* bugfix

* Remove invalid args from docs

Suppress spurious warning by repeating image before latent to mask gen

* add constant learning rate with custom rule (#3133)

* add constant lr with rules

* add constant with rules in TYPE_TO_SCHEDULER_FUNCTION

* add constant lr rate with rule

* hotfix code quality

* fix doc style

* change name constant_with_rules to piecewise constant

* Allow disabling torch 2_0 attention (#3273)

* Allow disabling torch 2_0 attention

* make style

* Update src/diffusers/models/attention.py

* [doc] add link to training script (#3271)

add link to training script

Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>

* temp disable spectogram diffusion tests (#3278)

The note-seq package throws an error on import because the default installed version of Ipython
is not compatible with python 3.8 which we run in the CI.
https://github.com/huggingface/diffusers/actions/runs/4830121056/jobs/8605954838#step:7:9

* Changed sample[0] to images[0] (#3304)

A pipeline object stores the results in `images` not in `sample`.
Current code blocks don't work.

* Typo in tutorial (#3295)

* Torch compile graph fix (#3286)

* fix more

* Fix more

* fix more

* Apply suggestions from code review

* fix

* make style

* make fix-copies

* fix

* make sure torch compile

* Clean

* fix test

* Postprocessing refactor img2img (#3268)

* refactor img2img VaeImageProcessor.postprocess

* remove copy from for init, run_safety_checker, decode_latents

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* [Torch 2.0 compile] Fix more torch compile breaks (#3313)

* Fix more torch compile breaks

* add tests

* Fix all

* fix controlnet

* fix more

* Add Horace He as co-author.
>
>
Co-authored-by: Horace He <horacehe2007@yahoo.com>

* Add Horace He as co-author.

Co-authored-by: Horace He <horacehe2007@yahoo.com>

---------

Co-authored-by: Horace He <horacehe2007@yahoo.com>

* fix: scale_lr and sync example readme and docs. (#3299)

* fix: scale_lr and sync example readme and docs.

* fix doc link.

* Update stable_diffusion.mdx (#3310)

fixed import statement

* Fix missing variable assign in DeepFloyd-IF-II (#3315)

Fix missing variable assign

lol

* Correct doc build for patch releases (#3316)

Update build_documentation.yml

* Add Stable Diffusion RePaint to community pipelines (#3320)

* Add Stable Diffsuion RePaint to community pipelines

- Adds Stable Diffsuion RePaint to community pipelines
- Add Readme enty for pipeline

* Fix: Remove wrong import

- Remove wrong import
- Minor change in comments

* Fix: Code formatting of stable_diffusion_repaint

* Fix: ruff errors in stable_diffusion_repaint

* Fix multistep dpmsolver for cosine schedule (suitable for deepfloyd-if) (#3314)

* fix multistep dpmsolver for cosine schedule (deepfloy-if)

* fix a typo

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update all dpmsolver (singlestep, multistep, dpm, dpm++) for cosine noise schedule

* add test, fix style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [docs] Improve LoRA docs (#3311)

* update docs

* add to toctree

* apply feedback

* Added input pretubation (#3292)

* Added input pretubation

* Fixed spelling

* Update write_own_pipeline.mdx (#3323)

* update controlling generation doc with latest goodies. (#3321)

* [Quality] Make style (#3341)

* Fix config dpm (#3343)

* Add the SDE variant of DPM-Solver and DPM-Solver++ (#3344)

* add SDE variant of DPM-Solver and DPM-Solver++

* add test

* fix typo

* fix typo

* Add upsample_size to AttnUpBlock2D, AttnDownBlock2D (#3275)

The argument `upsample_size` needs to be added to these modules to allow compatibility with other blocks that require this argument.

* Add UniDiffuser classes to __init__ files, modify transformer block to support pre- and post-LN, add fast default tests, fix some bugs.

* Update fast tests to use test checkpoints stored on the hub and to better match the reference UniDiffuser implementation.

* Fix code with make style.

* Revert "Fix code style with make style."

This reverts commit 10a174a12c.

* Add self.image_encoder, self.text_decoder to list of models to offload to CPU in the enable_sequential_cpu_offload(...)/enable_model_cpu_offload(...) methods to make test_cpu_offload_forward_pass pass.

* Fix code quality with make style.

* Support using a data type embedding for UniDiffuser-v1.

* Add fast test for checking UniDiffuser-v1 sampling.

* Make changes so that the repository consistency tests pass.

* Add UniDiffuser dummy objects via make fix-copies.

* Fix bugs and make improvements to the UniDiffuser pipeline:
	- Improve batch size inference and fix bugs when num_images_per_prompt or num_prompts_per_image > 1
	- Add tests for num_images_per_prompt, num_prompts_per_image > 1
	- Improve check_inputs, especially regarding checking supplied latents
	- Add reset_mode method so that mode inference can be re-enabled after mode is set manually
	- Fix some warnings related to accessing class members directly instead of through their config
	- Small amount of refactoring in pipeline_unidiffuser.py

* Fix code style with make style.

* Add/edit docstrings for added classes and public pipeline methods. Also do some light refactoring.

* Add documentation for UniDiffuser and fix some typos/formatting in docstrings.

* Fix code with make style.

* Refactor and improve the UniDiffuser convert_from_ckpt.py script.

* Move the UniDiffusers convert_from_ckpy.py script to diffusers/scripts/convert_unidiffuser_to_diffusers.py

* Fix code quality via make style.

* Improve UniDiffuser slow tests.

* make style

* Fix some typos in the UniDiffuser docs.

* Remove outdated logic based on transformers version in UniDiffuser pipeline __init__.py

* Remove dependency on einops by refactoring einops operations to pure torch operations.

* make style

* Add slow test on full checkpoint for joint mode and correct expected image slices/text prefixes.

* make style

* Fix mixed precision issue by wrapping the offending code with the torch.autocast context manager.

* Revert "Fix mixed precision issue by wrapping the offending code with the torch.autocast context manager."

This reverts commit 1a58958ab4.

* Add fast test for CUDA/fp16 model behavior (currently failing).

* Fix the mixed precision issue and add additional tests of the pipeline cuda/fp16 functionality.

* make style

* Use a CLIPVisionModelWithProjection instead of CLIPVisionModel for image_encoder to better match the original UniDiffuser implementation.

* Make style and remove some testing code.

* Fix shape errors for the 'joint' and 'img2text' modes.

* Fix tests and remove some testing code.

* Add option to use fixed latents for UniDiffuserPipelineSlowTests and fix issue in modeling_text_decoder.py.

* Improve UniDiffuser docs, particularly the usage examples, and improve slow tests with new expected outputs.

* make style

* Fix examples to load model in float16.

* In image-to-text mode, sample from the autoencoder moment distribution instead of always getting its mode.

* make style

* When encoding the image using the VAE, scale the image latents by the VAE's scaling factor.

* make style

* Clean up code and make slow tests pass.

* make fix-copies

* [docs] Fix docstring (#3334)

fix docstring

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* if dreambooth lora (#3360)

* update IF stage I pipelines

add fixed variance schedulers and lora loading

* added kv lora attn processor

* allow loading into alternative lora attn processor

* make vae optional

* throw away predicted variance

* allow loading into added kv lora layer

* allow load T5

* allow pre compute text embeddings

* set new variance type in schedulers

* fix copies

* refactor all prompt embedding code

class prompts are now included in pre-encoding code
max tokenizer length is now configurable
embedding attention mask is now configurable

* fix for when variance type is not defined on scheduler

* do not pre compute validation prompt if not present

* add example test for if lora dreambooth

* add check for train text encoder and pre compute text embeddings

* Postprocessing refactor all others (#3337)

* add text2img

* fix-copies

* add

* add all other pipelines

* add

* add

* add

* add

* add

* make style

* style + fix copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>

* [docs] Improve safetensors docstring (#3368)

* clarify safetensor docstring

* fix typo

* apply feedback

* add: a warning message when using xformers in a PT 2.0 env. (#3365)

* add: a warning message when using xformers in a PT 2.0 env.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* StableDiffusionInpaintingPipeline - resize image w.r.t height and width (#3322)

* StableDiffusionInpaintingPipeline now resizes input images and masks w.r.t to passed input height and width. Default is already set to 512. This addresses the common tensor mismatch error. Also moved type check into relevant funciton to keep main pipeline body tidy.

* Fixed StableDiffusionInpaintingPrepareMaskAndMaskedImageTests

Due to previous commit these tests were failing as height and width need to be passed into the prepare_mask_and_masked_image function, I have updated the code and added a height/width variable per unit test as it seemed more appropriate than the current hard coded solution

* Added a resolution test to StableDiffusionInpaintPipelineSlowTests

this unit test simply gets the input and resizes it into some that would fail (e.g. would throw a tensor mismatch error/not a mult of 8). Then passes it through the pipeline and verifies it produces output with correct dims w.r.t the passed height and width

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

* [docs] Adapt a model (#3326)

* first draft

* apply feedback

* conv_in.weight thrown away

* [docs] Load safetensors (#3333)

* safetensors

* apply feedback

* apply feedback

* Apply suggestions from code review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

* [Docs] Fix stable_diffusion.mdx typo (#3398)

Fix typo in last code block. Correct "prommpts" to "prompt"

* Support ControlNet v1.1 shuffle properly (#3340)

* add inferring_controlnet_cond_batch

* Revert "add inferring_controlnet_cond_batch"

This reverts commit abe8d6311d.

* set guess_mode to True
whenever global_pool_conditions is True

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* nit

* add integration test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [Tests] better determinism (#3374)

* enable deterministic pytorch and cuda operations.

* disable manual seeding.

* make style && make quality for unet_2d tests.

* enable determinism for the unet2dconditional model.

* add CUBLAS_WORKSPACE_CONFIG for better reproducibility.

* relax tolerance (very weird issue, though).

* revert to torch manual_seed() where needed.

* relax more tolerance.

* better placement of the cuda variable and relax more tolerance.

* enable determinism for 3d condition model.

* relax tolerance.

* add: determinism to alt_diffusion.

* relax tolerance for alt diffusion.

* dance diffusion.

* dance diffusion is flaky.

* test_dict_tuple_outputs_equivalent edit.

* fix two more tests.

* fix more ddim tests.

* fix: argument.

* change to diff in place of difference.

* fix: test_save_load call.

* test_save_load_float16 call.

* fix: expected_max_diff

* fix: paint by example.

* relax tolerance.

* add determinism to 1d unet model.

* torch 2.0 regressions seem to be brutal

* determinism to vae.

* add reason to skipping.

* up tolerance.

* determinism to vq.

* determinism to cuda.

* determinism to the generic test pipeline file.

* refactor general pipelines testing a bit.

* determinism to alt diffusion i2i

* up tolerance for alt diff i2i and audio diff

* up tolerance.

* determinism to audioldm

* increase tolerance for audioldm lms.

* increase tolerance for paint by paint.

* increase tolerance for repaint.

* determinism to cycle diffusion and sd 1.

* relax tol for cycle diffusion 🚲

* relax tol for sd 1.0

* relax tol for controlnet.

* determinism to img var.

* relax tol for img variation.

* tolerance to i2i sd

* make style

* determinism to inpaint.

* relax tolerance for inpaiting.

* determinism for inpainting legacy

* relax tolerance.

* determinism to instruct pix2pix

* determinism to model editing.

* model editing tolerance.

* panorama determinism

* determinism to pix2pix zero.

* determinism to sag.

* sd 2. determinism

* sd. tolerance

* disallow tf32 matmul.

* relax tolerance is all you need.

* make style and determinism to sd 2 depth

* relax tolerance for depth.

* tolerance to diffedit.

* tolerance to sd 2 inpaint.

* up tolerance.

* determinism in upscaling.

* tolerance in upscaler.

* more tolerance relaxation.

* determinism to v pred.

* up tol for v_pred

* unclip determinism

* determinism to unclip img2img

* determinism to text to video.

* determinism to last set of tests

* up tol.

* vq cumsum doesn't have a deterministic kernel

* relax tol

* relax tol

* [docs] Add transformers to install (#3388)

add transformers to install

* [deepspeed] partial ZeRO-3 support (#3076)

* [deepspeed] partial ZeRO-3 support

* cleanup

* improve deepspeed fixes

* Improve

* make style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add omegaconf for tests (#3400)

Add omegaconfg

* Fix various bugs with LoRA Dreambooth and Dreambooth script (#3353)

* Improve checkpointing lora

* fix more

* Improve doc string

* Update src/diffusers/loaders.py

* make stytle

* Apply suggestions from code review

* Update src/diffusers/loaders.py

* Apply suggestions from code review

* Apply suggestions from code review

* better

* Fix all

* Fix multi-GPU dreambooth

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix all

* make style

* make style

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix docker file (#3402)

* up

* up

* fix: deepseepd_plugin retrieval from accelerate state (#3410)

* [Docs] Add `sigmoid` beta_scheduler to docstrings of relevant Schedulers (#3399)

* Add `sigmoid` beta scheduler to `DDPMScheduler` docstring

* Add `sigmoid` beta scheduler to `RePaintScheduler` docstring

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Don't install accelerate and transformers from source (#3415)

* Don't install transformers and accelerate from source (#3414)

* Improve fast tests (#3416)

Update pr_tests.yml

* attention refactor: the trilogy  (#3387)

* Replace `AttentionBlock` with `Attention`

* use _from_deprecated_attn_block check re: @patrickvonplaten

* [Docs] update the PT 2.0 optimization doc with latest findings (#3370)

* add: benchmarking stats for A100 and V100.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* address patrick's comments.

* add: rtx 4090 stats

* ⚔ benchmark reports done

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* 3313 pr link.

* add: plots.

Co-authored-by: Pedro <pedro@huggingface.co>

* fix formattimg

* update number percent.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix style rendering (#3433)

* Fix style rendering.

* Fix typo

* unCLIP scheduler do not use note (#3417)

* Replace deprecated command with environment file (#3409)

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix warning message pipeline loading (#3446)

* add stable diffusion tensorrt img2img pipeline (#3419)

* add stable diffusion tensorrt img2img pipeline

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* update docstrings

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

---------

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* Refactor controlnet and add img2img and inpaint (#3386)

* refactor controlnet and add img2img and inpaint

* First draft to get pipelines to work

* make style

* Fix more

* Fix more

* More tests

* Fix more

* Make inpainting work

* make style and more tests

* Apply suggestions from code review

* up

* make style

* Fix imports

* Fix more

* Fix more

* Improve examples

* add test

* Make sure import is correctly deprecated

* Make sure everything works in compile mode

* make sure authorship is correctly attributed

* [Scheduler] DPM-Solver (++) Inverse Scheduler (#3335)

* Add DPM-Solver Multistep Inverse Scheduler

* Add draft tests for DiffEdit

* Add inverse sde-dpmsolver steps to tune image diversity from inverted latents

* Fix tests

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [Docs] Fix incomplete docstring for resnet.py (#3438)

Fix incomplete docstrings for resnet.py

* fix tiled vae blend extent range (#3384)

fix tiled vae bleand extent range

* Small update to "Next steps" section (#3443)

Small update to "Next steps" section:

- PyTorch 2 is recommended.
- Updated improvement figures.

* Allow arbitrary aspect ratio in IFSuperResolutionPipeline (#3298)

* Update pipeline_if_superresolution.py

Allow arbitrary aspect ratio in IFSuperResolutionPipeline by using the input image shape

* IFSuperResolutionPipeline: allow the user to override the height and width through the arguments

* update IFSuperResolutionPipeline width/height doc string to match StableDiffusionInpaintPipeline conventions

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Adding 'strength' parameter to StableDiffusionInpaintingPipeline  (#3424)

* Added explanation of 'strength' parameter

* Added get_timesteps function which relies on new strength parameter

* Added `strength` parameter which defaults to 1.

* Swapped ordering so `noise_timestep` can be calculated before masking the image

this is required when you aren't applying 100% noise to the masked region, e.g. strength < 1.

* Added strength to check_inputs, throws error if out of range

* Changed `prepare_latents` to initialise latents w.r.t strength

inspired from the stable diffusion img2img pipeline, init latents are initialised by converting the init image into a VAE latent and adding noise (based upon the strength parameter passed in), e.g. random when strength = 1, or the init image at strength = 0.

* WIP: Added a unit test for the new strength parameter in the StableDiffusionInpaintingPipeline

still need to add correct regression values

* Created a is_strength_max to initialise from pure random noise

* Updated unit tests w.r.t new strength parameter + fixed new strength unit test

* renamed parameter to avoid confusion with variable of same name

* Updated regression values for new strength test - now passes

* removed 'copied from' comment as this method is now different and divergent from the cpy

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Ensure backwards compatibility for prepare_mask_and_masked_image

created a return_image boolean and initialised to false

* Ensure backwards compatibility for prepare_latents

* Fixed copy check typo

* Fixes w.r.t backward compibility changes

* make style

* keep function argument ordering same for backwards compatibility in callees with copied from statements

* make fix-copies

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>

* [WIP] Bugfix - Pipeline.from_pretrained is broken when the pipeline is partially downloaded (#3448)

Added bugfix using f strings.

* Fix gradient checkpointing bugs in freezing part of models (requires_grad=False) (#3404)

* gradient checkpointing bug fix

* bug fix; changes for reviews

* reformat

* reformat

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Make dreambooth lora more robust to orig unet (#3462)

* Make dreambooth lora more robust to orig unet

* up

* Reduce peak VRAM by releasing large attention tensors (as soon as they're unnecessary) (#3463)

Release large tensors in attention (as soon as they're no longer required). Reduces peak VRAM by nearly 2 GB for 1024x1024 (even after slicing), and the savings scale up with image size.

* Add min snr to text2img lora training script (#3459)

add min snr to text2img lora training script

* Add inpaint lora scale support (#3460)

* add inpaint lora scale support

* add inpaint lora scale test

---------

Co-authored-by: yueyang.hyy <yueyang.hyy@alibaba-inc.com>

* [From ckpt] Fix from_ckpt (#3466)

* Correct from_ckpt

* make style

* Update full dreambooth script to work with IF (#3425)

* Add IF dreambooth docs (#3470)

* parameterize pass single args through tuple (#3477)

* attend and excite tests disable determinism on the class level (#3478)

* dreambooth docs torch.compile note (#3471)

* dreambooth docs torch.compile note

* Update examples/dreambooth/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/dreambooth/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add: if entry in the dreambooth training docs. (#3472)

* [docs] Textual inversion inference (#3473)

* add textual inversion inference to docs

* add to toctree

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* [docs] Distributed inference (#3376)

* distributed inference

* move to inference section

* apply feedback

* update with split_between_processes

* apply feedback

* [{Up,Down}sample1d] explicit view kernel size as number elements in flattened indices (#3479)

explicit view kernel size as number elements in flattened indices

* mps & onnx tests rework (#3449)

* Remove ONNX tests from PR.

They are already a part of push_tests.yml.

* Remove mps tests from PRs.

They are already performed on push.

* Fix workflow name for fast push tests.

* Extract mps tests to a workflow.

For better control/filtering.

* Remove --extra-index-url from mps tests

* Increase tolerance of mps test

This test passes in my Mac (Ventura 13.3) but fails in the CI hardware
(Ventura 13.2). I ran the local tests following the same steps that
exist in the CI workflow.

* Temporarily run mps tests on pr

So we can test.

* Revert "Temporarily run mps tests on pr"

Tests passed, go back to running on push.

* [Attention processor] Better warning message when shifting to `AttnProcessor2_0` (#3457)

* add: debugging to enabling memory efficient processing

* add: better warning message.

* [Docs] add note on local directory path. (#3397)

add note on local directory path.

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Refactor full determinism (#3485)

* up

* fix more

* Apply suggestions from code review

* fix more

* fix more

* Check it

* Remove 16:8

* fix more

* fix more

* fix more

* up

* up

* Test only stable diffusion

* Test only two files

* up

* Try out spinning up processes that can be killed

* up

* Apply suggestions from code review

* up

* up

* Fix DPM single (#3413)

* Fix DPM single

* add test

* fix one more bug

* Apply suggestions from code review

Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>

---------

Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>

* Add `use_Karras_sigmas` to DPMSolverSinglestepScheduler (#3476)

* add use_karras_sigmas

* add karras test

* add doc

* Adds local_files_only bool to prevent forced online connection (#3486)

* make style

* [Docs] Korean translation (optimization, training) (#3488)

* feat) optimization kr translation

* fix) typo, italic setting

* feat) dreambooth, text2image kr

* feat) lora kr

* fix) LoRA

* fix) fp16 fix

* fix) doc-builder style

* fix) fp16 일부 단어 수정

* fix) fp16 style fix

* fix) opt, training docs update

* feat) toctree update

* feat) toctree update

---------

Co-authored-by: Chanran Kim <seriousran@gmail.com>

* DataLoader respecting EXIF data in Training Images (#3465)

* DataLoader will now bake in any transforms or image manipulations contained in the EXIF

Images may have rotations stored in EXIF. Training using such images will cause those transforms to be ignored while training and thus produce unexpected results

* Fixed the Dataloading EXIF issue in main DreamBooth training as well

* Run make style (black & isort)

* make style

* feat: allow disk offload for diffuser models (#3285)

* allow disk offload for diffuser models

* sort import

* add max_memory argument

* Changed sample[0] to images[0] (#3304)

A pipeline object stores the results in `images` not in `sample`.
Current code blocks don't work.

* Typo in tutorial (#3295)

* Torch compile graph fix (#3286)

* fix more

* Fix more

* fix more

* Apply suggestions from code review

* fix

* make style

* make fix-copies

* fix

* make sure torch compile

* Clean

* fix test

* Postprocessing refactor img2img (#3268)

* refactor img2img VaeImageProcessor.postprocess

* remove copy from for init, run_safety_checker, decode_latents

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* [Torch 2.0 compile] Fix more torch compile breaks (#3313)

* Fix more torch compile breaks

* add tests

* Fix all

* fix controlnet

* fix more

* Add Horace He as co-author.
>
>
Co-authored-by: Horace He <horacehe2007@yahoo.com>

* Add Horace He as co-author.

Co-authored-by: Horace He <horacehe2007@yahoo.com>

---------

Co-authored-by: Horace He <horacehe2007@yahoo.com>

* fix: scale_lr and sync example readme and docs. (#3299)

* fix: scale_lr and sync example readme and docs.

* fix doc link.

* Update stable_diffusion.mdx (#3310)

fixed import statement

* Fix missing variable assign in DeepFloyd-IF-II (#3315)

Fix missing variable assign

lol

* Correct doc build for patch releases (#3316)

Update build_documentation.yml

* Add Stable Diffusion RePaint to community pipelines (#3320)

* Add Stable Diffsuion RePaint to community pipelines

- Adds Stable Diffsuion RePaint to community pipelines
- Add Readme enty for pipeline

* Fix: Remove wrong import

- Remove wrong import
- Minor change in comments

* Fix: Code formatting of stable_diffusion_repaint

* Fix: ruff errors in stable_diffusion_repaint

* Fix multistep dpmsolver for cosine schedule (suitable for deepfloyd-if) (#3314)

* fix multistep dpmsolver for cosine schedule (deepfloy-if)

* fix a typo

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update all dpmsolver (singlestep, multistep, dpm, dpm++) for cosine noise schedule

* add test, fix style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [docs] Improve LoRA docs (#3311)

* update docs

* add to toctree

* apply feedback

* Added input pretubation (#3292)

* Added input pretubation

* Fixed spelling

* Update write_own_pipeline.mdx (#3323)

* update controlling generation doc with latest goodies. (#3321)

* [Quality] Make style (#3341)

* Fix config dpm (#3343)

* Add the SDE variant of DPM-Solver and DPM-Solver++ (#3344)

* add SDE variant of DPM-Solver and DPM-Solver++

* add test

* fix typo

* fix typo

* Add upsample_size to AttnUpBlock2D, AttnDownBlock2D (#3275)

The argument `upsample_size` needs to be added to these modules to allow compatibility with other blocks that require this argument.

* Rename --only_save_embeds to --save_as_full_pipeline (#3206)

* Set --only_save_embeds to False by default

Due to how the option is named, it makes more sense to behave like this.

* Refactor only_save_embeds to save_as_full_pipeline

* [AudioLDM] Generalise conversion script (#3328)

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix TypeError when using prompt_embeds and negative_prompt (#2982)

* test: Added test case

* fix: fixed type checking issue on _encode_prompt

* fix: fixed copies consistency

* fix: one copy was not sufficient

* Fix pipeline class on README (#3345)

Update README.md

* Inpainting: typo in docs (#3331)

Typo in docs

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add `use_Karras_sigmas` to LMSDiscreteScheduler (#3351)

* add karras sigma to lms discrete scheduler

* add test for lms_scheduler karras

* reformat test lms

* Batched load of textual inversions (#3277)

* Batched load of textual inversions

- Only call resize_token_embeddings once per batch as it is the most expensive operation
- Allow pretrained_model_name_or_path and token to be an optional list
- Remove Dict from type annotation pretrained_model_name_or_path as it was not supported in this function
- Add comment that single files (e.g. .pt/.safetensors) are supported
- Add comment for token parameter
- Convert token override log message from warning to info

* Update src/diffusers/loaders.py

Check for duplicate tokens

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update condition for None tokens

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make fix-copies

* [docs] Fix docstring (#3334)

fix docstring

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* if dreambooth lora (#3360)

* update IF stage I pipelines

add fixed variance schedulers and lora loading

* added kv lora attn processor

* allow loading into alternative lora attn processor

* make vae optional

* throw away predicted variance

* allow loading into added kv lora layer

* allow load T5

* allow pre compute text embeddings

* set new variance type in schedulers

* fix copies

* refactor all prompt embedding code

class prompts are now included in pre-encoding code
max tokenizer length is now configurable
embedding attention mask is now configurable

* fix for when variance type is not defined on scheduler

* do not pre compute validation prompt if not present

* add example test for if lora dreambooth

* add check for train text encoder and pre compute text embeddings

* Postprocessing refactor all others (#3337)

* add text2img

* fix-copies

* add

* add all other pipelines

* add

* add

* add

* add

* add

* make style

* style + fix copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>

* [docs] Improve safetensors docstring (#3368)

* clarify safetensor docstring

* fix typo

* apply feedback

* add: a warning message when using xformers in a PT 2.0 env. (#3365)

* add: a warning message when using xformers in a PT 2.0 env.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* StableDiffusionInpaintingPipeline - resize image w.r.t height and width (#3322)

* StableDiffusionInpaintingPipeline now resizes input images and masks w.r.t to passed input height and width. Default is already set to 512. This addresses the common tensor mismatch error. Also moved type check into relevant funciton to keep main pipeline body tidy.

* Fixed StableDiffusionInpaintingPrepareMaskAndMaskedImageTests

Due to previous commit these tests were failing as height and width need to be passed into the prepare_mask_and_masked_image function, I have updated the code and added a height/width variable per unit test as it seemed more appropriate than the current hard coded solution

* Added a resolution test to StableDiffusionInpaintPipelineSlowTests

this unit test simply gets the input and resizes it into some that would fail (e.g. would throw a tensor mismatch error/not a mult of 8). Then passes it through the pipeline and verifies it produces output with correct dims w.r.t the passed height and width

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

* [docs] Adapt a model (#3326)

* first draft

* apply feedback

* conv_in.weight thrown away

* [docs] Load safetensors (#3333)

* safetensors

* apply feedback

* apply feedback

* Apply suggestions from code review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

* [Docs] Fix stable_diffusion.mdx typo (#3398)

Fix typo in last code block. Correct "prommpts" to "prompt"

* Support ControlNet v1.1 shuffle properly (#3340)

* add inferring_controlnet_cond_batch

* Revert "add inferring_controlnet_cond_batch"

This reverts commit abe8d6311d.

* set guess_mode to True
whenever global_pool_conditions is True

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* nit

* add integration test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [Tests] better determinism (#3374)

* enable deterministic pytorch and cuda operations.

* disable manual seeding.

* make style && make quality for unet_2d tests.

* enable determinism for the unet2dconditional model.

* add CUBLAS_WORKSPACE_CONFIG for better reproducibility.

* relax tolerance (very weird issue, though).

* revert to torch manual_seed() where needed.

* relax more tolerance.

* better placement of the cuda variable and relax more tolerance.

* enable determinism for 3d condition model.

* relax tolerance.

* add: determinism to alt_diffusion.

* relax tolerance for alt diffusion.

* dance diffusion.

* dance diffusion is flaky.

* test_dict_tuple_outputs_equivalent edit.

* fix two more tests.

* fix more ddim tests.

* fix: argument.

* change to diff in place of difference.

* fix: test_save_load call.

* test_save_load_float16 call.

* fix: expected_max_diff

* fix: paint by example.

* relax tolerance.

* add determinism to 1d unet model.

* torch 2.0 regressions seem to be brutal

* determinism to vae.

* add reason to skipping.

* up tolerance.

* determinism to vq.

* determinism to cuda.

* determinism to the generic test pipeline file.

* refactor general pipelines testing a bit.

* determinism to alt diffusion i2i

* up tolerance for alt diff i2i and audio diff

* up tolerance.

* determinism to audioldm

* increase tolerance for audioldm lms.

* increase tolerance for paint by paint.

* increase tolerance for repaint.

* determinism to cycle diffusion and sd 1.

* relax tol for cycle diffusion 🚲

* relax tol for sd 1.0

* relax tol for controlnet.

* determinism to img var.

* relax tol for img variation.

* tolerance to i2i sd

* make style

* determinism to inpaint.

* relax tolerance for inpaiting.

* determinism for inpainting legacy

* relax tolerance.

* determinism to instruct pix2pix

* determinism to model editing.

* model editing tolerance.

* panorama determinism

* determinism to pix2pix zero.

* determinism to sag.

* sd 2. determinism

* sd. tolerance

* disallow tf32 matmul.

* relax tolerance is all you need.

* make style and determinism to sd 2 depth

* relax tolerance for depth.

* tolerance to diffedit.

* tolerance to sd 2 inpaint.

* up tolerance.

* determinism in upscaling.

* tolerance in upscaler.

* more tolerance relaxation.

* determinism to v pred.

* up tol for v_pred

* unclip determinism

* determinism to unclip img2img

* determinism to text to video.

* determinism to last set of tests

* up tol.

* vq cumsum doesn't have a deterministic kernel

* relax tol

* relax tol

* [docs] Add transformers to install (#3388)

add transformers to install

* [deepspeed] partial ZeRO-3 support (#3076)

* [deepspeed] partial ZeRO-3 support

* cleanup

* improve deepspeed fixes

* Improve

* make style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add omegaconf for tests (#3400)

Add omegaconfg

* Fix various bugs with LoRA Dreambooth and Dreambooth script (#3353)

* Improve checkpointing lora

* fix more

* Improve doc string

* Update src/diffusers/loaders.py

* make stytle

* Apply suggestions from code review

* Update src/diffusers/loaders.py

* Apply suggestions from code review

* Apply suggestions from code review

* better

* Fix all

* Fix multi-GPU dreambooth

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix all

* make style

* make style

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix docker file (#3402)

* up

* up

* fix: deepseepd_plugin retrieval from accelerate state (#3410)

* [Docs] Add `sigmoid` beta_scheduler to docstrings of relevant Schedulers (#3399)

* Add `sigmoid` beta scheduler to `DDPMScheduler` docstring

* Add `sigmoid` beta scheduler to `RePaintScheduler` docstring

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Don't install accelerate and transformers from source (#3415)

* Don't install transformers and accelerate from source (#3414)

* Improve fast tests (#3416)

Update pr_tests.yml

* attention refactor: the trilogy  (#3387)

* Replace `AttentionBlock` with `Attention`

* use _from_deprecated_attn_block check re: @patrickvonplaten

* [Docs] update the PT 2.0 optimization doc with latest findings (#3370)

* add: benchmarking stats for A100 and V100.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* address patrick's comments.

* add: rtx 4090 stats

* ⚔ benchmark reports done

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* 3313 pr link.

* add: plots.

Co-authored-by: Pedro <pedro@huggingface.co>

* fix formattimg

* update number percent.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix style rendering (#3433)

* Fix style rendering.

* Fix typo

* unCLIP scheduler do not use note (#3417)

* Replace deprecated command with environment file (#3409)

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix warning message pipeline loading (#3446)

* add stable diffusion tensorrt img2img pipeline (#3419)

* add stable diffusion tensorrt img2img pipeline

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* update docstrings

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

---------

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* Refactor controlnet and add img2img and inpaint (#3386)

* refactor controlnet and add img2img and inpaint

* First draft to get pipelines to work

* make style

* Fix more

* Fix more

* More tests

* Fix more

* Make inpainting work

* make style and more tests

* Apply suggestions from code review

* up

* make style

* Fix imports

* Fix more

* Fix more

* Improve examples

* add test

* Make sure import is correctly deprecated

* Make sure everything works in compile mode

* make sure authorship is correctly attributed

* [Scheduler] DPM-Solver (++) Inverse Scheduler (#3335)

* Add DPM-Solver Multistep Inverse Scheduler

* Add draft tests for DiffEdit

* Add inverse sde-dpmsolver steps to tune image diversity from inverted latents

* Fix tests

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [Docs] Fix incomplete docstring for resnet.py (#3438)

Fix incomplete docstrings for resnet.py

* fix tiled vae blend extent range (#3384)

fix tiled vae bleand extent range

* Small update to "Next steps" section (#3443)

Small update to "Next steps" section:

- PyTorch 2 is recommended.
- Updated improvement figures.

* Allow arbitrary aspect ratio in IFSuperResolutionPipeline (#3298)

* Update pipeline_if_superresolution.py

Allow arbitrary aspect ratio in IFSuperResolutionPipeline by using the input image shape

* IFSuperResolutionPipeline: allow the user to override the height and width through the arguments

* update IFSuperResolutionPipeline width/height doc string to match StableDiffusionInpaintPipeline conventions

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Adding 'strength' parameter to StableDiffusionInpaintingPipeline  (#3424)

* Added explanation of 'strength' parameter

* Added get_timesteps function which relies on new strength parameter

* Added `strength` parameter which defaults to 1.

* Swapped ordering so `noise_timestep` can be calculated before masking the image

this is required when you aren't applying 100% noise to the masked region, e.g. strength < 1.

* Added strength to check_inputs, throws error if out of range

* Changed `prepare_latents` to initialise latents w.r.t strength

inspired from the stable diffusion img2img pipeline, init latents are initialised by converting the init image into a VAE latent and adding noise (based upon the strength parameter passed in), e.g. random when strength = 1, or the init image at strength = 0.

* WIP: Added a unit test for the new strength parameter in the StableDiffusionInpaintingPipeline

still need to add correct regression values

* Created a is_strength_max to initialise from pure random noise

* Updated unit tests w.r.t new strength parameter + fixed new strength unit test

* renamed parameter to avoid confusion with variable of same name

* Updated regression values for new strength test - now passes

* removed 'copied from' comment as this method is now different and divergent from the cpy

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Ensure backwards compatibility for prepare_mask_and_masked_image

created a return_image boolean and initialised to false

* Ensure backwards compatibility for prepare_latents

* Fixed copy check typo

* Fixes w.r.t backward compibility changes

* make style

* keep function argument ordering same for backwards compatibility in callees with copied from statements

* make fix-copies

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>

* [WIP] Bugfix - Pipeline.from_pretrained is broken when the pipeline is partially downloaded (#3448)

Added bugfix using f strings.

* Fix gradient checkpointing bugs in freezing part of models (requires_grad=False) (#3404)

* gradient checkpointing bug fix

* bug fix; changes for reviews

* reformat

* reformat

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Make dreambooth lora more robust to orig unet (#3462)

* Make dreambooth lora more robust to orig unet

* up

* Reduce peak VRAM by releasing large attention tensors (as soon as they're unnecessary) (#3463)

Release large tensors in attention (as soon as they're no longer required). Reduces peak VRAM by nearly 2 GB for 1024x1024 (even after slicing), and the savings scale up with image size.

* Add min snr to text2img lora training script (#3459)

add min snr to text2img lora training script

* Add inpaint lora scale support (#3460)

* add inpaint lora scale support

* add inpaint lora scale test

---------

Co-authored-by: yueyang.hyy <yueyang.hyy@alibaba-inc.com>

* [From ckpt] Fix from_ckpt (#3466)

* Correct from_ckpt

* make style

* Update full dreambooth script to work with IF (#3425)

* Add IF dreambooth docs (#3470)

* parameterize pass single args through tuple (#3477)

* attend and excite tests disable determinism on the class level (#3478)

* dreambooth docs torch.compile note (#3471)

* dreambooth docs torch.compile note

* Update examples/dreambooth/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/dreambooth/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add: if entry in the dreambooth training docs. (#3472)

* [docs] Textual inversion inference (#3473)

* add textual inversion inference to docs

* add to toctree

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* [docs] Distributed inference (#3376)

* distributed inference

* move to inference section

* apply feedback

* update with split_between_processes

* apply feedback

* [{Up,Down}sample1d] explicit view kernel size as number elements in flattened indices (#3479)

explicit view kernel size as number elements in flattened indices

* mps & onnx tests rework (#3449)

* Remove ONNX tests from PR.

They are already a part of push_tests.yml.

* Remove mps tests from PRs.

They are already performed on push.

* Fix workflow name for fast push tests.

* Extract mps tests to a workflow.

For better control/filtering.

* Remove --extra-index-url from mps tests

* Increase tolerance of mps test

This test passes in my Mac (Ventura 13.3) but fails in the CI hardware
(Ventura 13.2). I ran the local tests following the same steps that
exist in the CI workflow.

* Temporarily run mps tests on pr

So we can test.

* Revert "Temporarily run mps tests on pr"

Tests passed, go back to running on push.

---------

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>
Co-authored-by: Ilia Larchenko <41329713+IliaLarchenko@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Horace He <horacehe2007@yahoo.com>
Co-authored-by: Umar <55330742+mu94-csl@users.noreply.github.com>
Co-authored-by: Mylo <36931363+gitmylo@users.noreply.github.com>
Co-authored-by: Markus Pobitzer <markuspobitzer@gmail.com>
Co-authored-by: Cheng Lu <lucheng.lc15@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Isamu Isozaki <isamu.website@gmail.com>
Co-authored-by: Cesar Aybar <csaybar@gmail.com>
Co-authored-by: Will Rice <will@spokestack.io>
Co-authored-by: Adrià Arrufat <1671644+arrufat@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: At-sushi <dkahw210@kyoto.zaq.ne.jp>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Isotr0py <41363108+Isotr0py@users.noreply.github.com>
Co-authored-by: pdoane <pdoane2@gmail.com>
Co-authored-by: Will Berman <wlbberman@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Rupert Menneer <71332436+rupertmenneer@users.noreply.github.com>
Co-authored-by: sudowind <wfpkueecs@163.com>
Co-authored-by: Takuma Mori <takuma104@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Laureηt <laurentfainsin@protonmail.com>
Co-authored-by: Jongwoo Han <jongwooo.han@gmail.com>
Co-authored-by: asfiyab-nvidia <117682710+asfiyab-nvidia@users.noreply.github.com>
Co-authored-by: clarencechen <clarencechenct@gmail.com>
Co-authored-by: Laureηt <laurent@fainsin.bzh>
Co-authored-by: superlabs-dev <133080491+superlabs-dev@users.noreply.github.com>
Co-authored-by: Dev Aggarwal <devxpy@gmail.com>
Co-authored-by: Vimarsh Chaturvedi <vimarsh.c@gmail.com>
Co-authored-by: 7eu7d7 <31194890+7eu7d7@users.noreply.github.com>
Co-authored-by: cmdr2 <shashank.shekhar.global@gmail.com>
Co-authored-by: wfng92 <43742196+wfng92@users.noreply.github.com>
Co-authored-by: Glaceon-Hyy <ffheyy0017@gmail.com>
Co-authored-by: yueyang.hyy <yueyang.hyy@alibaba-inc.com>

* [Community] reference only control (#3435)

* add reference only control

* add reference only control

* add reference only control

* fix lint

* fix lint

* reference adain

* bugfix EulerAncestralDiscreteScheduler

* fix style fidelity rule

* fix default output size

* del unused line

* fix deterministic

* Support for cross-attention bias / mask (#2634)

* Cross-attention masks

prefer qualified symbol, fix accidental Optional

prefer qualified symbol in AttentionProcessor

prefer qualified symbol in embeddings.py

qualified symbol in transformed_2d

qualify FloatTensor in unet_2d_blocks

move new transformer_2d params attention_mask, encoder_attention_mask to the end of the section which is assumed (e.g. by functions such as checkpoint()) to have a stable positional param interface. regard return_dict as a special-case which is assumed to be injected separately from positional params (e.g. by create_custom_forward()).

move new encoder_attention_mask param to end of CrossAttn block interfaces and Unet2DCondition interface, to maintain positional param interface.

regenerate modeling_text_unet.py

remove unused import

unet_2d_condition encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

versatile_diffusion/modeling_text_unet.py encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

transformer_2d encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

unet_2d_blocks.py: add parameter name comments

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

revert description. bool-to-bias treatment happens in unet_2d_condition only.

comment parameter names

fix copies, style

* encoder_attention_mask for SimpleCrossAttnDownBlock2D, SimpleCrossAttnUpBlock2D

* encoder_attention_mask for UNetMidBlock2DSimpleCrossAttn

* support attention_mask, encoder_attention_mask in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D, KAttentionBlock. fix binding of attention_mask, cross_attention_kwargs params in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D checkpoint invocations.

* fix mistake made during merge conflict resolution

* regenerate versatile_diffusion

* pass time embedding into checkpointed attention invocation

* always assume encoder_attention_mask is a mask (i.e. not a bias).

* style, fix-copies

* add tests for cross-attention masks

* add test for padding of attention mask

* explain mask's query_tokens dim. fix explanation about broadcasting over channels; we actually broadcast over query tokens

* support both masks and biases in Transformer2DModel#forward. document behaviour

* fix-copies

* delete attention_mask docs on the basis I never tested self-attention masking myself. not comfortable explaining it, since I don't actually understand how a self-attn mask can work in its current form: the key length will be different in every ResBlock (we don't downsample the mask when we downsample the image).

* review feedback: the standard Unet blocks shouldn't pass temb to attn (only to resnet). remove from KCrossAttnDownBlock2D,KCrossAttnUpBlock2D#forward.

* remove encoder_attention_mask param from SimpleCrossAttn{Up,Down}Block2D,UNetMidBlock2DSimpleCrossAttn, and mask-choice in those blocks' #forward, on the basis that they only do one type of attention, so the consumer can pass whichever type of attention_mask is appropriate.

* put attention mask padding back to how it was (since the SD use-case it enabled wasn't important, and it breaks the original unclip use-case). disable the test which was added.

* fix-copies

* style

* fix-copies

* put encoder_attention_mask param back into Simple block forward interfaces, to ensure consistency of forward interface.

* restore passing of emb to KAttentionBlock#forward, on the basis that removal caused test failures. restore also the passing of emb to checkpointed calls to KAttentionBlock#forward.

* make simple unet2d blocks use encoder_attention_mask, but only when attention_mask is None. this should fix UnCLIP compatibility.

* fix copies

* do not scale the initial global step by gradient accumulation steps when loading from checkpoint (#3506)

* Remove CPU latents logic for UniDiffuserPipelineFastTests.

* make style

* Revert "Clean up code and make slow tests pass."

This reverts commit ec7fb8735b.

* Revert bad commit and clean up code.

* add: contributor note.

* Batched load of textual inversions (#3277)

* Batched load of textual inversions

- Only call resize_token_embeddings once per batch as it is the most expensive operation
- Allow pretrained_model_name_or_path and token to be an optional list
- Remove Dict from type annotation pretrained_model_name_or_path as it was not supported in this function
- Add comment that single files (e.g. .pt/.safetensors) are supported
- Add comment for token parameter
- Convert token override log message from warning to info

* Update src/diffusers/loaders.py

Check for duplicate tokens

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update condition for None tokens

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Revert "add: contributor note."

This reverts commit 302fde9409.

* Re-add contributor note and refactored fast tests fixed latents code to remove CPU specific logic.

* make style

* Refactored the code:
	- Updated the checkpoint ids to the new ids where appropriate
	- Refactored the UniDiffuserTextDecoder methods to return only tensors (and made other changes to support this)
	- Cleaned up the code following suggestions by patrickvonplaten

* make style

* Remove padding logic from UniDiffuserTextDecoder.generate_beam since the inputs are already padded to a consistent length.

* Update checkpoint id for small test v1 checkpoint to hf-internal-testing/unidiffuser-test-v1.

* make style

* Make improvements to the documentation.

* Move ImageTextPipelineOutput documentation from /api/pipelines/unidiffuser.mdx to /api/diffusion_pipeline.mdx.

* Change order of arguments for UniDiffuserTextDecoder.generate_beam.

* make style

* Update docs/source/en/api/pipelines/unidiffuser.mdx

---------

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>
Signed-off-by: Ye, Xinyu <xinyu.ye@intel.com>
Co-authored-by: Ernie Chu <51432514+ernestchu@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Andranik Movsisyan <48154088+19and99@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Andreas Steiner <andstein@google.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Joseph Coffland <github@joe.coffland.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Takuma Mori <takuma104@gmail.com>
Co-authored-by: Will Berman <wlbberman@gmail.com>
Co-authored-by: Tommaso De Rossi <beats.by.morse@gmail.com>
Co-authored-by: Cristian Garcia <cgarcia.e88@gmail.com>
Co-authored-by: cmdr2 <secondary.cmdr2@gmail.com>
Co-authored-by: 1lint <105617163+1lint@users.noreply.github.com>
Co-authored-by: asfiyab-nvidia <117682710+asfiyab-nvidia@users.noreply.github.com>
Co-authored-by: Chanchana Sornsoontorn <off9955555@gmail.com>
Co-authored-by: hwuebben <wbben123@yahoo.de>
Co-authored-by: superhero-7 <57797766+superhero-7@users.noreply.github.com>
Co-authored-by: root <fulong_ye@163.com>
Co-authored-by: nupurkmr9 <nupurkmr9@gmail.com>
Co-authored-by: Nupur Kumari <nupurkumari@Nupurs-MacBook-Pro.local>
Co-authored-by: Nupur Kumari <nupurkumari@nupurs-mbp.wifi.local.cmu.edu>
Co-authored-by: Mishig <mishig.davaadorj@coloradocollege.edu>
Co-authored-by: XinyuYe-Intel <xinyu.ye@intel.com>
Co-authored-by: clarencechen <clarencechenct@gmail.com>
Co-authored-by: regisss <15324346+regisss@users.noreply.github.com>
Co-authored-by: Suraj Patil <surajp815@gmail.com>
Co-authored-by: Youssef Adarrab <104783077+youssefadr@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: Chengrui Wang <80876977+crywang@users.noreply.github.com>
Co-authored-by: SkyTNT <SKYTNT@outlook.com>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>
Co-authored-by: Isaac <34376531+init-22@users.noreply.github.com>
Co-authored-by: pdoane <pdoane2@gmail.com>
Co-authored-by: Yuchen Fan <fyc0624@gmail.com>
Co-authored-by: Nipun Jindal <jindal.nipun@gmail.com>
Co-authored-by: njindal <njindal@adobe.com>
Co-authored-by: apolinário <joaopaulo.passos@gmail.com>
Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
Co-authored-by: Xie Zejian <xiezej@gmail.com>
Co-authored-by: Jair Trejo <jairtrejo@gmail.com>
Co-authored-by: Robert Dargavel Smith <teticio@gmail.com>
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Joqsan <6027118+Joqsan@users.noreply.github.com>
Co-authored-by: NimenDavid <312648004@qq.com>
Co-authored-by: M. Tolga Cangöz <46008593+standardAI@users.noreply.github.com>
Co-authored-by: timegate <timegate@kaist.ac.kr>
Co-authored-by: Jason Kuan <jason9075@users.noreply.github.com>
Co-authored-by: Ilia Larchenko <41329713+IliaLarchenko@users.noreply.github.com>
Co-authored-by: Horace He <horacehe2007@yahoo.com>
Co-authored-by: Umar <55330742+mu94-csl@users.noreply.github.com>
Co-authored-by: Mylo <36931363+gitmylo@users.noreply.github.com>
Co-authored-by: Markus Pobitzer <markuspobitzer@gmail.com>
Co-authored-by: Cheng Lu <lucheng.lc15@gmail.com>
Co-authored-by: Isamu Isozaki <isamu.website@gmail.com>
Co-authored-by: Cesar Aybar <csaybar@gmail.com>
Co-authored-by: Will Rice <will@spokestack.io>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Rupert Menneer <71332436+rupertmenneer@users.noreply.github.com>
Co-authored-by: sudowind <wfpkueecs@163.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Laureηt <laurentfainsin@protonmail.com>
Co-authored-by: Jongwoo Han <jongwooo.han@gmail.com>
Co-authored-by: Laureηt <laurent@fainsin.bzh>
Co-authored-by: superlabs-dev <133080491+superlabs-dev@users.noreply.github.com>
Co-authored-by: Dev Aggarwal <devxpy@gmail.com>
Co-authored-by: Vimarsh Chaturvedi <vimarsh.c@gmail.com>
Co-authored-by: 7eu7d7 <31194890+7eu7d7@users.noreply.github.com>
Co-authored-by: cmdr2 <shashank.shekhar.global@gmail.com>
Co-authored-by: wfng92 <43742196+wfng92@users.noreply.github.com>
Co-authored-by: Glaceon-Hyy <ffheyy0017@gmail.com>
Co-authored-by: yueyang.hyy <yueyang.hyy@alibaba-inc.com>
Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>
Co-authored-by: Isotr0py <41363108+Isotr0py@users.noreply.github.com>
Co-authored-by: w4ffl35 <w4ffl35@ml1.net>
Co-authored-by: Seongsu Park <tjdtnsu@gmail.com>
Co-authored-by: Chanran Kim <seriousran@gmail.com>
Co-authored-by: Ambrosiussen <paul@ambrosiussen.com>
Co-authored-by: Hari Krishna <37787894+hari10599@users.noreply.github.com>
Co-authored-by: Adrià Arrufat <1671644+arrufat@users.noreply.github.com>
Co-authored-by: At-sushi <dkahw210@kyoto.zaq.ne.jp>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: takuoko <to78314910@gmail.com>
Co-authored-by: Birch-san <Birch-san@users.noreply.github.com>
2023-05-26 17:27:30 +05:30
Steven Liu
7948db81c5 [docs] Add AttnProcessor to docs (#3474)
* add attnprocessor to docs

* fix path to class

* create separate page for attnprocessors

* fix path

* fix path for real

* fill in docstrings

* apply feedback

* apply feedback
2023-05-26 17:11:42 +05:30
Patrick von Platen
bf16a97018 Fix controlnet guess mode euler (#3571)
* Fix guess mode controlnet for euler-like schedulers

* make style

* Co-authored-by: Chanchana Sornsoontorn <off.chanchana@gmail.com>

* Add co author Co-authored-by: Chanchana Sornsoontorn <off.chanchana@gmail.com>

* 2nd try
Co-authored-by: Chanchana Sornsoontorn <off.chanchana@gmail.com>
2023-05-26 11:31:51 +01:00
Patrick von Platen
66356e7dd5 Correct inpainting controlnet docs (#3572) 2023-05-26 11:02:30 +01:00
vikasmech
ffa33d631a renamed variable to input_ and output_ (#3507)
* renamed variable to input_ and output_

* changed input
_ to intputs and output_ to outputs
2023-05-26 10:34:11 +01:00
Emin Demirci
d8ce53a8c4 Fix loaded_token reference before definition (#3523) 2023-05-26 10:31:02 +01:00
Patrick von Platen
d114d80fd2 [Stable Diffusion Inpainting] Allow standard text-to-img checkpoints to be useable for SD inpainting (#3533)
* Add default to inpaint

* Make sure controlnet also works with normal sd for inpaint

* Add tests

* improve

* Correct encode images function

* Correct inpaint controlnet

* Improve text2img inpanit

* make style

* up

* up

* up

* up

* fix more
2023-05-26 09:47:42 +01:00
YiYi Xu
e5215dee9a fix broken change for vq pipeline (#3563)
fix vq_model

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-05-25 14:55:31 -10:00
YiYi Xu
03b7a84cbe Add Kandinsky 2.1 (#3308)
add kandinsky2.1

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Ayush Mangal <43698245+ayushtues@users.noreply.github.com>
Co-authored-by: ayushmangal <ayushmangal@microsoft.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-05-25 11:28:34 -10:00
Patrick von Platen
f19f128735 Add open parti prompts to docs (#3549)
* Add open parti prompts

* More changes
2023-05-25 11:11:20 +01:00
Isotr0py
a94977b8b3 Fix panorama to support all schedulers (#3546)
* refactor blocks init

* refactor blocks loop

* remove unused function and warnings

* fix scheduler update location

* reformat code

* reformat code again

* fix PNDM test case

* reformat pndm test case
2023-05-24 17:58:08 +05:30
Sayak Paul
8e69708b0d [Examples/DreamBooth] refactor save_model_card utility in dreambooth examples (#3543)
refactor save_model_card utility in dreambooth examples.
2023-05-24 16:16:28 +05:30
Will Berman
db56f8a4f5 explicit broadcasts for assignments (#3535) 2023-05-24 11:17:41 +01:00
Will Berman
c13dbd5c3a fix attention mask pad check (#3531) 2023-05-23 13:11:53 -07:00
Pedro Cuenca
bde2cb5d9b Run torch.compile tests in separate subprocesses (#3503)
* Run ControlNet compile test in a separate subprocess

`torch.compile()` spawns several subprocesses and the GPU memory used
was not reclaimed after the test ran. This approach was taken from
`transformers`.

* Style

* Prepare a couple more compile tests to run in subprocess.

* Use require_torch_2 decorator.

* Test inpaint_compile in subprocess.

* Run img2img compile test in subprocess.

* Run stable diffusion compile test in subprocess.

* style

* Temporarily trigger on pr to test.

* Revert "Temporarily trigger on pr to test."

This reverts commit 82d76868dd.
2023-05-23 19:24:17 +02:00
Patrick von Platen
abab61d49e Update README.md 2023-05-23 17:29:18 +01:00
Patrick von Platen
b402604de4 Update README.md (#3525) 2023-05-23 17:28:39 +01:00
Patrick von Platen
84ce50f08e Improve README (#3524)
Update README.md
2023-05-23 16:53:34 +01:00
Patrick von Platen
9e2734a710 Make sure Diffusers works even if Hub is down (#3447)
* Make sure Diffusers works even if Hub is down

* Make sure hub down is well tested
2023-05-23 14:22:43 +01:00
Patrick von Platen
d4197bf4d7 Allow custom pipeline loading (#3504) 2023-05-23 13:20:55 +01:00
takuoko
b134f6a8b6 [Community] ControlNet Reference (#3508)
add controlnet reference and bugfix

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-23 13:20:34 +01:00
yingjieh
edc6505193 [Community Pipelines]Accelerate inference of stable diffusion by IPEX on CPU (#3105)
* add stable_diffusion_ipex community pipeline

* Update readme.md

* reformat

* reformat

* Update examples/community/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update examples/community/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update examples/community/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update examples/community/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Update README.md

* Update README.md

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* style

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-05-23 10:55:14 +02:00
Isotr0py
2f997f30ab Fix bug in panorama pipeline when using dpmsolver scheduler (#3499)
fix panorama pipeline with dpmsolver scheduler
2023-05-23 08:55:15 +05:30
Will Berman
67cd460154 do not scale the initial global step by gradient accumulation steps when loading from checkpoint (#3506) 2023-05-22 15:19:56 -07:00
Birch-san
64bf5d33b7 Support for cross-attention bias / mask (#2634)
* Cross-attention masks

prefer qualified symbol, fix accidental Optional

prefer qualified symbol in AttentionProcessor

prefer qualified symbol in embeddings.py

qualified symbol in transformed_2d

qualify FloatTensor in unet_2d_blocks

move new transformer_2d params attention_mask, encoder_attention_mask to the end of the section which is assumed (e.g. by functions such as checkpoint()) to have a stable positional param interface. regard return_dict as a special-case which is assumed to be injected separately from positional params (e.g. by create_custom_forward()).

move new encoder_attention_mask param to end of CrossAttn block interfaces and Unet2DCondition interface, to maintain positional param interface.

regenerate modeling_text_unet.py

remove unused import

unet_2d_condition encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

versatile_diffusion/modeling_text_unet.py encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

transformer_2d encoder_attention_mask docs

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

unet_2d_blocks.py: add parameter name comments

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

revert description. bool-to-bias treatment happens in unet_2d_condition only.

comment parameter names

fix copies, style

* encoder_attention_mask for SimpleCrossAttnDownBlock2D, SimpleCrossAttnUpBlock2D

* encoder_attention_mask for UNetMidBlock2DSimpleCrossAttn

* support attention_mask, encoder_attention_mask in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D, KAttentionBlock. fix binding of attention_mask, cross_attention_kwargs params in KCrossAttnDownBlock2D, KCrossAttnUpBlock2D checkpoint invocations.

* fix mistake made during merge conflict resolution

* regenerate versatile_diffusion

* pass time embedding into checkpointed attention invocation

* always assume encoder_attention_mask is a mask (i.e. not a bias).

* style, fix-copies

* add tests for cross-attention masks

* add test for padding of attention mask

* explain mask's query_tokens dim. fix explanation about broadcasting over channels; we actually broadcast over query tokens

* support both masks and biases in Transformer2DModel#forward. document behaviour

* fix-copies

* delete attention_mask docs on the basis I never tested self-attention masking myself. not comfortable explaining it, since I don't actually understand how a self-attn mask can work in its current form: the key length will be different in every ResBlock (we don't downsample the mask when we downsample the image).

* review feedback: the standard Unet blocks shouldn't pass temb to attn (only to resnet). remove from KCrossAttnDownBlock2D,KCrossAttnUpBlock2D#forward.

* remove encoder_attention_mask param from SimpleCrossAttn{Up,Down}Block2D,UNetMidBlock2DSimpleCrossAttn, and mask-choice in those blocks' #forward, on the basis that they only do one type of attention, so the consumer can pass whichever type of attention_mask is appropriate.

* put attention mask padding back to how it was (since the SD use-case it enabled wasn't important, and it breaks the original unclip use-case). disable the test which was added.

* fix-copies

* style

* fix-copies

* put encoder_attention_mask param back into Simple block forward interfaces, to ensure consistency of forward interface.

* restore passing of emb to KAttentionBlock#forward, on the basis that removal caused test failures. restore also the passing of emb to checkpointed calls to KAttentionBlock#forward.

* make simple unet2d blocks use encoder_attention_mask, but only when attention_mask is None. this should fix UnCLIP compatibility.

* fix copies
2023-05-22 17:27:15 +01:00
takuoko
c4359d63e3 [Community] reference only control (#3435)
* add reference only control

* add reference only control

* add reference only control

* fix lint

* fix lint

* reference adain

* bugfix EulerAncestralDiscreteScheduler

* fix style fidelity rule

* fix default output size

* del unused line

* fix deterministic
2023-05-22 16:21:54 +01:00
Hari Krishna
f3d570c273 feat: allow disk offload for diffuser models (#3285)
* allow disk offload for diffuser models

* sort import

* add max_memory argument

* Changed sample[0] to images[0] (#3304)

A pipeline object stores the results in `images` not in `sample`.
Current code blocks don't work.

* Typo in tutorial (#3295)

* Torch compile graph fix (#3286)

* fix more

* Fix more

* fix more

* Apply suggestions from code review

* fix

* make style

* make fix-copies

* fix

* make sure torch compile

* Clean

* fix test

* Postprocessing refactor img2img (#3268)

* refactor img2img VaeImageProcessor.postprocess

* remove copy from for init, run_safety_checker, decode_latents

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* [Torch 2.0 compile] Fix more torch compile breaks (#3313)

* Fix more torch compile breaks

* add tests

* Fix all

* fix controlnet

* fix more

* Add Horace He as co-author.
>
>
Co-authored-by: Horace He <horacehe2007@yahoo.com>

* Add Horace He as co-author.

Co-authored-by: Horace He <horacehe2007@yahoo.com>

---------

Co-authored-by: Horace He <horacehe2007@yahoo.com>

* fix: scale_lr and sync example readme and docs. (#3299)

* fix: scale_lr and sync example readme and docs.

* fix doc link.

* Update stable_diffusion.mdx (#3310)

fixed import statement

* Fix missing variable assign in DeepFloyd-IF-II (#3315)

Fix missing variable assign

lol

* Correct doc build for patch releases (#3316)

Update build_documentation.yml

* Add Stable Diffusion RePaint to community pipelines (#3320)

* Add Stable Diffsuion RePaint to community pipelines

- Adds Stable Diffsuion RePaint to community pipelines
- Add Readme enty for pipeline

* Fix: Remove wrong import

- Remove wrong import
- Minor change in comments

* Fix: Code formatting of stable_diffusion_repaint

* Fix: ruff errors in stable_diffusion_repaint

* Fix multistep dpmsolver for cosine schedule (suitable for deepfloyd-if) (#3314)

* fix multistep dpmsolver for cosine schedule (deepfloy-if)

* fix a typo

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update all dpmsolver (singlestep, multistep, dpm, dpm++) for cosine noise schedule

* add test, fix style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [docs] Improve LoRA docs (#3311)

* update docs

* add to toctree

* apply feedback

* Added input pretubation (#3292)

* Added input pretubation

* Fixed spelling

* Update write_own_pipeline.mdx (#3323)

* update controlling generation doc with latest goodies. (#3321)

* [Quality] Make style (#3341)

* Fix config dpm (#3343)

* Add the SDE variant of DPM-Solver and DPM-Solver++ (#3344)

* add SDE variant of DPM-Solver and DPM-Solver++

* add test

* fix typo

* fix typo

* Add upsample_size to AttnUpBlock2D, AttnDownBlock2D (#3275)

The argument `upsample_size` needs to be added to these modules to allow compatibility with other blocks that require this argument.

* Rename --only_save_embeds to --save_as_full_pipeline (#3206)

* Set --only_save_embeds to False by default

Due to how the option is named, it makes more sense to behave like this.

* Refactor only_save_embeds to save_as_full_pipeline

* [AudioLDM] Generalise conversion script (#3328)

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Fix TypeError when using prompt_embeds and negative_prompt (#2982)

* test: Added test case

* fix: fixed type checking issue on _encode_prompt

* fix: fixed copies consistency

* fix: one copy was not sufficient

* Fix pipeline class on README (#3345)

Update README.md

* Inpainting: typo in docs (#3331)

Typo in docs

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add `use_Karras_sigmas` to LMSDiscreteScheduler (#3351)

* add karras sigma to lms discrete scheduler

* add test for lms_scheduler karras

* reformat test lms

* Batched load of textual inversions (#3277)

* Batched load of textual inversions

- Only call resize_token_embeddings once per batch as it is the most expensive operation
- Allow pretrained_model_name_or_path and token to be an optional list
- Remove Dict from type annotation pretrained_model_name_or_path as it was not supported in this function
- Add comment that single files (e.g. .pt/.safetensors) are supported
- Add comment for token parameter
- Convert token override log message from warning to info

* Update src/diffusers/loaders.py

Check for duplicate tokens

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update condition for None tokens

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make fix-copies

* [docs] Fix docstring (#3334)

fix docstring

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* if dreambooth lora (#3360)

* update IF stage I pipelines

add fixed variance schedulers and lora loading

* added kv lora attn processor

* allow loading into alternative lora attn processor

* make vae optional

* throw away predicted variance

* allow loading into added kv lora layer

* allow load T5

* allow pre compute text embeddings

* set new variance type in schedulers

* fix copies

* refactor all prompt embedding code

class prompts are now included in pre-encoding code
max tokenizer length is now configurable
embedding attention mask is now configurable

* fix for when variance type is not defined on scheduler

* do not pre compute validation prompt if not present

* add example test for if lora dreambooth

* add check for train text encoder and pre compute text embeddings

* Postprocessing refactor all others (#3337)

* add text2img

* fix-copies

* add

* add all other pipelines

* add

* add

* add

* add

* add

* make style

* style + fix copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>

* [docs] Improve safetensors docstring (#3368)

* clarify safetensor docstring

* fix typo

* apply feedback

* add: a warning message when using xformers in a PT 2.0 env. (#3365)

* add: a warning message when using xformers in a PT 2.0 env.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* StableDiffusionInpaintingPipeline - resize image w.r.t height and width (#3322)

* StableDiffusionInpaintingPipeline now resizes input images and masks w.r.t to passed input height and width. Default is already set to 512. This addresses the common tensor mismatch error. Also moved type check into relevant funciton to keep main pipeline body tidy.

* Fixed StableDiffusionInpaintingPrepareMaskAndMaskedImageTests

Due to previous commit these tests were failing as height and width need to be passed into the prepare_mask_and_masked_image function, I have updated the code and added a height/width variable per unit test as it seemed more appropriate than the current hard coded solution

* Added a resolution test to StableDiffusionInpaintPipelineSlowTests

this unit test simply gets the input and resizes it into some that would fail (e.g. would throw a tensor mismatch error/not a mult of 8). Then passes it through the pipeline and verifies it produces output with correct dims w.r.t the passed height and width

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

* [docs] Adapt a model (#3326)

* first draft

* apply feedback

* conv_in.weight thrown away

* [docs] Load safetensors (#3333)

* safetensors

* apply feedback

* apply feedback

* Apply suggestions from code review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* make style

* [Docs] Fix stable_diffusion.mdx typo (#3398)

Fix typo in last code block. Correct "prommpts" to "prompt"

* Support ControlNet v1.1 shuffle properly (#3340)

* add inferring_controlnet_cond_batch

* Revert "add inferring_controlnet_cond_batch"

This reverts commit abe8d6311d.

* set guess_mode to True
whenever global_pool_conditions is True

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* nit

* add integration test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [Tests] better determinism (#3374)

* enable deterministic pytorch and cuda operations.

* disable manual seeding.

* make style && make quality for unet_2d tests.

* enable determinism for the unet2dconditional model.

* add CUBLAS_WORKSPACE_CONFIG for better reproducibility.

* relax tolerance (very weird issue, though).

* revert to torch manual_seed() where needed.

* relax more tolerance.

* better placement of the cuda variable and relax more tolerance.

* enable determinism for 3d condition model.

* relax tolerance.

* add: determinism to alt_diffusion.

* relax tolerance for alt diffusion.

* dance diffusion.

* dance diffusion is flaky.

* test_dict_tuple_outputs_equivalent edit.

* fix two more tests.

* fix more ddim tests.

* fix: argument.

* change to diff in place of difference.

* fix: test_save_load call.

* test_save_load_float16 call.

* fix: expected_max_diff

* fix: paint by example.

* relax tolerance.

* add determinism to 1d unet model.

* torch 2.0 regressions seem to be brutal

* determinism to vae.

* add reason to skipping.

* up tolerance.

* determinism to vq.

* determinism to cuda.

* determinism to the generic test pipeline file.

* refactor general pipelines testing a bit.

* determinism to alt diffusion i2i

* up tolerance for alt diff i2i and audio diff

* up tolerance.

* determinism to audioldm

* increase tolerance for audioldm lms.

* increase tolerance for paint by paint.

* increase tolerance for repaint.

* determinism to cycle diffusion and sd 1.

* relax tol for cycle diffusion 🚲

* relax tol for sd 1.0

* relax tol for controlnet.

* determinism to img var.

* relax tol for img variation.

* tolerance to i2i sd

* make style

* determinism to inpaint.

* relax tolerance for inpaiting.

* determinism for inpainting legacy

* relax tolerance.

* determinism to instruct pix2pix

* determinism to model editing.

* model editing tolerance.

* panorama determinism

* determinism to pix2pix zero.

* determinism to sag.

* sd 2. determinism

* sd. tolerance

* disallow tf32 matmul.

* relax tolerance is all you need.

* make style and determinism to sd 2 depth

* relax tolerance for depth.

* tolerance to diffedit.

* tolerance to sd 2 inpaint.

* up tolerance.

* determinism in upscaling.

* tolerance in upscaler.

* more tolerance relaxation.

* determinism to v pred.

* up tol for v_pred

* unclip determinism

* determinism to unclip img2img

* determinism to text to video.

* determinism to last set of tests

* up tol.

* vq cumsum doesn't have a deterministic kernel

* relax tol

* relax tol

* [docs] Add transformers to install (#3388)

add transformers to install

* [deepspeed] partial ZeRO-3 support (#3076)

* [deepspeed] partial ZeRO-3 support

* cleanup

* improve deepspeed fixes

* Improve

* make style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Add omegaconf for tests (#3400)

Add omegaconfg

* Fix various bugs with LoRA Dreambooth and Dreambooth script (#3353)

* Improve checkpointing lora

* fix more

* Improve doc string

* Update src/diffusers/loaders.py

* make stytle

* Apply suggestions from code review

* Update src/diffusers/loaders.py

* Apply suggestions from code review

* Apply suggestions from code review

* better

* Fix all

* Fix multi-GPU dreambooth

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix all

* make style

* make style

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix docker file (#3402)

* up

* up

* fix: deepseepd_plugin retrieval from accelerate state (#3410)

* [Docs] Add `sigmoid` beta_scheduler to docstrings of relevant Schedulers (#3399)

* Add `sigmoid` beta scheduler to `DDPMScheduler` docstring

* Add `sigmoid` beta scheduler to `RePaintScheduler` docstring

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Don't install accelerate and transformers from source (#3415)

* Don't install transformers and accelerate from source (#3414)

* Improve fast tests (#3416)

Update pr_tests.yml

* attention refactor: the trilogy  (#3387)

* Replace `AttentionBlock` with `Attention`

* use _from_deprecated_attn_block check re: @patrickvonplaten

* [Docs] update the PT 2.0 optimization doc with latest findings (#3370)

* add: benchmarking stats for A100 and V100.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* address patrick's comments.

* add: rtx 4090 stats

* ⚔ benchmark reports done

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* 3313 pr link.

* add: plots.

Co-authored-by: Pedro <pedro@huggingface.co>

* fix formattimg

* update number percent.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix style rendering (#3433)

* Fix style rendering.

* Fix typo

* unCLIP scheduler do not use note (#3417)

* Replace deprecated command with environment file (#3409)

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* fix warning message pipeline loading (#3446)

* add stable diffusion tensorrt img2img pipeline (#3419)

* add stable diffusion tensorrt img2img pipeline

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* update docstrings

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

---------

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* Refactor controlnet and add img2img and inpaint (#3386)

* refactor controlnet and add img2img and inpaint

* First draft to get pipelines to work

* make style

* Fix more

* Fix more

* More tests

* Fix more

* Make inpainting work

* make style and more tests

* Apply suggestions from code review

* up

* make style

* Fix imports

* Fix more

* Fix more

* Improve examples

* add test

* Make sure import is correctly deprecated

* Make sure everything works in compile mode

* make sure authorship is correctly attributed

* [Scheduler] DPM-Solver (++) Inverse Scheduler (#3335)

* Add DPM-Solver Multistep Inverse Scheduler

* Add draft tests for DiffEdit

* Add inverse sde-dpmsolver steps to tune image diversity from inverted latents

* Fix tests

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* [Docs] Fix incomplete docstring for resnet.py (#3438)

Fix incomplete docstrings for resnet.py

* fix tiled vae blend extent range (#3384)

fix tiled vae bleand extent range

* Small update to "Next steps" section (#3443)

Small update to "Next steps" section:

- PyTorch 2 is recommended.
- Updated improvement figures.

* Allow arbitrary aspect ratio in IFSuperResolutionPipeline (#3298)

* Update pipeline_if_superresolution.py

Allow arbitrary aspect ratio in IFSuperResolutionPipeline by using the input image shape

* IFSuperResolutionPipeline: allow the user to override the height and width through the arguments

* update IFSuperResolutionPipeline width/height doc string to match StableDiffusionInpaintPipeline conventions

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Adding 'strength' parameter to StableDiffusionInpaintingPipeline  (#3424)

* Added explanation of 'strength' parameter

* Added get_timesteps function which relies on new strength parameter

* Added `strength` parameter which defaults to 1.

* Swapped ordering so `noise_timestep` can be calculated before masking the image

this is required when you aren't applying 100% noise to the masked region, e.g. strength < 1.

* Added strength to check_inputs, throws error if out of range

* Changed `prepare_latents` to initialise latents w.r.t strength

inspired from the stable diffusion img2img pipeline, init latents are initialised by converting the init image into a VAE latent and adding noise (based upon the strength parameter passed in), e.g. random when strength = 1, or the init image at strength = 0.

* WIP: Added a unit test for the new strength parameter in the StableDiffusionInpaintingPipeline

still need to add correct regression values

* Created a is_strength_max to initialise from pure random noise

* Updated unit tests w.r.t new strength parameter + fixed new strength unit test

* renamed parameter to avoid confusion with variable of same name

* Updated regression values for new strength test - now passes

* removed 'copied from' comment as this method is now different and divergent from the cpy

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Ensure backwards compatibility for prepare_mask_and_masked_image

created a return_image boolean and initialised to false

* Ensure backwards compatibility for prepare_latents

* Fixed copy check typo

* Fixes w.r.t backward compibility changes

* make style

* keep function argument ordering same for backwards compatibility in callees with copied from statements

* make fix-copies

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>

* [WIP] Bugfix - Pipeline.from_pretrained is broken when the pipeline is partially downloaded (#3448)

Added bugfix using f strings.

* Fix gradient checkpointing bugs in freezing part of models (requires_grad=False) (#3404)

* gradient checkpointing bug fix

* bug fix; changes for reviews

* reformat

* reformat

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Make dreambooth lora more robust to orig unet (#3462)

* Make dreambooth lora more robust to orig unet

* up

* Reduce peak VRAM by releasing large attention tensors (as soon as they're unnecessary) (#3463)

Release large tensors in attention (as soon as they're no longer required). Reduces peak VRAM by nearly 2 GB for 1024x1024 (even after slicing), and the savings scale up with image size.

* Add min snr to text2img lora training script (#3459)

add min snr to text2img lora training script

* Add inpaint lora scale support (#3460)

* add inpaint lora scale support

* add inpaint lora scale test

---------

Co-authored-by: yueyang.hyy <yueyang.hyy@alibaba-inc.com>

* [From ckpt] Fix from_ckpt (#3466)

* Correct from_ckpt

* make style

* Update full dreambooth script to work with IF (#3425)

* Add IF dreambooth docs (#3470)

* parameterize pass single args through tuple (#3477)

* attend and excite tests disable determinism on the class level (#3478)

* dreambooth docs torch.compile note (#3471)

* dreambooth docs torch.compile note

* Update examples/dreambooth/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/dreambooth/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* add: if entry in the dreambooth training docs. (#3472)

* [docs] Textual inversion inference (#3473)

* add textual inversion inference to docs

* add to toctree

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* [docs] Distributed inference (#3376)

* distributed inference

* move to inference section

* apply feedback

* update with split_between_processes

* apply feedback

* [{Up,Down}sample1d] explicit view kernel size as number elements in flattened indices (#3479)

explicit view kernel size as number elements in flattened indices

* mps & onnx tests rework (#3449)

* Remove ONNX tests from PR.

They are already a part of push_tests.yml.

* Remove mps tests from PRs.

They are already performed on push.

* Fix workflow name for fast push tests.

* Extract mps tests to a workflow.

For better control/filtering.

* Remove --extra-index-url from mps tests

* Increase tolerance of mps test

This test passes in my Mac (Ventura 13.3) but fails in the CI hardware
(Ventura 13.2). I ran the local tests following the same steps that
exist in the CI workflow.

* Temporarily run mps tests on pr

So we can test.

* Revert "Temporarily run mps tests on pr"

Tests passed, go back to running on push.

---------

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>
Co-authored-by: Ilia Larchenko <41329713+IliaLarchenko@users.noreply.github.com>
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Horace He <horacehe2007@yahoo.com>
Co-authored-by: Umar <55330742+mu94-csl@users.noreply.github.com>
Co-authored-by: Mylo <36931363+gitmylo@users.noreply.github.com>
Co-authored-by: Markus Pobitzer <markuspobitzer@gmail.com>
Co-authored-by: Cheng Lu <lucheng.lc15@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Isamu Isozaki <isamu.website@gmail.com>
Co-authored-by: Cesar Aybar <csaybar@gmail.com>
Co-authored-by: Will Rice <will@spokestack.io>
Co-authored-by: Adrià Arrufat <1671644+arrufat@users.noreply.github.com>
Co-authored-by: Sanchit Gandhi <93869735+sanchit-gandhi@users.noreply.github.com>
Co-authored-by: At-sushi <dkahw210@kyoto.zaq.ne.jp>
Co-authored-by: Lucca Zenóbio <luccazen@gmail.com>
Co-authored-by: Lysandre Debut <lysandre@huggingface.co>
Co-authored-by: Isotr0py <41363108+Isotr0py@users.noreply.github.com>
Co-authored-by: pdoane <pdoane2@gmail.com>
Co-authored-by: Will Berman <wlbberman@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Rupert Menneer <71332436+rupertmenneer@users.noreply.github.com>
Co-authored-by: sudowind <wfpkueecs@163.com>
Co-authored-by: Takuma Mori <takuma104@gmail.com>
Co-authored-by: Stas Bekman <stas00@users.noreply.github.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
Co-authored-by: Laureηt <laurentfainsin@protonmail.com>
Co-authored-by: Jongwoo Han <jongwooo.han@gmail.com>
Co-authored-by: asfiyab-nvidia <117682710+asfiyab-nvidia@users.noreply.github.com>
Co-authored-by: clarencechen <clarencechenct@gmail.com>
Co-authored-by: Laureηt <laurent@fainsin.bzh>
Co-authored-by: superlabs-dev <133080491+superlabs-dev@users.noreply.github.com>
Co-authored-by: Dev Aggarwal <devxpy@gmail.com>
Co-authored-by: Vimarsh Chaturvedi <vimarsh.c@gmail.com>
Co-authored-by: 7eu7d7 <31194890+7eu7d7@users.noreply.github.com>
Co-authored-by: cmdr2 <shashank.shekhar.global@gmail.com>
Co-authored-by: wfng92 <43742196+wfng92@users.noreply.github.com>
Co-authored-by: Glaceon-Hyy <ffheyy0017@gmail.com>
Co-authored-by: yueyang.hyy <yueyang.hyy@alibaba-inc.com>
2023-05-22 16:11:08 +01:00
Patrick von Platen
2b56e8ca68 make style 2023-05-22 16:49:46 +02:00
Ambrosiussen
b8b5daaee3 DataLoader respecting EXIF data in Training Images (#3465)
* DataLoader will now bake in any transforms or image manipulations contained in the EXIF

Images may have rotations stored in EXIF. Training using such images will cause those transforms to be ignored while training and thus produce unexpected results

* Fixed the Dataloading EXIF issue in main DreamBooth training as well

* Run make style (black & isort)
2023-05-22 15:49:35 +01:00
Seongsu Park
229fd8cbca [Docs] Korean translation (optimization, training) (#3488)
* feat) optimization kr translation

* fix) typo, italic setting

* feat) dreambooth, text2image kr

* feat) lora kr

* fix) LoRA

* fix) fp16 fix

* fix) doc-builder style

* fix) fp16 일부 단어 수정

* fix) fp16 style fix

* fix) opt, training docs update

* feat) toctree update

* feat) toctree update

---------

Co-authored-by: Chanran Kim <seriousran@gmail.com>
2023-05-22 15:46:16 +01:00
Patrick von Platen
a2874af297 make style 2023-05-22 16:44:48 +02:00
w4ffl35
0160e5146f Adds local_files_only bool to prevent forced online connection (#3486) 2023-05-22 15:44:36 +01:00
Isotr0py
194b0a425d Add use_Karras_sigmas to DPMSolverSinglestepScheduler (#3476)
* add use_karras_sigmas

* add karras test

* add doc
2023-05-22 15:43:56 +01:00
Patrick von Platen
6dd3871ae0 Fix DPM single (#3413)
* Fix DPM single

* add test

* fix one more bug

* Apply suggestions from code review

Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>

---------

Co-authored-by: StAlKeR7779 <stalkek7779@yandex.ru>
2023-05-22 14:32:39 +01:00
Patrick von Platen
51843fd7d0 Refactor full determinism (#3485)
* up

* fix more

* Apply suggestions from code review

* fix more

* fix more

* Check it

* Remove 16:8

* fix more

* fix more

* fix more

* up

* up

* Test only stable diffusion

* Test only two files

* up

* Try out spinning up processes that can be killed

* up

* Apply suggestions from code review

* up

* up
2023-05-22 11:15:11 +01:00
Sayak Paul
49ad61c204 [Docs] add note on local directory path. (#3397)
add note on local directory path.

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-21 15:26:56 +05:30
Sayak Paul
4bbc51d94d [Attention processor] Better warning message when shifting to AttnProcessor2_0 (#3457)
* add: debugging to enabling memory efficient processing

* add: better warning message.
2023-05-21 15:26:47 +05:30
Pedro Cuenca
f7b4f51cc2 mps & onnx tests rework (#3449)
* Remove ONNX tests from PR.

They are already a part of push_tests.yml.

* Remove mps tests from PRs.

They are already performed on push.

* Fix workflow name for fast push tests.

* Extract mps tests to a workflow.

For better control/filtering.

* Remove --extra-index-url from mps tests

* Increase tolerance of mps test

This test passes in my Mac (Ventura 13.3) but fails in the CI hardware
(Ventura 13.2). I ran the local tests following the same steps that
exist in the CI workflow.

* Temporarily run mps tests on pr

So we can test.

* Revert "Temporarily run mps tests on pr"

Tests passed, go back to running on push.
2023-05-20 13:43:07 +02:00
Will Berman
85eff637aa [{Up,Down}sample1d] explicit view kernel size as number elements in flattened indices (#3479)
explicit view kernel size as number elements in flattened indices
2023-05-19 10:45:56 -07:00
Steven Liu
e589bdb956 [docs] Distributed inference (#3376)
* distributed inference

* move to inference section

* apply feedback

* update with split_between_processes

* apply feedback
2023-05-19 10:07:33 -07:00
Steven Liu
00c76f6ff1 [docs] Textual inversion inference (#3473)
* add textual inversion inference to docs

* add to toctree

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-05-19 09:47:27 -07:00
Sayak Paul
e343443565 add: if entry in the dreambooth training docs. (#3472) 2023-05-19 07:47:28 +05:30
Will Berman
8d646f2294 dreambooth docs torch.compile note (#3471)
* dreambooth docs torch.compile note

* Update examples/dreambooth/README.md

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

* Update examples/dreambooth/README.md

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-05-19 07:40:14 +05:30
Will Berman
8917769499 attend and excite tests disable determinism on the class level (#3478) 2023-05-18 10:24:49 -07:00
Will Berman
49b7ccfb96 parameterize pass single args through tuple (#3477) 2023-05-18 10:14:29 -07:00
Will Berman
7200985eab Add IF dreambooth docs (#3470) 2023-05-17 11:56:10 -07:00
Will Berman
c9f939bf98 Update full dreambooth script to work with IF (#3425) 2023-05-17 10:42:20 -07:00
Patrick von Platen
2858d7e15e [From ckpt] Fix from_ckpt (#3466)
* Correct from_ckpt

* make style
2023-05-17 13:26:53 +01:00
Glaceon-Hyy
88295f92d9 Add inpaint lora scale support (#3460)
* add inpaint lora scale support

* add inpaint lora scale test

---------

Co-authored-by: yueyang.hyy <yueyang.hyy@alibaba-inc.com>
2023-05-17 16:58:19 +05:30
wfng92
2faf91dbde Add min snr to text2img lora training script (#3459)
add min snr to text2img lora training script
2023-05-17 16:37:45 +05:30
cmdr2
bd78f63a54 Reduce peak VRAM by releasing large attention tensors (as soon as they're unnecessary) (#3463)
Release large tensors in attention (as soon as they're no longer required). Reduces peak VRAM by nearly 2 GB for 1024x1024 (even after slicing), and the savings scale up with image size.
2023-05-17 11:24:59 +01:00
Patrick von Platen
3ebd2d1f9e Make dreambooth lora more robust to orig unet (#3462)
* Make dreambooth lora more robust to orig unet

* up
2023-05-17 11:20:13 +01:00
7eu7d7
15f1bab13b Fix gradient checkpointing bugs in freezing part of models (requires_grad=False) (#3404)
* gradient checkpointing bug fix

* bug fix; changes for reviews

* reformat

* reformat

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-17 11:06:04 +01:00
Vimarsh Chaturvedi
415c616712 [WIP] Bugfix - Pipeline.from_pretrained is broken when the pipeline is partially downloaded (#3448)
Added bugfix using f strings.
2023-05-17 11:05:33 +01:00
Rupert Menneer
c09c4f3ab7 Adding 'strength' parameter to StableDiffusionInpaintingPipeline (#3424)
* Added explanation of 'strength' parameter

* Added get_timesteps function which relies on new strength parameter

* Added `strength` parameter which defaults to 1.

* Swapped ordering so `noise_timestep` can be calculated before masking the image

this is required when you aren't applying 100% noise to the masked region, e.g. strength < 1.

* Added strength to check_inputs, throws error if out of range

* Changed `prepare_latents` to initialise latents w.r.t strength

inspired from the stable diffusion img2img pipeline, init latents are initialised by converting the init image into a VAE latent and adding noise (based upon the strength parameter passed in), e.g. random when strength = 1, or the init image at strength = 0.

* WIP: Added a unit test for the new strength parameter in the StableDiffusionInpaintingPipeline

still need to add correct regression values

* Created a is_strength_max to initialise from pure random noise

* Updated unit tests w.r.t new strength parameter + fixed new strength unit test

* renamed parameter to avoid confusion with variable of same name

* Updated regression values for new strength test - now passes

* removed 'copied from' comment as this method is now different and divergent from the cpy

* Update src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_inpaint.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Ensure backwards compatibility for prepare_mask_and_masked_image

created a return_image boolean and initialised to false

* Ensure backwards compatibility for prepare_latents

* Fixed copy check typo

* Fixes w.r.t backward compibility changes

* make style

* keep function argument ordering same for backwards compatibility in callees with copied from statements

* make fix-copies

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: William Berman <WLBberman@gmail.com>
2023-05-17 11:05:16 +01:00
Dev Aggarwal
6070b32fcf Allow arbitrary aspect ratio in IFSuperResolutionPipeline (#3298)
* Update pipeline_if_superresolution.py

Allow arbitrary aspect ratio in IFSuperResolutionPipeline by using the input image shape

* IFSuperResolutionPipeline: allow the user to override the height and width through the arguments

* update IFSuperResolutionPipeline width/height doc string to match StableDiffusionInpaintPipeline conventions

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-16 19:21:07 -07:00
Pedro Cuenca
0392eceba8 Small update to "Next steps" section (#3443)
Small update to "Next steps" section:

- PyTorch 2 is recommended.
- Updated improvement figures.
2023-05-16 19:35:47 +01:00
superlabs-dev
92ea5baca2 fix tiled vae blend extent range (#3384)
fix tiled vae bleand extent range
2023-05-16 19:33:47 +01:00
Laureηt
754fac82d2 [Docs] Fix incomplete docstring for resnet.py (#3438)
Fix incomplete docstrings for resnet.py
2023-05-16 19:33:34 +01:00
clarencechen
17f9aed79c [Scheduler] DPM-Solver (++) Inverse Scheduler (#3335)
* Add DPM-Solver Multistep Inverse Scheduler

* Add draft tests for DiffEdit

* Add inverse sde-dpmsolver steps to tune image diversity from inverted latents

* Fix tests

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-16 19:26:53 +01:00
Patrick von Platen
886575ee43 Refactor controlnet and add img2img and inpaint (#3386)
* refactor controlnet and add img2img and inpaint

* First draft to get pipelines to work

* make style

* Fix more

* Fix more

* More tests

* Fix more

* Make inpainting work

* make style and more tests

* Apply suggestions from code review

* up

* make style

* Fix imports

* Fix more

* Fix more

* Improve examples

* add test

* Make sure import is correctly deprecated

* Make sure everything works in compile mode

* make sure authorship is correctly attributed
2023-05-16 19:07:21 +01:00
asfiyab-nvidia
9d44e2fb66 add stable diffusion tensorrt img2img pipeline (#3419)
* add stable diffusion tensorrt img2img pipeline

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

* update docstrings

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>

---------

Signed-off-by: Asfiya Baig <asfiyab@nvidia.com>
2023-05-16 14:28:01 +01:00
Patrick von Platen
d2285f5158 fix warning message pipeline loading (#3446) 2023-05-16 12:58:24 +01:00
Jongwoo Han
326f326e17 Replace deprecated command with environment file (#3409)
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-16 12:51:10 +01:00
Will Berman
29b1325a5a unCLIP scheduler do not use note (#3417) 2023-05-15 09:47:14 -06:00
Pedro Cuenca
7a32b6beeb Fix style rendering (#3433)
* Fix style rendering.

* Fix typo
2023-05-15 14:32:34 +05:30
Sayak Paul
bdefabd1a8 [Docs] update the PT 2.0 optimization doc with latest findings (#3370)
* add: benchmarking stats for A100 and V100.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* address patrick's comments.

* add: rtx 4090 stats

* ⚔ benchmark reports done

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* 3313 pr link.

* add: plots.

Co-authored-by: Pedro <pedro@huggingface.co>

* fix formattimg

* update number percent.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-05-13 15:12:01 +05:30
Will Berman
909742dbd6 attention refactor: the trilogy (#3387)
* Replace `AttentionBlock` with `Attention`

* use _from_deprecated_attn_block check re: @patrickvonplaten
2023-05-12 08:54:09 -06:00
Patrick von Platen
28f404349d Improve fast tests (#3416)
Update pr_tests.yml
2023-05-12 14:01:03 +01:00
Patrick von Platen
03e5126978 Don't install transformers and accelerate from source (#3414) 2023-05-12 13:15:23 +01:00
Patrick von Platen
b1b92f4a98 Don't install accelerate and transformers from source (#3415) 2023-05-12 13:14:04 +01:00
Laureηt
7f6373d264 [Docs] Add sigmoid beta_scheduler to docstrings of relevant Schedulers (#3399)
* Add `sigmoid` beta scheduler to `DDPMScheduler` docstring

* Add `sigmoid` beta scheduler to `RePaintScheduler` docstring

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-12 12:48:26 +01:00
Sayak Paul
3a237f4fa2 fix: deepseepd_plugin retrieval from accelerate state (#3410) 2023-05-12 10:02:22 +01:00
Patrick von Platen
1a5797c6d4 Fix docker file (#3402)
* up

* up
2023-05-11 20:28:37 +01:00
Patrick von Platen
f92253015c Fix various bugs with LoRA Dreambooth and Dreambooth script (#3353)
* Improve checkpointing lora

* fix more

* Improve doc string

* Update src/diffusers/loaders.py

* make stytle

* Apply suggestions from code review

* Update src/diffusers/loaders.py

* Apply suggestions from code review

* Apply suggestions from code review

* better

* Fix all

* Fix multi-GPU dreambooth

* Apply suggestions from code review

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* Fix all

* make style

* make style

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-05-11 19:28:09 +01:00
Patrick von Platen
58c6f9cb71 Add omegaconf for tests (#3400)
Add omegaconfg
2023-05-11 18:03:27 +01:00
Stas Bekman
af2a237676 [deepspeed] partial ZeRO-3 support (#3076)
* [deepspeed] partial ZeRO-3 support

* cleanup

* improve deepspeed fixes

* Improve

* make style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-11 16:59:20 +01:00
Steven Liu
d71db894eb [docs] Add transformers to install (#3388)
add transformers to install
2023-05-11 08:52:28 -07:00
Sayak Paul
90f5f3c4d4 [Tests] better determinism (#3374)
* enable deterministic pytorch and cuda operations.

* disable manual seeding.

* make style && make quality for unet_2d tests.

* enable determinism for the unet2dconditional model.

* add CUBLAS_WORKSPACE_CONFIG for better reproducibility.

* relax tolerance (very weird issue, though).

* revert to torch manual_seed() where needed.

* relax more tolerance.

* better placement of the cuda variable and relax more tolerance.

* enable determinism for 3d condition model.

* relax tolerance.

* add: determinism to alt_diffusion.

* relax tolerance for alt diffusion.

* dance diffusion.

* dance diffusion is flaky.

* test_dict_tuple_outputs_equivalent edit.

* fix two more tests.

* fix more ddim tests.

* fix: argument.

* change to diff in place of difference.

* fix: test_save_load call.

* test_save_load_float16 call.

* fix: expected_max_diff

* fix: paint by example.

* relax tolerance.

* add determinism to 1d unet model.

* torch 2.0 regressions seem to be brutal

* determinism to vae.

* add reason to skipping.

* up tolerance.

* determinism to vq.

* determinism to cuda.

* determinism to the generic test pipeline file.

* refactor general pipelines testing a bit.

* determinism to alt diffusion i2i

* up tolerance for alt diff i2i and audio diff

* up tolerance.

* determinism to audioldm

* increase tolerance for audioldm lms.

* increase tolerance for paint by paint.

* increase tolerance for repaint.

* determinism to cycle diffusion and sd 1.

* relax tol for cycle diffusion 🚲

* relax tol for sd 1.0

* relax tol for controlnet.

* determinism to img var.

* relax tol for img variation.

* tolerance to i2i sd

* make style

* determinism to inpaint.

* relax tolerance for inpaiting.

* determinism for inpainting legacy

* relax tolerance.

* determinism to instruct pix2pix

* determinism to model editing.

* model editing tolerance.

* panorama determinism

* determinism to pix2pix zero.

* determinism to sag.

* sd 2. determinism

* sd. tolerance

* disallow tf32 matmul.

* relax tolerance is all you need.

* make style and determinism to sd 2 depth

* relax tolerance for depth.

* tolerance to diffedit.

* tolerance to sd 2 inpaint.

* up tolerance.

* determinism in upscaling.

* tolerance in upscaler.

* more tolerance relaxation.

* determinism to v pred.

* up tol for v_pred

* unclip determinism

* determinism to unclip img2img

* determinism to text to video.

* determinism to last set of tests

* up tol.

* vq cumsum doesn't have a deterministic kernel

* relax tol

* relax tol
2023-05-11 16:38:14 +01:00
Takuma Mori
01c056f094 Support ControlNet v1.1 shuffle properly (#3340)
* add inferring_controlnet_cond_batch

* Revert "add inferring_controlnet_cond_batch"

This reverts commit abe8d6311d.

* set guess_mode to True
whenever global_pool_conditions is True

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* nit

* add integration test

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-11 14:58:07 +01:00
sudowind
e0b56d2b18 [Docs] Fix stable_diffusion.mdx typo (#3398)
Fix typo in last code block. Correct "prommpts" to "prompt"
2023-05-11 15:10:16 +02:00
Patrick von Platen
f740d357c9 make style 2023-05-11 11:31:49 +02:00
Steven Liu
5e746753d6 [docs] Load safetensors (#3333)
* safetensors

* apply feedback

* apply feedback

* Apply suggestions from code review

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-11 10:31:27 +01:00
Steven Liu
c49e9ede4d [docs] Adapt a model (#3326)
* first draft

* apply feedback

* conv_in.weight thrown away
2023-05-10 16:02:48 -07:00
Patrick von Platen
82e6fa56f0 make style 2023-05-10 20:16:18 +02:00
Rupert Menneer
edb087a217 StableDiffusionInpaintingPipeline - resize image w.r.t height and width (#3322)
* StableDiffusionInpaintingPipeline now resizes input images and masks w.r.t to passed input height and width. Default is already set to 512. This addresses the common tensor mismatch error. Also moved type check into relevant funciton to keep main pipeline body tidy.

* Fixed StableDiffusionInpaintingPrepareMaskAndMaskedImageTests

Due to previous commit these tests were failing as height and width need to be passed into the prepare_mask_and_masked_image function, I have updated the code and added a height/width variable per unit test as it seemed more appropriate than the current hard coded solution

* Added a resolution test to StableDiffusionInpaintPipelineSlowTests

this unit test simply gets the input and resizes it into some that would fail (e.g. would throw a tensor mismatch error/not a mult of 8). Then passes it through the pipeline and verifies it produces output with correct dims w.r.t the passed height and width

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-10 19:14:25 +01:00
Sayak Paul
94a0c644a8 add: a warning message when using xformers in a PT 2.0 env. (#3365)
* add: a warning message when using xformers in a PT 2.0 env.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-10 07:22:04 +05:30
Steven Liu
26832aa5ef [docs] Improve safetensors docstring (#3368)
* clarify safetensor docstring

* fix typo

* apply feedback
2023-05-09 16:15:05 -07:00
YiYi Xu
c559479592 Postprocessing refactor all others (#3337)
* add text2img

* fix-copies

* add

* add all other pipelines

* add

* add

* add

* add

* add

* make style

* style + fix copies

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2023-05-09 22:28:30 +01:00
Will Berman
a757b2db6e if dreambooth lora (#3360)
* update IF stage I pipelines

add fixed variance schedulers and lora loading

* added kv lora attn processor

* allow loading into alternative lora attn processor

* make vae optional

* throw away predicted variance

* allow loading into added kv lora layer

* allow load T5

* allow pre compute text embeddings

* set new variance type in schedulers

* fix copies

* refactor all prompt embedding code

class prompts are now included in pre-encoding code
max tokenizer length is now configurable
embedding attention mask is now configurable

* fix for when variance type is not defined on scheduler

* do not pre compute validation prompt if not present

* add example test for if lora dreambooth

* add check for train text encoder and pre compute text embeddings
2023-05-09 10:24:36 -07:00
Steven Liu
571bc1ea11 [docs] Fix docstring (#3334)
fix docstring

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-08 12:08:23 -07:00
Patrick von Platen
f381402ec8 make fix-copies 2023-05-08 10:55:02 +02:00
pdoane
3d8b3d7cd8 Batched load of textual inversions (#3277)
* Batched load of textual inversions

- Only call resize_token_embeddings once per batch as it is the most expensive operation
- Allow pretrained_model_name_or_path and token to be an optional list
- Remove Dict from type annotation pretrained_model_name_or_path as it was not supported in this function
- Add comment that single files (e.g. .pt/.safetensors) are supported
- Add comment for token parameter
- Convert token override log message from warning to info

* Update src/diffusers/loaders.py

Check for duplicate tokens

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update condition for None tokens

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-08 09:54:30 +01:00
Isotr0py
0ffac97933 Add use_Karras_sigmas to LMSDiscreteScheduler (#3351)
* add karras sigma to lms discrete scheduler

* add test for lms_scheduler karras

* reformat test lms
2023-05-06 12:19:27 +01:00
Lysandre Debut
b0966f5801 Inpainting: typo in docs (#3331)
Typo in docs

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-06 12:13:33 +01:00
Lucca Zenóbio
0407c3e7d0 Fix pipeline class on README (#3345)
Update README.md
2023-05-06 12:06:52 +01:00
At-sushi
7ce3fa010a Fix TypeError when using prompt_embeds and negative_prompt (#2982)
* test: Added test case

* fix: fixed type checking issue on _encode_prompt

* fix: fixed copies consistency

* fix: one copy was not sufficient
2023-05-06 12:04:07 +01:00
Sanchit Gandhi
abd86d1c17 [AudioLDM] Generalise conversion script (#3328)
Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-06 12:00:42 +01:00
Adrià Arrufat
e9aa0925a8 Rename --only_save_embeds to --save_as_full_pipeline (#3206)
* Set --only_save_embeds to False by default

Due to how the option is named, it makes more sense to behave like this.

* Refactor only_save_embeds to save_as_full_pipeline
2023-05-06 12:00:30 +01:00
Will Rice
36f43ea75a Add upsample_size to AttnUpBlock2D, AttnDownBlock2D (#3275)
The argument `upsample_size` needs to be added to these modules to allow compatibility with other blocks that require this argument.
2023-05-05 19:50:41 +01:00
Cheng Lu
27522b585b Add the SDE variant of DPM-Solver and DPM-Solver++ (#3344)
* add SDE variant of DPM-Solver and DPM-Solver++

* add test

* fix typo

* fix typo
2023-05-05 16:03:47 +01:00
Patrick von Platen
8d4c7d0ea0 Fix config dpm (#3343) 2023-05-05 12:02:33 +01:00
Patrick von Platen
29ad75dc3b [Quality] Make style (#3341) 2023-05-05 10:06:09 +01:00
Sayak Paul
379197a2f0 update controlling generation doc with latest goodies. (#3321) 2023-05-05 11:22:29 +05:30
Cesar Aybar
79c0e24a14 Update write_own_pipeline.mdx (#3323) 2023-05-04 10:58:27 -07:00
Isamu Isozaki
fa9e35fca4 Added input pretubation (#3292)
* Added input pretubation

* Fixed spelling
2023-05-04 18:12:32 +05:30
Steven Liu
4bae76e453 [docs] Improve LoRA docs (#3311)
* update docs

* add to toctree

* apply feedback
2023-05-04 11:28:44 +05:30
Cheng Lu
022479416f Fix multistep dpmsolver for cosine schedule (suitable for deepfloyd-if) (#3314)
* fix multistep dpmsolver for cosine schedule (deepfloy-if)

* fix a typo

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* update all dpmsolver (singlestep, multistep, dpm, dpm++) for cosine noise schedule

* add test, fix style

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-05-03 18:00:59 +01:00
Markus Pobitzer
2dd408504a Add Stable Diffusion RePaint to community pipelines (#3320)
* Add Stable Diffsuion RePaint to community pipelines

- Adds Stable Diffsuion RePaint to community pipelines
- Add Readme enty for pipeline

* Fix: Remove wrong import

- Remove wrong import
- Minor change in comments

* Fix: Code formatting of stable_diffusion_repaint

* Fix: ruff errors in stable_diffusion_repaint
2023-05-03 17:59:49 +01:00
Patrick von Platen
79bd909dbd Correct doc build for patch releases (#3316)
Update build_documentation.yml
2023-05-03 17:33:41 +01:00
Mylo
63a8ef7b73 Fix missing variable assign in DeepFloyd-IF-II (#3315)
Fix missing variable assign

lol
2023-05-03 17:31:04 +01:00
Umar
0ccad2ad2d Update stable_diffusion.mdx (#3310)
fixed import statement
2023-05-03 15:53:14 +01:00
Sayak Paul
efc48da23b fix: scale_lr and sync example readme and docs. (#3299)
* fix: scale_lr and sync example readme and docs.

* fix doc link.
2023-05-03 10:13:05 +05:30
Patrick von Platen
5c7a35a259 [Torch 2.0 compile] Fix more torch compile breaks (#3313)
* Fix more torch compile breaks

* add tests

* Fix all

* fix controlnet

* fix more

* Add Horace He as co-author.
>
>
Co-authored-by: Horace He <horacehe2007@yahoo.com>

* Add Horace He as co-author.

Co-authored-by: Horace He <horacehe2007@yahoo.com>

---------

Co-authored-by: Horace He <horacehe2007@yahoo.com>
2023-05-02 18:51:00 +01:00
YiYi Xu
a7f25b4a88 Postprocessing refactor img2img (#3268)
* refactor img2img VaeImageProcessor.postprocess

* remove copy from for init, run_safety_checker, decode_latents

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

---------

Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-05-01 07:54:09 -10:00
Patrick von Platen
0e82fb19e1 Torch compile graph fix (#3286)
* fix more

* Fix more

* fix more

* Apply suggestions from code review

* fix

* make style

* make fix-copies

* fix

* make sure torch compile

* Clean

* fix test
2023-05-01 16:45:43 +02:00
Ilia Larchenko
709cf554f6 Typo in tutorial (#3295) 2023-05-01 15:44:30 +02:00
Ilia Larchenko
536684eb2f Changed sample[0] to images[0] (#3304)
A pipeline object stores the results in `images` not in `sample`.
Current code blocks don't work.
2023-05-01 15:33:51 +02:00
Will Berman
384c83aa9a temp disable spectogram diffusion tests (#3278)
The note-seq package throws an error on import because the default installed version of Ipython
is not compatible with python 3.8 which we run in the CI.
https://github.com/huggingface/diffusers/actions/runs/4830121056/jobs/8605954838#step:7:9
2023-04-28 12:05:53 -07:00
YiYi Xu
14b460614b [doc] add link to training script (#3271)
add link to training script

Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
2023-04-28 07:14:30 -10:00
Patrick von Platen
4d35d7fea3 Allow disabling torch 2_0 attention (#3273)
* Allow disabling torch 2_0 attention

* make style

* Update src/diffusers/models/attention.py
2023-04-28 13:31:11 +02:00
Jason Kuan
a7b0671c07 add constant learning rate with custom rule (#3133)
* add constant lr with rules

* add constant with rules in TYPE_TO_SCHEDULER_FUNCTION

* add constant lr rate with rule

* hotfix code quality

* fix doc style

* change name constant_with_rules to piecewise constant
2023-04-28 16:29:56 +05:30
clarencechen
be0bfcec4d Diffedit Zero-Shot Inpainting Pipeline (#2837)
* Update Pix2PixZero Auto-correlation Loss

* Add Stable Diffusion DiffEdit pipeline

* Add draft documentation and import code

* Bugfixes and refactoring

* Add option to not decode latents in the inversion process

* Harmonize preprocessing

* Revert "Update Pix2PixZero Auto-correlation Loss"

This reverts commit b218062fed.

* Update annotations

* rename `compute_mask` to `generate_mask`

* Update documentation

* Update docs

* Update Docs

* Fix copy

* Change shape of output latents to batch first

* Update docs

* Add first draft for tests

* Bugfix and update tests

* Add `cross_attention_kwargs` support for all pipeline methods

* Fix Copies

* Add support for PIL image latents

Add support for mask broadcasting

Update docs and tests

Align `mask` argument to `mask_image`

Remove height and width arguments

* Enable MPS Tests

* Move example docstrings

* Fix test

* Fix test

* fix pipeline inheritance

* Harmonize `prepare_image_latents` with StableDiffusionPix2PixZeroPipeline

* Register modules set to `None` in config for `test_save_load_optional_components`

* Move fixed logic to specific test class

* Clean changes to other pipelines

* Update new tests to coordinate with #2953

* Update slow tests for better results

* Safety to avoid potential problems with torch.inference_mode

* Add reference in SD Pipeline Overview

* Fix tests again

* Enforce determinism in noise for generate_mask

* Fix copies

* Widen test tolerance for fp16 based on `test_stable_diffusion_upscale_pipeline_fp16`

* Add LoraLoaderMixin and update `prepare_image_latents`

* clean up repeat and reg

* bugfix

* Remove invalid args from docs

Suppress spurious warning by repeating image before latent to mask gen
2023-04-28 16:28:26 +05:30
Patrick von Platen
d464214464 Let's make sure that dreambooth always uploads to the Hub (#3272)
* Update Dreambooth README

* Adapt all docs as well

* automatically write model card

* fix

* make style
2023-04-28 11:39:50 +01:00
timegate
6290668254 Add multiple conditions to StableDiffusionControlNetInpaintPipeline (#3125)
* try multi controlnet inpaint

* multi controlnet inpaint

* multi controlnet inpaint
2023-04-28 10:58:10 +01:00
M. Tolga Cangöz
73cc43109b Update logging.mdx (#2863)
Fix typos
2023-04-28 10:57:27 +01:00
NimenDavid
0614fd2038 [Docs]zh translated docs update (#3245)
* zh translated docs update

* update _toctree
2023-04-28 10:23:02 +01:00
Joqsan
462b4edd31 [Community Pipelines] EDICT pipeline implementation (#3153)
* EDICT pipeline initial commit

- Starting point taking from https://github.com/Joqsan/edict-diffusion

* refactor __init__() method

* minor refactoring

* refactor scheduler code

- remove scheduler and move its methods to the EDICTPipeline class

* make CFG optional
- refactor encode_prompt().
- include optional generator for sampling with vae.
- minor variable renaming

* add EDICT pipeline description to README.md

* replace preprocess() with VaeImageProcessor

* run make style and make quality commands

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-04-28 10:11:29 +01:00
Sayak Paul
71de5b7051 [LoRA] quality of life improvements in the loading semantics and docs (#3180)
* 👽 qol improvements for LoRA.

* better function name?

* fix: LoRA weight loading with the new format.

* address Patrick's comments.

* Apply suggestions from code review

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* change wording around encouraging the use of load_lora_weights().

* fix: function name.

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
2023-04-28 11:36:49 +05:30
Will Berman
256e6960cb [docs] add notes for stateful model changes (#3252)
* [docs] add notes for stateful model changes

* Update docs/source/en/optimization/fp16.mdx

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>

* link to accelerate docs for discarding hooks

---------

Co-authored-by: Pedro Cuenca <pedro@huggingface.co>
2023-04-27 11:05:08 -07:00
YiYi Xu
329d1df8f2 update notebook (#3259)
Co-authored-by: yiyixuxu <yixu@yis-macbook-pro.lan>
2023-04-27 07:03:56 -10:00
Patrick von Platen
364d59d13b Fix community pipelines (#3266) 2023-04-27 17:12:08 +01:00
Patrick von Platen
2ced899cc7 Revert "Revert "[Community Pipelines] Update lpw_stable_diffusion pipeline"" (#3265)
Revert "Revert "[Community Pipelines] Update lpw_stable_diffusion pipeline" (#3201)"

This reverts commit 91a2a80eb2.
2023-04-27 16:45:37 +01:00
Robert Dargavel Smith
b63419a28a AudioDiffusionPipeline - fix encode method after config changes (#3114)
* config fixes

* deprecate get_input_dims
2023-04-27 16:27:41 +01:00
Jair Trejo
eb29dbad17 Fix typo in textual inversion JAX training script (#3123)
The pipeline is built as `pipe` but then used as `pipeline`.
2023-04-27 16:24:12 +01:00
Xie Zejian
d92c4d5ab7 fix typo in score sde pipeline (#3132) 2023-04-27 15:39:14 +01:00
apolinário
eade4308da Update IF name to XL (#3262)
Co-authored-by: multimodalart <joaopaulo.passos+multimodal@gmail.com>
2023-04-27 14:26:58 +01:00
Ernie Chu
fa31da29e5 [docs] Update interface in repaint.mdx (#3119)
Update repaint.mdx

accomodate to #1701
2023-04-27 13:24:51 +01:00
Isaac
77bfb56241 adding required parameters while calling the get_up_block and get_down_block (#3210)
* removed unnecessary parameters from get_up_block and get_down_block functions

* adding resnet_skip_time_act, resnet_out_scale_factor and cross_attention_norm to get_up_block and get_down_block functions

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2023-04-27 17:01:43 +05:30
Pedro Cuenca
70ef774fa0 Remove required from tracker_project_name (#3260)
Remove required from tracker_project_name.

As observed by https://github.com/off99555 in https://github.com/huggingface/diffusers/issues/2695#issuecomment-1470755050, it already has a default value.
2023-04-27 16:59:18 +05:30
Nipun Jindal
0b64c2c6c3 [Stochastic Sampler][Slow Test]: Cuda test fixes (#3257)
[Slow Test]: Cuda test fixes

Co-authored-by: njindal <njindal@adobe.com>
2023-04-27 14:52:38 +05:30
Nipun Jindal
fd512d7461 [2064]: Add stochastic sampler (sample_dpmpp_sde) (#3020)
* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* [2064]: Add stochastic sampler

* Review comments

* [Review comment]: Add is_torchsde_available()

* [Review comment]: Test and docs

* [Review comment]

* [Review comment]

* [Review comment]

* [Review comment]

* [Review comment]

---------

Co-authored-by: njindal <njindal@adobe.com>
2023-04-27 11:18:38 +05:30
Pedro Cuenca
e0a2bd15f9 Write model card in controlnet training script (#3229)
Write model card in controlnet training script.
2023-04-26 21:22:27 +02:00
Pedro Cuenca
c399de396d [docs] only mention one stage (#3246)
* [docs] only mention one stage

* add blurb on auto accepting

---------

Co-authored-by: William Berman <WLBberman@gmail.com>
2023-04-26 12:06:50 -07:00
Patrick von Platen
f842396367 Post release for 0.16.0 (#3244)
* Post release

* fix more
2023-04-26 17:43:09 +01:00
1432 changed files with 371129 additions and 42076 deletions

View File

@@ -1,5 +1,5 @@
name: "\U0001F41B Bug Report"
description: Report a bug on diffusers
description: Report a bug on Diffusers
labels: [ "bug" ]
body:
- type: markdown
@@ -10,15 +10,16 @@ body:
Thus, issues are of the same importance as pull requests when contributing to this library ❤️.
In order to make your issue as **useful for the community as possible**, let's try to stick to some simple guidelines:
- 1. Please try to be as precise and concise as possible.
*Give your issue a fitting title. Assume that someone which very limited knowledge of diffusers can understand your issue. Add links to the source code, documentation other issues, pull requests etc...*
*Give your issue a fitting title. Assume that someone which very limited knowledge of Diffusers can understand your issue. Add links to the source code, documentation other issues, pull requests etc...*
- 2. If your issue is about something not working, **always** provide a reproducible code snippet. The reader should be able to reproduce your issue by **only copy-pasting your code snippet into a Python shell**.
*The community cannot solve your issue if it cannot reproduce it. If your bug is related to training, add your training script and make everything needed to train public. Otherwise, just add a simple Python code snippet.*
- 3. Add the **minimum amount of code / context that is needed to understand, reproduce your issue**.
- 3. Add the **minimum** amount of code / context that is needed to understand, reproduce your issue.
*Make the life of maintainers easy. `diffusers` is getting many issues every day. Make sure your issue is about one bug and one bug only. Make sure you add only the context, code needed to understand your issues - nothing more. Generally, every issue is a way of documenting this library, try to make it a good documentation entry.*
- 4. For issues related to community pipelines (i.e., the pipelines located in the `examples/community` folder), please tag the author of the pipeline in your issue thread as those pipelines are not maintained.
- type: markdown
attributes:
value: |
For more in-detail information on how to write good issues you can have a look [here](https://huggingface.co/course/chapter8/5?fw=pt)
For more in-detail information on how to write good issues you can have a look [here](https://huggingface.co/course/chapter8/5?fw=pt).
- type: textarea
id: bug-description
attributes:
@@ -46,6 +47,60 @@ body:
attributes:
label: System Info
description: Please share your system info with us. You can run the command `diffusers-cli env` and copy-paste its output below.
placeholder: diffusers version, platform, python version, ...
placeholder: Diffusers version, platform, Python version, ...
validations:
required: true
- type: textarea
id: who-can-help
attributes:
label: Who can help?
description: |
Your issue will be replied to more quickly if you can figure out the right person to tag with @.
If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of **who to tag**.
All issues are read by one of the core maintainers, so if you don't know who to tag, just leave this blank and
a core maintainer will ping the right person.
Please tag a maximum of 2 people.
Questions on DiffusionPipeline (Saving, Loading, From pretrained, ...):
Questions on pipelines:
- Stable Diffusion @yiyixuxu @DN6 @sayakpaul
- Stable Diffusion XL @yiyixuxu @sayakpaul @DN6
- Kandinsky @yiyixuxu
- ControlNet @sayakpaul @yiyixuxu @DN6
- T2I Adapter @sayakpaul @yiyixuxu @DN6
- IF @DN6
- Text-to-Video / Video-to-Video @DN6 @sayakpaul
- Wuerstchen @DN6
- Other: @yiyixuxu @DN6
Questions on models:
- UNet @DN6 @yiyixuxu @sayakpaul
- VAE @sayakpaul @DN6 @yiyixuxu
- Transformers/Attention @DN6 @yiyixuxu @sayakpaul @DN6
Questions on Schedulers: @yiyixuxu
Questions on LoRA: @sayakpaul
Questions on Textual Inversion: @sayakpaul
Questions on Training:
- DreamBooth @sayakpaul
- Text-to-Image Fine-tuning @sayakpaul
- Textual Inversion @sayakpaul
- ControlNet @sayakpaul
Questions on Tests: @DN6 @sayakpaul @yiyixuxu
Questions on Documentation: @stevhliu
Questions on JAX- and MPS-related things: @pcuenca
Questions on audio pipelines: @DN6
placeholder: "@Username ..."

View File

@@ -1,7 +1,4 @@
contact_links:
- name: Blank issue
url: https://github.com/huggingface/diffusers/issues/new
about: Other
- name: Forum
url: https://discuss.huggingface.co/
about: General usage questions and community discussions
- name: Questions / Discussions
url: https://github.com/huggingface/diffusers/discussions
about: General usage questions and community discussions

View File

@@ -1,5 +1,5 @@
---
name: "\U0001F680 Feature request"
name: "\U0001F680 Feature Request"
about: Suggest an idea for this project
title: ''
labels: ''
@@ -8,13 +8,13 @@ assignees: ''
---
**Is your feature request related to a problem? Please describe.**
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...]
A clear and concise description of what the problem is. Ex. I'm always frustrated when [...].
**Describe the solution you'd like**
**Describe the solution you'd like.**
A clear and concise description of what you want to happen.
**Describe alternatives you've considered**
**Describe alternatives you've considered.**
A clear and concise description of any alternative solutions or features you've considered.
**Additional context**
**Additional context.**
Add any other context or screenshots about the feature request here.

View File

@@ -1,5 +1,5 @@
name: "\U0001F31F New model/pipeline/scheduler addition"
description: Submit a proposal/request to implement a new diffusion model / pipeline / scheduler
name: "\U0001F31F New Model/Pipeline/Scheduler Addition"
description: Submit a proposal/request to implement a new diffusion model/pipeline/scheduler
labels: [ "New model/pipeline/scheduler" ]
body:
@@ -19,7 +19,7 @@ body:
description: |
Please note that if the model implementation isn't available or if the weights aren't open-source, we are less likely to implement it in `diffusers`.
options:
- label: "The model implementation is available"
- label: "The model implementation is available."
- label: "The model weights are available (Only relevant if addition is not a scheduler)."
- type: textarea

29
.github/ISSUE_TEMPLATE/translate.md vendored Normal file
View File

@@ -0,0 +1,29 @@
---
name: 🌐 Translating a New Language?
about: Start a new translation effort in your language
title: '[<languageCode>] Translating docs to <languageName>'
labels: WIP
assignees: ''
---
<!--
Note: Please search to see if an issue already exists for the language you are trying to translate.
-->
Hi!
Let's bring the documentation to all the <languageName>-speaking community 🌐.
Who would want to translate? Please follow the 🤗 [TRANSLATING guide](https://github.com/huggingface/diffusers/blob/main/docs/TRANSLATING.md). Here is a list of the files ready for translation. Let us know in this issue if you'd like to translate any, and we'll add your name to the list.
Some notes:
* Please translate using an informal tone (imagine you are talking with a friend about Diffusers 🤗).
* Please translate in a gender-neutral way.
* Add your translations to the folder called `<languageCode>` inside the [source folder](https://github.com/huggingface/diffusers/tree/main/docs/source).
* Register your translation in `<languageCode>/_toctree.yml`; please follow the order of the [English version](https://github.com/huggingface/diffusers/blob/main/docs/source/en/_toctree.yml).
* Once you're finished, open a pull request and tag this issue by including #issue-number in the description, where issue-number is the number of this issue. Please ping @stevhliu for review.
* 🙋 If you'd like others to help you with the translation, you can also post in the 🤗 [forums](https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63).
Thank you so much for your help! 🤗

60
.github/PULL_REQUEST_TEMPLATE.md vendored Normal file
View File

@@ -0,0 +1,60 @@
# What does this PR do?
<!--
Congratulations! You've made it this far! You're not quite done yet though.
Once merged, your PR is going to appear in the release notes with the title you set, so make sure it's a great title that fully reflects the extent of your awesome contribution.
Then, please replace this with a description of the change and which issue is fixed (if applicable). Please also include relevant motivation and context. List any dependencies (if any) that are required for this change.
Once you're done, someone will review your PR shortly (see the section "Who can review?" below to tag some potential reviewers). They may suggest changes to make the code even better. If no one reviewed your PR after a week has passed, don't hesitate to post a new comment @-mentioning the same persons---sometimes notifications get lost.
-->
<!-- Remove if not applicable -->
Fixes # (issue)
## Before submitting
- [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
- [ ] Did you read the [contributor guideline](https://github.com/huggingface/diffusers/blob/main/CONTRIBUTING.md)?
- [ ] Did you read our [philosophy doc](https://github.com/huggingface/diffusers/blob/main/PHILOSOPHY.md) (important for complex PRs)?
- [ ] Was this discussed/approved via a GitHub issue or the [forum](https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63)? Please add a link to it if that's the case.
- [ ] Did you make sure to update the documentation with your changes? Here are the
[documentation guidelines](https://github.com/huggingface/diffusers/tree/main/docs), and
[here are tips on formatting docstrings](https://github.com/huggingface/diffusers/tree/main/docs#writing-source-documentation).
- [ ] Did you write any new necessary tests?
## Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
<!-- Your PR will be replied to more quickly if you can figure out the right person to tag with @.
If you know how to use git blame, that is the easiest way, otherwise, here is a rough guide of **who to tag**.
Please tag fewer than 3 people.
Core library:
- Schedulers: @yiyixuxu
- Pipelines: @sayakpaul @yiyixuxu @DN6
- Training examples: @sayakpaul
- Docs: @stevhliu and @sayakpaul
- JAX and MPS: @pcuenca
- Audio: @sanchit-gandhi
- General functionalities: @sayakpaul @yiyixuxu @DN6
Integrations:
- deepspeed: HF Trainer/Accelerate: @pacman100
HF projects:
- accelerate: [different repo](https://github.com/huggingface/accelerate)
- datasets: [different repo](https://github.com/huggingface/datasets)
- transformers: [different repo](https://github.com/huggingface/transformers)
- safetensors: [different repo](https://github.com/huggingface/safetensors)
-->

View File

@@ -4,7 +4,7 @@ description: Sets up miniconda in your ${RUNNER_TEMP} environment and gives you
inputs:
python-version:
description: If set to any value, dont use sudo to clean the workspace
description: If set to any value, don't use sudo to clean the workspace
required: false
type: string
default: "3.9"
@@ -27,7 +27,7 @@ runs:
- name: Get date
id: get-date
shell: bash
run: echo "::set-output name=today::$(/bin/date -u '+%Y%m%d')d"
run: echo "today=$(/bin/date -u '+%Y%m%d')d" >> $GITHUB_OUTPUT
- name: Setup miniconda cache
id: miniconda-cache
uses: actions/cache@v2
@@ -143,4 +143,4 @@ runs:
echo "There is ${AVAIL}KB free space left in $MOUNT, continue"
fi
fi
done
done

53
.github/workflows/benchmark.yml vendored Normal file
View File

@@ -0,0 +1,53 @@
name: Benchmarking tests
on:
workflow_dispatch:
schedule:
- cron: "30 1 1,15 * *" # every 2 weeks on the 1st and the 15th of every month at 1:30 AM
env:
DIFFUSERS_IS_CI: yes
HF_HOME: /mnt/cache
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
jobs:
torch_pipelines_cuda_benchmark_tests:
name: Torch Core Pipelines CUDA Benchmarking Tests
strategy:
fail-fast: false
max-parallel: 1
runs-on: [single-gpu, nvidia-gpu, a10, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install pandas peft
- name: Environment
run: |
python utils/print_env.py
- name: Diffusers Benchmarking
env:
HF_TOKEN: ${{ secrets.DIFFUSERS_BOT_TOKEN }}
BASE_PATH: benchmark_outputs
run: |
export TOTAL_GPU_MEMORY=$(python -c "import torch; print(torch.cuda.get_device_properties(0).total_memory / (1024**3))")
cd benchmarks && mkdir ${BASE_PATH} && python run_all.py && python push_results.py
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: benchmark_test_reports
path: benchmarks/benchmark_outputs

View File

@@ -1,20 +1,57 @@
name: Build Docker images (nightly)
name: Test, build, and push Docker images
on:
pull_request: # During PRs, we just check if the changes Dockerfiles can be successfully built
branches:
- main
paths:
- "docker/**"
workflow_dispatch:
schedule:
- cron: "0 0 * * *" # every day at midnight
concurrency:
group: docker-image-builds
cancel-in-progress: false
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
env:
REGISTRY: diffusers
CI_SLACK_CHANNEL: ${{ secrets.CI_DOCKER_CHANNEL }}
jobs:
build-docker-images:
runs-on: ubuntu-latest
test-build-docker-images:
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
if: github.event_name == 'pull_request'
steps:
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Check out code
uses: actions/checkout@v3
- name: Find Changed Dockerfiles
id: file_changes
uses: jitterbit/get-changed-files@v1
with:
format: 'space-delimited'
token: ${{ secrets.GITHUB_TOKEN }}
- name: Build Changed Docker Images
run: |
CHANGED_FILES="${{ steps.file_changes.outputs.all }}"
for FILE in $CHANGED_FILES; do
if [[ "$FILE" == docker/*Dockerfile ]]; then
DOCKER_PATH="${FILE%/Dockerfile}"
DOCKER_TAG=$(basename "$DOCKER_PATH")
echo "Building Docker image for $DOCKER_TAG"
docker build -t "$DOCKER_TAG" "$DOCKER_PATH"
fi
done
if: steps.file_changes.outputs.all != ''
build-and-push-docker-images:
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
if: github.event_name != 'pull_request'
permissions:
contents: read
@@ -26,21 +63,24 @@ jobs:
image-name:
- diffusers-pytorch-cpu
- diffusers-pytorch-cuda
- diffusers-pytorch-compile-cuda
- diffusers-pytorch-xformers-cuda
- diffusers-flax-cpu
- diffusers-flax-tpu
- diffusers-onnxruntime-cpu
- diffusers-onnxruntime-cuda
- diffusers-doc-builder
steps:
- name: Checkout repository
uses: actions/checkout@v3
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v1
- name: Login to Docker Hub
uses: docker/login-action@v2
with:
username: ${{ env.REGISTRY }}
password: ${{ secrets.DOCKERHUB_TOKEN }}
- name: Build and push
uses: docker/build-push-action@v3
with:
@@ -48,3 +88,14 @@ jobs:
context: ./docker/${{ matrix.image-name }}
push: true
tags: ${{ env.REGISTRY }}/${{ matrix.image-name }}:latest
- name: Post to a Slack channel
id: slack
uses: huggingface/hf-workflows/.github/actions/post-slack@main
with:
# Slack channel id, channel name, or user id to post message.
# See also: https://api.slack.com/methods/chat.postMessage#channels
slack_channel: ${{ env.CI_SLACK_CHANNEL }}
title: "🤗 Results of the ${{ matrix.image-name }} Docker Image build"
status: ${{ job.status }}
slack_token: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}

View File

@@ -6,14 +6,22 @@ on:
- main
- doc-builder*
- v*-release
- v*-patch
paths:
- "src/diffusers/**.py"
- "examples/**"
- "docs/**"
jobs:
build:
build:
uses: huggingface/doc-builder/.github/workflows/build_main_documentation.yml@main
with:
commit_sha: ${{ github.sha }}
install_libgl1: true
package: diffusers
notebook_folder: diffusers_doc
languages: en ko
languages: en ko zh ja pt
custom_container: diffusers/diffusers-doc-builder
secrets:
token: ${{ secrets.HUGGINGFACE_PUSH }}
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}

View File

@@ -2,6 +2,10 @@ name: Build PR Documentation
on:
pull_request:
paths:
- "src/diffusers/**.py"
- "examples/**"
- "docs/**"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
@@ -13,5 +17,7 @@ jobs:
with:
commit_sha: ${{ github.event.pull_request.head.sha }}
pr_number: ${{ github.event.number }}
install_libgl1: true
package: diffusers
languages: en ko
languages: en ko zh ja pt
custom_container: diffusers/diffusers-doc-builder

View File

@@ -1,13 +0,0 @@
name: Delete dev documentation
on:
pull_request:
types: [ closed ]
jobs:
delete:
uses: huggingface/doc-builder/.github/workflows/delete_doc_comment.yml@main
with:
pr_number: ${{ github.event.number }}
package: diffusers

View File

@@ -0,0 +1,89 @@
name: Mirror Community Pipeline
on:
# Push changes on the main branch
push:
branches:
- main
paths:
- 'examples/community/**.py'
# And on tag creation (e.g. `v0.28.1`)
tags:
- '*'
# Manual trigger with ref input
workflow_dispatch:
inputs:
ref:
description: "Either 'main' or a tag ref"
required: true
default: 'main'
jobs:
mirror_community_pipeline:
runs-on: ubuntu-latest
steps:
# Checkout to correct ref
# If workflow dispatch
# If ref is 'main', set:
# CHECKOUT_REF=refs/heads/main
# PATH_IN_REPO=main
# Else it must be a tag. Set:
# CHECKOUT_REF=refs/tags/{tag}
# PATH_IN_REPO={tag}
# If not workflow dispatch
# If ref is 'refs/heads/main' => set 'main'
# Else it must be a tag => set {tag}
- name: Set checkout_ref and path_in_repo
run: |
if [ "${{ github.event_name }}" == "workflow_dispatch" ]; then
if [ -z "${{ github.event.inputs.ref }}" ]; then
echo "Error: Missing ref input"
exit 1
elif [ "${{ github.event.inputs.ref }}" == "main" ]; then
echo "CHECKOUT_REF=refs/heads/main" >> $GITHUB_ENV
echo "PATH_IN_REPO=main" >> $GITHUB_ENV
else
echo "CHECKOUT_REF=refs/tags/${{ github.event.inputs.ref }}" >> $GITHUB_ENV
echo "PATH_IN_REPO=${{ github.event.inputs.ref }}" >> $GITHUB_ENV
fi
elif [ "${{ github.ref }}" == "refs/heads/main" ]; then
echo "CHECKOUT_REF=${{ github.ref }}" >> $GITHUB_ENV
echo "PATH_IN_REPO=main" >> $GITHUB_ENV
else
# e.g. refs/tags/v0.28.1 -> v0.28.1
echo "CHECKOUT_REF=${{ github.ref }}" >> $GITHUB_ENV
echo "PATH_IN_REPO=$(echo ${{ github.ref }} | sed 's/^refs\/tags\///')" >> $GITHUB_ENV
fi
- name: Print env vars
run: |
echo "CHECKOUT_REF: ${{ env.CHECKOUT_REF }}"
echo "PATH_IN_REPO: ${{ env.PATH_IN_REPO }}"
- uses: actions/checkout@v3
with:
ref: ${{ env.CHECKOUT_REF }}
# Setup + install dependencies
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.10"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install --upgrade huggingface_hub
# Check secret is set
- name: whoami
run: huggingface-cli whoami
env:
HF_TOKEN: ${{ secrets.HF_TOKEN_MIRROR_COMMUNITY_PIPELINES }}
# Push to HF! (under subfolder based on checkout ref)
# https://huggingface.co/datasets/diffusers/community-pipelines-mirror
- name: Mirror community pipeline to HF
run: huggingface-cli upload diffusers/community-pipelines-mirror ./examples/community ${PATH_IN_REPO} --repo-type dataset
env:
PATH_IN_REPO: ${{ env.PATH_IN_REPO }}
HF_TOKEN: ${{ secrets.HF_TOKEN_MIRROR_COMMUNITY_PIPELINES }}

View File

@@ -1,6 +1,7 @@
name: Nightly tests on main
name: Nightly and release tests on main/release branch
on:
workflow_dispatch:
schedule:
- cron: "0 0 * * *" # every day at midnight
@@ -12,106 +13,348 @@ env:
PYTEST_TIMEOUT: 600
RUN_SLOW: yes
RUN_NIGHTLY: yes
PIPELINE_USAGE_CUTOFF: 5000
SLACK_API_TOKEN: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
jobs:
run_nightly_tests:
strategy:
fail-fast: false
matrix:
config:
- name: Nightly PyTorch CUDA tests on Ubuntu
framework: pytorch
runner: docker-gpu
image: diffusers/diffusers-pytorch-cuda
report: torch_cuda
- name: Nightly Flax TPU tests on Ubuntu
framework: flax
runner: docker-tpu
image: diffusers/diffusers-flax-tpu
report: flax_tpu
- name: Nightly ONNXRuntime CUDA tests on Ubuntu
framework: onnxruntime
runner: docker-gpu
image: diffusers/diffusers-onnxruntime-cuda
report: onnx_cuda
name: ${{ matrix.config.name }}
runs-on: ${{ matrix.config.runner }}
container:
image: ${{ matrix.config.image }}
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ ${{ matrix.config.runner == 'docker-tpu' && '--privileged' || '--gpus 0'}}
defaults:
run:
shell: bash
setup_torch_cuda_pipeline_matrix:
name: Setup Torch Pipelines Matrix
runs-on: diffusers/diffusers-pytorch-cpu
outputs:
pipeline_test_matrix: ${{ steps.fetch_pipeline_matrix.outputs.pipeline_test_matrix }}
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
if: ${{ matrix.config.runner == 'docker-gpu' }}
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
nvidia-smi
pip install -e .
pip install huggingface_hub
- name: Fetch Pipeline Matrix
id: fetch_pipeline_matrix
run: |
matrix=$(python utils/fetch_torch_cuda_pipeline_test_matrix.py)
echo $matrix
echo "pipeline_test_matrix=$matrix" >> $GITHUB_OUTPUT
- name: Pipeline Tests Artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: test-pipelines.json
path: reports
run_nightly_tests_for_torch_pipelines:
name: Torch Pipelines CUDA Nightly Tests
needs: setup_torch_cuda_pipeline_matrix
strategy:
fail-fast: false
matrix:
module: ${{ fromJson(needs.setup_torch_cuda_pipeline_matrix.outputs.pipeline_test_matrix) }}
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: nvidia-smi
- name: Install dependencies
run: |
python -m pip install -e .[quality,test]
python -m pip install -U git+https://github.com/huggingface/transformers
python -m pip install git+https://github.com/huggingface/accelerate
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: |
python utils/print_env.py
- name: Run nightly PyTorch CUDA tests
if: ${{ matrix.config.framework == 'pytorch' }}
- name: Nightly PyTorch CUDA checkpoint (pipelines) tests
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
- name: Run nightly Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }}
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: |
python -m pytest -n 0 \
-s -v -k "Flax" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
- name: Run nightly ONNXRuntime CUDA tests
if: ${{ matrix.config.framework == 'onnxruntime' }}
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
--make-reports=tests_pipeline_${{ matrix.module }}_cuda \
--report-log=tests_pipeline_${{ matrix.module }}_cuda.log \
tests/pipelines/${{ matrix.module }}
- name: Failure short reports
if: ${{ failure() }}
run: cat reports/tests_${{ matrix.config.report }}_failures_short.txt
run: |
cat reports/tests_pipeline_${{ matrix.module }}_cuda_stats.txt
cat reports/tests_pipeline_${{ matrix.module }}_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: ${{ matrix.config.report }}_test_reports
name: pipeline_${{ matrix.module }}_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_tests_for_other_torch_modules:
name: Torch Non-Pipelines CUDA Nightly Tests
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
strategy:
matrix:
module: [models, schedulers, others, examples]
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly PyTorch CUDA tests for non-pipeline modules
if: ${{ matrix.module != 'examples'}}
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_torch_${{ matrix.module }}_cuda \
--report-log=tests_torch_${{ matrix.module }}_cuda.log \
tests/${{ matrix.module }}
- name: Run nightly example tests with Torch
if: ${{ matrix.module == 'examples' }}
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m uv pip install peft@git+https://github.com/huggingface/peft.git
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v --make-reports=examples_torch_cuda \
--report-log=examples_torch_cuda.log \
examples/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_torch_${{ matrix.module }}_cuda_stats.txt
cat reports/tests_torch_${{ matrix.module }}_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: torch_${{ matrix.module }}_cuda_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_lora_nightly_tests:
name: Nightly LoRA Tests with PEFT and TORCH
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install peft@git+https://github.com/huggingface/peft.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly LoRA tests with PEFT and Torch
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_torch_lora_cuda \
--report-log=tests_torch_lora_cuda.log \
tests/lora
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_torch_lora_cuda_stats.txt
cat reports/tests_torch_lora_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: torch_lora_cuda_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_flax_tpu_tests:
name: Nightly Flax TPU Tests
runs-on: docker-tpu
if: github.event_name == 'schedule'
container:
image: diffusers/diffusers-flax-tpu
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ --privileged
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly Flax TPU tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 0 \
-s -v -k "Flax" \
--make-reports=tests_flax_tpu \
--report-log=tests_flax_tpu.log \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_flax_tpu_stats.txt
cat reports/tests_flax_tpu_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: flax_tpu_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_onnx_tests:
name: Nightly ONNXRuntime CUDA tests on Ubuntu
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-onnxruntime-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m uv pip install pytest-reportlog
- name: Environment
run: python utils/print_env.py
- name: Run nightly ONNXRuntime CUDA tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_onnx_cuda \
--report-log=tests_onnx_cuda.log \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_onnx_cuda_stats.txt
cat reports/tests_onnx_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: ${{ matrix.config.report }}_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY
run_nightly_tests_apple_m1:
name: Nightly PyTorch MPS tests on MacOS
runs-on: [ self-hosted, apple-m1 ]
if: github.event_name == 'schedule'
steps:
- name: Checkout diffusers
@@ -132,10 +375,11 @@ jobs:
- name: Install dependencies
shell: arch -arch arm64 bash {0}
run: |
${CONDA_RUN} python -m pip install --upgrade pip
${CONDA_RUN} python -m pip install -e .[quality,test]
${CONDA_RUN} python -m pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
${CONDA_RUN} python -m pip install git+https://github.com/huggingface/accelerate
${CONDA_RUN} python -m pip install --upgrade pip uv
${CONDA_RUN} python -m uv pip install -e [quality,test]
${CONDA_RUN} python -m uv pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
${CONDA_RUN} python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate
${CONDA_RUN} python -m uv pip install pytest-reportlog
- name: Environment
shell: arch -arch arm64 bash {0}
@@ -146,9 +390,11 @@ jobs:
shell: arch -arch arm64 bash {0}
env:
HF_HOME: /System/Volumes/Data/mnt/cache
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps tests/
${CONDA_RUN} python -m pytest -n 1 -s -v --make-reports=tests_torch_mps \
--report-log=tests_torch_mps.log \
tests/
- name: Failure short reports
if: ${{ failure() }}
@@ -160,3 +406,9 @@ jobs:
with:
name: torch_mps_test_reports
path: reports
- name: Generate Report and Notify Channel
if: always()
run: |
pip install slack_sdk tabulate
python scripts/log_reports.py >> $GITHUB_STEP_SUMMARY

View File

@@ -0,0 +1,23 @@
name: Notify Slack about a release
on:
workflow_dispatch:
release:
types: [published]
jobs:
build:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: '3.8'
- name: Notify Slack about the release
env:
SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
run: pip install requests && python utils/notify_slack_about_release.py

View File

@@ -0,0 +1,36 @@
name: Run dependency tests
on:
pull_request:
branches:
- main
paths:
- "src/diffusers/**.py"
push:
branches:
- main
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
check_dependencies:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pip install --upgrade pip uv
python -m uv pip install -e .
python -m uv pip install pytest
- name: Check for soft dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
pytest tests/others/test_dependencies.py

View File

@@ -0,0 +1,38 @@
name: Run Flax dependency tests
on:
pull_request:
branches:
- main
paths:
- "src/diffusers/**.py"
push:
branches:
- main
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
check_flax_dependencies:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pip install --upgrade pip uv
python -m uv pip install -e .
python -m uv pip install "jax[cpu]>=0.2.16,!=0.3.2"
python -m uv pip install "flax>=0.4.1"
python -m uv pip install "jaxlib>=0.1.65"
python -m uv pip install pytest
- name: Check for soft dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
pytest tests/others/test_dependencies.py

View File

@@ -1,50 +0,0 @@
name: Run code quality checks
on:
pull_request:
branches:
- main
push:
branches:
- main
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
check_code_quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.7"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: |
black --check examples tests src utils scripts
ruff examples tests src utils scripts
doc-builder style src/diffusers docs/source --max_len 119 --check_only --path_to_docs docs/source
check_repository_consistency:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.7"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: |
python utils/check_copies.py
python utils/check_dummies.py
make deps_table_check_updated

174
.github/workflows/pr_test_fetcher.yml vendored Normal file
View File

@@ -0,0 +1,174 @@
name: Fast tests for PRs - Test Fetcher
on: workflow_dispatch
env:
DIFFUSERS_IS_CI: yes
OMP_NUM_THREADS: 4
MKL_NUM_THREADS: 4
PYTEST_TIMEOUT: 60
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
setup_pr_tests:
name: Setup PR Tests
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
container:
image: diffusers/diffusers-pytorch-cpu
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
defaults:
run:
shell: bash
outputs:
matrix: ${{ steps.set_matrix.outputs.matrix }}
test_map: ${{ steps.set_matrix.outputs.test_map }}
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 0
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
- name: Environment
run: |
python utils/print_env.py
echo $(git --version)
- name: Fetch Tests
run: |
python utils/tests_fetcher.py | tee test_preparation.txt
- name: Report fetched tests
uses: actions/upload-artifact@v3
with:
name: test_fetched
path: test_preparation.txt
- id: set_matrix
name: Create Test Matrix
# The `keys` is used as GitHub actions matrix for jobs, i.e. `models`, `pipelines`, etc.
# The `test_map` is used to get the actual identified test files under each key.
# If no test to run (so no `test_map.json` file), create a dummy map (empty matrix will fail)
run: |
if [ -f test_map.json ]; then
keys=$(python3 -c 'import json; fp = open("test_map.json"); test_map = json.load(fp); fp.close(); d = list(test_map.keys()); print(json.dumps(d))')
test_map=$(python3 -c 'import json; fp = open("test_map.json"); test_map = json.load(fp); fp.close(); print(json.dumps(test_map))')
else
keys=$(python3 -c 'keys = ["dummy"]; print(keys)')
test_map=$(python3 -c 'test_map = {"dummy": []}; print(test_map)')
fi
echo $keys
echo $test_map
echo "matrix=$keys" >> $GITHUB_OUTPUT
echo "test_map=$test_map" >> $GITHUB_OUTPUT
run_pr_tests:
name: Run PR Tests
needs: setup_pr_tests
if: contains(fromJson(needs.setup_pr_tests.outputs.matrix), 'dummy') != true
strategy:
fail-fast: false
max-parallel: 2
matrix:
modules: ${{ fromJson(needs.setup_pr_tests.outputs.matrix) }}
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
container:
image: diffusers/diffusers-pytorch-cpu
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pip install -e [quality,test]
python -m pip install accelerate
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run all selected tests on CPU
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 2 --dist=loadfile -v --make-reports=${{ matrix.modules }}_tests_cpu ${{ fromJson(needs.setup_pr_tests.outputs.test_map)[matrix.modules] }}
- name: Failure short reports
if: ${{ failure() }}
continue-on-error: true
run: |
cat reports/${{ matrix.modules }}_tests_cpu_stats.txt
cat reports/${{ matrix.modules }}_tests_cpu_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v3
with:
name: ${{ matrix.modules }}_test_reports
path: reports
run_staging_tests:
strategy:
fail-fast: false
matrix:
config:
- name: Hub tests for models, schedulers, and pipelines
framework: hub_tests_pytorch
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_hub
name: ${{ matrix.config.name }}
runs-on: ${{ matrix.config.runner }}
container:
image: ${{ matrix.config.image }}
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pip install -e [quality,test]
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run Hub tests for models, schedulers, and pipelines on a staging env
if: ${{ matrix.config.framework == 'hub_tests_pytorch' }}
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
HUGGINGFACE_CO_STAGING=true python -m pytest \
-m "is_staging_test" \
--make-reports=tests_${{ matrix.config.report }} \
tests
- name: Failure short reports
if: ${{ failure() }}
run: cat reports/tests_${{ matrix.config.report }}_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: pr_${{ matrix.config.report }}_test_reports
path: reports

View File

@@ -0,0 +1,131 @@
name: Fast tests for PRs - PEFT backend
on:
pull_request:
branches:
- main
paths:
- "src/diffusers/**.py"
- "tests/**.py"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
env:
DIFFUSERS_IS_CI: yes
OMP_NUM_THREADS: 4
MKL_NUM_THREADS: 4
PYTEST_TIMEOUT: 60
jobs:
check_code_quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: make quality
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Quality check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make style && make quality'" >> $GITHUB_STEP_SUMMARY
check_repository_consistency:
needs: check_code_quality
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check repo consistency
run: |
python utils/check_copies.py
python utils/check_dummies.py
make deps_table_check_updated
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Repo consistency check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make fix-copies'" >> $GITHUB_STEP_SUMMARY
run_fast_tests:
needs: [check_code_quality, check_repository_consistency]
strategy:
fail-fast: false
matrix:
lib-versions: ["main", "latest"]
name: LoRA - ${{ matrix.lib-versions }}
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
container:
image: diffusers/diffusers-pytorch-cpu
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
if [ "${{ matrix.lib-versions }}" == "main" ]; then
python -m pip install -U peft@git+https://github.com/huggingface/peft.git
python -m uv pip install -U transformers@git+https://github.com/huggingface/transformers.git
python -m uv pip install -U accelerate@git+https://github.com/huggingface/accelerate.git
else
python -m uv pip install -U peft transformers accelerate
fi
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run fast PyTorch LoRA CPU tests with PEFT backend
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v \
--make-reports=tests_${{ matrix.config.report }} \
tests/lora/
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v \
--make-reports=tests_models_lora_${{ matrix.config.report }} \
tests/models/ -k "lora"
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_${{ matrix.config.report }}_failures_short.txt
cat reports/tests_models_lora_${{ matrix.config.report }}_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: pr_${{ matrix.config.report }}_test_reports
path: reports

View File

@@ -4,6 +4,17 @@ on:
pull_request:
branches:
- main
paths:
- "src/diffusers/**.py"
- "benchmarks/**.py"
- "examples/**.py"
- "scripts/**.py"
- "tests/**.py"
- ".github/**.yml"
- "utils/**.py"
push:
branches:
- ci-*
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
@@ -16,34 +27,72 @@ env:
PYTEST_TIMEOUT: 60
jobs:
check_code_quality:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check quality
run: make quality
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Quality check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make style && make quality'" >> $GITHUB_STEP_SUMMARY
check_repository_consistency:
needs: check_code_quality
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install .[quality]
- name: Check repo consistency
run: |
python utils/check_copies.py
python utils/check_dummies.py
make deps_table_check_updated
- name: Check if failure
if: ${{ failure() }}
run: |
echo "Repo consistency check failed. Please ensure the right dependency versions are installed with 'pip install -e .[quality]' and run 'make fix-copies'" >> $GITHUB_STEP_SUMMARY
run_fast_tests:
needs: [check_code_quality, check_repository_consistency]
strategy:
fail-fast: false
matrix:
config:
- name: Fast PyTorch Pipeline CPU tests
framework: pytorch_pipelines
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 32-cpu, 256-ram, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_cpu_pipelines
- name: Fast PyTorch Models & Schedulers CPU tests
framework: pytorch_models
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_cpu_models_schedulers
- name: Fast Flax CPU tests
framework: flax
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-flax-cpu
report: flax_cpu
- name: Fast ONNXRuntime CPU tests
framework: onnxruntime
runner: docker-cpu
image: diffusers/diffusers-onnxruntime-cpu
report: onnx_cpu
- name: PyTorch Example CPU tests
framework: pytorch_examples
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_example_cpu
@@ -67,19 +116,20 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev -y
python -m pip install -e .[quality,test]
python -m pip install -U git+https://github.com/huggingface/transformers
python -m pip install git+https://github.com/huggingface/accelerate
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run fast PyTorch Pipeline CPU tests
if: ${{ matrix.config.framework == 'pytorch_pipelines' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 8 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/pipelines
@@ -87,33 +137,29 @@ jobs:
- name: Run fast PyTorch Model Scheduler CPU tests
if: ${{ matrix.config.framework == 'pytorch_models' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx and not Dependency" \
--make-reports=tests_${{ matrix.config.report }} \
tests/models tests/schedulers tests/others
- name: Run fast Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Flax" \
--make-reports=tests_${{ matrix.config.report }} \
tests
- name: Run fast ONNXRuntime CPU tests
if: ${{ matrix.config.framework == 'onnxruntime' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
- name: Run example PyTorch CPU tests
if: ${{ matrix.config.framework == 'pytorch_examples' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install peft timm
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
--make-reports=tests_${{ matrix.config.report }} \
examples/test_examples.py
examples
- name: Failure short reports
if: ${{ failure() }}
@@ -126,9 +172,29 @@ jobs:
name: pr_${{ matrix.config.report }}_test_reports
path: reports
run_fast_tests_apple_m1:
name: Fast PyTorch MPS tests on MacOS
runs-on: [ self-hosted, apple-m1 ]
run_staging_tests:
needs: [check_code_quality, check_repository_consistency]
strategy:
fail-fast: false
matrix:
config:
- name: Hub tests for models, schedulers, and pipelines
framework: hub_tests_pytorch
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_hub
name: ${{ matrix.config.name }}
runs-on: ${{ matrix.config.runner }}
container:
image: ${{ matrix.config.image }}
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
@@ -136,45 +202,32 @@ jobs:
with:
fetch-depth: 2
- name: Clean checkout
shell: arch -arch arm64 bash {0}
run: |
git clean -fxd
- name: Setup miniconda
uses: ./.github/actions/setup-miniconda
with:
python-version: 3.9
- name: Install dependencies
shell: arch -arch arm64 bash {0}
run: |
${CONDA_RUN} python -m pip install --upgrade pip
${CONDA_RUN} python -m pip install -e .[quality,test]
${CONDA_RUN} python -m pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
${CONDA_RUN} python -m pip install git+https://github.com/huggingface/accelerate
${CONDA_RUN} python -m pip install -U git+https://github.com/huggingface/transformers
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
- name: Environment
shell: arch -arch arm64 bash {0}
run: |
${CONDA_RUN} python utils/print_env.py
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run fast PyTorch tests on M1 (MPS)
shell: arch -arch arm64 bash {0}
env:
HF_HOME: /System/Volumes/Data/mnt/cache
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
- name: Run Hub tests for models, schedulers, and pipelines on a staging env
if: ${{ matrix.config.framework == 'hub_tests_pytorch' }}
run: |
${CONDA_RUN} python -m pytest -n 0 -s -v --make-reports=tests_torch_mps tests/
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
HUGGINGFACE_CO_STAGING=true python -m pytest \
-m "is_staging_test" \
--make-reports=tests_${{ matrix.config.report }} \
tests
- name: Failure short reports
if: ${{ failure() }}
run: cat reports/tests_torch_mps_failures_short.txt
run: cat reports/tests_${{ matrix.config.report }}_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: pr_torch_mps_test_reports
name: pr_${{ matrix.config.report }}_test_reports
path: reports

View File

@@ -0,0 +1,36 @@
name: Run Torch dependency tests
on:
pull_request:
branches:
- main
paths:
- "src/diffusers/**.py"
push:
branches:
- main
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
check_torch_dependencies:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pip install --upgrade pip uv
python -m uv pip install -e .
python -m uv pip install torch torchvision torchaudio
python -m uv pip install pytest
- name: Check for soft dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
pytest tests/others/test_dependencies.py

View File

@@ -1,9 +1,13 @@
name: Slow tests on main
name: Slow Tests on main
on:
push:
branches:
- main
paths:
- "src/diffusers/**.py"
- "examples/**.py"
- "tests/**.py"
env:
DIFFUSERS_IS_CI: yes
@@ -12,111 +16,382 @@ env:
MKL_NUM_THREADS: 8
PYTEST_TIMEOUT: 600
RUN_SLOW: yes
PIPELINE_USAGE_CUTOFF: 50000
jobs:
run_slow_tests:
setup_torch_cuda_pipeline_matrix:
name: Setup Torch Pipelines CUDA Slow Tests Matrix
runs-on: [ self-hosted, intel-cpu, 8-cpu, ci ]
container:
image: diffusers/diffusers-pytorch-cpu
outputs:
pipeline_test_matrix: ${{ steps.fetch_pipeline_matrix.outputs.pipeline_test_matrix }}
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
- name: Environment
run: |
python utils/print_env.py
- name: Fetch Pipeline Matrix
id: fetch_pipeline_matrix
run: |
matrix=$(python utils/fetch_torch_cuda_pipeline_test_matrix.py)
echo $matrix
echo "pipeline_test_matrix=$matrix" >> $GITHUB_OUTPUT
- name: Pipeline Tests Artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: test-pipelines.json
path: reports
torch_pipelines_cuda_tests:
name: Torch Pipelines CUDA Slow Tests
needs: setup_torch_cuda_pipeline_matrix
strategy:
fail-fast: false
max-parallel: 8
matrix:
config:
- name: Slow PyTorch CUDA tests on Ubuntu
framework: pytorch
runner: docker-gpu
image: diffusers/diffusers-pytorch-cuda
report: torch_cuda
- name: Slow Flax TPU tests on Ubuntu
framework: flax
runner: docker-tpu
image: diffusers/diffusers-flax-tpu
report: flax_tpu
- name: Slow ONNXRuntime CUDA tests on Ubuntu
framework: onnxruntime
runner: docker-gpu
image: diffusers/diffusers-onnxruntime-cuda
report: onnx_cuda
name: ${{ matrix.config.name }}
runs-on: ${{ matrix.config.runner }}
module: ${{ fromJson(needs.setup_torch_cuda_pipeline_matrix.outputs.pipeline_test_matrix) }}
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: ${{ matrix.config.image }}
options: --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/ ${{ matrix.config.runner == 'docker-tpu' && '--privileged' || '--gpus 0'}}
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
python utils/print_env.py
- name: Slow PyTorch CUDA checkpoint tests on Ubuntu
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_pipeline_${{ matrix.module }}_cuda \
tests/pipelines/${{ matrix.module }}
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_pipeline_${{ matrix.module }}_cuda_stats.txt
cat reports/tests_pipeline_${{ matrix.module }}_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: pipeline_${{ matrix.module }}_test_reports
path: reports
torch_cuda_tests:
name: Torch CUDA Tests
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
strategy:
matrix:
module: [models, schedulers, lora, others, single_file]
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
if : ${{ matrix.config.runner == 'docker-gpu' }}
run: |
nvidia-smi
- name: Install dependencies
run: |
python -m pip install -e .[quality,test]
python -m pip install -U git+https://github.com/huggingface/transformers
python -m pip install git+https://github.com/huggingface/accelerate
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
python utils/print_env.py
- name: Run slow PyTorch CUDA tests
if: ${{ matrix.config.framework == 'pytorch' }}
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
- name: Run slow Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }}
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: |
python -m pytest -n 0 \
-s -v -k "Flax" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
- name: Run slow ONNXRuntime CUDA tests
if: ${{ matrix.config.framework == 'onnxruntime' }}
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
--make-reports=tests_torch_cuda \
tests/${{ matrix.module }}
- name: Failure short reports
if: ${{ failure() }}
run: cat reports/tests_${{ matrix.config.report }}_failures_short.txt
run: |
cat reports/tests_torch_cuda_stats.txt
cat reports/tests_torch_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: ${{ matrix.config.report }}_test_reports
name: torch_cuda_test_reports
path: reports
peft_cuda_tests:
name: PEFT CUDA Tests
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
python -m pip install -U peft@git+https://github.com/huggingface/peft.git
- name: Environment
run: |
python utils/print_env.py
- name: Run slow PEFT CUDA tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
# https://pytorch.org/docs/stable/notes/randomness.html#avoiding-nondeterministic-algorithms
CUBLAS_WORKSPACE_CONFIG: :16:8
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx and not PEFTLoRALoading" \
--make-reports=tests_peft_cuda \
tests/lora/
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "lora and not Flax and not Onnx and not PEFTLoRALoading" \
--make-reports=tests_peft_cuda_models_lora \
tests/models/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_peft_cuda_stats.txt
cat reports/tests_peft_cuda_failures_short.txt
cat reports/tests_peft_cuda_models_lora_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: torch_peft_test_reports
path: reports
flax_tpu_tests:
name: Flax TPU Tests
runs-on: docker-tpu
container:
image: diffusers/diffusers-flax-tpu
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ --privileged
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
python utils/print_env.py
- name: Run slow Flax TPU tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 0 \
-s -v -k "Flax" \
--make-reports=tests_flax_tpu \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_flax_tpu_stats.txt
cat reports/tests_flax_tpu_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: flax_tpu_test_reports
path: reports
onnx_cuda_tests:
name: ONNX CUDA Tests
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-onnxruntime-cuda
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/ --gpus 0
defaults:
run:
shell: bash
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
- name: Environment
run: |
python utils/print_env.py
- name: Run slow ONNXRuntime CUDA tests
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_onnx_cuda \
tests/
- name: Failure short reports
if: ${{ failure() }}
run: |
cat reports/tests_onnx_cuda_stats.txt
cat reports/tests_onnx_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: onnx_cuda_test_reports
path: reports
run_torch_compile_tests:
name: PyTorch Compile CUDA tests
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-compile-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test,training]
- name: Environment
run: |
python utils/print_env.py
- name: Run example tests on GPU
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "compile" --make-reports=tests_torch_compile_cuda tests/
- name: Failure short reports
if: ${{ failure() }}
run: cat reports/tests_torch_compile_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: torch_compile_test_reports
path: reports
run_xformers_tests:
name: PyTorch xformers CUDA tests
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-xformers-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Install dependencies
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test,training]
- name: Environment
run: |
python utils/print_env.py
- name: Run example tests on GPU
env:
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v -k "xformers" --make-reports=tests_torch_xformers_cuda tests/
- name: Failure short reports
if: ${{ failure() }}
run: cat reports/tests_torch_xformers_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: torch_xformers_test_reports
path: reports
run_examples_tests:
name: Examples PyTorch CUDA tests on Ubuntu
runs-on: docker-gpu
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: diffusers/diffusers-pytorch-cuda
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/hf_cache:/mnt/cache/
options: --gpus 0 --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Checkout diffusers
@@ -130,23 +405,27 @@ jobs:
- name: Install dependencies
run: |
python -m pip install -e .[quality,test,training]
python -m pip install git+https://github.com/huggingface/accelerate
python -m pip install -U git+https://github.com/huggingface/transformers
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test,training]
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run example tests on GPU
env:
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install timm
python -m pytest -n 1 --max-worker-restart=0 --dist=loadfile -s -v --make-reports=examples_torch_cuda examples/
- name: Failure short reports
if: ${{ failure() }}
run: cat reports/examples_torch_cuda_failures_short.txt
run: |
cat reports/examples_torch_cuda_stats.txt
cat reports/examples_torch_cuda_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}

View File

@@ -1,9 +1,17 @@
name: Slow tests on main
name: Fast tests on main
on:
push:
branches:
- main
paths:
- "src/diffusers/**.py"
- "examples/**.py"
- "tests/**.py"
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
env:
DIFFUSERS_IS_CI: yes
@@ -21,22 +29,22 @@ jobs:
config:
- name: Fast PyTorch CPU tests on Ubuntu
framework: pytorch
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_cpu
- name: Fast Flax CPU tests on Ubuntu
framework: flax
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-flax-cpu
report: flax_cpu
- name: Fast ONNXRuntime CPU tests on Ubuntu
framework: onnxruntime
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-onnxruntime-cpu
report: onnx_cpu
- name: PyTorch Example CPU tests on Ubuntu
framework: pytorch_examples
runner: docker-cpu
runner: [ self-hosted, intel-cpu, 8-cpu, ci ]
image: diffusers/diffusers-pytorch-cpu
report: torch_example_cpu
@@ -60,19 +68,19 @@ jobs:
- name: Install dependencies
run: |
apt-get update && apt-get install libsndfile1-dev -y
python -m pip install -e .[quality,test]
python -m pip install -U git+https://github.com/huggingface/transformers
python -m pip install git+https://github.com/huggingface/accelerate
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
- name: Environment
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python utils/print_env.py
- name: Run fast PyTorch CPU tests
if: ${{ matrix.config.framework == 'pytorch' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "not Flax and not Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
@@ -80,7 +88,8 @@ jobs:
- name: Run fast Flax TPU tests
if: ${{ matrix.config.framework == 'flax' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Flax" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
@@ -88,7 +97,8 @@ jobs:
- name: Run fast ONNXRuntime CPU tests
if: ${{ matrix.config.framework == 'onnxruntime' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
-s -v -k "Onnx" \
--make-reports=tests_${{ matrix.config.report }} \
tests/
@@ -96,9 +106,11 @@ jobs:
- name: Run example PyTorch CPU tests
if: ${{ matrix.config.framework == 'pytorch_examples' }}
run: |
python -m pytest -n 2 --max-worker-restart=0 --dist=loadfile \
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install peft timm
python -m pytest -n 4 --max-worker-restart=0 --dist=loadfile \
--make-reports=tests_${{ matrix.config.report }} \
examples/test_examples.py
examples
- name: Failure short reports
if: ${{ failure() }}
@@ -110,56 +122,3 @@ jobs:
with:
name: pr_${{ matrix.config.report }}_test_reports
path: reports
run_fast_tests_apple_m1:
name: Fast PyTorch MPS tests on MacOS
runs-on: [ self-hosted, apple-m1 ]
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Clean checkout
shell: arch -arch arm64 bash {0}
run: |
git clean -fxd
- name: Setup miniconda
uses: ./.github/actions/setup-miniconda
with:
python-version: 3.9
- name: Install dependencies
shell: arch -arch arm64 bash {0}
run: |
${CONDA_RUN} python -m pip install --upgrade pip
${CONDA_RUN} python -m pip install -e .[quality,test]
${CONDA_RUN} python -m pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cpu
${CONDA_RUN} python -m pip install git+https://github.com/huggingface/accelerate
${CONDA_RUN} python -m pip install -U git+https://github.com/huggingface/transformers
- name: Environment
shell: arch -arch arm64 bash {0}
run: |
${CONDA_RUN} python utils/print_env.py
- name: Run fast PyTorch tests on M1 (MPS)
shell: arch -arch arm64 bash {0}
env:
HF_HOME: /System/Volumes/Data/mnt/cache
HUGGING_FACE_HUB_TOKEN: ${{ secrets.HUGGING_FACE_HUB_TOKEN }}
run: |
${CONDA_RUN} python -m pytest -n 0 -s -v --make-reports=tests_torch_mps tests/
- name: Failure short reports
if: ${{ failure() }}
run: cat reports/tests_torch_mps_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: pr_torch_mps_test_reports
path: reports

75
.github/workflows/push_tests_mps.yml vendored Normal file
View File

@@ -0,0 +1,75 @@
name: Fast mps tests on main
on:
push:
branches:
- main
paths:
- "src/diffusers/**.py"
- "tests/**.py"
env:
DIFFUSERS_IS_CI: yes
HF_HOME: /mnt/cache
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
PYTEST_TIMEOUT: 600
RUN_SLOW: no
concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true
jobs:
run_fast_tests_apple_m1:
name: Fast PyTorch MPS tests on MacOS
runs-on: macos-13-xlarge
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: Clean checkout
shell: arch -arch arm64 bash {0}
run: |
git clean -fxd
- name: Setup miniconda
uses: ./.github/actions/setup-miniconda
with:
python-version: 3.9
- name: Install dependencies
shell: arch -arch arm64 bash {0}
run: |
${CONDA_RUN} python -m pip install --upgrade pip uv
${CONDA_RUN} python -m uv pip install -e [quality,test]
${CONDA_RUN} python -m uv pip install torch torchvision torchaudio
${CONDA_RUN} python -m uv pip install accelerate@git+https://github.com/huggingface/accelerate.git
${CONDA_RUN} python -m uv pip install transformers --upgrade
- name: Environment
shell: arch -arch arm64 bash {0}
run: |
${CONDA_RUN} python utils/print_env.py
- name: Run fast PyTorch tests on M1 (MPS)
shell: arch -arch arm64 bash {0}
env:
HF_HOME: /System/Volumes/Data/mnt/cache
HF_TOKEN: ${{ secrets.HF_TOKEN }}
run: |
${CONDA_RUN} python -m pytest -n 0 -s -v --make-reports=tests_torch_mps tests/
- name: Failure short reports
if: ${{ failure() }}
run: cat reports/tests_torch_mps_failures_short.txt
- name: Test suite reports artifacts
if: ${{ always() }}
uses: actions/upload-artifact@v2
with:
name: pr_torch_mps_test_reports
path: reports

81
.github/workflows/pypi_publish.yaml vendored Normal file
View File

@@ -0,0 +1,81 @@
# Adapted from https://blog.deepjyoti30.dev/pypi-release-github-action
name: PyPI release
on:
workflow_dispatch:
push:
tags:
- "*"
jobs:
find-and-checkout-latest-branch:
runs-on: ubuntu-latest
outputs:
latest_branch: ${{ steps.set_latest_branch.outputs.latest_branch }}
steps:
- name: Checkout Repo
uses: actions/checkout@v3
- name: Set up Python
uses: actions/setup-python@v4
with:
python-version: '3.8'
- name: Fetch latest branch
id: fetch_latest_branch
run: |
pip install -U requests packaging
LATEST_BRANCH=$(python utils/fetch_latest_release_branch.py)
echo "Latest branch: $LATEST_BRANCH"
echo "latest_branch=$LATEST_BRANCH" >> $GITHUB_ENV
- name: Set latest branch output
id: set_latest_branch
run: echo "::set-output name=latest_branch::${{ env.latest_branch }}"
release:
needs: find-and-checkout-latest-branch
runs-on: ubuntu-latest
steps:
- name: Checkout Repo
uses: actions/checkout@v3
with:
ref: ${{ needs.find-and-checkout-latest-branch.outputs.latest_branch }}
- name: Setup Python
uses: actions/setup-python@v4
with:
python-version: "3.8"
- name: Install dependencies
run: |
python -m pip install --upgrade pip
pip install -U setuptools wheel twine
pip install -U torch --index-url https://download.pytorch.org/whl/cpu
pip install -U transformers
- name: Build the dist files
run: python setup.py bdist_wheel && python setup.py sdist
- name: Publish to the test PyPI
env:
TWINE_USERNAME: ${{ secrets.TEST_PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.TEST_PYPI_PASSWORD }}
run: twine upload dist/* -r pypitest --repository-url=https://test.pypi.org/legacy/
- name: Test installing diffusers and importing
run: |
pip install diffusers && pip uninstall diffusers -y
pip install -i https://testpypi.python.org/pypi diffusers
python -c "from diffusers import __version__; print(__version__)"
python -c "from diffusers import DiffusionPipeline; pipe = DiffusionPipeline.from_pretrained('fusing/unet-ldm-dummy-update'); pipe()"
python -c "from diffusers import DiffusionPipeline; pipe = DiffusionPipeline.from_pretrained('hf-internal-testing/tiny-stable-diffusion-pipe', safety_checker=None); pipe('ah suh du')"
python -c "from diffusers import *"
- name: Publish to PyPI
env:
TWINE_USERNAME: ${{ secrets.PYPI_USERNAME }}
TWINE_PASSWORD: ${{ secrets.PYPI_PASSWORD }}
run: twine upload dist/* -r pypi

View File

@@ -0,0 +1,73 @@
name: Check running SLOW tests from a PR (only GPU)
on:
workflow_dispatch:
inputs:
docker_image:
default: 'diffusers/diffusers-pytorch-cuda'
description: 'Name of the Docker image'
required: true
branch:
description: 'PR Branch to test on'
required: true
test:
description: 'Tests to run (e.g.: `tests/models`).'
required: true
env:
DIFFUSERS_IS_CI: yes
IS_GITHUB_CI: "1"
HF_HOME: /mnt/cache
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
PYTEST_TIMEOUT: 600
RUN_SLOW: yes
jobs:
run_tests:
name: "Run a test on our runner from a PR"
runs-on: [single-gpu, nvidia-gpu, t4, ci]
container:
image: ${{ github.event.inputs.docker_image }}
options: --gpus 0 --privileged --ipc host -v /mnt/cache/.cache/huggingface:/mnt/cache/
steps:
- name: Validate test files input
id: validate_test_files
env:
PY_TEST: ${{ github.event.inputs.test }}
run: |
if [[ ! "$PY_TEST" =~ ^tests/ ]]; then
echo "Error: The input string must start with 'tests/'."
exit 1
fi
if [[ ! "$PY_TEST" =~ ^tests/(models|pipelines) ]]; then
echo "Error: The input string must contain either 'models' or 'pipelines' after 'tests/'."
exit 1
fi
if [[ "$PY_TEST" == *";"* ]]; then
echo "Error: The input string must not contain ';'."
exit 1
fi
echo "$PY_TEST"
- name: Checkout PR branch
uses: actions/checkout@v4
with:
ref: ${{ github.event.inputs.branch }}
repository: ${{ github.event.pull_request.head.repo.full_name }}
- name: Install pytest
run: |
python -m venv /opt/venv && export PATH="/opt/venv/bin:$PATH"
python -m uv pip install -e [quality,test]
python -m uv pip install peft
- name: Run tests
env:
PY_TEST: ${{ github.event.inputs.test }}
run: |
pytest "$PY_TEST"

46
.github/workflows/ssh-runner.yml vendored Normal file
View File

@@ -0,0 +1,46 @@
name: SSH into runners
on:
workflow_dispatch:
inputs:
runner_type:
description: 'Type of runner to test (a10 or t4)'
required: true
docker_image:
description: 'Name of the Docker image'
required: true
env:
IS_GITHUB_CI: "1"
HF_HUB_READ_TOKEN: ${{ secrets.HF_HUB_READ_TOKEN }}
HF_HOME: /mnt/cache
DIFFUSERS_IS_CI: yes
OMP_NUM_THREADS: 8
MKL_NUM_THREADS: 8
RUN_SLOW: yes
jobs:
ssh_runner:
name: "SSH"
runs-on: [single-gpu, nvidia-gpu, "${{ github.event.inputs.runner_type }}", ci]
container:
image: ${{ github.event.inputs.docker_image }}
options: --shm-size "16gb" --ipc host -v /mnt/cache/.cache/huggingface/diffusers:/mnt/cache/ --gpus 0 --privileged
steps:
- name: Checkout diffusers
uses: actions/checkout@v3
with:
fetch-depth: 2
- name: NVIDIA-SMI
run: |
nvidia-smi
- name: Tailscale # In order to be able to SSH when a test fails
uses: huggingface/tailscale-action@main
with:
authkey: ${{ secrets.TAILSCALE_SSH_AUTHKEY }}
slackChannel: ${{ secrets.SLACK_CIFEEDBACK_CHANNEL }}
slackToken: ${{ secrets.SLACK_CIFEEDBACK_BOT_TOKEN }}
waitForSSH: true

View File

@@ -17,7 +17,7 @@ jobs:
- name: Setup Python
uses: actions/setup-python@v1
with:
python-version: 3.7
python-version: 3.8
- name: Install requirements
run: |

15
.github/workflows/trufflehog.yml vendored Normal file
View File

@@ -0,0 +1,15 @@
on:
push:
name: Secret Leaks
jobs:
trufflehog:
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v4
with:
fetch-depth: 0
- name: Secret Scanning
uses: trufflesecurity/trufflehog@main

30
.github/workflows/update_metadata.yml vendored Normal file
View File

@@ -0,0 +1,30 @@
name: Update Diffusers metadata
on:
workflow_dispatch:
push:
branches:
- main
- update_diffusers_metadata*
jobs:
update_metadata:
runs-on: ubuntu-22.04
defaults:
run:
shell: bash -l {0}
steps:
- uses: actions/checkout@v3
- name: Setup environment
run: |
pip install --upgrade pip
pip install datasets pandas
pip install .[torch]
- name: Update metadata
env:
HF_TOKEN: ${{ secrets.SAYAK_HF_TOKEN }}
run: |
python utils/update_metadata.py --commit_sha ${{ github.sha }}

View File

@@ -0,0 +1,16 @@
name: Upload PR Documentation
on:
workflow_run:
workflows: ["Build PR Documentation"]
types:
- completed
jobs:
build:
uses: huggingface/doc-builder/.github/workflows/upload_pr_documentation.yml@main
with:
package_name: diffusers
secrets:
hf_token: ${{ secrets.HF_DOC_BUILD_PUSH }}
comment_bot_token: ${{ secrets.COMMENT_BOT_TOKEN }}

10
.gitignore vendored
View File

@@ -1,4 +1,4 @@
# Initially taken from Github's Python gitignore file
# Initially taken from GitHub's Python gitignore file
# Byte-compiled / optimized / DLL files
__pycache__/
@@ -34,7 +34,7 @@ wheels/
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# Usually these files are written by a Python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
@@ -153,7 +153,7 @@ debug.env
# vim
.*.swp
#ctags
# ctags
tags
# pre-commit
@@ -164,6 +164,7 @@ tags
# DS_Store (MacOS)
.DS_Store
# RL pipelines may produce mp4 outputs
*.mp4
@@ -173,4 +174,5 @@ tags
# ruff
.ruff_cache
wandb
# wandb
wandb

View File

@@ -19,6 +19,16 @@ authors:
family-names: Rasul
- given-names: Mishig
family-names: Davaadorj
- given-names: Dhruv
family-names: Nair
- given-names: Sayak
family-names: Paul
- given-names: Steven
family-names: Liu
- given-names: William
family-names: Berman
- given-names: Yiyi
family-names: Xu
- given-names: Thomas
family-names: Wolf
repository-code: 'https://github.com/huggingface/diffusers'
@@ -31,10 +41,12 @@ keywords:
- deep-learning
- pytorch
- image-generation
- hacktoberfest
- diffusion
- text2image
- image2image
- score-based-generative-modeling
- stable-diffusion
- stable-diffusion-diffusers
license: Apache-2.0
version: 0.12.1

View File

@@ -7,7 +7,7 @@ We as members, contributors, and leaders pledge to make participation in our
community a harassment-free experience for everyone, regardless of age, body
size, visible or invisible disability, ethnicity, sex characteristics, gender
identity and expression, level of experience, education, socio-economic status,
nationality, personal appearance, race, religion, or sexual identity
nationality, personal appearance, race, caste, color, religion, or sexual identity
and orientation.
We pledge to act and interact in ways that contribute to an open, welcoming,
@@ -24,7 +24,7 @@ community include:
* Accepting responsibility and apologizing to those affected by our mistakes,
and learning from the experience
* Focusing on what is best not just for us as individuals, but for the
overall diffusers community
overall Diffusers community
Examples of unacceptable behavior include:
@@ -117,8 +117,8 @@ the community.
## Attribution
This Code of Conduct is adapted from the [Contributor Covenant][homepage],
version 2.0, available at
https://www.contributor-covenant.org/version/2/0/code_of_conduct.html.
version 2.1, available at
https://www.contributor-covenant.org/version/2/1/code_of_conduct.html.
Community Impact Guidelines were inspired by [Mozilla's code of conduct
enforcement ladder](https://github.com/mozilla/diversity).

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -14,7 +14,7 @@ specific language governing permissions and limitations under the License.
We ❤️ contributions from the open-source community! Everyone is welcome, and all types of participation not just code are valued and appreciated. Answering questions, helping others, reaching out, and improving the documentation are all immensely valuable to the community, so don't be afraid and get involved if you're up for it!
Everyone is encouraged to start by saying 👋 in our public Discord channel. We discuss the latest trends in diffusion models, ask questions, show off personal projects, help each other with contributions, or just hang out ☕. <a href="https://Discord.gg/G7tWnz98XR"><img alt="Join us on Discord" src="https://img.shields.io/Discord/823813159592001537?color=5865F2&logo=Discord&logoColor=white"></a>
Everyone is encouraged to start by saying 👋 in our public Discord channel. We discuss the latest trends in diffusion models, ask questions, show off personal projects, help each other with contributions, or just hang out ☕. <a href="https://discord.gg/G7tWnz98XR"><img alt="Join us on Discord" src="https://img.shields.io/discord/823813159592001537?color=5865F2&logo=Discord&logoColor=white"></a>
Whichever way you choose to contribute, we strive to be part of an open, welcoming, and kind community. Please, read our [code of conduct](https://github.com/huggingface/diffusers/blob/main/CODE_OF_CONDUCT.md) and be mindful to respect it during your interactions. We also recommend you become familiar with the [ethical guidelines](https://huggingface.co/docs/diffusers/conceptual/ethical_guidelines) that guide our project and ask you to adhere to the same principles of transparency and responsibility.
@@ -28,11 +28,11 @@ the core library.
In the following, we give an overview of different ways to contribute, ranked by difficulty in ascending order. All of them are valuable to the community.
* 1. Asking and answering questions on [the Diffusers discussion forum](https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers) or on [Discord](https://discord.gg/G7tWnz98XR).
* 2. Opening new issues on [the GitHub Issues tab](https://github.com/huggingface/diffusers/issues/new/choose)
* 3. Answering issues on [the GitHub Issues tab](https://github.com/huggingface/diffusers/issues)
* 2. Opening new issues on [the GitHub Issues tab](https://github.com/huggingface/diffusers/issues/new/choose).
* 3. Answering issues on [the GitHub Issues tab](https://github.com/huggingface/diffusers/issues).
* 4. Fix a simple issue, marked by the "Good first issue" label, see [here](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22).
* 5. Contribute to the [documentation](https://github.com/huggingface/diffusers/tree/main/docs/source).
* 6. Contribute a [Community Pipeline](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3Acommunity-examples)
* 6. Contribute a [Community Pipeline](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3Acommunity-examples).
* 7. Contribute to the [examples](https://github.com/huggingface/diffusers/tree/main/examples).
* 8. Fix a more difficult issue, marked by the "Good second issue" label, see [here](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22Good+second+issue%22).
* 9. Add a new pipeline, model, or scheduler, see ["New Pipeline/Model"](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22New+pipeline%2Fmodel%22) and ["New scheduler"](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22New+scheduler%22) issues. For this contribution, please have a look at [Design Philosophy](https://github.com/huggingface/diffusers/blob/main/PHILOSOPHY.md).
@@ -40,7 +40,7 @@ In the following, we give an overview of different ways to contribute, ranked by
As said before, **all contributions are valuable to the community**.
In the following, we will explain each contribution a bit more in detail.
For all contributions 4.-9. you will need to open a PR. It is explained in detail how to do so in [Opening a pull requst](#how-to-open-a-pr)
For all contributions 4-9, you will need to open a PR. It is explained in detail how to do so in [Opening a pull request](#how-to-open-a-pr).
### 1. Asking and answering questions on the Diffusers discussion forum or on the Diffusers Discord
@@ -63,7 +63,7 @@ In the same spirit, you are of immense help to the community by answering such q
**Please** keep in mind that the more effort you put into asking or answering a question, the higher
the quality of the publicly documented knowledge. In the same way, well-posed and well-answered questions create a high-quality knowledge database accessible to everybody, while badly posed questions or answers reduce the overall quality of the public knowledge database.
In short, a high quality question or answer is *precise*, *concise*, *relevant*, *easy-to-understand*, *accesible*, and *well-formated/well-posed*. For more information, please have a look through the [How to write a good issue](#how-to-write-a-good-issue) section.
In short, a high quality question or answer is *precise*, *concise*, *relevant*, *easy-to-understand*, *accessible*, and *well-formated/well-posed*. For more information, please have a look through the [How to write a good issue](#how-to-write-a-good-issue) section.
**NOTE about channels**:
[*The forum*](https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63) is much better indexed by search engines, such as Google. Posts are ranked by popularity rather than chronologically. Hence, it's easier to look up questions and answers that we posted some time ago.
@@ -91,12 +91,12 @@ open a new issue nevertheless and link to the related issue.
New issues usually include the following.
#### 2.1. Reproducible, minimal bug reports.
#### 2.1. Reproducible, minimal bug reports
A bug report should always have a reproducible code snippet and be as minimal and concise as possible.
This means in more detail:
- Narrow the bug down as much as you can, **do not just dump your whole code file**
- Format your code
- Narrow the bug down as much as you can, **do not just dump your whole code file**.
- Format your code.
- Do not include any external libraries except for Diffusers depending on them.
- **Always** provide all necessary information about your environment; for this, you can run: `diffusers-cli env` in your shell and copy-paste the displayed information to the issue.
- Explain the issue. If the reader doesn't know what the issue is and why it is an issue, she cannot solve it.
@@ -105,9 +105,9 @@ This means in more detail:
For more information, please have a look through the [How to write a good issue](#how-to-write-a-good-issue) section.
You can open a bug report [here](https://github.com/huggingface/diffusers/issues/new/choose).
You can open a bug report [here](https://github.com/huggingface/diffusers/issues/new?assignees=&labels=bug&projects=&template=bug-report.yml).
#### 2.2. Feature requests.
#### 2.2. Feature requests
A world-class feature request addresses the following points:
@@ -125,21 +125,21 @@ Awesome! Tell us what problem it solved for you.
You can open a feature request [here](https://github.com/huggingface/diffusers/issues/new?assignees=&labels=&template=feature_request.md&title=).
#### 2.3 Feedback.
#### 2.3 Feedback
Feedback about the library design and why it is good or not good helps the core maintainers immensely to build a user-friendly library. To understand the philosophy behind the current design philosophy, please have a look [here](https://huggingface.co/docs/diffusers/conceptual/philosophy). If you feel like a certain design choice does not fit with the current design philosophy, please explain why and how it should be changed. If a certain design choice follows the design philosophy too much, hence restricting use cases, explain why and how it should be changed.
If a certain design choice is very useful for you, please also leave a note as this is great feedback for future design decisions.
You can open an issue about feedback [here](https://github.com/huggingface/diffusers/issues/new?assignees=&labels=&template=feedback.md&title=).
#### 2.4 Technical questions.
#### 2.4 Technical questions
Technical questions are mainly about why certain code of the library was written in a certain way, or what a certain part of the code does. Please make sure to link to the code in question and please provide detail on
why this part of the code is difficult to understand.
You can open an issue about a technical question [here](https://github.com/huggingface/diffusers/issues/new?assignees=&labels=bug&template=bug-report.yml).
#### 2.5 Proposal to add a new model, scheduler, or pipeline.
#### 2.5 Proposal to add a new model, scheduler, or pipeline
If the diffusion model community released a new model, pipeline, or scheduler that you would like to see in the Diffusers library, please provide the following information:
@@ -156,19 +156,19 @@ You can open a request for a model/pipeline/scheduler [here](https://github.com/
Answering issues on GitHub might require some technical knowledge of Diffusers, but we encourage everybody to give it a try even if you are not 100% certain that your answer is correct.
Some tips to give a high-quality answer to an issue:
- Be as concise and minimal as possible
- Be as concise and minimal as possible.
- Stay on topic. An answer to the issue should concern the issue and only the issue.
- Provide links to code, papers, or other sources that prove or encourage your point.
- Answer in code. If a simple code snippet is the answer to the issue or shows how the issue can be solved, please provide a fully reproducible code snippet.
Also, many issues tend to be simply off-topic, duplicates of other issues, or irrelevant. It is of great
help to the maintainers if you can answer such issues, encouraging the author of the issue to be
more precise, provide the link to a duplicated issue or redirect them to [the forum](https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63) or [Discord](https://discord.gg/G7tWnz98XR)
more precise, provide the link to a duplicated issue or redirect them to [the forum](https://discuss.huggingface.co/c/discussion-related-to-httpsgithubcomhuggingfacediffusers/63) or [Discord](https://discord.gg/G7tWnz98XR).
If you have verified that the issued bug report is correct and requires a correction in the source code,
please have a look at the next sections.
For all of the following contributions, you will need to open a PR. It is explained in detail how to do so in the [Opening a pull requst](#how-to-open-a-pr) section.
For all of the following contributions, you will need to open a PR. It is explained in detail how to do so in the [Opening a pull request](#how-to-open-a-pr) section.
### 4. Fixing a "Good first issue"
@@ -202,7 +202,7 @@ Please have a look at [this page](https://github.com/huggingface/diffusers/tree/
### 6. Contribute a community pipeline
[Pipelines](https://huggingface.co/docs/diffusers/api/pipelines/overview) are usually the first point of contact between the Diffusers library and the user.
Pipelines are examples of how to use Diffusers [models](https://huggingface.co/docs/diffusers/api/models) and [schedulers](https://huggingface.co/docs/diffusers/api/schedulers/overview).
Pipelines are examples of how to use Diffusers [models](https://huggingface.co/docs/diffusers/api/models/overview) and [schedulers](https://huggingface.co/docs/diffusers/api/schedulers/overview).
We support two types of pipelines:
- Official Pipelines
@@ -242,27 +242,27 @@ We support two types of training examples:
Research training examples are located in [examples/research_projects](https://github.com/huggingface/diffusers/tree/main/examples/research_projects) whereas official training examples include all folders under [examples](https://github.com/huggingface/diffusers/tree/main/examples) except the `research_projects` and `community` folders.
The official training examples are maintained by the Diffusers' core maintainers whereas the research training examples are maintained by the community.
This is because of the same reasons put forward in [6. Contribute a community pipeline](#contribute-a-community-pipeline) for official pipelines vs. community pipelines: It is not feasible for the core maintainers to maintain all possible training methods for diffusion models.
This is because of the same reasons put forward in [6. Contribute a community pipeline](#6-contribute-a-community-pipeline) for official pipelines vs. community pipelines: It is not feasible for the core maintainers to maintain all possible training methods for diffusion models.
If the Diffusers core maintainers and the community consider a certain training paradigm to be too experimental or not popular enough, the corresponding training code should be put in the `research_projects` folder and maintained by the author.
Both official training and research examples consist of a directory that contains one or more training scripts, a requirements.txt file, and a README.md file. In order for the user to make use of the
training examples, it is required to clone the repository:
```
```bash
git clone https://github.com/huggingface/diffusers
```
as well as to install all additional dependencies required for training:
```
```bash
pip install -r /examples/<your-example-folder>/requirements.txt
```
Therefore when adding an example, the `requirements.txt` file shall define all pip dependencies required for your training example so that once all those are installed, the user can run the example's training script. See, for example, the [DreamBooth `requirements.txt` file](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/requirements.txt).
Training examples of the Diffusers library should adhere to the following philosophy:
- All the code necessary to run the examples should be found in a single Python file
- One should be able to run the example from the command line with `python <your-example>.py --args`
- All the code necessary to run the examples should be found in a single Python file.
- One should be able to run the example from the command line with `python <your-example>.py --args`.
- Examples should be kept simple and serve as **an example** on how to use Diffusers for training. The purpose of example scripts is **not** to create state-of-the-art diffusion models, but rather to reproduce known training schemes without adding too much custom logic. As a byproduct of this point, our examples also strive to serve as good educational materials.
To contribute an example, it is highly recommended to look at already existing examples such as [dreambooth](https://github.com/huggingface/diffusers/blob/main/examples/dreambooth/train_dreambooth.py) to get an idea of how they should look like.
@@ -281,7 +281,7 @@ If you are contributing to the official training examples, please also make sure
usually more complicated to solve than [Good first issues](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22).
The issue description usually gives less guidance on how to fix the issue and requires
a decent understanding of the library by the interested contributor.
If you are interested in tackling a second good issue, feel free to open a PR to fix it and link the PR to the issue. If you see that a PR has already been opened for this issue but did not get merged, have a look to understand why it wasn't merged and try to open an improved PR.
If you are interested in tackling a good second issue, feel free to open a PR to fix it and link the PR to the issue. If you see that a PR has already been opened for this issue but did not get merged, have a look to understand why it wasn't merged and try to open an improved PR.
Good second issues are usually more difficult to get merged compared to good first issues, so don't hesitate to ask for help from the core maintainers. If your PR is almost finished the core maintainers can also jump into your PR and commit to it in order to get it merged.
### 9. Adding pipelines, models, schedulers
@@ -297,7 +297,7 @@ if you don't know yet what specific component you would like to add:
- [Model or pipeline](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22New+pipeline%2Fmodel%22)
- [Scheduler](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22New+scheduler%22)
Before adding any of the three components, it is strongly recommended that you give the [Philosophy guide](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22Good+second+issue%22) a read to better understand the design of any of the three components. Please be aware that
Before adding any of the three components, it is strongly recommended that you give the [Philosophy guide](https://github.com/huggingface/diffusers/blob/main/PHILOSOPHY.md) a read to better understand the design of any of the three components. Please be aware that
we cannot merge model, scheduler, or pipeline additions that strongly diverge from our design philosophy
as it will lead to API inconsistencies. If you fundamentally disagree with a design choice, please
open a [Feedback issue](https://github.com/huggingface/diffusers/issues/new?assignees=&labels=&template=feedback.md&title=) instead so that it can be discussed whether a certain design
@@ -337,8 +337,8 @@ to be merged;
9. Add high-coverage tests. No quality testing = no merge.
- If you are adding new `@slow` tests, make sure they pass using
`RUN_SLOW=1 python -m pytest tests/test_my_new_model.py`.
CircleCI does not run the slow tests, but GitHub actions does every night!
10. All public methods must have informative docstrings that work nicely with markdown. See `[pipeline_latent_diffusion.py](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion.py)` for an example.
CircleCI does not run the slow tests, but GitHub Actions does every night!
10. All public methods must have informative docstrings that work nicely with markdown. See [`pipeline_latent_diffusion.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/pipelines/latent_diffusion/pipeline_latent_diffusion.py) for an example.
11. Due to the rapidly growing repository, it is important to make sure that no files that would significantly weigh down the repository are added. This includes images, videos, and other non-text files. We prefer to leverage a hf.co hosted `dataset` like
[`hf-internal-testing`](https://huggingface.co/hf-internal-testing) or [huggingface/documentation-images](https://huggingface.co/datasets/huggingface/documentation-images) to place these files.
If an external contribution, feel free to add the images to your PR and ask a Hugging Face member to migrate your images
@@ -355,7 +355,7 @@ You will need basic `git` proficiency to be able to contribute to
manual. Type `git --help` in a shell and enjoy. If you prefer books, [Pro
Git](https://git-scm.com/book/en/v2) is a very good reference.
Follow these steps to start contributing ([supported Python versions](https://github.com/huggingface/diffusers/blob/main/setup.py#L244)):
Follow these steps to start contributing ([supported Python versions](https://github.com/huggingface/diffusers/blob/42f25d601a910dceadaee6c44345896b4cfa9928/setup.py#L270)):
1. Fork the [repository](https://github.com/huggingface/diffusers) by
clicking on the 'Fork' button on the repository's page. This creates a copy of the code
@@ -364,7 +364,7 @@ under your GitHub user account.
2. Clone your fork to your local disk, and add the base repository as a remote:
```bash
$ git clone git@github.com:<your Github handle>/diffusers.git
$ git clone git@github.com:<your GitHub handle>/diffusers.git
$ cd diffusers
$ git remote add upstream https://github.com/huggingface/diffusers.git
```
@@ -394,15 +394,15 @@ passes. You should run the tests impacted by your changes like this:
```bash
$ pytest tests/<TEST_TO_RUN>.py
```
Before you run the tests, please make sure you install the dependencies required for testing. You can do so
Before you run the tests, please make sure you install the dependencies required for testing. You can do so
with this command:
```bash
$ pip install -e ".[test]"
```
You can run the full test suite with the following command, but it takes
You can also run the full test suite with the following command, but it takes
a beefy machine to produce a result in a decent amount of time now that
Diffusers has grown a lot. Here is the command for it:
@@ -410,7 +410,7 @@ Diffusers has grown a lot. Here is the command for it:
$ make test
```
🧨 Diffusers relies on `black` and `isort` to format its source code
🧨 Diffusers relies on `ruff` and `isort` to format its source code
consistently. After you make changes, apply automatic style corrections and code verifications
that can't be automated in one go with:
@@ -430,7 +430,7 @@ make a commit with `git commit` to record your changes locally:
```bash
$ git add modified_file.py
$ git commit
$ git commit -m "A descriptive message about your changes."
```
It is a good idea to sync your copy of the code with the original
@@ -493,7 +493,7 @@ To avoid pinging the upstream repository which adds reference notes to each upst
when syncing the main branch of a forked repository, please, follow these steps:
1. When possible, avoid syncing with the upstream using a branch and PR on the forked repository. Instead, merge directly into the forked main.
2. If a PR is absolutely necessary, use the following steps after checking out your branch:
```
```bash
$ git checkout -b your-branch-for-syncing
$ git pull --squash --no-commit upstream main
$ git commit -m '<your message without GitHub references>'
@@ -502,4 +502,4 @@ $ git push --set-upstream origin your-branch-for-syncing
### Style guide
For documentation strings, 🧨 Diffusers follows the [google style](https://google.github.io/styleguide/pyguide.html).
For documentation strings, 🧨 Diffusers follows the [Google style](https://google.github.io/styleguide/pyguide.html).

View File

@@ -3,14 +3,14 @@
# make sure to test the local checkout in scripts and not the pre-installed one (don't use quotes!)
export PYTHONPATH = src
check_dirs := examples scripts src tests utils
check_dirs := examples scripts src tests utils benchmarks
modified_only_fixup:
$(eval modified_py_files := $(shell python utils/get_modified_files.py $(check_dirs)))
@if test -n "$(modified_py_files)"; then \
echo "Checking/fixing $(modified_py_files)"; \
black $(modified_py_files); \
ruff $(modified_py_files); \
ruff check $(modified_py_files) --fix; \
ruff format $(modified_py_files);\
else \
echo "No library .py files were modified"; \
fi
@@ -40,23 +40,23 @@ repo-consistency:
# this target runs checks on all files
quality:
black --check $(check_dirs)
ruff $(check_dirs)
doc-builder style src/diffusers docs/source --max_len 119 --check_only --path_to_docs docs/source
ruff check $(check_dirs) setup.py
ruff format --check $(check_dirs) setup.py
doc-builder style src/diffusers docs/source --max_len 119 --check_only
python utils/check_doc_toc.py
# Format source code automatically and check is there are any problems left that need manual fixing
extra_style_checks:
python utils/custom_init_isort.py
doc-builder style src/diffusers docs/source --max_len 119 --path_to_docs docs/source
python utils/check_doc_toc.py --fix_and_overwrite
# this target runs checks on all files and potentially modifies some of them
style:
black $(check_dirs)
ruff $(check_dirs) --fix
ruff check $(check_dirs) setup.py --fix
ruff format $(check_dirs) setup.py
doc-builder style src/diffusers docs/source --max_len 119
${MAKE} autogenerate_code
${MAKE} extra_style_checks
@@ -78,7 +78,7 @@ test:
# Run tests for examples
test-examples:
python -m pytest -n auto --dist=loadfile -s -v ./examples/pytorch/
python -m pytest -n auto --dist=loadfile -s -v ./examples/
# Release stuff

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -22,23 +22,23 @@ In a nutshell, Diffusers is built to be a natural extension of PyTorch. Therefor
## Usability over Performance
- While Diffusers has many built-in performance-enhancing features (see [Memory and Speed](https://huggingface.co/docs/diffusers/optimization/fp16)), models are always loaded with the highest precision and lowest optimization. Therefore, by default diffusion pipelines are always instantiated on CPU with float32 precision if not otherwise defined by the user. This ensures usability across different platforms and accelerators and means that no complex installations are required to run the library.
- Diffusers aim at being a **light-weight** package and therefore has very few required dependencies, but many soft dependencies that can improve performance (such as `accelerate`, `safetensors`, `onnx`, etc...). We strive to keep the library as lightweight as possible so that it can be added without much concern as a dependency on other packages.
- Diffusers aims to be a **light-weight** package and therefore has very few required dependencies, but many soft dependencies that can improve performance (such as `accelerate`, `safetensors`, `onnx`, etc...). We strive to keep the library as lightweight as possible so that it can be added without much concern as a dependency on other packages.
- Diffusers prefers simple, self-explainable code over condensed, magic code. This means that short-hand code syntaxes such as lambda functions, and advanced PyTorch operators are often not desired.
## Simple over easy
As PyTorch states, **explicit is better than implicit** and **simple is better than complex**. This design philosophy is reflected in multiple parts of the library:
As PyTorch states, **explicit is better than implicit** and **simple is better than complex**. This design philosophy is reflected in multiple parts of the library:
- We follow PyTorch's API with methods like [`DiffusionPipeline.to`](https://huggingface.co/docs/diffusers/main/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.to) to let the user handle device management.
- Raising concise error messages is preferred to silently correct erroneous input. Diffusers aims at teaching the user, rather than making the library as easy to use as possible.
- Complex model vs. scheduler logic is exposed instead of magically handled inside. Schedulers/Samplers are separated from diffusion models with minimal dependencies on each other. This forces the user to write the unrolled denoising loop. However, the separation allows for easier debugging and gives the user more control over adapting the denoising process or switching out diffusion models or schedulers.
- Separately trained components of the diffusion pipeline, *e.g.* the text encoder, the unet, and the variational autoencoder, each have their own model class. This forces the user to handle the interaction between the different model components, and the serialization format separates the model components into different files. However, this allows for easier debugging and customization. Dreambooth or textual inversion training
is very simple thanks to diffusers' ability to separate single components of the diffusion pipeline.
- Separately trained components of the diffusion pipeline, *e.g.* the text encoder, the UNet, and the variational autoencoder, each has their own model class. This forces the user to handle the interaction between the different model components, and the serialization format separates the model components into different files. However, this allows for easier debugging and customization. DreamBooth or Textual Inversion training
is very simple thanks to Diffusers' ability to separate single components of the diffusion pipeline.
## Tweakable, contributor-friendly over abstraction
For large parts of the library, Diffusers adopts an important design principle of the [Transformers library](https://github.com/huggingface/transformers), which is to prefer copy-pasted code over hasty abstractions. This design principle is very opinionated and stands in stark contrast to popular design principles such as [Don't repeat yourself (DRY)](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself).
In short, just like Transformers does for modeling files, diffusers prefers to keep an extremely low level of abstraction and very self-contained code for pipelines and schedulers.
Functions, long code blocks, and even classes can be copied across multiple files which at first can look like a bad, sloppy design choice that makes the library unmaintainable.
For large parts of the library, Diffusers adopts an important design principle of the [Transformers library](https://github.com/huggingface/transformers), which is to prefer copy-pasted code over hasty abstractions. This design principle is very opinionated and stands in stark contrast to popular design principles such as [Don't repeat yourself (DRY)](https://en.wikipedia.org/wiki/Don%27t_repeat_yourself).
In short, just like Transformers does for modeling files, Diffusers prefers to keep an extremely low level of abstraction and very self-contained code for pipelines and schedulers.
Functions, long code blocks, and even classes can be copied across multiple files which at first can look like a bad, sloppy design choice that makes the library unmaintainable.
**However**, this design has proven to be extremely successful for Transformers and makes a lot of sense for community-driven, open-source machine learning libraries because:
- Machine Learning is an extremely fast-moving field in which paradigms, model architectures, and algorithms are changing rapidly, which therefore makes it very difficult to define long-lasting code abstractions.
- Machine Learning practitioners like to be able to quickly tweak existing code for ideation and research and therefore prefer self-contained code over one that contains many abstractions.
@@ -47,30 +47,30 @@ Functions, long code blocks, and even classes can be copied across multiple file
At Hugging Face, we call this design the **single-file policy** which means that almost all of the code of a certain class should be written in a single, self-contained file. To read more about the philosophy, you can have a look
at [this blog post](https://huggingface.co/blog/transformers-design-philosophy).
In diffusers, we follow this philosophy for both pipelines and schedulers, but only partly for diffusion models. The reason we don't follow this design fully for diffusion models is because almost all diffusion pipelines, such
as [DDPM](https://huggingface.co/docs/diffusers/v0.12.0/en/api/pipelines/ddpm), [Stable Diffusion](https://huggingface.co/docs/diffusers/v0.12.0/en/api/pipelines/stable_diffusion/overview#stable-diffusion-pipelines), [UnCLIP (Dalle-2)](https://huggingface.co/docs/diffusers/v0.12.0/en/api/pipelines/unclip#overview) and [Imagen](https://imagen.research.google/) all rely on the same diffusion model, the [UNet](https://huggingface.co/docs/diffusers/api/models#diffusers.UNet2DConditionModel).
In Diffusers, we follow this philosophy for both pipelines and schedulers, but only partly for diffusion models. The reason we don't follow this design fully for diffusion models is because almost all diffusion pipelines, such
as [DDPM](https://huggingface.co/docs/diffusers/api/pipelines/ddpm), [Stable Diffusion](https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/overview#stable-diffusion-pipelines), [unCLIP (DALL·E 2)](https://huggingface.co/docs/diffusers/api/pipelines/unclip) and [Imagen](https://imagen.research.google/) all rely on the same diffusion model, the [UNet](https://huggingface.co/docs/diffusers/api/models/unet2d-cond).
Great, now you should have generally understood why 🧨 Diffusers is designed the way it is 🤗.
Great, now you should have generally understood why 🧨 Diffusers is designed the way it is 🤗.
We try to apply these design principles consistently across the library. Nevertheless, there are some minor exceptions to the philosophy or some unlucky design choices. If you have feedback regarding the design, we would ❤️ to hear it [directly on GitHub](https://github.com/huggingface/diffusers/issues/new?assignees=&labels=&template=feedback.md&title=).
## Design Philosophy in Details
Now, let's look a bit into the nitty-gritty details of the design philosophy. Diffusers essentially consist of three major classes, [pipelines](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines), [models](https://github.com/huggingface/diffusers/tree/main/src/diffusers/models), and [schedulers](https://github.com/huggingface/diffusers/tree/main/src/diffusers/schedulers).
Let's walk through more in-detail design decisions for each class.
Now, let's look a bit into the nitty-gritty details of the design philosophy. Diffusers essentially consists of three major classes: [pipelines](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines), [models](https://github.com/huggingface/diffusers/tree/main/src/diffusers/models), and [schedulers](https://github.com/huggingface/diffusers/tree/main/src/diffusers/schedulers).
Let's walk through more detailed design decisions for each class.
### Pipelines
Pipelines are designed to be easy to use (therefore do not follow [*Simple over easy*](#simple-over-easy) 100%)), are not feature complete, and should loosely be seen as examples of how to use [models](#models) and [schedulers](#schedulers) for inference.
Pipelines are designed to be easy to use (therefore do not follow [*Simple over easy*](#simple-over-easy) 100%), are not feature complete, and should loosely be seen as examples of how to use [models](#models) and [schedulers](#schedulers) for inference.
The following design principles are followed:
- Pipelines follow the single-file policy. All pipelines can be found in individual directories under src/diffusers/pipelines. One pipeline folder corresponds to one diffusion paper/project/release. Multiple pipeline files can be gathered in one pipeline folder, as its done for [`src/diffusers/pipelines/stable-diffusion`](https://github.com/huggingface/diffusers/tree/main/src/diffusers/pipelines/stable_diffusion). If pipelines share similar functionality, one can make use of the [#Copied from mechanism](https://github.com/huggingface/diffusers/blob/125d783076e5bd9785beb05367a2d2566843a271/src/diffusers/pipelines/stable_diffusion/pipeline_stable_diffusion_img2img.py#L251).
- Pipelines all inherit from [`DiffusionPipeline`]
- Pipelines all inherit from [`DiffusionPipeline`].
- Every pipeline consists of different model and scheduler components, that are documented in the [`model_index.json` file](https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/model_index.json), are accessible under the same name as attributes of the pipeline and can be shared between pipelines with [`DiffusionPipeline.components`](https://huggingface.co/docs/diffusers/main/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.components) function.
- Every pipeline should be loadable via the [`DiffusionPipeline.from_pretrained`](https://huggingface.co/docs/diffusers/main/en/api/diffusion_pipeline#diffusers.DiffusionPipeline.from_pretrained) function.
- Pipelines should be used **only** for inference.
- Pipelines should be very readable, self-explanatory, and easy to tweak.
- Pipelines should be designed to build on top of each other and be easy to integrate into higher-level APIs.
- Pipelines are **not** intended to be feature-complete user interfaces. For future complete user interfaces one should rather have a look at [InvokeAI](https://github.com/invoke-ai/InvokeAI), [Diffuzers](https://github.com/abhishekkrthakur/diffuzers), and [lama-cleaner](https://github.com/Sanster/lama-cleaner)
- Pipelines are **not** intended to be feature-complete user interfaces. For future complete user interfaces one should rather have a look at [InvokeAI](https://github.com/invoke-ai/InvokeAI), [Diffuzers](https://github.com/abhishekkrthakur/diffuzers), and [lama-cleaner](https://github.com/Sanster/lama-cleaner).
- Every pipeline should have one and only one way to run it via a `__call__` method. The naming of the `__call__` arguments should be shared across all pipelines.
- Pipelines should be named after the task they are intended to solve.
- In almost all cases, novel diffusion pipelines shall be implemented in a new pipeline folder/file.
@@ -83,28 +83,28 @@ The following design principles are followed:
- Models correspond to **a type of model architecture**. *E.g.* the [`UNet2DConditionModel`] class is used for all UNet variations that expect 2D image inputs and are conditioned on some context.
- All models can be found in [`src/diffusers/models`](https://github.com/huggingface/diffusers/tree/main/src/diffusers/models) and every model architecture shall be defined in its file, e.g. [`unet_2d_condition.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/unet_2d_condition.py), [`transformer_2d.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/transformer_2d.py), etc...
- Models **do not** follow the single-file policy and should make use of smaller model building blocks, such as [`attention.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention.py), [`resnet.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/resnet.py), [`embeddings.py`](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/embeddings.py), etc... **Note**: This is in stark contrast to Transformers' modeling files and shows that models do not really follow the single-file policy.
- Models intend to expose complexity, just like PyTorch's module does, and give clear error messages.
- Models intend to expose complexity, just like PyTorch's `Module` class, and give clear error messages.
- Models all inherit from `ModelMixin` and `ConfigMixin`.
- Models can be optimized for performance when it doesnt demand major code changes, keeps backward compatibility, and gives significant memory or compute gain.
- Models can be optimized for performance when it doesnt demand major code changes, keep backward compatibility, and give significant memory or compute gain.
- Models should by default have the highest precision and lowest performance setting.
- To integrate new model checkpoints whose general architecture can be classified as an architecture that already exists in Diffusers, the existing model architecture shall be adapted to make it work with the new checkpoint. One should only create a new file if the model architecture is fundamentally different.
- Models should be designed to be easily extendable to future changes. This can be achieved by limiting public function arguments, configuration arguments, and "foreseeing" future changes, *e.g.* it is usually better to add `string` "...type" arguments that can easily be extended to new future types instead of boolean `is_..._type` arguments. Only the minimum amount of changes shall be made to existing architectures to make a new model checkpoint work.
- The model design is a difficult trade-off between keeping code readable and concise and supporting many model checkpoints. For most parts of the modeling code, classes shall be adapted for new model checkpoints, while there are some exceptions where it is preferred to add new classes to make sure the code is kept concise and
readable longterm, such as [UNet blocks](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/unet_2d_blocks.py) and [Attention processors](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/cross_attention.py).
- The model design is a difficult trade-off between keeping code readable and concise and supporting many model checkpoints. For most parts of the modeling code, classes shall be adapted for new model checkpoints, while there are some exceptions where it is preferred to add new classes to make sure the code is kept concise and
readable long-term, such as [UNet blocks](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/unet_2d_blocks.py) and [Attention processors](https://github.com/huggingface/diffusers/blob/main/src/diffusers/models/attention_processor.py).
### Schedulers
Schedulers are responsible to guide the denoising process for inference as well as to define a noise schedule for training. They are designed as individual classes with loadable configuration files and strongly follow the **single-file policy**.
The following design principles are followed:
- All schedulers are found in [`src/diffusers/schedulers`](https://github.com/huggingface/diffusers/tree/main/src/diffusers/schedulers).
- Schedulers are **not** allowed to import from large utils files and shall be kept very self-contained.
- One scheduler python file corresponds to one scheduler algorithm (as might be defined in a paper).
- All schedulers are found in [`src/diffusers/schedulers`](https://github.com/huggingface/diffusers/tree/main/src/diffusers/schedulers).
- Schedulers are **not** allowed to import from large utils files and shall be kept very self-contained.
- One scheduler Python file corresponds to one scheduler algorithm (as might be defined in a paper).
- If schedulers share similar functionalities, we can make use of the `#Copied from` mechanism.
- Schedulers all inherit from `SchedulerMixin` and `ConfigMixin`.
- Schedulers can be easily swapped out with the [`ConfigMixin.from_config`](https://huggingface.co/docs/diffusers/main/en/api/configuration#diffusers.ConfigMixin.from_config) method as explained in detail [here](./using-diffusers/schedulers.mdx).
- Schedulers can be easily swapped out with the [`ConfigMixin.from_config`](https://huggingface.co/docs/diffusers/main/en/api/configuration#diffusers.ConfigMixin.from_config) method as explained in detail [here](./docs/source/en/using-diffusers/schedulers.md).
- Every scheduler has to have a `set_num_inference_steps`, and a `step` function. `set_num_inference_steps(...)` has to be called before every denoising process, *i.e.* before `step(...)` is called.
- Every scheduler exposes the timesteps to be "looped over" via a `timesteps` attribute, which is an array of timesteps the model will be called upon
- Every scheduler exposes the timesteps to be "looped over" via a `timesteps` attribute, which is an array of timesteps the model will be called upon.
- The `step(...)` function takes a predicted model output and the "current" sample (x_t) and returns the "previous", slightly more denoised sample (x_t-1).
- Given the complexity of diffusion schedulers, the `step` function does not expose all the complexity and can be a bit of a "black box".
- In almost all cases, novel schedulers shall be implemented in a new scheduling file.

185
README.md
View File

@@ -1,6 +1,22 @@
<!---
Copyright 2022 - The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
-->
<p align="center">
<br>
<img src="./docs/source/en/imgs/diffusers_library.jpg" width="400"/>
<img src="https://raw.githubusercontent.com/huggingface/diffusers/main/docs/source/en/imgs/diffusers_library.jpg" width="400"/>
<br>
<p>
<p align="center">
@@ -10,8 +26,14 @@
<a href="https://github.com/huggingface/diffusers/releases">
<img alt="GitHub release" src="https://img.shields.io/github/release/huggingface/diffusers.svg">
</a>
<a href="https://pepy.tech/project/diffusers">
<img alt="GitHub release" src="https://static.pepy.tech/badge/diffusers/month">
</a>
<a href="CODE_OF_CONDUCT.md">
<img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-2.0-4baaaa.svg">
<img alt="Contributor Covenant" src="https://img.shields.io/badge/Contributor%20Covenant-2.1-4baaaa.svg">
</a>
<a href="https://twitter.com/diffuserslib">
<img alt="X account" src="https://img.shields.io/twitter/url/https/twitter.com/diffuserslib.svg?style=social&label=Follow%20%40diffuserslib">
</a>
</p>
@@ -21,16 +43,16 @@
- State-of-the-art [diffusion pipelines](https://huggingface.co/docs/diffusers/api/pipelines/overview) that can be run in inference with just a few lines of code.
- Interchangeable noise [schedulers](https://huggingface.co/docs/diffusers/api/schedulers/overview) for different diffusion speeds and output quality.
- Pretrained [models](https://huggingface.co/docs/diffusers/api/models) that can be used as building blocks, and combined with schedulers, for creating your own end-to-end diffusion systems.
- Pretrained [models](https://huggingface.co/docs/diffusers/api/models/overview) that can be used as building blocks, and combined with schedulers, for creating your own end-to-end diffusion systems.
## Installation
We recommend installing 🤗 Diffusers in a virtual environment from PyPi or Conda. For more details about installing [PyTorch](https://pytorch.org/get-started/locally/) and [Flax](https://flax.readthedocs.io/en/latest/installation.html), please refer to their official documentation.
We recommend installing 🤗 Diffusers in a virtual environment from PyPI or Conda. For more details about installing [PyTorch](https://pytorch.org/get-started/locally/) and [Flax](https://flax.readthedocs.io/en/latest/#installation), please refer to their official documentation.
### PyTorch
With `pip` (official package):
```bash
pip install --upgrade diffusers[torch]
```
@@ -55,12 +77,13 @@ Please refer to the [How to use Stable Diffusion in Apple Silicon](https://huggi
## Quickstart
Generating outputs is super easy with 🤗 Diffusers. To generate an image from text, use the `from_pretrained` method to load any pretrained diffusion model (browse the [Hub](https://huggingface.co/models?library=diffusers&sort=downloads) for 4000+ checkpoints):
Generating outputs is super easy with 🤗 Diffusers. To generate an image from text, use the `from_pretrained` method to load any pretrained diffusion model (browse the [Hub](https://huggingface.co/models?library=diffusers&sort=downloads) for 25.000+ checkpoints):
```python
from diffusers import DiffusionPipeline
import torch
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5")
pipeline = DiffusionPipeline.from_pretrained("runwayml/stable-diffusion-v1-5", torch_dtype=torch.float16)
pipeline.to("cuda")
pipeline("An image of a squirrel in Picasso style").images[0]
```
@@ -71,14 +94,13 @@ You can also dig into the models and schedulers toolbox to build your own diffus
from diffusers import DDPMScheduler, UNet2DModel
from PIL import Image
import torch
import numpy as np
scheduler = DDPMScheduler.from_pretrained("google/ddpm-cat-256")
model = UNet2DModel.from_pretrained("google/ddpm-cat-256").to("cuda")
scheduler.set_timesteps(50)
sample_size = model.config.sample_size
noise = torch.randn((1, 3, sample_size, sample_size)).to("cuda")
noise = torch.randn((1, 3, sample_size, sample_size), device="cuda")
input = noise
for t in scheduler.timesteps:
@@ -99,66 +121,107 @@ Check out the [Quickstart](https://huggingface.co/docs/diffusers/quicktour) to l
| **Documentation** | **What can I learn?** |
|---------------------------------------------------------------------|-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
| Tutorial | A basic crash course for learning how to use the library's most important features like using models and schedulers to build your own diffusion system, and training your own diffusion model. |
| Loading | Guides for how to load and configure all the components (pipelines, models, and schedulers) of the library, as well as how to use different schedulers. |
| Pipelines for inference | Guides for how to use pipelines for different inference tasks, batched generation, controlling generated outputs and randomness, and how to contribute a pipeline to the library. |
| Optimization | Guides for how to optimize your diffusion model to run faster and consume less memory. |
| [Tutorial](https://huggingface.co/docs/diffusers/tutorials/tutorial_overview) | A basic crash course for learning how to use the library's most important features like using models and schedulers to build your own diffusion system, and training your own diffusion model. |
| [Loading](https://huggingface.co/docs/diffusers/using-diffusers/loading_overview) | Guides for how to load and configure all the components (pipelines, models, and schedulers) of the library, as well as how to use different schedulers. |
| [Pipelines for inference](https://huggingface.co/docs/diffusers/using-diffusers/pipeline_overview) | Guides for how to use pipelines for different inference tasks, batched generation, controlling generated outputs and randomness, and how to contribute a pipeline to the library. |
| [Optimization](https://huggingface.co/docs/diffusers/optimization/opt_overview) | Guides for how to optimize your diffusion model to run faster and consume less memory. |
| [Training](https://huggingface.co/docs/diffusers/training/overview) | Guides for how to train a diffusion model for different tasks with different training techniques. |
## Supported pipelines
| Pipeline | Paper | Tasks |
|---|---|:---:|
| [alt_diffusion](./api/pipelines/alt_diffusion) | [**AltDiffusion**](https://arxiv.org/abs/2211.06679) | Image-to-Image Text-Guided Generation |
| [audio_diffusion](./api/pipelines/audio_diffusion) | [**Audio Diffusion**](https://github.com/teticio/audio-diffusion.git) | Unconditional Audio Generation |
| [controlnet](./api/pipelines/stable_diffusion/controlnet) | [**ControlNet with Stable Diffusion**](https://arxiv.org/abs/2302.05543) | Image-to-Image Text-Guided Generation |
| [cycle_diffusion](./api/pipelines/cycle_diffusion) | [**Cycle Diffusion**](https://arxiv.org/abs/2210.05559) | Image-to-Image Text-Guided Generation |
| [dance_diffusion](./api/pipelines/dance_diffusion) | [**Dance Diffusion**](https://github.com/williamberman/diffusers.git) | Unconditional Audio Generation |
| [ddpm](./api/pipelines/ddpm) | [**Denoising Diffusion Probabilistic Models**](https://arxiv.org/abs/2006.11239) | Unconditional Image Generation |
| [ddim](./api/pipelines/ddim) | [**Denoising Diffusion Implicit Models**](https://arxiv.org/abs/2010.02502) | Unconditional Image Generation |
| [latent_diffusion](./api/pipelines/latent_diffusion) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752)| Text-to-Image Generation |
| [latent_diffusion](./api/pipelines/latent_diffusion) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752)| Super Resolution Image-to-Image |
| [latent_diffusion_uncond](./api/pipelines/latent_diffusion_uncond) | [**High-Resolution Image Synthesis with Latent Diffusion Models**](https://arxiv.org/abs/2112.10752) | Unconditional Image Generation |
| [paint_by_example](./api/pipelines/paint_by_example) | [**Paint by Example: Exemplar-based Image Editing with Diffusion Models**](https://arxiv.org/abs/2211.13227) | Image-Guided Image Inpainting |
| [pndm](./api/pipelines/pndm) | [**Pseudo Numerical Methods for Diffusion Models on Manifolds**](https://arxiv.org/abs/2202.09778) | Unconditional Image Generation |
| [score_sde_ve](./api/pipelines/score_sde_ve) | [**Score-Based Generative Modeling through Stochastic Differential Equations**](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation |
| [score_sde_vp](./api/pipelines/score_sde_vp) | [**Score-Based Generative Modeling through Stochastic Differential Equations**](https://openreview.net/forum?id=PxTIG12RRHS) | Unconditional Image Generation |
| [semantic_stable_diffusion](./api/pipelines/semantic_stable_diffusion) | [**Semantic Guidance**](https://arxiv.org/abs/2301.12247) | Text-Guided Generation |
| [stable_diffusion_text2img](./api/pipelines/stable_diffusion/text2img) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Text-to-Image Generation |
| [stable_diffusion_img2img](./api/pipelines/stable_diffusion/img2img) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Image-to-Image Text-Guided Generation |
| [stable_diffusion_inpaint](./api/pipelines/stable_diffusion/inpaint) | [**Stable Diffusion**](https://stability.ai/blog/stable-diffusion-public-release) | Text-Guided Image Inpainting |
| [stable_diffusion_panorama](./api/pipelines/stable_diffusion/panorama) | [**MultiDiffusion**](https://multidiffusion.github.io/) | Text-to-Panorama Generation |
| [stable_diffusion_pix2pix](./api/pipelines/stable_diffusion/pix2pix) | [**InstructPix2Pix**](https://github.com/timothybrooks/instruct-pix2pix) | Text-Guided Image Editing|
| [stable_diffusion_pix2pix_zero](./api/pipelines/stable_diffusion/pix2pix_zero) | [**Zero-shot Image-to-Image Translation**](https://pix2pixzero.github.io/) | Text-Guided Image Editing |
| [stable_diffusion_attend_and_excite](./api/pipelines/stable_diffusion/attend_and_excite) | [**Attend and Excite for Stable Diffusion**](https://attendandexcite.github.io/Attend-and-Excite/) | Text-to-Image Generation |
| [stable_diffusion_self_attention_guidance](./api/pipelines/stable_diffusion/self_attention_guidance) | [**Self-Attention Guidance**](https://ku-cvlab.github.io/Self-Attention-Guidance) | Text-to-Image Generation |
| [stable_diffusion_image_variation](./stable_diffusion/image_variation) | [**Stable Diffusion Image Variations**](https://github.com/LambdaLabsML/lambda-diffusers#stable-diffusion-image-variations) | Image-to-Image Generation |
| [stable_diffusion_latent_upscale](./stable_diffusion/latent_upscale) | [**Stable Diffusion Latent Upscaler**](https://twitter.com/StabilityAI/status/1590531958815064065) | Text-Guided Super Resolution Image-to-Image |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-to-Image Generation |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Image Inpainting |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Depth-Conditional Stable Diffusion**](https://github.com/Stability-AI/stablediffusion#depth-conditional-stable-diffusion) | Depth-to-Image Generation |
| [stable_diffusion_2](./api/pipelines/stable_diffusion_2) | [**Stable Diffusion 2**](https://stability.ai/blog/stable-diffusion-v2-release) | Text-Guided Super Resolution Image-to-Image |
| [stable_diffusion_safe](./api/pipelines/stable_diffusion_safe) | [**Safe Stable Diffusion**](https://arxiv.org/abs/2211.05105) | Text-Guided Generation |
| [stable_unclip](./stable_unclip) | **Stable unCLIP** | Text-to-Image Generation |
| [stable_unclip](./stable_unclip) | **Stable unCLIP** | Image-to-Image Text-Guided Generation |
| [stochastic_karras_ve](./api/pipelines/stochastic_karras_ve) | [**Elucidating the Design Space of Diffusion-Based Generative Models**](https://arxiv.org/abs/2206.00364) | Unconditional Image Generation |
| [unclip](./api/pipelines/unclip) | [Hierarchical Text-Conditional Image Generation with CLIP Latents](https://arxiv.org/abs/2204.06125) | Text-to-Image Generation |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Text-to-Image Generation |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Image Variations Generation |
| [versatile_diffusion](./api/pipelines/versatile_diffusion) | [Versatile Diffusion: Text, Images and Variations All in One Diffusion Model](https://arxiv.org/abs/2211.08332) | Dual Image and Text Guided Generation |
| [vq_diffusion](./api/pipelines/vq_diffusion) | [Vector Quantized Diffusion Model for Text-to-Image Synthesis](https://arxiv.org/abs/2111.14822) | Text-to-Image Generation |
## Contribution
We ❤️ contributions from the open-source community!
We ❤️ contributions from the open-source community!
If you want to contribute to this library, please check out our [Contribution guide](https://github.com/huggingface/diffusers/blob/main/CONTRIBUTING.md).
You can look out for [issues](https://github.com/huggingface/diffusers/issues) you'd like to tackle to contribute to the library.
- See [Good first issues](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22good+first+issue%22) for general opportunities to contribute
- See [New model/pipeline](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22New+pipeline%2Fmodel%22) to contribute exciting new diffusion models / diffusion pipelines
- See [New scheduler](https://github.com/huggingface/diffusers/issues?q=is%3Aopen+is%3Aissue+label%3A%22New+scheduler%22)
Also, say 👋 in our public Discord channel <a href="https://discord.gg/G7tWnz98XR"><img alt="Join us on Discord" src="https://img.shields.io/discord/823813159592001537?color=5865F2&logo=discord&logoColor=white"></a>. We discuss the hottest trends about diffusion models, help each other with contributions, personal projects or
just hang out ☕.
Also, say 👋 in our public Discord channel <a href="https://discord.gg/G7tWnz98XR"><img alt="Join us on Discord" src="https://img.shields.io/discord/823813159592001537?color=5865F2&logo=discord&logoColor=white"></a>. We discuss the hottest trends about diffusion models, help each other with contributions, personal projects or just hang out ☕.
## Popular Tasks & Pipelines
<table>
<tr>
<th>Task</th>
<th>Pipeline</th>
<th>🤗 Hub</th>
</tr>
<tr style="border-top: 2px solid black">
<td>Unconditional Image Generation</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/ddpm"> DDPM </a></td>
<td><a href="https://huggingface.co/google/ddpm-ema-church-256"> google/ddpm-ema-church-256 </a></td>
</tr>
<tr style="border-top: 2px solid black">
<td>Text-to-Image</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/text2img">Stable Diffusion Text-to-Image</a></td>
<td><a href="https://huggingface.co/runwayml/stable-diffusion-v1-5"> runwayml/stable-diffusion-v1-5 </a></td>
</tr>
<tr>
<td>Text-to-Image</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/unclip">unCLIP</a></td>
<td><a href="https://huggingface.co/kakaobrain/karlo-v1-alpha"> kakaobrain/karlo-v1-alpha </a></td>
</tr>
<tr>
<td>Text-to-Image</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/deepfloyd_if">DeepFloyd IF</a></td>
<td><a href="https://huggingface.co/DeepFloyd/IF-I-XL-v1.0"> DeepFloyd/IF-I-XL-v1.0 </a></td>
</tr>
<tr>
<td>Text-to-Image</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/kandinsky">Kandinsky</a></td>
<td><a href="https://huggingface.co/kandinsky-community/kandinsky-2-2-decoder"> kandinsky-community/kandinsky-2-2-decoder </a></td>
</tr>
<tr style="border-top: 2px solid black">
<td>Text-guided Image-to-Image</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/controlnet">ControlNet</a></td>
<td><a href="https://huggingface.co/lllyasviel/sd-controlnet-canny"> lllyasviel/sd-controlnet-canny </a></td>
</tr>
<tr>
<td>Text-guided Image-to-Image</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/pix2pix">InstructPix2Pix</a></td>
<td><a href="https://huggingface.co/timbrooks/instruct-pix2pix"> timbrooks/instruct-pix2pix </a></td>
</tr>
<tr>
<td>Text-guided Image-to-Image</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/img2img">Stable Diffusion Image-to-Image</a></td>
<td><a href="https://huggingface.co/runwayml/stable-diffusion-v1-5"> runwayml/stable-diffusion-v1-5 </a></td>
</tr>
<tr style="border-top: 2px solid black">
<td>Text-guided Image Inpainting</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/inpaint">Stable Diffusion Inpainting</a></td>
<td><a href="https://huggingface.co/runwayml/stable-diffusion-inpainting"> runwayml/stable-diffusion-inpainting </a></td>
</tr>
<tr style="border-top: 2px solid black">
<td>Image Variation</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/image_variation">Stable Diffusion Image Variation</a></td>
<td><a href="https://huggingface.co/lambdalabs/sd-image-variations-diffusers"> lambdalabs/sd-image-variations-diffusers </a></td>
</tr>
<tr style="border-top: 2px solid black">
<td>Super Resolution</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/upscale">Stable Diffusion Upscale</a></td>
<td><a href="https://huggingface.co/stabilityai/stable-diffusion-x4-upscaler"> stabilityai/stable-diffusion-x4-upscaler </a></td>
</tr>
<tr>
<td>Super Resolution</td>
<td><a href="https://huggingface.co/docs/diffusers/api/pipelines/stable_diffusion/latent_upscale">Stable Diffusion Latent Upscale</a></td>
<td><a href="https://huggingface.co/stabilityai/sd-x2-latent-upscaler"> stabilityai/sd-x2-latent-upscaler </a></td>
</tr>
</table>
## Popular libraries using 🧨 Diffusers
- https://github.com/microsoft/TaskMatrix
- https://github.com/invoke-ai/InvokeAI
- https://github.com/apple/ml-stable-diffusion
- https://github.com/Sanster/lama-cleaner
- https://github.com/IDEA-Research/Grounded-Segment-Anything
- https://github.com/ashawkey/stable-dreamfusion
- https://github.com/deep-floyd/IF
- https://github.com/bentoml/BentoML
- https://github.com/bmaltais/kohya_ss
- +11.000 other amazing GitHub repositories 💪
Thank you for using us ❤️.
## Credits
@@ -175,7 +238,7 @@ We also want to thank @heejkoo for the very helpful overview of papers, code and
```bibtex
@misc{von-platen-etal-2022-diffusers,
author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Thomas Wolf},
author = {Patrick von Platen and Suraj Patil and Anton Lozhkov and Pedro Cuenca and Nathan Lambert and Kashif Rasul and Mishig Davaadorj and Dhruv Nair and Sayak Paul and William Berman and Yiyi Xu and Steven Liu and Thomas Wolf},
title = {Diffusers: State-of-the-art diffusion models},
year = {2022},
publisher = {GitHub},

346
benchmarks/base_classes.py Normal file
View File

@@ -0,0 +1,346 @@
import os
import sys
import torch
from diffusers import (
AutoPipelineForImage2Image,
AutoPipelineForInpainting,
AutoPipelineForText2Image,
ControlNetModel,
LCMScheduler,
StableDiffusionAdapterPipeline,
StableDiffusionControlNetPipeline,
StableDiffusionXLAdapterPipeline,
StableDiffusionXLControlNetPipeline,
T2IAdapter,
WuerstchenCombinedPipeline,
)
from diffusers.utils import load_image
sys.path.append(".")
from utils import ( # noqa: E402
BASE_PATH,
PROMPT,
BenchmarkInfo,
benchmark_fn,
bytes_to_giga_bytes,
flush,
generate_csv_dict,
write_to_csv,
)
RESOLUTION_MAPPING = {
"runwayml/stable-diffusion-v1-5": (512, 512),
"lllyasviel/sd-controlnet-canny": (512, 512),
"diffusers/controlnet-canny-sdxl-1.0": (1024, 1024),
"TencentARC/t2iadapter_canny_sd14v1": (512, 512),
"TencentARC/t2i-adapter-canny-sdxl-1.0": (1024, 1024),
"stabilityai/stable-diffusion-2-1": (768, 768),
"stabilityai/stable-diffusion-xl-base-1.0": (1024, 1024),
"stabilityai/stable-diffusion-xl-refiner-1.0": (1024, 1024),
"stabilityai/sdxl-turbo": (512, 512),
}
class BaseBenchmak:
pipeline_class = None
def __init__(self, args):
super().__init__()
def run_inference(self, args):
raise NotImplementedError
def benchmark(self, args):
raise NotImplementedError
def get_result_filepath(self, args):
pipeline_class_name = str(self.pipe.__class__.__name__)
name = (
args.ckpt.replace("/", "_")
+ "_"
+ pipeline_class_name
+ f"-bs@{args.batch_size}-steps@{args.num_inference_steps}-mco@{args.model_cpu_offload}-compile@{args.run_compile}.csv"
)
filepath = os.path.join(BASE_PATH, name)
return filepath
class TextToImageBenchmark(BaseBenchmak):
pipeline_class = AutoPipelineForText2Image
def __init__(self, args):
pipe = self.pipeline_class.from_pretrained(args.ckpt, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
if args.run_compile:
if not isinstance(pipe, WuerstchenCombinedPipeline):
pipe.unet.to(memory_format=torch.channels_last)
print("Run torch compile")
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
if hasattr(pipe, "movq") and getattr(pipe, "movq", None) is not None:
pipe.movq.to(memory_format=torch.channels_last)
pipe.movq = torch.compile(pipe.movq, mode="reduce-overhead", fullgraph=True)
else:
print("Run torch compile")
pipe.decoder = torch.compile(pipe.decoder, mode="reduce-overhead", fullgraph=True)
pipe.vqgan = torch.compile(pipe.vqgan, mode="reduce-overhead", fullgraph=True)
pipe.set_progress_bar_config(disable=True)
self.pipe = pipe
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
def benchmark(self, args):
flush()
print(f"[INFO] {self.pipe.__class__.__name__}: Running benchmark with: {vars(args)}\n")
time = benchmark_fn(self.run_inference, self.pipe, args) # in seconds.
memory = bytes_to_giga_bytes(torch.cuda.max_memory_allocated()) # in GBs.
benchmark_info = BenchmarkInfo(time=time, memory=memory)
pipeline_class_name = str(self.pipe.__class__.__name__)
flush()
csv_dict = generate_csv_dict(
pipeline_cls=pipeline_class_name, ckpt=args.ckpt, args=args, benchmark_info=benchmark_info
)
filepath = self.get_result_filepath(args)
write_to_csv(filepath, csv_dict)
print(f"Logs written to: {filepath}")
flush()
class TurboTextToImageBenchmark(TextToImageBenchmark):
def __init__(self, args):
super().__init__(args)
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
guidance_scale=0.0,
)
class LCMLoRATextToImageBenchmark(TextToImageBenchmark):
lora_id = "latent-consistency/lcm-lora-sdxl"
def __init__(self, args):
super().__init__(args)
self.pipe.load_lora_weights(self.lora_id)
self.pipe.fuse_lora()
self.pipe.unload_lora_weights()
self.pipe.scheduler = LCMScheduler.from_config(self.pipe.scheduler.config)
def get_result_filepath(self, args):
pipeline_class_name = str(self.pipe.__class__.__name__)
name = (
self.lora_id.replace("/", "_")
+ "_"
+ pipeline_class_name
+ f"-bs@{args.batch_size}-steps@{args.num_inference_steps}-mco@{args.model_cpu_offload}-compile@{args.run_compile}.csv"
)
filepath = os.path.join(BASE_PATH, name)
return filepath
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
guidance_scale=1.0,
)
def benchmark(self, args):
flush()
print(f"[INFO] {self.pipe.__class__.__name__}: Running benchmark with: {vars(args)}\n")
time = benchmark_fn(self.run_inference, self.pipe, args) # in seconds.
memory = bytes_to_giga_bytes(torch.cuda.max_memory_allocated()) # in GBs.
benchmark_info = BenchmarkInfo(time=time, memory=memory)
pipeline_class_name = str(self.pipe.__class__.__name__)
flush()
csv_dict = generate_csv_dict(
pipeline_cls=pipeline_class_name, ckpt=self.lora_id, args=args, benchmark_info=benchmark_info
)
filepath = self.get_result_filepath(args)
write_to_csv(filepath, csv_dict)
print(f"Logs written to: {filepath}")
flush()
class ImageToImageBenchmark(TextToImageBenchmark):
pipeline_class = AutoPipelineForImage2Image
url = "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/benchmarking/1665_Girl_with_a_Pearl_Earring.jpg"
image = load_image(url).convert("RGB")
def __init__(self, args):
super().__init__(args)
self.image = self.image.resize(RESOLUTION_MAPPING[args.ckpt])
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
image=self.image,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
class TurboImageToImageBenchmark(ImageToImageBenchmark):
def __init__(self, args):
super().__init__(args)
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
image=self.image,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
guidance_scale=0.0,
strength=0.5,
)
class InpaintingBenchmark(ImageToImageBenchmark):
pipeline_class = AutoPipelineForInpainting
mask_url = "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/benchmarking/overture-creations-5sI6fQgYIuo_mask.png"
mask = load_image(mask_url).convert("RGB")
def __init__(self, args):
super().__init__(args)
self.image = self.image.resize(RESOLUTION_MAPPING[args.ckpt])
self.mask = self.mask.resize(RESOLUTION_MAPPING[args.ckpt])
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
image=self.image,
mask_image=self.mask,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
class IPAdapterTextToImageBenchmark(TextToImageBenchmark):
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/load_neg_embed.png"
image = load_image(url)
def __init__(self, args):
pipe = self.pipeline_class.from_pretrained(args.ckpt, torch_dtype=torch.float16).to("cuda")
pipe.load_ip_adapter(
args.ip_adapter_id[0],
subfolder="models" if "sdxl" not in args.ip_adapter_id[1] else "sdxl_models",
weight_name=args.ip_adapter_id[1],
)
if args.run_compile:
pipe.unet.to(memory_format=torch.channels_last)
print("Run torch compile")
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
pipe.set_progress_bar_config(disable=True)
self.pipe = pipe
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
ip_adapter_image=self.image,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
class ControlNetBenchmark(TextToImageBenchmark):
pipeline_class = StableDiffusionControlNetPipeline
aux_network_class = ControlNetModel
root_ckpt = "runwayml/stable-diffusion-v1-5"
url = "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/benchmarking/canny_image_condition.png"
image = load_image(url).convert("RGB")
def __init__(self, args):
aux_network = self.aux_network_class.from_pretrained(args.ckpt, torch_dtype=torch.float16)
pipe = self.pipeline_class.from_pretrained(self.root_ckpt, controlnet=aux_network, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
pipe.set_progress_bar_config(disable=True)
self.pipe = pipe
if args.run_compile:
pipe.unet.to(memory_format=torch.channels_last)
pipe.controlnet.to(memory_format=torch.channels_last)
print("Run torch compile")
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
pipe.controlnet = torch.compile(pipe.controlnet, mode="reduce-overhead", fullgraph=True)
self.image = self.image.resize(RESOLUTION_MAPPING[args.ckpt])
def run_inference(self, pipe, args):
_ = pipe(
prompt=PROMPT,
image=self.image,
num_inference_steps=args.num_inference_steps,
num_images_per_prompt=args.batch_size,
)
class ControlNetSDXLBenchmark(ControlNetBenchmark):
pipeline_class = StableDiffusionXLControlNetPipeline
root_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
def __init__(self, args):
super().__init__(args)
class T2IAdapterBenchmark(ControlNetBenchmark):
pipeline_class = StableDiffusionAdapterPipeline
aux_network_class = T2IAdapter
root_ckpt = "CompVis/stable-diffusion-v1-4"
url = "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/benchmarking/canny_for_adapter.png"
image = load_image(url).convert("L")
def __init__(self, args):
aux_network = self.aux_network_class.from_pretrained(args.ckpt, torch_dtype=torch.float16)
pipe = self.pipeline_class.from_pretrained(self.root_ckpt, adapter=aux_network, torch_dtype=torch.float16)
pipe = pipe.to("cuda")
pipe.set_progress_bar_config(disable=True)
self.pipe = pipe
if args.run_compile:
pipe.unet.to(memory_format=torch.channels_last)
pipe.adapter.to(memory_format=torch.channels_last)
print("Run torch compile")
pipe.unet = torch.compile(pipe.unet, mode="reduce-overhead", fullgraph=True)
pipe.adapter = torch.compile(pipe.adapter, mode="reduce-overhead", fullgraph=True)
self.image = self.image.resize(RESOLUTION_MAPPING[args.ckpt])
class T2IAdapterSDXLBenchmark(T2IAdapterBenchmark):
pipeline_class = StableDiffusionXLAdapterPipeline
root_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
url = "https://huggingface.co/datasets/diffusers/docs-images/resolve/main/benchmarking/canny_for_adapter_sdxl.png"
image = load_image(url)
def __init__(self, args):
super().__init__(args)

View File

@@ -0,0 +1,26 @@
import argparse
import sys
sys.path.append(".")
from base_classes import ControlNetBenchmark, ControlNetSDXLBenchmark # noqa: E402
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="lllyasviel/sd-controlnet-canny",
choices=["lllyasviel/sd-controlnet-canny", "diffusers/controlnet-canny-sdxl-1.0"],
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_pipe = (
ControlNetBenchmark(args) if args.ckpt == "lllyasviel/sd-controlnet-canny" else ControlNetSDXLBenchmark(args)
)
benchmark_pipe.benchmark(args)

View File

@@ -0,0 +1,32 @@
import argparse
import sys
sys.path.append(".")
from base_classes import IPAdapterTextToImageBenchmark # noqa: E402
IP_ADAPTER_CKPTS = {
"runwayml/stable-diffusion-v1-5": ("h94/IP-Adapter", "ip-adapter_sd15.bin"),
"stabilityai/stable-diffusion-xl-base-1.0": ("h94/IP-Adapter", "ip-adapter_sdxl.bin"),
}
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="runwayml/stable-diffusion-v1-5",
choices=list(IP_ADAPTER_CKPTS.keys()),
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
args.ip_adapter_id = IP_ADAPTER_CKPTS[args.ckpt]
benchmark_pipe = IPAdapterTextToImageBenchmark(args)
args.ckpt = f"{args.ckpt} (IP-Adapter)"
benchmark_pipe.benchmark(args)

View File

@@ -0,0 +1,29 @@
import argparse
import sys
sys.path.append(".")
from base_classes import ImageToImageBenchmark, TurboImageToImageBenchmark # noqa: E402
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="runwayml/stable-diffusion-v1-5",
choices=[
"runwayml/stable-diffusion-v1-5",
"stabilityai/stable-diffusion-2-1",
"stabilityai/stable-diffusion-xl-refiner-1.0",
"stabilityai/sdxl-turbo",
],
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_pipe = ImageToImageBenchmark(args) if "turbo" not in args.ckpt else TurboImageToImageBenchmark(args)
benchmark_pipe.benchmark(args)

View File

@@ -0,0 +1,28 @@
import argparse
import sys
sys.path.append(".")
from base_classes import InpaintingBenchmark # noqa: E402
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="runwayml/stable-diffusion-v1-5",
choices=[
"runwayml/stable-diffusion-v1-5",
"stabilityai/stable-diffusion-2-1",
"stabilityai/stable-diffusion-xl-base-1.0",
],
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_pipe = InpaintingBenchmark(args)
benchmark_pipe.benchmark(args)

View File

@@ -0,0 +1,28 @@
import argparse
import sys
sys.path.append(".")
from base_classes import T2IAdapterBenchmark, T2IAdapterSDXLBenchmark # noqa: E402
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="TencentARC/t2iadapter_canny_sd14v1",
choices=["TencentARC/t2iadapter_canny_sd14v1", "TencentARC/t2i-adapter-canny-sdxl-1.0"],
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_pipe = (
T2IAdapterBenchmark(args)
if args.ckpt == "TencentARC/t2iadapter_canny_sd14v1"
else T2IAdapterSDXLBenchmark(args)
)
benchmark_pipe.benchmark(args)

View File

@@ -0,0 +1,23 @@
import argparse
import sys
sys.path.append(".")
from base_classes import LCMLoRATextToImageBenchmark # noqa: E402
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="stabilityai/stable-diffusion-xl-base-1.0",
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=4)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_pipe = LCMLoRATextToImageBenchmark(args)
benchmark_pipe.benchmark(args)

View File

@@ -0,0 +1,40 @@
import argparse
import sys
sys.path.append(".")
from base_classes import TextToImageBenchmark, TurboTextToImageBenchmark # noqa: E402
ALL_T2I_CKPTS = [
"runwayml/stable-diffusion-v1-5",
"segmind/SSD-1B",
"stabilityai/stable-diffusion-xl-base-1.0",
"kandinsky-community/kandinsky-2-2-decoder",
"warp-ai/wuerstchen",
"stabilityai/sdxl-turbo",
]
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument(
"--ckpt",
type=str,
default="runwayml/stable-diffusion-v1-5",
choices=ALL_T2I_CKPTS,
)
parser.add_argument("--batch_size", type=int, default=1)
parser.add_argument("--num_inference_steps", type=int, default=50)
parser.add_argument("--model_cpu_offload", action="store_true")
parser.add_argument("--run_compile", action="store_true")
args = parser.parse_args()
benchmark_cls = None
if "turbo" in args.ckpt:
benchmark_cls = TurboTextToImageBenchmark
else:
benchmark_cls = TextToImageBenchmark
benchmark_pipe = benchmark_cls(args)
benchmark_pipe.benchmark(args)

View File

@@ -0,0 +1,72 @@
import glob
import sys
import pandas as pd
from huggingface_hub import hf_hub_download, upload_file
from huggingface_hub.utils._errors import EntryNotFoundError
sys.path.append(".")
from utils import BASE_PATH, FINAL_CSV_FILE, GITHUB_SHA, REPO_ID, collate_csv # noqa: E402
def has_previous_benchmark() -> str:
csv_path = None
try:
csv_path = hf_hub_download(repo_id=REPO_ID, repo_type="dataset", filename=FINAL_CSV_FILE)
except EntryNotFoundError:
csv_path = None
return csv_path
def filter_float(value):
if isinstance(value, str):
return float(value.split()[0])
return value
def push_to_hf_dataset():
all_csvs = sorted(glob.glob(f"{BASE_PATH}/*.csv"))
collate_csv(all_csvs, FINAL_CSV_FILE)
# If there's an existing benchmark file, we should report the changes.
csv_path = has_previous_benchmark()
if csv_path is not None:
current_results = pd.read_csv(FINAL_CSV_FILE)
previous_results = pd.read_csv(csv_path)
numeric_columns = current_results.select_dtypes(include=["float64", "int64"]).columns
numeric_columns = [
c for c in numeric_columns if c not in ["batch_size", "num_inference_steps", "actual_gpu_memory (gbs)"]
]
for column in numeric_columns:
previous_results[column] = previous_results[column].map(lambda x: filter_float(x))
# Calculate the percentage change
current_results[column] = current_results[column].astype(float)
previous_results[column] = previous_results[column].astype(float)
percent_change = ((current_results[column] - previous_results[column]) / previous_results[column]) * 100
# Format the values with '+' or '-' sign and append to original values
current_results[column] = current_results[column].map(str) + percent_change.map(
lambda x: f" ({'+' if x > 0 else ''}{x:.2f}%)"
)
# There might be newly added rows. So, filter out the NaNs.
current_results[column] = current_results[column].map(lambda x: x.replace(" (nan%)", ""))
# Overwrite the current result file.
current_results.to_csv(FINAL_CSV_FILE, index=False)
commit_message = f"upload from sha: {GITHUB_SHA}" if GITHUB_SHA is not None else "upload benchmark results"
upload_file(
repo_id=REPO_ID,
path_in_repo=FINAL_CSV_FILE,
path_or_fileobj=FINAL_CSV_FILE,
repo_type="dataset",
commit_message=commit_message,
)
if __name__ == "__main__":
push_to_hf_dataset()

97
benchmarks/run_all.py Normal file
View File

@@ -0,0 +1,97 @@
import glob
import subprocess
import sys
from typing import List
sys.path.append(".")
from benchmark_text_to_image import ALL_T2I_CKPTS # noqa: E402
PATTERN = "benchmark_*.py"
class SubprocessCallException(Exception):
pass
# Taken from `test_examples_utils.py`
def run_command(command: List[str], return_stdout=False):
"""
Runs `command` with `subprocess.check_output` and will potentially return the `stdout`. Will also properly capture
if an error occurred while running `command`
"""
try:
output = subprocess.check_output(command, stderr=subprocess.STDOUT)
if return_stdout:
if hasattr(output, "decode"):
output = output.decode("utf-8")
return output
except subprocess.CalledProcessError as e:
raise SubprocessCallException(
f"Command `{' '.join(command)}` failed with the following error:\n\n{e.output.decode()}"
) from e
def main():
python_files = glob.glob(PATTERN)
for file in python_files:
print(f"****** Running file: {file} ******")
# Run with canonical settings.
if file != "benchmark_text_to_image.py":
command = f"python {file}"
run_command(command.split())
command += " --run_compile"
run_command(command.split())
# Run variants.
for file in python_files:
if file == "benchmark_text_to_image.py":
for ckpt in ALL_T2I_CKPTS:
command = f"python {file} --ckpt {ckpt}"
if "turbo" in ckpt:
command += " --num_inference_steps 1"
run_command(command.split())
command += " --run_compile"
run_command(command.split())
elif file == "benchmark_sd_img.py":
for ckpt in ["stabilityai/stable-diffusion-xl-refiner-1.0", "stabilityai/sdxl-turbo"]:
command = f"python {file} --ckpt {ckpt}"
if ckpt == "stabilityai/sdxl-turbo":
command += " --num_inference_steps 2"
run_command(command.split())
command += " --run_compile"
run_command(command.split())
elif file in ["benchmark_sd_inpainting.py", "benchmark_ip_adapters.py"]:
sdxl_ckpt = "stabilityai/stable-diffusion-xl-base-1.0"
command = f"python {file} --ckpt {sdxl_ckpt}"
run_command(command.split())
command += " --run_compile"
run_command(command.split())
elif file in ["benchmark_controlnet.py", "benchmark_t2i_adapter.py"]:
sdxl_ckpt = (
"diffusers/controlnet-canny-sdxl-1.0"
if "controlnet" in file
else "TencentARC/t2i-adapter-canny-sdxl-1.0"
)
command = f"python {file} --ckpt {sdxl_ckpt}"
run_command(command.split())
command += " --run_compile"
run_command(command.split())
if __name__ == "__main__":
main()

98
benchmarks/utils.py Normal file
View File

@@ -0,0 +1,98 @@
import argparse
import csv
import gc
import os
from dataclasses import dataclass
from typing import Dict, List, Union
import torch
import torch.utils.benchmark as benchmark
GITHUB_SHA = os.getenv("GITHUB_SHA", None)
BENCHMARK_FIELDS = [
"pipeline_cls",
"ckpt_id",
"batch_size",
"num_inference_steps",
"model_cpu_offload",
"run_compile",
"time (secs)",
"memory (gbs)",
"actual_gpu_memory (gbs)",
"github_sha",
]
PROMPT = "ghibli style, a fantasy landscape with castles"
BASE_PATH = os.getenv("BASE_PATH", ".")
TOTAL_GPU_MEMORY = float(os.getenv("TOTAL_GPU_MEMORY", torch.cuda.get_device_properties(0).total_memory / (1024**3)))
REPO_ID = "diffusers/benchmarks"
FINAL_CSV_FILE = "collated_results.csv"
@dataclass
class BenchmarkInfo:
time: float
memory: float
def flush():
"""Wipes off memory."""
gc.collect()
torch.cuda.empty_cache()
torch.cuda.reset_max_memory_allocated()
torch.cuda.reset_peak_memory_stats()
def bytes_to_giga_bytes(bytes):
return f"{(bytes / 1024 / 1024 / 1024):.3f}"
def benchmark_fn(f, *args, **kwargs):
t0 = benchmark.Timer(
stmt="f(*args, **kwargs)",
globals={"args": args, "kwargs": kwargs, "f": f},
num_threads=torch.get_num_threads(),
)
return f"{(t0.blocked_autorange().mean):.3f}"
def generate_csv_dict(
pipeline_cls: str, ckpt: str, args: argparse.Namespace, benchmark_info: BenchmarkInfo
) -> Dict[str, Union[str, bool, float]]:
"""Packs benchmarking data into a dictionary for latter serialization."""
data_dict = {
"pipeline_cls": pipeline_cls,
"ckpt_id": ckpt,
"batch_size": args.batch_size,
"num_inference_steps": args.num_inference_steps,
"model_cpu_offload": args.model_cpu_offload,
"run_compile": args.run_compile,
"time (secs)": benchmark_info.time,
"memory (gbs)": benchmark_info.memory,
"actual_gpu_memory (gbs)": f"{(TOTAL_GPU_MEMORY):.3f}",
"github_sha": GITHUB_SHA,
}
return data_dict
def write_to_csv(file_name: str, data_dict: Dict[str, Union[str, bool, float]]):
"""Serializes a dictionary into a CSV file."""
with open(file_name, mode="w", newline="") as csvfile:
writer = csv.DictWriter(csvfile, fieldnames=BENCHMARK_FIELDS)
writer.writeheader()
writer.writerow(data_dict)
def collate_csv(input_files: List[str], output_file: str):
"""Collates multiple identically structured CSVs into a single CSV file."""
with open(output_file, mode="w", newline="") as outfile:
writer = csv.DictWriter(outfile, fieldnames=BENCHMARK_FIELDS)
writer.writeheader()
for file in input_files:
with open(file, mode="r") as infile:
reader = csv.DictReader(infile)
for row in reader:
writer.writerow(row)

View File

@@ -0,0 +1,52 @@
FROM ubuntu:20.04
LABEL maintainer="Hugging Face"
LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.10 \
python3-pip \
libgl1 \
zip \
wget \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark \
--extra-index-url https://download.pytorch.org/whl/cpu && \
python3.10 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \
huggingface-hub \
Jinja2 \
librosa \
numpy \
scipy \
tensorboard \
transformers \
matplotlib \
setuptools==69.5.1
CMD ["/bin/bash"]

View File

@@ -4,32 +4,36 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.8 \
python3-pip \
python3.8-venv && \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
libgl1 \
python3.10 \
python3-pip \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
# follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --upgrade --no-cache-dir \
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3 -m uv pip install --upgrade --no-cache-dir \
clu \
"jax[cpu]>=0.2.16,!=0.3.2" \
"flax>=0.4.1" \
"jaxlib>=0.1.65" && \
python3 -m pip install --no-cache-dir \
python3 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \

View File

@@ -4,34 +4,38 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.8 \
libgl1 \
python3.10 \
python3-pip \
python3.8-venv && \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
# follow the instructions here: https://cloud.google.com/tpu/docs/run-in-container#train_a_jax_model_in_a_docker_container
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3 -m pip install --no-cache-dir \
"jax[tpu]>=0.2.16,!=0.3.2" \
-f https://storage.googleapis.com/jax-releases/libtpu_releases.html && \
python3 -m pip install --upgrade --no-cache-dir \
python3 -m uv pip install --upgrade --no-cache-dir \
clu \
"flax>=0.4.1" \
"jaxlib>=0.1.65" && \
python3 -m pip install --no-cache-dir \
python3 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \

View File

@@ -4,32 +4,36 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.8 \
libgl1 \
python3.10 \
python3-pip \
python3.8-venv && \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
RUN python3 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3 -m uv pip install --no-cache-dir \
torch==2.1.2 \
torchvision==0.16.2 \
torchaudio==2.1.2 \
onnxruntime \
--extra-index-url https://download.pytorch.org/whl/cpu && \
python3 -m pip install --no-cache-dir \
python3 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \

View File

@@ -1,35 +1,39 @@
FROM nvidia/cuda:11.6.2-cudnn8-devel-ubuntu20.04
FROM nvidia/cuda:12.1.0-runtime-ubuntu20.04
LABEL maintainer="Hugging Face"
LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.8 \
libgl1 \
python3.10 \
python3-pip \
python3.8-venv && \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
"onnxruntime-gpu>=1.13.1" \
--extra-index-url https://download.pytorch.org/whl/cu117 && \
python3 -m pip install --no-cache-dir \
python3.10 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \

View File

@@ -0,0 +1,47 @@
FROM nvidia/cuda:12.1.0-runtime-ubuntu20.04
LABEL maintainer="Hugging Face"
LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
libgl1 \
python3.10 \
python3-pip \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark && \
python3.10 -m pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \
huggingface-hub \
Jinja2 \
librosa \
numpy \
scipy \
tensorboard \
transformers
CMD ["/bin/bash"]

View File

@@ -4,31 +4,36 @@ LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.8 \
python3.10 \
python3-pip \
python3.8-venv && \
libgl1 \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark \
--extra-index-url https://download.pytorch.org/whl/cpu && \
python3 -m pip install --no-cache-dir \
python3.10 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \
@@ -38,6 +43,6 @@ RUN python3 -m pip install --no-cache-dir --upgrade pip && \
numpy \
scipy \
tensorboard \
transformers
transformers matplotlib
CMD ["/bin/bash"]
CMD ["/bin/bash"]

View File

@@ -1,42 +1,48 @@
FROM nvidia/cuda:11.7.1-cudnn8-runtime-ubuntu20.04
FROM nvidia/cuda:12.1.0-runtime-ubuntu20.04
LABEL maintainer="Hugging Face"
LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt update && \
apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
python3.8 \
python3-pip \
python3.8-venv && \
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
libgl1 \
python3.10 \
python3-pip \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3 -m venv /opt/venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3 -m pip install --no-cache-dir --upgrade pip && \
python3 -m pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
python3 -m pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \
huggingface-hub \
Jinja2 \
librosa \
numpy \
scipy \
tensorboard \
transformers
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m uv pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark && \
python3.10 -m pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \
huggingface-hub \
Jinja2 \
librosa \
numpy \
scipy \
tensorboard \
transformers \
pytorch-lightning
CMD ["/bin/bash"]

View File

@@ -0,0 +1,48 @@
FROM nvidia/cuda:12.1.0-runtime-ubuntu20.04
LABEL maintainer="Hugging Face"
LABEL repository="diffusers"
ENV DEBIAN_FRONTEND=noninteractive
RUN apt-get -y update \
&& apt-get install -y software-properties-common \
&& add-apt-repository ppa:deadsnakes/ppa
RUN apt install -y bash \
build-essential \
git \
git-lfs \
curl \
ca-certificates \
libsndfile1-dev \
libgl1 \
python3.10 \
python3-pip \
python3.10-venv && \
rm -rf /var/lib/apt/lists
# make sure to use venv
RUN python3.10 -m venv /opt/venv
ENV PATH="/opt/venv/bin:$PATH"
# pre-install the heavy dependencies (these can later be overridden by the deps from setup.py)
RUN python3.10 -m pip install --no-cache-dir --upgrade pip uv==0.1.11 && \
python3.10 -m pip install --no-cache-dir \
torch \
torchvision \
torchaudio \
invisible_watermark && \
python3.10 -m uv pip install --no-cache-dir \
accelerate \
datasets \
hf-doc-builder \
huggingface-hub \
Jinja2 \
librosa \
numpy \
scipy \
tensorboard \
transformers \
xformers
CMD ["/bin/bash"]

View File

@@ -1,5 +1,5 @@
<!---
Copyright 2023- The HuggingFace Team. All rights reserved.
Copyright 2024- The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
@@ -16,7 +16,7 @@ limitations under the License.
# Generating the documentation
To generate the documentation, you first have to build it. Several packages are necessary to build the doc,
To generate the documentation, you first have to build it. Several packages are necessary to build the doc,
you can install them with the following command, at the root of the code repository:
```bash
@@ -68,10 +68,10 @@ The `preview` command only works with existing doc files. When you add a complet
## Adding a new element to the navigation bar
Accepted files are Markdown (.md or .mdx).
Accepted files are Markdown (.md).
Create a file with its extension and put it in the source directory. You can then link it to the toc-tree by putting
the filename without the extension in the [`_toctree.yml`](https://github.com/huggingface/diffusers/blob/main/docs/source/_toctree.yml) file.
the filename without the extension in the [`_toctree.yml`](https://github.com/huggingface/diffusers/blob/main/docs/source/en/_toctree.yml) file.
## Renaming section headers and moving sections
@@ -81,14 +81,14 @@ Therefore, we simply keep a little map of moved sections at the end of the docum
So if you renamed a section from: "Section A" to "Section B", then you can add at the end of the file:
```
```md
Sections that were moved:
[ <a href="#section-b">Section A</a><a id="section-a"></a> ]
```
and of course, if you moved it to another file, then:
```
```md
Sections that were moved:
[ <a href="../new-file#section-b">Section A</a><a id="section-a"></a> ]
@@ -96,7 +96,7 @@ Sections that were moved:
Use the relative style to link to the new file so that the versioned docs continue to work.
For an example of a rich moved section set please see the very end of [the transformers Trainer doc](https://github.com/huggingface/transformers/blob/main/docs/source/en/main_classes/trainer.mdx).
For an example of a rich moved section set please see the very end of [the transformers Trainer doc](https://github.com/huggingface/transformers/blob/main/docs/source/en/main_classes/trainer.md).
## Writing Documentation - Specification
@@ -109,8 +109,8 @@ although we can write them directly in Markdown.
Adding a new tutorial or section is done in two steps:
- Add a new file under `docs/source`. This file can either be ReStructuredText (.rst) or Markdown (.md).
- Link that file in `docs/source/_toctree.yml` on the correct toc-tree.
- Add a new Markdown (.md) file under `docs/source/<languageCode>`.
- Link that file in `docs/source/<languageCode>/_toctree.yml` on the correct toc-tree.
Make sure to put your new file under the proper section. It's unlikely to go in the first section (*Get Started*), so
depending on the intended targets (beginners, more advanced users, or researchers) it should go in sections two, three, or four.
@@ -119,8 +119,8 @@ depending on the intended targets (beginners, more advanced users, or researcher
When adding a new pipeline:
- create a file `xxx.mdx` under `docs/source/api/pipelines` (don't hesitate to copy an existing file as template).
- Link that file in (*Diffusers Summary*) section in `docs/source/api/pipelines/overview.mdx`, along with the link to the paper, and a colab notebook (if available).
- Create a file `xxx.md` under `docs/source/<languageCode>/api/pipelines` (don't hesitate to copy an existing file as template).
- Link that file in (*Diffusers Summary*) section in `docs/source/api/pipelines/overview.md`, along with the link to the paper, and a colab notebook (if available).
- Write a short overview of the diffusion model:
- Overview with paper & authors
- Paper abstract
@@ -129,8 +129,6 @@ When adding a new pipeline:
- Add all the pipeline classes that should be linked in the diffusion model. These classes should be added using our Markdown syntax. By default as follows:
```
## XXXPipeline
[[autodoc]] XXXPipeline
- all
- __call__
@@ -144,11 +142,11 @@ This will include every public method of the pipeline that is documented, as wel
- __call__
- enable_attention_slicing
- disable_attention_slicing
- enable_xformers_memory_efficient_attention
- enable_xformers_memory_efficient_attention
- disable_xformers_memory_efficient_attention
```
You can follow the same process to create a new scheduler under the `docs/source/api/schedulers` folder
You can follow the same process to create a new scheduler under the `docs/source/<languageCode>/api/schedulers` folder.
### Writing source documentation
@@ -156,7 +154,7 @@ Values that should be put in `code` should either be surrounded by backticks: \`
and objects like True, None, or any strings should usually be put in `code`.
When mentioning a class, function, or method, it is recommended to use our syntax for internal links so that our tool
adds a link to its documentation with this syntax: \[\`XXXClass\`\] or \[\`function\`\]. This requires the class or
adds a link to its documentation with this syntax: \[\`XXXClass\`\] or \[\`function\`\]. This requires the class or
function to be in the main package.
If you want to create a link to some internal class or function, you need to
@@ -164,7 +162,7 @@ provide its path. For instance: \[\`pipelines.ImagePipelineOutput\`\]. This will
`pipelines.ImagePipelineOutput` in the description. To get rid of the path and only keep the name of the object you are
linking to in the description, add a ~: \[\`~pipelines.ImagePipelineOutput\`\] will generate a link with `ImagePipelineOutput` in the description.
The same works for methods so you can either use \[\`XXXClass.method\`\] or \[~\`XXXClass.method\`\].
The same works for methods so you can either use \[\`XXXClass.method\`\] or \[\`~XXXClass.method\`\].
#### Defining arguments in a method
@@ -196,8 +194,8 @@ Here's an example showcasing everything so far:
For optional arguments or arguments with defaults we follow the following syntax: imagine we have a function with the
following signature:
```
def my_function(x: str = None, a: float = 1):
```py
def my_function(x: str=None, a: float=3.14):
```
then its documentation should look like this:
@@ -206,7 +204,7 @@ then its documentation should look like this:
Args:
x (`str`, *optional*):
This argument controls ...
a (`float`, *optional*, defaults to 1):
a (`float`, *optional*, defaults to `3.14`):
This argument is used to ...
```
@@ -244,10 +242,10 @@ Here's an example of a tuple return, comprising several objects:
```
Returns:
`tuple(torch.FloatTensor)` comprising various elements depending on the configuration ([`BertConfig`]) and inputs:
- ** loss** (*optional*, returned when `masked_lm_labels` is provided) `torch.FloatTensor` of shape `(1,)` --
`tuple(torch.Tensor)` comprising various elements depending on the configuration ([`BertConfig`]) and inputs:
- ** loss** (*optional*, returned when `masked_lm_labels` is provided) `torch.Tensor` of shape `(1,)` --
Total loss is the sum of the masked language modeling loss and the next sequence prediction (classification) loss.
- **prediction_scores** (`torch.FloatTensor` of shape `(batch_size, sequence_length, config.vocab_size)`) --
- **prediction_scores** (`torch.Tensor` of shape `(batch_size, sequence_length, config.vocab_size)`) --
Prediction scores of the language modeling head (scores for each vocabulary token before SoftMax).
```
@@ -268,4 +266,3 @@ We have an automatic script running with the `make style` command that will make
This script may have some weird failures if you made a syntax mistake or if you uncover a bug. Therefore, it's
recommended to commit your changes before running `make style`, so you can revert the changes done by that script
easily.

View File

@@ -1,10 +1,22 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
### Translating the Diffusers documentation into your language
As part of our mission to democratize machine learning, we'd love to make the Diffusers library available in many more languages! Follow the steps below if you want to help translate the documentation into your language 🙏.
**🗞️ Open an issue**
To get started, navigate to the [Issues](https://github.com/huggingface/diffusers/issues) page of this repo and check if anyone else has opened an issue for your language. If not, open a new issue by selecting the "Translation template" from the "New issue" button.
To get started, navigate to the [Issues](https://github.com/huggingface/diffusers/issues) page of this repo and check if anyone else has opened an issue for your language. If not, open a new issue by selecting the "🌐 Translating a New Language?" from the "New issue" button.
Once an issue exists, post a comment to indicate which chapters you'd like to work on, and we'll add your name to the list.
@@ -16,7 +28,7 @@ First, you'll need to [fork the Diffusers repo](https://docs.github.com/en/get-s
Once you've forked the repo, you'll want to get the files on your local machine for editing. You can do that by cloning the fork with Git as follows:
```bash
git clone https://github.com/YOUR-USERNAME/diffusers.git
git clone https://github.com/<YOUR-USERNAME>/diffusers.git
```
**📋 Copy-paste the English version with a new language code**
@@ -29,18 +41,18 @@ You'll only need to copy the files in the [`docs/source/en`](https://github.com/
```bash
cd ~/path/to/diffusers/docs
cp -r source/en source/LANG-ID
cp -r source/en source/<LANG-ID>
```
Here, `LANG-ID` should be one of the ISO 639-1 or ISO 639-2 language codes -- see [here](https://www.loc.gov/standards/iso639-2/php/code_list.php) for a handy table.
Here, `<LANG-ID>` should be one of the ISO 639-1 or ISO 639-2 language codes -- see [here](https://www.loc.gov/standards/iso639-2/php/code_list.php) for a handy table.
**✍️ Start translating**
The fun part comes - translating the text!
The first thing we recommend is translating the part of the `_toctree.yml` file that corresponds to your doc chapter. This file is used to render the table of contents on the website.
The first thing we recommend is translating the part of the `_toctree.yml` file that corresponds to your doc chapter. This file is used to render the table of contents on the website.
> 🙋 If the `_toctree.yml` file doesn't yet exist for your language, you can create one by copy-pasting from the English version and deleting the sections unrelated to your chapter. Just make sure it exists in the `docs/source/LANG-ID/` directory!
> 🙋 If the `_toctree.yml` file doesn't yet exist for your language, you can create one by copy-pasting from the English version and deleting the sections unrelated to your chapter. Just make sure it exists in the `docs/source/<LANG-ID>/` directory!
The fields you should add are `local` (with the name of the file containing the translation; e.g. `autoclass_tutorial`), and `title` (with the title of the doc in your language; e.g. `Load pretrained instances with an AutoClass`) -- as a reference, here is the `_toctree.yml` for [English](https://github.com/huggingface/diffusers/blob/main/docs/source/en/_toctree.yml):

View File

@@ -6,4 +6,4 @@ INSTALL_CONTENT = """
# ! pip install git+https://github.com/huggingface/diffusers.git
"""
notebook_first_cells = [{"type": "code", "content": INSTALL_CONTENT}]
notebook_first_cells = [{"type": "code", "content": INSTALL_CONTENT}]

View File

@@ -12,102 +12,168 @@
- local: tutorials/tutorial_overview
title: Overview
- local: using-diffusers/write_own_pipeline
title: Understanding models and schedulers
title: Understanding pipelines, models and schedulers
- local: tutorials/autopipeline
title: AutoPipeline
- local: tutorials/basic_training
title: Train a diffusion model
- local: tutorials/using_peft_for_inference
title: Load LoRAs for inference
- local: tutorials/fast_diffusion
title: Accelerate inference of text-to-image diffusion models
title: Tutorials
- sections:
- sections:
- local: using-diffusers/loading_overview
title: Overview
- local: using-diffusers/loading
title: Load pipelines, models, and schedulers
- local: using-diffusers/schedulers
title: Load and compare different schedulers
- local: using-diffusers/custom_pipeline_overview
title: Load community pipelines
- local: using-diffusers/kerascv
title: Load KerasCV Stable Diffusion checkpoints
title: Loading & Hub
- sections:
- local: using-diffusers/pipeline_overview
title: Overview
- local: using-diffusers/unconditional_image_generation
title: Unconditional image generation
- local: using-diffusers/conditional_image_generation
title: Text-to-image generation
- local: using-diffusers/img2img
title: Text-guided image-to-image
- local: using-diffusers/inpaint
title: Text-guided image-inpainting
- local: using-diffusers/depth2img
title: Text-guided depth-to-image
- local: using-diffusers/reusing_seeds
title: Improve image quality with deterministic generation
- local: using-diffusers/reproducibility
title: Create reproducible pipelines
- local: using-diffusers/custom_pipeline_examples
title: Community pipelines
- local: using-diffusers/contribute_pipeline
title: How to contribute a community pipeline
- local: using-diffusers/using_safetensors
title: Using safetensors
- local: using-diffusers/stable_diffusion_jax_how_to
title: Stable Diffusion in JAX/Flax
- local: using-diffusers/weighted_prompts
title: Weighting Prompts
title: Pipelines for Inference
- sections:
- local: training/overview
title: Overview
- local: using-diffusers/loading
title: Load pipelines
- local: using-diffusers/custom_pipeline_overview
title: Load community pipelines and components
- local: using-diffusers/schedulers
title: Load schedulers and models
- local: using-diffusers/other-formats
title: Model files and layouts
- local: using-diffusers/loading_adapters
title: Load adapters
- local: using-diffusers/push_to_hub
title: Push files to the Hub
title: Load pipelines and adapters
- sections:
- local: using-diffusers/unconditional_image_generation
title: Unconditional image generation
- local: using-diffusers/conditional_image_generation
title: Text-to-image
- local: using-diffusers/img2img
title: Image-to-image
- local: using-diffusers/inpaint
title: Inpainting
- local: using-diffusers/text-img2vid
title: Text or image-to-video
- local: using-diffusers/depth2img
title: Depth-to-image
title: Generative tasks
- sections:
- local: using-diffusers/overview_techniques
title: Overview
- local: training/distributed_inference
title: Distributed inference with multiple GPUs
- local: using-diffusers/merge_loras
title: Merge LoRAs
- local: using-diffusers/scheduler_features
title: Scheduler features
- local: using-diffusers/callback
title: Pipeline callbacks
- local: using-diffusers/reusing_seeds
title: Reproducible pipelines
- local: using-diffusers/image_quality
title: Controlling image quality
- local: using-diffusers/weighted_prompts
title: Prompt techniques
title: Inference techniques
- sections:
- local: advanced_inference/outpaint
title: Outpainting
title: Advanced inference
- sections:
- local: using-diffusers/sdxl
title: Stable Diffusion XL
- local: using-diffusers/sdxl_turbo
title: SDXL Turbo
- local: using-diffusers/kandinsky
title: Kandinsky
- local: using-diffusers/ip_adapter
title: IP-Adapter
- local: using-diffusers/controlnet
title: ControlNet
- local: using-diffusers/t2i_adapter
title: T2I-Adapter
- local: using-diffusers/inference_with_lcm
title: Latent Consistency Model
- local: using-diffusers/textual_inversion_inference
title: Textual inversion
- local: using-diffusers/shap-e
title: Shap-E
- local: using-diffusers/diffedit
title: DiffEdit
- local: using-diffusers/inference_with_tcd_lora
title: Trajectory Consistency Distillation-LoRA
- local: using-diffusers/svd
title: Stable Video Diffusion
- local: using-diffusers/marigold_usage
title: Marigold Computer Vision
title: Specific pipeline examples
- sections:
- local: training/overview
title: Overview
- local: training/create_dataset
title: Create a dataset for training
- local: training/adapt_a_model
title: Adapt a model to a new task
- isExpanded: false
sections:
- local: training/unconditional_training
title: Unconditional image generation
- local: training/text2image
title: Text-to-image
- local: training/sdxl
title: Stable Diffusion XL
- local: training/kandinsky
title: Kandinsky 2.2
- local: training/wuerstchen
title: Wuerstchen
- local: training/controlnet
title: ControlNet
- local: training/t2i_adapters
title: T2I-Adapters
- local: training/instructpix2pix
title: InstructPix2Pix
title: Models
- isExpanded: false
sections:
- local: training/text_inversion
title: Textual Inversion
- local: training/dreambooth
title: DreamBooth
- local: training/text2image
title: Text-to-image
- local: training/lora
title: Low-Rank Adaptation of Large Language Models (LoRA)
- local: training/controlnet
title: ControlNet
- local: training/instructpix2pix
title: InstructPix2Pix Training
title: LoRA
- local: training/custom_diffusion
title: Custom Diffusion
title: Training
- sections:
- local: using-diffusers/rl
title: Reinforcement Learning
- local: using-diffusers/audio
title: Audio
- local: using-diffusers/other-modalities
title: Other Modalities
title: Taking Diffusers Beyond Images
title: Using Diffusers
- local: training/lcm_distill
title: Latent Consistency Distillation
- local: training/ddpo
title: Reinforcement learning training with DDPO
title: Methods
title: Training
- sections:
- local: optimization/opt_overview
title: Overview
- local: optimization/fp16
title: Memory and Speed
title: Speed up inference
- local: optimization/memory
title: Reduce memory usage
- local: optimization/torch2.0
title: Torch2.0 support
title: PyTorch 2.0
- local: optimization/xformers
title: xFormers
- local: optimization/onnx
title: ONNX
- local: optimization/open_vino
title: OpenVINO
- local: optimization/coreml
title: Core ML
- local: optimization/mps
title: MPS
- local: optimization/habana
title: Habana Gaudi
- local: optimization/tome
title: Token Merging
title: Optimization/Special Hardware
title: Token merging
- local: optimization/deepcache
title: DeepCache
- local: optimization/tgate
title: TGATE
- sections:
- local: using-diffusers/stable_diffusion_jax_how_to
title: JAX/Flax
- local: optimization/onnx
title: ONNX
- local: optimization/open_vino
title: OpenVINO
- local: optimization/coreml
title: Core ML
title: Optimized model formats
- sections:
- local: optimization/mps
title: Metal Performance Shaders (MPS)
- local: optimization/habana
title: Habana Gaudi
title: Optimized hardware
title: Accelerate inference and reduce memory
- sections:
- local: conceptual/philosophy
title: Philosophy
@@ -121,154 +187,284 @@
title: Evaluating Diffusion Models
title: Conceptual Guides
- sections:
- sections:
- local: api/models
title: Models
- local: api/diffusion_pipeline
title: Diffusion Pipeline
- local: api/logging
title: Logging
- isExpanded: false
sections:
- local: api/configuration
title: Configuration
- local: api/logging
title: Logging
- local: api/outputs
title: Outputs
- local: api/loaders
title: Loaders
title: Main Classes
- sections:
- isExpanded: false
sections:
- local: api/loaders/ip_adapter
title: IP-Adapter
- local: api/loaders/lora
title: LoRA
- local: api/loaders/single_file
title: Single files
- local: api/loaders/textual_inversion
title: Textual Inversion
- local: api/loaders/unet
title: UNet
- local: api/loaders/peft
title: PEFT
title: Loaders
- isExpanded: false
sections:
- local: api/models/overview
title: Overview
- local: api/models/unet
title: UNet1DModel
- local: api/models/unet2d
title: UNet2DModel
- local: api/models/unet2d-cond
title: UNet2DConditionModel
- local: api/models/unet3d-cond
title: UNet3DConditionModel
- local: api/models/unet-motion
title: UNetMotionModel
- local: api/models/uvit2d
title: UViT2DModel
- local: api/models/vq
title: VQModel
- local: api/models/autoencoderkl
title: AutoencoderKL
- local: api/models/asymmetricautoencoderkl
title: AsymmetricAutoencoderKL
- local: api/models/autoencoder_tiny
title: Tiny AutoEncoder
- local: api/models/consistency_decoder_vae
title: ConsistencyDecoderVAE
- local: api/models/transformer2d
title: Transformer2DModel
- local: api/models/pixart_transformer2d
title: PixArtTransformer2DModel
- local: api/models/dit_transformer2d
title: DiTTransformer2DModel
- local: api/models/hunyuan_transformer2d
title: HunyuanDiT2DModel
- local: api/models/transformer_temporal
title: TransformerTemporalModel
- local: api/models/sd3_transformer2d
title: SD3Transformer2DModel
- local: api/models/prior_transformer
title: PriorTransformer
- local: api/models/controlnet
title: ControlNetModel
title: Models
- isExpanded: false
sections:
- local: api/pipelines/overview
title: Overview
- local: api/pipelines/alt_diffusion
title: AltDiffusion
- local: api/pipelines/audio_diffusion
title: Audio Diffusion
- local: api/pipelines/amused
title: aMUSEd
- local: api/pipelines/animatediff
title: AnimateDiff
- local: api/pipelines/attend_and_excite
title: Attend-and-Excite
- local: api/pipelines/audioldm
title: AudioLDM
- local: api/pipelines/cycle_diffusion
title: Cycle Diffusion
- local: api/pipelines/audioldm2
title: AudioLDM 2
- local: api/pipelines/auto_pipeline
title: AutoPipeline
- local: api/pipelines/blip_diffusion
title: BLIP-Diffusion
- local: api/pipelines/consistency_models
title: Consistency Models
- local: api/pipelines/controlnet
title: ControlNet
- local: api/pipelines/controlnet_sdxl
title: ControlNet with Stable Diffusion XL
- local: api/pipelines/controlnetxs
title: ControlNet-XS
- local: api/pipelines/controlnetxs_sdxl
title: ControlNet-XS with Stable Diffusion XL
- local: api/pipelines/dance_diffusion
title: Dance Diffusion
- local: api/pipelines/ddim
title: DDIM
- local: api/pipelines/ddpm
title: DDPM
- local: api/pipelines/deepfloyd_if
title: DeepFloyd IF
- local: api/pipelines/diffedit
title: DiffEdit
- local: api/pipelines/dit
title: DiT
- local: api/pipelines/if
title: IF
- local: api/pipelines/hunyuandit
title: Hunyuan-DiT
- local: api/pipelines/i2vgenxl
title: I2VGen-XL
- local: api/pipelines/pix2pix
title: InstructPix2Pix
- local: api/pipelines/kandinsky
title: Kandinsky 2.1
- local: api/pipelines/kandinsky_v22
title: Kandinsky 2.2
- local: api/pipelines/kandinsky3
title: Kandinsky 3
- local: api/pipelines/latent_consistency_models
title: Latent Consistency Models
- local: api/pipelines/latent_diffusion
title: Latent Diffusion
- local: api/pipelines/ledits_pp
title: LEDITS++
- local: api/pipelines/marigold
title: Marigold
- local: api/pipelines/panorama
title: MultiDiffusion
- local: api/pipelines/musicldm
title: MusicLDM
- local: api/pipelines/paint_by_example
title: PaintByExample
- local: api/pipelines/pndm
title: PNDM
- local: api/pipelines/repaint
title: RePaint
- local: api/pipelines/stable_diffusion_safe
title: Safe Stable Diffusion
- local: api/pipelines/score_sde_ve
title: Score SDE VE
title: Paint by Example
- local: api/pipelines/pia
title: Personalized Image Animator (PIA)
- local: api/pipelines/pixart
title: PixArt-α
- local: api/pipelines/pixart_sigma
title: PixArt-Σ
- local: api/pipelines/self_attention_guidance
title: Self-Attention Guidance
- local: api/pipelines/semantic_stable_diffusion
title: Semantic Guidance
- local: api/pipelines/spectrogram_diffusion
title: "Spectrogram Diffusion"
- local: api/pipelines/shap_e
title: Shap-E
- local: api/pipelines/stable_cascade
title: Stable Cascade
- sections:
- local: api/pipelines/stable_diffusion/overview
title: Overview
- local: api/pipelines/stable_diffusion/text2img
title: Text-to-Image
title: Text-to-image
- local: api/pipelines/stable_diffusion/img2img
title: Image-to-Image
title: Image-to-image
- local: api/pipelines/stable_diffusion/svd
title: Image-to-video
- local: api/pipelines/stable_diffusion/inpaint
title: Inpaint
title: Inpainting
- local: api/pipelines/stable_diffusion/depth2img
title: Depth-to-Image
title: Depth-to-image
- local: api/pipelines/stable_diffusion/image_variation
title: Image-Variation
- local: api/pipelines/stable_diffusion/upscale
title: Super-Resolution
title: Image variation
- local: api/pipelines/stable_diffusion/stable_diffusion_safe
title: Safe Stable Diffusion
- local: api/pipelines/stable_diffusion/stable_diffusion_2
title: Stable Diffusion 2
- local: api/pipelines/stable_diffusion/stable_diffusion_3
title: Stable Diffusion 3
- local: api/pipelines/stable_diffusion/stable_diffusion_xl
title: Stable Diffusion XL
- local: api/pipelines/stable_diffusion/sdxl_turbo
title: SDXL Turbo
- local: api/pipelines/stable_diffusion/latent_upscale
title: Stable-Diffusion-Latent-Upscaler
- local: api/pipelines/stable_diffusion/pix2pix
title: InstructPix2Pix
- local: api/pipelines/stable_diffusion/attend_and_excite
title: Attend and Excite
- local: api/pipelines/stable_diffusion/pix2pix_zero
title: Pix2Pix Zero
- local: api/pipelines/stable_diffusion/self_attention_guidance
title: Self-Attention Guidance
- local: api/pipelines/stable_diffusion/panorama
title: MultiDiffusion Panorama
- local: api/pipelines/stable_diffusion/controlnet
title: Text-to-Image Generation with ControlNet Conditioning
- local: api/pipelines/stable_diffusion/model_editing
title: Text-to-Image Model Editing
title: Latent upscaler
- local: api/pipelines/stable_diffusion/upscale
title: Super-resolution
- local: api/pipelines/stable_diffusion/k_diffusion
title: K-Diffusion
- local: api/pipelines/stable_diffusion/ldm3d_diffusion
title: LDM3D Text-to-(RGB, Depth), Text-to-(RGB-pano, Depth-pano), LDM3D Upscaler
- local: api/pipelines/stable_diffusion/adapter
title: T2I-Adapter
- local: api/pipelines/stable_diffusion/gligen
title: GLIGEN (Grounded Language-to-Image Generation)
title: Stable Diffusion
- local: api/pipelines/stable_diffusion_2
title: Stable Diffusion 2
- local: api/pipelines/stable_unclip
title: Stable unCLIP
- local: api/pipelines/stochastic_karras_ve
title: Stochastic Karras VE
- local: api/pipelines/text_to_video
title: Text-to-Video
title: Text-to-video
- local: api/pipelines/text_to_video_zero
title: Text-to-Video Zero
title: Text2Video-Zero
- local: api/pipelines/unclip
title: UnCLIP
- local: api/pipelines/latent_diffusion_uncond
title: Unconditional Latent Diffusion
- local: api/pipelines/versatile_diffusion
title: Versatile Diffusion
- local: api/pipelines/vq_diffusion
title: VQ Diffusion
title: unCLIP
- local: api/pipelines/unidiffuser
title: UniDiffuser
- local: api/pipelines/value_guided_sampling
title: Value-guided sampling
- local: api/pipelines/wuerstchen
title: Wuerstchen
title: Pipelines
- sections:
- isExpanded: false
sections:
- local: api/schedulers/overview
title: Overview
- local: api/schedulers/ddim
title: DDIM
- local: api/schedulers/cm_stochastic_iterative
title: CMStochasticIterativeScheduler
- local: api/schedulers/consistency_decoder
title: ConsistencyDecoderScheduler
- local: api/schedulers/ddim_inverse
title: DDIMInverse
title: DDIMInverseScheduler
- local: api/schedulers/ddim
title: DDIMScheduler
- local: api/schedulers/ddpm
title: DDPM
title: DDPMScheduler
- local: api/schedulers/deis
title: DEIS
- local: api/schedulers/dpm_discrete
title: DPM Discrete Scheduler
- local: api/schedulers/dpm_discrete_ancestral
title: DPM Discrete Scheduler with ancestral sampling
- local: api/schedulers/euler_ancestral
title: Euler Ancestral Scheduler
- local: api/schedulers/euler
title: Euler scheduler
- local: api/schedulers/heun
title: Heun Scheduler
- local: api/schedulers/ipndm
title: IPNDM
- local: api/schedulers/lms_discrete
title: Linear Multistep
title: DEISMultistepScheduler
- local: api/schedulers/multistep_dpm_solver_inverse
title: DPMSolverMultistepInverse
- local: api/schedulers/multistep_dpm_solver
title: Multistep DPM-Solver
- local: api/schedulers/pndm
title: PNDM
- local: api/schedulers/repaint
title: RePaint Scheduler
title: DPMSolverMultistepScheduler
- local: api/schedulers/dpm_sde
title: DPMSolverSDEScheduler
- local: api/schedulers/singlestep_dpm_solver
title: Singlestep DPM-Solver
title: DPMSolverSinglestepScheduler
- local: api/schedulers/edm_multistep_dpm_solver
title: EDMDPMSolverMultistepScheduler
- local: api/schedulers/edm_euler
title: EDMEulerScheduler
- local: api/schedulers/euler_ancestral
title: EulerAncestralDiscreteScheduler
- local: api/schedulers/euler
title: EulerDiscreteScheduler
- local: api/schedulers/flow_match_euler_discrete
title: FlowMatchEulerDiscreteScheduler
- local: api/schedulers/heun
title: HeunDiscreteScheduler
- local: api/schedulers/ipndm
title: IPNDMScheduler
- local: api/schedulers/stochastic_karras_ve
title: Stochastic Kerras VE
title: KarrasVeScheduler
- local: api/schedulers/dpm_discrete_ancestral
title: KDPM2AncestralDiscreteScheduler
- local: api/schedulers/dpm_discrete
title: KDPM2DiscreteScheduler
- local: api/schedulers/lcm
title: LCMScheduler
- local: api/schedulers/lms_discrete
title: LMSDiscreteScheduler
- local: api/schedulers/pndm
title: PNDMScheduler
- local: api/schedulers/repaint
title: RePaintScheduler
- local: api/schedulers/score_sde_ve
title: ScoreSdeVeScheduler
- local: api/schedulers/score_sde_vp
title: ScoreSdeVpScheduler
- local: api/schedulers/tcd
title: TCDScheduler
- local: api/schedulers/unipc
title: UniPCMultistepScheduler
- local: api/schedulers/score_sde_ve
title: VE-SDE
- local: api/schedulers/score_sde_vp
title: VP-SDE
- local: api/schedulers/vq_diffusion
title: VQDiffusionScheduler
title: Schedulers
- sections:
- local: api/experimental/rl
title: RL Planning
title: Experimental Features
- isExpanded: false
sections:
- local: api/internal_classes_overview
title: Overview
- local: api/attnprocessor
title: Attention Processor
- local: api/activations
title: Custom activation functions
- local: api/normalization
title: Custom normalization layers
- local: api/utilities
title: Utilities
- local: api/image_processor
title: VAE Image Processor
- local: api/video_processor
title: Video Processor
title: Internal classes
title: API

View File

@@ -0,0 +1,231 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Outpainting
Outpainting extends an image beyond its original boundaries, allowing you to add, replace, or modify visual elements in an image while preserving the original image. Like [inpainting](../using-diffusers/inpaint), you want to fill the white area (in this case, the area outside of the original image) with new visual elements while keeping the original image (represented by a mask of black pixels). There are a couple of ways to outpaint, such as with a [ControlNet](https://hf.co/blog/OzzyGT/outpainting-controlnet) or with [Differential Diffusion](https://hf.co/blog/OzzyGT/outpainting-differential-diffusion).
This guide will show you how to outpaint with an inpainting model, ControlNet, and a ZoeDepth estimator.
Before you begin, make sure you have the [controlnet_aux](https://github.com/huggingface/controlnet_aux) library installed so you can use the ZoeDepth estimator.
```py
!pip install -q controlnet_aux
```
## Image preparation
Start by picking an image to outpaint with and remove the background with a Space like [BRIA-RMBG-1.4](https://hf.co/spaces/briaai/BRIA-RMBG-1.4).
<iframe
src="https://briaai-bria-rmbg-1-4.hf.space"
frameborder="0"
width="850"
height="450"
></iframe>
For example, remove the background from this image of a pair of shoes.
<div class="flex flex-row gap-4">
<div class="flex-1">
<img class="rounded-xl" src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/original-jordan.png"/>
<figcaption class="mt-2 text-center text-sm text-gray-500">original image</figcaption>
</div>
<div class="flex-1">
<img class="rounded-xl" src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/no-background-jordan.png"/>
<figcaption class="mt-2 text-center text-sm text-gray-500">background removed</figcaption>
</div>
</div>
[Stable Diffusion XL (SDXL)](../using-diffusers/sdxl) models work best with 1024x1024 images, but you can resize the image to any size as long as your hardware has enough memory to support it. The transparent background in the image should also be replaced with a white background. Create a function (like the one below) that scales and pastes the image onto a white background.
```py
import random
import requests
import torch
from controlnet_aux import ZoeDetector
from PIL import Image, ImageOps
from diffusers import (
AutoencoderKL,
ControlNetModel,
StableDiffusionXLControlNetPipeline,
StableDiffusionXLInpaintPipeline,
)
def scale_and_paste(original_image):
aspect_ratio = original_image.width / original_image.height
if original_image.width > original_image.height:
new_width = 1024
new_height = round(new_width / aspect_ratio)
else:
new_height = 1024
new_width = round(new_height * aspect_ratio)
resized_original = original_image.resize((new_width, new_height), Image.LANCZOS)
white_background = Image.new("RGBA", (1024, 1024), "white")
x = (1024 - new_width) // 2
y = (1024 - new_height) // 2
white_background.paste(resized_original, (x, y), resized_original)
return resized_original, white_background
original_image = Image.open(
requests.get(
"https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/no-background-jordan.png",
stream=True,
).raw
).convert("RGBA")
resized_img, white_bg_image = scale_and_paste(original_image)
```
To avoid adding unwanted extra details, use the ZoeDepth estimator to provide additional guidance during generation and to ensure the shoes remain consistent with the original image.
```py
zoe = ZoeDetector.from_pretrained("lllyasviel/Annotators")
image_zoe = zoe(white_bg_image, detect_resolution=512, image_resolution=1024)
image_zoe
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/zoedepth-jordan.png"/>
</div>
## Outpaint
Once your image is ready, you can generate content in the white area around the shoes with [controlnet-inpaint-dreamer-sdxl](https://hf.co/destitech/controlnet-inpaint-dreamer-sdxl), a SDXL ControlNet trained for inpainting.
Load the inpainting ControlNet, ZoeDepth model, VAE and pass them to the [`StableDiffusionXLControlNetPipeline`]. Then you can create an optional `generate_image` function (for convenience) to outpaint an initial image.
```py
controlnets = [
ControlNetModel.from_pretrained(
"destitech/controlnet-inpaint-dreamer-sdxl", torch_dtype=torch.float16, variant="fp16"
),
ControlNetModel.from_pretrained(
"diffusers/controlnet-zoe-depth-sdxl-1.0", torch_dtype=torch.float16
),
]
vae = AutoencoderKL.from_pretrained("madebyollin/sdxl-vae-fp16-fix", torch_dtype=torch.float16).to("cuda")
pipeline = StableDiffusionXLControlNetPipeline.from_pretrained(
"SG161222/RealVisXL_V4.0", torch_dtype=torch.float16, variant="fp16", controlnet=controlnets, vae=vae
).to("cuda")
def generate_image(prompt, negative_prompt, inpaint_image, zoe_image, seed: int = None):
if seed is None:
seed = random.randint(0, 2**32 - 1)
generator = torch.Generator(device="cpu").manual_seed(seed)
image = pipeline(
prompt,
negative_prompt=negative_prompt,
image=[inpaint_image, zoe_image],
guidance_scale=6.5,
num_inference_steps=25,
generator=generator,
controlnet_conditioning_scale=[0.5, 0.8],
control_guidance_end=[0.9, 0.6],
).images[0]
return image
prompt = "nike air jordans on a basketball court"
negative_prompt = ""
temp_image = generate_image(prompt, negative_prompt, white_bg_image, image_zoe, 908097)
```
Paste the original image over the initial outpainted image. You'll improve the outpainted background in a later step.
```py
x = (1024 - resized_img.width) // 2
y = (1024 - resized_img.height) // 2
temp_image.paste(resized_img, (x, y), resized_img)
temp_image
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/initial-outpaint.png"/>
</div>
> [!TIP]
> Now is a good time to free up some memory if you're running low!
>
> ```py
> pipeline=None
> torch.cuda.empty_cache()
> ```
Now that you have an initial outpainted image, load the [`StableDiffusionXLInpaintPipeline`] with the [RealVisXL](https://hf.co/SG161222/RealVisXL_V4.0) model to generate the final outpainted image with better quality.
```py
pipeline = StableDiffusionXLInpaintPipeline.from_pretrained(
"OzzyGT/RealVisXL_V4.0_inpainting",
torch_dtype=torch.float16,
variant="fp16",
vae=vae,
).to("cuda")
```
Prepare a mask for the final outpainted image. To create a more natural transition between the original image and the outpainted background, blur the mask to help it blend better.
```py
mask = Image.new("L", temp_image.size)
mask.paste(resized_img.split()[3], (x, y))
mask = ImageOps.invert(mask)
final_mask = mask.point(lambda p: p > 128 and 255)
mask_blurred = pipeline.mask_processor.blur(final_mask, blur_factor=20)
mask_blurred
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/blurred-mask.png"/>
</div>
Create a better prompt and pass it to the `generate_outpaint` function to generate the final outpainted image. Again, paste the original image over the final outpainted background.
```py
def generate_outpaint(prompt, negative_prompt, image, mask, seed: int = None):
if seed is None:
seed = random.randint(0, 2**32 - 1)
generator = torch.Generator(device="cpu").manual_seed(seed)
image = pipeline(
prompt,
negative_prompt=negative_prompt,
image=image,
mask_image=mask,
guidance_scale=10.0,
strength=0.8,
num_inference_steps=30,
generator=generator,
).images[0]
return image
prompt = "high quality photo of nike air jordans on a basketball court, highly detailed"
negative_prompt = ""
final_image = generate_outpaint(prompt, negative_prompt, temp_image, mask_blurred, 7688778)
x = (1024 - resized_img.width) // 2
y = (1024 - resized_img.height) // 2
final_image.paste(resized_img, (x, y), resized_img)
final_image
```
<div class="flex justify-center">
<img src="https://huggingface.co/datasets/stevhliu/testing-images/resolve/main/final-outpaint.png"/>
</div>

View File

@@ -0,0 +1,27 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Activation functions
Customized activation functions for supporting various models in 🤗 Diffusers.
## GELU
[[autodoc]] models.activations.GELU
## GEGLU
[[autodoc]] models.activations.GEGLU
## ApproximateGELU
[[autodoc]] models.activations.ApproximateGELU

View File

@@ -0,0 +1,60 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Attention Processor
An attention processor is a class for applying different types of attention mechanisms.
## AttnProcessor
[[autodoc]] models.attention_processor.AttnProcessor
## AttnProcessor2_0
[[autodoc]] models.attention_processor.AttnProcessor2_0
## AttnAddedKVProcessor
[[autodoc]] models.attention_processor.AttnAddedKVProcessor
## AttnAddedKVProcessor2_0
[[autodoc]] models.attention_processor.AttnAddedKVProcessor2_0
## CrossFrameAttnProcessor
[[autodoc]] pipelines.text_to_video_synthesis.pipeline_text_to_video_zero.CrossFrameAttnProcessor
## CustomDiffusionAttnProcessor
[[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor
## CustomDiffusionAttnProcessor2_0
[[autodoc]] models.attention_processor.CustomDiffusionAttnProcessor2_0
## CustomDiffusionXFormersAttnProcessor
[[autodoc]] models.attention_processor.CustomDiffusionXFormersAttnProcessor
## FusedAttnProcessor2_0
[[autodoc]] models.attention_processor.FusedAttnProcessor2_0
## LoRAAttnAddedKVProcessor
[[autodoc]] models.attention_processor.LoRAAttnAddedKVProcessor
## LoRAXFormersAttnProcessor
[[autodoc]] models.attention_processor.LoRAXFormersAttnProcessor
## SlicedAttnProcessor
[[autodoc]] models.attention_processor.SlicedAttnProcessor
## SlicedAttnAddedKVProcessor
[[autodoc]] models.attention_processor.SlicedAttnAddedKVProcessor
## XFormersAttnProcessor
[[autodoc]] models.attention_processor.XFormersAttnProcessor
## AttnProcessorNPU
[[autodoc]] models.attention_processor.AttnProcessorNPU

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -12,8 +12,13 @@ specific language governing permissions and limitations under the License.
# Configuration
Schedulers from [`~schedulers.scheduling_utils.SchedulerMixin`] and models from [`ModelMixin`] inherit from [`ConfigMixin`] which conveniently takes care of storing all the parameters that are
passed to their respective `__init__` methods in a JSON-configuration file.
Schedulers from [`~schedulers.scheduling_utils.SchedulerMixin`] and models from [`ModelMixin`] inherit from [`ConfigMixin`] which stores all the parameters that are passed to their respective `__init__` methods in a JSON-configuration file.
<Tip>
To use private or [gated](https://huggingface.co/docs/hub/models-gated#gated-models) models, log-in with `huggingface-cli login`.
</Tip>
## ConfigMixin

View File

@@ -1,47 +0,0 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Pipelines
The [`DiffusionPipeline`] is the easiest way to load any pretrained diffusion pipeline from the [Hub](https://huggingface.co/models?library=diffusers) and to use it in inference.
<Tip>
One should not use the Diffusion Pipeline class for training or fine-tuning a diffusion model. Individual
components of diffusion pipelines are usually trained individually, so we suggest to directly work
with [`UNetModel`] and [`UNetConditionModel`].
</Tip>
Any diffusion pipeline that is loaded with [`~DiffusionPipeline.from_pretrained`] will automatically
detect the pipeline type, *e.g.* [`StableDiffusionPipeline`] and consequently load each component of the
pipeline and pass them into the `__init__` function of the pipeline, *e.g.* [`~StableDiffusionPipeline.__init__`].
Any pipeline object can be saved locally with [`~DiffusionPipeline.save_pretrained`].
## DiffusionPipeline
[[autodoc]] DiffusionPipeline
- all
- __call__
- device
- to
- components
## ImagePipelineOutput
By default diffusion pipelines return an object of class
[[autodoc]] pipelines.ImagePipelineOutput
## AudioPipelineOutput
By default diffusion pipelines return an object of class
[[autodoc]] pipelines.AudioPipelineOutput

View File

@@ -0,0 +1,35 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# VAE Image Processor
The [`VaeImageProcessor`] provides a unified API for [`StableDiffusionPipeline`]s to prepare image inputs for VAE encoding and post-processing outputs once they're decoded. This includes transformations such as resizing, normalization, and conversion between PIL Image, PyTorch, and NumPy arrays.
All pipelines with [`VaeImageProcessor`] accept PIL Image, PyTorch tensor, or NumPy arrays as image inputs and return outputs based on the `output_type` argument by the user. You can pass encoded image latents directly to the pipeline and return latents from the pipeline as a specific output with the `output_type` argument (for example `output_type="latent"`). This allows you to take the generated latents from one pipeline and pass it to another pipeline as input without leaving the latent space. It also makes it much easier to use multiple pipelines together by passing PyTorch tensors directly between different pipelines.
## VaeImageProcessor
[[autodoc]] image_processor.VaeImageProcessor
## VaeImageProcessorLDM3D
The [`VaeImageProcessorLDM3D`] accepts RGB and depth inputs and returns RGB and depth outputs.
[[autodoc]] image_processor.VaeImageProcessorLDM3D
## PixArtImageProcessor
[[autodoc]] image_processor.PixArtImageProcessor
## IPAdapterMaskProcessor
[[autodoc]] image_processor.IPAdapterMaskProcessor

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -10,13 +10,6 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->
# DEIS
# Overview
Fast Sampling of Diffusion Models with Exponential Integrator.
## Overview
Original paper can be found [here](https://arxiv.org/abs/2204.13902). The original implementation can be found [here](https://github.com/qsh-zh/deis).
## DEISMultistepScheduler
[[autodoc]] DEISMultistepScheduler
The APIs in this section are more experimental and prone to breaking changes. Most of them are used internally for development, but they may also be useful to you if you're interested in building a diffusion model with some custom parts or if you're interested in some of our helper utilities for working with 🤗 Diffusers.

View File

@@ -1,42 +0,0 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Loaders
There are many ways to train adapter neural networks for diffusion models, such as
- [Textual Inversion](./training/text_inversion.mdx)
- [LoRA](https://github.com/cloneofsimo/lora)
- [Hypernetworks](https://arxiv.org/abs/1609.09106)
Such adapter neural networks often only consist of a fraction of the number of weights compared
to the pretrained model and as such are very portable. The Diffusers library offers an easy-to-use
API to load such adapter neural networks via the [`loaders.py` module](https://github.com/huggingface/diffusers/blob/main/src/diffusers/loaders.py).
**Note**: This module is still highly experimental and prone to future changes.
## LoaderMixins
### UNet2DConditionLoadersMixin
[[autodoc]] loaders.UNet2DConditionLoadersMixin
### TextualInversionLoaderMixin
[[autodoc]] loaders.TextualInversionLoaderMixin
### LoraLoaderMixin
[[autodoc]] loaders.LoraLoaderMixin
### FromCkptMixin
[[autodoc]] loaders.FromCkptMixin

View File

@@ -0,0 +1,29 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# IP-Adapter
[IP-Adapter](https://hf.co/papers/2308.06721) is a lightweight adapter that enables prompting a diffusion model with an image. This method decouples the cross-attention layers of the image and text features. The image features are generated from an image encoder.
<Tip>
Learn how to load an IP-Adapter checkpoint and image in the IP-Adapter [loading](../../using-diffusers/loading_adapters#ip-adapter) guide, and you can see how to use it in the [usage](../../using-diffusers/ip_adapter) guide.
</Tip>
## IPAdapterMixin
[[autodoc]] loaders.ip_adapter.IPAdapterMixin
## IPAdapterMaskProcessor
[[autodoc]] image_processor.IPAdapterMaskProcessor

View File

@@ -0,0 +1,32 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# LoRA
LoRA is a fast and lightweight training method that inserts and trains a significantly smaller number of parameters instead of all the model parameters. This produces a smaller file (~100 MBs) and makes it easier to quickly train a model to learn a new concept. LoRA weights are typically loaded into the UNet, text encoder or both. There are two classes for loading LoRA weights:
- [`LoraLoaderMixin`] provides functions for loading and unloading, fusing and unfusing, enabling and disabling, and more functions for managing LoRA weights. This class can be used with any model.
- [`StableDiffusionXLLoraLoaderMixin`] is a [Stable Diffusion (SDXL)](../../api/pipelines/stable_diffusion/stable_diffusion_xl) version of the [`LoraLoaderMixin`] class for loading and saving LoRA weights. It can only be used with the SDXL model.
<Tip>
To learn more about how to load LoRA weights, see the [LoRA](../../using-diffusers/loading_adapters#lora) loading guide.
</Tip>
## LoraLoaderMixin
[[autodoc]] loaders.lora.LoraLoaderMixin
## StableDiffusionXLLoraLoaderMixin
[[autodoc]] loaders.lora.StableDiffusionXLLoraLoaderMixin

View File

@@ -0,0 +1,25 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# PEFT
Diffusers supports loading adapters such as [LoRA](../../using-diffusers/loading_adapters) with the [PEFT](https://huggingface.co/docs/peft/index) library with the [`~loaders.peft.PeftAdapterMixin`] class. This allows modeling classes in Diffusers like [`UNet2DConditionModel`] to load an adapter.
<Tip>
Refer to the [Inference with PEFT](../../tutorials/using_peft_for_inference.md) tutorial for an overview of how to use PEFT in Diffusers for inference.
</Tip>
## PeftAdapterMixin
[[autodoc]] loaders.peft.PeftAdapterMixin

View File

@@ -0,0 +1,59 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Single files
The [`~loaders.FromSingleFileMixin.from_single_file`] method allows you to load:
* a model stored in a single file, which is useful if you're working with models from the diffusion ecosystem, like Automatic1111, and commonly rely on a single-file layout to store and share models
* a model stored in their originally distributed layout, which is useful if you're working with models finetuned with other services, and want to load it directly into Diffusers model objects and pipelines
> [!TIP]
> Read the [Model files and layouts](../../using-diffusers/other-formats) guide to learn more about the Diffusers-multifolder layout versus the single-file layout, and how to load models stored in these different layouts.
## Supported pipelines
- [`StableDiffusionPipeline`]
- [`StableDiffusionImg2ImgPipeline`]
- [`StableDiffusionInpaintPipeline`]
- [`StableDiffusionControlNetPipeline`]
- [`StableDiffusionControlNetImg2ImgPipeline`]
- [`StableDiffusionControlNetInpaintPipeline`]
- [`StableDiffusionUpscalePipeline`]
- [`StableDiffusionXLPipeline`]
- [`StableDiffusionXLImg2ImgPipeline`]
- [`StableDiffusionXLInpaintPipeline`]
- [`StableDiffusionXLInstructPix2PixPipeline`]
- [`StableDiffusionXLControlNetPipeline`]
- [`StableDiffusionXLKDiffusionPipeline`]
- [`LatentConsistencyModelPipeline`]
- [`LatentConsistencyModelImg2ImgPipeline`]
- [`StableDiffusionControlNetXSPipeline`]
- [`StableDiffusionXLControlNetXSPipeline`]
- [`LEditsPPPipelineStableDiffusion`]
- [`LEditsPPPipelineStableDiffusionXL`]
- [`PIAPipeline`]
## Supported models
- [`UNet2DConditionModel`]
- [`StableCascadeUNet`]
- [`AutoencoderKL`]
- [`ControlNetModel`]
## FromSingleFileMixin
[[autodoc]] loaders.single_file.FromSingleFileMixin
## FromOriginalModelMixin
[[autodoc]] loaders.single_file_model.FromOriginalModelMixin

View File

@@ -0,0 +1,27 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Textual Inversion
Textual Inversion is a training method for personalizing models by learning new text embeddings from a few example images. The file produced from training is extremely small (a few KBs) and the new embeddings can be loaded into the text encoder.
[`TextualInversionLoaderMixin`] provides a function for loading Textual Inversion embeddings from Diffusers and Automatic1111 into the text encoder and loading a special token to activate the embeddings.
<Tip>
To learn more about how to load Textual Inversion embeddings, see the [Textual Inversion](../../using-diffusers/loading_adapters#textual-inversion) loading guide.
</Tip>
## TextualInversionLoaderMixin
[[autodoc]] loaders.textual_inversion.TextualInversionLoaderMixin

View File

@@ -0,0 +1,27 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# UNet
Some training methods - like LoRA and Custom Diffusion - typically target the UNet's attention layers, but these training methods can also target other non-attention layers. Instead of training all of a model's parameters, only a subset of the parameters are trained, which is faster and more efficient. This class is useful if you're *only* loading weights into a UNet. If you need to load weights into the text encoder or a text encoder and UNet, try using the [`~loaders.LoraLoaderMixin.load_lora_weights`] function instead.
The [`UNet2DConditionLoadersMixin`] class provides functions for loading and saving weights, fusing and unfusing LoRAs, disabling and enabling LoRAs, and setting and deleting adapters.
<Tip>
To learn more about how to load LoRA weights, see the [LoRA](../../using-diffusers/loading_adapters#lora) loading guide.
</Tip>
## UNet2DConditionLoadersMixin
[[autodoc]] loaders.unet.UNet2DConditionLoadersMixin

View File

@@ -0,0 +1,96 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Logging
🤗 Diffusers has a centralized logging system to easily manage the verbosity of the library. The default verbosity is set to `WARNING`.
To change the verbosity level, use one of the direct setters. For instance, to change the verbosity to the `INFO` level.
```python
import diffusers
diffusers.logging.set_verbosity_info()
```
You can also use the environment variable `DIFFUSERS_VERBOSITY` to override the default verbosity. You can set it
to one of the following: `debug`, `info`, `warning`, `error`, `critical`. For example:
```bash
DIFFUSERS_VERBOSITY=error ./myprogram.py
```
Additionally, some `warnings` can be disabled by setting the environment variable
`DIFFUSERS_NO_ADVISORY_WARNINGS` to a true value, like `1`. This disables any warning logged by
[`logger.warning_advice`]. For example:
```bash
DIFFUSERS_NO_ADVISORY_WARNINGS=1 ./myprogram.py
```
Here is an example of how to use the same logger as the library in your own module or script:
```python
from diffusers.utils import logging
logging.set_verbosity_info()
logger = logging.get_logger("diffusers")
logger.info("INFO")
logger.warning("WARN")
```
All methods of the logging module are documented below. The main methods are
[`logging.get_verbosity`] to get the current level of verbosity in the logger and
[`logging.set_verbosity`] to set the verbosity to the level of your choice.
In order from the least verbose to the most verbose:
| Method | Integer value | Description |
|----------------------------------------------------------:|--------------:|----------------------------------------------------:|
| `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL` | 50 | only report the most critical errors |
| `diffusers.logging.ERROR` | 40 | only report errors |
| `diffusers.logging.WARNING` or `diffusers.logging.WARN` | 30 | only report errors and warnings (default) |
| `diffusers.logging.INFO` | 20 | only report errors, warnings, and basic information |
| `diffusers.logging.DEBUG` | 10 | report all information |
By default, `tqdm` progress bars are displayed during model download. [`logging.disable_progress_bar`] and [`logging.enable_progress_bar`] are used to enable or disable this behavior.
## Base setters
[[autodoc]] utils.logging.set_verbosity_error
[[autodoc]] utils.logging.set_verbosity_warning
[[autodoc]] utils.logging.set_verbosity_info
[[autodoc]] utils.logging.set_verbosity_debug
## Other functions
[[autodoc]] utils.logging.get_verbosity
[[autodoc]] utils.logging.set_verbosity
[[autodoc]] utils.logging.get_logger
[[autodoc]] utils.logging.enable_default_handler
[[autodoc]] utils.logging.disable_default_handler
[[autodoc]] utils.logging.enable_explicit_format
[[autodoc]] utils.logging.reset_format
[[autodoc]] utils.logging.enable_progress_bar
[[autodoc]] utils.logging.disable_progress_bar

View File

@@ -1,98 +0,0 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Logging
🧨 Diffusers has a centralized logging system, so that you can setup the verbosity of the library easily.
Currently the default verbosity of the library is `WARNING`.
To change the level of verbosity, just use one of the direct setters. For instance, here is how to change the verbosity
to the INFO level.
```python
import diffusers
diffusers.logging.set_verbosity_info()
```
You can also use the environment variable `DIFFUSERS_VERBOSITY` to override the default verbosity. You can set it
to one of the following: `debug`, `info`, `warning`, `error`, `critical`. For example:
```bash
DIFFUSERS_VERBOSITY=error ./myprogram.py
```
Additionally, some `warnings` can be disabled by setting the environment variable
`DIFFUSERS_NO_ADVISORY_WARNINGS` to a true value, like *1*. This will disable any warning that is logged using
[`logger.warning_advice`]. For example:
```bash
DIFFUSERS_NO_ADVISORY_WARNINGS=1 ./myprogram.py
```
Here is an example of how to use the same logger as the library in your own module or script:
```python
from diffusers.utils import logging
logging.set_verbosity_info()
logger = logging.get_logger("diffusers")
logger.info("INFO")
logger.warning("WARN")
```
All the methods of this logging module are documented below, the main ones are
[`logging.get_verbosity`] to get the current level of verbosity in the logger and
[`logging.set_verbosity`] to set the verbosity to the level of your choice. In order (from the least
verbose to the most verbose), those levels (with their corresponding int values in parenthesis) are:
- `diffusers.logging.CRITICAL` or `diffusers.logging.FATAL` (int value, 50): only report the most
critical errors.
- `diffusers.logging.ERROR` (int value, 40): only report errors.
- `diffusers.logging.WARNING` or `diffusers.logging.WARN` (int value, 30): only reports error and
warnings. This the default level used by the library.
- `diffusers.logging.INFO` (int value, 20): reports error, warnings and basic information.
- `diffusers.logging.DEBUG` (int value, 10): report all information.
By default, `tqdm` progress bars will be displayed during model download. [`logging.disable_progress_bar`] and [`logging.enable_progress_bar`] can be used to suppress or unsuppress this behavior.
## Base setters
[[autodoc]] logging.set_verbosity_error
[[autodoc]] logging.set_verbosity_warning
[[autodoc]] logging.set_verbosity_info
[[autodoc]] logging.set_verbosity_debug
## Other functions
[[autodoc]] logging.get_verbosity
[[autodoc]] logging.set_verbosity
[[autodoc]] logging.get_logger
[[autodoc]] logging.enable_default_handler
[[autodoc]] logging.disable_default_handler
[[autodoc]] logging.enable_explicit_format
[[autodoc]] logging.reset_format
[[autodoc]] logging.enable_progress_bar
[[autodoc]] logging.disable_progress_bar

View File

@@ -1,107 +0,0 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Models
Diffusers contains pretrained models for popular algorithms and modules for creating the next set of diffusion models.
The primary function of these models is to denoise an input sample, by modeling the distribution $p_\theta(\mathbf{x}_{t-1}|\mathbf{x}_t)$.
The models are built on the base class ['ModelMixin'] that is a `torch.nn.module` with basic functionality for saving and loading models both locally and from the HuggingFace hub.
## ModelMixin
[[autodoc]] ModelMixin
## UNet2DOutput
[[autodoc]] models.unet_2d.UNet2DOutput
## UNet2DModel
[[autodoc]] UNet2DModel
## UNet1DOutput
[[autodoc]] models.unet_1d.UNet1DOutput
## UNet1DModel
[[autodoc]] UNet1DModel
## UNet2DConditionOutput
[[autodoc]] models.unet_2d_condition.UNet2DConditionOutput
## UNet2DConditionModel
[[autodoc]] UNet2DConditionModel
## UNet3DConditionOutput
[[autodoc]] models.unet_3d_condition.UNet3DConditionOutput
## UNet3DConditionModel
[[autodoc]] UNet3DConditionModel
## DecoderOutput
[[autodoc]] models.vae.DecoderOutput
## VQEncoderOutput
[[autodoc]] models.vq_model.VQEncoderOutput
## VQModel
[[autodoc]] VQModel
## AutoencoderKLOutput
[[autodoc]] models.autoencoder_kl.AutoencoderKLOutput
## AutoencoderKL
[[autodoc]] AutoencoderKL
## Transformer2DModel
[[autodoc]] Transformer2DModel
## Transformer2DModelOutput
[[autodoc]] models.transformer_2d.Transformer2DModelOutput
## TransformerTemporalModel
[[autodoc]] models.transformer_temporal.TransformerTemporalModel
## Transformer2DModelOutput
[[autodoc]] models.transformer_temporal.TransformerTemporalModelOutput
## PriorTransformer
[[autodoc]] models.prior_transformer.PriorTransformer
## PriorTransformerOutput
[[autodoc]] models.prior_transformer.PriorTransformerOutput
## ControlNetOutput
[[autodoc]] models.controlnet.ControlNetOutput
## ControlNetModel
[[autodoc]] ControlNetModel
## FlaxModelMixin
[[autodoc]] FlaxModelMixin
## FlaxUNet2DConditionOutput
[[autodoc]] models.unet_2d_condition_flax.FlaxUNet2DConditionOutput
## FlaxUNet2DConditionModel
[[autodoc]] FlaxUNet2DConditionModel
## FlaxDecoderOutput
[[autodoc]] models.vae_flax.FlaxDecoderOutput
## FlaxAutoencoderKLOutput
[[autodoc]] models.vae_flax.FlaxAutoencoderKLOutput
## FlaxAutoencoderKL
[[autodoc]] FlaxAutoencoderKL
## FlaxControlNetOutput
[[autodoc]] models.controlnet_flax.FlaxControlNetOutput
## FlaxControlNetModel
[[autodoc]] FlaxControlNetModel

View File

@@ -0,0 +1,60 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# AsymmetricAutoencoderKL
Improved larger variational autoencoder (VAE) model with KL loss for inpainting task: [Designing a Better Asymmetric VQGAN for StableDiffusion](https://arxiv.org/abs/2306.04632) by Zixin Zhu, Xuelu Feng, Dongdong Chen, Jianmin Bao, Le Wang, Yinpeng Chen, Lu Yuan, Gang Hua.
The abstract from the paper is:
*StableDiffusion is a revolutionary text-to-image generator that is causing a stir in the world of image generation and editing. Unlike traditional methods that learn a diffusion model in pixel space, StableDiffusion learns a diffusion model in the latent space via a VQGAN, ensuring both efficiency and quality. It not only supports image generation tasks, but also enables image editing for real images, such as image inpainting and local editing. However, we have observed that the vanilla VQGAN used in StableDiffusion leads to significant information loss, causing distortion artifacts even in non-edited image regions. To this end, we propose a new asymmetric VQGAN with two simple designs. Firstly, in addition to the input from the encoder, the decoder contains a conditional branch that incorporates information from task-specific priors, such as the unmasked image region in inpainting. Secondly, the decoder is much heavier than the encoder, allowing for more detailed recovery while only slightly increasing the total inference cost. The training cost of our asymmetric VQGAN is cheap, and we only need to retrain a new asymmetric decoder while keeping the vanilla VQGAN encoder and StableDiffusion unchanged. Our asymmetric VQGAN can be widely used in StableDiffusion-based inpainting and local editing methods. Extensive experiments demonstrate that it can significantly improve the inpainting and editing performance, while maintaining the original text-to-image capability. The code is available at https://github.com/buxiangzhiren/Asymmetric_VQGAN*
Evaluation results can be found in section 4.1 of the original paper.
## Available checkpoints
* [https://huggingface.co/cross-attention/asymmetric-autoencoder-kl-x-1-5](https://huggingface.co/cross-attention/asymmetric-autoencoder-kl-x-1-5)
* [https://huggingface.co/cross-attention/asymmetric-autoencoder-kl-x-2](https://huggingface.co/cross-attention/asymmetric-autoencoder-kl-x-2)
## Example Usage
```python
from diffusers import AsymmetricAutoencoderKL, StableDiffusionInpaintPipeline
from diffusers.utils import load_image, make_image_grid
prompt = "a photo of a person with beard"
img_url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/repaint/celeba_hq_256.png"
mask_url = "https://huggingface.co/datasets/hf-internal-testing/diffusers-images/resolve/main/repaint/mask_256.png"
original_image = load_image(img_url).resize((512, 512))
mask_image = load_image(mask_url).resize((512, 512))
pipe = StableDiffusionInpaintPipeline.from_pretrained("runwayml/stable-diffusion-inpainting")
pipe.vae = AsymmetricAutoencoderKL.from_pretrained("cross-attention/asymmetric-autoencoder-kl-x-1-5")
pipe.to("cuda")
image = pipe(prompt=prompt, image=original_image, mask_image=mask_image).images[0]
make_image_grid([original_image, mask_image, image], rows=1, cols=3)
```
## AsymmetricAutoencoderKL
[[autodoc]] models.autoencoders.autoencoder_asym_kl.AsymmetricAutoencoderKL
## AutoencoderKLOutput
[[autodoc]] models.autoencoders.autoencoder_kl.AutoencoderKLOutput
## DecoderOutput
[[autodoc]] models.autoencoders.vae.DecoderOutput

View File

@@ -0,0 +1,57 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Tiny AutoEncoder
Tiny AutoEncoder for Stable Diffusion (TAESD) was introduced in [madebyollin/taesd](https://github.com/madebyollin/taesd) by Ollin Boer Bohan. It is a tiny distilled version of Stable Diffusion's VAE that can quickly decode the latents in a [`StableDiffusionPipeline`] or [`StableDiffusionXLPipeline`] almost instantly.
To use with Stable Diffusion v-2.1:
```python
import torch
from diffusers import DiffusionPipeline, AutoencoderTiny
pipe = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-2-1-base", torch_dtype=torch.float16
)
pipe.vae = AutoencoderTiny.from_pretrained("madebyollin/taesd", torch_dtype=torch.float16)
pipe = pipe.to("cuda")
prompt = "slice of delicious New York-style berry cheesecake"
image = pipe(prompt, num_inference_steps=25).images[0]
image
```
To use with Stable Diffusion XL 1.0
```python
import torch
from diffusers import DiffusionPipeline, AutoencoderTiny
pipe = DiffusionPipeline.from_pretrained(
"stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
)
pipe.vae = AutoencoderTiny.from_pretrained("madebyollin/taesdxl", torch_dtype=torch.float16)
pipe = pipe.to("cuda")
prompt = "slice of delicious New York-style berry cheesecake"
image = pipe(prompt, num_inference_steps=25).images[0]
image
```
## AutoencoderTiny
[[autodoc]] AutoencoderTiny
## AutoencoderTinyOutput
[[autodoc]] models.autoencoders.autoencoder_tiny.AutoencoderTinyOutput

View File

@@ -0,0 +1,58 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# AutoencoderKL
The variational autoencoder (VAE) model with KL loss was introduced in [Auto-Encoding Variational Bayes](https://arxiv.org/abs/1312.6114v11) by Diederik P. Kingma and Max Welling. The model is used in 🤗 Diffusers to encode images into latents and to decode latent representations into images.
The abstract from the paper is:
*How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets? We introduce a stochastic variational inference and learning algorithm that scales to large datasets and, under some mild differentiability conditions, even works in the intractable case. Our contributions are two-fold. First, we show that a reparameterization of the variational lower bound yields a lower bound estimator that can be straightforwardly optimized using standard stochastic gradient methods. Second, we show that for i.i.d. datasets with continuous latent variables per datapoint, posterior inference can be made especially efficient by fitting an approximate inference model (also called a recognition model) to the intractable posterior using the proposed lower bound estimator. Theoretical advantages are reflected in experimental results.*
## Loading from the original format
By default the [`AutoencoderKL`] should be loaded with [`~ModelMixin.from_pretrained`], but it can also be loaded
from the original format using [`FromOriginalVAEMixin.from_single_file`] as follows:
```py
from diffusers import AutoencoderKL
url = "https://huggingface.co/stabilityai/sd-vae-ft-mse-original/blob/main/vae-ft-mse-840000-ema-pruned.safetensors" # can also be a local file
model = AutoencoderKL.from_single_file(url)
```
## AutoencoderKL
[[autodoc]] AutoencoderKL
- decode
- encode
- all
## AutoencoderKLOutput
[[autodoc]] models.autoencoders.autoencoder_kl.AutoencoderKLOutput
## DecoderOutput
[[autodoc]] models.autoencoders.vae.DecoderOutput
## FlaxAutoencoderKL
[[autodoc]] FlaxAutoencoderKL
## FlaxAutoencoderKLOutput
[[autodoc]] models.vae_flax.FlaxAutoencoderKLOutput
## FlaxDecoderOutput
[[autodoc]] models.vae_flax.FlaxDecoderOutput

View File

@@ -0,0 +1,30 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Consistency Decoder
Consistency decoder can be used to decode the latents from the denoising UNet in the [`StableDiffusionPipeline`]. This decoder was introduced in the [DALL-E 3 technical report](https://openai.com/dall-e-3).
The original codebase can be found at [openai/consistencydecoder](https://github.com/openai/consistencydecoder).
<Tip warning={true}>
Inference is only supported for 2 iterations as of now.
</Tip>
The pipeline could not have been contributed without the help of [madebyollin](https://github.com/madebyollin) and [mrsteyk](https://github.com/mrsteyk) from [this issue](https://github.com/openai/consistencydecoder/issues/1).
## ConsistencyDecoderVAE
[[autodoc]] ConsistencyDecoderVAE
- all
- decode

View File

@@ -0,0 +1,50 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# ControlNetModel
The ControlNet model was introduced in [Adding Conditional Control to Text-to-Image Diffusion Models](https://huggingface.co/papers/2302.05543) by Lvmin Zhang, Anyi Rao, Maneesh Agrawala. It provides a greater degree of control over text-to-image generation by conditioning the model on additional inputs such as edge maps, depth maps, segmentation maps, and keypoints for pose detection.
The abstract from the paper is:
*We present ControlNet, a neural network architecture to add spatial conditioning controls to large, pretrained text-to-image diffusion models. ControlNet locks the production-ready large diffusion models, and reuses their deep and robust encoding layers pretrained with billions of images as a strong backbone to learn a diverse set of conditional controls. The neural architecture is connected with "zero convolutions" (zero-initialized convolution layers) that progressively grow the parameters from zero and ensure that no harmful noise could affect the finetuning. We test various conditioning controls, eg, edges, depth, segmentation, human pose, etc, with Stable Diffusion, using single or multiple conditions, with or without prompts. We show that the training of ControlNets is robust with small (<50k) and large (>1m) datasets. Extensive results show that ControlNet may facilitate wider applications to control image diffusion models.*
## Loading from the original format
By default the [`ControlNetModel`] should be loaded with [`~ModelMixin.from_pretrained`], but it can also be loaded
from the original format using [`FromOriginalControlnetMixin.from_single_file`] as follows:
```py
from diffusers import StableDiffusionControlNetPipeline, ControlNetModel
url = "https://huggingface.co/lllyasviel/ControlNet-v1-1/blob/main/control_v11p_sd15_canny.pth" # can also be a local path
controlnet = ControlNetModel.from_single_file(url)
url = "https://huggingface.co/runwayml/stable-diffusion-v1-5/blob/main/v1-5-pruned.safetensors" # can also be a local path
pipe = StableDiffusionControlNetPipeline.from_single_file(url, controlnet=controlnet)
```
## ControlNetModel
[[autodoc]] ControlNetModel
## ControlNetOutput
[[autodoc]] models.controlnet.ControlNetOutput
## FlaxControlNetModel
[[autodoc]] FlaxControlNetModel
## FlaxControlNetOutput
[[autodoc]] models.controlnet_flax.FlaxControlNetOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -10,6 +10,10 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->
# TODO
# DiTTransformer2DModel
Coming soon!
A Transformer model for image-like data from [DiT](https://huggingface.co/papers/2212.09748).
## DiTTransformer2DModel
[[autodoc]] DiTTransformer2DModel

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -10,11 +10,11 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->
# VQDiffusionScheduler
# HunyuanDiT2DModel
## Overview
A Diffusion Transformer model for 2D data from [Hunyuan-DiT](https://github.com/Tencent/HunyuanDiT).
Original paper can be found [here](https://arxiv.org/abs/2111.14822)
## HunyuanDiT2DModel
[[autodoc]] HunyuanDiT2DModel
## VQDiffusionScheduler
[[autodoc]] VQDiffusionScheduler

View File

@@ -0,0 +1,28 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Models
🤗 Diffusers provides pretrained models for popular algorithms and modules to create custom diffusion systems. The primary function of models is to denoise an input sample as modeled by the distribution \\(p_{\theta}(x_{t-1}|x_{t})\\).
All models are built from the base [`ModelMixin`] class which is a [`torch.nn.Module`](https://pytorch.org/docs/stable/generated/torch.nn.Module.html) providing basic functionality for saving and loading models, locally and from the Hugging Face Hub.
## ModelMixin
[[autodoc]] ModelMixin
## FlaxModelMixin
[[autodoc]] FlaxModelMixin
## PushToHubMixin
[[autodoc]] utils.PushToHubMixin

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -10,11 +10,10 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->
# Variance Exploding Stochastic Differential Equation (VE-SDE) scheduler
# PixArtTransformer2DModel
## Overview
A Transformer model for image-like data from [PixArt-Alpha](https://huggingface.co/papers/2310.00426) and [PixArt-Sigma](https://huggingface.co/papers/2403.04692).
Original paper can be found [here](https://arxiv.org/abs/2011.13456).
## PixArtTransformer2DModel
## ScoreSdeVeScheduler
[[autodoc]] ScoreSdeVeScheduler
[[autodoc]] PixArtTransformer2DModel

View File

@@ -0,0 +1,27 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# PriorTransformer
The Prior Transformer was originally introduced in [Hierarchical Text-Conditional Image Generation with CLIP Latents](https://huggingface.co/papers/2204.06125) by Ramesh et al. It is used to predict CLIP image embeddings from CLIP text embeddings; image embeddings are predicted through a denoising diffusion process.
The abstract from the paper is:
*Contrastive models like CLIP have been shown to learn robust representations of images that capture both semantics and style. To leverage these representations for image generation, we propose a two-stage model: a prior that generates a CLIP image embedding given a text caption, and a decoder that generates an image conditioned on the image embedding. We show that explicitly generating image representations improves image diversity with minimal loss in photorealism and caption similarity. Our decoders conditioned on image representations can also produce variations of an image that preserve both its semantics and style, while varying the non-essential details absent from the image representation. Moreover, the joint embedding space of CLIP enables language-guided image manipulations in a zero-shot fashion. We use diffusion models for the decoder and experiment with both autoregressive and diffusion models for the prior, finding that the latter are computationally more efficient and produce higher-quality samples.*
## PriorTransformer
[[autodoc]] PriorTransformer
## PriorTransformerOutput
[[autodoc]] models.transformers.prior_transformer.PriorTransformerOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -10,11 +10,10 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->
# Variance exploding, stochastic sampling from Karras et. al
# SD3 Transformer Model
## Overview
The Transformer model introduced in [Stable Diffusion 3](https://hf.co/papers/2403.03206). Its novelty lies in the MMDiT transformer block.
Original paper can be found [here](https://arxiv.org/abs/2206.00364).
## SD3Transformer2DModel
## KarrasVeScheduler
[[autodoc]] KarrasVeScheduler
[[autodoc]] SD3Transformer2DModel

View File

@@ -0,0 +1,41 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# Transformer2DModel
A Transformer model for image-like data from [CompVis](https://huggingface.co/CompVis) that is based on the [Vision Transformer](https://huggingface.co/papers/2010.11929) introduced by Dosovitskiy et al. The [`Transformer2DModel`] accepts discrete (classes of vector embeddings) or continuous (actual embeddings) inputs.
When the input is **continuous**:
1. Project the input and reshape it to `(batch_size, sequence_length, feature_dimension)`.
2. Apply the Transformer blocks in the standard way.
3. Reshape to image.
When the input is **discrete**:
<Tip>
It is assumed one of the input classes is the masked latent pixel. The predicted classes of the unnoised image don't contain a prediction for the masked pixel because the unnoised image cannot be masked.
</Tip>
1. Convert input (classes of latent pixels) to embeddings and apply positional embeddings.
2. Apply the Transformer blocks in the standard way.
3. Predict classes of unnoised image.
## Transformer2DModel
[[autodoc]] Transformer2DModel
## Transformer2DModelOutput
[[autodoc]] models.transformers.transformer_2d.Transformer2DModelOutput

View File

@@ -1,4 +1,4 @@
<!--Copyright 2023 The HuggingFace Team. All rights reserved.
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
@@ -10,11 +10,14 @@ an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express o
specific language governing permissions and limitations under the License.
-->
# improved pseudo numerical methods for diffusion models (iPNDM)
# TransformerTemporalModel
## Overview
A Transformer model for video-like data.
Original implementation can be found [here](https://github.com/crowsonkb/v-diffusion-pytorch/blob/987f8985e38208345c1959b0ea767a625831cc9b/diffusion/sampling.py#L296).
## TransformerTemporalModel
## IPNDMScheduler
[[autodoc]] IPNDMScheduler
[[autodoc]] models.transformers.transformer_temporal.TransformerTemporalModel
## TransformerTemporalModelOutput
[[autodoc]] models.transformers.transformer_temporal.TransformerTemporalModelOutput

View File

@@ -0,0 +1,25 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# UNetMotionModel
The [UNet](https://huggingface.co/papers/1505.04597) model was originally introduced by Ronneberger et al for biomedical image segmentation, but it is also commonly used in 🤗 Diffusers because it outputs images that are the same size as the input. It is one of the most important components of a diffusion system because it facilitates the actual diffusion process. There are several variants of the UNet model in 🤗 Diffusers, depending on it's number of dimensions and whether it is a conditional model or not. This is a 2D UNet model.
The abstract from the paper is:
*There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net.*
## UNetMotionModel
[[autodoc]] UNetMotionModel
## UNet3DConditionOutput
[[autodoc]] models.unets.unet_3d_condition.UNet3DConditionOutput

View File

@@ -0,0 +1,25 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# UNet1DModel
The [UNet](https://huggingface.co/papers/1505.04597) model was originally introduced by Ronneberger et al. for biomedical image segmentation, but it is also commonly used in 🤗 Diffusers because it outputs images that are the same size as the input. It is one of the most important components of a diffusion system because it facilitates the actual diffusion process. There are several variants of the UNet model in 🤗 Diffusers, depending on it's number of dimensions and whether it is a conditional model or not. This is a 1D UNet model.
The abstract from the paper is:
*There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net.*
## UNet1DModel
[[autodoc]] UNet1DModel
## UNet1DOutput
[[autodoc]] models.unets.unet_1d.UNet1DOutput

View File

@@ -0,0 +1,31 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# UNet2DConditionModel
The [UNet](https://huggingface.co/papers/1505.04597) model was originally introduced by Ronneberger et al. for biomedical image segmentation, but it is also commonly used in 🤗 Diffusers because it outputs images that are the same size as the input. It is one of the most important components of a diffusion system because it facilitates the actual diffusion process. There are several variants of the UNet model in 🤗 Diffusers, depending on it's number of dimensions and whether it is a conditional model or not. This is a 2D UNet conditional model.
The abstract from the paper is:
*There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net.*
## UNet2DConditionModel
[[autodoc]] UNet2DConditionModel
## UNet2DConditionOutput
[[autodoc]] models.unets.unet_2d_condition.UNet2DConditionOutput
## FlaxUNet2DConditionModel
[[autodoc]] models.unets.unet_2d_condition_flax.FlaxUNet2DConditionModel
## FlaxUNet2DConditionOutput
[[autodoc]] models.unets.unet_2d_condition_flax.FlaxUNet2DConditionOutput

View File

@@ -0,0 +1,25 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# UNet2DModel
The [UNet](https://huggingface.co/papers/1505.04597) model was originally introduced by Ronneberger et al. for biomedical image segmentation, but it is also commonly used in 🤗 Diffusers because it outputs images that are the same size as the input. It is one of the most important components of a diffusion system because it facilitates the actual diffusion process. There are several variants of the UNet model in 🤗 Diffusers, depending on it's number of dimensions and whether it is a conditional model or not. This is a 2D UNet model.
The abstract from the paper is:
*There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net.*
## UNet2DModel
[[autodoc]] UNet2DModel
## UNet2DOutput
[[autodoc]] models.unets.unet_2d.UNet2DOutput

View File

@@ -0,0 +1,25 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# UNet3DConditionModel
The [UNet](https://huggingface.co/papers/1505.04597) model was originally introduced by Ronneberger et al. for biomedical image segmentation, but it is also commonly used in 🤗 Diffusers because it outputs images that are the same size as the input. It is one of the most important components of a diffusion system because it facilitates the actual diffusion process. There are several variants of the UNet model in 🤗 Diffusers, depending on it's number of dimensions and whether it is a conditional model or not. This is a 3D UNet conditional model.
The abstract from the paper is:
*There is large consent that successful training of deep networks requires many thousand annotated training samples. In this paper, we present a network and training strategy that relies on the strong use of data augmentation to use the available annotated samples more efficiently. The architecture consists of a contracting path to capture context and a symmetric expanding path that enables precise localization. We show that such a network can be trained end-to-end from very few images and outperforms the prior best method (a sliding-window convolutional network) on the ISBI challenge for segmentation of neuronal structures in electron microscopic stacks. Using the same network trained on transmitted light microscopy images (phase contrast and DIC) we won the ISBI cell tracking challenge 2015 in these categories by a large margin. Moreover, the network is fast. Segmentation of a 512x512 image takes less than a second on a recent GPU. The full implementation (based on Caffe) and the trained networks are available at http://lmb.informatik.uni-freiburg.de/people/ronneber/u-net.*
## UNet3DConditionModel
[[autodoc]] UNet3DConditionModel
## UNet3DConditionOutput
[[autodoc]] models.unets.unet_3d_condition.UNet3DConditionOutput

View File

@@ -0,0 +1,39 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# UVit2DModel
The [U-ViT](https://hf.co/papers/2301.11093) model is a vision transformer (ViT) based UNet. This model incorporates elements from ViT (considers all inputs such as time, conditions and noisy image patches as tokens) and a UNet (long skip connections between the shallow and deep layers). The skip connection is important for predicting pixel-level features. An additional 3x3 convolutional block is applied prior to the final output to improve image quality.
The abstract from the paper is:
*Currently, applying diffusion models in pixel space of high resolution images is difficult. Instead, existing approaches focus on diffusion in lower dimensional spaces (latent diffusion), or have multiple super-resolution levels of generation referred to as cascades. The downside is that these approaches add additional complexity to the diffusion framework. This paper aims to improve denoising diffusion for high resolution images while keeping the model as simple as possible. The paper is centered around the research question: How can one train a standard denoising diffusion models on high resolution images, and still obtain performance comparable to these alternate approaches? The four main findings are: 1) the noise schedule should be adjusted for high resolution images, 2) It is sufficient to scale only a particular part of the architecture, 3) dropout should be added at specific locations in the architecture, and 4) downsampling is an effective strategy to avoid high resolution feature maps. Combining these simple yet effective techniques, we achieve state-of-the-art on image generation among diffusion models without sampling modifiers on ImageNet.*
## UVit2DModel
[[autodoc]] UVit2DModel
## UVit2DConvEmbed
[[autodoc]] models.unets.uvit_2d.UVit2DConvEmbed
## UVitBlock
[[autodoc]] models.unets.uvit_2d.UVitBlock
## ConvNextBlock
[[autodoc]] models.unets.uvit_2d.ConvNextBlock
## ConvMlmLayer
[[autodoc]] models.unets.uvit_2d.ConvMlmLayer

View File

@@ -0,0 +1,27 @@
<!--Copyright 2024 The HuggingFace Team. All rights reserved.
Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with
the License. You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on
an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the
specific language governing permissions and limitations under the License.
-->
# VQModel
The VQ-VAE model was introduced in [Neural Discrete Representation Learning](https://huggingface.co/papers/1711.00937) by Aaron van den Oord, Oriol Vinyals and Koray Kavukcuoglu. The model is used in 🤗 Diffusers to decode latent representations into images. Unlike [`AutoencoderKL`], the [`VQModel`] works in a quantized latent space.
The abstract from the paper is:
*Learning useful representations without supervision remains a key challenge in machine learning. In this paper, we propose a simple yet powerful generative model that learns such discrete representations. Our model, the Vector Quantised-Variational AutoEncoder (VQ-VAE), differs from VAEs in two key ways: the encoder network outputs discrete, rather than continuous, codes; and the prior is learnt rather than static. In order to learn a discrete latent representation, we incorporate ideas from vector quantisation (VQ). Using the VQ method allows the model to circumvent issues of "posterior collapse" -- where the latents are ignored when they are paired with a powerful autoregressive decoder -- typically observed in the VAE framework. Pairing these representations with an autoregressive prior, the model can generate high quality images, videos, and speech as well as doing high quality speaker conversion and unsupervised learning of phonemes, providing further evidence of the utility of the learnt representations.*
## VQModel
[[autodoc]] VQModel
## VQEncoderOutput
[[autodoc]] models.autoencoders.vq_model.VQEncoderOutput

Some files were not shown because too many files have changed in this diff Show More