Commit Graph

818 Commits

Author SHA1 Message Date
Sayak Paul
c8a236ba5c [Core] Add PAG support for PixArtSigma (#8921)
* feat: add pixart sigma pag.

* inits.

* fixes

* fix

* remove print.

* copy paste methods to the pixart pag mixin

* fix-copies

* add documentation.

* add tests.

* remove correction file.

* remove pag_applied_layers

* empty
2024-12-23 13:02:14 +05:30
Sayak Paul
7739beb740 Flux pipeline (#9043)
add flux!

Signed-off-by: Adrien <adrien@huggingface.co>
Co-authored-by: Adrien <adrien.69740@gmail.com>
Co-authored-by: Anatoly Belikov <abelikov@singularitynet.io>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
2024-12-23 13:02:14 +05:30
Aryan
e28e5373f9 PAG variant for AnimateDiff (#8789)
* add animatediff pag pipeline

* remove unnecessary print

* make fix-copies

* fix ip-adapter bug

* update docs

* add fast tests and fix bugs

* update

* update

* address review comments

* update ip adapter single test expected slice

* implement test_from_pipe_consistent_config; fix expected slice values

* LoraLoaderMixin->StableDiffusionLoraLoaderMixin; add latest freeinit test
2024-12-23 13:02:14 +05:30
Aryan
cf513e4205 [core] Move community AnimateDiff ControlNet to core (#8972)
* add animatediff controlnet to core

* make style; remove unused method

* fix copied from comment

* add tests

* changes to make tests work

* add utility function to load videos

* update docs

* update pipeline example

* make style

* update docs with example

* address review comments

* add latest freeinit test from #8969

* LoraLoaderMixin -> StableDiffusionLoraLoaderMixin

* fix docs

* Update src/diffusers/utils/loading_utils.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fix: variable out of scope

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-12-23 13:02:14 +05:30
Yoach Lacombe
030a134311 Stable Audio integration (#8716)
* WIP modeling code and pipeline

* add custom attention processor + custom activation + add to init

* correct ProjectionModel forward

* add stable audio to __initèè

* add autoencoder and update pipeline and modeling code

* add half Rope

* add partial rotary v2

* add temporary modfis to scheduler

* add EDM DPM Solver

* remove TODOs

* clean GLU

* remove att.group_norm to attn processor

* revert back src/diffusers/schedulers/scheduling_dpmsolver_multistep.py

* refactor GLU -> SwiGLU

* remove redundant args

* add channel multiples in autoencoder docstrings

* changes in docsrtings and copyright headers

* clean pipeline

* further cleaning

* remove peft and lora and fromoriginalmodel

* Delete src/diffusers/pipelines/stable_audio/diffusers.code-workspace

* make style

* dummy models

* fix copied from

* add fast oobleck tests

* add brownian tree

* oobleck autoencoder slow tests

* remove TODO

* fast stable audio pipeline tests

* add slow tests

* make style

* add first version of docs

* wrap is_torchsde_available to the scheduler

* fix slow test

* test with input waveform

* add input waveform

* remove some todos

* create stableaudio gaussian projection + make style

* add pipeline to toctree

* fix copied from

* make quality

* refactor timestep_features->time_proj

* refactor joint_attention_kwargs->cross_attention_kwargs

* remove forward_chunk

* move StableAudioDitModel to transformers folder

* correct convert + remove partial rotary embed

* apply suggestions from yiyixuxu -> removing attn.kv_heads

* remove temb

* remove cross_attention_kwargs

* further removal of cross_attention_kwargs

* remove text encoder autocast to fp16

* continue removing autocast

* make style

* refactor how text and audio are embedded

* add paper

* update example code

* make style

* unify projection model forward + fix device placement

* make style

* remove fuse qkv

* apply suggestions from review

* Update src/diffusers/pipelines/stable_audio/pipeline_stable_audio.py

Co-authored-by: YiYi Xu <yixu310@gmail.com>

* make style

* smaller models in fast tests

* pass sequential offloading fast tests

* add docs for vae and autoencoder

* make style and update example

* remove useless import

* add cosine scheduler

* dummy classes

* cosine scheduler docs

* better description of scheduler

---------

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-12-23 13:02:14 +05:30
Sayak Paul
3566f4b18a [Docs] credit where it's due for Lumina and Latte. (#9000)
credit where it's due for Lumina and Latte.
2024-12-23 13:02:14 +05:30
Álvaro Somoza
edddf3d417 [Kolors] Add IP Adapter (#8901)
* initial draft

* apply suggestions

* fix failing test

* added ipa to img2img

* add docs

* apply suggestions
2024-12-23 13:02:14 +05:30
Aryan
b7ddd2bb99 [core] AnimateDiff SparseCtrl (#8897)
* initial sparse control model draft

* remove unnecessary implementation

* copy animatediff pipeline

* remove deprecated callbacks

* update

* update pipeline implementation progress

* make style

* make fix-copies

* update progress

* add partially working pipeline

* remove debug prints

* add model docs

* dummy objects

* improve motion lora conversion script

* fix bugs

* update docstrings

* remove unnecessary model params; docs

* address review comment

* add copied from to zero_module

* copy animatediff test

* add fast tests

* update docs

* update

* update pipeline docs

* fix expected slice values

* fix license

* remove get_down_block usage

* remove temporal_double_self_attention from get_down_block

* update

* update docs with org and documentation images

* make from_unet work in sparsecontrolnetmodel

* add latest freeinit test from #8969

* make fix-copies

* LoraLoaderMixin -> StableDiffsuionLoraLoaderMixin
2024-12-23 13:02:14 +05:30
RandomGamingDev
13ef7e1b98 Added accelerator based gradient accumulation for basic_example (#8966)
added accelerator based gradient accumulation for basic_example

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:14 +05:30
Sayak Paul
6d11129c5a [Chore] add LoraLoaderMixin to the inits (#8981)
* introduce  to promote reusability.

* up

* add more tests

* up

* remove comments.

* fix fuse_nan test

* clarify the scope of fuse_lora and unfuse_lora

* remove space

* rewrite fuse_lora a bit.

* feedback

* copy over load_lora_into_text_encoder.

* address dhruv's feedback.

* fix-copies

* fix issubclass.

* num_fused_loras

* fix

* fix

* remove mapping

* up

* fix

* style

* fix-copies

* change to SD3TransformerLoRALoadersMixin

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* up

* handle wuerstchen

* up

* move lora to lora_pipeline.py

* up

* fix-copies

* fix documentation.

* comment set_adapters().

* fix-copies

* fix set_adapters() at the model level.

* fix?

* fix

* loraloadermixin.

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-12-23 13:02:14 +05:30
YiYi Xu
a754d9071e Revert "[LoRA] introduce LoraBaseMixin to promote reusability." (#8976)
Revert "[LoRA] introduce LoraBaseMixin to promote reusability. (#8774)"

This reverts commit 527430d0a4.
2024-12-23 13:02:14 +05:30
Sayak Paul
82b37a4cc3 [LoRA] introduce LoraBaseMixin to promote reusability. (#8774)
* introduce  to promote reusability.

* up

* add more tests

* up

* remove comments.

* fix fuse_nan test

* clarify the scope of fuse_lora and unfuse_lora

* remove space

* rewrite fuse_lora a bit.

* feedback

* copy over load_lora_into_text_encoder.

* address dhruv's feedback.

* fix-copies

* fix issubclass.

* num_fused_loras

* fix

* fix

* remove mapping

* up

* fix

* style

* fix-copies

* change to SD3TransformerLoRALoadersMixin

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* up

* handle wuerstchen

* up

* move lora to lora_pipeline.py

* up

* fix-copies

* fix documentation.

* comment set_adapters().

* fix-copies

* fix set_adapters() at the model level.

* fix?

* fix

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-12-23 13:02:14 +05:30
RandomGamingDev
7f10257852 Added Code for Gradient Accumulation to work for basic_training (#8961)
added line allowing gradient accumulation to work for basic_training example
2024-12-23 13:02:14 +05:30
Jiwook Han
eed4531f9c Reflect few contributions on ethical_guidelines.md that were not reflected on #8294 (#8914)
fix_ethical_guidelines.md
2024-12-23 13:02:14 +05:30
Sayak Paul
8e76f5b6b5 [Docs] small fixes to pag guide. (#8920)
small fixes to pag guide.
2024-12-23 13:02:14 +05:30
Seongsu Park
0e59db02d9 🌐 [i18n-KO] Translated docs to Korean (added 7 docs and etc) (#8804)
* remove unused docs

* add ko-18n docs

* docs typo, edit etc

* reorder list, add `in translation` in toctree

* fix minor translation

* fix docs minor tone, etc
2024-12-23 13:02:14 +05:30
Aryan
0f2c512fb6 [docs] pipeline docs for latte (#8844)
* add pipeline docs for latte

* add inference time to latte docs

* apply review suggestions
2024-12-23 13:02:14 +05:30
Nguyễn Công Tú Anh
adcd3682bf add PAG support sd15 controlnet (#8820)
* add pag support sd15 controlnet

* fix quality import

* remove unecessary import

* remove if state

* fix tests

* remove useless function

* add sd1.5 controlnet pag docs

---------

Co-authored-by: anhnct8 <anhnct8@fpt.com>
2024-12-23 13:02:14 +05:30
Sayak Paul
0ace726d8a [Docs] add AuraFlow docs (#8851)
* add pipeline documentation.

* add api spec for pipeline

* model documentation

* model spec
2024-12-23 13:02:14 +05:30
Dhruv Nair
c166a0a90d Add single file loading support for AnimateDiff (#8819)
* update

* update

* update

* update
2024-12-23 13:02:14 +05:30
Álvaro Somoza
1028de9d9d [Core] Add Kolors (#8812)
* initial draft
2024-12-23 13:02:14 +05:30
Xin Ma
b8742cb946 Latte: Latent Diffusion Transformer for Video Generation (#8404)
* add Latte to diffusers

* remove print

* remove print

* remove print

* remove unuse codes

* remove layer_norm_latte and add a flag

* remove layer_norm_latte and add a flag

* update latte_pipeline

* update latte_pipeline

* remove unuse squeeze

* add norm_hidden_states.ndim == 2: # for Latte

* fixed test latte pipeline bugs

* fixed test latte pipeline bugs

* delete sh

* add doc for latte

* add licensing

* Move Transformer3DModelOutput to modeling_outputs

* give a default value to sample_size

* remove the einops dependency

* change norm2 for latte

* modify pipeline of latte

* update test for Latte

* modify some codes for latte

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* modify for Latte pipeline

* video_length -> num_frames; update prepare_latents copied from

* make fix-copies

* make style

* typo: videe -> video

* update

* modify for Latte pipeline

* modify latte pipeline

* modify latte pipeline

* modify latte pipeline

* modify latte pipeline

* modify for Latte pipeline

* Delete .vscode directory

* make style

* make fix-copies

* add latte transformer 3d to docs _toctree.yml

* update example

* reduce frames for test

* fixed bug of _text_preprocessing

* set num frame to 1 for testing

* remove unuse print

* add text = self._clean_caption(text) again

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Aryan <contact.aryanvs@gmail.com>
Co-authored-by: Aryan <aryan@huggingface.co>
2024-12-23 13:02:14 +05:30
PommesPeter
2256ec51ff [Alpha-VLLM Team] Add Lumina-T2X to diffusers (#8652)
---------

Co-authored-by: zhuole1025 <zhuole1025@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-12-23 13:02:14 +05:30
Sayak Paul
11c5f6bfcd Revert "[LoRA] introduce LoraBaseMixin to promote reusability." (#8773)
Revert "[LoRA] introduce `LoraBaseMixin` to promote reusability. (#8670)"

This reverts commit a2071a1837.
2024-12-23 13:02:14 +05:30
Sayak Paul
2686552727 [LoRA] introduce LoraBaseMixin to promote reusability. (#8670)
* introduce  to promote reusability.

* up

* add more tests

* up

* remove comments.

* fix fuse_nan test

* clarify the scope of fuse_lora and unfuse_lora

* remove space
2024-12-23 13:02:13 +05:30
Jiwook Han
08db291f18 Reflect few contributions on philosophy.md that were not reflected on #8294 (#8690)
* Update philosophy.md 

Some contributions were not reflected previously, so I am resubmitting them.

* Update docs/source/ko/conceptual/philosophy.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/ko/conceptual/philosophy.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-23 13:02:13 +05:30
Dhruv Nair
56735380d8 Fix mistake in Single File Docs page (#8765)
update
2024-12-23 13:02:13 +05:30
Dhruv Nair
a039005206 Remove legacy single file model loading mixins (#8754)
update
2024-12-23 13:02:13 +05:30
YiYi Xu
ace869b5ac [doc] add a tip about using SDXL refiner with hunyuan-dit and pixart (#8735)
* up

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-23 13:02:13 +05:30
Shauray Singh
5f10c18270 add PAG support for SD architecture (#8725)
* add pag to sd pipelines
2024-12-23 13:02:13 +05:30
Sayak Paul
23145f2d9b [Chore] remove deprecation from transformer2d regarding the output class. (#8698)
* remove deprecation from transformer2d regarding the output class.

* up

* deprecate more
2024-12-23 13:02:13 +05:30
XCL
f488493082 [Tencent Hunyuan Team] Add Hunyuan-DiT ControlNet Inference (#8694)
* add controlnet support

---------

Co-authored-by: xingchaoliu <xingchaoliu@tencent.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-12-23 13:02:13 +05:30
YiYi Xu
b078f93d6a [doc] add more about from_pipe API for PAG doc (#8701)
* add more about from_pipe API

* Update docs/source/en/using-diffusers/pag.md

* Update docs/source/en/using-diffusers/pag.md

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-12-23 13:02:13 +05:30
Sayak Paul
3ecdab0995 add docs on model sharding (#8658)
* add docs on model sharding

* add entry to _toctree.

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* simplify wording

* add a note on transformer library handling

* move device placement section

* Update docs/source/en/training/distributed_inference.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-23 13:02:13 +05:30
Álvaro Somoza
89a6943efc [Docs] SD3 T5 Token limit doc (#8654)
* doc for max_sequence_length

* better position and changed note to tip

* apply suggestions

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:13 +05:30
YiYi Xu
5efc438c7e add PAG support (#7944)
* first draft


---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Junhwa Song <ethan9867@gmail.com>
Co-authored-by: Ahn Donghoon (안동훈 / suno) <suno.vivid@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-23 13:02:13 +05:30
Steven Liu
473acb9579 [docs] Add note for float8 (#8685)
add note
2024-12-23 13:02:13 +05:30
Sayak Paul
1850ad61c4 [Docs] add note on caching in fast diffusion (#8675)
* add note on caching in fast diffusion

* formatting

* Update docs/source/en/tutorials/fast_diffusion.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-23 13:02:13 +05:30
Tolga Cangöz
f6172748c6 Errata - Fix typos and improve style (#8571)
* Fix typos

* Fix typos & up style

* chore: Update numbers

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:13 +05:30
Tolga Cangöz
1ced1c40d8 Discourage using deprecated revision parameter (#8573)
* Discourage using `revision`

* `make style && make quality`

* Refactor code to use 'variant' instead of 'revision'

* `revision="bf16"` -> `variant="bf16"`

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:13 +05:30
Tolga Cangöz
2c56360222 Errata - Trim trailing white space in the whole repo (#8575)
* Trim all the trailing white space in the whole repo

* Remove unnecessary empty places

* make style && make quality

* Trim trailing white space

* trim

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:13 +05:30
Tolga Cangöz
c20d0ec72d Errata - Fix typos & improve contributing page (#8572)
* Fix typos & improve contributing page

* `make style && make quality`

* fix typos

* Fix typo

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:13 +05:30
Sayak Paul
22ebb0ab89 [LoRA] get rid of the legacy lora remnants and make our codebase lighter (#8623)
* get rid of the legacy lora remnants and make our codebase lighter

* fix depcrecated lora argument

* fix

* empty commit to trigger ci

* remove print

* empty
2024-12-23 13:02:13 +05:30
王奇勋
86da4dcf8e Support SD3 ControlNet and Multi-ControlNet. (#8566)
* sd3 controlnet



---------

Co-authored-by: haofanwang <haofanwang.ai@gmail.com>
2024-12-23 13:02:13 +05:30
Vasco Ramos
410eb1ad86 [SD3 Docs] Corrected title about loading model with T5 "without" -> "with" (#8602)
[SD3 Docs] Corrected title about loading model with T5

Corrected the documentation title to "Loading the single file checkpoint with T5" Previously, it incorrectly stated "Loading the single file checkpoint without T5" which contradicted the code snippet showing how to load the SD3 checkpoint with the T5 model
2024-12-23 13:02:13 +05:30
Sayak Paul
db74292bb3 [Core] Add shift_factor to SD3 tiny autoencoder (#8618)
* shift factor argument to tiny

* remove shift factor rejigging from the sd3 docs
2024-12-23 13:02:13 +05:30
MaoXianXin
c8deda5834 A backslash is missing from the run command (#8471) 2024-12-23 13:02:13 +05:30
Álvaro Somoza
fcbb2ffac9 [SD3] TAESD3 docs (#8607)
* tased3 docs

* apply suggestion

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:13 +05:30
AmosDinh
925e2343d7 Syntax error in readme example "pipe" -> "pipeline" (#8601)
Update controlnet.md

Syntax error pipe -> pipeline
2024-12-23 13:02:13 +05:30
Dhruv Nair
fa640cc7d7 Expand Single File support in SD3 Pipeline (#8517)
* update

* update
2024-12-23 13:02:12 +05:30