satani99
8a17331c29
Add StableDiffusionXLControlNetPAGImg2ImgPipeline ( #8990 )
...
* Added pad controlnet sdxl img2img pipeline
---------
Co-authored-by: YiYi Xu <yixu310@gmail.com >
2024-12-23 13:02:15 +05:30
Tolga Cangöz
259b12a077
[Docs] Fix CPU offloading usage ( #9207 )
...
* chore: Fix cpu offloading usage
* Trim trailing white space
* docs: update Kolors model link in kolors.md
2024-12-23 13:02:15 +05:30
Aryan
0f74c69416
[refactor] CogVideoX followups + tiled decoding support ( #9150 )
...
* refactor context parallel cache; update torch compile time benchmark
* add tiling support
* make style
* remove num_frames % 8 == 0 requirement
* update default num_frames to original value
* add explanations + refactor
* update torch compile example
* update docs
* update
* clean up if-statements
* address review comments
* add test for vae tiling
* update docs
* update docs
* update docstrings
* add modeling test for cogvideox transformer
* make style
2024-12-23 13:02:15 +05:30
林金鹏
095393a5b8
Support SD3 controlnet inpainting ( #9099 )
...
* add controlnet inpainting pipeline
* [SD3] add controlnet inpaint example
* update example and fix code style
* fix code style with ruff
* Update controlnet_sd3.md : add control inpaint pipeline
* Update docs/source/en/api/pipelines/controlnet_sd3.md
Co-authored-by: Aryan <contact.aryanvs@gmail.com >
* Update docs/source/en/api/pipelines/controlnet_sd3.md
Co-authored-by: Aryan <contact.aryanvs@gmail.com >
* Update docs/source/en/api/pipelines/controlnet_sd3.md
Co-authored-by: Aryan <contact.aryanvs@gmail.com >
* Update src/diffusers/pipelines/controlnet_sd3/pipeline_stable_diffusion_3_controlnet_inpainting.py
Co-authored-by: Aryan <contact.aryanvs@gmail.com >
* Update __init__.py : add sd3 control pipelines
* Update pipeline : add new param doc & check input reference.
* fix typo
* make style & make quality
* add unittest for sd3 controlnet inpaint
---------
Co-authored-by: 鹏徙 <linjinpeng.ljp@alibaba-inc.com >
Co-authored-by: Aryan <contact.aryanvs@gmail.com >
2024-12-23 13:02:15 +05:30
David Steinberg
d8d8e86924
Fix a dead link ( #9116 )
...
Co-authored-by: Aryan <aryan@huggingface.co >
2024-12-23 13:02:15 +05:30
zR
dbf5d348e6
Add CogVideoX text-to-video generation model ( #9082 )
...
* add CogVideoX
---------
Co-authored-by: Aryan <aryan@huggingface.co >
Co-authored-by: sayakpaul <spsayakpaul@gmail.com >
Co-authored-by: Aryan <contact.aryanvs@gmail.com >
Co-authored-by: yiyixuxu <yixu310@gmail.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2024-12-23 13:02:15 +05:30
latentCall145
f771be1d7b
Flux fp16 inference fix ( #9097 )
...
* clipping for fp16
* fix typo
* added fp16 inference to docs
* fix docs typo
* include link for fp16 investigation
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2024-12-23 13:02:15 +05:30
Álvaro Somoza
3510d0ef5e
[Kolors] Add PAG ( #8934 )
...
* txt2img pag added
* autopipe added, fixed case
* style
* apply suggestions
* added fast tests, added todo tests
* revert dummy objects for kolors
* fix pag dummies
* fix test imports
* update pag tests
* add kolor pag to docs
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2024-12-23 13:02:15 +05:30
Dhruv Nair
47874e837d
[Single File] Add single file support for Flux Transformer ( #9083 )
...
* update
* update
* update
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2024-12-23 13:02:15 +05:30
Ahn Donghoon (안동훈 / suno)
f25823781d
add PAG support for Stable Diffusion 3 ( #8861 )
...
add pag sd3
---------
Co-authored-by: HyoungwonCho <jhw9811@korea.ac.kr >
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
Co-authored-by: crepejung00 <jaewoojung00@naver.com >
Co-authored-by: YiYi Xu <yixu310@gmail.com >
Co-authored-by: Aryan <contact.aryanvs@gmail.com >
Co-authored-by: Aryan <aryan@huggingface.co >
2024-12-23 13:02:15 +05:30
Dhruv Nair
faa0826328
update
2024-12-23 13:02:15 +05:30
Sayak Paul
8881fc9872
[Docs] add stable cascade unet doc. ( #9066 )
...
* add stable cascade unet doc.
* fix path
2024-12-23 13:02:15 +05:30
Aryan
9dbffc8c60
PAG variant for HunyuanDiT, PAG refactor ( #8936 )
...
* copy hunyuandit pipeline
* pag variant of hunyuan dit
* add tests
* update docs
* make style
* make fix-copies
* Update src/diffusers/pipelines/pag/pag_utils.py
* remove incorrect copied from
* remove pag hunyuan attn procs to resolve conflicts
* add pag attn procs again
* new implementation for pag_utils
* revert pag changes
* add pag refactor back; update pixart sigma
* update pixart pag tests
* apply suggestions from review
Co-Authored-By: yixu310@gmail.com
* make style
* update docs, fix tests
* fix tests
* fix test_components_function since list not accepted as valid __init__ param
* apply patch to fix broken tests
Co-Authored-By: Sayak Paul <spsayakpaul@gmail.com >
* make style
* fix hunyuan tests
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2024-12-23 13:02:15 +05:30
Sayak Paul
0db81141b9
[Flux] minor documentation fixes for flux. ( #9048 )
...
* minor documentation fixes for flux.
* clipskip
* add gist
2024-12-23 13:02:15 +05:30
Tolga Cangöz
c6ac793955
Errata: Fix typos & \s+$ ( #9008 )
...
* Fix typos
* chore: Fix typos
* chore: Update README.md for promptdiffusion example
* Trim trailing white spaces
* Fix a typo
* update number
* chore: update number
* Trim trailing white space
* Update README.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
* Update README.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2024-12-23 13:02:14 +05:30
Sayak Paul
c8a236ba5c
[Core] Add PAG support for PixArtSigma ( #8921 )
...
* feat: add pixart sigma pag.
* inits.
* fixes
* fix
* remove print.
* copy paste methods to the pixart pag mixin
* fix-copies
* add documentation.
* add tests.
* remove correction file.
* remove pag_applied_layers
* empty
2024-12-23 13:02:14 +05:30
Sayak Paul
7739beb740
Flux pipeline ( #9043 )
...
add flux!
Signed-off-by: Adrien <adrien@huggingface.co >
Co-authored-by: Adrien <adrien.69740@gmail.com >
Co-authored-by: Anatoly Belikov <abelikov@singularitynet.io >
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
Co-authored-by: yiyixuxu <yixu310@gmail.com >
2024-12-23 13:02:14 +05:30
Aryan
e28e5373f9
PAG variant for AnimateDiff ( #8789 )
...
* add animatediff pag pipeline
* remove unnecessary print
* make fix-copies
* fix ip-adapter bug
* update docs
* add fast tests and fix bugs
* update
* update
* address review comments
* update ip adapter single test expected slice
* implement test_from_pipe_consistent_config; fix expected slice values
* LoraLoaderMixin->StableDiffusionLoraLoaderMixin; add latest freeinit test
2024-12-23 13:02:14 +05:30
Aryan
cf513e4205
[core] Move community AnimateDiff ControlNet to core ( #8972 )
...
* add animatediff controlnet to core
* make style; remove unused method
* fix copied from comment
* add tests
* changes to make tests work
* add utility function to load videos
* update docs
* update pipeline example
* make style
* update docs with example
* address review comments
* add latest freeinit test from #8969
* LoraLoaderMixin -> StableDiffusionLoraLoaderMixin
* fix docs
* Update src/diffusers/utils/loading_utils.py
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
* fix: variable out of scope
---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
2024-12-23 13:02:14 +05:30
Yoach Lacombe
030a134311
Stable Audio integration ( #8716 )
...
* WIP modeling code and pipeline
* add custom attention processor + custom activation + add to init
* correct ProjectionModel forward
* add stable audio to __initèè
* add autoencoder and update pipeline and modeling code
* add half Rope
* add partial rotary v2
* add temporary modfis to scheduler
* add EDM DPM Solver
* remove TODOs
* clean GLU
* remove att.group_norm to attn processor
* revert back src/diffusers/schedulers/scheduling_dpmsolver_multistep.py
* refactor GLU -> SwiGLU
* remove redundant args
* add channel multiples in autoencoder docstrings
* changes in docsrtings and copyright headers
* clean pipeline
* further cleaning
* remove peft and lora and fromoriginalmodel
* Delete src/diffusers/pipelines/stable_audio/diffusers.code-workspace
* make style
* dummy models
* fix copied from
* add fast oobleck tests
* add brownian tree
* oobleck autoencoder slow tests
* remove TODO
* fast stable audio pipeline tests
* add slow tests
* make style
* add first version of docs
* wrap is_torchsde_available to the scheduler
* fix slow test
* test with input waveform
* add input waveform
* remove some todos
* create stableaudio gaussian projection + make style
* add pipeline to toctree
* fix copied from
* make quality
* refactor timestep_features->time_proj
* refactor joint_attention_kwargs->cross_attention_kwargs
* remove forward_chunk
* move StableAudioDitModel to transformers folder
* correct convert + remove partial rotary embed
* apply suggestions from yiyixuxu -> removing attn.kv_heads
* remove temb
* remove cross_attention_kwargs
* further removal of cross_attention_kwargs
* remove text encoder autocast to fp16
* continue removing autocast
* make style
* refactor how text and audio are embedded
* add paper
* update example code
* make style
* unify projection model forward + fix device placement
* make style
* remove fuse qkv
* apply suggestions from review
* Update src/diffusers/pipelines/stable_audio/pipeline_stable_audio.py
Co-authored-by: YiYi Xu <yixu310@gmail.com >
* make style
* smaller models in fast tests
* pass sequential offloading fast tests
* add docs for vae and autoencoder
* make style and update example
* remove useless import
* add cosine scheduler
* dummy classes
* cosine scheduler docs
* better description of scheduler
---------
Co-authored-by: YiYi Xu <yixu310@gmail.com >
2024-12-23 13:02:14 +05:30
Sayak Paul
3566f4b18a
[Docs] credit where it's due for Lumina and Latte. ( #9000 )
...
credit where it's due for Lumina and Latte.
2024-12-23 13:02:14 +05:30
Álvaro Somoza
edddf3d417
[Kolors] Add IP Adapter ( #8901 )
...
* initial draft
* apply suggestions
* fix failing test
* added ipa to img2img
* add docs
* apply suggestions
2024-12-23 13:02:14 +05:30
Aryan
b7ddd2bb99
[core] AnimateDiff SparseCtrl ( #8897 )
...
* initial sparse control model draft
* remove unnecessary implementation
* copy animatediff pipeline
* remove deprecated callbacks
* update
* update pipeline implementation progress
* make style
* make fix-copies
* update progress
* add partially working pipeline
* remove debug prints
* add model docs
* dummy objects
* improve motion lora conversion script
* fix bugs
* update docstrings
* remove unnecessary model params; docs
* address review comment
* add copied from to zero_module
* copy animatediff test
* add fast tests
* update docs
* update
* update pipeline docs
* fix expected slice values
* fix license
* remove get_down_block usage
* remove temporal_double_self_attention from get_down_block
* update
* update docs with org and documentation images
* make from_unet work in sparsecontrolnetmodel
* add latest freeinit test from #8969
* make fix-copies
* LoraLoaderMixin -> StableDiffsuionLoraLoaderMixin
2024-12-23 13:02:14 +05:30
Sayak Paul
6d11129c5a
[Chore] add LoraLoaderMixin to the inits ( #8981 )
...
* introduce to promote reusability.
* up
* add more tests
* up
* remove comments.
* fix fuse_nan test
* clarify the scope of fuse_lora and unfuse_lora
* remove space
* rewrite fuse_lora a bit.
* feedback
* copy over load_lora_into_text_encoder.
* address dhruv's feedback.
* fix-copies
* fix issubclass.
* num_fused_loras
* fix
* fix
* remove mapping
* up
* fix
* style
* fix-copies
* change to SD3TransformerLoRALoadersMixin
* Apply suggestions from code review
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
* up
* handle wuerstchen
* up
* move lora to lora_pipeline.py
* up
* fix-copies
* fix documentation.
* comment set_adapters().
* fix-copies
* fix set_adapters() at the model level.
* fix?
* fix
* loraloadermixin.
---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
2024-12-23 13:02:14 +05:30
YiYi Xu
a754d9071e
Revert "[LoRA] introduce LoraBaseMixin to promote reusability." ( #8976 )
...
Revert "[LoRA] introduce LoraBaseMixin to promote reusability. (#8774 )"
This reverts commit 527430d0a4 .
2024-12-23 13:02:14 +05:30
Sayak Paul
82b37a4cc3
[LoRA] introduce LoraBaseMixin to promote reusability. ( #8774 )
...
* introduce to promote reusability.
* up
* add more tests
* up
* remove comments.
* fix fuse_nan test
* clarify the scope of fuse_lora and unfuse_lora
* remove space
* rewrite fuse_lora a bit.
* feedback
* copy over load_lora_into_text_encoder.
* address dhruv's feedback.
* fix-copies
* fix issubclass.
* num_fused_loras
* fix
* fix
* remove mapping
* up
* fix
* style
* fix-copies
* change to SD3TransformerLoRALoadersMixin
* Apply suggestions from code review
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
* up
* handle wuerstchen
* up
* move lora to lora_pipeline.py
* up
* fix-copies
* fix documentation.
* comment set_adapters().
* fix-copies
* fix set_adapters() at the model level.
* fix?
* fix
---------
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com >
2024-12-23 13:02:14 +05:30
Aryan
0f2c512fb6
[docs] pipeline docs for latte ( #8844 )
...
* add pipeline docs for latte
* add inference time to latte docs
* apply review suggestions
2024-12-23 13:02:14 +05:30
Nguyễn Công Tú Anh
adcd3682bf
add PAG support sd15 controlnet ( #8820 )
...
* add pag support sd15 controlnet
* fix quality import
* remove unecessary import
* remove if state
* fix tests
* remove useless function
* add sd1.5 controlnet pag docs
---------
Co-authored-by: anhnct8 <anhnct8@fpt.com >
2024-12-23 13:02:14 +05:30
Sayak Paul
0ace726d8a
[Docs] add AuraFlow docs ( #8851 )
...
* add pipeline documentation.
* add api spec for pipeline
* model documentation
* model spec
2024-12-23 13:02:14 +05:30
Dhruv Nair
c166a0a90d
Add single file loading support for AnimateDiff ( #8819 )
...
* update
* update
* update
* update
2024-12-23 13:02:14 +05:30
Álvaro Somoza
1028de9d9d
[Core] Add Kolors ( #8812 )
...
* initial draft
2024-12-23 13:02:14 +05:30
Xin Ma
b8742cb946
Latte: Latent Diffusion Transformer for Video Generation ( #8404 )
...
* add Latte to diffusers
* remove print
* remove print
* remove print
* remove unuse codes
* remove layer_norm_latte and add a flag
* remove layer_norm_latte and add a flag
* update latte_pipeline
* update latte_pipeline
* remove unuse squeeze
* add norm_hidden_states.ndim == 2: # for Latte
* fixed test latte pipeline bugs
* fixed test latte pipeline bugs
* delete sh
* add doc for latte
* add licensing
* Move Transformer3DModelOutput to modeling_outputs
* give a default value to sample_size
* remove the einops dependency
* change norm2 for latte
* modify pipeline of latte
* update test for Latte
* modify some codes for latte
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* modify for Latte pipeline
* video_length -> num_frames; update prepare_latents copied from
* make fix-copies
* make style
* typo: videe -> video
* update
* modify for Latte pipeline
* modify latte pipeline
* modify latte pipeline
* modify latte pipeline
* modify latte pipeline
* modify for Latte pipeline
* Delete .vscode directory
* make style
* make fix-copies
* add latte transformer 3d to docs _toctree.yml
* update example
* reduce frames for test
* fixed bug of _text_preprocessing
* set num frame to 1 for testing
* remove unuse print
* add text = self._clean_caption(text) again
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
Co-authored-by: YiYi Xu <yixu310@gmail.com >
Co-authored-by: Aryan <contact.aryanvs@gmail.com >
Co-authored-by: Aryan <aryan@huggingface.co >
2024-12-23 13:02:14 +05:30
PommesPeter
2256ec51ff
[Alpha-VLLM Team] Add Lumina-T2X to diffusers ( #8652 )
...
---------
Co-authored-by: zhuole1025 <zhuole1025@gmail.com >
Co-authored-by: YiYi Xu <yixu310@gmail.com >
2024-12-23 13:02:14 +05:30
Sayak Paul
11c5f6bfcd
Revert "[LoRA] introduce LoraBaseMixin to promote reusability." ( #8773 )
...
Revert "[LoRA] introduce `LoraBaseMixin` to promote reusability. (#8670 )"
This reverts commit a2071a1837 .
2024-12-23 13:02:14 +05:30
Sayak Paul
2686552727
[LoRA] introduce LoraBaseMixin to promote reusability. ( #8670 )
...
* introduce to promote reusability.
* up
* add more tests
* up
* remove comments.
* fix fuse_nan test
* clarify the scope of fuse_lora and unfuse_lora
* remove space
2024-12-23 13:02:13 +05:30
Dhruv Nair
a039005206
Remove legacy single file model loading mixins ( #8754 )
...
update
2024-12-23 13:02:13 +05:30
YiYi Xu
ace869b5ac
[doc] add a tip about using SDXL refiner with hunyuan-dit and pixart ( #8735 )
...
* up
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2024-12-23 13:02:13 +05:30
Shauray Singh
5f10c18270
add PAG support for SD architecture ( #8725 )
...
* add pag to sd pipelines
2024-12-23 13:02:13 +05:30
Sayak Paul
23145f2d9b
[Chore] remove deprecation from transformer2d regarding the output class. ( #8698 )
...
* remove deprecation from transformer2d regarding the output class.
* up
* deprecate more
2024-12-23 13:02:13 +05:30
XCL
f488493082
[Tencent Hunyuan Team] Add Hunyuan-DiT ControlNet Inference ( #8694 )
...
* add controlnet support
---------
Co-authored-by: xingchaoliu <xingchaoliu@tencent.com >
Co-authored-by: yiyixuxu <yixu310@gmail,com>
2024-12-23 13:02:13 +05:30
Álvaro Somoza
89a6943efc
[Docs] SD3 T5 Token limit doc ( #8654 )
...
* doc for max_sequence_length
* better position and changed note to tip
* apply suggestions
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2024-12-23 13:02:13 +05:30
YiYi Xu
5efc438c7e
add PAG support ( #7944 )
...
* first draft
---------
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: Junhwa Song <ethan9867@gmail.com >
Co-authored-by: Ahn Donghoon (안동훈 / suno) <suno.vivid@gmail.com >
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com >
2024-12-23 13:02:13 +05:30
Steven Liu
473acb9579
[docs] Add note for float8 ( #8685 )
...
add note
2024-12-23 13:02:13 +05:30
Tolga Cangöz
f6172748c6
Errata - Fix typos and improve style ( #8571 )
...
* Fix typos
* Fix typos & up style
* chore: Update numbers
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2024-12-23 13:02:13 +05:30
Tolga Cangöz
1ced1c40d8
Discourage using deprecated revision parameter ( #8573 )
...
* Discourage using `revision`
* `make style && make quality`
* Refactor code to use 'variant' instead of 'revision'
* `revision="bf16"` -> `variant="bf16"`
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2024-12-23 13:02:13 +05:30
Tolga Cangöz
2c56360222
Errata - Trim trailing white space in the whole repo ( #8575 )
...
* Trim all the trailing white space in the whole repo
* Remove unnecessary empty places
* make style && make quality
* Trim trailing white space
* trim
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com >
2024-12-23 13:02:13 +05:30
Sayak Paul
22ebb0ab89
[LoRA] get rid of the legacy lora remnants and make our codebase lighter ( #8623 )
...
* get rid of the legacy lora remnants and make our codebase lighter
* fix depcrecated lora argument
* fix
* empty commit to trigger ci
* remove print
* empty
2024-12-23 13:02:13 +05:30
王奇勋
86da4dcf8e
Support SD3 ControlNet and Multi-ControlNet. ( #8566 )
...
* sd3 controlnet
---------
Co-authored-by: haofanwang <haofanwang.ai@gmail.com >
2024-12-23 13:02:13 +05:30
Vasco Ramos
410eb1ad86
[SD3 Docs] Corrected title about loading model with T5 "without" -> "with" ( #8602 )
...
[SD3 Docs] Corrected title about loading model with T5
Corrected the documentation title to "Loading the single file checkpoint with T5" Previously, it incorrectly stated "Loading the single file checkpoint without T5" which contradicted the code snippet showing how to load the SD3 checkpoint with the T5 model
2024-12-23 13:02:13 +05:30
Sayak Paul
db74292bb3
[Core] Add shift_factor to SD3 tiny autoencoder ( #8618 )
...
* shift factor argument to tiny
* remove shift factor rejigging from the sd3 docs
2024-12-23 13:02:13 +05:30