diffusers

mirror of https://github.com/huggingface/diffusers.git synced 2026-02-15 23:37:08 +08:00

Author	SHA1	Message	Date
satani99	8a17331c29	Add StableDiffusionXLControlNetPAGImg2ImgPipeline (#8990 ) * Added pad controlnet sdxl img2img pipeline --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>	2024-12-23 13:02:15 +05:30
Tolga Cangöz	259b12a077	[`Docs`] Fix CPU offloading usage (#9207 ) * chore: Fix cpu offloading usage * Trim trailing white space * docs: update Kolors model link in kolors.md	2024-12-23 13:02:15 +05:30
Aryan	0f74c69416	[refactor] CogVideoX followups + tiled decoding support (#9150 ) * refactor context parallel cache; update torch compile time benchmark * add tiling support * make style * remove num_frames % 8 == 0 requirement * update default num_frames to original value * add explanations + refactor * update torch compile example * update docs * update * clean up if-statements * address review comments * add test for vae tiling * update docs * update docs * update docstrings * add modeling test for cogvideox transformer * make style	2024-12-23 13:02:15 +05:30
林金鹏	095393a5b8	Support SD3 controlnet inpainting (#9099 ) * add controlnet inpainting pipeline * [SD3] add controlnet inpaint example * update example and fix code style * fix code style with ruff * Update controlnet_sd3.md : add control inpaint pipeline * Update docs/source/en/api/pipelines/controlnet_sd3.md Co-authored-by: Aryan <contact.aryanvs@gmail.com> * Update docs/source/en/api/pipelines/controlnet_sd3.md Co-authored-by: Aryan <contact.aryanvs@gmail.com> * Update docs/source/en/api/pipelines/controlnet_sd3.md Co-authored-by: Aryan <contact.aryanvs@gmail.com> * Update src/diffusers/pipelines/controlnet_sd3/pipeline_stable_diffusion_3_controlnet_inpainting.py Co-authored-by: Aryan <contact.aryanvs@gmail.com> * Update __init__.py : add sd3 control pipelines * Update pipeline : add new param doc & check input reference. * fix typo * make style & make quality * add unittest for sd3 controlnet inpaint --------- Co-authored-by: 鹏徙 <linjinpeng.ljp@alibaba-inc.com> Co-authored-by: Aryan <contact.aryanvs@gmail.com>	2024-12-23 13:02:15 +05:30
David Steinberg	d8d8e86924	Fix a dead link (#9116 ) Co-authored-by: Aryan <aryan@huggingface.co>	2024-12-23 13:02:15 +05:30
zR	dbf5d348e6	Add CogVideoX text-to-video generation model (#9082 ) * add CogVideoX --------- Co-authored-by: Aryan <aryan@huggingface.co> Co-authored-by: sayakpaul <spsayakpaul@gmail.com> Co-authored-by: Aryan <contact.aryanvs@gmail.com> Co-authored-by: yiyixuxu <yixu310@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-12-23 13:02:15 +05:30
latentCall145	f771be1d7b	Flux fp16 inference fix (#9097 ) * clipping for fp16 * fix typo * added fp16 inference to docs * fix docs typo * include link for fp16 investigation --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:15 +05:30
Álvaro Somoza	3510d0ef5e	[Kolors] Add PAG (#8934 ) * txt2img pag added * autopipe added, fixed case * style * apply suggestions * added fast tests, added todo tests * revert dummy objects for kolors * fix pag dummies * fix test imports * update pag tests * add kolor pag to docs --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:15 +05:30
Dhruv Nair	47874e837d	[Single File] Add single file support for Flux Transformer (#9083 ) * update * update * update --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:15 +05:30
Ahn Donghoon (안동훈 / suno)	f25823781d	add PAG support for Stable Diffusion 3 (#8861 ) add pag sd3 --------- Co-authored-by: HyoungwonCho <jhw9811@korea.ac.kr> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: crepejung00 <jaewoojung00@naver.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Aryan <contact.aryanvs@gmail.com> Co-authored-by: Aryan <aryan@huggingface.co>	2024-12-23 13:02:15 +05:30
Dhruv Nair	faa0826328	update	2024-12-23 13:02:15 +05:30
Sayak Paul	8881fc9872	[Docs] add stable cascade unet doc. (#9066 ) * add stable cascade unet doc. * fix path	2024-12-23 13:02:15 +05:30
Aryan	9dbffc8c60	PAG variant for HunyuanDiT, PAG refactor (#8936 ) * copy hunyuandit pipeline * pag variant of hunyuan dit * add tests * update docs * make style * make fix-copies * Update src/diffusers/pipelines/pag/pag_utils.py * remove incorrect copied from * remove pag hunyuan attn procs to resolve conflicts * add pag attn procs again * new implementation for pag_utils * revert pag changes * add pag refactor back; update pixart sigma * update pixart pag tests * apply suggestions from review Co-Authored-By: yixu310@gmail.com * make style * update docs, fix tests * fix tests * fix test_components_function since list not accepted as valid __init__ param * apply patch to fix broken tests Co-Authored-By: Sayak Paul <spsayakpaul@gmail.com> * make style * fix hunyuan tests --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:15 +05:30
Sayak Paul	0db81141b9	[Flux] minor documentation fixes for flux. (#9048 ) * minor documentation fixes for flux. * clipskip * add gist	2024-12-23 13:02:15 +05:30
Tolga Cangöz	c6ac793955	Errata: Fix typos & `\s+$` (#9008 ) * Fix typos * chore: Fix typos * chore: Update README.md for promptdiffusion example * Trim trailing white spaces * Fix a typo * update number * chore: update number * Trim trailing white space * Update README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> * Update README.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-12-23 13:02:14 +05:30
Sayak Paul	c8a236ba5c	[Core] Add PAG support for PixArtSigma (#8921 ) * feat: add pixart sigma pag. * inits. * fixes * fix * remove print. * copy paste methods to the pixart pag mixin * fix-copies * add documentation. * add tests. * remove correction file. * remove pag_applied_layers * empty	2024-12-23 13:02:14 +05:30
Sayak Paul	7739beb740	Flux pipeline (#9043 ) add flux! Signed-off-by: Adrien <adrien@huggingface.co> Co-authored-by: Adrien <adrien.69740@gmail.com> Co-authored-by: Anatoly Belikov <abelikov@singularitynet.io> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: yiyixuxu <yixu310@gmail.com>	2024-12-23 13:02:14 +05:30
Aryan	e28e5373f9	PAG variant for AnimateDiff (#8789 ) * add animatediff pag pipeline * remove unnecessary print * make fix-copies * fix ip-adapter bug * update docs * add fast tests and fix bugs * update * update * address review comments * update ip adapter single test expected slice * implement test_from_pipe_consistent_config; fix expected slice values * LoraLoaderMixin->StableDiffusionLoraLoaderMixin; add latest freeinit test	2024-12-23 13:02:14 +05:30
Aryan	cf513e4205	[core] Move community AnimateDiff ControlNet to core (#8972 ) * add animatediff controlnet to core * make style; remove unused method * fix copied from comment * add tests * changes to make tests work * add utility function to load videos * update docs * update pipeline example * make style * update docs with example * address review comments * add latest freeinit test from #8969 * LoraLoaderMixin -> StableDiffusionLoraLoaderMixin * fix docs * Update src/diffusers/utils/loading_utils.py Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * fix: variable out of scope --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2024-12-23 13:02:14 +05:30
Yoach Lacombe	030a134311	Stable Audio integration (#8716 ) * WIP modeling code and pipeline * add custom attention processor + custom activation + add to init * correct ProjectionModel forward * add stable audio to __initèè * add autoencoder and update pipeline and modeling code * add half Rope * add partial rotary v2 * add temporary modfis to scheduler * add EDM DPM Solver * remove TODOs * clean GLU * remove att.group_norm to attn processor * revert back src/diffusers/schedulers/scheduling_dpmsolver_multistep.py * refactor GLU -> SwiGLU * remove redundant args * add channel multiples in autoencoder docstrings * changes in docsrtings and copyright headers * clean pipeline * further cleaning * remove peft and lora and fromoriginalmodel * Delete src/diffusers/pipelines/stable_audio/diffusers.code-workspace * make style * dummy models * fix copied from * add fast oobleck tests * add brownian tree * oobleck autoencoder slow tests * remove TODO * fast stable audio pipeline tests * add slow tests * make style * add first version of docs * wrap is_torchsde_available to the scheduler * fix slow test * test with input waveform * add input waveform * remove some todos * create stableaudio gaussian projection + make style * add pipeline to toctree * fix copied from * make quality * refactor timestep_features->time_proj * refactor joint_attention_kwargs->cross_attention_kwargs * remove forward_chunk * move StableAudioDitModel to transformers folder * correct convert + remove partial rotary embed * apply suggestions from yiyixuxu -> removing attn.kv_heads * remove temb * remove cross_attention_kwargs * further removal of cross_attention_kwargs * remove text encoder autocast to fp16 * continue removing autocast * make style * refactor how text and audio are embedded * add paper * update example code * make style * unify projection model forward + fix device placement * make style * remove fuse qkv * apply suggestions from review * Update src/diffusers/pipelines/stable_audio/pipeline_stable_audio.py Co-authored-by: YiYi Xu <yixu310@gmail.com> * make style * smaller models in fast tests * pass sequential offloading fast tests * add docs for vae and autoencoder * make style and update example * remove useless import * add cosine scheduler * dummy classes * cosine scheduler docs * better description of scheduler --------- Co-authored-by: YiYi Xu <yixu310@gmail.com>	2024-12-23 13:02:14 +05:30
Sayak Paul	3566f4b18a	[Docs] credit where it's due for Lumina and Latte. (#9000 ) credit where it's due for Lumina and Latte.	2024-12-23 13:02:14 +05:30
Álvaro Somoza	edddf3d417	[Kolors] Add IP Adapter (#8901 ) * initial draft * apply suggestions * fix failing test * added ipa to img2img * add docs * apply suggestions	2024-12-23 13:02:14 +05:30
Aryan	b7ddd2bb99	[core] AnimateDiff SparseCtrl (#8897 ) * initial sparse control model draft * remove unnecessary implementation * copy animatediff pipeline * remove deprecated callbacks * update * update pipeline implementation progress * make style * make fix-copies * update progress * add partially working pipeline * remove debug prints * add model docs * dummy objects * improve motion lora conversion script * fix bugs * update docstrings * remove unnecessary model params; docs * address review comment * add copied from to zero_module * copy animatediff test * add fast tests * update docs * update * update pipeline docs * fix expected slice values * fix license * remove get_down_block usage * remove temporal_double_self_attention from get_down_block * update * update docs with org and documentation images * make from_unet work in sparsecontrolnetmodel * add latest freeinit test from #8969 * make fix-copies * LoraLoaderMixin -> StableDiffsuionLoraLoaderMixin	2024-12-23 13:02:14 +05:30
Sayak Paul	6d11129c5a	[Chore] add `LoraLoaderMixin` to the inits (#8981 ) * introduce to promote reusability. * up * add more tests * up * remove comments. * fix fuse_nan test * clarify the scope of fuse_lora and unfuse_lora * remove space * rewrite fuse_lora a bit. * feedback * copy over load_lora_into_text_encoder. * address dhruv's feedback. * fix-copies * fix issubclass. * num_fused_loras * fix * fix * remove mapping * up * fix * style * fix-copies * change to SD3TransformerLoRALoadersMixin * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * up * handle wuerstchen * up * move lora to lora_pipeline.py * up * fix-copies * fix documentation. * comment set_adapters(). * fix-copies * fix set_adapters() at the model level. * fix? * fix * loraloadermixin. --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2024-12-23 13:02:14 +05:30
YiYi Xu	a754d9071e	Revert "[LoRA] introduce LoraBaseMixin to promote reusability." (#8976 ) Revert "[LoRA] introduce LoraBaseMixin to promote reusability. (#8774)" This reverts commit `527430d0a4`.	2024-12-23 13:02:14 +05:30
Sayak Paul	82b37a4cc3	[LoRA] introduce LoraBaseMixin to promote reusability. (#8774 ) * introduce to promote reusability. * up * add more tests * up * remove comments. * fix fuse_nan test * clarify the scope of fuse_lora and unfuse_lora * remove space * rewrite fuse_lora a bit. * feedback * copy over load_lora_into_text_encoder. * address dhruv's feedback. * fix-copies * fix issubclass. * num_fused_loras * fix * fix * remove mapping * up * fix * style * fix-copies * change to SD3TransformerLoRALoadersMixin * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * up * handle wuerstchen * up * move lora to lora_pipeline.py * up * fix-copies * fix documentation. * comment set_adapters(). * fix-copies * fix set_adapters() at the model level. * fix? * fix --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2024-12-23 13:02:14 +05:30
Aryan	0f2c512fb6	[docs] pipeline docs for latte (#8844 ) * add pipeline docs for latte * add inference time to latte docs * apply review suggestions	2024-12-23 13:02:14 +05:30
Nguyễn Công Tú Anh	adcd3682bf	add PAG support sd15 controlnet (#8820 ) * add pag support sd15 controlnet * fix quality import * remove unecessary import * remove if state * fix tests * remove useless function * add sd1.5 controlnet pag docs --------- Co-authored-by: anhnct8 <anhnct8@fpt.com>	2024-12-23 13:02:14 +05:30
Sayak Paul	0ace726d8a	[Docs] add AuraFlow docs (#8851 ) * add pipeline documentation. * add api spec for pipeline * model documentation * model spec	2024-12-23 13:02:14 +05:30
Dhruv Nair	c166a0a90d	Add single file loading support for AnimateDiff (#8819 ) * update * update * update * update	2024-12-23 13:02:14 +05:30
Álvaro Somoza	1028de9d9d	[Core] Add Kolors (#8812 ) * initial draft	2024-12-23 13:02:14 +05:30
Xin Ma	b8742cb946	Latte: Latent Diffusion Transformer for Video Generation (#8404 ) * add Latte to diffusers * remove print * remove print * remove print * remove unuse codes * remove layer_norm_latte and add a flag * remove layer_norm_latte and add a flag * update latte_pipeline * update latte_pipeline * remove unuse squeeze * add norm_hidden_states.ndim == 2: # for Latte * fixed test latte pipeline bugs * fixed test latte pipeline bugs * delete sh * add doc for latte * add licensing * Move Transformer3DModelOutput to modeling_outputs * give a default value to sample_size * remove the einops dependency * change norm2 for latte * modify pipeline of latte * update test for Latte * modify some codes for latte * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * modify for Latte pipeline * video_length -> num_frames; update prepare_latents copied from * make fix-copies * make style * typo: videe -> video * update * modify for Latte pipeline * modify latte pipeline * modify latte pipeline * modify latte pipeline * modify latte pipeline * modify for Latte pipeline * Delete .vscode directory * make style * make fix-copies * add latte transformer 3d to docs _toctree.yml * update example * reduce frames for test * fixed bug of _text_preprocessing * set num frame to 1 for testing * remove unuse print * add text = self._clean_caption(text) again --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com> Co-authored-by: Aryan <contact.aryanvs@gmail.com> Co-authored-by: Aryan <aryan@huggingface.co>	2024-12-23 13:02:14 +05:30
PommesPeter	2256ec51ff	[Alpha-VLLM Team] Add Lumina-T2X to diffusers (#8652 ) --------- Co-authored-by: zhuole1025 <zhuole1025@gmail.com> Co-authored-by: YiYi Xu <yixu310@gmail.com>	2024-12-23 13:02:14 +05:30
Sayak Paul	11c5f6bfcd	Revert "[LoRA] introduce `LoraBaseMixin` to promote reusability." (#8773 ) Revert "[LoRA] introduce `LoraBaseMixin` to promote reusability. (#8670)" This reverts commit `a2071a1837`.	2024-12-23 13:02:14 +05:30
Sayak Paul	2686552727	[LoRA] introduce `LoraBaseMixin` to promote reusability. (#8670 ) * introduce to promote reusability. * up * add more tests * up * remove comments. * fix fuse_nan test * clarify the scope of fuse_lora and unfuse_lora * remove space	2024-12-23 13:02:13 +05:30
Dhruv Nair	a039005206	Remove legacy single file model loading mixins (#8754 ) update	2024-12-23 13:02:13 +05:30
YiYi Xu	ace869b5ac	[doc] add a tip about using SDXL refiner with hunyuan-dit and pixart (#8735 ) * up * Apply suggestions from code review Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com> --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-12-23 13:02:13 +05:30
Shauray Singh	5f10c18270	add PAG support for SD architecture (#8725 ) * add pag to sd pipelines	2024-12-23 13:02:13 +05:30
Sayak Paul	23145f2d9b	[Chore] remove deprecation from transformer2d regarding the output class. (#8698 ) * remove deprecation from transformer2d regarding the output class. * up * deprecate more	2024-12-23 13:02:13 +05:30
XCL	f488493082	[Tencent Hunyuan Team] Add Hunyuan-DiT ControlNet Inference (#8694 ) * add controlnet support --------- Co-authored-by: xingchaoliu <xingchaoliu@tencent.com> Co-authored-by: yiyixuxu <yixu310@gmail,com>	2024-12-23 13:02:13 +05:30
Álvaro Somoza	89a6943efc	[Docs] SD3 T5 Token limit doc (#8654 ) * doc for max_sequence_length * better position and changed note to tip * apply suggestions --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:13 +05:30
YiYi Xu	5efc438c7e	add PAG support (#7944 ) * first draft --------- Co-authored-by: yiyixuxu <yixu310@gmail,com> Co-authored-by: Junhwa Song <ethan9867@gmail.com> Co-authored-by: Ahn Donghoon (안동훈 / suno) <suno.vivid@gmail.com> Co-authored-by: Sayak Paul <spsayakpaul@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2024-12-23 13:02:13 +05:30
Steven Liu	473acb9579	[docs] Add note for float8 (#8685 ) add note	2024-12-23 13:02:13 +05:30
Tolga Cangöz	f6172748c6	Errata - Fix typos and improve style (#8571 ) * Fix typos * Fix typos & up style * chore: Update numbers --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:13 +05:30
Tolga Cangöz	1ced1c40d8	Discourage using deprecated `revision` parameter (#8573 ) * Discourage using `revision` * `make style && make quality` * Refactor code to use 'variant' instead of 'revision' * `revision="bf16"` -> `variant="bf16"` --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:13 +05:30
Tolga Cangöz	2c56360222	Errata - Trim trailing white space in the whole repo (#8575 ) * Trim all the trailing white space in the whole repo * Remove unnecessary empty places * make style && make quality * Trim trailing white space * trim --------- Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>	2024-12-23 13:02:13 +05:30
Sayak Paul	22ebb0ab89	[LoRA] get rid of the legacy lora remnants and make our codebase lighter (#8623 ) * get rid of the legacy lora remnants and make our codebase lighter * fix depcrecated lora argument * fix * empty commit to trigger ci * remove print * empty	2024-12-23 13:02:13 +05:30
王奇勋	86da4dcf8e	Support SD3 ControlNet and Multi-ControlNet. (#8566 ) * sd3 controlnet --------- Co-authored-by: haofanwang <haofanwang.ai@gmail.com>	2024-12-23 13:02:13 +05:30
Vasco Ramos	410eb1ad86	[SD3 Docs] Corrected title about loading model with T5 "without" -> "with" (#8602 ) [SD3 Docs] Corrected title about loading model with T5 Corrected the documentation title to "Loading the single file checkpoint with T5" Previously, it incorrectly stated "Loading the single file checkpoint without T5" which contradicted the code snippet showing how to load the SD3 checkpoint with the T5 model	2024-12-23 13:02:13 +05:30
Sayak Paul	db74292bb3	[Core] Add `shift_factor` to SD3 tiny autoencoder (#8618 ) * shift factor argument to tiny * remove shift factor rejigging from the sd3 docs	2024-12-23 13:02:13 +05:30

1 2 3 4 5 ...

339 Commits