mirror of https://github.com/huggingface/diffusers.git synced 2025-12-11 06:54:32 +08:00

Files

UmerHA e192ae08d3 Add ControlNet-XS support (#5827 )

* Check in 23-10-05

* check-in 23-10-06

* check-in 23-10-07 2pm

* check-in 23-10-08

* check-in 231009T1200

* check-in 230109

* checkin 231010

* init + forward run

* checkin

* checkin

* ControlNetXSModel is now saveable+loadable

* Forward works

* checkin

* Pipeline works with `no_control=True`

* checkin

* debug: save intermediate outputs of resnet

* checkin

* Understood time error + fixed connection error

* checkin

* checkin 231106T1600

* turned off detailled debug prints

* time debug logs

* small fix

* Separated control_scale for connections/time

* simplified debug logging

* Full denoising works with control scale = 0

* aligned logs

* Added control_attention_head_dim param

* Passing n_heads instead of dim_head into ctrl unet

* Fixed ctrl midblock bug

* Cleanup

* Fixed time dtype bug

* checkin

* 1. from_unet, 2. base passed, 3. all unet params

* checkin

* Finished docstrings

* cleanup

* make style

* checkin

* more tests pass

* Fixed tests

* removed debug logs

* make style + quality

* make fix-copies

* fixed documentation

* added cnxs to doc toc

* added control start/end param

* Update controlnetxs_sdxl.md

* tried to fix copies..

* Fixed norm_num_groups in from_unet

* added sdxl-depth test

* created SD2.1 controlnet-xs pipeline

* re-added debug logs

* Adjusting group norm ; readded logs

* Added debug log statements

* removed debug logs ; started tests for sd2.1

* updated sd21 tests

* fixed tests

* fixed tests

* slightly increased error tolerance for 1 test

* make style & quality

* Added docs for CNXS-SD

* make fix-copies

* Fixed sd compile test ; fixed gradient ckpointing

* vae downs = cnxs conditioning downs; removed guess

* make style & quality

* Fixed tests

* fixed test

* Incorporated review feedback

* simplified control model surgery

* fixed tests & make style / quality

* Updated docs; deleted pip & cursor files

* Rolled back minimal change to resnet

* Update resnet.py

* Update resnet.py

* Update src/diffusers/models/controlnetxs.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Update src/diffusers/models/controlnetxs.py

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>

* Incorporated review feedback

* Update docs/source/en/api/pipelines/controlnetxs_sdxl.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/controlnetxs.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/controlnetxs.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/controlnetxs.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/controlnetxs.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/models/controlnetxs.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/controlnet_xs/pipeline_controlnet_xs.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update docs/source/en/api/pipelines/controlnetxs.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Update src/diffusers/pipelines/controlnet_xs/pipeline_controlnet_xs_sd_xl.py

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* Incorporated doc feedback

---------

Co-authored-by: Patrick von Platen <patrick.v.platen@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

2023-12-06 23:33:47 +01:00

5.9 KiB

Raw Blame History

Pipelines

Pipelines provide a simple way to run state-of-the-art diffusion models in inference by bundling all of the necessary components (multiple independently-trained models, schedulers, and processors) into a single end-to-end class. Pipelines are flexible and they can be adapted to use different schedulers or even model components.

All pipelines are built from the base [DiffusionPipeline] class which provides basic functionality for loading, downloading, and saving all the components. Specific pipeline types (for example [StableDiffusionPipeline]) loaded with [~DiffusionPipeline.from_pretrained] are automatically detected and the pipeline components are loaded and passed to the __init__ function of the pipeline.

You shouldn't use the [DiffusionPipeline] class for training. Individual components (for example, [UNet2DModel] and [UNet2DConditionModel]) of diffusion pipelines are usually trained individually, so we suggest directly working with them instead.

Pipelines do not offer any training functionality. You'll notice PyTorch's autograd is disabled by decorating the [~DiffusionPipeline.__call__] method with a torch.no_grad decorator because pipelines should not be used for training. If you're interested in training, please take a look at the Training guides instead!

The table below lists all the pipelines currently available in 🤗 Diffusers and the tasks they support. Click on a pipeline to view its abstract and published paper.

Pipeline	Tasks
AltDiffusion	image2image
AnimateDiff	text2video
Attend-and-Excite	text2image
Audio Diffusion	image2audio
AudioLDM	text2audio
AudioLDM2	text2audio
BLIP Diffusion	text2image
Consistency Models	unconditional image generation
ControlNet	text2image, image2image, inpainting
ControlNet with Stable Diffusion XL	text2image
ControlNet-XS	text2image
ControlNet-XS with Stable Diffusion XL	text2image
Cycle Diffusion	image2image
Dance Diffusion	unconditional audio generation
DDIM	unconditional image generation
DDPM	unconditional image generation
DeepFloyd IF	text2image, image2image, inpainting, super-resolution
DiffEdit	inpainting
DiT	text2image
GLIGEN	text2image
InstructPix2Pix	image editing
Kandinsky 2.1	text2image, image2image, inpainting, interpolation
Kandinsky 2.2	text2image, image2image, inpainting
Kandinsky 3	text2image, image2image
Latent Consistency Models	text2image
Latent Diffusion	text2image, super-resolution
LDM3D	text2image, text-to-3D, text-to-pano, upscaling
MultiDiffusion	text2image
MusicLDM	text2audio
Paint by Example	inpainting
ParaDiGMS	text2image
Pix2Pix Zero	image editing
PixArt-α	text2image
PNDM	unconditional image generation
RePaint	inpainting
Score SDE VE	unconditional image generation
Self-Attention Guidance	text2image
Semantic Guidance	text2image
Shap-E	text-to-3D, image-to-3D
Spectrogram Diffusion
Stable Diffusion	text2image, image2image, depth2image, inpainting, image variation, latent upscaler, super-resolution
Stable Diffusion Model Editing	model editing
Stable Diffusion XL	text2image, image2image, inpainting
Stable Diffusion XL Turbo	text2image, image2image, inpainting
Stable unCLIP	text2image, image variation
Stochastic Karras VE	unconditional image generation
T2I-Adapter	text2image
Text2Video	text2video, video2video
Text2Video-Zero	text2video
unCLIP	text2image, image variation
Unconditional Latent Diffusion	unconditional image generation
UniDiffuser	text2image, image2text, image variation, text variation, unconditional image generation, unconditional audio generation
Value-guided planning	value guided sampling
Versatile Diffusion	text2image, image variation
VQ Diffusion	text2image
Wuerstchen	text2image

DiffusionPipeline

autodoc DiffusionPipeline - all - call - device - to - components

FlaxDiffusionPipeline

autodoc pipelines.pipeline_flax_utils.FlaxDiffusionPipeline

PushToHubMixin

autodoc utils.PushToHubMixin

5.9 KiB Raw Blame History Unescape Escape

Pipelines

DiffusionPipeline

FlaxDiffusionPipeline

PushToHubMixin

5.9 KiB

Raw Blame History