mirror of
https://github.com/huggingface/diffusers.git
synced 2026-02-13 06:15:31 +08:00
update create pipeline section
This commit is contained in:
@@ -12,27 +12,28 @@ specific language governing permissions and limitations under the License.
|
||||
|
||||
# ModularPipeline
|
||||
|
||||
[`ModularPipeline`] converts [`~modular_pipelines.ModularPipelineBlocks`]'s into an executable pipeline that loads models and performs the computation steps defined in the block. It is the main interface for running a pipeline and it is very similar to the [`DiffusionPipeline`] API.
|
||||
[`ModularPipeline`] converts [`~modular_pipelines.ModularPipelineBlocks`] into an executable pipeline that loads models and performs the computation steps defined in the blocks. It is the main interface for running a pipeline and the API is very similar to [`DiffusionPipeline`] but with a few key differences.
|
||||
|
||||
The main difference is to include an expected `output` argument in the pipeline.
|
||||
**Loading is lazy.** With [`DiffusionPipeline`], [`~DiffusionPipeline.from_pretrained`] creates the pipeline and loads all models at the same time. With [`ModularPipeline`], creating and loading are two separate steps: [`~ModularPipeline.from_pretrained`] reads the configuration and knows where to load each component from, but doesn't actually load the model weights. You load the models later with [`~ModularPipeline.load_components`], which is where you pass loading arguments like `torch_dtype` and `quantization_config`.
|
||||
|
||||
**Two ways to create a pipeline.** You can use [`~ModularPipeline.from_pretrained`] with an existing diffusers model repository — it automatically maps to the default pipeline blocks and then converts to a [`ModularPipeline`] with no extra setup. Currently supported models include SDXL, Wan, Qwen, Z-Image, Flux, and Flux2. You can also assemble your own pipeline from [`ModularPipelineBlocks`] and convert it with the [`~ModularPipelineBlocks.init_pipeline`] method (see [Creating a pipeline](#creating-a-pipeline) for more details).
|
||||
|
||||
**Running the pipeline is the same.** Once loaded, you call the pipeline with the same arguments you're used to. A single [`ModularPipeline`] can support multiple workflows (text-to-image, image-to-image, inpainting, etc.) when the pipeline blocks use [`AutoPipelineBlocks`](./auto_pipeline) to automatically select the workflow based on your inputs.
|
||||
|
||||
Below are complete examples for text-to-image, image-to-image, and inpainting with SDXL.
|
||||
|
||||
<hfoptions id="example">
|
||||
<hfoption id="text-to-image">
|
||||
|
||||
```py
|
||||
import torch
|
||||
from diffusers.modular_pipelines import SequentialPipelineBlocks
|
||||
from diffusers.modular_pipelines.stable_diffusion_xl import TEXT2IMAGE_BLOCKS
|
||||
|
||||
blocks = SequentialPipelineBlocks.from_blocks_dict(TEXT2IMAGE_BLOCKS)
|
||||
|
||||
modular_repo_id = "YiYiXu/modular-loader-t2i-0704"
|
||||
pipeline = blocks.init_pipeline(modular_repo_id)
|
||||
from diffusers import ModularPipeline
|
||||
|
||||
pipeline = ModularPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0")
|
||||
pipeline.load_components(torch_dtype=torch.float16)
|
||||
pipeline.to("cuda")
|
||||
|
||||
image = pipeline(prompt="Astronaut in a jungle, cold color palette, muted colors, detailed, 8k", output="images")[0]
|
||||
image = pipeline(prompt="Astronaut in a jungle, cold color palette, muted colors, detailed, 8k").images[0]
|
||||
image.save("modular_t2i_out.png")
|
||||
```
|
||||
|
||||
@@ -41,21 +42,17 @@ image.save("modular_t2i_out.png")
|
||||
|
||||
```py
|
||||
import torch
|
||||
from diffusers.modular_pipelines import SequentialPipelineBlocks
|
||||
from diffusers.modular_pipelines.stable_diffusion_xl import IMAGE2IMAGE_BLOCKS
|
||||
|
||||
blocks = SequentialPipelineBlocks.from_blocks_dict(IMAGE2IMAGE_BLOCKS)
|
||||
|
||||
modular_repo_id = "YiYiXu/modular-loader-t2i-0704"
|
||||
pipeline = blocks.init_pipeline(modular_repo_id)
|
||||
from diffusers import ModularPipeline
|
||||
from diffusers.utils import load_image
|
||||
|
||||
pipeline = ModularPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0")
|
||||
pipeline.load_components(torch_dtype=torch.float16)
|
||||
pipeline.to("cuda")
|
||||
|
||||
url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/diffusers/sdxl-text2img.png"
|
||||
init_image = load_image(url)
|
||||
prompt = "a dog catching a frisbee in the jungle"
|
||||
image = pipeline(prompt=prompt, image=init_image, strength=0.8, output="images")[0]
|
||||
image = pipeline(prompt=prompt, image=init_image, strength=0.8).images[0]
|
||||
image.save("modular_i2i_out.png")
|
||||
```
|
||||
|
||||
@@ -64,15 +61,10 @@ image.save("modular_i2i_out.png")
|
||||
|
||||
```py
|
||||
import torch
|
||||
from diffusers.modular_pipelines import SequentialPipelineBlocks
|
||||
from diffusers.modular_pipelines.stable_diffusion_xl import INPAINT_BLOCKS
|
||||
from diffusers import ModularPipeline
|
||||
from diffusers.utils import load_image
|
||||
|
||||
blocks = SequentialPipelineBlocks.from_blocks_dict(INPAINT_BLOCKS)
|
||||
|
||||
modular_repo_id = "YiYiXu/modular-loader-t2i-0704"
|
||||
pipeline = blocks.init_pipeline(modular_repo_id)
|
||||
|
||||
pipeline = ModularPipeline.from_pretrained("stabilityai/stable-diffusion-xl-base-1.0")
|
||||
pipeline.load_components(torch_dtype=torch.float16)
|
||||
pipeline.to("cuda")
|
||||
|
||||
@@ -83,96 +75,160 @@ init_image = load_image(img_url)
|
||||
mask_image = load_image(mask_url)
|
||||
|
||||
prompt = "A deep sea diver floating"
|
||||
image = pipeline(prompt=prompt, image=init_image, mask_image=mask_image, strength=0.85, output="images")[0]
|
||||
image = pipeline(prompt=prompt, image=init_image, mask_image=mask_image, strength=0.85).images[0]
|
||||
image.save("moduar_inpaint_out.png")
|
||||
```
|
||||
|
||||
</hfoption>
|
||||
</hfoptions>
|
||||
|
||||
This guide will show you how to create a [`ModularPipeline`] and manage the components in it.
|
||||
|
||||
## Adding blocks
|
||||
|
||||
Blocks are [`InsertableDict`] objects that can be inserted at specific positions, providing a flexible way to mix-and-match blocks.
|
||||
|
||||
Use [`~modular_pipelines.modular_pipeline_utils.InsertableDict.insert`] on either the block class or `sub_blocks` attribute to add a block.
|
||||
|
||||
```py
|
||||
# BLOCKS is dict of block classes, you need to add class to it
|
||||
BLOCKS.insert("block_name", BlockClass, index)
|
||||
# sub_blocks attribute contains instance, add a block instance to the attribute
|
||||
t2i_blocks.sub_blocks.insert("block_name", block_instance, index)
|
||||
```
|
||||
|
||||
Use [`~modular_pipelines.modular_pipeline_utils.InsertableDict.pop`] on either the block class or `sub_blocks` attribute to remove a block.
|
||||
|
||||
```py
|
||||
# remove a block class from preset
|
||||
BLOCKS.pop("text_encoder")
|
||||
# split out a block instance on its own
|
||||
text_encoder_block = t2i_blocks.sub_blocks.pop("text_encoder")
|
||||
```
|
||||
|
||||
Swap blocks by setting the existing block to the new block.
|
||||
|
||||
```py
|
||||
# Replace block class in preset
|
||||
BLOCKS["prepare_latents"] = CustomPrepareLatents
|
||||
# Replace in sub_blocks attribute using an block instance
|
||||
t2i_blocks.sub_blocks["prepare_latents"] = CustomPrepareLatents()
|
||||
```
|
||||
This guide will show you how to create a [`ModularPipeline`] and manage the components in it, and run it.
|
||||
|
||||
## Creating a pipeline
|
||||
|
||||
There are two ways to create a [`ModularPipeline`]. Assemble and create a pipeline from [`ModularPipelineBlocks`] or load an existing pipeline with [`~ModularPipeline.from_pretrained`].
|
||||
There are two ways to create a [`ModularPipeline`]. Assemble and create a pipeline from [`ModularPipelineBlocks`] with [`~ModularPipelineBlocks.init_pipeline`], or load an existing pipeline with [`~ModularPipeline.from_pretrained`].
|
||||
|
||||
You should also initialize a [`ComponentsManager`] to handle device placement and memory and component management.
|
||||
|
||||
> [!TIP]
|
||||
> Refer to the [ComponentsManager](./components_manager) doc for more details about how it can help manage components across different workflows.
|
||||
|
||||
<hfoptions id="create">
|
||||
<hfoption id="ModularPipelineBlocks">
|
||||
### init_pipeline
|
||||
|
||||
Use the [`~ModularPipelineBlocks.init_pipeline`] method to create a [`ModularPipeline`] from the component and configuration specifications. This method loads the *specifications* from a `modular_model_index.json` file, but it doesn't load the *models* yet.
|
||||
[`~ModularPipelineBlocks.init_pipeline`] converts any [`ModularPipelineBlocks`] into a [`ModularPipeline`].
|
||||
|
||||
Let's define a minimal block to see how it works:
|
||||
```py
|
||||
from diffusers import ComponentsManager
|
||||
from diffusers.modular_pipelines import SequentialPipelineBlocks
|
||||
from diffusers.modular_pipelines.stable_diffusion_xl import TEXT2IMAGE_BLOCKS
|
||||
from transformers import CLIPTextModel
|
||||
from diffusers.modular_pipelines import (
|
||||
ComponentSpec,
|
||||
ModularPipelineBlocks,
|
||||
PipelineState,
|
||||
)
|
||||
|
||||
t2i_blocks = SequentialPipelineBlocks.from_blocks_dict(TEXT2IMAGE_BLOCKS)
|
||||
class MyBlock(ModularPipelineBlocks):
|
||||
@property
|
||||
def expected_components(self):
|
||||
return [
|
||||
ComponentSpec(
|
||||
name="text_encoder",
|
||||
type_hint=CLIPTextModel,
|
||||
pretrained_model_name_or_path="openai/clip-vit-large-patch14",
|
||||
),
|
||||
]
|
||||
|
||||
modular_repo_id = "YiYiXu/modular-loader-t2i-0704"
|
||||
components = ComponentsManager()
|
||||
t2i_pipeline = t2i_blocks.init_pipeline(modular_repo_id, components_manager=components)
|
||||
def __call__(self, components, state: PipelineState) -> PipelineState:
|
||||
return components, state
|
||||
```
|
||||
|
||||
</hfoption>
|
||||
<hfoption id="from_pretrained">
|
||||
Call [`~ModularPipelineBlocks.init_pipeline`] to convert it into a pipeline. The `blocks` attribute on the pipeline is the blocks it was created from — it determines the expected inputs, outputs, and computation logic.
|
||||
```py
|
||||
block = MyBlock()
|
||||
pipe = block.init_pipeline()
|
||||
pipe.blocks
|
||||
```
|
||||
```
|
||||
MyBlock {
|
||||
"_class_name": "MyBlock",
|
||||
"_diffusers_version": "0.37.0.dev0"
|
||||
}
|
||||
```
|
||||
|
||||
The [`~ModularPipeline.from_pretrained`] method creates a [`ModularPipeline`] from a modular repository on the Hub.
|
||||
Call [`~ModularPipelineBlocks.init_pipeline`] to convert it into a pipeline. The `blocks` attribute on the pipeline is the blocks it was created from — it determines the expected inputs, outputs, and computation logic.
|
||||
|
||||
> [!WARNING]
|
||||
> Blocks are mutable — you can freely add, remove, or swap blocks before creating a pipeline. However, once a pipeline is created, modifying `pipeline.blocks` won't affect the pipeline because it returns a copy. If you want a different block structure, create a new pipeline after modifying the blocks.
|
||||
|
||||
When you call [`~ModularPipelineBlocks.init_pipeline`] without a repository, it uses the `pretrained_model_name_or_path` defined in the block's [`ComponentSpec`] to determine where to load each component from. Printing the pipeline shows the component loading configuration.
|
||||
|
||||
```py
|
||||
pipe
|
||||
ModularPipeline {
|
||||
"_blocks_class_name": "MyBlock",
|
||||
"_class_name": "ModularPipeline",
|
||||
"_diffusers_version": "0.37.0.dev0",
|
||||
"text_encoder": [
|
||||
null,
|
||||
null,
|
||||
{
|
||||
"pretrained_model_name_or_path": "openai/clip-vit-large-patch14",
|
||||
"revision": null,
|
||||
"subfolder": "",
|
||||
"type_hint": [
|
||||
"transformers",
|
||||
"CLIPTextModel"
|
||||
],
|
||||
"variant": null
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
If you pass a repository to [`~ModularPipelineBlocks.init_pipeline`], it overrides the loading path by matching your block's components against the pipeline config in that repository (`model_index.json` or `modular_model_index.json`).
|
||||
|
||||
In the example below, the `pretrained_model_name_or_path` will be updated to `"stabilityai/stable-diffusion-xl-base-1.0"`.
|
||||
```py
|
||||
pipe = block.init_pipeline("stabilityai/stable-diffusion-xl-base-1.0")
|
||||
pipe
|
||||
ModularPipeline {
|
||||
"_blocks_class_name": "MyBlock",
|
||||
"_class_name": "ModularPipeline",
|
||||
"_diffusers_version": "0.37.0.dev0",
|
||||
"text_encoder": [
|
||||
null,
|
||||
null,
|
||||
{
|
||||
"pretrained_model_name_or_path": "stabilityai/stable-diffusion-xl-base-1.0",
|
||||
"revision": null,
|
||||
"subfolder": "text_encoder",
|
||||
"type_hint": [
|
||||
"transformers",
|
||||
"CLIPTextModel"
|
||||
],
|
||||
"variant": null
|
||||
}
|
||||
]
|
||||
}
|
||||
```
|
||||
|
||||
If a component in your block doesn't exist in the repository, it remains `null` and is skipped during [`~ModularPipeline.load_components`].
|
||||
|
||||
### from_pretrained
|
||||
|
||||
[`~ModularPipeline.from_pretrained`] is a convenient way to create a [`ModularPipeline`] without defining blocks yourself.
|
||||
|
||||
It works with three types of repositories.
|
||||
|
||||
**A regular diffusers repository.** Pass any supported model repository and it automatically maps to the default pipeline blocks. Currently supported models include SDXL, Wan, Qwen, Z-Image, Flux, and Flux2.
|
||||
```py
|
||||
from diffusers import ModularPipeline, ComponentsManager
|
||||
|
||||
components = ComponentsManager()
|
||||
pipeline = ModularPipeline.from_pretrained("YiYiXu/modular-loader-t2i-0704", components_manager=components)
|
||||
pipeline = ModularPipeline.from_pretrained(
|
||||
"stabilityai/stable-diffusion-xl-base-1.0", components_manager=components
|
||||
)
|
||||
```
|
||||
|
||||
Add the `trust_remote_code` argument to load a custom [`ModularPipeline`].
|
||||
|
||||
**A modular repository.** These repositories contain a `modular_model_index.json` that specifies where to load each component from — the components can come from different repositories and the modular repository itself may not contain any model weights. For example, [diffusers/flux2-bnb-4bit-modular](https://huggingface.co/diffusers/flux2-bnb-4bit-modular) loads a quantized transformer from one repository and the remaining components from another. See [Modular repository](#modular-repository) for more details on the format.
|
||||
```py
|
||||
from diffusers import ModularPipeline, ComponentsManager
|
||||
|
||||
components = ComponentsManager()
|
||||
modular_repo_id = "YiYiXu/modular-diffdiff-0704"
|
||||
diffdiff_pipeline = ModularPipeline.from_pretrained(modular_repo_id, trust_remote_code=True, components_manager=components)
|
||||
pipeline = ModularPipeline.from_pretrained(
|
||||
"diffusers/flux2-bnb-4bit-modular", components_manager=components
|
||||
)
|
||||
```
|
||||
|
||||
**A modular repository with custom code.** Some repositories include custom pipeline blocks alongside the loading configuration. Add `trust_remote_code=True` to load them. See [Custom blocks](./custom_blocks) for how to create your own.
|
||||
```py
|
||||
from diffusers import ModularPipeline, ComponentsManager
|
||||
|
||||
components = ComponentsManager()
|
||||
pipeline = ModularPipeline.from_pretrained(
|
||||
"diffusers/Florence2-image-Annotator", trust_remote_code=True, components_manager=components
|
||||
)
|
||||
```
|
||||
|
||||
</hfoption>
|
||||
</hfoptions>
|
||||
|
||||
## Loading components
|
||||
|
||||
@@ -184,7 +240,7 @@ A [`ModularPipeline`] doesn't automatically instantiate with components. It only
|
||||
```py
|
||||
import torch
|
||||
|
||||
t2i_pipeline.load_components(torch_dtype=torch.float16)
|
||||
pipeline.load_components(torch_dtype=torch.float16)
|
||||
t2i_pipeline.to("cuda")
|
||||
```
|
||||
|
||||
@@ -196,7 +252,7 @@ The example below only loads the UNet and VAE.
|
||||
```py
|
||||
import torch
|
||||
|
||||
t2i_pipeline.load_components(names=["unet", "vae"], torch_dtype=torch.float16)
|
||||
pipeline.load_components(names=["unet", "vae"], torch_dtype=torch.float16)
|
||||
```
|
||||
|
||||
</hfoption>
|
||||
|
||||
Reference in New Issue
Block a user