fix zimage lora conversion to support for more lora.

cogvideo example: Distribute VAE video encoding across processes in CogVideoX LoRA training (#13207 )
* Distribute VAE video encoding across processes in CogVideoX LoRA training Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Apply style fixes --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-03-05 00:00:50 +08:00 · 2026-03-04 16:27:01 +05:30 · 2026-03-04 15:09:01 +05:30 · 2026-03-04 12:19:08 +05:30 · 2026-03-03 02:36:36 -10:00 · 2026-03-02 20:50:07 -10:00
17 changed files with 1211 additions and 373 deletions
--- a/.github/workflows/benchmark.yml
+++ b/.github/workflows/benchmark.yml
@@ -62,20 +62,6 @@ jobs:
        with:
          name: benchmark_test_reports
          path: benchmarks/${{ env.BASE_PATH }}
-      
-      # TODO: enable this once the connection problem has been resolved.
-      - name: Update benchmarking results to DB
-        env:
-          PGDATABASE: metrics
-          PGHOST: ${{ secrets.DIFFUSERS_BENCHMARKS_PGHOST }}
-          PGUSER: transformers_benchmarks
-          PGPASSWORD: ${{ secrets.DIFFUSERS_BENCHMARKS_PGPASSWORD }}
-          BRANCH_NAME: ${{ github.head_ref || github.ref_name }}
-        run: |
-          git config --global --add safe.directory /__w/diffusers/diffusers
-          commit_id=$GITHUB_SHA
-          commit_msg=$(git show -s --format=%s "$commit_id" | cut -c1-70)
-          cd benchmarks && python populate_into_db.py "$BRANCH_NAME" "$commit_id" "$commit_msg"

      - name: Report success status
        if: ${{ success() }}
--- a/benchmarks/populate_into_db.py
+++ b/benchmarks/populate_into_db.py
@@ -1,166 +0,0 @@
-import argparse
-import os
-import sys
-
-import gpustat
-import pandas as pd
-import psycopg2
-import psycopg2.extras
-from psycopg2.extensions import register_adapter
-from psycopg2.extras import Json
-
-
-register_adapter(dict, Json)
-
-FINAL_CSV_FILENAME = "collated_results.csv"
-# https://github.com/huggingface/transformers/blob/593e29c5e2a9b17baec010e8dc7c1431fed6e841/benchmark/init_db.sql#L27
-BENCHMARKS_TABLE_NAME = "benchmarks"
-MEASUREMENTS_TABLE_NAME = "model_measurements"
-
-
-def _init_benchmark(conn, branch, commit_id, commit_msg):
-    gpu_stats = gpustat.GPUStatCollection.new_query()
-    metadata = {"gpu_name": gpu_stats[0]["name"]}
-    repository = "huggingface/diffusers"
-    with conn.cursor() as cur:
-        cur.execute(
-            f"INSERT INTO {BENCHMARKS_TABLE_NAME} (repository, branch, commit_id, commit_message, metadata) VALUES (%s, %s, %s, %s, %s) RETURNING benchmark_id",
-            (repository, branch, commit_id, commit_msg, metadata),
-        )
-        benchmark_id = cur.fetchone()[0]
-        print(f"Initialised benchmark #{benchmark_id}")
-        return benchmark_id
-
-
-def parse_args():
-    parser = argparse.ArgumentParser()
-    parser.add_argument(
-        "branch",
-        type=str,
-        help="The branch name on which the benchmarking is performed.",
-    )
-
-    parser.add_argument(
-        "commit_id",
-        type=str,
-        help="The commit hash on which the benchmarking is performed.",
-    )
-
-    parser.add_argument(
-        "commit_msg",
-        type=str,
-        help="The commit message associated with the commit, truncated to 70 characters.",
-    )
-    args = parser.parse_args()
-    return args
-
-
-if __name__ == "__main__":
-    args = parse_args()
-    try:
-        conn = psycopg2.connect(
-            host=os.getenv("PGHOST"),
-            database=os.getenv("PGDATABASE"),
-            user=os.getenv("PGUSER"),
-            password=os.getenv("PGPASSWORD"),
-        )
-        print("DB connection established successfully.")
-    except Exception as e:
-        print(f"Problem during DB init: {e}")
-        sys.exit(1)
-
-    try:
-        benchmark_id = _init_benchmark(
-            conn=conn,
-            branch=args.branch,
-            commit_id=args.commit_id,
-            commit_msg=args.commit_msg,
-        )
-    except Exception as e:
-        print(f"Problem during initializing benchmark: {e}")
-        sys.exit(1)
-
-    cur = conn.cursor()
-
-    df = pd.read_csv(FINAL_CSV_FILENAME)
-
-    # Helper to cast values (or None) given a dtype
-    def _cast_value(val, dtype: str):
-        if pd.isna(val):
-            return None
-
-        if dtype == "text":
-            return str(val).strip()
-
-        if dtype == "float":
-            try:
-                return float(val)
-            except ValueError:
-                return None
-
-        if dtype == "bool":
-            s = str(val).strip().lower()
-            if s in ("true", "t", "yes", "1"):
-                return True
-            if s in ("false", "f", "no", "0"):
-                return False
-            if val in (1, 1.0):
-                return True
-            if val in (0, 0.0):
-                return False
-            return None
-
-        return val
-
-    try:
-        rows_to_insert = []
-        for _, row in df.iterrows():
-            scenario = _cast_value(row.get("scenario"), "text")
-            model_cls = _cast_value(row.get("model_cls"), "text")
-            num_params_B = _cast_value(row.get("num_params_B"), "float")
-            flops_G = _cast_value(row.get("flops_G"), "float")
-            time_plain_s = _cast_value(row.get("time_plain_s"), "float")
-            mem_plain_GB = _cast_value(row.get("mem_plain_GB"), "float")
-            time_compile_s = _cast_value(row.get("time_compile_s"), "float")
-            mem_compile_GB = _cast_value(row.get("mem_compile_GB"), "float")
-            fullgraph = _cast_value(row.get("fullgraph"), "bool")
-            mode = _cast_value(row.get("mode"), "text")
-
-            # If "github_sha" column exists in the CSV, cast it; else default to None
-            if "github_sha" in df.columns:
-                github_sha = _cast_value(row.get("github_sha"), "text")
-            else:
-                github_sha = None
-
-            measurements = {
-                "scenario": scenario,
-                "model_cls": model_cls,
-                "num_params_B": num_params_B,
-                "flops_G": flops_G,
-                "time_plain_s": time_plain_s,
-                "mem_plain_GB": mem_plain_GB,
-                "time_compile_s": time_compile_s,
-                "mem_compile_GB": mem_compile_GB,
-                "fullgraph": fullgraph,
-                "mode": mode,
-                "github_sha": github_sha,
-            }
-            rows_to_insert.append((benchmark_id, measurements))
-
-        # Batch-insert all rows
-        insert_sql = f"""
-        INSERT INTO {MEASUREMENTS_TABLE_NAME} (
-            benchmark_id,
-            measurements
-        )
-        VALUES (%s, %s);
-        """
-
-        psycopg2.extras.execute_batch(cur, insert_sql, rows_to_insert)
-        conn.commit()
-
-        cur.close()
-        conn.close()
-    except Exception as e:
-        print(f"Exception: {e}")
-        sys.exit(1)
--- a/docs/source/en/modular_diffusers/custom_blocks.md
+++ b/docs/source/en/modular_diffusers/custom_blocks.md
@@ -332,4 +332,49 @@ Make your custom block work with Mellon's visual interface. See the [Mellon Cust
 Browse the [Modular Diffusers Custom Blocks](https://huggingface.co/collections/diffusers/modular-diffusers-custom-blocks) collection for inspiration and ready-to-use blocks.

 </hfoption>
-</hfoptions>
+</hfoptions>
+
+## Dependencies
+
+Declaring package dependencies in custom blocks prevents runtime import errors later on. Diffusers validates the dependencies and returns a warning if a package is missing or incompatible.
+
+Set a `_requirements` attribute in your block class, mapping package names to version specifiers.
+
+```py
+from diffusers.modular_pipelines import PipelineBlock
+
+class MyCustomBlock(PipelineBlock):
+    _requirements = {
+        "transformers": ">=4.44.0",
+        "sentencepiece": ">=0.2.0"
+    }
+```
+
+When there are blocks with different requirements, Diffusers merges their requirements.
+
+```py
+from diffusers.modular_pipelines import SequentialPipelineBlocks
+
+class BlockA(PipelineBlock):
+    _requirements = {"transformers": ">=4.44.0"}
+    # ...
+
+class BlockB(PipelineBlock):
+    _requirements = {"sentencepiece": ">=0.2.0"}
+    # ...
+
+pipe = SequentialPipelineBlocks.from_blocks_dict({
+    "block_a": BlockA,
+    "block_b": BlockB,
+})
+```
+
+When this block is saved with [`~ModularPipeline.save_pretrained`], the requirements are saved to the `modular_config.json` file. When this block is loaded, Diffusers checks each requirement against the current environment. If there is a mismatch or a package isn't found, Diffusers returns the following warning.
+
+```md
+# missing package
+xyz-package was specified in the requirements but wasn't found in the current environment.
+
+# version mismatch
+xyz requirement 'specific-version' is not satisfied by the installed version 'actual-version'. Things might work unexpected.
+```
--- a/docs/source/en/using-diffusers/automodel.md
+++ b/docs/source/en/using-diffusers/automodel.md
@@ -97,5 +97,32 @@ If the custom model inherits from the [`ModelMixin`] class, it gets access to th
 > )
 > ```

+### Saving custom models
+
+Use [`~ConfigMixin.register_for_auto_class`] to add the `auto_map` entry to `config.json` automatically when saving. This avoids having to manually edit the config file.
+
+```py
+# my_model.py
+from diffusers import ModelMixin, ConfigMixin
+
+class MyCustomModel(ModelMixin, ConfigMixin):
+    ...
+
+MyCustomModel.register_for_auto_class("AutoModel")
+
+model = MyCustomModel(...)
+model.save_pretrained("./my_model")
+```
+
+The saved `config.json` will include the `auto_map` field.
+
+```json
+{
+  "auto_map": {
+    "AutoModel": "my_model.MyCustomModel"
+  }
+}
+```
+
 > [!NOTE]
 > Learn more about implementing custom models in the [Community components](../using-diffusers/custom_pipeline_overview#community-components) guide.
--- a/examples/cogvideo/train_cogvideox_lora.py
+++ b/examples/cogvideo/train_cogvideox_lora.py
@@ -1232,22 +1232,49 @@ def main(args):
        id_token=args.id_token,
    )

-    def encode_video(video, bar):
-        bar.update(1)
+    def encode_video(video):
        video = video.to(accelerator.device, dtype=vae.dtype).unsqueeze(0)
        video = video.permute(0, 2, 1, 3, 4)  # [B, C, F, H, W]
        latent_dist = vae.encode(video).latent_dist
        return latent_dist

+    # Distribute video encoding across processes: each process only encodes its own shard
+    num_videos = len(train_dataset.instance_videos)
+    num_procs = accelerator.num_processes
+    local_rank = accelerator.process_index
+    local_count = len(range(local_rank, num_videos, num_procs))
+
    progress_encode_bar = tqdm(
-        range(0, len(train_dataset.instance_videos)),
-        desc="Loading Encode videos",
+        range(local_count),
+        desc="Encoding videos",
+        disable=not accelerator.is_local_main_process,
    )
-    train_dataset.instance_videos = [
-        encode_video(video, progress_encode_bar) for video in train_dataset.instance_videos
-    ]
+
+    encoded_videos = [None] * num_videos
+    for i, video in enumerate(train_dataset.instance_videos):
+        if i % num_procs == local_rank:
+            encoded_videos[i] = encode_video(video)
+            progress_encode_bar.update(1)
    progress_encode_bar.close()

+    # Broadcast encoded latent distributions so every process has the full set
+    if num_procs > 1:
+        import torch.distributed as dist
+
+        from diffusers.models.autoencoders.vae import DiagonalGaussianDistribution
+
+        ref_params = next(v for v in encoded_videos if v is not None).parameters
+        for i in range(num_videos):
+            src = i % num_procs
+            if encoded_videos[i] is not None:
+                params = encoded_videos[i].parameters.contiguous()
+            else:
+                params = torch.empty_like(ref_params)
+            dist.broadcast(params, src=src)
+            encoded_videos[i] = DiagonalGaussianDistribution(params)
+
+    train_dataset.instance_videos = encoded_videos
+
    def collate_fn(examples):
        videos = [example["instance_video"].sample() * vae.config.scaling_factor for example in examples]
        prompts = [example["instance_prompt"] for example in examples]
--- a/src/diffusers/commands/custom_blocks.py
+++ b/src/diffusers/commands/custom_blocks.py
@@ -89,8 +89,6 @@ class CustomBlocksCommand(BaseDiffusersCLICommand):
        # automap = self._create_automap(parent_class=parent_class, child_class=child_class)
        # with open(CONFIG, "w") as f:
        #     json.dump(automap, f)
-        with open("requirements.txt", "w") as f:
-            f.write("")

    def _choose_block(self, candidates, chosen=None):
        for cls, base in candidates:
--- a/src/diffusers/configuration_utils.py
+++ b/src/diffusers/configuration_utils.py
@@ -107,6 +107,38 @@ class ConfigMixin:
    has_compatibles = False

    _deprecated_kwargs = []
+    _auto_class = None
+
+    @classmethod
+    def register_for_auto_class(cls, auto_class="AutoModel"):
+        """
+        Register this class with the given auto class so that it can be loaded with `AutoModel.from_pretrained(...,
+        trust_remote_code=True)`.
+
+        When the config is saved, the resulting `config.json` will include an `auto_map` entry mapping the auto class
+        to this class's module and class name.
+
+        Args:
+            auto_class (`str` or type, *optional*, defaults to `"AutoModel"`):
+                The auto class to register this class with. Can be a string (e.g. `"AutoModel"`) or the class itself.
+                Currently only `"AutoModel"` is supported.
+
+        Example:
+
+        ```python
+        from diffusers import ModelMixin, ConfigMixin
+
+
+        class MyCustomModel(ModelMixin, ConfigMixin): ...
+
+
+        MyCustomModel.register_for_auto_class("AutoModel")
+        ```
+        """
+        if auto_class != "AutoModel":
+            raise ValueError(f"Only 'AutoModel' is supported, got '{auto_class}'.")
+
+        cls._auto_class = auto_class

    def register_to_config(self, **kwargs):
        if self.config_name is None:
@@ -621,6 +653,12 @@ class ConfigMixin:
        # pop the `_pre_quantization_dtype` as torch.dtypes are not serializable.
        _ = config_dict.pop("_pre_quantization_dtype", None)

+        if getattr(self, "_auto_class", None) is not None:
+            module = self.__class__.__module__.split(".")[-1]
+            auto_map = config_dict.get("auto_map", {})
+            auto_map[self._auto_class] = f"{module}.{self.__class__.__name__}"
+            config_dict["auto_map"] = auto_map
+
        return json.dumps(config_dict, indent=2, sort_keys=True) + "\n"

    def to_json_file(self, json_file_path: str | os.PathLike):
--- a/src/diffusers/loaders/lora_conversion_utils.py
+++ b/src/diffusers/loaders/lora_conversion_utils.py
@@ -2519,6 +2519,13 @@ def _convert_non_diffusers_z_image_lora_to_diffusers(state_dict):
    if has_default:
        state_dict = {k.replace("default.", ""): v for k, v in state_dict.items()}

+    # Normalize ZImage-specific dot-separated module names to underscore form so they
+    # match the diffusers model parameter names (context_refiner, noise_refiner).
+    state_dict = {
+        k.replace("context.refiner.", "context_refiner.").replace("noise.refiner.", "noise_refiner."): v
+        for k, v in state_dict.items()
+    }
+
    converted_state_dict = {}
    all_keys = list(state_dict.keys())
    down_key = ".lora_down.weight"
@@ -2529,19 +2536,18 @@ def _convert_non_diffusers_z_image_lora_to_diffusers(state_dict):
    has_non_diffusers_lora_id = any(down_key in k or up_key in k for k in all_keys)
    has_diffusers_lora_id = any(a_key in k or b_key in k for k in all_keys)

+    def get_alpha_scales(down_weight, alpha_key):
+        rank = down_weight.shape[0]
+        alpha = state_dict.pop(alpha_key).item()
+        scale = alpha / rank  # LoRA is scaled by 'alpha / rank' in forward pass, so we need to scale it back here
+        scale_down = scale
+        scale_up = 1.0
+        while scale_down * 2 < scale_up:
+            scale_down *= 2
+            scale_up /= 2
+        return scale_down, scale_up
+
    if has_non_diffusers_lora_id:
-
-        def get_alpha_scales(down_weight, alpha_key):
-            rank = down_weight.shape[0]
-            alpha = state_dict.pop(alpha_key).item()
-            scale = alpha / rank  # LoRA is scaled by 'alpha / rank' in forward pass, so we need to scale it back here
-            scale_down = scale
-            scale_up = 1.0
-            while scale_down * 2 < scale_up:
-                scale_down *= 2
-                scale_up /= 2
-            return scale_down, scale_up
-
        for k in all_keys:
            if k.endswith(down_key):
                diffusers_down_key = k.replace(down_key, ".lora_A.weight")
@@ -2554,13 +2560,69 @@ def _convert_non_diffusers_z_image_lora_to_diffusers(state_dict):
                converted_state_dict[diffusers_down_key] = down_weight * scale_down
                converted_state_dict[diffusers_up_key] = up_weight * scale_up

-    # Already in diffusers format (lora_A/lora_B), just pop
+    # Already in diffusers format (lora_A/lora_B), apply alpha scaling and pop.
    elif has_diffusers_lora_id:
        for k in all_keys:
-            if a_key in k or b_key in k:
-                converted_state_dict[k] = state_dict.pop(k)
-            elif ".alpha" in k:
+            if k.endswith(a_key):
+                diffusers_up_key = k.replace(a_key, b_key)
+                alpha_key = k.replace(a_key, ".alpha")
+
+                down_weight = state_dict.pop(k)
+                up_weight = state_dict.pop(diffusers_up_key)
+                scale_down, scale_up = get_alpha_scales(down_weight, alpha_key)
+                converted_state_dict[k] = down_weight * scale_down
+                converted_state_dict[diffusers_up_key] = up_weight * scale_up
+
+    # Handle dot-format LoRA keys: ".lora.down.weight" / ".lora.up.weight".
+    # Some external ZImage trainers (e.g. Anime-Z) use dots instead of underscores in
+    # lora weight names and also include redundant keys:
+    #   - "qkv.lora.*"    duplicates individual "to.q/k/v.lora.*" keys → skip qkv
+    #   - "out.lora.*"    duplicates "to_out.0.lora.*" keys → skip bare out
+    #   - "to.q/k/v.lora.*" → normalise to "to_q/k/v.lora_A/B.weight"
+    lora_dot_down_key = ".lora.down.weight"
+    lora_dot_up_key = ".lora.up.weight"
+    has_lora_dot_format = any(lora_dot_down_key in k for k in state_dict)
+
+    if has_lora_dot_format:
+        dot_keys = list(state_dict.keys())
+        for k in dot_keys:
+            if lora_dot_down_key not in k:
+                continue
+            if k not in state_dict:
+                continue  # already popped by a prior iteration
+
+            base = k[: -len(lora_dot_down_key)]
+
+            # Skip combined "qkv" projection — individual to.q/k/v keys are also present.
+            if base.endswith(".qkv"):
                state_dict.pop(k)
+                state_dict.pop(k.replace(lora_dot_down_key, lora_dot_up_key), None)
+                state_dict.pop(base + ".alpha", None)
+                continue
+
+            # Skip bare "out.lora.*" — "to_out.0.lora.*" covers the same projection.
+            if re.search(r"\.out$", base) and ".to_out" not in base:
+                state_dict.pop(k)
+                state_dict.pop(k.replace(lora_dot_down_key, lora_dot_up_key), None)
+                continue
+
+            # Normalise "to.q/k/v" → "to_q/k/v" for the diffusers output key.
+            norm_k = re.sub(
+                r"\.to\.([qkv])" + re.escape(lora_dot_down_key) + r"$",
+                r".to_\1" + lora_dot_down_key,
+                k,
+            )
+            norm_base = norm_k[: -len(lora_dot_down_key)]
+            alpha_key = norm_base + ".alpha"
+
+            diffusers_down = norm_k.replace(lora_dot_down_key, ".lora_A.weight")
+            diffusers_up = norm_k.replace(lora_dot_down_key, ".lora_B.weight")
+
+            down_weight = state_dict.pop(k)
+            up_weight = state_dict.pop(k.replace(lora_dot_down_key, lora_dot_up_key))
+            scale_down, scale_up = get_alpha_scales(down_weight, alpha_key)
+            converted_state_dict[diffusers_down] = down_weight * scale_down
+            converted_state_dict[diffusers_up] = up_weight * scale_up

    if len(state_dict) > 0:
        raise ValueError(f"`state_dict` should be empty at this point but has {state_dict.keys()=}")
--- a/src/diffusers/modular_pipelines/modular_pipeline.py
+++ b/src/diffusers/modular_pipelines/modular_pipeline.py
@@ -14,6 +14,7 @@
 import importlib
 import inspect
 import os
+import sys
 import traceback
 import warnings
 from collections import OrderedDict
@@ -28,10 +29,16 @@ from tqdm.auto import tqdm
 from typing_extensions import Self

 from ..configuration_utils import ConfigMixin, FrozenDict
-from ..pipelines.pipeline_loading_utils import _fetch_class_library_tuple, simple_get_class_obj
+from ..pipelines.pipeline_loading_utils import (
+    LOADABLE_CLASSES,
+    _fetch_class_library_tuple,
+    _unwrap_model,
+    simple_get_class_obj,
+)
 from ..utils import PushToHubMixin, is_accelerate_available, logging
 from ..utils.dynamic_modules_utils import get_class_from_dynamic_module, resolve_trust_remote_code
 from ..utils.hub_utils import load_or_create_model_card, populate_model_card
+from ..utils.torch_utils import is_compiled_module
 from .components_manager import ComponentsManager
 from .modular_pipeline_utils import (
    MODULAR_MODEL_CARD_TEMPLATE,
@@ -40,6 +47,7 @@ from .modular_pipeline_utils import (
    InputParam,
    InsertableDict,
    OutputParam,
+    _validate_requirements,
    combine_inputs,
    combine_outputs,
    format_components,
@@ -290,6 +298,7 @@ class ModularPipelineBlocks(ConfigMixin, PushToHubMixin):

    config_name = "modular_config.json"
    model_name = None
+    _requirements: dict[str, str] | None = None
    _workflow_map = None

    @classmethod
@@ -404,6 +413,9 @@ class ModularPipelineBlocks(ConfigMixin, PushToHubMixin):
                "Selected model repository does not happear to have any custom code or does not have a valid `config.json` file."
            )

+        if "requirements" in config and config["requirements"] is not None:
+            _ = _validate_requirements(config["requirements"])
+
        class_ref = config["auto_map"][cls.__name__]
        module_file, class_name = class_ref.split(".")
        module_file = module_file + ".py"
@@ -428,8 +440,13 @@ class ModularPipelineBlocks(ConfigMixin, PushToHubMixin):
        module = full_mod.rsplit(".", 1)[-1].replace("__dynamic__", "")
        parent_module = self.save_pretrained.__func__.__qualname__.split(".", 1)[0]
        auto_map = {f"{parent_module}": f"{module}.{cls_name}"}
-
        self.register_to_config(auto_map=auto_map)
+
+        # resolve requirements
+        requirements = _validate_requirements(getattr(self, "_requirements", None))
+        if requirements:
+            self.register_to_config(requirements=requirements)
+
        self.save_config(save_directory=save_directory, push_to_hub=push_to_hub, **kwargs)
        config = dict(self.config)
        self._internal_dict = FrozenDict(config)
@@ -651,6 +668,15 @@ class ConditionalPipelineBlocks(ModularPipelineBlocks):
        combined_outputs = combine_outputs(*named_outputs)
        return combined_outputs

+    @property
+    # Copied from diffusers.modular_pipelines.modular_pipeline.SequentialPipelineBlocks._requirements
+    def _requirements(self) -> dict[str, str]:
+        requirements = {}
+        for block_name, block in self.sub_blocks.items():
+            if getattr(block, "_requirements", None):
+                requirements[block_name] = block._requirements
+        return requirements
+
    # used for `__repr__`
    def _get_trigger_inputs(self) -> set:
        """
@@ -1240,6 +1266,14 @@ class SequentialPipelineBlocks(ModularPipelineBlocks):
            expected_configs=self.expected_configs,
        )

+    @property
+    def _requirements(self) -> dict[str, str]:
+        requirements = {}
+        for block_name, block in self.sub_blocks.items():
+            if getattr(block, "_requirements", None):
+                requirements[block_name] = block._requirements
+        return requirements
+

 class LoopSequentialPipelineBlocks(ModularPipelineBlocks):
    """
@@ -1378,6 +1412,15 @@ class LoopSequentialPipelineBlocks(ModularPipelineBlocks):
    def outputs(self) -> list[str]:
        return next(reversed(self.sub_blocks.values())).intermediate_outputs

+    @property
+    # Copied from diffusers.modular_pipelines.modular_pipeline.SequentialPipelineBlocks._requirements
+    def _requirements(self) -> dict[str, str]:
+        requirements = {}
+        for block_name, block in self.sub_blocks.items():
+            if getattr(block, "_requirements", None):
+                requirements[block_name] = block._requirements
+        return requirements
+
    def __init__(self):
        sub_blocks = InsertableDict()
        for block_name, block in zip(self.block_names, self.block_classes):
@@ -1700,6 +1743,8 @@ class ModularPipeline(ConfigMixin, PushToHubMixin):
            _blocks_class_name=self._blocks.__class__.__name__ if self._blocks is not None else None
        )

+        self._pretrained_model_name_or_path = pretrained_model_name_or_path
+
    @property
    def default_call_parameters(self) -> dict[str, Any]:
        """
@@ -1826,44 +1871,136 @@ class ModularPipeline(ConfigMixin, PushToHubMixin):
        )
        return pipeline

-    def save_pretrained(self, save_directory: str | os.PathLike, push_to_hub: bool = False, **kwargs):
+    def save_pretrained(
+        self,
+        save_directory: str | os.PathLike,
+        safe_serialization: bool = True,
+        variant: str | None = None,
+        max_shard_size: int | str | None = None,
+        push_to_hub: bool = False,
+        **kwargs,
+    ):
        """
-        Save the pipeline to a directory. It does not save components, you need to save them separately.
+        Save the pipeline and all its components to a directory, so that it can be re-loaded using the
+        [`~ModularPipeline.from_pretrained`] class method.

        Args:
            save_directory (`str` or `os.PathLike`):
-                Path to the directory where the pipeline will be saved.
-            push_to_hub (`bool`, optional):
-                Whether to push the pipeline to the huggingface hub.
-            **kwargs: Additional arguments passed to `save_config()` method
+                Directory to save the pipeline to. Will be created if it doesn't exist.
+            safe_serialization (`bool`, *optional*, defaults to `True`):
+                Whether to save the model using `safetensors` or the traditional PyTorch way with `pickle`.
+            variant (`str`, *optional*):
+                If specified, weights are saved in the format `pytorch_model.<variant>.bin`.
+            max_shard_size (`int` or `str`, defaults to `None`):
+                The maximum size for a checkpoint before being sharded. Checkpoints shard will then be each of size
+                lower than this size. If expressed as a string, needs to be digits followed by a unit (like `"5GB"`).
+                If expressed as an integer, the unit is bytes.
+            push_to_hub (`bool`, *optional*, defaults to `False`):
+                Whether to push the pipeline to the Hugging Face model hub after saving it.
+            **kwargs: Additional keyword arguments:
+                - `overwrite_modular_index` (`bool`, *optional*, defaults to `False`):
+                    When saving a Modular Pipeline, its components in `modular_model_index.json` may reference repos
+                    different from the destination repo. Setting this to `True` updates all component references in
+                    `modular_model_index.json` so they point to the repo specified by `repo_id`.
+                - `repo_id` (`str`, *optional*):
+                    The repository ID to push the pipeline to. Defaults to the last component of `save_directory`.
+                - `commit_message` (`str`, *optional*):
+                    Commit message for the push to hub operation.
+                - `private` (`bool`, *optional*):
+                    Whether the repository should be private.
+                - `create_pr` (`bool`, *optional*, defaults to `False`):
+                    Whether to create a pull request instead of pushing directly.
+                - `token` (`str`, *optional*):
+                    The Hugging Face token to use for authentication.
        """
+        overwrite_modular_index = kwargs.pop("overwrite_modular_index", False)
+        repo_id = kwargs.pop("repo_id", save_directory.split(os.path.sep)[-1])
+
        if push_to_hub:
            commit_message = kwargs.pop("commit_message", None)
            private = kwargs.pop("private", None)
            create_pr = kwargs.pop("create_pr", False)
            token = kwargs.pop("token", None)
-            repo_id = kwargs.pop("repo_id", save_directory.split(os.path.sep)[-1])
+            update_model_card = kwargs.pop("update_model_card", False)
            repo_id = create_repo(repo_id, exist_ok=True, private=private, token=token).repo_id

-            # Generate modular pipeline card content
-            card_content = generate_modular_model_card_content(self.blocks)
+        for component_name, component_spec in self._component_specs.items():
+            if component_spec.default_creation_method != "from_pretrained":
+                continue

-            # Create a new empty model card and eventually tag it
+            component = getattr(self, component_name, None)
+            if component is None:
+                continue
+
+            model_cls = component.__class__
+            if is_compiled_module(component):
+                component = _unwrap_model(component)
+                model_cls = component.__class__
+
+            save_method_name = None
+            for library_name, library_classes in LOADABLE_CLASSES.items():
+                if library_name in sys.modules:
+                    library = importlib.import_module(library_name)
+                else:
+                    logger.info(
+                        f"{library_name} is not installed. Cannot save {component_name} as {library_classes} from {library_name}"
+                    )
+                    continue
+
+                for base_class, save_load_methods in library_classes.items():
+                    class_candidate = getattr(library, base_class, None)
+                    if class_candidate is not None and issubclass(model_cls, class_candidate):
+                        save_method_name = save_load_methods[0]
+                        break
+                if save_method_name is not None:
+                    break
+
+            if save_method_name is None:
+                logger.warning(f"self.{component_name}={component} of type {type(component)} cannot be saved.")
+                continue
+
+            save_method = getattr(component, save_method_name)
+            save_method_signature = inspect.signature(save_method)
+            save_method_accept_safe = "safe_serialization" in save_method_signature.parameters
+            save_method_accept_variant = "variant" in save_method_signature.parameters
+            save_method_accept_max_shard_size = "max_shard_size" in save_method_signature.parameters
+
+            save_kwargs = {}
+            if save_method_accept_safe:
+                save_kwargs["safe_serialization"] = safe_serialization
+            if save_method_accept_variant:
+                save_kwargs["variant"] = variant
+            if save_method_accept_max_shard_size and max_shard_size is not None:
+                save_kwargs["max_shard_size"] = max_shard_size
+
+            component_save_path = os.path.join(save_directory, component_name)
+            save_method(component_save_path, **save_kwargs)
+
+            if component_name not in self.config:
+                continue
+
+            has_no_load_id = not hasattr(component, "_diffusers_load_id") or component._diffusers_load_id == "null"
+            if overwrite_modular_index or has_no_load_id:
+                library, class_name, component_spec_dict = self.config[component_name]
+                component_spec_dict["pretrained_model_name_or_path"] = repo_id if push_to_hub else save_directory
+                component_spec_dict["subfolder"] = component_name
+                self.register_to_config(**{component_name: (library, class_name, component_spec_dict)})
+
+        self.save_config(save_directory=save_directory)
+
+        if push_to_hub:
+            card_content = generate_modular_model_card_content(self.blocks)
            model_card = load_or_create_model_card(
                repo_id,
                token=token,
                is_pipeline=True,
                model_description=MODULAR_MODEL_CARD_TEMPLATE.format(**card_content),
                is_modular=True,
+                update_model_card=update_model_card,
            )
            model_card = populate_model_card(model_card, tags=card_content["tags"])
-
            model_card.save(os.path.join(save_directory, "README.md"))

-        # YiYi TODO: maybe order the json file to make it more readable: configs first, then components
-        self.save_config(save_directory=save_directory)
-
-        if push_to_hub:
            self._upload_folder(
                save_directory,
                repo_id,
@@ -2131,8 +2268,9 @@ class ModularPipeline(ConfigMixin, PushToHubMixin):
            ```

        Notes:
-            - Components with trained weights should be loaded with `AutoModel.from_pretrained()` or
-            `ComponentSpec.load()` so that loading specs are preserved for serialization.
+            - Components loaded with `AutoModel.from_pretrained()` or `ComponentSpec.load()` will have
+            loading specs preserved for serialization. Custom or locally loaded components without Hub references will
+            have their `modular_model_index.json` entries updated automatically during `save_pretrained()`.
            - ConfigMixin objects without weights (e.g., schedulers, guiders) can be passed directly.
        """

@@ -2154,13 +2292,10 @@ class ModularPipeline(ConfigMixin, PushToHubMixin):
                new_component_spec = current_component_spec
                if hasattr(self, name) and getattr(self, name) is not None:
                    logger.warning(f"ModularPipeline.update_components: setting {name} to None (spec unchanged)")
-            elif current_component_spec.default_creation_method == "from_pretrained" and not (
-                hasattr(component, "_diffusers_load_id") and component._diffusers_load_id is not None
+            elif (
+                current_component_spec.default_creation_method == "from_pretrained"
+                and getattr(component, "_diffusers_load_id", None) is None
            ):
-                logger.warning(
-                    f"ModularPipeline.update_components: {name} has no valid _diffusers_load_id. "
-                    f"This will result in empty loading spec, use ComponentSpec.load() for proper specs"
-                )
                new_component_spec = ComponentSpec(name=name, type_hint=type(component))
            else:
                new_component_spec = ComponentSpec.from_component(name, component)
@@ -2233,17 +2368,49 @@ class ModularPipeline(ConfigMixin, PushToHubMixin):
                    elif "default" in value:
                        # check if the default is specified
                        component_load_kwargs[key] = value["default"]
+            # Only pass trust_remote_code to components from the same repo as the pipeline.
+            # When a user passes trust_remote_code=True, they intend to trust code from the
+            # pipeline's repo, not from external repos referenced in modular_model_index.json.
+            trust_remote_code_stripped = False
+            if (
+                "trust_remote_code" in component_load_kwargs
+                and self._pretrained_model_name_or_path is not None
+                and spec.pretrained_model_name_or_path != self._pretrained_model_name_or_path
+            ):
+                component_load_kwargs.pop("trust_remote_code")
+                trust_remote_code_stripped = True
+
+            if not spec.pretrained_model_name_or_path:
+                logger.info(f"Skipping component `{name}`: no pretrained model path specified.")
+                continue
+
            try:
                components_to_register[name] = spec.load(**component_load_kwargs)
            except Exception:
-                logger.warning(
-                    f"\nFailed to create component {name}:\n"
-                    f"- Component spec: {spec}\n"
-                    f"- load() called with kwargs: {component_load_kwargs}\n"
-                    "If this component is not required for your workflow you can safely ignore this message.\n\n"
-                    "Traceback:\n"
-                    f"{traceback.format_exc()}"
-                )
+                tb = traceback.format_exc()
+                if trust_remote_code_stripped and "trust_remote_code" in tb:
+                    warning_msg = (
+                        f"Failed to load component `{name}` from external repository "
+                        f"`{spec.pretrained_model_name_or_path}`.\n\n"
+                        f"`trust_remote_code=True` was not forwarded to `{name}` because it comes from "
+                        f"a different repository than the pipeline (`{self._pretrained_model_name_or_path}`). "
+                        f"For safety, `trust_remote_code` is only forwarded to components from the same "
+                        f"repository as the pipeline.\n\n"
+                        f"You need to load this component manually with `trust_remote_code=True` and pass it "
+                        f"to the pipeline via `pipe.update_components()`. For example, if it is a custom model:\n\n"
+                        f'  {name} = AutoModel.from_pretrained("{spec.pretrained_model_name_or_path}", trust_remote_code=True)\n'
+                        f"  pipe.update_components({name}={name})\n"
+                    )
+                else:
+                    warning_msg = (
+                        f"Failed to create component {name}:\n"
+                        f"- Component spec: {spec}\n"
+                        f"- load() called with kwargs: {component_load_kwargs}\n"
+                        "If this component is not required for your workflow you can safely ignore this message.\n\n"
+                        "Traceback:\n"
+                        f"{tb}"
+                    )
+                logger.warning(warning_msg)

        # Register all components at once
        self.register_components(**components_to_register)
--- a/src/diffusers/modular_pipelines/modular_pipeline_utils.py
+++ b/src/diffusers/modular_pipelines/modular_pipeline_utils.py
@@ -22,10 +22,12 @@ from typing import Any, Literal, Type, Union, get_args, get_origin

 import PIL.Image
 import torch
+from packaging.specifiers import InvalidSpecifier, SpecifierSet

 from ..configuration_utils import ConfigMixin, FrozenDict
 from ..loaders.single_file_utils import _is_single_file_path_or_url
 from ..utils import DIFFUSERS_LOAD_ID_FIELDS, is_torch_available, logging
+from ..utils.import_utils import _is_package_available


 if is_torch_available():
@@ -50,11 +52,7 @@ This modular pipeline is composed of the following blocks:

 {components_description} {configs_section}

-## Input/Output Specification
-
-### Inputs {inputs_description}
-
-### Outputs {outputs_description}
+{io_specification_section}
 """


@@ -311,6 +309,12 @@ class ComponentSpec:
                f"`type_hint` is required when loading a single file model but is missing for component: {self.name}"
            )

+        # `torch_dtype` is not an accepted parameter for tokenizers and processors.
+        # As a result, it gets stored in `init_kwargs`, which are written to the config
+        # during save. This causes JSON serialization to fail when saving the component.
+        if self.type_hint is not None and not issubclass(self.type_hint, torch.nn.Module):
+            kwargs.pop("torch_dtype", None)
+
        if self.type_hint is None:
            try:
                from diffusers import AutoModel
@@ -328,6 +332,12 @@ class ComponentSpec:
                else getattr(self.type_hint, "from_pretrained")
            )

+            # `torch_dtype` is not an accepted parameter for tokenizers and processors.
+            # As a result, it gets stored in `init_kwargs`, which are written to the config
+            # during save. This causes JSON serialization to fail when saving the component.
+            if not issubclass(self.type_hint, torch.nn.Module):
+                kwargs.pop("torch_dtype", None)
+
            try:
                component = load_method(pretrained_model_name_or_path, **load_kwargs, **kwargs)
            except Exception as e:
@@ -799,6 +809,46 @@ def format_output_params(output_params, indent_level=4, max_line_length=115):
    return format_params(output_params, "Outputs", indent_level, max_line_length)


+def format_params_markdown(params, header="Inputs"):
+    """Format a list of InputParam or OutputParam objects as a markdown bullet-point list.
+
+    Suitable for model cards rendered on Hugging Face Hub.
+
+    Args:
+        params: list of InputParam or OutputParam objects to format
+        header: Header text (e.g. "Inputs" or "Outputs")
+
+    Returns:
+        A formatted markdown string, or empty string if params is empty.
+    """
+    if not params:
+        return ""
+
+    def get_type_str(type_hint):
+        if isinstance(type_hint, UnionType) or get_origin(type_hint) is Union:
+            type_strs = [t.__name__ if hasattr(t, "__name__") else str(t) for t in get_args(type_hint)]
+            return " | ".join(type_strs)
+        return type_hint.__name__ if hasattr(type_hint, "__name__") else str(type_hint)
+
+    lines = [f"**{header}:**\n"] if header else []
+    for param in params:
+        type_str = get_type_str(param.type_hint) if param.type_hint != Any else ""
+        name = f"**{param.kwargs_type}" if param.name is None and param.kwargs_type is not None else param.name
+        param_str = f"- `{name}` (`{type_str}`"
+
+        if hasattr(param, "required") and not param.required:
+            param_str += ", *optional*"
+            if param.default is not None:
+                param_str += f", defaults to `{param.default}`"
+        param_str += ")"
+
+        desc = param.description if param.description else "No description provided"
+        param_str += f": {desc}"
+        lines.append(param_str)
+
+    return "\n".join(lines)
+
+
 def format_components(components, indent_level=4, max_line_length=115, add_empty_lines=True):
    """Format a list of ComponentSpec objects into a readable string representation.

@@ -972,6 +1022,89 @@ def make_doc_string(
    return output


+def _validate_requirements(reqs):
+    if reqs is None:
+        normalized_reqs = {}
+    else:
+        if not isinstance(reqs, dict):
+            raise ValueError(
+                "Requirements must be provided as a dictionary mapping package names to version specifiers."
+            )
+        normalized_reqs = _normalize_requirements(reqs)
+
+    if not normalized_reqs:
+        return {}
+
+    final: dict[str, str] = {}
+    for req, specified_ver in normalized_reqs.items():
+        req_available, req_actual_ver = _is_package_available(req)
+        if not req_available:
+            logger.warning(f"{req} was specified in the requirements but wasn't found in the current environment.")
+
+        if specified_ver:
+            try:
+                specifier = SpecifierSet(specified_ver)
+            except InvalidSpecifier as err:
+                raise ValueError(f"Requirement specifier '{specified_ver}' for {req} is invalid.") from err
+
+            if req_actual_ver == "N/A":
+                logger.warning(
+                    f"Version of {req} could not be determined to validate requirement '{specified_ver}'. Things might work unexpected."
+                )
+            elif not specifier.contains(req_actual_ver, prereleases=True):
+                logger.warning(
+                    f"{req} requirement '{specified_ver}' is not satisfied by the installed version {req_actual_ver}. Things might work unexpected."
+                )
+
+        final[req] = specified_ver
+
+    return final
+
+
+def _normalize_requirements(reqs):
+    if not reqs:
+        return {}
+
+    normalized: "OrderedDict[str, str]" = OrderedDict()
+
+    def _accumulate(mapping: dict[str, Any]):
+        for pkg, spec in mapping.items():
+            if isinstance(spec, dict):
+                # This is recursive because blocks are composable. This way, we can merge requirements
+                # from multiple blocks.
+                _accumulate(spec)
+                continue
+
+            pkg_name = str(pkg).strip()
+            if not pkg_name:
+                raise ValueError("Requirement package name cannot be empty.")
+
+            spec_str = "" if spec is None else str(spec).strip()
+            if spec_str and not spec_str.startswith(("<", ">", "=", "!", "~")):
+                spec_str = f"=={spec_str}"
+
+            existing_spec = normalized.get(pkg_name)
+            if existing_spec is not None:
+                if not existing_spec and spec_str:
+                    normalized[pkg_name] = spec_str
+                elif existing_spec and spec_str and existing_spec != spec_str:
+                    try:
+                        combined_spec = SpecifierSet(",".join(filter(None, [existing_spec, spec_str])))
+                    except InvalidSpecifier:
+                        logger.warning(
+                            f"Conflicting requirements for '{pkg_name}' detected: '{existing_spec}' vs '{spec_str}'. Keeping '{existing_spec}'."
+                        )
+                    else:
+                        normalized[pkg_name] = str(combined_spec)
+                continue
+
+            normalized[pkg_name] = spec_str
+
+    _accumulate(reqs)
+
+    return normalized
+
+
 def combine_inputs(*named_input_lists: list[tuple[str, list[InputParam]]]) -> list[InputParam]:
    """
    Combines multiple lists of InputParam objects from different blocks. For duplicate inputs, updates only if current
@@ -1055,8 +1188,7 @@ def generate_modular_model_card_content(blocks) -> dict[str, Any]:
            - blocks_description: Detailed architecture of blocks
            - components_description: List of required components
            - configs_section: Configuration parameters section
-            - inputs_description: Input parameters specification
-            - outputs_description: Output parameters specification
+            - io_specification_section: Input/Output specification (per-workflow or unified)
            - trigger_inputs_section: Conditional execution information
            - tags: List of relevant tags for the model card
    """
@@ -1075,15 +1207,6 @@ def generate_modular_model_card_content(blocks) -> dict[str, Any]:
            if block_desc:
                blocks_desc_parts.append(f"   - {block_desc}")

-            # add sub-blocks if any
-            if hasattr(block, "sub_blocks") and block.sub_blocks:
-                for sub_name, sub_block in block.sub_blocks.items():
-                    sub_class = sub_block.__class__.__name__
-                    sub_desc = sub_block.description.split("\n")[0] if getattr(sub_block, "description", "") else ""
-                    blocks_desc_parts.append(f"   - *{sub_name}*: `{sub_class}`")
-                    if sub_desc:
-                        blocks_desc_parts.append(f"     - {sub_desc}")
-
    blocks_description = "\n".join(blocks_desc_parts) if blocks_desc_parts else "No blocks defined."

    components = getattr(blocks, "expected_components", [])
@@ -1109,63 +1232,76 @@ def generate_modular_model_card_content(blocks) -> dict[str, Any]:
        if configs_description:
            configs_section = f"\n\n## Configuration Parameters\n\n{configs_description}"

-    inputs = blocks.inputs
-    outputs = blocks.outputs
+    # Branch on whether workflows are defined
+    has_workflows = getattr(blocks, "_workflow_map", None) is not None

-    # format inputs as markdown list
-    inputs_parts = []
-    required_inputs = [inp for inp in inputs if inp.required]
-    optional_inputs = [inp for inp in inputs if not inp.required]
+    if has_workflows:
+        workflow_map = blocks._workflow_map
+        parts = []

-    if required_inputs:
-        inputs_parts.append("**Required:**\n")
-        for inp in required_inputs:
-            if hasattr(inp.type_hint, "__name__"):
-                type_str = inp.type_hint.__name__
-            elif inp.type_hint is not None:
-                type_str = str(inp.type_hint).replace("typing.", "")
-            else:
-                type_str = "Any"
-            desc = inp.description or "No description provided"
-            inputs_parts.append(f"- `{inp.name}` (`{type_str}`): {desc}")
+        # If blocks overrides outputs (e.g. to return just "images" instead of all intermediates),
+        # use that as the shared output for all workflows
+        blocks_outputs = blocks.outputs
+        blocks_intermediate = getattr(blocks, "intermediate_outputs", None)
+        shared_outputs = (
+            blocks_outputs if blocks_intermediate is not None and blocks_outputs != blocks_intermediate else None
+        )

-    if optional_inputs:
-        if required_inputs:
-            inputs_parts.append("")
-        inputs_parts.append("**Optional:**\n")
-        for inp in optional_inputs:
-            if hasattr(inp.type_hint, "__name__"):
-                type_str = inp.type_hint.__name__
-            elif inp.type_hint is not None:
-                type_str = str(inp.type_hint).replace("typing.", "")
-            else:
-                type_str = "Any"
-            desc = inp.description or "No description provided"
-            default_str = f", default: `{inp.default}`" if inp.default is not None else ""
-            inputs_parts.append(f"- `{inp.name}` (`{type_str}`){default_str}: {desc}")
+        parts.append("## Workflow Input Specification\n")

-    inputs_description = "\n".join(inputs_parts) if inputs_parts else "No specific inputs defined."
+        # Per-workflow details: show trigger inputs with full param descriptions
+        for wf_name, trigger_inputs in workflow_map.items():
+            trigger_input_names = set(trigger_inputs.keys())
+            try:
+                workflow_blocks = blocks.get_workflow(wf_name)
+            except Exception:
+                parts.append(f"<details>\n<summary><strong>{wf_name}</strong></summary>\n")
+                parts.append("*Could not resolve workflow blocks.*\n")
+                parts.append("</details>\n")
+                continue

-    # format outputs as markdown list
-    outputs_parts = []
-    for out in outputs:
-        if hasattr(out.type_hint, "__name__"):
-            type_str = out.type_hint.__name__
-        elif out.type_hint is not None:
-            type_str = str(out.type_hint).replace("typing.", "")
-        else:
-            type_str = "Any"
-        desc = out.description or "No description provided"
-        outputs_parts.append(f"- `{out.name}` (`{type_str}`): {desc}")
+            wf_inputs = workflow_blocks.inputs
+            # Show only trigger inputs with full parameter descriptions
+            trigger_params = [p for p in wf_inputs if p.name in trigger_input_names]

-    outputs_description = "\n".join(outputs_parts) if outputs_parts else "Standard pipeline outputs."
+            parts.append(f"<details>\n<summary><strong>{wf_name}</strong></summary>\n")

-    trigger_inputs_section = ""
-    if hasattr(blocks, "trigger_inputs") and blocks.trigger_inputs:
-        trigger_inputs_list = sorted([t for t in blocks.trigger_inputs if t is not None])
-        if trigger_inputs_list:
-            trigger_inputs_str = ", ".join(f"`{t}`" for t in trigger_inputs_list)
-            trigger_inputs_section = f"""
+            inputs_str = format_params_markdown(trigger_params, header=None)
+            parts.append(inputs_str if inputs_str else "No additional inputs required.")
+            parts.append("")
+
+            parts.append("</details>\n")
+
+        # Common Inputs & Outputs section (like non-workflow pipelines)
+        all_inputs = blocks.inputs
+        all_outputs = shared_outputs if shared_outputs is not None else blocks.outputs
+
+        inputs_str = format_params_markdown(all_inputs, "Inputs")
+        outputs_str = format_params_markdown(all_outputs, "Outputs")
+        inputs_description = inputs_str if inputs_str else "No specific inputs defined."
+        outputs_description = outputs_str if outputs_str else "Standard pipeline outputs."
+
+        parts.append(f"\n## Input/Output Specification\n\n{inputs_description}\n\n{outputs_description}")
+
+        io_specification_section = "\n".join(parts)
+        # Suppress trigger_inputs_section when workflows are shown (it's redundant)
+        trigger_inputs_section = ""
+    else:
+        # Unified I/O section (original behavior)
+        inputs = blocks.inputs
+        outputs = blocks.outputs
+        inputs_str = format_params_markdown(inputs, "Inputs")
+        outputs_str = format_params_markdown(outputs, "Outputs")
+        inputs_description = inputs_str if inputs_str else "No specific inputs defined."
+        outputs_description = outputs_str if outputs_str else "Standard pipeline outputs."
+        io_specification_section = f"## Input/Output Specification\n\n{inputs_description}\n\n{outputs_description}"
+
+        trigger_inputs_section = ""
+        if hasattr(blocks, "trigger_inputs") and blocks.trigger_inputs:
+            trigger_inputs_list = sorted([t for t in blocks.trigger_inputs if t is not None])
+            if trigger_inputs_list:
+                trigger_inputs_str = ", ".join(f"`{t}`" for t in trigger_inputs_list)
+                trigger_inputs_section = f"""
 ### Conditional Execution

 This pipeline contains blocks that are selected at runtime based on inputs:
@@ -1178,7 +1314,18 @@ This pipeline contains blocks that are selected at runtime based on inputs:
    if hasattr(blocks, "model_name") and blocks.model_name:
        tags.append(blocks.model_name)

-    if hasattr(blocks, "trigger_inputs") and blocks.trigger_inputs:
+    if has_workflows:
+        # Derive tags from workflow names
+        workflow_names = set(blocks._workflow_map.keys())
+        if any("inpainting" in wf for wf in workflow_names):
+            tags.append("inpainting")
+        if any("image2image" in wf for wf in workflow_names):
+            tags.append("image-to-image")
+        if any("controlnet" in wf for wf in workflow_names):
+            tags.append("controlnet")
+        if any("text2image" in wf for wf in workflow_names):
+            tags.append("text-to-image")
+    elif hasattr(blocks, "trigger_inputs") and blocks.trigger_inputs:
        triggers = blocks.trigger_inputs
        if any(t in triggers for t in ["mask", "mask_image"]):
            tags.append("inpainting")
@@ -1206,8 +1353,7 @@ This pipeline uses a {block_count}-block architecture that can be customized and
        "blocks_description": blocks_description,
        "components_description": components_description,
        "configs_section": configs_section,
-        "inputs_description": inputs_description,
-        "outputs_description": outputs_description,
+        "io_specification_section": io_specification_section,
        "trigger_inputs_section": trigger_inputs_section,
        "tags": tags,
    }
--- a/src/diffusers/schedulers/scheduling_ipndm.py
+++ b/src/diffusers/schedulers/scheduling_ipndm.py
@@ -31,14 +31,18 @@ class IPNDMScheduler(SchedulerMixin, ConfigMixin):
    Args:
        num_train_timesteps (`int`, defaults to 1000):
            The number of diffusion steps to train the model.
-        trained_betas (`np.ndarray`, *optional*):
+        trained_betas (`np.ndarray` or `List[float]`, *optional*):
            Pass an array of betas directly to the constructor to bypass `beta_start` and `beta_end`.
    """

    order = 1

    @register_to_config
-    def __init__(self, num_train_timesteps: int = 1000, trained_betas: np.ndarray | list[float] | None = None):
+    def __init__(
+        self,
+        num_train_timesteps: int = 1000,
+        trained_betas: np.ndarray | list[float] | None = None,
+    ):
        # set `betas`, `alphas`, `timesteps`
        self.set_timesteps(num_train_timesteps)

@@ -56,21 +60,29 @@ class IPNDMScheduler(SchedulerMixin, ConfigMixin):
        self._begin_index = None

    @property
-    def step_index(self):
+    def step_index(self) -> int | None:
        """
        The index counter for current timestep. It will increase 1 after each scheduler step.
+
+        Returns:
+            `int` or `None`:
+                The index counter for current timestep.
        """
        return self._step_index

    @property
-    def begin_index(self):
+    def begin_index(self) -> int | None:
        """
        The index for the first timestep. It should be set from pipeline with `set_begin_index` method.
+
+        Returns:
+            `int` or `None`:
+                The index for the first timestep.
        """
        return self._begin_index

    # Copied from diffusers.schedulers.scheduling_dpmsolver_multistep.DPMSolverMultistepScheduler.set_begin_index
-    def set_begin_index(self, begin_index: int = 0):
+    def set_begin_index(self, begin_index: int = 0) -> None:
        """
        Sets the begin index for the scheduler. This function should be run from pipeline before the inference.

@@ -169,7 +181,7 @@ class IPNDMScheduler(SchedulerMixin, ConfigMixin):
        Args:
            model_output (`torch.Tensor`):
                The direct output from learned diffusion model.
-            timestep (`int`):
+            timestep (`int` or `torch.Tensor`):
                The current discrete timestep in the diffusion chain.
            sample (`torch.Tensor`):
                A current instance of a sample created by the diffusion process.
@@ -228,7 +240,30 @@ class IPNDMScheduler(SchedulerMixin, ConfigMixin):
        """
        return sample

-    def _get_prev_sample(self, sample, timestep_index, prev_timestep_index, ets):
+    def _get_prev_sample(
+        self,
+        sample: torch.Tensor,
+        timestep_index: int,
+        prev_timestep_index: int,
+        ets: torch.Tensor,
+    ) -> torch.Tensor:
+        """
+        Predicts the previous sample based on the current sample, timestep indices, and running model outputs.
+
+        Args:
+            sample (`torch.Tensor`):
+                The current sample.
+            timestep_index (`int`):
+                Index of the current timestep in the schedule.
+            prev_timestep_index (`int`):
+                Index of the previous timestep in the schedule.
+            ets (`torch.Tensor`):
+                The running sequence of model outputs.
+
+        Returns:
+            `torch.Tensor`:
+                The predicted previous sample.
+        """
        alpha = self.alphas[timestep_index]
        sigma = self.betas[timestep_index]

@@ -240,5 +275,5 @@ class IPNDMScheduler(SchedulerMixin, ConfigMixin):

        return prev_sample

-    def __len__(self):
+    def __len__(self) -> int:
        return self.config.num_train_timesteps
--- a/src/diffusers/utils/dynamic_modules_utils.py
+++ b/src/diffusers/utils/dynamic_modules_utils.py
@@ -299,7 +299,10 @@ def get_cached_module_file(
    # Download and cache module_file from the repo `pretrained_model_name_or_path` of grab it if it's a local file.
    pretrained_model_name_or_path = str(pretrained_model_name_or_path)

-    module_file_or_url = os.path.join(pretrained_model_name_or_path, module_file)
+    if subfolder is not None:
+        module_file_or_url = os.path.join(pretrained_model_name_or_path, subfolder, module_file)
+    else:
+        module_file_or_url = os.path.join(pretrained_model_name_or_path, module_file)

    if os.path.isfile(module_file_or_url):
        resolved_module_file = module_file_or_url
@@ -384,7 +387,11 @@ def get_cached_module_file(
                if not os.path.exists(submodule_path / module_folder):
                    os.makedirs(submodule_path / module_folder)
            module_needed = f"{module_needed}.py"
-            shutil.copyfile(os.path.join(pretrained_model_name_or_path, module_needed), submodule_path / module_needed)
+            if subfolder is not None:
+                source_path = os.path.join(pretrained_model_name_or_path, subfolder, module_needed)
+            else:
+                source_path = os.path.join(pretrained_model_name_or_path, module_needed)
+            shutil.copyfile(source_path, submodule_path / module_needed)
    else:
        # Get the commit hash
        # TODO: we will get this info in the etag soon, so retrieve it from there and not here.
--- a/src/diffusers/utils/hub_utils.py
+++ b/src/diffusers/utils/hub_utils.py
@@ -107,6 +107,7 @@ def load_or_create_model_card(
    widget: list[dict] | None = None,
    inference: bool | None = None,
    is_modular: bool = False,
+    update_model_card: bool = False,
 ) -> ModelCard:
    """
    Loads or creates a model card.
@@ -133,6 +134,9 @@ def load_or_create_model_card(
            `load_or_create_model_card` from a training script.
        is_modular: (`bool`, optional): Boolean flag to denote if the model card is for a modular pipeline.
            When True, uses model_description as-is without additional template formatting.
+        update_model_card: (`bool`, optional): When True, regenerates the model card content even if one
+            already exists on the remote repo. Existing card metadata (tags, license, etc.) is preserved. Only
+            supported for modular pipelines (i.e., `is_modular=True`).
    """
    if not is_jinja_available():
        raise ValueError(
@@ -141,9 +145,17 @@ def load_or_create_model_card(
            " To install it, please run `pip install Jinja2`."
        )

+    if update_model_card and not is_modular:
+        raise ValueError("`update_model_card=True` is only supported for modular pipelines (`is_modular=True`).")
+
    try:
        # Check if the model card is present on the remote repo
        model_card = ModelCard.load(repo_id_or_path, token=token)
+        # For modular pipelines, regenerate card content when requested (preserve existing metadata)
+        if update_model_card and is_modular and model_description is not None:
+            existing_data = model_card.data
+            model_card = ModelCard(model_description)
+            model_card.data = existing_data
    except (EntryNotFoundError, RepositoryNotFoundError):
        # Otherwise create a model card from template
        if from_training:
--- a/tests/models/test_models_auto.py
+++ b/tests/models/test_models_auto.py
@@ -1,9 +1,15 @@
+import json
+import os
+import tempfile
 import unittest
 from unittest.mock import MagicMock, patch

+import torch
 from transformers import CLIPTextModel, LongformerModel

+from diffusers import ConfigMixin
 from diffusers.models import AutoModel, UNet2DConditionModel
+from diffusers.models.modeling_utils import ModelMixin


 class TestAutoModel(unittest.TestCase):
@@ -35,6 +41,45 @@ class TestAutoModel(unittest.TestCase):
        )
        assert isinstance(model, CLIPTextModel)

+    def test_load_dynamic_module_from_local_path_with_subfolder(self):
+        CUSTOM_MODEL_CODE = (
+            "import torch\n"
+            "from diffusers import ModelMixin, ConfigMixin\n"
+            "from diffusers.configuration_utils import register_to_config\n"
+            "\n"
+            "class CustomModel(ModelMixin, ConfigMixin):\n"
+            "    @register_to_config\n"
+            "    def __init__(self, hidden_size=8):\n"
+            "        super().__init__()\n"
+            "        self.linear = torch.nn.Linear(hidden_size, hidden_size)\n"
+            "\n"
+            "    def forward(self, x):\n"
+            "        return self.linear(x)\n"
+        )
+
+        with tempfile.TemporaryDirectory() as tmpdir:
+            subfolder = "custom_model"
+            model_dir = os.path.join(tmpdir, subfolder)
+            os.makedirs(model_dir)
+
+            with open(os.path.join(model_dir, "modeling.py"), "w") as f:
+                f.write(CUSTOM_MODEL_CODE)
+
+            config = {
+                "_class_name": "CustomModel",
+                "_diffusers_version": "0.0.0",
+                "auto_map": {"AutoModel": "modeling.CustomModel"},
+                "hidden_size": 8,
+            }
+            with open(os.path.join(model_dir, "config.json"), "w") as f:
+                json.dump(config, f)
+
+            torch.save({}, os.path.join(model_dir, "diffusion_pytorch_model.bin"))
+
+            model = AutoModel.from_pretrained(tmpdir, subfolder=subfolder, trust_remote_code=True)
+            assert model.__class__.__name__ == "CustomModel"
+            assert model.config["hidden_size"] == 8
+

 class TestAutoModelFromConfig(unittest.TestCase):
    @patch(
@@ -100,3 +145,51 @@ class TestAutoModelFromConfig(unittest.TestCase):
    def test_from_config_raises_on_none(self):
        with self.assertRaises(ValueError, msg="Please provide a `pretrained_model_name_or_path_or_dict`"):
            AutoModel.from_config(None)
+
+
+class TestRegisterForAutoClass(unittest.TestCase):
+    def test_register_for_auto_class_sets_attribute(self):
+        class DummyModel(ModelMixin, ConfigMixin):
+            config_name = "config.json"
+
+        DummyModel.register_for_auto_class("AutoModel")
+        self.assertEqual(DummyModel._auto_class, "AutoModel")
+
+    def test_register_for_auto_class_rejects_unsupported(self):
+        class DummyModel(ModelMixin, ConfigMixin):
+            config_name = "config.json"
+
+        with self.assertRaises(ValueError, msg="Only 'AutoModel' is supported"):
+            DummyModel.register_for_auto_class("AutoPipeline")
+
+    def test_auto_map_in_saved_config(self):
+        class DummyModel(ModelMixin, ConfigMixin):
+            config_name = "config.json"
+
+        DummyModel.register_for_auto_class("AutoModel")
+        model = DummyModel()
+
+        with tempfile.TemporaryDirectory() as tmpdir:
+            model.save_config(tmpdir)
+            config_path = os.path.join(tmpdir, "config.json")
+            with open(config_path, "r") as f:
+                config = json.load(f)
+
+        self.assertIn("auto_map", config)
+        self.assertIn("AutoModel", config["auto_map"])
+        module_name = DummyModel.__module__.split(".")[-1]
+        self.assertEqual(config["auto_map"]["AutoModel"], f"{module_name}.DummyModel")
+
+    def test_no_auto_map_without_register(self):
+        class DummyModel(ModelMixin, ConfigMixin):
+            config_name = "config.json"
+
+        model = DummyModel()
+
+        with tempfile.TemporaryDirectory() as tmpdir:
+            model.save_config(tmpdir)
+            config_path = os.path.join(tmpdir, "config.json")
+            with open(config_path, "r") as f:
+                config = json.load(f)
+
+        self.assertNotIn("auto_map", config)
--- a/tests/modular_pipelines/test_modular_pipelines_common.py
+++ b/tests/modular_pipelines/test_modular_pipelines_common.py
@@ -10,6 +10,11 @@ import torch
 import diffusers
 from diffusers import AutoModel, ComponentsManager, ModularPipeline, ModularPipelineBlocks
 from diffusers.guiders import ClassifierFreeGuidance
+from diffusers.modular_pipelines import (
+    ConditionalPipelineBlocks,
+    LoopSequentialPipelineBlocks,
+    SequentialPipelineBlocks,
+)
 from diffusers.modular_pipelines.modular_pipeline_utils import (
    ComponentSpec,
    ConfigSpec,
@@ -19,7 +24,13 @@ from diffusers.modular_pipelines.modular_pipeline_utils import (
 )
 from diffusers.utils import logging

-from ..testing_utils import backend_empty_cache, numpy_cosine_similarity_distance, require_accelerator, torch_device
+from ..testing_utils import (
+    CaptureLogger,
+    backend_empty_cache,
+    numpy_cosine_similarity_distance,
+    require_accelerator,
+    torch_device,
+)


 class ModularPipelineTesterMixin:
@@ -429,6 +440,117 @@ class ModularGuiderTesterMixin:
        assert max_diff > expected_max_diff, "Output with CFG must be different from normal inference"


+class TestCustomBlockRequirements:
+    def get_dummy_block_pipe(self):
+        class DummyBlockOne:
+            # keep two arbitrary deps so that we can test warnings.
+            _requirements = {"xyz": ">=0.8.0", "abc": ">=10.0.0"}
+
+        class DummyBlockTwo:
+            # keep two dependencies that will be available during testing.
+            _requirements = {"transformers": ">=4.44.0", "diffusers": ">=0.2.0"}
+
+        pipe = SequentialPipelineBlocks.from_blocks_dict(
+            {"dummy_block_one": DummyBlockOne, "dummy_block_two": DummyBlockTwo}
+        )
+        return pipe
+
+    def get_dummy_conditional_block_pipe(self):
+        class DummyBlockOne:
+            _requirements = {"xyz": ">=0.8.0", "abc": ">=10.0.0"}
+
+        class DummyBlockTwo:
+            _requirements = {"transformers": ">=4.44.0", "diffusers": ">=0.2.0"}
+
+        class DummyConditionalBlocks(ConditionalPipelineBlocks):
+            block_classes = [DummyBlockOne, DummyBlockTwo]
+            block_names = ["block_one", "block_two"]
+            block_trigger_inputs = []
+
+            def select_block(self, **kwargs):
+                return "block_one"
+
+        return DummyConditionalBlocks()
+
+    def get_dummy_loop_block_pipe(self):
+        class DummyBlockOne:
+            _requirements = {"xyz": ">=0.8.0", "abc": ">=10.0.0"}
+
+        class DummyBlockTwo:
+            _requirements = {"transformers": ">=4.44.0", "diffusers": ">=0.2.0"}
+
+        return LoopSequentialPipelineBlocks.from_blocks_dict({"block_one": DummyBlockOne, "block_two": DummyBlockTwo})
+
+    def test_sequential_block_requirements_save_load(self, tmp_path):
+        pipe = self.get_dummy_block_pipe()
+        pipe.save_pretrained(tmp_path)
+
+        config_path = tmp_path / "modular_config.json"
+
+        with open(config_path, "r") as f:
+            config = json.load(f)
+
+        assert "requirements" in config
+        requirements = config["requirements"]
+
+        expected_requirements = {
+            "xyz": ">=0.8.0",
+            "abc": ">=10.0.0",
+            "transformers": ">=4.44.0",
+            "diffusers": ">=0.2.0",
+        }
+        assert expected_requirements == requirements
+
+    def test_sequential_block_requirements_warnings(self, tmp_path):
+        pipe = self.get_dummy_block_pipe()
+
+        logger = logging.get_logger("diffusers.modular_pipelines.modular_pipeline_utils")
+        logger.setLevel(30)
+
+        with CaptureLogger(logger) as cap_logger:
+            pipe.save_pretrained(tmp_path)
+
+        template = "{req} was specified in the requirements but wasn't found in the current environment"
+        msg_xyz = template.format(req="xyz")
+        msg_abc = template.format(req="abc")
+        assert msg_xyz in str(cap_logger.out)
+        assert msg_abc in str(cap_logger.out)
+
+    def test_conditional_block_requirements_save_load(self, tmp_path):
+        pipe = self.get_dummy_conditional_block_pipe()
+        pipe.save_pretrained(tmp_path)
+
+        config_path = tmp_path / "modular_config.json"
+        with open(config_path, "r") as f:
+            config = json.load(f)
+
+        assert "requirements" in config
+        expected_requirements = {
+            "xyz": ">=0.8.0",
+            "abc": ">=10.0.0",
+            "transformers": ">=4.44.0",
+            "diffusers": ">=0.2.0",
+        }
+        assert expected_requirements == config["requirements"]
+
+    def test_loop_block_requirements_save_load(self, tmp_path):
+        pipe = self.get_dummy_loop_block_pipe()
+        pipe.save_pretrained(tmp_path)
+
+        config_path = tmp_path / "modular_config.json"
+        with open(config_path, "r") as f:
+            config = json.load(f)
+
+        assert "requirements" in config
+        expected_requirements = {
+            "xyz": ">=0.8.0",
+            "abc": ">=10.0.0",
+            "transformers": ">=4.44.0",
+            "diffusers": ">=0.2.0",
+        }
+        assert expected_requirements == config["requirements"]
+
+
 class TestModularModelCardContent:
    def create_mock_block(self, name="TestBlock", description="Test block description"):
        class MockBlock:
@@ -483,8 +605,7 @@ class TestModularModelCardContent:
            "blocks_description",
            "components_description",
            "configs_section",
-            "inputs_description",
-            "outputs_description",
+            "io_specification_section",
            "trigger_inputs_section",
            "tags",
        ]
@@ -581,18 +702,19 @@ class TestModularModelCardContent:
        blocks = self.create_mock_blocks(inputs=inputs)
        content = generate_modular_model_card_content(blocks)

-        assert "**Required:**" in content["inputs_description"]
-        assert "**Optional:**" in content["inputs_description"]
-        assert "prompt" in content["inputs_description"]
-        assert "num_steps" in content["inputs_description"]
-        assert "default: `50`" in content["inputs_description"]
+        io_section = content["io_specification_section"]
+        assert "**Inputs:**" in io_section
+        assert "prompt" in io_section
+        assert "num_steps" in io_section
+        assert "*optional*" in io_section
+        assert "defaults to `50`" in io_section

    def test_inputs_description_empty(self):
        """Test handling of pipelines without specific inputs."""
        blocks = self.create_mock_blocks(inputs=[])
        content = generate_modular_model_card_content(blocks)

-        assert "No specific inputs defined" in content["inputs_description"]
+        assert "No specific inputs defined" in content["io_specification_section"]

    def test_outputs_description_formatting(self):
        """Test that outputs are correctly formatted."""
@@ -602,15 +724,16 @@ class TestModularModelCardContent:
        blocks = self.create_mock_blocks(outputs=outputs)
        content = generate_modular_model_card_content(blocks)

-        assert "images" in content["outputs_description"]
-        assert "Generated images" in content["outputs_description"]
+        io_section = content["io_specification_section"]
+        assert "images" in io_section
+        assert "Generated images" in io_section

    def test_outputs_description_empty(self):
        """Test handling of pipelines without specific outputs."""
        blocks = self.create_mock_blocks(outputs=[])
        content = generate_modular_model_card_content(blocks)

-        assert "Standard pipeline outputs" in content["outputs_description"]
+        assert "Standard pipeline outputs" in content["io_specification_section"]

    def test_trigger_inputs_section_with_triggers(self):
        """Test that trigger inputs section is generated when present."""
@@ -628,35 +751,6 @@ class TestModularModelCardContent:

        assert content["trigger_inputs_section"] == ""

-    def test_blocks_description_with_sub_blocks(self):
-        """Test that blocks with sub-blocks are correctly described."""
-
-        class MockBlockWithSubBlocks:
-            def __init__(self):
-                self.__class__.__name__ = "ParentBlock"
-                self.description = "Parent block"
-                self.sub_blocks = {
-                    "child1": self.create_child_block("ChildBlock1", "Child 1 description"),
-                    "child2": self.create_child_block("ChildBlock2", "Child 2 description"),
-                }
-
-            def create_child_block(self, name, desc):
-                class ChildBlock:
-                    def __init__(self):
-                        self.__class__.__name__ = name
-                        self.description = desc
-
-                return ChildBlock()
-
-        blocks = self.create_mock_blocks()
-        blocks.sub_blocks["parent"] = MockBlockWithSubBlocks()
-
-        content = generate_modular_model_card_content(blocks)
-
-        assert "parent" in content["blocks_description"]
-        assert "child1" in content["blocks_description"]
-        assert "child2" in content["blocks_description"]
-
    def test_model_description_includes_block_count(self):
        """Test that model description includes the number of blocks."""
        blocks = self.create_mock_blocks(num_blocks=5)
@@ -715,6 +809,18 @@ class TestLoadComponentsSkipBehavior:
        assert pipe.unet is not None
        assert getattr(pipe, "vae", None) is None

+    def test_load_components_selective_loading_incremental(self):
+        """Loading a subset of components should not affect already-loaded components."""
+        pipe = ModularPipeline.from_pretrained("hf-internal-testing/tiny-stable-diffusion-xl-pipe")
+
+        pipe.load_components(names="unet", torch_dtype=torch.float32)
+        pipe.load_components(names="text_encoder", torch_dtype=torch.float32)
+
+        assert hasattr(pipe, "unet")
+        assert pipe.unet is not None
+        assert hasattr(pipe, "text_encoder")
+        assert pipe.text_encoder is not None
+
    def test_load_components_skips_invalid_pretrained_path(self):
        pipe = ModularPipeline.from_pretrained("hf-internal-testing/tiny-stable-diffusion-xl-pipe")

@@ -730,6 +836,112 @@ class TestLoadComponentsSkipBehavior:
        assert not hasattr(pipe, "test_component") or pipe.test_component is None


+class TestCustomModelSavePretrained:
+    def test_save_pretrained_updates_index_for_local_model(self, tmp_path):
+        """When a component without _diffusers_load_id (custom/local model) is saved,
+        modular_model_index.json should point to the save directory."""
+        import json
+
+        pipe = ModularPipeline.from_pretrained("hf-internal-testing/tiny-stable-diffusion-xl-pipe")
+        pipe.load_components(torch_dtype=torch.float32)
+
+        pipe.unet._diffusers_load_id = "null"
+
+        save_dir = str(tmp_path / "my-pipeline")
+        pipe.save_pretrained(save_dir)
+
+        with open(os.path.join(save_dir, "modular_model_index.json")) as f:
+            index = json.load(f)
+
+        _library, _cls, unet_spec = index["unet"]
+        assert unet_spec["pretrained_model_name_or_path"] == save_dir
+        assert unet_spec["subfolder"] == "unet"
+
+        _library, _cls, vae_spec = index["vae"]
+        assert vae_spec["pretrained_model_name_or_path"] == "hf-internal-testing/tiny-stable-diffusion-xl-pipe"
+
+    def test_save_pretrained_roundtrip_with_local_model(self, tmp_path):
+        """A pipeline with a custom/local model should be saveable and re-loadable with identical outputs."""
+        pipe = ModularPipeline.from_pretrained("hf-internal-testing/tiny-stable-diffusion-xl-pipe")
+        pipe.load_components(torch_dtype=torch.float32)
+
+        pipe.unet._diffusers_load_id = "null"
+
+        original_state_dict = pipe.unet.state_dict()
+
+        save_dir = str(tmp_path / "my-pipeline")
+        pipe.save_pretrained(save_dir)
+
+        loaded_pipe = ModularPipeline.from_pretrained(save_dir)
+        loaded_pipe.load_components(torch_dtype=torch.float32)
+
+        assert loaded_pipe.unet is not None
+        assert loaded_pipe.unet.__class__.__name__ == pipe.unet.__class__.__name__
+
+        loaded_state_dict = loaded_pipe.unet.state_dict()
+        assert set(original_state_dict.keys()) == set(loaded_state_dict.keys())
+        for key in original_state_dict:
+            assert torch.equal(original_state_dict[key], loaded_state_dict[key]), f"Mismatch in {key}"
+
+    def test_save_pretrained_updates_index_for_model_with_no_load_id(self, tmp_path):
+        """testing the workflow of update the pipeline with a custom model and save the pipeline,
+        the modular_model_index.json should point to the save directory."""
+        import json
+
+        from diffusers import UNet2DConditionModel
+
+        pipe = ModularPipeline.from_pretrained("hf-internal-testing/tiny-stable-diffusion-xl-pipe")
+        pipe.load_components(torch_dtype=torch.float32)
+
+        unet = UNet2DConditionModel.from_pretrained(
+            "hf-internal-testing/tiny-stable-diffusion-xl-pipe", subfolder="unet"
+        )
+        assert not hasattr(unet, "_diffusers_load_id")
+
+        pipe.update_components(unet=unet)
+
+        save_dir = str(tmp_path / "my-pipeline")
+        pipe.save_pretrained(save_dir)
+
+        with open(os.path.join(save_dir, "modular_model_index.json")) as f:
+            index = json.load(f)
+
+        _library, _cls, unet_spec = index["unet"]
+        assert unet_spec["pretrained_model_name_or_path"] == save_dir
+        assert unet_spec["subfolder"] == "unet"
+
+        _library, _cls, vae_spec = index["vae"]
+        assert vae_spec["pretrained_model_name_or_path"] == "hf-internal-testing/tiny-stable-diffusion-xl-pipe"
+
+    def test_save_pretrained_overwrite_modular_index(self, tmp_path):
+        """With overwrite_modular_index=True, all component references should point to the save directory."""
+        import json
+
+        pipe = ModularPipeline.from_pretrained("hf-internal-testing/tiny-stable-diffusion-xl-pipe")
+        pipe.load_components(torch_dtype=torch.float32)
+
+        save_dir = str(tmp_path / "my-pipeline")
+        pipe.save_pretrained(save_dir, overwrite_modular_index=True)
+
+        with open(os.path.join(save_dir, "modular_model_index.json")) as f:
+            index = json.load(f)
+
+        for component_name in ["unet", "vae", "text_encoder", "text_encoder_2"]:
+            if component_name not in index:
+                continue
+            _library, _cls, spec = index[component_name]
+            assert spec["pretrained_model_name_or_path"] == save_dir, (
+                f"{component_name} should point to save dir but got {spec['pretrained_model_name_or_path']}"
+            )
+            assert spec["subfolder"] == component_name
+
+        loaded_pipe = ModularPipeline.from_pretrained(save_dir)
+        loaded_pipe.load_components(torch_dtype=torch.float32)
+
+        assert loaded_pipe.unet is not None
+        assert loaded_pipe.vae is not None
+
+
 class TestModularPipelineInitFallback:
    """Test that ModularPipeline.__init__ falls back to default_blocks_name when
    _blocks_class_name is a base class (e.g. SequentialPipelineBlocks saved by from_blocks_dict)."""
--- a/tests/modular_pipelines/test_modular_pipelines_custom_blocks.py
+++ b/tests/modular_pipelines/test_modular_pipelines_custom_blocks.py
@@ -192,6 +192,156 @@ class TestModularCustomBlocks:
        assert len(pipe.components) == 1
        assert pipe.component_names[0] == "transformer"

+    def test_trust_remote_code_not_propagated_to_external_repo(self):
+        """When a modular pipeline repo references a component from an external repo that has custom
+        code (auto_map in config), calling load_components(trust_remote_code=True) should NOT
+        propagate trust_remote_code to that external component. The external component should fail
+        to load."""
+
+        from diffusers import ModularPipeline
+
+        CUSTOM_MODEL_CODE = (
+            "import torch\n"
+            "from diffusers import ModelMixin, ConfigMixin\n"
+            "from diffusers.configuration_utils import register_to_config\n"
+            "\n"
+            "class CustomModel(ModelMixin, ConfigMixin):\n"
+            "    @register_to_config\n"
+            "    def __init__(self, hidden_size=8):\n"
+            "        super().__init__()\n"
+            "        self.linear = torch.nn.Linear(hidden_size, hidden_size)\n"
+            "\n"
+            "    def forward(self, x):\n"
+            "        return self.linear(x)\n"
+        )
+
+        with tempfile.TemporaryDirectory() as external_repo_dir, tempfile.TemporaryDirectory() as pipeline_repo_dir:
+            # Step 1: Create an external model repo with custom code (requires trust_remote_code)
+            with open(os.path.join(external_repo_dir, "modeling.py"), "w") as f:
+                f.write(CUSTOM_MODEL_CODE)
+
+            config = {
+                "_class_name": "CustomModel",
+                "_diffusers_version": "0.0.0",
+                "auto_map": {"AutoModel": "modeling.CustomModel"},
+                "hidden_size": 8,
+            }
+            with open(os.path.join(external_repo_dir, "config.json"), "w") as f:
+                json.dump(config, f)
+
+            torch.save({}, os.path.join(external_repo_dir, "diffusion_pytorch_model.bin"))
+
+            # Step 2: Create a custom block that references the external repo.
+            # Define both the class (for direct use) and its code string (for block.py).
+            class ExternalRefBlock(ModularPipelineBlocks):
+                @property
+                def expected_components(self):
+                    return [
+                        ComponentSpec(
+                            "custom_model",
+                            AutoModel,
+                            pretrained_model_name_or_path=external_repo_dir,
+                        )
+                    ]
+
+                @property
+                def inputs(self) -> List[InputParam]:
+                    return [InputParam("prompt", type_hint=str, required=True)]
+
+                @property
+                def intermediate_inputs(self) -> List[InputParam]:
+                    return []
+
+                @property
+                def intermediate_outputs(self) -> List[OutputParam]:
+                    return [OutputParam("output", type_hint=str)]
+
+                def __call__(self, components, state: PipelineState) -> PipelineState:
+                    block_state = self.get_block_state(state)
+                    block_state.output = "test"
+                    self.set_block_state(state, block_state)
+                    return components, state
+
+            EXTERNAL_REF_BLOCK_CODE_STR = (
+                "from typing import List\n"
+                "from diffusers import AutoModel\n"
+                "from diffusers.modular_pipelines import (\n"
+                "    ComponentSpec,\n"
+                "    InputParam,\n"
+                "    ModularPipelineBlocks,\n"
+                "    OutputParam,\n"
+                "    PipelineState,\n"
+                ")\n"
+                "\n"
+                "class ExternalRefBlock(ModularPipelineBlocks):\n"
+                "    @property\n"
+                "    def expected_components(self):\n"
+                "        return [\n"
+                "            ComponentSpec(\n"
+                '                "custom_model",\n'
+                "                AutoModel,\n"
+                f'                pretrained_model_name_or_path="{external_repo_dir}",\n'
+                "            )\n"
+                "        ]\n"
+                "\n"
+                "    @property\n"
+                "    def inputs(self) -> List[InputParam]:\n"
+                '        return [InputParam("prompt", type_hint=str, required=True)]\n'
+                "\n"
+                "    @property\n"
+                "    def intermediate_inputs(self) -> List[InputParam]:\n"
+                "        return []\n"
+                "\n"
+                "    @property\n"
+                "    def intermediate_outputs(self) -> List[OutputParam]:\n"
+                '        return [OutputParam("output", type_hint=str)]\n'
+                "\n"
+                "    def __call__(self, components, state: PipelineState) -> PipelineState:\n"
+                "        block_state = self.get_block_state(state)\n"
+                '        block_state.output = "test"\n'
+                "        self.set_block_state(state, block_state)\n"
+                "        return components, state\n"
+            )
+
+            # Save the block config, write block.py, then load back via from_pretrained
+            block = ExternalRefBlock()
+            block.save_pretrained(pipeline_repo_dir)
+
+            # auto_map will reference the module name derived from ExternalRefBlock.__module__,
+            # which is "test_modular_pipelines_custom_blocks". Write the code file with that name.
+            code_path = os.path.join(pipeline_repo_dir, "test_modular_pipelines_custom_blocks.py")
+            with open(code_path, "w") as f:
+                f.write(EXTERNAL_REF_BLOCK_CODE_STR)
+
+            block = ModularPipelineBlocks.from_pretrained(pipeline_repo_dir, trust_remote_code=True)
+            pipe = block.init_pipeline()
+            pipe.save_pretrained(pipeline_repo_dir)
+
+            # Step 3: Load the pipeline from the saved directory.
+            loaded_pipe = ModularPipeline.from_pretrained(pipeline_repo_dir, trust_remote_code=True)
+
+            assert loaded_pipe._pretrained_model_name_or_path == pipeline_repo_dir
+            assert loaded_pipe._component_specs["custom_model"].pretrained_model_name_or_path == external_repo_dir
+            assert getattr(loaded_pipe, "custom_model", None) is None
+
+            # Step 4a: load_components WITHOUT trust_remote_code.
+            # It should still fail
+            loaded_pipe.load_components()
+            assert getattr(loaded_pipe, "custom_model", None) is None
+
+            # Step 4b: load_components with trust_remote_code=True.
+            # trust_remote_code should be stripped for the external component, so it fails.
+            # The warning should contain guidance about manually loading with trust_remote_code.
+            loaded_pipe.load_components(trust_remote_code=True)
+            assert getattr(loaded_pipe, "custom_model", None) is None
+
+            # Step 4c: Manually load with AutoModel and update_components — this should work.
+            from diffusers import AutoModel
+
+            custom_model = AutoModel.from_pretrained(external_repo_dir, trust_remote_code=True)
+            loaded_pipe.update_components(custom_model=custom_model)
+            assert getattr(loaded_pipe, "custom_model", None) is not None
+
    def test_custom_block_loads_from_hub(self):
        repo_id = "hf-internal-testing/tiny-modular-diffusers-block"
        block = ModularPipelineBlocks.from_pretrained(repo_id, trust_remote_code=True)
--- a/tests/quantization/torchao/test_torchao.py
+++ b/tests/quantization/torchao/test_torchao.py
@@ -74,7 +74,7 @@ if is_torchao_available():

@require_torch
@require_torch_accelerator
-@require_torchao_version_greater_or_equal("0.7.0")
+@require_torchao_version_greater_or_equal("0.14.0")
 class TorchAoConfigTest(unittest.TestCase):
    def test_to_dict(self):
        """
@@ -132,7 +132,7 @@ class TorchAoConfigTest(unittest.TestCase):
 # Slices for these tests have been obtained on our aws-g6e-xlarge-plus runners
@require_torch
@require_torch_accelerator
-@require_torchao_version_greater_or_equal("0.7.0")
+@require_torchao_version_greater_or_equal("0.14.0")
 class TorchAoTest(unittest.TestCase):
    def tearDown(self):
        gc.collect()
@@ -587,7 +587,7 @@ class TorchAoTest(unittest.TestCase):
 # Slices for these tests have been obtained on our aws-g6e-xlarge-plus runners
@require_torch
@require_torch_accelerator
-@require_torchao_version_greater_or_equal("0.7.0")
+@require_torchao_version_greater_or_equal("0.14.0")
 class TorchAoSerializationTest(unittest.TestCase):
    model_name = "hf-internal-testing/tiny-flux-pipe"

@@ -698,23 +698,22 @@ class TorchAoSerializationTest(unittest.TestCase):
        self._check_serialization_expected_slice(quant_method, quant_method_kwargs, expected_slice, device)


-@require_torchao_version_greater_or_equal("0.7.0")
+@require_torchao_version_greater_or_equal("0.14.0")
 class TorchAoCompileTest(QuantCompileTests, unittest.TestCase):
    @property
    def quantization_config(self):
        return PipelineQuantizationConfig(
-            quant_mapping={
-                "transformer": TorchAoConfig(quant_type="int8_weight_only"),
-            },
+            quant_mapping={"transformer": TorchAoConfig(Int8WeightOnlyConfig())},
        )

-    @unittest.skip(
-        "Changing the device of AQT tensor with module._apply (called from doing module.to() in accelerate) does not work "
-        "when compiling."
-    )
    def test_torch_compile_with_cpu_offload(self):
+        pipe = self._init_pipeline(self.quantization_config, torch.bfloat16)
+        pipe.enable_model_cpu_offload()
+        # No compilation because it fails with:
        # RuntimeError: _apply(): Couldn't swap Linear.weight
-        super().test_torch_compile_with_cpu_offload()
+
+        # small resolutions to ensure speedy execution.
+        pipe("a dog", num_inference_steps=2, max_sequence_length=16, height=256, width=256)

    @parameterized.expand([False, True])
    @unittest.skip(
@@ -745,7 +744,7 @@ class TorchAoCompileTest(QuantCompileTests, unittest.TestCase):
 # Slices for these tests have been obtained on our aws-g6e-xlarge-plus runners
@require_torch
@require_torch_accelerator
-@require_torchao_version_greater_or_equal("0.7.0")
+@require_torchao_version_greater_or_equal("0.14.0")
@slow
@nightly
 class SlowTorchAoTests(unittest.TestCase):
@@ -907,7 +906,7 @@ class SlowTorchAoTests(unittest.TestCase):

@require_torch
@require_torch_accelerator
-@require_torchao_version_greater_or_equal("0.7.0")
+@require_torchao_version_greater_or_equal("0.14.0")
@slow
@nightly
 class SlowTorchAoPreserializedModelTests(unittest.TestCase):
Author	SHA1	Message	Date
sayakpaul	1ccb2bd4f6	fix zimage lora conversion to support for more lora.	2026-03-04 16:27:01 +05:30
jiqing-feng	88798242bc	cogvideo example: Distribute VAE video encoding across processes in CogVideoX LoRA training (#13207 ) * Distribute VAE video encoding across processes in CogVideoX LoRA training Signed-off-by: jiqing-feng <jiqing.feng@intel.com> * Apply style fixes --------- Signed-off-by: jiqing-feng <jiqing.feng@intel.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>	2026-03-04 15:09:01 +05:30
Sayak Paul	4a2833c1c2	[Modular] implement requirements validation for custom blocks (#12196 ) * feat: implement requirements validation for custom blocks. * up * unify. * up * add tests * Apply suggestions from code review Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * reviewer feedback. * [docs] validation for custom blocks (#13156) validation * move to tmp_path fixture. * propagate to conditional and loopsequential blocks. * up * remove collected tests --------- Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2026-03-04 12:19:08 +05:30
YiYi Xu	1fe688a651	[modular] not pass trust_remote_code to external repos (#13204 ) * add * update warn * add a test * updaqte * update_component with custom model * add more tests * Apply suggestion from @DN6 Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com> * up --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>	2026-03-03 02:36:36 -10:00
YiYi Xu	bbbcdd87bd	[modular]Update model card to include workflow (#13195 ) * up * up * update * remove test --------- Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal> Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>	2026-03-02 20:50:07 -10:00
Dhruv Nair	47e8faf3b9	Clean up accidental files (#13202 ) update	2026-03-03 00:35:58 +05:30
David El Malih	c2fdd2d048	docs: improve docstring scheduling_ipndm.py (#13198 ) Improve docstring scheduling ipndm	2026-03-02 09:42:55 -08:00
Dhruv Nair	84ff061b1d	[Modular] Save Modular Pipeline weights to Hub (#13168 ) * update * update * update * update * update * update	2026-03-02 22:20:42 +05:30
Dhruv Nair	3fd14f1acf	[AutoModel] Allow registering `auto_map` to model config (#13186 ) * update * update	2026-03-02 22:13:25 +05:30
Dhruv Nair	e7fe4ce92f	[AutoModel] Fix bug with subfolders and local model paths when loading custom code (#13197 ) * update * update	2026-03-02 17:44:25 +05:30
Sayak Paul	3d9085565b	remove db utils from benchmarking (#13199 )	2026-03-02 16:39:56 +05:30
Sayak Paul	5b54496131	[tests] enable cpu offload test in torchao without compilation. (#12704 ) enable cpu offload test in torchao without compilation.	2026-03-02 15:03:58 +05:30