Merge branch 'main' into tests-load-components

Add PRXPipeline in AUTO_TEXT2IMAGE_PIPELINES_MAPPING (#13257 )
Update Documentation for NVIDIA Cosmos (#13251 )
2026-03-16 13:37:55 +08:00 · 2026-03-12 20:57:38 +05:30 · 2026-03-11 14:39:24 -03:00 · 2026-03-11 09:14:56 -07:00 · 2026-03-10 17:55:08 +05:30 · 2026-03-10 17:49:51 +05:30
7 changed files with 86 additions and 50 deletions
--- a/docs/source/en/_toctree.yml
+++ b/docs/source/en/_toctree.yml
@@ -532,8 +532,6 @@
        title: ControlNet-XS with Stable Diffusion XL
      - local: api/pipelines/controlnet_union
        title: ControlNetUnion
-      - local: api/pipelines/cosmos
-        title: Cosmos
      - local: api/pipelines/ddim
        title: DDIM
      - local: api/pipelines/ddpm
@@ -677,6 +675,8 @@
        title: CogVideoX
      - local: api/pipelines/consisid
        title: ConsisID
+      - local: api/pipelines/cosmos
+        title: Cosmos
      - local: api/pipelines/framepack
        title: Framepack
      - local: api/pipelines/helios
--- a/docs/source/en/api/pipelines/cosmos.md
+++ b/docs/source/en/api/pipelines/cosmos.md
@@ -21,29 +21,31 @@
 > [!TIP]
 > Make sure to check out the Schedulers [guide](../../using-diffusers/schedulers) to learn how to explore the tradeoff between scheduler speed and quality, and see the [reuse components across pipelines](../../using-diffusers/loading#reuse-a-pipeline) section to learn how to efficiently load the same components into multiple pipelines.

-## Loading original format checkpoints
-
-Original format checkpoints that have not been converted to diffusers-expected format can be loaded using the `from_single_file` method.
+## Basic usage

 ```python
 import torch
-from diffusers import Cosmos2TextToImagePipeline, CosmosTransformer3DModel
+from diffusers import Cosmos2_5_PredictBasePipeline
+from diffusers.utils import export_to_video

-model_id = "nvidia/Cosmos-Predict2-2B-Text2Image"
-transformer = CosmosTransformer3DModel.from_single_file(
-    "https://huggingface.co/nvidia/Cosmos-Predict2-2B-Text2Image/blob/main/model.pt",
-    torch_dtype=torch.bfloat16,
-).to("cuda")
-pipe = Cosmos2TextToImagePipeline.from_pretrained(model_id, transformer=transformer, torch_dtype=torch.bfloat16)
+model_id = "nvidia/Cosmos-Predict2.5-2B"
+pipe = Cosmos2_5_PredictBasePipeline.from_pretrained(
+    model_id, revision="diffusers/base/post-trained", torch_dtype=torch.bfloat16
+)
 pipe.to("cuda")

-prompt = "A close-up shot captures a vibrant yellow scrubber vigorously working on a grimy plate, its bristles moving in circular motions to lift stubborn grease and food residue. The dish, once covered in remnants of a hearty meal, gradually reveals its original glossy surface. Suds form and bubble around the scrubber, creating a satisfying visual of cleanliness in progress. The sound of scrubbing fills the air, accompanied by the gentle clinking of the dish against the sink. As the scrubber continues its task, the dish transforms, gleaming under the bright kitchen lights, symbolizing the triumph of cleanliness over mess."
+prompt = "As the red light shifts to green, the red bus at the intersection begins to move forward, its headlights cutting through the falling snow. The snowy tire tracks deepen as the vehicle inches ahead, casting fresh lines onto the slushy road. Around it, streetlights glow warmer, illuminating the drifting flakes and wet reflections on the asphalt. Other cars behind start to edge forward, their beams joining the scene. The stillness of the urban street transitions into motion as the quiet snowfall is punctuated by the slow advance of traffic through the frosty city corridor."
 negative_prompt = "The video captures a series of frames showing ugly scenes, static with no motion, motion blur, over-saturation, shaky footage, low resolution, grainy texture, pixelated images, poorly lit areas, underexposed and overexposed scenes, poor color balance, washed out colors, choppy sequences, jerky movements, low frame rate, artifacting, color banding, unnatural transitions, outdated special effects, fake elements, unconvincing visuals, poorly edited content, jump cuts, visual noise, and flickering. Overall, the video is of poor quality."

 output = pipe(
-    prompt=prompt, negative_prompt=negative_prompt, generator=torch.Generator().manual_seed(1)
-).images[0]
-output.save("output.png")
+    image=None,
+    video=None,
+    prompt=prompt,
+    negative_prompt=negative_prompt,
+    num_frames=93,
+    generator=torch.Generator().manual_seed(1),
+).frames[0]
+export_to_video(output, "text2world.mp4", fps=16)
 ```

 ## Cosmos2_5_TransferPipeline
--- a/docs/source/en/api/pipelines/overview.md
+++ b/docs/source/en/api/pipelines/overview.md
@@ -44,6 +44,7 @@ The table below lists all the pipelines currently available in 🤗 Diffusers an
 | [ControlNet with Stable Diffusion XL](controlnet_sdxl) | text2image |
 | [ControlNet-XS](controlnetxs) | text2image |
 | [ControlNet-XS with Stable Diffusion XL](controlnetxs_sdxl) | text2image |
+| [Cosmos](cosmos) | text2video, video2video |
 | [Dance Diffusion](dance_diffusion) | unconditional audio generation |
 | [DDIM](ddim) | unconditional image generation |
 | [DDPM](ddpm) | unconditional image generation |
--- a/src/diffusers/modular_pipelines/modular_pipeline.py
+++ b/src/diffusers/modular_pipelines/modular_pipeline.py
@@ -14,7 +14,6 @@
 import importlib
 import inspect
 import os
-import shutil
 import sys
 import traceback
 import warnings
@@ -1884,36 +1883,6 @@ class ModularPipeline(ConfigMixin, PushToHubMixin):
        )
        return pipeline

-    def _maybe_save_custom_code(self, save_directory: str | os.PathLike):
-        """Save custom code files (blocks config and Python modules) to the save directory."""
-        if self._blocks is None:
-            return
-
-        blocks_module = type(self._blocks).__module__
-        is_custom_code = not blocks_module.startswith("diffusers.") and blocks_module != "diffusers"
-        if not is_custom_code:
-            return
-
-        os.makedirs(save_directory, exist_ok=True)
-
-        self._blocks.save_pretrained(save_directory)
-
-        source_file = inspect.getfile(type(self._blocks))
-        module_file = os.path.basename(source_file)
-        dest_file = os.path.join(save_directory, module_file)
-
-        if os.path.abspath(source_file) != os.path.abspath(dest_file):
-            shutil.copyfile(source_file, dest_file)
-
-        from ..utils.dynamic_modules_utils import get_relative_import_files
-
-        for rel_file in get_relative_import_files(source_file):
-            rel_name = os.path.relpath(rel_file, os.path.dirname(source_file))
-            rel_dest = os.path.join(save_directory, rel_name)
-            if os.path.abspath(rel_file) != os.path.abspath(rel_dest):
-                os.makedirs(os.path.dirname(rel_dest), exist_ok=True)
-                shutil.copyfile(rel_file, rel_dest)
-
    def save_pretrained(
        self,
        save_directory: str | os.PathLike,
@@ -2029,8 +1998,6 @@ class ModularPipeline(ConfigMixin, PushToHubMixin):
                component_spec_dict["subfolder"] = component_name
                self.register_to_config(**{component_name: (library, class_name, component_spec_dict)})

-        self._maybe_save_custom_code(save_directory)
-
        self.save_config(save_directory=save_directory)

        if push_to_hub:
--- a/src/diffusers/pipelines/auto_pipeline.py
+++ b/src/diffusers/pipelines/auto_pipeline.py
@@ -95,6 +95,7 @@ from .pag import (
    StableDiffusionXLPAGPipeline,
 )
 from .pixart_alpha import PixArtAlphaPipeline, PixArtSigmaPipeline
+from .prx import PRXPipeline
 from .qwenimage import (
    QwenImageControlNetPipeline,
    QwenImageEditInpaintPipeline,
@@ -185,6 +186,7 @@ AUTO_TEXT2IMAGE_PIPELINES_MAPPING = OrderedDict(
        ("z-image-controlnet-inpaint", ZImageControlNetInpaintPipeline),
        ("z-image-omni", ZImageOmniPipeline),
        ("ovis", OvisImagePipeline),
+        ("prx", PRXPipeline),
    ]
 )

--- a/src/diffusers/pipelines/cosmos/pipeline_cosmos2_5_transfer.py
+++ b/src/diffusers/pipelines/cosmos/pipeline_cosmos2_5_transfer.py
@@ -82,13 +82,16 @@ EXAMPLE_DOC_STRING = """
        ```python
        >>> import cv2
        >>> import numpy as np
+        >>> from PIL import Image
        >>> import torch
        >>> from diffusers import Cosmos2_5_TransferPipeline, AutoModel
        >>> from diffusers.utils import export_to_video, load_video

        >>> model_id = "nvidia/Cosmos-Transfer2.5-2B"
        >>> # Load a Transfer2.5 controlnet variant (edge, depth, seg, or blur)
-        >>> controlnet = AutoModel.from_pretrained(model_id, revision="diffusers/controlnet/general/edge")
+        >>> controlnet = AutoModel.from_pretrained(
+        ...     model_id, revision="diffusers/controlnet/general/edge", torch_dtype=torch.bfloat16
+        ... )
        >>> pipe = Cosmos2_5_TransferPipeline.from_pretrained(
        ...     model_id, controlnet=controlnet, revision="diffusers/general", torch_dtype=torch.bfloat16
        ... )
--- a/tests/modular_pipelines/test_modular_pipelines_common.py
+++ b/tests/modular_pipelines/test_modular_pipelines_common.py
@@ -5,6 +5,7 @@ from typing import Callable

 import pytest
 import torch
+from huggingface_hub import hf_hub_download

 import diffusers
 from diffusers import AutoModel, ComponentsManager, ModularPipeline, ModularPipelineBlocks
@@ -32,6 +33,33 @@ from ..testing_utils import (
 )


+def _get_specified_components(path_or_repo_id, cache_dir=None):
+    if os.path.isdir(path_or_repo_id):
+        config_path = os.path.join(path_or_repo_id, "modular_model_index.json")
+    else:
+        try:
+            config_path = hf_hub_download(
+                repo_id=path_or_repo_id,
+                filename="modular_model_index.json",
+                local_dir=cache_dir,
+            )
+        except Exception:
+            return None
+
+    with open(config_path) as f:
+        config = json.load(f)
+
+    components = set()
+    for k, v in config.items():
+        if isinstance(v, (str, int, float, bool)):
+            continue
+        for entry in v:
+            if isinstance(entry, dict) and (entry.get("repo") or entry.get("pretrained_model_name_or_path")):
+                components.add(k)
+                break
+    return components
+
+
 class ModularPipelineTesterMixin:
    """
    It provides a set of common tests for each modular pipeline,
@@ -360,6 +388,39 @@ class ModularPipelineTesterMixin:

        assert torch.abs(image_slices[0] - image_slices[1]).max() < 1e-3

+    def test_load_expected_components_from_pretrained(self, tmp_path):
+        pipe = self.get_pipeline()
+        expected = _get_specified_components(self.pretrained_model_name_or_path, cache_dir=tmp_path)
+        if not expected:
+            pytest.skip("Skipping test as we couldn't fetch the expected components.")
+
+        actual = {
+            name
+            for name in pipe.components
+            if getattr(pipe, name, None) is not None
+            and getattr(getattr(pipe, name), "_diffusers_load_id", None) not in (None, "null")
+        }
+        assert expected == actual, f"Component mismatch: missing={expected - actual}, unexpected={actual - expected}"
+
+    def test_load_expected_components_from_save_pretrained(self, tmp_path):
+        pipe = self.get_pipeline()
+        save_dir = str(tmp_path / "saved-pipeline")
+        pipe.save_pretrained(save_dir)
+
+        expected = _get_specified_components(save_dir)
+        loaded_pipe = ModularPipeline.from_pretrained(save_dir)
+        loaded_pipe.load_components(torch_dtype=torch.float32)
+
+        actual = {
+            name
+            for name in loaded_pipe.components
+            if getattr(loaded_pipe, name, None) is not None
+            and getattr(getattr(loaded_pipe, name), "_diffusers_load_id", None) not in (None, "null")
+        }
+        assert expected == actual, (
+            f"Component mismatch after save/load: missing={expected - actual}, unexpected={actual - expected}"
+        )
+
    def test_modular_index_consistency(self, tmp_path):
        pipe = self.get_pipeline()
        components_spec = pipe._component_specs
Author	SHA1	Message	Date
Sayak Paul	faca9b90a7	Merge branch 'main' into tests-load-components	2026-03-12 20:57:38 +05:30
Alvaro Bartolome	81c354d879	Add `PRXPipeline` in `AUTO_TEXT2IMAGE_PIPELINES_MAPPING` (#13257 )	2026-03-11 14:39:24 -03:00
Miguel Martin	0a2c26d0a4	Update Documentation for NVIDIA Cosmos (#13251 ) * fix docs * update main example	2026-03-11 09:14:56 -07:00
sayakpaul	a1f63a398c	up	2026-03-10 17:55:08 +05:30
sayakpaul	bf846f722c	u[	2026-03-10 17:49:51 +05:30
sayakpaul	78a86e85cf	fix	2026-03-10 17:46:55 +05:30
sayakpaul	7673ab1757	fix	2026-03-10 16:50:27 +05:30
sayakpaul	b7648557d4	test load_components.	2026-03-10 16:09:02 +05:30