mirror of https://github.com/huggingface/diffusers.git synced 2025-12-16 01:14:47 +08:00

Files

Aryan 3f329a426a [core] Mochi T2V (#9769 )

* update

* udpate

* update transformer

* make style

* fix

* add conversion script

* update

* fix

* update

* fix

* update

* fixes

* make style

* update

* update

* update

* init

* update

* update

* add

* up

* up

* up

* update

* mochi transformer

* remove original implementation

* make style

* update inits

* update conversion script

* docs

* Update src/diffusers/pipelines/mochi/pipeline_mochi.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* Update src/diffusers/pipelines/mochi/pipeline_mochi.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fix docs

* pipeline fixes

* make style

* invert sigmas in scheduler; fix pipeline

* fix pipeline num_frames

* flip proj and gate in swiglu

* make style

* fix

* make style

* fix tests

* latent mean and std fix

* update

* cherry-pick 1069d210e1

* remove additional sigma already handled by flow match scheduler

* fix

* remove hardcoded value

* replace conv1x1 with linear

* Update src/diffusers/pipelines/mochi/pipeline_mochi.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* framewise decoding and conv_cache

* make style

* Apply suggestions from code review

* mochi vae encoder changes

* rebase correctly

* Update scripts/convert_mochi_to_diffusers.py

* fix tests

* fixes

* make style

* update

* make style

* update

* add framewise and tiled encoding

* make style

* make original vae implementation behaviour the default; note: framewise encoding does not work

* remove framewise encoding implementation due to presence of attn layers

* fight test 1

* fight test 2

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>

2024-11-05 20:33:41 +05:30

1.1 KiB

Raw Blame History

MochiTransformer3DModel

A Diffusion Transformer model for 3D video-like data was introduced in Mochi-1 Preview by Genmo.

The model can be loaded with the following code snippet.

from diffusers import MochiTransformer3DModel

vae = MochiTransformer3DModel.from_pretrained("genmo/mochi-1-preview", subfolder="transformer", torch_dtype=torch.float16).to("cuda")

MochiTransformer3DModel

autodoc MochiTransformer3DModel

Transformer2DModelOutput

autodoc models.modeling_outputs.Transformer2DModelOutput

1.1 KiB Raw Blame History

MochiTransformer3DModel

MochiTransformer3DModel

Transformer2DModelOutput

1.1 KiB

Raw Blame History