mirror of https://github.com/huggingface/diffusers.git synced 2025-12-08 21:44:27 +08:00

Files

Aryan 0d1d267b12 [core] Allegro T2V (#9736 )

* update

* refactor transformer part 1

* refactor part 2

* refactor part 3

* make style

* refactor part 4; modeling tests

* make style

* refactor part 5

* refactor part 6

* gradient checkpointing

* pipeline tests (broken atm)

* update

* add coauthor

Co-Authored-By: Huan Yang <hyang@fastmail.com>

* refactor part 7

* add docs

* make style

* add coauthor

Co-Authored-By: YiYi Xu <yixu310@gmail.com>

* make fix-copies

* undo unrelated change

* revert changes to embeddings, normalization, transformer

* refactor part 8

* make style

* refactor part 9

* make style

* fix

* apply suggestions from review

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* update example

* remove attention mask for self-attention

* update

* copied from

* update

* update

---------

Co-authored-by: Huan Yang <hyang@fastmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

2024-10-29 13:14:36 +05:30

1.3 KiB

Raw Blame History

AutoencoderKLAllegro

The 3D variational autoencoder (VAE) model with KL loss used in Allegro was introduced in Allegro: Open the Black Box of Commercial-Level Video Generation Model by RhymesAI.

The model can be loaded with the following code snippet.

from diffusers import AutoencoderKLAllegro

vae = AutoencoderKLCogVideoX.from_pretrained("rhymes-ai/Allegro", subfolder="vae", torch_dtype=torch.float32).to("cuda")

AutoencoderKLAllegro

autodoc AutoencoderKLAllegro - decode - encode - all

AutoencoderKLOutput

autodoc models.autoencoders.autoencoder_kl.AutoencoderKLOutput

DecoderOutput

autodoc models.autoencoders.vae.DecoderOutput

1.3 KiB Raw Blame History

AutoencoderKLAllegro

AutoencoderKLAllegro

AutoencoderKLOutput

DecoderOutput

1.3 KiB

Raw Blame History