mirror of https://github.com/huggingface/diffusers.git synced 2025-12-08 13:34:27 +08:00

Files

Sayak Paul 18fc40c169 [Feat] add tiny Autoencoder for (almost) instant decoding (#4384 )

* add: model implementation of tiny autoencoder.

* add: inits.

* push the latest devs.

* add: conversion script and finish.

* add: scaling factor args.

* debugging

* fix denormalization.

* fix: positional argument.

* handle use_torch_2_0_or_xformers.

* handle post_quant_conv

* handle dtype

* fix: sdxl image processor for tiny ae.

* fix: sdxl image processor for tiny ae.

* unify upcasting logic.

* copied from madness.

* remove trailing whitespace.

* set is_tiny_vae = False

* address PR comments.

* change to AutoencoderTiny

* make act_fn an str throughout

* fix: apply_forward_hook decorator call

* get rid of the special is_tiny_vae flag.

* directly scale the output.

* fix dummies?

* fix: act_fn.

* get rid of the Clamp() layer.

* bring back copied from.

* movement of the blocks to appropriate modules.

* add: docstrings to AutoencoderTiny

* add: documentation.

* changes to the conversion script.

* add doc entry.

* settle tests.

* style

* add one slow test.

* fix

* fix 2

* fix 2

* fix: 4

* fix: 5

* finish integration tests

* Apply suggestions from code review

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* style

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

2023-08-02 23:58:05 +05:30

1.4 KiB

Raw Blame History

Tiny AutoEncoder

Tiny AutoEncoder for Stable Diffusion (TAESD) was introduced in madebyollin/taesd by Ollin Boer Bohan. It is a tiny distilled version of Stable Diffusion's VAE that can quickly decode the latents in a [StableDiffusionPipeline] or [StableDiffusionXLPipeline] almost instantly.

To use with Stable Diffusion v-2.1:

import torch
from diffusers import DiffusionPipeline, AutoencoderTiny

pipe = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-2-1-base", torch_dtype=torch.float16
)
pipe.vae = AutoencoderTiny.from_pretrained("madebyollin/taesd", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "slice of delicious New York-style berry cheesecake"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("cheesecake.png")

To use with Stable Diffusion XL 1.0

import torch
from diffusers import DiffusionPipeline, AutoencoderTiny

pipe = DiffusionPipeline.from_pretrained(
    "stabilityai/stable-diffusion-xl-base-1.0", torch_dtype=torch.float16
)
pipe.vae = AutoencoderTiny.from_pretrained("madebyollin/taesdxl", torch_dtype=torch.float16)
pipe = pipe.to("cuda")

prompt = "slice of delicious New York-style berry cheesecake"
image = pipe(prompt, num_inference_steps=25).images[0]
image.save("cheesecake_sdxl.png")

AutoencoderTiny

autodoc AutoencoderTiny

AutoencoderTinyOutput

autodoc models.autoencoder_tiny.AutoencoderTinyOutput

1.4 KiB Raw Blame History

Tiny AutoEncoder

AutoencoderTiny

AutoencoderTinyOutput

1.4 KiB

Raw Blame History