mirror of https://github.com/huggingface/diffusers.git synced 2025-12-08 05:24:20 +08:00

Files

Sayak Paul 30e5e81d58 change to 2024 in the license (#6902 )

change to 2024

2024-02-08 08:19:31 -10:00

2.7 KiB

Raw Blame History

Control image brightness

The Stable Diffusion pipeline is mediocre at generating images that are either very bright or dark as explained in the Common Diffusion Noise Schedules and Sample Steps are Flawed paper. The solutions proposed in the paper are currently implemented in the [DDIMScheduler] which you can use to improve the lighting in your images.

💡 Take a look at the paper linked above for more details about the proposed solutions!

One of the solutions is to train a model with v prediction and v loss. Add the following flag to the train_text_to_image.py or train_text_to_image_lora.py scripts to enable v_prediction:

--prediction_type="v_prediction"

For example, let's use the ptx0/pseudo-journey-v2 checkpoint which has been finetuned with v_prediction.

Next, configure the following parameters in the [DDIMScheduler]:

rescale_betas_zero_snr=True, rescales the noise schedule to zero terminal signal-to-noise ratio (SNR)
timestep_spacing="trailing", starts sampling from the last timestep

from diffusers import DiffusionPipeline, DDIMScheduler

pipeline = DiffusionPipeline.from_pretrained("ptx0/pseudo-journey-v2", use_safetensors=True)

# switch the scheduler in the pipeline to use the DDIMScheduler
pipeline.scheduler = DDIMScheduler.from_config(
    pipeline.scheduler.config, rescale_betas_zero_snr=True, timestep_spacing="trailing"
)
pipeline.to("cuda")

Finally, in your call to the pipeline, set guidance_rescale to prevent overexposure:

prompt = "A lion in galaxies, spirals, nebulae, stars, smoke, iridescent, intricate detail, octane render, 8k"
image = pipeline(prompt, guidance_rescale=0.7).images[0]
image

2.7 KiB Raw Blame History

Control image brightness

2.7 KiB

Raw Blame History