Commit Graph

4913 Commits

Author SHA1 Message Date
Sayak Paul
8ade2c94d0 [Hunyuan DiT] feat: enable fusing qkv projections when doing attention (#8396)
* feat: introduce qkv fusion for Hunyuan

* fix copies
2024-12-23 13:02:12 +05:30
leaps
942bc38be9 Update code example in pipeline_stable_unclip_img2img.py EXAMPLE_DOC_STRING (#8401)
Update code example in pipeline_stable_unclip_img2img.py

Previous code caused an error when run
2024-12-23 13:02:12 +05:30
Sayak Paul
4f77f52853 [Transformer2DModel] Handle norm_type safely while remapping (#8370)
* handle norm_type of transformer2d_model safely.

* log an info when old model class is being returned.

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* remove extra stuff

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-12-23 13:02:12 +05:30
Sayak Paul
9b1c118692 [HunyuanDiT] minor docs changes in hunyuandit (#8395)
minor docs changes in hunyuandit
2024-12-23 13:02:12 +05:30
townwish4git
c0a81199d3 Fix AsymmetricAutoencoderKL forward (#8378) 2024-12-23 13:02:12 +05:30
Marçal Comajoan Cara
26e7b4bd9a Update transformer2d.md title (#8375)
* Update transformer2d.md title

For the other classes (e.g., UNet2DModel) the title of the documentation coincides with the name of the class, but that was not the case for Transformer2DModel.

* Update model docs titles for consistency with class names
2024-12-23 13:02:12 +05:30
Dhruv Nair
92b915aca7 Update slow test actions (#8381)
* update

* update

* update

* update
2024-12-23 13:02:12 +05:30
XCL
d6c7e17867 Tencent Hunyuan Team - Updated Doc for HunyuanDiT (#8383)
* add hunyuandit doc

* update hunyuandit doc

* update hunyuandit 2d model

* update toctree.yml for hunyuandit
2024-12-23 13:02:12 +05:30
XCL
43510582cd Tencent Hunyuan Team: add HunyuanDiT related updates (#8240)
* Hunyuan Team: add HunyuanDiT related updates


---------

Co-authored-by: XCLiu <liuxc1996@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail.com>
2024-12-23 13:02:12 +05:30
39th president of the United States, probably
7f58fad44d Fix DREAM training (#8302)
Co-authored-by: Jimmy <39@🇺🇸.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-12-23 13:02:12 +05:30
Anton Obukhov
a495ed3e8b Fix marigold documentation (#8372)
* rename prs-eth/marigold-lcm-v1-0 into prs-eth/marigold-depth-lcm-v1-0

* update image paths in https://huggingface.co/datasets/huggingface/documentation-images to use main branch

* fix relative paths to other diffusers pages

* Update docs/source/en/using-diffusers/marigold_usage.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

---------

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2024-12-23 13:02:12 +05:30
Sayak Paul
b9e55fcbf7 [Core] Introduce class variants for Transformer2DModel (#7647)
* init for patches

* finish patched model.

* continuous transformer

* vectorized transformer2d.

* style.

* inits.

* fix-copies.

* introduce DiTTransformer2DModel.

* fixes

* use REMAPPING as suggested by @DN6

* better logging.

* add pixart transformer model.

* inits.

* caption_channels.

* attention masking.

* fix use_additional_conditions.

* remove print.

* debug

* flatten

* fix: assertion for sigma

* handle remapping for modeling_utils

* add tests for dit transformer2d

* quality

* placeholder for pixart tests

* pixart tests

* add _no_split_modules

* add docs.

* check

* check

* check

* check

* fix tests

* fix tests

* move Transformer output to modeling_output

* move errors better and bring back use_additional_conditions attribute.

* add unnecessary things from DiT.

* clean up pixart

* fix remapping

* fix device_map things in pixart2d.

* replace Transformer2DModel with appropriate classes in dit, pixart tests

* empty

* legacy mixin classes./

* use a remapping dict for fetching class names.

* change to specifc model types in the pipeline implementations.

* move _fetch_remapped_cls_from_config to modeling_loading_utils.py

* fix dependency problems.

* add deprecation note.
2024-12-23 13:02:12 +05:30
Dhruv Nair
f6553106f7 Change checkpoint key used to identify CLIP models in single file checkpoints (#8319)
update
2024-12-23 13:02:12 +05:30
Jonah
6f0153747c Fix depth pipeline "input/weight type should be the same" error at fp16 (#8321)
Fix "input/weight type should be the same"

Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-12-23 13:02:12 +05:30
satani99
760e5993de Modularize train_text_to_image_lora_sdxl inferencing during and after training in example (#8335)
* Modularized the train_lora_sdxl file

* Modularized the train_lora_sdxl file

* Modularized the train_lora_sdxl file

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:12 +05:30
Genius Patrick
a1cfb0accc fix(training): lr scheduler doesn't work properly in distributed scenarios (#8312) 2024-12-23 13:02:12 +05:30
Dhruv Nair
4be03cfa82 Fix StableDiffusionPipeline when text_encoder=None (#8297)
* update

* update

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:12 +05:30
Tolga Cangöz
b2abd9e371 Fix Copying Mechanism typo/bug (#8232)
* Fix copying mechanism typos

* fix copying mecha

* Revert, since they are in TODO

* Fix copying mechanism
2024-12-23 13:02:12 +05:30
Steven Liu
c62a927ba1 [docs] Files and formats (#7874)
* files and formats

* fix callout

* feedback

* code sample

* feedback
2024-12-23 13:02:12 +05:30
Steven Liu
8ccc3f225c [docs] DeepFloyd training (#8224)
deepfloyd training

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:12 +05:30
Tolga Cangöz
64e3d998b3 Simplify platform_info assignment in diffusers-cli env (#8298)
chore: Simplify `platform_info` assignment
2024-12-23 13:02:12 +05:30
satani99
630a71361f Modularize train_text_to_image_lora SD inferencing during and after training in example (#8283)
* Modularized the train_lora file

* Modularized the train_lora file

* Modularized the train_lora file

* Modularized the train_lora file

* Modularized the train_lora file

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:12 +05:30
Sayak Paul
e135b4dadf post release v0.28.0 (#8286)
* post release v0.28.0

* style
2024-12-23 13:02:12 +05:30
Sayak Paul
93c3abef71 [Core] Refactor IPAdapterPlusImageProjection a bit (#7994)
* use IPAdapterPlusImageProjectionBlock in IPAdapterPlusImageProjection

* reposition IPAdapterPlusImageProjection

* refactor complete?

* fix heads param retrieval.

* update test dict creation method.
2024-12-23 13:02:12 +05:30
Sayak Paul
3a932ec0bd move vqmodel to models.autoencoders. (#8292)
move vqmodel to models.autoencoders.
2024-12-23 13:02:12 +05:30
Sayak Paul
84f7039fb3 [Post release 0.28.0] remove deprecated blocks. (#8291)
* remove deprecated blocks.

* update the location paths.
2024-12-23 13:02:12 +05:30
Vladimir Mandic
57843c9fac fix pixart-sigma negative prompt handling (#8299)
* fix negative prompt

* fix

---------

Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: YiYi Xu <yixu310@gmail.com>
2024-12-23 13:02:12 +05:30
Steven Liu
e73f80f05a [docs] Outpaint (#7964)
* first draft

* edits
2024-12-23 13:02:12 +05:30
Steven Liu
6e0c2947e7 [docs] Scheduler features (#7990)
* noise schedule

* sigmas and zero snr

* feedback

* feedback
2024-12-23 13:02:12 +05:30
Álvaro Somoza
f59e4b1df9 Fix object has no attribute 'flush' when using without a console (#8271)
fix
2024-12-23 13:02:12 +05:30
Sajad Norouzi
b52b0d247e Add Kohya fix to SD pipeline for high resolution generation (#7633)
add kohya high resolution fix.
2024-12-23 13:02:12 +05:30
Sayak Paul
c512cc9845 change to yiyi's address. (#7981)
* change to yiyi's address.

* update to diffusers@huggingface.co
2024-12-23 13:02:12 +05:30
Sayak Paul
a33f850d5c [LoRA] attempt at fixing onetrainer lora. (#8242)
* attempt at fixing onetrainer lora.

* fix
2024-12-23 13:02:12 +05:30
Jiwook Han
36ce49851d Fix typo in philosophy.md (#8303)
fix typo in philosophy.md
2024-12-23 13:02:12 +05:30
Álvaro Somoza
eedcdafe25 [docs] Add controlnet example to marigold (#8289)
* initial doc

* fix wrong LCM sentence

* implement binary colormap without requiring matplotlib
update section about Marigold for ControlNet
update formatting of marigold_usage.md

* fix indentation

---------

Co-authored-by: anton <anton.obukhov@gmail.com>
2024-12-23 13:02:12 +05:30
Sayak Paul
879ad6d38c install wget. (#8285) 2024-12-23 13:02:12 +05:30
Anton Obukhov
0be111f3d0 [Pipeline] Marigold depth and normals estimation (#7847)
* implement marigold depth and normals pipelines in diffusers core

* remove bibtex

* remove deprecations

* remove save_memory argument

* remove validate_vae

* remove config output

* remove batch_size autodetection

* remove presets logic
move default denoising_steps and processing_resolution into the model config
make default ensemble_size 1

* remove no_grad

* add fp16 to the example usage

* implement is_matplotlib_available
use is_matplotlib_available, is_scipy_available for conditional imports in the marigold depth pipeline

* move colormap, visualize_depth, and visualize_normals into export_utils.py

* make the denoising loop more lucid
fix the outputs to always be 4d tensors or lists of pil images
support a 4d input_image case
attempt to support model_cpu_offload_seq
move check_inputs into a separate function
change default batch_size to 1, remove any logic to make it bigger implicitly

* style

* rename denoising_steps into num_inference_steps

* rename input_image into image

* rename input_latent into latents

* remove decode_image
change decode_prediction to use the AutoencoderKL.decode method

* move clean_latent outside of progress_bar

* refactor marigold-reusable image processing bits into MarigoldImageProcessor class

* clean up the usage example docstring

* make ensemble functions members of the pipelines

* add early checks in check_inputs
rename E into ensemble_size in depth ensembling

* fix vae_scale_factor computation

* better compatibility with torch.compile
better variable naming

* move export_depth_to_png to export_utils

* remove encode_prediction

* improve visualize_depth and visualize_normals to accept multi-dimensional data and lists
remove visualization functions from the pipelines
move exporting depth as 16-bit PNGs functionality from the depth pipeline
update example docstrings

* do not shortcut vae.config variables

* change all asserts to raise ValueError

* rename output_prediction_type to output_type

* better variable names
clean up variable deletion code

* better variable names

* pass desc and leave kwargs into the diffusers progress_bar
implement nested progress bar for images and steps loops

* implement scale_invariant and shift_invariant flags in the ensemble_depth function
add scale_invariant and shift_invariant flags readout from the model config
further refactor ensemble_depth
support ensembling without alignment
add ensemble_depth docstring

* fix generator device placement checks

* move encode_empty_text body into the pipeline call

* minor empty text encoding simplifications

* adjust pipelines' class docstrings to explain the added construction arguments

* improve the scipy failure condition
add comments
improve docstrings
change the default use_full_z_range to True

* make input image values range check configurable in the preprocessor
refactor load_image_canonical in preprocessor to reject unknown types and return the image in the expected 4D format of tensor and on right device
support a list of everything as inputs to the pipeline, change type to PipelineImageInput
implement a check that all input list elements have the same dimensions
improve docstrings of pipeline outputs
remove check_input pipeline argument

* remove forgotten print

* add prediction_type model config

* add uncertainty visualization into export utils
fix NaN values in normals uncertainties

* change default of output_uncertainty to False
better handle the case of an attempt to export or visualize none

* fix `output_uncertainty=False`

* remove kwargs
fix check_inputs according to the new inputs of the pipeline

* rename prepare_latent into prepare_latents as in other pipelines
annotate prepare_latents in normals pipeline with "Copied from"
annotate encode_image in normals pipeline with "Copied from"

* move nested-capable `progress_bar` method into the pipelines
revert the original `progress_bar` method in pipeline_utils

* minor message improvement

* fix cpu offloading

* move colormap, visualize_depth, export_depth_to_16bit_png, visualize_normals, visualize_uncertainty to marigold_image_processing.py
update example docstrings

* fix missing comma

* change torch.FloatTensor to torch.Tensor

* fix importing of MarigoldImageProcessor

* fix vae offloading
fix batched image encoding
remove separate encode_image function and use vae.encode instead

* implement marigold's intial tests
relax generator checks in line with other pipelines
implement return_dict __call__ argument in line with other pipelines

* fix num_images computation

* remove MarigoldImageProcessor and outputs from import structure
update tests

* update docstrings

* update init

* update

* style

* fix

* fix

* up

* up

* up

* add simple test

* up

* update expected np input/output to be channel last

* move expand_tensor_or_array into the MarigoldImageProcessor

* rewrite tests to follow conventions - hardcoded slices instead of image artifacts
write more smoke tests

* add basic docs.

* add anton's contribution statement

* remove todos.

* fix assertion values for marigold depth slow tests

* fix assertion values for depth normals.

* remove print

* support AutoencoderTiny in the pipelines

* update documentation page
add Available Pipelines section
add Available Checkpoints section
add warning about num_inference_steps

* fix missing import in docstring
fix wrong value in visualize_depth docstring

* [doc] add marigold to pipelines overview

* [doc] add section "usage examples"

* fix an issue with latents check in the pipelines

* add "Frame-by-frame Video Processing with Consistency" section

* grammarly

* replace tables with images with css-styled images (blindly)

* style

* print

* fix the assertions.

* take from the github runner.

* take the slices from action artifacts

* style.

* update with the slices from the runner.

* remove unnecessary code blocks.

* Revert "[doc] add marigold to pipelines overview"

This reverts commit a505165150afd8dab23c474d1a054ea505a56a5f.

* remove invitation for new modalities

* split out marigold usage examples

* doc cleanup

---------

Co-authored-by: yiyixuxu <yixu310@gmail.com>
Co-authored-by: yiyixuxu <yixu310@gmail,com>
Co-authored-by: sayakpaul <spsayakpaul@gmail.com>
2024-12-23 13:02:12 +05:30
Dhruv Nair
bc984f82ee Add zip package to doc builder image (#8284)
update
2024-12-23 13:02:12 +05:30
Sayak Paul
1be1282cbc [Workflows] add a more secure way to run tests from a PR. (#7969)
* add a more secure way to run tests from a PR.

* make pytest more secure.

* address dhruv's comments.

* improve validation check.

* Update .github/workflows/run_tests_from_a_pr.yml

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2024-12-23 13:02:12 +05:30
Dhaivat Bhatt
d1ddc5eb78 Add details about 1-stage implementation in I2VGen-XL docs (#8282)
* Add details about 1-stage implementation

* Add details about 1-stage implementation
2024-12-23 13:02:12 +05:30
Tolga Cangöz
d8d7a0e307 Fix CPU Offloading Usage & Typos (#8230)
* Fix typos

* Fix `pipe.enable_model_cpu_offload()` usage

* Fix cpu offloading

* Update numbers
2024-12-23 13:02:12 +05:30
Tolga Cangöz
c69aff7798 Fix a grammatical error in the raise messages (#8272)
Fix grammatical error
2024-12-23 13:02:12 +05:30
Yue Wu
6c302b4b09 sampling bug fix in diffusers tutorial "basic_training.md" (#8223)
sampling bug fix in basic_training.md

In the diffusers basic training tutorial, setting the manual seed argument (generator=torch.manual_seed(config.seed)) in the pipeline call inside evaluate() function rewinds the dataloader shuffling, leading to overfitting due to the model seeing same sequence of training examples after every evaluation call. Using generator=torch.Generator(device='cpu').manual_seed(config.seed) avoids this.
2024-12-23 13:02:12 +05:30
Dhruv Nair
915c094068 Clean up from_single_file docs (#8268)
* update

* update
2024-12-23 13:02:12 +05:30
Lucain
ebb3b44eec Respect resume_download deprecation V2 (#8267)
* Fix resume_downoad FutureWarning

* only resume download
2024-12-23 13:02:12 +05:30
Sayak Paul
7425b46859 [Chore] run the documentation workflow in a custom container. (#8266)
run the documentation workflow in a custom container.
2024-12-23 13:02:12 +05:30
Yifan Zhou
ae1fb33d4c [Community Pipeline] FRESCO: Spatial-Temporal Correspondence for Zero-Shot Video Translation (#8239)
* code and doc

* update paper link

* remove redundant codes

* add example video

---------

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:12 +05:30
Dhruv Nair
d57bfe7405 Use freedesktop_os_release() in diffusers cli for Python >=3.10 (#8235)
* update

* update
2024-12-23 13:02:12 +05:30
Dhruv Nair
84b6effa66 Create custom container for doc builder (#8263)
* update

* update
2024-12-23 13:02:12 +05:30
Dhruv Nair
6534924ae1 Fix resize issue in SVD pipeline with VideoProcessor (#8229)
update

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2024-12-23 13:02:12 +05:30