* remove str option for quantization config in torchao
* Apply style fixes
* minor fixes
* Added AOBaseConfig docs to torchao.md
* minor fixes for removing str option torchao
* minor change to add back int and uint check
* minor fixes
* minor fixes to tests
* Update tests/quantization/torchao/test_torchao.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Update docs/source/en/quantization/torchao.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Update tests/quantization/torchao/test_torchao.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* version=2 update to test_torchao.py
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* feat: implement three RAE encoders(dinov2, siglip2, mae)
* feat: finish first version of autoencoder_rae
* fix formatting
* make fix-copies
* initial doc
* fix latent_mean / latent_var init types to accept config-friendly inputs
* use mean and std convention
* cleanup
* add rae to diffusers script
* use imports
* use attention
* remove unneeded class
* example traiing script
* input and ground truth sizes have to be the same
* fix argument
* move loss to training script
* cleanup
* simplify mixins
* fix training script
* fix entrypoint for instantiating the AutoencoderRAE
* added encoder_image_size config
* undo last change
* fixes from pretrained weights
* cleanups
* address reviews
* fix train script to use pretrained
* fix conversion script review
* latebt normalization buffers are now always registered with no-op defaults
* Update examples/research_projects/autoencoder_rae/README.md
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Update src/diffusers/models/autoencoders/autoencoder_rae.py
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* use image url
* Encoder is frozen
* fix slow test
* remove config
* use ModelTesterMixin and AutoencoderTesterMixin
* make quality
* strip final layernorm when converting
* _strip_final_layernorm_affine for training script
* fix test
* add dispatch forward and update conversion script
* update training script
* error out as soon as possible and add comments
* Update src/diffusers/models/autoencoders/autoencoder_rae.py
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
* use buffer
* inline
* Update src/diffusers/models/autoencoders/autoencoder_rae.py
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
* remove optional
* _noising takes a generator
* Update src/diffusers/models/autoencoders/autoencoder_rae.py
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
* fix api
* rename
* remove unittest
* use randn_tensor
* fix device map on multigpu
* check if the key is missing in the original state dict and only then add to the allow_missing set
* remove initialize_weights
---------
Co-authored-by: wangyuqi <wangyuqi@MBP-FJDQNJTWYN-0208.local>
Co-authored-by: Kashif Rasul <kashif.rasul@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
* Fix Helios paper link in documentation
Updated the link to the Helios paper for accuracy.
* Fix reference link in HeliosTransformer3DModel documentation
Updated the reference link for the Helios Transformer model paper.
* Update Helios research paper link in documentation
* Update Helios research paper link in documentation
* LTX2 condition pipeline initial commit
* Fix pipeline import error
* Implement LTX-2-style general image conditioning
* Blend denoising output and clean latents in sample space instead of velocity space
* make style and make quality
* make fix-copies
* Rename LTX2VideoCondition image to frames
* Update LTX2ConditionPipeline example
* Remove support for image and video in __call__
* Put latent_idx_from_index logic inline
* Improve comment on using the conditioning mask in denoising loop
* Apply suggestions from code review
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
* make fix-copies
* Migrate to Python 3.9+ style type annotations without explicit typing imports
* Forward kwargs from preprocess/postprocess_video to preprocess/postprocess resp.
* Center crop LTX-2 conditions following original code
* Duplicate video and audio position ids if using CFG
* make style and make quality
* Remove unused index_type arg to preprocess_conditions
* Add # Copied from for _normalize_latents
* Fix _normalize_latents # Copied from statement
* Add LTX-2 condition pipeline docs
* Remove TODOs
* Support only unpacked latents (5D for video, 4D for audio)
* Remove # Copied from for prepare_audio_latents
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
* up
* up up
* update outputs
* style
* add modular_auto_docstring!
* more auto docstring
* style
* up up up
* more more
* up
* address feedbacks
* add TODO in the description for empty docstring
* refactor based on dhruv's feedback: remove the class method
* add template method
* up
* up up up
* apply auto docstring
* make style
* rmove space in make docstring
* Apply suggestions from code review
* revert change in z
* fix
* Apply style fixes
* include auto-docstring check in the modular ci. (#13004)
* initial support: workflow
* up up
* treeat loop sequential pipeline blocks as leaf
* update qwen image docstring note
* add workflow support for sdxl
* add a test suit
* add test for qwen-image
* refactor flux a bit, seperate modular_blocks into modular_blocks_flux and modular_blocks_flux_kontext + support workflow
* refactor flux2: seperate blocks for klein_base + workflow
* qwen: remove import support for stuff other than the default blocks
* add workflow support for wan
* sdxl: remove some imports:
* refactor z
* update flux2 auto core denoise
* add workflow test for z and flux2
* Apply suggestions from code review
* Apply suggestions from code review
* add test for flux
* add workflow test for flux
* add test for flux-klein
* sdxl: modular_blocks.py -> modular_blocks_stable_diffusion_xl.py
* style
* up
* add auto docstring
* workflow_names -> available_workflows
* fix workflow test for klein base
* Apply suggestions from code review
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
* fix workflow tests
* qwen: edit -> image_conditioned to be consistent with flux kontext/2 such
* remove Optional
* update type hints
* update guider update_components
* fix more
* update docstring auto again
---------
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
* Support different pipeline outputs for LTX 2 encode_video
* Update examples to use improved encode_video function
* Fix comment
* Address review comments
* make style and make quality
* Have non-iterator video inputs respect video_chunks_number
* make style and make quality
* Add warning when encode_video receives a non-denormalized np.ndarray
* make style and make quality
---------
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* Add ZImageInpaintPipeline
Updated the pipeline structure to include ZImageInpaintPipeline
alongside ZImagePipeline and ZImageImg2ImgPipeline.
Implemented the ZImageInpaintPipeline class for inpainting
tasks, including necessary methods for encoding prompts,
preparing masked latents, and denoising.
Enhanced the auto_pipeline to map the new ZImageInpaintPipeline
for inpainting generation tasks.
Added unit tests for ZImageInpaintPipeline to ensure
functionality and performance.
Updated dummy objects to include ZImageInpaintPipeline for
testing purposes.
* Add documentation and improve test stability for ZImageInpaintPipeline
- Add torch.empty fix for x_pad_token and cap_pad_token in test
- Add # Copied from annotations for encode_prompt methods
- Add documentation with usage example and autodoc directive
* Address PR review feedback for ZImageInpaintPipeline
Add batch size validation and callback handling fixes per review,
using diffusers conventions rather than suggested code verbatim.
* Update src/diffusers/pipelines/z_image/pipeline_z_image_inpaint.py
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
* Update src/diffusers/pipelines/z_image/pipeline_z_image_inpaint.py
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
* Add input validation and fix XLA support for ZImageInpaintPipeline
- Add missing is_torch_xla_available import for TPU support
- Add xm.mark_step() in denoising loop for proper XLA execution
- Add check_inputs() method for comprehensive input validation
- Call check_inputs() at the start of __call__
Addresses PR review feedback from @asomoza.
* Cleanup
---------
Co-authored-by: Álvaro Somoza <asomoza@users.noreply.github.com>
* add metadata field to input/output param
* refactor mellonparam: move the template outside, add metaclass, define some generic template for custom node
* add from_custom_block
* style
* up up fix
* add mellon guide
* add to toctree
* style
* add mellon_types
* style
* mellon_type -> inpnt_types + output_types
* update doc
* add quant info to components manager
* fix more
* up up
* fix components manager
* update custom block guide
* update
* style
* add a warn for mellon and add new guides to overview
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/modular_diffusers/mellon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* more update on custom block guide
* Update docs/source/en/modular_diffusers/mellon.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* a few mamual
* apply suggestion: turn into bullets
* support define mellon meta with MellonParam directly, and update doc
* add the video
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
* add a real quick start guide
* Update docs/source/en/modular_diffusers/quickstart.md
* update a bit more
* fix
* Apply suggestions from code review
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/modular_diffusers/quickstart.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* Update docs/source/en/modular_diffusers/quickstart.md
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
* update more
* Apply suggestions from code review
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
* address more feedbacks: move components amnager earlier, explain blocks vs sub-blocks etc
* more
* remove the link to mellon guide, not exist in this PR yet
---------
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>