Commit Graph

3605 Commits

Author SHA1 Message Date
Kashif Rasul
7debd07541 Merge branch 'main' into rae 2026-02-26 11:08:08 +01:00
Kirill Stukalov
97c2c6e397 Fix wrong do_classifier_free_guidance threshold in ZImagePipeline (#13183)
Z-Image uses CFG formula `pred = pos + scale * (pos - neg)` where
`guidance_scale = 0` means no guidance. The threshold should be `> 0`
instead of `> 1` to match this formula.

Co-authored-by: Hezlich2 <typretypre@gmail.com>
2026-02-25 15:08:11 -10:00
Miguel Martin
212db7b999 Cosmos Transfer2.5 Auto-Regressive Inference Pipeline (#13114)
* AR

* address comments

* address comments 2
2026-02-25 14:42:29 -10:00
Sayak Paul
31058485f1 [attention backends] use dedicated wrappers from fa3 for cp. (#13165)
* use dedicated wrappers from fa3 for cp.

* up
2026-02-26 00:36:01 +05:30
Kashif Rasul
b297868201 fixes from pretrained weights 2026-02-25 13:38:22 +00:00
SYM.BOT
1f6ac1c3d1 fix: graceful fallback when attention backends fail to import (#13060)
* fix: graceful fallback when attention backends fail to import

## Problem

External attention backends (flash_attn, xformers, sageattention, etc.) may be
installed but fail to import at runtime due to ABI mismatches. For example,
when `flash_attn` is compiled against PyTorch 2.4 but used with PyTorch 2.8,
the import fails with:

```
OSError: .../flash_attn_2_cuda.cpython-311-x86_64-linux-gnu.so: undefined symbol: _ZN3c104cuda9SetDeviceEab
```

The current code uses `importlib.util.find_spec()` to check if packages exist,
but this only verifies the package is installed—not that it can actually be
imported. When the import fails, diffusers crashes instead of falling back to
native PyTorch attention.

## Solution

Wrap all external attention backend imports in try-except blocks that catch
`ImportError` and `OSError`. On failure:
1. Log a warning message explaining the issue
2. Set the corresponding `_CAN_USE_*` flag to `False`
3. Set the imported functions to `None`

This allows diffusers to gracefully degrade to PyTorch's native SDPA
(scaled_dot_product_attention) instead of crashing.

## Affected backends

- flash_attn (Flash Attention)
- flash_attn_3 (Flash Attention 3)
- aiter (AMD Instinct)
- sageattention (SageAttention)
- flex_attention (PyTorch Flex Attention)
- torch_npu (Huawei NPU)
- torch_xla (TPU/XLA)
- xformers (Meta xFormers)

## Testing

Tested with PyTorch 2.8.0 and flash_attn 2.7.4.post1 (compiled for PyTorch 2.4).
Before: crashes on import. After: logs warning and uses native attention.

* address review: use single logger and catch RuntimeError

- Move logger to module level instead of creating per-backend loggers
- Add RuntimeError to exception list alongside ImportError and OSError

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

* Apply style fixes

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-02-24 13:37:39 +05:30
Sayak Paul
5e94d62eb4 migrate to transformers v5 (#12976)
* switch to transformers main again./

* more

* up

* up

* fix group offloading.

* attributes

* up

* up

* tie embedding issue.

* fix t5 stuff for more.

* matrix configuration to see differences between 4.57.3 and main failures.

* change qwen expected slice because of how init is handled in v5.

* same stuff.

* up

* up

* Revert "up"

This reverts commit 515dd06db5.

* Revert "up"

This reverts commit 5274ffdd7f.

* up

* up

* fix with peft_format.

* just keep main for easier debugging.

* remove torchvision.

* empty

* up

* up with skyreelsv2 fixes.

* fix skyreels type annotation.

* up

* up

* fix variant loading issues.

* more fixes.

* fix dduf

* fix

* fix

* fix

* more fixes

* fixes

* up

* up

* fix dduf test

* up

* more

* update

* hopefully ,final?

* one last breath

* always install from main

* up

* audioldm tests

* up

* fix PRX tests.

* up

* kandinsky fixes

* qwen fixes.

* prx

* hidream
2026-02-24 10:53:56 +05:30
dg845
7ab2011759 Fix AutoModel typing Import Error (#13178)
Fix typing import by converting to Python 3.9+ style type hint
2026-02-24 07:58:43 +05:30
Dhruv Nair
4890e9bf70 Allow Automodel to use from_config with custom code. (#13123)
* update

* update
2026-02-23 21:55:59 +05:30
David Bertoin
f1e5914120 Fix T5GemmaEncoder loading for transformers 5.x composite T5GemmaConfig (#13143) 2026-02-23 15:45:45 +05:30
Kashif Rasul
28a02eb226 undo last change 2026-02-23 10:05:24 +00:00
Kashif Rasul
61885f37e3 added encoder_image_size config 2026-02-23 09:59:26 +00:00
Kashif Rasul
c68b812cb0 fix entrypoint for instantiating the AutoencoderRAE 2026-02-23 09:40:18 +00:00
Álvaro Somoza
a80b19218b Support Flux Klein peft (fal) lora format (#13169)
peft (fal) lora format
2026-02-21 10:31:18 +05:30
Animesh Jain
01de02e8b4 [gguf][torch.compile time] Convert to plain tensor earlier in dequantize_gguf_tensor (#13166)
[gguf] Convert to plain tensor earlier in dequantize_gguf_tensor

Once dequantize_gguf_tensor fetches the quant_type attributed from the
GGUFParamter tensor subclass, there is no further need of running the
actual dequantize operations on the Tensor subclass, we can just convert
to plain tensor right away.

This not only makes PyTorch eager faster, but reduces torch.compile
tracer compile time from 36 seconds to 10 seconds, because there is lot
less code to trace now.
2026-02-20 09:31:52 +05:30
Sayak Paul
99daaa802d [core] Enable CP for kernels-based attention backends (#12812)
* up

* up

* up

* up

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2026-02-19 18:16:50 +05:30
dg845
fe78a7b7c6 Fix ftfy import for PRX Pipeline (#13154)
* Guard ftfy import with is_ftfy_available

* Remove xfail for PRX pipeline tests as they appear to work on transformers>4.57.1

* make style and make quality
2026-02-18 20:44:33 -08:00
dxqb
a577ec36df Flux2: Tensor tuples can cause issues for checkpointing (#12777)
* split tensors inside the transformer blocks to avoid checkpointing issues

* clean up, fix type hints

* fix merge error

* Apply style fixes

---------

Co-authored-by: s <you@example.com>
Co-authored-by: dg845 <58458699+dg845@users.noreply.github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
2026-02-18 17:03:22 -08:00
David El Malih
64734b2115 docs: improve docstring scheduling_flow_match_lcm.py (#13160)
Improve docstring scheduling flow match lcm
2026-02-18 10:52:02 -08:00
Dhruv Nair
f81e653197 [CI] Add ftfy as a test dependency (#13155)
* update

* update

* update

* update

* update

* update
2026-02-18 22:51:10 +05:30
Kashif Rasul
d8b2983b9e Merge branch 'main' into rae 2026-02-17 10:10:40 +01:00
zhangtao0408
bcbbded7c3 [Bug] Fix QwenImageEditPlus Series on NPU (#13017)
* [Bug Fix][Qwen-Image-Edit] Fix Qwen-Image-Edit series on NPU

* Enhance NPU attention handling by converting attention mask to boolean and refining mask checks.

* Refine attention mask handling in NPU attention function to improve validation and conversion logic.

* Clean Code

* Refine attention mask processing in NPU attention functions to enhance performance and validation.

* Remove item() ops on npu fa backend.

* Reuse NPU attention mask by `_maybe_modify_attn_mask_npu`

* Apply style fixes

* Update src/diffusers/models/attention_dispatch.py

---------

Co-authored-by: zhangtao <zhangtao529@huawei.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2026-02-17 09:10:40 +05:30
Kashif Rasul
a4fc9f64b2 simplify mixins 2026-02-16 12:52:20 +00:00
Kashif Rasul
fc5295951a cleanup 2026-02-16 12:40:36 +00:00
Kashif Rasul
96520c4ff1 move loss to training script 2026-02-16 12:35:18 +00:00
Sayak Paul
35086ac06a [core] support device type device_maps to work with offloading. (#12811)
* support device type device_maps to work with offloading.

* add tests.

* fix tests

* skip tests where it's not supported.

* empty

* up

* up

* fix allegro.
2026-02-16 16:31:45 +05:30
Dhruv Nair
59e7a46928 [Pipelines] Remove k-diffusion (#13152)
* remove k-diffusion

* fix copies
2026-02-16 13:54:24 +05:30
Kashif Rasul
906d79a432 input and ground truth sizes have to be the same 2026-02-16 00:02:27 +00:00
Kashif Rasul
6a9bde6964 remove unneeded class 2026-02-15 23:55:06 +00:00
Kashif Rasul
e6d449933d use attention 2026-02-15 23:50:52 +00:00
Kashif Rasul
7cbbf271f3 use imports 2026-02-15 23:33:30 +00:00
Kashif Rasul
0d59b22732 cleanup 2026-02-15 23:19:13 +00:00
Kashif Rasul
d7cb12470b use mean and std convention 2026-02-15 22:57:02 +00:00
Kashif Rasul
f06ea7a901 fix latent_mean / latent_var init types to accept config-friendly inputs 2026-02-15 22:51:36 +00:00
Kashif Rasul
24acab0bcc make fix-copies 2026-02-15 22:44:16 +00:00
Kashif Rasul
0850c8cdc9 fix formatting 2026-02-15 22:39:59 +00:00
Kashif Rasul
3ecf89d044 Merge branch 'main' into rae 2026-02-15 23:05:44 +01:00
Álvaro Somoza
b0dc51da31 [LTX2] Fix wrong lora mixin (#13144)
change lora mixin

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
2026-02-15 11:36:17 +05:30
YiYi Xu
c919ec0611 [Modular] add explicit workflow support (#13028)
* up

* up up

* update outputs

* style

* add modular_auto_docstring!

* more auto docstring

* style

* up up up

* more more

* up

* address feedbacks

* add TODO in the description for empty docstring

* refactor based on dhruv's feedback: remove the class method

* add template method

* up

* up up up

* apply auto docstring

* make style

* rmove space in make docstring

* Apply suggestions from code review

* revert change in z

* fix

* Apply style fixes

* include auto-docstring check in the modular ci. (#13004)

* initial support: workflow

* up up

* treeat loop sequential pipeline blocks as leaf

* update qwen image docstring note

* add workflow support for sdxl

* add a test suit

* add test for qwen-image

* refactor flux a bit, seperate modular_blocks into modular_blocks_flux and modular_blocks_flux_kontext + support workflow

* refactor flux2: seperate blocks for klein_base + workflow

* qwen: remove import support for stuff other than the default blocks

* add workflow support for wan

* sdxl: remove some imports:

* refactor z

* update flux2 auto core denoise

* add workflow test for z and flux2

* Apply suggestions from code review

* Apply suggestions from code review

* add test for flux

* add workflow test for flux

* add test for flux-klein

* sdxl: modular_blocks.py -> modular_blocks_stable_diffusion_xl.py

* style

* up

* add auto docstring

* workflow_names -> available_workflows

* fix workflow test for klein base

* Apply suggestions from code review

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* fix workflow tests

* qwen: edit -> image_conditioned to be consistent with flux kontext/2 such

* remove Optional

* update type hints

* update guider update_components

* fix more

* update docstring auto again

---------

Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2026-02-14 16:18:48 -10:00
YiYi Xu
3c7506b294 [Modular] update doc for ModularPipeline (#13100)
* update create pipeline section

* update more

* update more

* more

* add a section on running pipeline moduarly

* refactor update_components, remove support for spec

* style

* bullet points

* update the pipeline block

* small fix in state doc

* update sequential doc

* fix link

* small update on quikstart

* add a note on how to run pipeline without the componen4ts manager

* Apply suggestions from code review

Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>

* remove the supported models mention

* update more

* up

* revert type hint changes

---------

Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-161-123.ec2.internal>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>
Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
2026-02-14 11:43:28 -10:00
YiYi Xu
19ab0ecb9e fix guider (#13147)
fix
2026-02-14 11:12:22 -10:00
YiYi Xu
5b00a18374 fix MT5Tokenizer (#13146)
up
2026-02-14 09:40:07 -10:00
YiYi Xu
6141ae2348 [Modular] add different pipeine blocks to init (#13145)
* up

* style + copies

* fix

---------

Co-authored-by: yiyi@huggingface.co <yiyi@ip-26-0-160-103.ec2.internal>
2026-02-13 18:36:47 -10:00
Sayak Paul
3c1c62ec9d [docs] fix ltx2 i2v docstring. (#13135)
* fix ltx2 i2v docstring.

* up
2026-02-14 08:40:16 +05:30
Sayak Paul
8abcf351c9 feat: implement apply_lora_scale to remove boilerplate. (#12994)
* feat: implement apply_lora_scale to remove boilerplate.

* apply to the rest.

* up

* remove more.

* remove.

* fix

* apply feedback.
2026-02-13 23:25:46 +05:30
Sayak Paul
2843b3d37a Sunset Python 3.8 & get rid of explicit typing exports where possible (#12524)
* drop python 3.8

* remove list, tuple, dict from typing

* fold Unions into |

* up

* fix a bunch and please me.

* up

* up

* up

* up

* up

* up

* enforce 3.10.0.

* up

* up

* up

* up

* up

* up

* up

* up

* Update setup.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* up.

* python 3.10.

* ifx

* up

* up

* up

* up

* final

* up

* fix typing utils.

* up

* up

* up

* up

* up

* up

* fix

* up

* up

* up

* up

* up

* up

* handle modern types.

* up

* up

* fix ip adapter type checking.

* up

* up

* up

* up

* up

* up

* up

* revert docstring changes.

* keep deleted files deleted.

* keep deleted files deleted.

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
2026-02-13 18:16:51 +05:30
Sayak Paul
76af013a41 fix cosmos transformer typing. (#13134) 2026-02-13 14:51:19 +05:30
David El Malih
5f3ea22513 docs: improve docstring scheduling_flow_match_heun_discrete.py (#13130)
Improve docstring scheduling flow match heun discrete
2026-02-12 14:32:04 -08:00
dg845
985d83c948 Fix LTX-2 Inference when num_videos_per_prompt > 1 and CFG is Enabled (#13121)
Fix LTX-2 inference when num_videos_per_prompt > 1 and CFG is enabled
2026-02-11 22:35:29 -08:00
Miguel Martin
a1816166a5 Cosmos Transfer2.5 inference pipeline: general/{seg, depth, blur, edge} (#13066)
* initial conversion script

* cosmos control net block

* CosmosAttention

* base model conversion

* wip

* pipeline updates

* convert controlnet

* pipeline: working without controls

* wip

* debugging

* Almost working

* temp

* control working

* cleanup + detail on neg_encoder_hidden_states

* convert edge

* pos emb for control latents

* convert all chkpts

* resolve TODOs

* remove prints

* Docs

* add siglip image reference encoder

* Add unit tests

* controlnet: add duplicate layers

* Additional tests

* skip less

* skip less

* remove image_ref

* minor

* docs

* remove skipped test in transfer

* Don't crash process

* formatting

* revert some changes

* remove skipped test

* make style

* Address comment + fix example

* CosmosAttnProcessor2_0 revert + CosmosAttnProcessor2_5 changes

* make style

* make fix-copies
2026-02-11 18:33:09 -10:00