mirror of https://github.com/huggingface/diffusers.git synced 2026-01-27 14:06:03 +08:00

Files

Aryan 0454fbb30b First Block Cache (#11180 )

* update

* modify flux single blocks to make compatible with cache techniques (without too much model-specific intrusion code)

* remove debug logs

* update

* cache context for different batches of data

* fix hs residual bug for single return outputs; support ltx

* fix controlnet flux

* support flux, ltx i2v, ltx condition

* update

* update

* Update docs/source/en/api/cache.md

* Update src/diffusers/hooks/hooks.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* address review comments pt. 1

* address review comments pt. 2

* cache context refacotr; address review pt. 3

* address review comments

* metadata registration with decorators instead of centralized

* support cogvideox

* support mochi

* fix

* remove unused function

* remove central registry based on review

* update

---------

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

2025-07-09 03:27:15 +05:30

1.1 KiB

Raw Permalink Blame History

Caching methods

Cache methods speedup diffusion transformers by storing and reusing intermediate outputs of specific layers, such as attention and feedforward layers, instead of recalculating them at each inference step.

CacheMixin

autodoc CacheMixin

PyramidAttentionBroadcastConfig

autodoc PyramidAttentionBroadcastConfig

autodoc apply_pyramid_attention_broadcast

FasterCacheConfig

autodoc FasterCacheConfig

autodoc apply_faster_cache

FirstBlockCacheConfig

autodoc FirstBlockCacheConfig

autodoc apply_first_block_cache

1.1 KiB Raw Permalink Blame History