mirror of https://github.com/huggingface/diffusers.git synced 2025-12-16 01:14:47 +08:00

Files

Tran Thanh Luan 6290fdfda4 [Feat] TaylorSeer Cache (#12648 )

* init taylor_seer cache

* make compatible with any tuple size returned

* use logger for printing, add warmup feature

* still update in warmup steps

* refractor, add docs

* add configurable cache, skip compute module

* allow special cache ids only

* add stop_predicts (cooldown)

* update docs

* apply ruff

* update to handle multple calls per timestep

* refractor to use state manager

* fix format & doc

* chores: naming, remove redundancy

* add docs

* quality & style

* fix taylor precision

* Apply style fixes

* add tests

* Apply style fixes

* Remove TaylorSeerCacheTesterMixin from flux2 tests

* rename identifiers, use more expressive taylor predict loop

* torch compile compatible

* Apply style fixes

* Update src/diffusers/hooks/taylorseer_cache.py

Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>

* update docs

* make fix-copies

* fix example usage.

* remove tests on flux kontext

---------

Co-authored-by: toilaluan <toilaluan@github.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Co-authored-by: Dhruv Nair <dhruv.nair@gmail.com>
Co-authored-by: Sayak Paul <spsayakpaul@gmail.com>

2025-12-06 05:39:54 +05:30

1.2 KiB

Raw Blame History

Caching methods

Cache methods speedup diffusion transformers by storing and reusing intermediate outputs of specific layers, such as attention and feedforward layers, instead of recalculating them at each inference step.

CacheMixin

autodoc CacheMixin

PyramidAttentionBroadcastConfig

autodoc PyramidAttentionBroadcastConfig

autodoc apply_pyramid_attention_broadcast

FasterCacheConfig

autodoc FasterCacheConfig

autodoc apply_faster_cache

FirstBlockCacheConfig

autodoc FirstBlockCacheConfig

autodoc apply_first_block_cache

TaylorSeerCacheConfig

autodoc TaylorSeerCacheConfig

autodoc apply_taylorseer_cache

1.2 KiB Raw Blame History