Update docs/source/en/optimization/attention_backends.md

Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>
Update attention_backends.md
2026-03-19 23:18:16 +08:00 · 2026-03-18 23:31:40 +05:30 · 2026-03-18 15:43:53 +05:30
1 changed files with 1 additions and 1 deletions
--- a/docs/source/en/optimization/attention_backends.md
+++ b/docs/source/en/optimization/attention_backends.md
@@ -35,7 +35,7 @@ The [`~ModelMixin.set_attention_backend`] method iterates through all the module
 The example below demonstrates how to enable the `_flash_3_hub` implementation for FlashAttention-3 from the [`kernels`](https://github.com/huggingface/kernels) library, which allows you to instantly use optimized compute kernels from the Hub without requiring any setup.

 > [!NOTE]
-> FlashAttention-3 is not supported for non-Hopper architectures, in which case, use FlashAttention with `set_attention_backend("flash")`.
+> FlashAttention-3 requires Ampere GPUs at a minimum.

 ```py
 import torch
Author	SHA1	Message	Date
Sayak Paul	611034eb74	Update docs/source/en/optimization/attention_backends.md Co-authored-by: Steven Liu <59462357+stevhliu@users.noreply.github.com>	2026-03-18 23:31:40 +05:30
Sayak Paul	052d5e6d5f	Update attention_backends.md	2026-03-18 15:43:53 +05:30