mirror of
https://github.com/vllm-project/vllm.git
synced 2025-12-06 06:53:12 +08:00
663 B
663 B
Disaggregated Prefill V1
This example contains scripts that demonstrate disaggregated prefill in the offline setting of vLLM.
Files
run.sh- A helper script that will runprefill_example.pyanddecode_example.pysequentially.- Make sure you are in the
examples/offline_inference/disaggregated-prefill-v1directory before runningrun.sh.
- Make sure you are in the
prefill_example.py- A script which performs prefill only, saving the KV state to thelocal_storagedirectory and the prompts tooutput.txt.decode_example.py- A script which performs decode only, loading the KV state from thelocal_storagedirectory and the prompts fromoutput.txt.