Varun Sundar Rabindranath
|
fb0571b077
|
[GPTOSS][DP/EP][Marlin] Enable GPTOSS Batched DP/EP using Marlin kernels (#25997)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
|
2025-10-16 12:53:11 -07:00 |
|
Kay Yan
|
02d709a6f1
|
[docs] standardize Hugging Face env var to HF_TOKEN (deprecates HUGGING_FACE_HUB_TOKEN) (#27020)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
|
2025-10-16 15:31:02 +01:00 |
|
Chendi.Xue
|
509cdc0370
|
[DOC][XPU]update feature parity with Intel GPU (#26954)
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
|
2025-10-15 20:07:10 -07:00 |
|
Pradeep Dasigi
|
4794c2bd92
|
Olmo 3 tool parser and tests (#26143)
Signed-off-by: Pradeep Dasigi <pradeepd@allenai.org>
|
2025-10-15 16:36:12 +00:00 |
|
Boyuan Feng
|
f57438338d
|
[BugFix] Patch inductor memory plan logic (#26878)
Signed-off-by: Boyuan Feng <boyuan@meta.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-15 12:51:45 +00:00 |
|
wangxiyuan
|
8f4b313c37
|
[Misc] rename torch_dtype to dtype (#26695)
Signed-off-by: wangxiyuan <wangxiyuan1007@gmail.com>
|
2025-10-15 12:11:48 +00:00 |
|
youkaichao
|
650b51f9f9
|
[doc] add Context Parallel Deployment doc (#26877)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-10-15 16:33:52 +08:00 |
|
Cyrus Leung
|
6256697997
|
[Doc] ruff format remaining Python examples (#26795)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-15 01:25:49 -07:00 |
|
Michael Yao
|
c43ca8259e
|
[Docs] Move build.inc into arm.inc (#26862)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
|
2025-10-14 20:35:08 -07:00 |
|
Tao Hui
|
85a65e7f51
|
[Model] Add DeepSeek-V3.1 reasoning parser (split from PR #24972) (#25589)
Signed-off-by: taohui <taohui3@gmail.com>
Signed-off-by: Tao Hui <taohui3@gmail.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: Chauncey <chaunceyjiang@gmail.com>
|
2025-10-15 11:09:52 +08:00 |
|
Morrison Turnansky
|
96b9aa5aa0
|
[Frontend][torch.compile] CompilationConfig Overhaul (#20283): name change compilation level to compilation mode, deprecation compilation level (#26355)
Signed-off-by: morrison-turnansky <mturnans@redhat.com>
Signed-off-by: Morrison Turnansky <mturnans@redhat.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-10-15 02:51:16 +00:00 |
|
HDCharles
|
b92ab3deda
|
Notice for deprecation of AutoAWQ (#26820)
Signed-off-by: HDCharles <39544797+HDCharles@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-14 13:39:59 -07:00 |
|
Cyrus Leung
|
9c4cb68339
|
[Chore] Remove SupportsV0Only interface and update supported models docs (#26783)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-14 04:55:10 -07:00 |
|
Cyrus Leung
|
ef9676a1f1
|
[Doc] ruff format some Python examples (#26767)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-14 03:21:53 -07:00 |
|
Chendi.Xue
|
7e6edb1469
|
[NIXL][HeteroTP] Enable KV transfer from HND prefill to NHD decode (#26556)
Signed-off-by: Chendi Xue <chendi.xue@intel.com>
|
2025-10-14 09:46:05 +00:00 |
|
Michael Yao
|
2e36cdbe2b
|
[Docs] Add a start tag to build.inc.md (#26747)
Signed-off-by: windsonsea <haifeng.yao@daocloud.io>
|
2025-10-13 21:51:55 -07:00 |
|
Maximilien de Bayser
|
fe3edb4cf0
|
Add support for the /rerank endpoint in vllm bench serve (#26602)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2025-10-14 04:25:43 +00:00 |
|
Michael Goin
|
3e051bda82
|
[UX] Replace VLLM_ALL2ALL_BACKEND with --all2all-backend (#26732)
Signed-off-by: mgoin <mgoin64@gmail.com>
|
2025-10-13 18:12:52 -07:00 |
|
wang.yuqi
|
767c3ab869
|
[Model][0/N] Improve all pooling task | clean up (#25817)
Signed-off-by: wang.yuqi <noooop@126.com>
|
2025-10-13 16:44:50 +08:00 |
|
CSWYF3634076
|
782505ed8e
|
[Model] Add reasoning_parser and tool_parser for Ernie45 thinking (#25027)
Signed-off-by: wangyafeng <wangyafeng@baidu.com>
|
2025-10-13 15:55:20 +08:00 |
|
Harry Mellor
|
8fcaaf6a16
|
Update Optional[x] -> x | None and Union[x, y] to x | y (#26633)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-12 09:51:31 -07:00 |
|
Xiong Wang
|
19a9b169bf
|
Add Qwen3-Omni moe thinker (#25550)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
Signed-off-by: Roger Wang <hey@rogerw.io>
Signed-off-by: Isotr0py <mozf@mail2.sysu.edu.cn>
Co-authored-by: Xiong Wang <feizi.wx@alibaba-inc.com>
Co-authored-by: DarkLight1337 <tlleungac@connect.ust.hk>
Co-authored-by: Roger Wang <hey@rogerw.io>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-10 17:00:56 +00:00 |
|
Shane A
|
8d2b8c0ff2
|
[Model] Add FlexOlmo model implementation (#24923)
Signed-off-by: Shane A <shanea@allenai.org>
|
2025-10-10 09:43:15 -07:00 |
|
Julien Denize
|
c6187f55f7
|
Refactor MistralTokenizer (#26358)
Signed-off-by: Julien Denize <julien.denize@mistral.ai>
|
2025-10-09 22:48:58 +00:00 |
|
youkaichao
|
e6e898f95d
|
[doc] add Volcengine as a compute sponsor (#26477)
Signed-off-by: youkaichao <youkaichao@gmail.com>
|
2025-10-09 17:11:47 +08:00 |
|
Wenlong Wang
|
43ab8cfaa5
|
[MM][Doc] Add documentation for configurable mm profiling (#26200)
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
|
2025-10-08 23:21:20 -07:00 |
|
Harry Mellor
|
e09d1753ec
|
Remove Python 3.9 support ahead of PyTorch 2.9 in v0.11.1 (#26416)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-08 10:40:42 -07:00 |
|
Chendi.Xue
|
9fc983c707
|
[NIXL][non-cuda] Add install script for nixl with non-cuda ucx (#25959)
Signed-off-by: Chendi Xue <Chendi.Xue@intel.com>
|
2025-10-08 14:19:53 +00:00 |
|
Michael Goin
|
1b86bd8e18
|
Add more libraries to rlhf.md (#26374)
Signed-off-by: Michael Goin <mgoin64@gmail.com>
|
2025-10-07 20:59:41 +00:00 |
|
Paul Pak
|
320feae6f5
|
[Model] Lfm2Moe (#26344)
Signed-off-by: Paul Pak <paulpak58@gmail.com>
|
2025-10-07 16:03:05 +00:00 |
|
antrec
|
6f59beaf0b
|
[Model] Add support for ModernBertForTokenClassification (#26340)
Signed-off-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr>
Signed-off-by: antrec <antoine.recanati@gmail.com>
Co-authored-by: Antoine Recanati Le Goat <antoine.recanati@sancare.fr>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
|
2025-10-07 14:29:19 +00:00 |
|
fxmarty-amd
|
41f1cf38f2
|
[Feature][OCP MX] Support mxfp6 and mixed mxfp6-mxfp4 (#21166)
|
2025-10-07 09:35:26 -04:00 |
|
fhl2000
|
63773a6200
|
[Docs] add docs for cuda graph v1 (#24374)
Signed-off-by: fhl <2410591650@qq.com>
Signed-off-by: fhl2000 <63384265+fhl2000@users.noreply.github.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
|
2025-10-07 05:25:05 -07:00 |
|
Sergio Paniego Blanco
|
883b42896a
|
Add TRL example notebook to RLHF docs (#26346)
Signed-off-by: sergiopaniego <sergiopaniegoblanco@gmail.com>
|
2025-10-07 11:31:28 +00:00 |
|
Cyrus Leung
|
7e4cd070b0
|
[V0 Deprecation] Remove VLLM_USE_V1 from docs and scripts (#26336)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-07 16:46:44 +08:00 |
|
Sage Moore
|
c50901f3b9
|
[Docs][DBO] Add initial doc that describes the DBO implementation (#26024)
Signed-off-by: Sage Moore <sage@neuralmagic.com>
|
2025-10-07 00:47:28 +00:00 |
|
Varun Sundar Rabindranath
|
93540958b8
|
[Docs] Fix broken table in moe_kernel_features doc (#26314)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
|
2025-10-06 15:58:05 -04:00 |
|
Cyrus Leung
|
44b9af5bb2
|
[Benchmark] Enable MM Embedding benchmarks (#26310)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-06 19:51:58 +00:00 |
|
abhisheksheth28
|
77c95f72f7
|
[Doc] add KAITO to integrations (#25521)
Signed-off-by: "Abhishek Sheth" <absheth@microsoft.com>
|
2025-10-06 17:30:03 +08:00 |
|
Aritra Roy Gosthipaty
|
59f30d0448
|
[Docs] Edit HF Inference Endpoints documentation (#26275)
Signed-off-by: Aritra Roy Gosthipaty <aritra.born2fly@gmail.com>
Signed-off-by: ariG23498 <aritra.born2fly@gmail.com>
|
2025-10-06 10:13:09 +01:00 |
|
orangeng
|
59b477645c
|
[Doc] Edited minor typo (#26266)
Signed-off-by: Orange Ng <ngquanhao@outlook.com>
|
2025-10-05 19:53:09 -07:00 |
|
Elieser Pereira
|
f509a20846
|
[DOC] Update production-stack.md (#26177)
Signed-off-by: Elieser Pereira <elieser.pereiraa@gmail.com>
|
2025-10-05 21:32:48 +00:00 |
|
Harry Mellor
|
d6953beb91
|
Convert formatting to use ruff instead of yapf + isort (#26247)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-05 07:06:22 -07:00 |
|
Maximilien de Bayser
|
e0986ea07b
|
Add documentation for granite 4 tool calling (#26175)
Signed-off-by: Max de Bayser <mbayser@br.ibm.com>
|
2025-10-05 07:35:42 +00:00 |
|
Cyrus Leung
|
4570535ec4
|
[Model] CLIP Embedding Support (#26010)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-04 06:21:42 -07:00 |
|
Harry Mellor
|
d3d649efec
|
Support expert parallel in Transformers backend (#26162)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
|
2025-10-04 04:35:04 +00:00 |
|
Varun Sundar Rabindranath
|
7ef40bb983
|
[GPTOSS][DP/EP][Marlin] Enable GPTOSS DP/EP using Marlin kernels (#25488)
Signed-off-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: Varun Sundar Rabindranath <vsundarr@redhat.com>
Co-authored-by: mgoin <mgoin64@gmail.com>
|
2025-10-03 20:13:13 -04:00 |
|
Wenlong Wang
|
79aa244678
|
[Multi Modal] Configurable MM Profiling (#25631)
Signed-off-by: wwl2755 <wangwenlong2755@gmail.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-03 03:59:10 -07:00 |
|
Cyrus Leung
|
f9a8084e48
|
[Model] Use merge_by_field_config for MM models (InternVL family) (#26153)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
|
2025-10-03 01:59:06 -07:00 |
|
Harry Mellor
|
10d765482d
|
FusedMoE support for the Transformers backend (#22650)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
|
2025-10-02 23:12:15 -07:00 |
|