180 Commits

Author SHA1 Message Date
Noa Neria
6366c098d7 Validating Runai Model Streamer Integration with S3 Object Storage (#29320)
Signed-off-by: Noa Neria <noa@run.ai>
2025-12-04 18:04:43 +08:00
Shengqi Chen
1109f98288 [CI] fix docker image build by specifying merge-base commit id when downloading pre-compiled wheels (#29930)
Signed-off-by: Shengqi Chen <harry-chen@outlook.com>
2025-12-03 14:08:19 -08:00
Amr Mahdi
f5d3d93c40 [docker] Build CUDA kernels in separate Docker stage for faster rebuilds (#29452)
Signed-off-by: Amr Mahdi <amrmahdi@meta.com>
2025-12-03 11:41:53 +00:00
Andreas Karatzas
506ed87e87 [ROCm][CI][Bugfix] Disable Flash/MemEfficient SDP on ROCm to avoid HF Transformers accuracy issues (#29909)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-12-03 10:36:49 +08:00
Benjamin Bartels
2d613de9ae [CI/Build] Fixes missing runtime dependencies (#29822)
Signed-off-by: bbartels <benjamin@bartels.dev>
2025-12-02 10:21:49 -08:00
Andreas Karatzas
ea3370b428 [ROCm][Bugfix] Patch for the Multi-Modal Processor Test group (#29702)
Signed-off-by: Andreas Karatzas <akaratza@amd.com>
2025-11-29 01:31:44 +00:00
Li, Jiang
e2f56c309d [CPU] Update torch 2.9.1 for CPU backend (#29664)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-11-28 13:37:54 +00:00
Cyrus Leung
b34e8775a3 Revert "[CPU]Update CPU PyTorch to 2.9.0 (#29589)" (#29647)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-27 22:43:18 -08:00
scydas
35657bcd7a [CPU]Update CPU PyTorch to 2.9.0 (#29589)
Signed-off-by: scyda <scyda@outlook.com>
Co-authored-by: Li, Jiang <jiang1.li@intel.com>
2025-11-28 09:34:33 +08:00
Andrii Skliar
a5345bf49d [BugFix] Fix plan API Mismatch when using latest FlashInfer (#29426)
Signed-off-by: Andrii Skliar <askliar@askliar-mlt.client.nvidia.com>
Co-authored-by: Andrii Skliar <askliar@askliar-mlt.client.nvidia.com>
2025-11-27 11:34:59 -08:00
Alec
c4c0354eec [CI/Build] allow user modify pplx and deepep ref by ENV or command line (#29131)
Signed-off-by: alec-flowers <aflowers@nvidia.com>
2025-11-26 17:41:16 +00:00
汪志鹏
7012d8b45e [Docker] Optimize Dockerfile: consolidate apt-get and reduce image size by ~200MB (#29060)
Signed-off-by: princepride <wangzhipeng628@gmail.com>
2025-11-24 19:54:00 -07:00
Kunshang Ji
b8328b49fb [XPU] upgrade torch & ipex 2.9 on XPU platform (#29307)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-11-25 09:34:47 +08:00
Benjamin Bartels
4d6afcaddc [CI/Build] Moves to cuda-base runtime image while retaining minimal JIT dependencies (#29270)
Signed-off-by: bbartels <benjamin@bartels.dev>
Signed-off-by: Benjamin Bartels <benjamin@bartels.dev>
2025-11-24 11:40:54 -08:00
Roger Wang
0ff70821c9 [Core] Deprecate xformers (#29262)
Signed-off-by: Roger Wang <hey@rogerw.io>
2025-11-24 04:18:55 +00:00
Benjamin Bartels
eb5352a770 [CI/build] Removes source compilation from runtime image (#26966)
Signed-off-by: bbartels <benjamin@bartels.dev>
2025-11-22 10:23:09 -08:00
Charlie Fu
75648b16dd [ROCm][CI] Fix config/test_config_generation.py (#29142)
Signed-off-by: charlifu <charlifu@amd.com>
2025-11-21 17:12:16 +00:00
Cyrus Leung
9452863088 Revert "Revert #28875 (#29159)" (#29179)
Signed-off-by: DarkLight1337 <tlleungac@connect.ust.hk>
2025-11-21 04:27:43 -08:00
Bhagyashri
2b1b3dfa4b Update Dockerfile to use gcc-toolset-14 and fix test case failures on power (ppc64le) (#28957)
Signed-off-by: Bhagyashri <Bhagyashri.Gaikwad2@ibm.com>
2025-11-21 12:24:09 +00:00
Cyrus Leung
4d7231e774 Revert #28875 (#29159) 2025-11-21 01:40:17 -08:00
Qidong Su
698024ecce [Doc] update installation guide regarding aarch64+cuda pytorch build (#28875)
Signed-off-by: Qidong Su <soodoshll@gmail.com>
Co-authored-by: Cyrus Leung <cyrus.tl.leung@gmail.com>
2025-11-20 19:40:25 -08:00
Fadi Arafeh
3168285fca [cpu][ci] Add initial set of tests for Arm CPUs (#28657)
Signed-off-by: Fadi Arafeh <fadi.arafeh@arm.com>
2025-11-20 02:37:09 +00:00
liuzhenwei
d64429bb36 [NIXL][XPU] update install script of NIXL (#28778)
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
2025-11-17 03:01:33 +00:00
Gregory Shtrasberg
75f01b9d3c [ROCm][CI/Build] Upgrade to ROCm 7.1 and AITER main (#28753)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-11-14 15:53:21 -08:00
Gregory Shtrasberg
5a84b76b86 [ROCm][CI/Build] Change install location of uv (#28741)
Signed-off-by: Gregory Shtrasberg <Gregory.Shtrasberg@amd.com>
2025-11-14 21:34:18 +00:00
amdfaa
a7791eac9d [CI/Build] Install uv for AMD MI300: Language Models Tests (Hybrid) %N (#28142)
Signed-off-by: amdfaa <107946068+amdfaa@users.noreply.github.com>
Signed-off-by: zhewenli <zhewenli@meta.com>
Co-authored-by: zhewenli <zhewenli@meta.com>
2025-11-13 14:34:55 +00:00
Li, Jiang
7f829be7d3 [CPU] Refactor CPU attention backend (#27954)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
2025-11-12 09:43:06 +08:00
Harry Mellor
811df41ee9 Update Flashinfer from v0.4.1 to v0.5.2 (#27952)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-11-07 16:24:42 -08:00
R3hankhan
e04492449e [Hardware][IBM Z] Optimize s390x Dockerfile (#28023)
Signed-off-by: Rehan Khan <Rehan.Khan7@ibm.com>
2025-11-05 11:25:44 -08:00
Zhewen Li
2f84ae1f27 [CI/Build] Update LM Eval Version in AMD CI (#27944)
Signed-off-by: zhewenli <zhewenli@meta.com>
2025-11-04 06:36:40 +00:00
Kunshang Ji
7f4bdadb92 [XPU]Refine Dockerfile.xpu, avoid oneccl dependency issue (#27964)
Signed-off-by: Kunshang Ji <kunshang.ji@intel.com>
2025-11-03 07:36:59 +00:00
Huy Do
ba33e8830d Reapply "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" (#27768)
Signed-off-by: Huy Do <huydhn@gmail.com>
2025-10-30 10:22:30 -07:00
Benjamin Bartels
17d055f527 [Feat] Adds runai distributed streamer (#27230)
Signed-off-by: bbartels <benjamin@bartels.dev>
Signed-off-by: Benjamin Bartels <benjamin@bartels.dev>
Co-authored-by: omer-dayan <omdayan@nvidia.com>
Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
2025-10-29 21:09:10 -07:00
Simon Mo
9007bf57e6 Revert "Install pre-built xformers-0.0.32.post2 built with pt-2.9.0" (#27714) 2025-10-28 20:58:01 -07:00
Huy Do
f257544709 Install pre-built xformers-0.0.32.post2 built with pt-2.9.0 (#27598)
Signed-off-by: Huy Do <huydhn@gmail.com>
Co-authored-by: Roger Wang <hey@rogerw.io>
2025-10-28 19:39:15 -07:00
Li, Jiang
d34f5fe939 [Bugfix][CPU] Fallback oneDNN linear to torch linear to fix half gemm support on legecy platforms (#27526)
Signed-off-by: jiang1.li <jiang1.li@intel.com>
Co-authored-by: Isotr0py <mozf@mail2.sysu.edu.cn>
2025-10-27 23:25:44 -07:00
Micah Williamson
921e78f4bb [ROCm] Update AITER branch for ROCm base docker (#27586)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2025-10-27 17:22:33 +00:00
ioana ghiban
435be10db9 Fix AArch64 CPU Docker pipeline (#27331)
Signed-off-by: Ioana Ghiban <ioana.ghiban@arm.com>
2025-10-24 05:11:01 -07:00
Huy Do
ed540d6d4c Update release pipeline for PyTorch 2.9.0 (#27303)
Signed-off-by: Huy Do <huydhn@gmail.com>
2025-10-22 09:18:01 +00:00
Huy Do
becb7de40b Update PyTorch to 2.9.0+cu129 (#24994)
Co-authored-by: Luka Govedič <ProExpertProg@users.noreply.github.com>
2025-10-21 17:20:18 -04:00
Micah Williamson
aa1356ec53 [ROCm] Update Triton, Torch, and AITER branches for ROCm base Dockerfile (#27206)
Signed-off-by: Micah Williamson <micah.williamson@amd.com>
2025-10-21 12:01:23 -04:00
Harry Mellor
bd66b8529b [CI] Install pre-release version of apache-tvm-ffi for flashinfer (#27262)
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
2025-10-21 14:23:56 +00:00
ioana ghiban
1c691f4a71 AArch64 CPU Docker pipeline (#26931) 2025-10-20 07:09:40 -04:00
jiahanc
41d3071918 [NVIDIA] [Perf] Update to leverage flashinfer trtllm FP4 MOE throughput kernel (#26714)
Signed-off-by: jiahanc <173873397+jiahanc@users.noreply.github.com>
Co-authored-by: Michael Goin <mgoin64@gmail.com>
2025-10-16 16:20:25 -07:00
Zhewen Li
44c8555621 [CI/Build] Fix AMD import failures in CI (#26841)
Signed-off-by: zhewenli <zhewenli@meta.com>
2025-10-16 07:28:20 +00:00
Michael Goin
04b5f9802d [CI] Raise VLLM_MAX_SIZE_MB to 500 due to failing Build wheel - CUDA 12.9 (#26722)
Signed-off-by: mgoin <mgoin64@gmail.com>
2025-10-14 10:52:05 -07:00
liuzhenwei
27ed39a347 [XPU] Upgrade NIXL to remove CUDA dependency (#26570)
Signed-off-by: zhenwei-intel <zhenwei.liu@intel.com>
2025-10-11 05:15:23 +00:00
Nishidha Panpaliya
8f8474fbe3 [CI/Build] Fix ppc64le CPU build and tests (#22443)
Signed-off-by: Nishidha Panpaliya <nishidha.panpaliya@partner.ibm.com>
2025-10-11 13:04:42 +08:00
Michael Goin
c9d33c60dc [UX] Add FlashInfer as default CUDA dependency (#26443)
Signed-off-by: mgoin <mgoin64@gmail.com>
Co-authored-by: Wentao Ye <44945378+yewentao256@users.noreply.github.com>
2025-10-09 14:10:02 -07:00
elvischenv
5e49c3e777 Bump Flashinfer to v0.4.0 (#26326)
Signed-off-by: elvischenv <219235043+elvischenv@users.noreply.github.com>
2025-10-08 23:58:44 -07:00