Tags: qmx/vllm
Tags
[Test] Add tests for n parameter in chat completions API (vllm-projec… …t#35283) Signed-off-by: KrxGu <krishom70@gmail.com>
[Bugfix] Fix MTP accuracy for GLM-5 (vllm-project#34385) Signed-off-by: mgoin <mgoin64@gmail.com> (cherry picked from commit ec12d39)
Patch protobuf for CVE-2026-0994 (vllm-project#34253) Signed-off-by: Seiji Eicher <seiji@anyscale.com> Co-authored-by: Kevin H. Luu <khluu000@gmail.com> (cherry picked from commit 5045d5c)
[Frontend][last/5] Make pooling entrypoints request schema consensus. (… …vllm-project#31127) Signed-off-by: wang.yuqi <yuqi.wang@daocloud.io>
[Bugfix] Disable TRTLLM attention when KV transfer is enabled (vllm-p… …roject#33192) Signed-off-by: Zhanqiu Hu <zh338@cornell.edu>
[BugFix][Spec Decoding] Fix negative accepted tokens metric crash (vl… …lm-project#33729) Signed-off-by: Nick Hill <nickhill123@gmail.com>
[torch.compile] Don't do the fast moe cold start optimization if ther… …e is speculative decoding (vllm-project#33624) Signed-off-by: Richard Zou <zou3519@gmail.com> Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com> (cherry picked from commit 5eac9a1)
[Docs] Adding links and intro to Speculators and LLM Compressor (vllm… …-project#32849) Signed-off-by: Aidan Reilly <aireilly@redhat.com> Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com> Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
PreviousNext