Skip to content

Tags: li2zhi/vllm

Tags

v0.15.2rc0

Toggle v0.15.2rc0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[Bugfix] Disable TRTLLM attention when KV transfer is enabled (vllm-p…

…roject#33192)

Signed-off-by: Zhanqiu Hu <zh338@cornell.edu>

v0.15.1

Toggle v0.15.1's commit message
[BugFix][Spec Decoding] Fix negative accepted tokens metric crash (vl…

…lm-project#33729)

Signed-off-by: Nick Hill <nickhill123@gmail.com>

v0.15.1rc1

Toggle v0.15.1rc1's commit message
[BugFix][Spec Decoding] Fix negative accepted tokens metric crash (vl…

…lm-project#33729)

Signed-off-by: Nick Hill <nickhill123@gmail.com>

v0.15.1rc0

Toggle v0.15.1rc0's commit message
[torch.compile] Don't do the fast moe cold start optimization if ther…

…e is speculative decoding (vllm-project#33624)

Signed-off-by: Richard Zou <zou3519@gmail.com>
Co-authored-by: Robert Shaw <114415538+robertgshaw2-redhat@users.noreply.github.com>
(cherry picked from commit 5eac9a1)

v0.16.0rc0

Toggle v0.16.0rc0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[Docs] Adding links and intro to Speculators and LLM Compressor (vllm…

…-project#32849)

Signed-off-by: Aidan Reilly <aireilly@redhat.com>
Signed-off-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>
Co-authored-by: Harry Mellor <19981378+hmellor@users.noreply.github.com>

v0.15.0

Toggle v0.15.0's commit message
[Release] [CI] Optim release pipeline (vllm-project#33156)

Signed-off-by: tjtanaa <tunjian.tan@embeddedllm.com>
(cherry picked from commit f9d0359)

v0.15.0rc3

Toggle v0.15.0rc3's commit message
Revert "Enable Cross layers KV cache layout at NIXL Connector (vllm-p…

…roject#30207)" (vllm-project#33241)

Signed-off-by: Or Ozeri <oro@il.ibm.com>
Co-authored-by: Kevin H. Luu <khluu000@gmail.com>
(cherry picked from commit 2e8de86)

v0.15.0rc2

Toggle v0.15.0rc2's commit message
Relax protobuf library version constraints (vllm-project#33202)

Signed-off-by: Jeffrey Wang <jeffreywang@anyscale.com>
(cherry picked from commit a97b5e2)

v0.15.0rc1

Toggle v0.15.0rc1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
[AMD][Kernel][BugFix] Use correct scale in concat_and_cache_ds_mla_ke…

…rnel when on gfx942 (vllm-project#32976)

Signed-off-by: Randall Smith <ransmith@amd.com>
Signed-off-by: Randall Smith <Randall.Smith@amd.com>
Co-authored-by: Randall Smith <ransmith@amd.com>

v0.15.0rc0

Toggle v0.15.0rc0's commit message
[Bugfix] Fix Dtypes for Pynccl Wrapper (vllm-project#33030)

Signed-off-by: Robert Shaw <robshaw@redhat.com>
Co-authored-by: Robert Shaw <robshaw@redhat.com>
(cherry picked from commit 43a013c)