-
-
Notifications
You must be signed in to change notification settings - Fork 17.9k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[Bugfix] Fix gridDim.y overflow for large row counts
bug
Something isn't working
#45255
opened Jun 11, 2026 by
JasonLi314
Loading…
3 tasks done
[MM][CG] Support ViT full CUDA graph for Ernie-4.5-VL image inference
documentation
Improvements or additions to documentation
multi-modality
Related to multi-modality (#4194)
nvidia
#45254
opened Jun 11, 2026 by
qyYue1389
Contributor
Loading…
docs: add fix disclosure policy to SECURITY.md
documentation
Improvements or additions to documentation
#45253
opened Jun 11, 2026 by
jperezdealgaba
Contributor
Loading…
[Security] Fix DoS via prompt_embeds on M-RoPE models
v1
#45252
opened Jun 11, 2026 by
jperezdealgaba
Contributor
Loading…
[Bugfix] Restrict FlashInfer cuDNN FP8 ViT attention gate to Blackwell (SM 100)
bug
Something isn't working
nvidia
#45251
opened Jun 11, 2026 by
wentian-byte
Loading…
1 of 3 tasks
[Bugfix][Frontend] Keep a reference to the background abort task in disagg api_router
bug
Something isn't working
frontend
#45249
opened Jun 11, 2026 by
kratos0718
Loading…
[Bugfix] Fix CPU memory leak in worker_busy_loop RPC deserialization path
bug
Something isn't working
v1
#45248
opened Jun 11, 2026 by
hiepnnguyentcu
•
Draft
3 of 4 tasks
[Tests][Reasoning] Add test coverage for Step3ReasoningParser
#45247
opened Jun 11, 2026 by
z-priyanshu
Loading…
[Bugfix] Pre-compile _zero_kv_blocks_kernel and _compute_slot_mapping_kernel during warmup
bug
Something isn't working
#45245
opened Jun 11, 2026 by
z-priyanshu
Loading…
2 of 3 tasks
minicpmv4_6: fix ImageSize (W,H) order for placeholder token calculation
#45244
opened Jun 11, 2026 by
tc-mb
Contributor
Loading…
[RISC-V] Enable BF16 on VLEN=256 hardware
ci/build
cpu
Related to CPU backends
#45243
opened Jun 11, 2026 by
velonica0
Contributor
Loading…
[Frontend] Add site-packages support for reasoning/tool parser plugins
tool-calling
#45241
opened Jun 11, 2026 by
odashi
Loading…
2 of 4 tasks
[XPU][DeepSeek-V4] Fix MTP: sync with upstream fixes #44821 and #43746
deepseek
Related to DeepSeek models
intel-gpu
Related to Intel GPU
#45240
opened Jun 11, 2026 by
majian4work
Contributor
Loading…
fix(ep): use floor/ceil for n_local_physical_experts bookkeeping in DeepSeek-V2 and Qwen3-MoE
ci/build
cpu
Related to CPU backends
deepseek
Related to DeepSeek models
documentation
Improvements or additions to documentation
frontend
needs-rebase
nvidia
qwen
Related to Qwen models
rocm
Related to AMD ROCm
v1
[Core] Avoid mixed batch on D-node using spec dec by isolating prefill-tail request
v1
#45237
opened Jun 11, 2026 by
qianlihuang
Contributor
•
Draft
4 tasks
[ROCm] Enable ROCm Attention Sinks and Connector-Friendly KV Layouts
documentation
Improvements or additions to documentation
kv-connector
rocm
Related to AMD ROCm
v1
#45234
opened Jun 11, 2026 by
AndreasKaratzas
Member
•
Draft
Bump the minor-update group across 1 directory with 150 updates
ci/build
dependencies
Pull requests that update a dependency file
nvidia
rocm
Related to AMD ROCm
#45233
opened Jun 11, 2026 by
dependabot
Bot
Loading…
[FlexAttention] make custom mask mods fully cudagraphable
nvidia
v1
#45232
opened Jun 11, 2026 by
liangel-02
Contributor
•
Draft
[Bugfix][KV-transfer] MoRIIO: READ-mode stability fixes (completion IDs, DP routing, drain, keepalive)
bug
Something isn't working
documentation
Improvements or additions to documentation
kv-connector
v1
#45230
opened Jun 11, 2026 by
chaeminlim-mb
•
Draft
3 of 4 tasks
[V1][Spec Decode] Relaxed acceptance for thinking-phase tokens (port of #22238)
v1
#45229
opened Jun 11, 2026 by
chaeminlim-mb
•
Draft
3 of 4 tasks
[Core][KV-transfer] MoRIIO: multi-node TP prefill→decode dispatch via published host list
documentation
Improvements or additions to documentation
kv-connector
#45228
opened Jun 11, 2026 by
chaeminlim-mb
•
Draft
3 of 4 tasks
[Bugfix][ROCm] MLA MTP decode: size verification metadata for real qlen/dtype, gate persistent path, fix CUDA-graph padding
bug
Something isn't working
nvidia
rocm
Related to AMD ROCm
v1
#45227
opened Jun 11, 2026 by
chaeminlim-mb
•
Draft
5 tasks done
[ROCm][MoE] Route batched expert layout through flat-reshape wrapper for AITER FP8
rocm
Related to AMD ROCm
#45226
opened Jun 11, 2026 by
chaeminlim-mb
•
Draft
6 tasks done
[Bugfix][ROCm][MoE] MoRI: pass num_qp_per_pe/quant_type explicitly; preserve router top-k for finalize
bug
Something isn't working
rocm
Related to AMD ROCm
#45225
opened Jun 11, 2026 by
chaeminlim-mb
•
Draft
4 tasks done
Previous Next
ProTip!
no:milestone will show everything without a milestone.