Tags: Pradyun92/vllm
Tags
[Compilation Bug] Fix Inductor Graph Output with Shape Issue (vllm-pr… …oject#24772) Signed-off-by: yewentao256 <zhyanwentao@126.com>
[Bugfix] fixes the causal_conv1d_update kernel update non-speculative… … decoding cases (vllm-project#24680) Signed-off-by: Tao He <linzhu.ht@alibaba-inc.com> Co-authored-by: Cyrus Leung <tlleungac@connect.ust.hk>
this is only used to fix nightly wheel version, not a real release ca… …ndidate
Do not use eval() to convert unknown types (vllm-project#23266) Signed-off-by: Russell Bryant <rbryant@redhat.com> Signed-off-by: simon-mo <simon.mo@hey.com>
Use Blackwell FlashInfer MXFP4 MoE by default if available (vllm-proj… …ect#23008) Signed-off-by: mgoin <mgoin64@gmail.com>
fix: gptq marlin weight loading failure (vllm-project#23066)
Add think chunk (vllm-project#21333) Signed-off-by: Julien Denize <julien.denize@mistral.ai>
Add think chunk (vllm-project#21333) Signed-off-by: Julien Denize <julien.denize@mistral.ai>
PreviousNext