Skip to content

Pull requests: alibaba/rtp-llm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

fix: modify fp16 layernorm in PreLayers function
#493 opened Dec 24, 2025 by junna2016 Loading…
fix - support get dtype from attn in python
#492 opened Dec 24, 2025 by Nancheng-11 Loading…
Feature/weight mask
#491 opened Dec 24, 2025 by zhanghuanzj Loading…
feat: support trt paged attn
#490 opened Dec 24, 2025 by Vinkle-hzt Loading…
feat - remove reshape for deepgemm weight
#488 opened Dec 24, 2025 by alibaba-miji Loading…
hotfix: support multi merge copy for rocm mtp
#486 opened Dec 24, 2025 by liaocz Loading…
feat: adapt glm-4.7 reasoning parsing
#485 opened Dec 24, 2025 by soaringk Loading…
[WIP] Features/refactor endpoint
#483 opened Dec 23, 2025 by wanglining97 Loading…
Develop/backend refactor rebase
#480 opened Dec 22, 2025 by wanglining97 Loading…
feat: release cache store when quit
#479 opened Dec 22, 2025 by zhangchicc Loading…
[ROCm] feat: support custom all gather
#475 opened Dec 22, 2025 by yyccli Loading…
Feature/clean sample
#474 opened Dec 21, 2025 by LLLLKKKK Loading…
feat: support python-xqa with CUDA 12.9
#473 opened Dec 19, 2025 by qqbbiu Loading…
Feature/support qwen next
#472 opened Dec 19, 2025 by alibaba-miji Loading…
support fp4 moe for decode
#470 opened Dec 18, 2025 by LingYeAI Loading…
feat - add trt llm gen attention
#469 opened Dec 18, 2025 by zerozw Loading…
fix: partial cpp smoke tp py model
#467 opened Dec 17, 2025 by JackTan25 Loading…
[draft] Features/refactor frontend
#452 opened Dec 15, 2025 by wanglining97 Loading…
Feat/refactor cuda graph ut
#450 opened Dec 15, 2025 by JackTan25 Loading…
Feature/flashinfer python merge
#448 opened Dec 12, 2025 by zerozw Loading…
ProTip! Follow long discussions with comments:>50.