Skip to content

Pull requests: radixark/miles

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

[fix] stop merging agentic turns at first non-COMPLETED turn
#1323 opened Jun 12, 2026 by Shi-Dong Contributor Loading…
feat: add FlashQLA backend for Qwen GDN linear-attention layers
#1318 opened Jun 11, 2026 by Zhichenzzz Contributor Loading…
fix: load Qwen 3.5 checkpoint with unfused experts
#1317 opened Jun 10, 2026 by lawrence-harmonic Contributor Loading…
[doc, CI] doc driven CI
#1312 opened Jun 9, 2026 by guapisolo Collaborator Loading…
fix(qwen3-vl): per-segment mRoPE + vision under CP + THD packing
#1308 opened Jun 8, 2026 by Zhichenzzz Contributor Loading…
fix(mtp): track megatron mtp_model_layer rename in raw converters
#1307 opened Jun 8, 2026 by Zhichenzzz Contributor Loading…
DO NOT MERGE: CI test run-ci-model-scripts Run model script smoke tests
#1306 opened Jun 8, 2026 by yueming-yuan Collaborator Loading…
[NPU] Feature add npu docker
#1305 opened Jun 8, 2026 by codemayq Loading…
Inject rank and millisecond timestamp into Ray train actor log lines
#1303 opened Jun 7, 2026 by fzyzcjy Collaborator Loading…
[feat] balance data by FLOPs run-ci-megatron
#1302 opened Jun 6, 2026 by yueming-yuan Collaborator Loading…
Add AMD support for DeepSeek V4
#1300 opened Jun 5, 2026 by Xinyu-Kang Loading…
ci: make manual Docker overlay builds configurable
#1299 opened Jun 5, 2026 by guapisolo Collaborator Loading…
Docker ci refactor
#1297 opened Jun 4, 2026 by Zhichenzzz Contributor Draft
[AMD] Merge MI300/MI350-5 Dockerfiles
#1294 opened Jun 4, 2026 by JessicaJiang-123 Contributor Loading…
ProTip! Updated in the last three days: updated:>2026-06-08.