Skip to content

Tags: cchuter/ds4

Tags

mgpu-v0.1.0

Toggle mgpu-v0.1.0's commit message
mgpu v0.1.0

Multi-GPU performance branch: prefill tensorcore, decode split-KV/flash-decode
attention, routed-MoE gate/up decode launch geometry, q8->f16 cache reserve.