-
The Chinese University of Hong Kong
- Hong Kong SAR
-
10:25
(UTC +08:00) - https://txxx926.github.io/
Stars
OpenClaw-RL: Train any agent simply by talking
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
slime is an LLM post-training framework for RL Scaling.
Bash is all you need - A nano claude code–like 「agent harness」, built from 0 to 1
DFlash: Block Diffusion for Flash Speculative Decoding
Official implementation of "FOCUS: DLLMs Know How to Tame Their Compute Bound".
Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond
cuTile is a programming model for writing parallel kernels for NVIDIA GPUs
torchcomms: a modern PyTorch communications API
An Easy-to-use, Scalable and High-performance Agentic RL Framework based on Ray (PPO & DAPO & REINFORCE++ & VLM & TIS & vLLM & Ray & Async RL)
Allow torch tensor memory to be released and resumed later
Official implementation for DenseMixer: Improving MoE Post-Training with Precise Router Gradient
📚 Modern C++ Tutorial: C++11/14/17/20 On the Fly | https://changkun.de/modern-cpp/
Distributed Compiler based on Triton for Parallel Systems
A PyTorch native platform for training generative AI models
verl: Volcano Engine Reinforcement Learning for LLMs
CUDA Python: Performance meets Productivity
[ASPLOS'25] Towards End-to-End Optimization of LLM-based Applications with Ayo
[SenSys'24] PieBridge: Fast and Parameter-Efficient On-Device Training via Proxy Networks
DeepEP: an efficient expert-parallel communication library
Official Repo for Open-Reasoner-Zero