lkevinzc

Follow

🎯

Learning

zclzc lkevinzc

🎯

Learning

Follow

@google-deepmind

166 followers · 163 following

Achievements

Achievements

Organizations

Pinned Loading

sail-sg/oat sail-sg/oat Public

🌾 OAT: A research-friendly framework for LLM online alignment, including reinforcement learning, preference learning, etc.

Python 661 63
axon-rl/gem axon-rl/gem Public

A Gym for Agentic LLMs

Python 494 33
sail-sg/understand-r1-zero sail-sg/understand-r1-zero Public

Understanding R1-Zero-Like Training: A Critical Perspective

Python 1.3k 59
mosecorg/mosec mosecorg/mosec Public

A high-performance ML model serving framework, offers dynamic batching and CPU/GPU pipelines to fully exploit your compute machine

Python 901 73
spiral-rl/spiral spiral-rl/spiral Public

SPIRAL: Self-Play on Zero-Sum Games Incentivizes Reasoning via Multi-Agent Multi-Turn Reinforcement Learning

Python 195 22
sail-sg/Precision-RL sail-sg/Precision-RL Public

Defeating the Training-Inference Mismatch via FP16

Python 195 17