-
11:03
(UTC -07:00) - https://rogerw.io
- in/rogerywang
- @rogerw0108
Stars
A high-throughput and memory-efficient inference and serving engine for LLMs
EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL
A framework for efficient model inference with omni-modality models
Entropy Based Sampling and Parallel CoT Decoding
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI
Train speculative decoding models effortlessly and port them smoothly to SGLang serving.
Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference
A PyTorch native library for training speculative decoding models