Skip to content
View ywang96's full-sized avatar

Organizations

@vllm-project

Block or report ywang96

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
12 stars written in Python
Clear filter

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 75,900 15,373 Updated Apr 9, 2026

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,829 367 Updated Apr 6, 2026

A fast multimodal LLM for real-time voice

Python 4,396 370 Updated Dec 12, 2025

A framework for efficient model inference with omni-modality models

Python 4,215 724 Updated Apr 9, 2026

Entropy Based Sampling and Parallel CoT Decoding

Python 3,431 321 Updated Nov 13, 2024

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,807 174 Updated Apr 9, 2026

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 1,333 71 Updated Jan 27, 2026

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 767 200 Updated Apr 2, 2026

Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference

Python 247 17 Updated Feb 3, 2026

A PyTorch native library for training speculative decoding models

Python 76 12 Updated Apr 9, 2026

Manages vllm-nccl dependency

Python 17 3 Updated Jun 3, 2024

Cross-platform transformer training

Python 3 1 Updated Nov 14, 2025