ywang96

Roger Wang ywang96

Move faster and we will get there

249 followers · 37 following

Achievements

x3 x4

Achievements

x3 x4

Organizations

Stars

vllm-project / vllm

A high-throughput and memory-efficient inference and serving engine for LLMs

Python 75,030 15,100 Updated Apr 2, 2026

QwenLM / Qwen3-VL

Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.

Jupyter Notebook 18,855 1,712 Updated Jan 30, 2026

deepseek-ai / FlashMLA

FlashMLA: Efficient Multi-head Latent Attention Kernels

C++ 12,549 1,005 Updated Mar 31, 2026

NVIDIA / cutlass

CUDA Templates and Python DSLs for High-Performance Linear Algebra

C++ 9,524 1,768 Updated Apr 2, 2026

ai-dynamo / dynamo

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,470 985 Updated Apr 2, 2026

hiyouga / EasyR1

EasyR1: An Efficient, Scalable, Multi-Modality RL Training Framework based on veRL

Python 4,799 365 Updated Mar 26, 2026

fixie-ai / ultravox

A fast multimodal LLM for real-time voice

Python 4,389 370 Updated Dec 12, 2025

vllm-project / vllm-omni

A framework for efficient model inference with omni-modality models

Python 4,118 673 Updated Apr 2, 2026

xjdr-alt / entropix

Entropy Based Sampling and Parallel CoT Decoding

Python 3,430 320 Updated Nov 13, 2024

BBuf / how-to-optim-algorithm-in-cuda

how to optimize some algorithm in cuda.

Cuda 2,905 267 Updated Apr 1, 2026

ByteDance-Seed / VeOmni

VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo

Python 1,791 171 Updated Apr 2, 2026

lucidrains / transfusion-pytorch

Pytorch implementation of Transfusion, "Predict the Next Token and Diffuse Images with One Multi-Modal Model", from MetaAI

Python 1,331 71 Updated Jan 27, 2026

sgl-project / SpecForge

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

Python 753 193 Updated Apr 2, 2026

zeux / calm

CUDA/Metal accelerated language model inference

C 632 31 Updated May 29, 2025

SJTU-DENG-Lab / Discrete-Diffusion-Forcing

Discrete Diffusion Forcing (D2F): dLLMs Can Do Faster-Than-AR Inference

Python 249 17 Updated Feb 3, 2026

hu-po / docs

documentation for content creation

HTML 233 22 Updated Oct 3, 2025

vllm-project / vllm-nccl

Manages vllm-nccl dependency

Python 17 3 Updated Jun 3, 2024

tx-project / tx

Cross-platform transformer training

Python 3 1 Updated Nov 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Roger Wang ywang96

Achievements

Achievements

Organizations

Block or report ywang96

Stars

vllm-project / vllm

QwenLM / Qwen3-VL

deepseek-ai / FlashMLA

NVIDIA / cutlass

ai-dynamo / dynamo

hiyouga / EasyR1

fixie-ai / ultravox

vllm-project / vllm-omni

xjdr-alt / entropix

BBuf / how-to-optim-algorithm-in-cuda

ByteDance-Seed / VeOmni

lucidrains / transfusion-pytorch

sgl-project / SpecForge

zeux / calm

SJTU-DENG-Lab / Discrete-Diffusion-Forcing

hu-po / docs

vllm-project / vllm-nccl

tx-project / tx