Skip to content
View WoosukKwon's full-sized avatar

Highlights

  • Pro

Block or report WoosukKwon

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Machine Learning Engineering Open Book

Python 17,827 1,132 Updated Mar 16, 2026

FlashSampling: Fast and Memory-Efficient Exact Sampling (https://huggingface.co/papers/2603.15854)

Python 69 6 Updated Apr 25, 2026

Early-stage Rust drop-in alternative frontend for vLLM

Rust 26 1 Updated Apr 29, 2026

Tensors and Dynamic neural networks in Python with strong GPU acceleration

Python 99,529 27,619 Updated Apr 29, 2026

NVIDIA Inference Xfer Library (NIXL)

C++ 1,010 307 Updated Apr 29, 2026

wentao.site / Hugo Template / A template repository for Hugo based blog

55 3 Updated Mar 21, 2026

A PyTorch native platform for training generative AI models

Python 5,281 801 Updated Apr 29, 2026

A framework for efficient model inference with omni-modality models

Python 4,556 855 Updated Apr 29, 2026

Easy, Fast, and Scalable Multimodal AI

Python 124 9 Updated Apr 17, 2026

verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework

Python 21,012 3,767 Updated Apr 29, 2026

TPU inference for vLLM, with unified JAX and PyTorch support.

Python 306 172 Updated Apr 29, 2026

SkyRL: A Modular Full-stack RL Library for LLMs

Python 1,795 311 Updated Apr 29, 2026

Post-training with Tinker

Python 3,185 403 Updated Apr 29, 2026

[NeurIPS 2025] Scaling Speculative Decoding with Lookahead Reasoning

Python 68 7 Updated Oct 31, 2025

A high-performance and light-weight router for vLLM large scale deployment

Rust 212 73 Updated Apr 29, 2026

Common recipes to run vLLM

JavaScript 763 246 Updated Apr 29, 2026

Open-source implementation of AlphaEvolve

Python 6,110 978 Updated Mar 18, 2026

Nano vLLM

Python 13,181 2,017 Updated Apr 26, 2026

Achieve state of the art inference performance with modern accelerators on Kubernetes

Shell 3,100 442 Updated Apr 29, 2026

A Datacenter Scale Distributed Inference Serving Framework

Rust 6,697 1,072 Updated Apr 29, 2026

ArcticInference: vLLM plugin for high-throughput, low-latency inference

Python 426 60 Updated Apr 23, 2026

MiMo: Unlocking the Reasoning Potential of Language Model – From Pretraining to Posttraining

Python 2,082 88 Updated Jun 5, 2025

Democratizing Reinforcement Learning for LLMs

Python 5,460 547 Updated Apr 28, 2026

[ACL 2025 Long Main] Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions

Python 43 7 Updated Apr 21, 2025

NumPy aware dynamic Python compiler using LLVM

Python 10,998 1,256 Updated Apr 28, 2026

[NeurIPS 2025] Simple extension on vLLM to help you speed up reasoning model without training.

Python 228 31 Updated May 31, 2025

A collection of GPT system prompts and various prompt injection/leaking knowledge.

HTML 10,543 1,468 Updated Apr 23, 2026

FAIR Sequence Modeling Toolkit 2

Python 1,128 140 Updated Apr 27, 2026
Next