-
University of Virginia
- Charlottesville, VA
- http://tddg.github.io
- https://ds2-lab.github.io/
- @yuecheng87
- in/yue-cheng
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
Memory Sparse Attention - 亿级(100M)token 上下文的端到端可训练记忆框架
AI agents running research on single-GPU nanochat training automatically
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Unsloth Studio is a web UI for training and running open models like Qwen, DeepSeek, gpt-oss and Gemma locally.
ZipLLM: An efficient, lossless data reduction pipeline for large-scale LLM storage (NSDI'26)
A Claude Code plugin that automatically captures everything Claude does during your coding sessions, compresses it with AI (using Claude's agent-sdk), and injects relevant context back into future …
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
Create Epic Math and Physics Animations & Study Notes From Text and Images.
💫 Toolkit to help you get started with Spec-Driven Development
Open-source implementation of AlphaEvolve
Mooncake is the serving platform for Kimi, a leading LLM service provided by Moonshot AI.
A Comprehensive Benchmark to Evaluate LLMs as Agents (ICLR'24)
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
What if we could pack single purpose, powerful AI Agents into a single python file?
DeepSpeed is a deep learning optimization library that makes distributed training and inference easy, efficient, and effective.
Inspect a command's effects before modifying your live system
21 Lessons, Get Started Building with Generative AI
A tiny yet powerful LLM inference system tailored for researching purpose. vLLM-equivalent performance with only 2k lines of code (2% of vLLM).
Scalene: a high-performance, high-precision CPU, GPU, and memory profiler for Python with AI-powered optimization proposals
A command-line productivity tool powered by AI large language models like GPT-5, will help you accomplish your tasks faster and more efficiently.
Envision a future where everyone can read all the code of an educational operating system.
Sparsity-aware deep learning inference runtime for CPUs
LLM Serving Performance Evaluation Harness