Stars
A compact implementation of SGLang, designed to demystify the complexities of modern LLM serving systems.
Provider-neutral Agent Skill for Codex, Claude Code, and agentic harness design.
Build your own high performance LLM inference engine in C++ and CUDA - a smaller version of vLLM
This project aims to replicate mainstream open-source model architectures with limited computational resources, implementing mini models with 100-200M parameters.
Implement a ChatGPT-like LLM in PyTorch from scratch, step by step
Graphs that teach > graphs that impress. Turn any code into an interactive knowledge graph you can explore, search, and ask questions about. Works with Claude Code, Codex, Cursor, Copilot, Gemini C…
A GPT-2 inference engine written from scratch in CUDA and C++. Implements custom CUDA kernels for tiled matrix multiplication, LayerNorm, fused attention, transformer blocks, KV cache management, a…
Generate beautiful dark-themed system architecture diagrams as standalone HTML/SVG files. Works as a Claude AI skill.
🚀 An open-source, hands-on curriculum bridging the gap from basic RL concepts to LLM alignment, RLVR, and advanced Agentic systems.
AI-powered job search system built on Claude Code. 14 skill modes, Go dashboard, PDF generation, batch processing.
Sutskever 30 implementations inspired by https://papercode.vercel.app/ | For Agents, use https://github.com/pageman/Sutskever-Agent | Polyglot / Multi-Backed version at https://github.com/pageman/s…