Highlights
- Pro
Stars
Train the smallest LM you can that fits in 16MB. Best model wins!
A lightweight inference engine supporting speculative speculative decoding (SSD).
Course on Flash-attention in Triton
Lists of company wise questions. Every csv file in the companies directory corresponds to a list of questions on leetcode for a specific company based on the leetcode company tags. Updated as of 20…
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
Student version of Assignment 1 for Stanford CS336 - Language Modeling From Scratch
Official inference repo for FLUX.2 models
Claude Code skill that removes signs of AI-generated writing from text
A curriculum for learning about gpu performance engineering, from scratch to what the frontier AI labs do
Shortest solutions for CS231n 2021-2026
~950 line, minimal, extensible LLM inference engine built from scratch.
Simple OSINT script to find Instagram profiles by name and e-mail/phone
Implementing scalable LLMs in pure JAX (no third-party libraries)
Based on Nano-vLLM, a simple replication of vLLM with self-contained paged attention and flash attention implementation
Train your own speech AI model from scratch
Official code of Why Knowledge Distillation Works in Generative Models: A Minimal Working Explanation (NeurIPS 2025)
Agentic AI Infrastructure for magnifying HUMAN capabilities.
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
Examples demonstrating available options to program multiple GPUs in a single node or a cluster