Lists (1)
Sort Name ascending (A-Z)
Stars
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
A free and strong UCI chess engine
Tilus is a tile-level kernel programming language with explicit control over shared memory and registers.
CUDA Templates and Python DSLs for High-Performance Linear Algebra
Official Implementation of EAGLE-1 (ICML'24), EAGLE-2 (EMNLP'24), and EAGLE-3 (NeurIPS'25).
Production-tested AI infrastructure tools for efficient AGI development and community-driven innovation
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
A book for Learning the Foundations of LLMs
An extremely fast Python linter and code formatter, written in Rust.
Your AI second brain. Self-hostable. Get answers from the web or your docs. Build custom agents, schedule automations, do deep research. Turn any online or local LLM into your personal, autonomous …
KECC: KAIST Educational C Compiler. IMPORTANT: DON'T FORK!
A Easy-to-understand TensorOp Matmul Tutorial
how to optimize some algorithm in cuda.
A curated list for Efficient Large Language Models
Awesome-LLM: a curated list of Large Language Model
A high-throughput and memory-efficient inference and serving engine for LLMs
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
GPU programming related news and material links
High-speed GEMV kernels, at most 2.7x speedup compared to pytorch baseline.
[EuroSys'24] Minuet: Accelerating 3D Sparse Convolutions on GPUs
程序员延寿指南 | A programmer's guide to live longer
An open-source efficient deep learning framework/compiler, written in python.