Lists (5)
Sort Name ascending (A-Z)
Stars
My learning notes for ML SYS.
MoE training for Me and You and maybe other people
minimal DL library in C: 24 NAIVE cuda/cpu ops, autodiff engine, python API (ops bindings/layers/models), tensor abstraction, strides, complex indexing (multi-dim slices like numpy), computation-gr…
a teaching deep learning framework: the bridge from micrograd to tinygrad
High-Performance Implementation of OpenAI's TikToken.
Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.
A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
This repository delivers end-to-end, code-first tutorials covering every layer of production-grade GenAI agents, guiding you from spark to scale with proven patterns and reusable blueprints for re…
Supercharge Your LLM Application Evaluations 🚀
A Book about Pythonic Application Architecture Patterns for Managing Complexity. Cosmos is the Opposite of Chaos you see. O'R. wouldn't actually let us call it "Cosmic Python" tho.
This repository contains a curated collection of 300+ case studies from over 80 companies, detailing practical applications and insights into machine learning (ML) system design. The contents are o…
A reading list on LLM based Synthetic Data Generation 🔥
sail-sg / VocabularyParallelism
Forked from NVIDIA/Megatron-LMVocabulary Parallelism
Ongoing research training transformer models at scale
Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentation and parallelization, and has demonstrated industry lead…
SGLang is a fast serving framework for large language models and vision language models.
Browser extension that simplifies the GitHub interface and adds useful features
Machine Learning Engineering Open Book
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
Minimalistic 4D-parallelism distributed training framework for education purpose
This repo is meant to serve as a guide for Machine Learning/AI technical interviews.
Step-by-step optimization of CUDA SGEMM
Small scale distributed training of sequential deep learning models, built on Numpy and MPI.