Skip to content
View sshkhr's full-sized avatar
GPU Mode
GPU Mode

Organizations

@awesome-mlss

Block or report sshkhr

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

My learning notes for ML SYS.

Python 4,751 301 Updated Dec 22, 2025

MoE training for Me and You and maybe other people

Python 289 25 Updated Dec 17, 2025

minimal DL library in C: 24 NAIVE cuda/cpu ops, autodiff engine, python API (ops bindings/layers/models), tensor abstraction, strides, complex indexing (multi-dim slices like numpy), computation-gr…

C++ 29 4 Updated Dec 17, 2025

a teaching deep learning framework: the bridge from micrograd to tinygrad

Python 8 Updated Dec 22, 2025

High-Performance Implementation of OpenAI's TikToken.

C++ 465 12 Updated Jul 3, 2025

Hallucinations (Confabulations) Document-Based Benchmark for RAG. Includes human-verified questions and answers.

HTML 238 7 Updated Aug 7, 2025

s1: Simple test-time scaling

Python 6,618 764 Updated Jun 25, 2025

A course of learning LLM inference serving on Apple Silicon for systems engineers: build a tiny vLLM + Qwen.

Python 3,564 246 Updated Dec 18, 2025
JavaScript 173 13 Updated Oct 23, 2025

The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!

Python 2,352 301 Updated Dec 22, 2025

This repository delivers end-to-end, code-first tutorials covering every layer of production-grade GenAI agents, guiding you from spark to scale with proven patterns and reusable blueprints for re…

Jupyter Notebook 16,000 2,099 Updated Dec 11, 2025

Supercharge Your LLM Application Evaluations 🚀

Python 11,824 1,178 Updated Dec 22, 2025

A Book about Pythonic Application Architecture Patterns for Managing Complexity. Cosmos is the Opposite of Chaos you see. O'R. wouldn't actually let us call it "Cosmic Python" tho.

Python 3,676 543 Updated Sep 8, 2025

This repository contains a curated collection of 300+ case studies from over 80 companies, detailing practical applications and insights into machine learning (ML) system design. The contents are o…

5,144 705 Updated Aug 5, 2025

Scalable and Performant Data Loading

Python 355 21 Updated Dec 22, 2025

A reading list on LLM based Synthetic Data Generation 🔥

1,492 90 Updated Jun 5, 2025

Vocabulary Parallelism

Python 24 Updated Mar 10, 2025

Ongoing research training transformer models at scale

Python 14,673 3,404 Updated Dec 23, 2025

Pax is a Jax-based machine learning framework for training large scale models. Pax allows for advanced and fully configurable experimentation and parallelization, and has demonstrated industry lead…

Python 542 71 Updated Dec 20, 2025
Dockerfile 1 Updated Nov 8, 2025

SGLang is a fast serving framework for large language models and vision language models.

Python 21,908 3,836 Updated Dec 23, 2025

Fastest kernels written from scratch

Cuda 500 61 Updated Sep 18, 2025

:octocat: Browser extension that simplifies the GitHub interface and adds useful features

TypeScript 29,956 1,637 Updated Dec 22, 2025

Machine Learning Engineering Open Book

Python 16,079 987 Updated Dec 20, 2025

Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.

Python 305 55 Updated Dec 23, 2025

Minimalistic 4D-parallelism distributed training framework for education purpose

Python 1,926 149 Updated Aug 26, 2025

This repo is meant to serve as a guide for Machine Learning/AI technical interviews.

Jupyter Notebook 7,352 1,332 Updated Nov 28, 2025

CUDA Library Samples

C++ 2,250 431 Updated Dec 22, 2025

Step-by-step optimization of CUDA SGEMM

Cuda 416 54 Updated Mar 30, 2022

Small scale distributed training of sequential deep learning models, built on Numpy and MPI.

Python 153 7 Updated Oct 19, 2023
Next