Stars
A bidirectional pipeline parallelism algorithm for computation-communication overlap in DeepSeek V3/R1 training.
A PyTorch native platform for training generative AI models
Minimalistic 4D-parallelism distributed training framework for education purpose
Command-line program to download image galleries and collections from several image hosting sites
Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.
Production-grade client-side tracing, profiling, and analysis for complex software systems.
Open-source framework for the research and development of foundation models.
Beta unofficial migration assistant for moving from Claude Code to OpenAI Codex CLI
The official implementation for [NeurIPS2025 Oral] Gated Attention for Large Language Models: Non-linearity, Sparsity, and Attention-Sink-Free
Use Codex from Claude Code to review code or delegate tasks.
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
FDFO: Finite Difference Flow Optimization
The 100 line AI agent that solves GitHub issues or helps you in your command line. Radically simple, no huge configs, no giant monorepo—but scores >74% on SWE-bench verified!
PCCL (Prime Collective Communications Library) implements fault tolerant collective communications over IP
Skills for Real Engineers. Straight from my .claude directory.
EleutherAI / nanoGPT-mup
Forked from karpathy/nanoGPTThe simplest, fastest repository for training/finetuning medium-sized GPTs.
🌻 Flexible and fast ZSH plugin manager
Python tool for converting files and office documents to Markdown.
The simplest, fastest repository for training/finetuning medium-sized GPTs.
OpenDiLoCo: An Open-Source Framework for Globally Distributed Low-Communication Training
Run, manage, and scale AI workloads on any AI infrastructure. Use one system to access & manage all AI compute (Kubernetes, Slurm, 20+ clouds, on-prem).