Stars
Build an agent harness and control it end-to-end. Open-source SDK for production AI agents in Python & TypeScript - any model, any cloud.
verl/HybridFlow: A Flexible and Efficient RL Post-Training Framework
Anthropic's original performance take-home, now open for you to try!
A comprehensive 0-to-1 guide for building self-improving LLM applications with DSPy framework
Build and run agents you can see, understand and trust.
Implementation of all RL algorithms in a simpler way
The Startup CTO's Handbook, a book covering leadership, management and technical topics for leaders of software engineering teams
Official inference framework for 1-bit LLMs
Debug, evaluate, and monitor your LLM applications, RAG systems, and agentic workflows with comprehensive tracing, automated evaluations, and production-ready dashboards.
OpenID Connect (OIDC) identity and OAuth 2.0 provider with pluggable connectors
A JavaScript library like PyTorch, with GPU acceleration.
lightweight, standalone C++ inference engine for Google's Gemma models.
An unnecessarily tiny implementation of GPT-2 in NumPy.
Container runtimes on macOS (and Linux) with minimal setup
Friends don't let friends make certain types of data visualization - What are they and why are they bad.
An Open Source text-to-speech system built by inverting Whisper.
Tools for merging pretrained large language models.
Apache Hamilton helps data scientists and engineers define testable, modular, self-documenting dataflows, that encode lineage/tracing and metadata. Runs and scales everywhere python does.
Best Practices on Recommendation Systems
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
Foundational Models for State-of-the-Art Speech and Text Translation
Convert PDF to markdown + JSON quickly with high accuracy
Lightweight inference library for ONNX files, written in C++. It can run Stable Diffusion XL 1.0 on a RPI Zero 2 (or in 298MB of RAM) but also Mistral 7B on desktops and servers. ARM, x86, WASM, RI…
Bootstrap Kubernetes the hard way. No scripts.
A high-throughput and memory-efficient inference and serving engine for LLMs
An end-to-end implementation of intent prediction with Metaflow and other cool tools