Highlights
- Pro
Starred repositories
Summaries and resources for Designing Machine Learning Systems book (Chip Huyen, O'Reilly 2022)
Fit interpretable models. Explain blackbox machine learning.
Distributed Compiler based on Triton for Parallel Systems
A lightweight sandboxing tool for enforcing filesystem and network restrictions on arbitrary processes at the OS level, without requiring a container.
Framework for creating high fidelity and complex RL environments and evaluation tasks
A protocol for connecting any editor to any agent
An annotated implementation of the Transformer paper.
Solve puzzles. Improve your pytorch.
NexusRaven-13B, a new SOTA Open-Source LLM for function calling. This repo contains everything for reproducing our evaluation on NexusRaven-13B and baselines.
OpenChat: Advancing Open-source Language Models with Imperfect Data
Claude Code is an agentic coding tool that lives in your terminal, understands your codebase, and helps you code faster by executing routine tasks, explaining complex code, and handling git workflo…
Lakehouse native graph engine with git-style workflows
C++-based high-performance parallel environment execution engine (vectorized env) for general RL environments.
SWE-bench: Can Language Models Resolve Real-world Github Issues?
Collection of the system designs driven by LLMs
A collection of 500+ real-world ML & LLM system design case studies from 100+ companies. Learn how top tech firms implement GenAI in production.
AI system design guide for engineers building production AI systems and evals.
TimesFM (Time Series Foundation Model) is a pretrained time-series foundation model developed by Google Research for time-series forecasting.
Scalable toolkit for efficient model reinforcement
Lance Namespace is an open specification for describing access and operations against a collection of tables in a multimodal lakehouse
Agent framework for the JVM. Pronounced Em-BAY-bel /ɛmˈbeɪbəl/
A library to convert a pydantic model to a pyarrow schema
Miles is an enterprise-facing reinforcement learning framework for LLM and VLM post-training, forked from and co-evolving with slime.
A fast, feature-rich static code analyzer & language server for Python
Apache Paimon Python The Python implementation of Apache Paimon.
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…