Stars
AI agents running research on single-GPU nanochat training automatically
Ultra fast and portable Parakeet implementation for on-device inference in C++ using Axiom with MPS+Unified Memory
Fast Open-Source Search & Clustering engine × for Vectors & Arbitrary Objects × in C++, C, Python, JavaScript, Rust, Java, Objective-C, Swift, C#, GoLang, and Wolfram 🔍
A lightweight data processing framework built on DuckDB and 3FS.
🗻 Log-structured, embeddable key-value storage engine written in Rust
🦛 CHONK docs with Chonkie ✨ — The lightweight ingestion library for fast, efficient and robust RAG pipelines
Official Repo for InSTA: Towards Internet-Scale Training For Agents
RayDP provides simple APIs for running Spark on Ray and integrating Spark with AI libraries.
SeaweedFS is a distributed storage system for object storage (S3), file systems, and Iceberg tables, designed to handle billions of files with O(1) disk access and effortless horizontal scaling.
A Kubernetes operator to install and manage Dragonfly instances.
Scalable and efficient data transformation framework - backwards compatible with dbt.
LMCache: Supercharge Your LLM with the Fastest KV Cache Layer
A modern replacement for Redis and Memcached
Recipes to scale inference-time compute of open models
Summarize existing representative LLMs text datasets.
Train high-quality text-to-image diffusion models in a data & compute efficient manner
Streamlines and simplifies prompt design for both developers and non-technical users with a low code approach.
A high-throughput and memory-efficient inference and serving engine for LLMs
High-performance data engine for AI and multimodal workloads. Process images, audio, video, and structured data at any scale
Open Lakehouse Format for Multimodal AI. Convert from Parquet in 2 lines of code for 100x faster random access, vector index, and data versioning. Compatible with Pandas, DuckDB, Polars, Pyarrow, a…
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Resource, examples & tutorials for multimodal AI, RAG and agents using vector search and LLMs
Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.