Stars
Roo Code gives you a whole dev team of AI agents in your code editor.
🎒 Token-Oriented Object Notation (TOON) – Compact, human-readable, schema-aware JSON for LLM prompts. Spec, benchmarks, TypeScript SDK.
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
🧠 Laws, Theories, Principles and Patterns for developers and technologists.
🚀2.3x faster than MinIO for 4KB object payloads. RustFS is an open-source, S3-compatible high-performance object storage system supporting migration and coexistence with other S3-compatible platfor…
Deploy an AI Analyst in less than 2 mins — connect any LLM to any data source with centralized context management, observability, and control. Text-to-SQL, Text-to-Python, Text-to-Dashboard
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
A curated list of 100+ libraries and frameworks for AI engineers building with LLMs
File Diff using the Patience Diff algorithm. https://opensource.janestreet.com/patdiff/
The Auron accelerator for distributed computing framework (e.g., Spark) leverages native vectorized execution to accelerate query processing
A lightweight data processing framework built on DuckDB and 3FS.
QuestDB is a high performance, open-source, time-series database
Perforator is a cluster-wide continuous profiling tool designed for large data centers
AI-Powered Data Processing: Use LOTUS to process all of your datasets with LLMs and embeddings. Enjoy up to 1000x speedups with fast, accurate query processing, that's as simple as writing Pandas code
Fast, Simple and a cost effective tool to replicate data from Postgres to Data Warehouses, Queues and Storage
Rokku project. This project acts as a proxy on top of any S3 storage solution providing services like authentication, authorization, short-term tokens, and lineage.
Distributed SQL Query Engine in Python using Ray
A generic framework for on-demand, incrementalized computation. Inspired by adapton, glimmer, and rustc's query system.
A portable Multimodal Lakehouse powered by Ray that brings exabyte-level scalability and fast, ACID-compliant, change-data-capture to your ML and analytics workloads.
Production-ready C++ Asynchronous Framework with rich functionality
Free universal database tool and SQL client
Jackrabbit Relay is an API endpoint for cryptocurrency/forex exchanges.
Reverse proxy for AWS S3 with basic authentication.
High-performance diffing of large datasets across databases
Tool for easy backup and restore for ClickHouse® using object storage for backup files.
📊 Cube Core is open-source semantic layer and LookML alternative for AI, BI and embedded analytics