Starred repositories
Modin: Scale your Pandas workflows by changing a single line of code
Command-line sampling profiler for macOS, Linux, and Windows
Ministack: Free, open-source local AWS emulator - 35+ services, Terraform compatible, real databases. Free forever. MIT licensed.
This is the companion repository for the book How Query Engines Work.
The book Distributed systems: for fun and profit
An open-source C++ library developed and used at Facebook.
Apache Spark - A unified analytics engine for large-scale data processing
Postgres extension for vector search (DiskANN), complements pgvector for performance and scale. Postgres OSS licensed.
A lightweight, lightning-fast, in-process vector database
Opensource IDE For Exploring and Testing API's (lightweight alternative to Postman/Insomnia)
Container runtimes on macOS (and Linux) with minimal setup
Dev environments for numerous languages based on Nix flakes [maintainer=@lucperkins]
A syntax-highlighting pager for git, diff, grep, and blame output
Bf-Tree is a modern read-write-optimized concurrent larger-than-memory range index in Rust from MS Research.
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …
SeaweedFS is a distributed storage system for object storage (S3), file systems, and Iceberg tables, designed to handle billions of files with O(1) disk access and effortless horizontal scaling.
VimTeX: A modern Vim and neovim filetype plugin for LaTeX files.
FlatBuffers: Memory Efficient Serialization Library
chDB is an in-process OLAP SQL Engine 🚀 powered by ClickHouse
The lean application framework for Python. Build sophisticated user interfaces with a simple Python API. Run your apps in the terminal and a web browser.
Free and Open Source, Distributed, RESTful Search Engine
The official Waterfox 💧 source code repository