Stars
Apache Iggy: Hyper-Efficient Message Streaming at Laser Speed
SeaweedFS is a distributed storage system for object storage (S3), file systems, and Iceberg tables, designed to handle billions of files with O(1) disk access and effortless horizontal scaling.
Dataframe like library and AI Agent for working with Apache Iceberg in Python, using pyiceberg plus natively implemented procedure extensions
Repo of adapters converting a Cerbos Query Plan to a data fetching layer
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
Faster C implementation of the bitstruct Python library
The world's fastest open query engine for sub-second analytics both on and off the data lakehouse. With the flexibility to support nearly any scenario, StarRocks provides best-in-class performance …
pip installer for OpenAPI Generator
New Polars implementation of the classic featurewiz MRMR algorithm. Created by Ram Seshadri. Collaborators welcome.
Use advanced feature engineering strategies and select best features from your data set with a single line of code. Created by Ram Seshadri. Collaborators welcome.
GitHub Action to setup `ssh-agent` with a private key
Apache Druid: a high performance real-time analytics database.
Learn the basics of Apache Druid® from leaders in the community with these notebooks and useful tools.
Unified Training of Universal Time Series Forecasting Transformers
FastStream is a powerful and easy-to-use asynchronous Python framework for building asynchronous services interacting with event streams such as Apache Kafka, RabbitMQ, NATS, MQTT and Redis.
Utility to make it easy for a package to report its own version.
Fast job queuing and RPC in python with asyncio and redis.
Distributed task queue with full async support
An extremely fast Python package and project manager, written in Rust.
An extremely fast Python linter and code formatter, written in Rust.