Highlights
- Pro
Starred repositories
Free and Open Source, Distributed, RESTful Search Engine
Orchestrate everything - from scripts to data, infra, AI, and business - as code, with UI and AI Copilot. Simple. Fast. Scalable.
Apache Pulsar - distributed pub-sub messaging system
Apache Doris is an easy-to-use, high performance and unified analytics database.
Apache Druid: a high performance real-time analytics database.
Style and Grammar Checker for 25+ Languages
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Change data capture for a variety of databases. Please log issues at https://issues.redhat.com/browse/DBZ.
The Metadata Platform for your Data and AI Stack
Apache Beam is a unified programming model for Batch and Streaming data processing.
Upserts, Deletes And Incremental Processing on Big Data.
Apache Pinot - A realtime distributed OLAP datastore
Java client for Kubernetes & OpenShift
Supplementary resources for the AWS Lambda Developer Guide
First open-source data discovery and observability platform. We make a life for data practitioners easy so you can focus on your business.
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
Apache XTable (incubating) is a cross-table converter for lakehouse table formats that facilitates interoperability across data processing systems and query engines.
Artemis - Interactive Learning with Automated Feedback
Streaming Anomaly Detection Solution by using Pub/Sub, Dataflow, BQML & Cloud DLP
The Amazon S3 Tables catalog is a client library that bridges control plane operations provided by S3 Tables to engines like Apache Spark, Flink and others, when used with the Iceberg Table format