Stars
Change data capture for a variety of databases. Please log issues at https://github.com/debezium/dbz/issues.
Alluxio, data orchestration for analytics and machine learning in the cloud
Official repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Repo for counting stars and contributing. Press F to pay respect to glorious developers.
This is archive of SparkRDMA project. The new repository with RDMA shuffle acceleration for Apache Spark is here: https://github.com/Nvidia/sparkucx
The open source AI engineering platform for agents, LLMs, and ML models. MLflow enables teams of all sizes to debug, evaluate, monitor, and optimize production-quality AI applications while control…
Apache Spark - A unified analytics engine for large-scale data processing
Apache Beam is a unified programming model for Batch and Streaming data processing.
An easy to use, self-service open BI reporting and BI dashboard platform.