- Hangzhou
Stars
Apache Spark - A unified analytics engine for large-scale data processing
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
Gluten is a middle layer responsible for offloading JVM-based SQL engines' execution to native engines.
Apache DataFusion Comet Spark Accelerator
Spark RAPIDS plugin - accelerate Apache Spark with GPUs
This is the development repository for sparkMeasure, a tool and library designed for efficient analysis and troubleshooting of Apache Spark jobs. It focuses on easing the collection and examination…
All the things about TPC-DS in Apache Spark