Lists (1)
Sort Name ascending (A-Z)
Stars
Source code for the X Recommendation Algorithm
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
State of the Art Natural Language Processing
Deequ is a library built on top of Apache Spark for defining "unit tests for data", which measure data quality in large datasets.
Apache Kyuubi is a distributed and multi-tenant gateway to provide serverless SQL on data warehouses and lakehouses.
A Spark plugin for reading and writing Excel files
Apache Spark Connector for SQL Server and Azure SQL
Plug-and-play implementation of an Apache Spark custom data source for AWS DynamoDB.
Performant Redshift data source for Apache Spark