Stars
Flink CDC is a streaming data integration tool
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
Apache Spark - A unified analytics engine for large-scale data processing
Apache Pulsar - distributed pub-sub messaging system
An open-source storage framework that enables building a Lakehouse architecture with compute engines including Spark, PrestoDB, Flink, Trino, and Hive and APIs
Upserts, Deletes And Incremental Processing on Big Data.
A standard style for README files
Easy-to-use pytorch-based framework for RecSys models
Easy-to-use,Modular and Extendible package of deep-learning based CTR models .