Starred repositories
Apache Spark - A unified analytics engine for large-scale data processing
CMAK is a tool for managing Apache Kafka clusters
REST job server for Apache Spark
Byzer (former MLSQL): A low-code open-source programming language for data pipeline, analytics and AI.
Livy is an open source REST interface for interacting with Apache Spark from anywhere
Connect Spark to HBase for reading and writing data with ease
💥 🚀 封装sparkstreaming动态调节batch time(有数据就执行计算);🚀 支持运行过程中增删topic;🚀 封装sparkstreaming 1.6 - kafka 010 用以支持 SSL。
为DataX(https://github.com/alibaba/DataX) 提供远程多语言调用(ThriftServer,HttpServer) 分布式运行(DataX on YARN) 功能
一个手动管理spark streaming集成kafka时的偏移量到zookeeper中的小项目
Scalable recommendation system written in Scala using the Apache Spark framework
A simple project intended to demo spark and get developers up and running quickly