Lists (9)
Sort Name ascending (A-Z)
Starred repositories
Spring Batch is a framework for writing batch applications using Java and Spring
An Open Standard for lineage metadata collection
BitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data ever…
Dr. Elephant is a job and flow-level performance monitoring and tuning tool for Apache Hadoop and Apache Spark
Nessie: Transactional Catalog for Data Lakes with Git-like semantics
Hopsworks - Data-Intensive AI platform with a Feature Store
A microservices-based Streaming and Batch data processing in Cloud Foundry and Kubernetes
Apache Celeborn is an elastic and high-performance service for shuffle and spilled data.
Java MVC framework, agile, fast, rich domain model, made especially for server side of mobile application (一个敏捷,快速,富领域模型的Java MVC 框架,专为 移动应用后端量身定做)
https://openjdk.org/projects/code-tools/jcstress
Uniffle is a high performance, general purpose Remote Shuffle Service.
I listed All the questions that I wrote at least twice in https://seanforfun.github.io/leetcode/. Though all my solutions can be found at leetcode column. I also made my own conclusions about data …
Source code for Modern Java Recipes book from O'Reilly
来自淘宝diamond:http://code.taobao.org/p/diamond/src/
Remote shuffle service for Apache Spark to store shuffle data on remote servers.
The Eclipse Memory Analyzer is a fast and feature-rich Java heap dump analyzer that helps you find memory leaks and reduce memory consumption.
Sample code for a presentation on ZooKeeper.
Functional testing framework for Big Data pipelines.