More
More
-
alluxio Public
Forked from Alluxio/alluxioAlluxio, data orchestration for analytics and machine learning in the cloud
Java Apache License 2.0 UpdatedApr 29, 2025 -
azkaban Public
Forked from azkaban/azkabanAzkaban workflow manager.
Java Apache License 2.0 UpdatedDec 19, 2019 -
bitsail Public
Forked from bytedance/bitsailBitSail is a distributed high-performance data integration engine which supports batch, streaming and incremental scenarios. BitSail is widely used to synchronize hundreds of trillions of data ever…
Java Apache License 2.0 UpdatedDec 5, 2023 -
cassandra Public
Forked from apache/cassandraApache Cassandra®
Java Apache License 2.0 UpdatedMar 24, 2025 -
-
-
-
dubbo Public
Forked from apache/dubboApache Dubbo is a high-performance, java based, open source RPC framework.
Java Apache License 2.0 UpdatedMar 27, 2023 -
-
fluss Public
Forked from apache/flussFluss is a streaming storage built for real-time analytics.
Java Apache License 2.0 UpdatedOct 27, 2025 -
-
-
-
-
hudi Public
Forked from apache/hudiUpserts, Deletes And Incremental Processing on Big Data.
Java Apache License 2.0 UpdatedOct 21, 2020 -
iceberg Public
Forked from apache/icebergApache Iceberg
Java Apache License 2.0 UpdatedApr 7, 2025 -
kafka Public
Forked from apache/kafkaMirror of Apache Kafka
Java Apache License 2.0 UpdatedJun 8, 2024 -
paimon Public
Forked from apache/paimonApache Paimon is a lake format that enables building a Realtime Lakehouse Architecture with Flink and Spark for both streaming and batch operations.
Java Apache License 2.0 UpdatedOct 27, 2025 -
pulsar Public
Forked from apache/pulsarApache Pulsar - distributed pub-sub messaging system
Java Apache License 2.0 UpdatedJan 15, 2024 -
ranger Public
Forked from apache/rangerApache Ranger - To enable, monitor and manage comprehensive data security across the Hadoop platform and beyond
Java Apache License 2.0 UpdatedOct 30, 2024 -
ray Public
Forked from ray-project/rayRay is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Python Apache License 2.0 UpdatedJun 6, 2025 -
sofa-bolt Public
Forked from sofastack/sofa-boltSOFABolt is a lightweight, easy to use and high performance remoting framework based on Netty.
Java Apache License 2.0 UpdatedMar 14, 2019 -
spark Public
Forked from apache/sparkApache Spark - A unified analytics engine for large-scale data processing
Scala Apache License 2.0 UpdatedDec 17, 2024 -
Distributed transactional key-value database, originally created to complement TiDB
Rust Apache License 2.0 UpdatedDec 16, 2024 -
trino Public
Forked from trinodb/trinoOfficial repository of Trino, the distributed SQL query engine for big data, formerly known as PrestoSQL (https://trino.io)
Java Apache License 2.0 UpdatedAug 13, 2023 -
-