Skip to content
Change the repository type filter

Forks

    Repositories list

    • gluten

      Public
      Scala
      Apache License 2.0
      434100Updated Nov 5, 2024Nov 5, 2024
    • ClickHouse is a free analytic DBMS for big data.
      C++
      Apache License 2.0
      6.9k405Updated Nov 5, 2024Nov 5, 2024
    • orc

      Public
      Apache ORC - the smallest, fastest columnar storage for Hadoop workloads
      Java
      Apache License 2.0
      483000Updated Oct 28, 2024Oct 28, 2024
    • spark

      Public
      Apache Spark
      Scala
      Apache License 2.0
      28k000Updated Jul 31, 2024Jul 31, 2024
    • doris

      Public
      Apache Doris is an easy-to-use, high performance and unified analytics database.
      Java
      Apache License 2.0
      3.3k000Updated Aug 4, 2023Aug 4, 2023
    • rate

      Public
      Golang rate limiter for distributed system
      Go
      MIT License
      13000Updated Jul 3, 2023Jul 3, 2023
    • libhdfs3

      Public
      HDFS file read access for ClickHouse
      C++
      Apache License 2.0
      56000Updated Jun 25, 2023Jun 25, 2023
    • arrow

      Public
      Apache Arrow is a cross-language development platform for in-memory data. It specifies a standardized language-independent columnar memory format for flat and hierarchical data, organized for efficient analytic operations on modern hardware. It also provides computational libraries and zero-copy streaming messaging and interprocess communication. L
      C++
      Apache License 2.0
      3.5k000Updated Mar 29, 2023Mar 29, 2023
    • isa-l

      Public
      Intelligent Storage Acceleration Library
      C
      BSD 3-Clause "New" or "Revised" License
      300000Updated Mar 21, 2023Mar 21, 2023
    • C library for the MaxMind DB file format
      C
      Apache License 2.0
      239000Updated Feb 11, 2023Feb 11, 2023
    • Official home of Presto, the distributed SQL query engine for big data
      Java
      Apache License 2.0
      3k300Updated Jan 25, 2023Jan 25, 2023
    • ByConity

      Public
      ByConity is an open source cloud-native data warehouse
      C++
      Apache License 2.0
      331000Updated Jan 10, 2023Jan 10, 2023
    • redis

      Public
      Redis is an in-memory database that persists on disk. The data model is key-value, but many different kind of values are supported: Strings, Lists, Sets, Sorted Sets, Hashes, Streams, HyperLogLogs, Bitmaps.
      C
      BSD 3-Clause "New" or "Revised" License
      24k000Updated Aug 1, 2022Aug 1, 2022
    • sysroot

      Public
      Files for cross-compilation
      C
      21000Updated Jul 19, 2022Jul 19, 2022
    • gporca

      Public
      A modular query optimizer for big data
      C++
      Apache License 2.0
      228000Updated Jul 15, 2022Jul 15, 2022
    • NameNodeAnalytics is a self-help utility for scouting and maintaining the namespace of an HDFS instance.
      Java
      Apache License 2.0
      71000Updated Jun 14, 2022Jun 14, 2022
    • seastar

      Public
      High performance server-side application framework
      C++
      Apache License 2.0
      1.5k000Updated Jan 21, 2022Jan 21, 2022
    • Fast & memory efficient hashtable based on robin hood hashing for C++11/14/17/20
      C++
      MIT License
      145000Updated Jan 12, 2022Jan 12, 2022
    • dpdk

      Public
      Mirror of Data Plane Development Kit, git://dpdk.org/dpdk (http://dpdk.org)
      C++
      GNU General Public License v2.0
      196000Updated Oct 13, 2021Oct 13, 2021
    • Apache DolphinScheduler is a distributed and extensible workflow scheduler platform with powerful DAG visual interfaces, dedicated to solving complex job dependencies in the data pipeline and providing various types of jobs available `out of the box`.
      Java
      Apache License 2.0
      4.6k000Updated Aug 23, 2021Aug 23, 2021
    • brpc

      Public
      Most common RPC framework used throughout Baidu, with 600,000+ instances and 500+ kinds of services, called "baidu-rpc" inside Baidu.
      C++
      Apache License 2.0
      4k600Updated Jul 14, 2021Jul 14, 2021
    • Shaded version of Apache Hadoop for Trino
      Java
      Apache License 2.0
      48000Updated Jul 1, 2021Jul 1, 2021
    • ranger

      Public
      Mirror of Apache Ranger
      Java
      Apache License 2.0
      971000Updated Mar 4, 2021Mar 4, 2021
    • alluxio

      Public
      Alluxio, data orchestration for analytics and machine learning in the cloud
      Java
      Apache License 2.0
      2.9k000Updated Nov 2, 2020Nov 2, 2020
    • kyuubi

      Public
      Kyuubi is an enhanced editon of Apache Spark's primordial Thrift JDBC/ODBC Server.
      Scala
      Apache License 2.0
      915100Updated Oct 20, 2020Oct 20, 2020
    • hive

      Public
      Apache Hive
      Java
      Apache License 2.0
      4.7k101Updated Oct 12, 2020Oct 12, 2020
    • rttr

      Public
      C++ Reflection Library
      C++
      MIT License
      439001Updated Aug 21, 2020Aug 21, 2020
    • Java
      16000Updated Jul 7, 2020Jul 7, 2020
    • braft

      Public
      An industrial-grade C++ implementation of RAFT consensus algorithm based on brpc, widely used inside Baidu to build highly-available distributed systems.
      C++
      Apache License 2.0
      883000Updated Jun 29, 2020Jun 29, 2020
    • atlas

      Public
      Apache Atlas
      Java
      Apache License 2.0
      847000Updated May 21, 2020May 21, 2020