Stars
Apache Spark - A unified analytics engine for large-scale data processing
Oh my tmux! My self-contained, pretty & versatile tmux configuration made with 💛🩷💙🖤❤️🤍
Mirror of the official PostgreSQL GIT repository. Note that this is just a *mirror* - we don't work with pull requests on github. To contribute, please see https://wiki.postgresql.org/wiki/Submitti…
A platform to build and run apps that are elastic, agile, and resilient. SDK, libraries, and hosted environments.
A Flexible and Powerful Parameter Server for large-scale machine learning
Notes talking about the design and implementation of Apache Spark
A small utility to modify the dynamic linker and RPATH of ELF executables
A high performance and generic framework for distributed DNN training
LinDB is a scalable, high performance, high availability distributed time series database.
Stream summarizer and cardinality estimator.
A new data structure for accurate on-line accumulation of rank-based statistics such as quantiles and trimmed means
Deep Learning Pipelines for Apache Spark
Tonbo is an embedded database for serverless and edge runtimes.
Get Method Sampling from Java Flight Recorder Dump and convert to FlameGraph compatible format.
An end-to-end machine learning and data mining framework on Hadoop
Type-safe data migration tool for Slick, Git and beyond.
Classical RecSys algorithms implemented by using TensorFlow Estimators
An iterative computing framework for both Hadoop MapReduce and Hadoop YARN.
junshiguo / shifu
Forked from ShifuML/shifuAn end-to-end machine learning and data mining framework on Hadoop
junshiguo / AMC
Forked from brett-chen/AMCCode for KDD 2014 paper "Mining Topics in Documents: Standing on the Shoulders of Big Data"
junshiguo / EnWikiIndexing
Forked from Raysmond/EnWikiIndexingAim to create distributed inverted indexes of English Wikipedia dump using Hadoop.