Highlights
- Pro
Stars
High Performance Inter-Thread Messaging Library
Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running mat…
Alluxio, data orchestration for analytics and machine learning in the cloud
Tutorials for using RabbitMQ in various ways
A Flexible and Powerful Parameter Server for large-scale machine learning
Web-based notebook that enables data-driven, interactive data analytics and collaborative documents with SQL, Scala and more.
OrientDB is the most versatile DBMS supporting Graph, Document, Reactive, Full-Text and Geospatial models in one Multi-Model product. OrientDB can run distributed (Multi-Master), supports SQL, ACID…
CrateDB is a distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time, even with complex queries. It is PostgreSQL-compatible, and based on Lucene.
Example code from Learning Spark book
Please visit https://github.com/h2oai/h2o-3 for latest H2O
A cluster computing framework for processing large-scale geospatial data
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Open Source ML Model Versioning, Metadata, and Experiment Management
Maven plugin which includes build-time git repository information into an POJO / *.properties). Make your apps tell you which version exactly they were built from! Priceless in large distributed de…
MapReduce, Spark, Java, and Scala for Data Algorithms Book
An open source ML system for the end-to-end data science lifecycle
Snippets and small examples demonstrating kafka features and configs
Building Microservices with Spring Boot
AWS libraries/modules for working with Kinesis aggregated record data
Source code for Big Data: Principles and best practices of scalable realtime data systems
Code repository for O'Reilly Hadoop Application Architectures book
Hadoop Crypto Ledger - Analyzing CryptoLedgers, such as Bitcoin Blockchain, on Big Data platforms, such as Hadoop/Spark/Flink/Hive
Supporting material (code, schemas etc) for Unified Log Processing (Manning Publications)