Stars
- All languages
- Awk
- Batchfile
- C
- C#
- C++
- COBOL
- CSS
- Clojure
- CoffeeScript
- Crystal
- Cuda
- Dart
- Dockerfile
- GCC Machine Description
- GLSL
- Go
- HTML
- Haskell
- Idris
- Java
- JavaScript
- Julia
- Jupyter Notebook
- Kotlin
- Lua
- MATLAB
- Makefile
- Mustache
- Nim
- Nunjucks
- Objective-C
- Objective-C++
- OpenEdge ABL
- PHP
- PowerShell
- Prolog
- Python
- QML
- R
- Ruby
- Rust
- Scala
- Shell
- Smalltalk
- Swift
- TeX
- TypeScript
- VBA
- Vue
Free and Open Source, Distributed, RESTful Search Engine
Apache Druid: a high performance real-time analytics database.
OpenRefine is a free, open source power tool for working with messy data and improving it
Apache Beam is a unified programming model for Batch and Streaming data processing.
Apache Pinot - A realtime distributed OLAP datastore
The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).
Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log-like data
A flexible and scalable container based Selenium Grid with video recording, live preview, basic auth & dashboard.
Apache Ambari simplifies provisioning, managing, and monitoring of Apache Hadoop clusters.
🐘 Elasticsearch real-time search and analytics natively integrated with Hadoop
Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning
Open Source ML Model Versioning, Metadata, and Experiment Management
Elassandra = Elasticsearch + Apache Cassandra
Plugin to integrate Learning to Rank (aka machine learning for better relevance) with Elasticsearch
Hopsworks - Data-Intensive AI platform with a Feature Store
Multi Model Server is a tool for serving neural net models for inference
A scalable, mature and versatile web crawler based on Apache Storm
An extensible distributed system for reliable nearline data streaming at scale
REST web service for the true real-time scoring (<1 ms) of Scikit-Learn, R and Apache Spark models