Skip to content
View MegaMixedReality's full-sized avatar

Block or report MegaMixedReality

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
51 stars written in Java
Clear filter

Free and Open Source, Distributed, RESTful Search Engine

Java 75,717 25,746 Updated Dec 24, 2025

Mirror of Apache Kafka

Java 31,579 14,850 Updated Dec 24, 2025

Apache Flink

Java 25,634 13,806 Updated Dec 24, 2025

Apache Druid: a high performance real-time analytics database.

Java 13,906 3,770 Updated Dec 23, 2025

Apache ZooKeeper

Java 12,693 7,328 Updated Dec 19, 2025

OpenRefine is a free, open source power tool for working with messy data and improving it

Java 11,664 2,112 Updated Dec 23, 2025

Apache Beam is a unified programming model for Batch and Streaming data processing.

Java 8,416 4,468 Updated Dec 24, 2025

AI + Data, online. https://vespa.ai

Java 6,687 686 Updated Dec 22, 2025

Apache Pinot - A realtime distributed OLAP datastore

Java 5,993 1,436 Updated Dec 24, 2025

Apache NiFi

Java 5,884 2,916 Updated Dec 24, 2025

Apache Ignite

Java 5,023 1,932 Updated Dec 24, 2025

Open Source Web Crawler for Java

Java 4,617 1,920 Updated Nov 4, 2021

The Apache Tika toolkit detects and extracts metadata and text from over a thousand different file types (such as PPT, XLS, and PDF).

Java 3,477 903 Updated Dec 22, 2025

Apache Flume is a distributed, reliable, and available service for efficiently collecting, aggregating, and moving large amounts of log-like data

Java 2,556 1,557 Updated Oct 10, 2024

Easy to use lightweight web crawler(易用的轻量化网络爬虫)

Java 2,518 883 Updated Dec 3, 2025

A flexible and scalable container based Selenium Grid with video recording, live preview, basic auth & dashboard.

Java 2,368 568 Updated Sep 11, 2021

Apache Ambari simplifies provisioning, managing, and monitoring of Apache Hadoop clusters.

Java 2,277 1,744 Updated Dec 19, 2025

🐘 Elasticsearch real-time search and analytics natively integrated with Hadoop

Java 2,022 999 Updated Dec 19, 2025

Oryx 2: Lambda architecture on Apache Spark, Apache Kafka for real-time large scale machine learning

Java 1,784 404 Updated Aug 16, 2021

Distributed Big Data Orchestration Service

Java 1,759 371 Updated Dec 17, 2025

Open Source ML Model Versioning, Metadata, and Experiment Management

Java 1,744 288 Updated Jul 23, 2024

Elassandra = Elasticsearch + Apache Cassandra

Java 1,719 199 Updated May 26, 2025

Plugin to integrate Learning to Rank (aka machine learning for better relevance) with Elasticsearch

Java 1,521 375 Updated Oct 30, 2025

An open source event analytics platform

Java 1,335 145 Updated Apr 5, 2022

Hopsworks - Data-Intensive AI platform with a Feature Store

Java 1,271 152 Updated Feb 10, 2025

Multi Model Server is a tool for serving neural net models for inference

Java 1,025 232 Updated May 20, 2024

A scalable, mature and versatile web crawler based on Apache Storm

Java 953 268 Updated Dec 23, 2025

An extensible distributed system for reliable nearline data streaming at scale

Java 951 140 Updated Nov 11, 2025

Apache Metron

Java 867 506 Updated Aug 13, 2025

REST web service for the true real-time scoring (<1 ms) of Scikit-Learn, R and Apache Spark models

Java 588 172 Updated Dec 1, 2025
Next