Skip to content
View mdagost's full-sized avatar

Organizations

@dssg

Block or report mdagost

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
46 stars written in Java
Clear filter

The official home of the Presto distributed SQL query engine for big data

Java 16,698 5,532 Updated Apr 27, 2026

Zuul is a gateway service that provides dynamic routing, monitoring, resiliency, security, and more.

Java 14,011 2,443 Updated Apr 27, 2026

Apache Druid: a high performance real-time analytics database.

Java 13,982 3,776 Updated Apr 27, 2026

The Context Platform for your Data and AI Stack

Java 11,848 3,460 Updated Apr 27, 2026

OpenRefine is a free, open source power tool for working with messy data and improving it

Java 11,814 2,137 Updated Apr 24, 2026

A Camera component for React Native. Also supports barcode scanning!

Java 9,641 3,531 Updated Jun 7, 2023

Distributed and fault-tolerant realtime computation: stream processing, continuous computation, distributed RPC, and more

Java 8,786 1,646 Updated Aug 16, 2017

Use SQL to query Elasticsearch

Java 7,017 1,532 Updated Feb 21, 2026

High-quality QR Code generator library in Java, TypeScript/JavaScript, Python, Rust, C++, C.

Java 6,542 1,256 Updated Jan 23, 2025

Apache Pinot - A realtime distributed OLAP datastore

Java 6,069 1,472 Updated Apr 27, 2026

JanusGraph: an open-source, distributed graph database

Java 5,770 1,208 Updated Apr 24, 2026

CrateDB is a distributed and scalable SQL database for storing and analyzing massive amounts of data in near real-time, even with complex queries. It is PostgreSQL-compatible, and based on Lucene.

Java 4,388 594 Updated Apr 27, 2026

Apache Parquet Java

Java 3,053 1,531 Updated Apr 26, 2026

Jenkins plugin to run dynamic agents in a Kubernetes/Docker environment

Java 2,307 1,273 Updated Apr 6, 2026

Stream summarizer and cardinality estimator.

Java 2,266 556 Updated Nov 28, 2019

A Graph Traversal Language (no longer active - see Apache TinkerPop)

Java 1,953 229 Updated Aug 16, 2021
Java 1,939 169 Updated Jul 17, 2021

Secor is a service implementing Kafka log persistence

Java 1,860 532 Updated Mar 10, 2026

A large-scale entity and relation database supporting aggregation of properties

Java 1,793 364 Updated Jun 6, 2025

Workload Automation System

Java 1,334 231 Updated Apr 23, 2026

Hopsworks - Data-Intensive AI platform with a Feature Store

Java 1,294 156 Updated Feb 10, 2025

A Java package to automatically detect anomalies in large scale time-series data

Java 1,193 326 Updated Nov 14, 2023

Scalable identity resolution, entity resolution, data mastering and deduplication using ML

Java 1,189 161 Updated Apr 27, 2026

A platform for visualization and real-time monitoring of data workflows

Java 1,170 197 Updated Jan 22, 2020

Twitter's collection of LZO and Protocol Buffer-related Hadoop, Pig, Hive, and HBase code.

Java 1,134 384 Updated Apr 10, 2023

A software library of stochastic streaming algorithms, a.k.a. sketches.

Java 952 220 Updated Apr 22, 2026

Semantic Parser with Execution

Java 843 295 Updated May 1, 2023

The metric correlation component of Etsy's Kale system

Java 709 69 Updated Apr 18, 2017

Hadoop library for large-scale data processing, now an Apache Incubator project

Java 581 132 Updated Jul 8, 2014

A tool that translates augmented markdown into HTML or latex

Java 474 31 Updated Jun 19, 2022
Next