Stars
GraphQL for Java with Spring Boot made easy.
A list of engineering manager resource links.
Prefect is a workflow orchestration framework for building resilient data pipelines in Python.
Querybook is a Big Data Querying UI, combining collocated table metadata and a simple notebook interface.
ClickHouse® is a real-time analytics database management system
Zuul is a gateway service that provides dynamic routing, monitoring, resiliency, security, and more.
Secor is a service implementing Kafka log persistence
A distributed data integration framework that simplifies common aspects of big data integration such as data ingestion, replication, organization and lifecycle management for both streaming and bat…
An open source, high scalability toolkit in Java for Entity Resolution.
DBND is an agile pipeline framework that helps data engineering teams track and orchestrate their data processes.
The OpenAPI Specification Repository
The leading data integration platform for ETL / ELT data pipelines from APIs, databases & files to data warehouses, data lakes & data lakehouses. Both self-hosted and Cloud-hosted.
Data Pipeline Framework using the singer.io spec
A community discussion platform: Brings together the main features from StackOverflow, Slack, Discourse, Reddit, and Disqus blog comments.
COVID Rest API for India data, using Cloudflare Workers
🦔 PostHog is an all-in-one developer platform for building successful products. We offer product analytics, web analytics, session replay, error tracking, feature flags, experimentation, surveys, d…
Collect, aggregate, and visualize a data ecosystem's metadata
Guide your users through a tour of your app
Fair-code workflow automation platform with native AI capabilities. Combine visual building with custom code, self-host or cloud, 400+ integrations.
Apache DolphinScheduler is the modern data orchestration platform. Agile to create high performance workflow with low-code
Fast web applications through dynamic, partially-stateful dataflow
Pravega - Streaming as a new software defined storage primitive
Kylo is a data lake management software platform and framework for enabling scalable enterprise-class data lakes on big data technologies such as Teradata, Apache Spark and/or Hadoop. Kylo is licen…
Titanoboa makes complex workflows easy. It is a low-code workflow orchestration platform for JVM - distributed, highly scalable and fault tolerant.
Apache BookKeeper - a scalable, fault tolerant and low latency storage service optimized for append-only workloads
Apache Pulsar - distributed pub-sub messaging system