hadoop

Hadoop3-HA-Docker is a production-ready, fault-tolerant Hadoop cluster deployed with Docker Compose. It automates the setup of a fully distributed Hadoop ecosystem with high availability (HA) features, designed for reliability, scalability, and real-world big data workloads

docker yarn hadoop docker-compose hdfs hadoop-cluster mapreduce

Updated May 22, 2025
Dockerfile

menazord / big-data-playground

Star

Local playground for Spark and Jupyter notebooks, plus Iceberg support

spark hive hadoop jupyter-notebook iceberg hive-metastore

Updated Apr 20, 2025
Dockerfile

yash-chauhan-dev / SPARK_CLUSTER_DOCKER

Star

Set-up local spark cluster, hadoop (hdfs), airflow, postgresql on docker with ease, without any local installations

python docker data apache-spark hadoop deployment docker-compose pyspark data-engineering hdfs localhost local-development data-engineering-pipeline

Updated Feb 21, 2025
Dockerfile

yash-chauhan-dev / SPARK_HDFS_AIRFLOW_CLUSTER_DOCKER

Star

Set-up apache spark cluster with hadoop(hdfs) and airflow on docker

python docker apache-spark hadoop docker-compose pyspark data-engineering hdfs data-pipeline apache-airflow

Updated Feb 13, 2025
Dockerfile

mcddhub / mcdd-big-data-study

Star

Study project for big data (Hadoop, Zookeeper, Kafka, Flink, Spark)

docker kafka big-data spark hadoop zookeeper flink data-processing

Updated Jan 22, 2025
Dockerfile

sadra1f / pyspark-hadoop-notebook

Star

Apache Hadoop development environment integrated with Jupyter Notebook using Docker

docker hadoop docker-compose jupyter-notebook pyspark

Updated Jan 10, 2025
Dockerfile

Dragon1573 / Hadoop-in-Docker

Star

A template repository provides convenient Apache Hadoop instance in Dev Containers.

docker hadoop vscode hdfs devcontainers

Updated Nov 12, 2024
Dockerfile

minhthong582000 / my-data-stack

Star

A simple Big data stack with Docker

docker spark hadoop docker-compose

Updated Nov 4, 2024
Dockerfile

Pirate-Emperor / BigData-Pipeline

Star

BigData Pipeline is a local testing environment for experimenting with various storage solutions (RDB, HDFS), query engines (Trino), schedulers (Airflow), and ETL/ELT tools (DBT). It supports MySQL, Hadoop, Hive, Kudu, and more.

Updated Nov 2, 2024
Dockerfile

epilif1017a / big-data-open-os

Star

The definitive open source big data operating system.

docker kubernetes big-data spark hive hadoop

Updated Oct 25, 2024
Dockerfile

this / docker-hadoop-hive

Star

Kerberized Apache Hadoop, Apache Hive Docker Images

docker hive hadoop kerberos

Updated Oct 21, 2024
Dockerfile

kentarokamiyajp / crypto-prediction-infra

Star

Devops for DWH which is for Crypto data analysis (hadoop, hive, spark, kafka, cassandra, trino, etc.)

docker kafka spark cassandra hive hadoop docker-compose dbt trino

Updated Sep 7, 2024
Dockerfile

keyhong / datalake-playground

Star

Playground for DataLake (Hadoop, Hive, Kudu, Trino, Hue, Airflow, DBT)

mysql airflow hive hadoop docker-compose kudu dbt trino

Updated Aug 23, 2024
Dockerfile

Improve this page

Add a description, image, and links to the hadoop topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the hadoop topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

hadoop

Here are 82 public repositories matching this topic...

hibuz / hadoop-docker

hadoop-sandbox / hadoop-sandbox-images

silicoflare / docker-hadoop

yashraj-n / hadoop-docker

sukumaar / hdfs-local-container

pfisterer / apache-hadoop-helm

apache / kyuubi-docker

MuhamedHekal / Hadoop-HA-Cluster-on-Docker

menazord / big-data-playground

yash-chauhan-dev / SPARK_CLUSTER_DOCKER

yash-chauhan-dev / SPARK_HDFS_AIRFLOW_CLUSTER_DOCKER

mcddhub / mcdd-big-data-study

sadra1f / pyspark-hadoop-notebook

Dragon1573 / Hadoop-in-Docker

minhthong582000 / my-data-stack

Pirate-Emperor / BigData-Pipeline

epilif1017a / big-data-open-os

this / docker-hadoop-hive

kentarokamiyajp / crypto-prediction-infra

keyhong / datalake-playground

Improve this page

Add this topic to your repo