Collection of Apache Spark docker images for OKDP
-
Updated
Dec 15, 2025 - Dockerfile
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Collection of Apache Spark docker images for OKDP
A comprehensive starter kit for Apache Spark, featuring Docker-based setup and example applications demonstrating various Spark capabilities.
A robust, scalable on-premises data lake
Set-up local spark cluster, hadoop (hdfs), airflow, postgresql on docker with ease, without any local installations
Set-up apache spark cluster with hadoop(hdfs) and airflow on docker
The Apache Spark with Scala 2.13 Docker image is a lightweight and easy-to-use Docker image for running Apache Spark with Scala 2.13 supported on your system.
Docker setup for Apache Spark and the R sparklyr package
This repository holds examples and documentation about the most used tools in the data engineering ecosystem.
PySpark in Docker Containers
Apache Spark cluster connected to a Jupyter Notebook instance
running apache spark with docker swarm
Small setup of development environment for Apache Spark with docker
Dockerimage of morpheus, the project from opencypher previously known as Cypher for Apache Spark
Containers configuration saved from other tasks related to work or personal projects
Created by Matei Zaharia
Released May 26, 2014