Setting up a simple Apache Spark environment used for working with Spark in a development environment.
-
Updated
Aug 5, 2019 - Dockerfile
Apache Spark is an open source distributed general-purpose cluster-computing framework. It provides an interface for programming entire clusters with implicit data parallelism and fault tolerance.
Setting up a simple Apache Spark environment used for working with Spark in a development environment.
Containers configuration saved from other tasks related to work or personal projects
Dockerimage of morpheus, the project from opencypher previously known as Cypher for Apache Spark
Small setup of development environment for Apache Spark with docker
running apache spark with docker swarm
Apache Spark cluster connected to a Jupyter Notebook instance
PySpark in Docker Containers
This repository holds examples and documentation about the most used tools in the data engineering ecosystem.
Docker setup for Apache Spark and the R sparklyr package
The Apache Spark with Scala 2.13 Docker image is a lightweight and easy-to-use Docker image for running Apache Spark with Scala 2.13 supported on your system.
Set-up apache spark cluster with hadoop(hdfs) and airflow on docker
Set-up local spark cluster, hadoop (hdfs), airflow, postgresql on docker with ease, without any local installations
A robust, scalable on-premises data lake
A comprehensive starter kit for Apache Spark, featuring Docker-based setup and example applications demonstrating various Spark capabilities.
Created by Matei Zaharia
Released May 26, 2014