This repository contains Dockerfile of apache-airflow for Docker's automated build published to the public Docker Hub Registry.
- Based on Debian Jessie official Image debian:jessie and uses the official Postgres as backend and Redis as queue
- Install Docker
- Install Docker Compose
- Following the Airflow release from Python Package Index
For example, if you need to install Extra Packages, edit the Dockerfile and then build it.
python build-airflow-image.py
By default we use the LocalExecutor:
docker-compose -f docker-compose-LocalExecutor.yml up -d
Eventually we might want to add support for CeleryExecutor at some point. The original repo have support for this via:
docker-compose -f docker-compose-CeleryExecutor.yml up -d
If you want to use Ad hoc query, make sure you've configured connections: Go to Admin -> Connections and Edit "postgres_default" set this values (equivalent to values in airflow.cfg/docker-compose*.yml) :
- Host : postgres
- Schema : airflow
- Login : airflow
- Password : airflow
For encrypted connection passwords (in Local or Celery Executor), you must have the same fernet_key. By default docker-airflow generates the fernet_key at startup, you have to set an environment variable in the docker-compose (ie: docker-compose-LocalExecutor.yml) file to set the same key accross containers. To generate a fernet_key :
python -c "from cryptography.fernet import Fernet; FERNET_KEY = Fernet.generate_key().decode(); print FERNET_KEY"
Check Airflow Documentation
- Common dependency between this module and parent module should be added to the parent's "requirements.txt". These dependencies will be copied and installed on the container image by the build script
- Create a file "airflow-requirements.txt" with the desired python modules
- Mount this file as a volume
-v $(pwd)/airflow-requirements.txt:/airflow-requirements.txt - The entrypoint.sh script execute the pip install command (with --user option)
- Airflow: localhost:8080
- Flower: localhost:5555
Easy scaling using docker-compose:
docker-compose scale worker=5
This can be used to scale to a multi node setup using docker swarm.
Fork, improve and PR. ;-)