A small boilerplate/project with docker for data science.
cp config/.env.jupyter.example .env.jupyter
cp config/.env.minio.example .env.minio
cp config/.env.postgres.example .env.postgres
cp config/.env.airflow.example .env.airflow
cp config/.env.database.example .env.databaseProvide appropriate values in the .env files and then run
docker-compose up -dif you want to persist jupyter settings, make sure to commit the container before taking it down. So
docker commit jupyter
# and then
docker-compose downOtherwise you can simply stick to docker-compose start and stop.
Services:
jupyterlab: Jupyter notebooks and jupyter lab where you can do fancy stuff (localhost:8888)minio: A key value file store like aws (localhost:9001)postgres: Database storemetabase: Cool data science stuffsuperset: Another cool visualisations serviceairflow: Scheduler and task runner (localhost:8080)
Structure:
config: Contains environment variables, keys, secrets etcjupyter: Contains notebooksdags: Task runners
Data folders:
data: Contains data which is mounted on to miniodb_data: Postgres database persistant volumnemetabase_data: Metabase data persistant volume
Inspired by data-science-stack