- Install Docker Desktop
- Create
.envfile in the repo root by copying.env.template - Fill in the desired
POSTGRES_PASSWORDvalue in the.envfile - Build containers:
docker compose up -d --buildCheck out the jupyterlab container logs and click on the link that looks like http://127.0.0.1:8089/lab?token=...
docker exec -it trino trinoSHOW SCHEMAS FROM db;USE db.public;SHOW TABLES FROM public;docker compose --profile spark up -d- Spark Master UI: http://localhost:9090
- Spark Worker A UI: http://localhost:10091
- Spark Worker B UI: http://localhost:10092
- Spark History Server: http://localhost:4040 (when a job is running)
docker exec -it spark-master /bin/bashcd /opt/spark/bin
./spark-submit --master spark://0.0.0.0:7077 \
--name spark-pi \
--class org.apache.spark.examples.SparkPi \
local:///opt/spark/examples/jars/spark-examples_2.12-3.5.1.jar 100docker exec -it spark-master /bin/bash./bin/beeline!connect jdbc:hive2://localhost:10000 scott tigershow databases;create table hive_example(a string, b int) partitioned by(c int);
alter table hive_example add partition(c=1);
insert into hive_example partition(c=1) values('a', 1), ('a', 2),('b',3);
select count(distinct a) from hive_example;
select sum(b) from hive_example;docker exec -it scylla-1 cqlshCREATE KEYSPACE data
WITH replication = {'class': 'SimpleStrategy', 'replication_factor' : 3};
USE data;
CREATE TABLE data.users (
user_id uuid PRIMARY KEY,
first_name text,
last_name text,
age int
);
INSERT INTO data.users (user_id, first_name, last_name, age)
VALUES (123e4567-e89b-12d3-a456-426655440000, 'Polly', 'Partition', 77);
docker exec -it kafka kafka-topics.sh --create --topic test --bootstrap-server 127.0.0.1:9095Check out the .env.template file. Copy/paste airflow related variables and
update their values where necessary.
Airlfow UI is available at http://localhost:8081/.
You need to create a Slack app and setup AIRFLOW_CONN_SLACK_API_DEFAULT
env variable with Slack api key. If you don't want to use this integration,
remove the AIRFLOW_CONN_SLACK_API_DEFAULT variable from your .env file.
Mongo is available on port 27018.