A real time streaming tool for stocks and cryptocurrencies.
Yampa is a data engineering project that streams real-time cryptocurrency trade data from Coincap into a Redpanda cluster and processes it using ClickHouse for real-time analytics. The processed data can be used for various analyses, including the statistical distributions of trading patterns.
- Data Collection: Yampa CLI (Go)
- Data Streaming: Redpanda (Kafka API-compatible)
- Stream Processing: Redpanda Connect
- Storage & Analytics: ClickHouse
Trade Distributions - A companion project that visualizes the statistical patterns in cryptocurrency trading data collected by Yampa. The analysis reveals interesting patterns in trade volume distributions that resemble well-known probability distributions.
-
Create a docker network for the Redpanda containers:
docker network create --driver bridge --attachable redpanda-net
-
Start your Redpanda cluster:
cd redpanda && docker compose up -d && cd ..
-
Start streaming trades from Coincap into Redpanda:
-
Copy the contents of
yampa-cli/.env.exampleintoyampa-cli/.env -
Start the yampa-cli container:
cd yampa-cli && docker compose up --build -d && cd ..
-
-
View the trade data in the Redpanda Console UI at localhost:8080
-
Copy the contents of
clickhouse/.env.exampleintoclickhouse/.env -
Start your ClickHouse container:
cd clickhouse && docker compose up -d && cd ..
-
Create the
tradesdatabase andraw_tradestable in the ClickHouse UI at localhost:8123 using theclickhouse/trades/tables/raw_trades.sqlschema.
-
Copy the contents of
redpanda-connect/.env.exampleintoredpanda-connect/.env -
Start your Redpanda Connect cluster:
cd redpanda-connect && docker compose up -d && cd ..
-
Create the ClickHouse sink connector in the Redpanda Console UI using the
redpanda-connect/connectors/clickhouse-sink.jsonconfiguration.
This project is open source and available under the MIT License.