Music Typing Correlation Pipeline

This repository demonstrates how to build a simple—but real‐time—personal analytics pipeline with ZenML. It simultaneously tracks your typing speed and polls Spotify for the currently playing track, segments your keystrokes by track, fetches each artist’s genres, computes words-per-minute (WPM) per track, and visualizes the results.

Concept & Architecture

ZenML is an MLOps framework that lets you define:

Steps: Modular, reusable functions that produce/consume artifacts.
Pipelines: Ordered sequences of steps wired together.
Stacks: Execution environments, artifact stores, orchestrators, etc.

Here, our pipeline has two steps:

collect_and_track
- Listens to your keyboard events for a user-configurable duration
- Polls Spotify’s "currently playing" API at a user-configurable interval
- Splits your keystrokes into time segments per track
- Fetches each artist’s genres in batch
- Computes WPM for each track segment
- Returns a pandas.DataFrame with one row per track segment containing:
```
track_id, track_name, artist_name, genres, duration_seconds, keypresses, wpm
```
visualize
- Takes the DataFrame of per-track segments
- Computes average WPM per track (or per genre)
- Plots a bar chart (data/wpm_by_genre.png)

These steps are wired into a single ZenML pipeline:

@pipeline
def correlation_pipeline(duration: int = 60, poll_interval: int = 1):
    df = collect_and_track(duration=duration, poll_interval=poll_interval)
    visualize(df)

Repository Structure

.
├── data/                         ← token cache & output plots
├── pipelines/
│   └── correlation_pipeline.py   ← pipeline definition
├── run.py                        ← CLI entrypoint
├── steps/
│   ├── collect_and_track.py      ← combined data-collection step
│   └── visualize.py              ← plotting step
├── utils/
│   └── spotify_auth.py           ← OAuth helper
├── .python-version               ← Python version for pyenv
├── .env-example                  ← Environment variables template
├── requirements.txt
└── README.md

Getting Started

Clone & Install

git clone https://github.com/euxhenh/spotype.git
cd spotype

# Optional: Set Python version with pyenv (if using pyenv)
pyenv install 3.12.0  # or your preferred version
pyenv local 3.12.0

# Create and activate virtual environment
python -m venv venv
source venv/bin/activate

# Install dependencies
pip install -r requirements.txt

Initialize ZenML

zenml init

You’ll see a default stack with local orchestrator & artifact store.

Configure Spotify OAuth

Set up Spotify Developer app:

Go to Spotify Developer Dashboard
Create a new app
Set the redirect URI to: http://127.0.0.1:8888/callback
Copy your Client ID and Client Secret

Create environment file:

cp .env-example .env

Edit .env with your actual Spotify credentials:

SPOTIPY_CLIENT_ID="your_actual_client_id_here"
SPOTIPY_CLIENT_SECRET="your_actual_client_secret_here"
SPOTIPY_REDIRECT_URI="http://127.0.0.1:8888/callback"

Test authentication:

python auth_check.py

This will open your browser, ask you to grant "Playback State" permission, and save a token to data/.spotify_token.json.

Run the Pipeline

python run.py --duration 90

--duration (seconds): total time to track typing & poll Spotify. For a better solution, this should be turned into a cron job, but for the purposes of this demo it should suffice.

ZenML will:

Spin up a pipeline run
Execute collect_and_track (real-time keystroke + track polling)
Execute visualize (bar chart of WPM by genre)
Save the plot to data/wpm_by_track.png

You can also view run metadata & logs in the local ZenML dashboard.

What’s Happening Under the Hood?

Keyboard Listener
- Uses pynput to timestamp every key press.
Spotify Polling
- Uses spotipy + OAuth to fetch your currently playing track.
- Detects when a track changes to split keystrokes into segments.
Genre Enrichment
- Batches artist IDs through sp.artists(...) to fetch each artist’s genre list.
WPM Calculation
- Counts key presses in each segment, converts to "words" (assuming 5 chars/word), normalizes by segment duration to compute WPM.
Visualization
- Groups segments by track (or genre) and plots average WPM.

Next Steps

Focus vs. Energy buckets: map genres to "focus"/"energetic" categories.
ZenML Secrets: store your Spotify credentials securely in a managed secrets store.
Cloud Orchestration: switch your stack from local to Airflow/Prefect/Kubernetes for scheduled, containerized runs.
Dashboards: integrate with Streamlit or Plotly Dash for live exploration.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Music Typing Correlation Pipeline

Concept & Architecture

Repository Structure

Getting Started

What’s Happening Under the Hood?

Next Steps

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
pipelines		pipelines
steps		steps
utils		utils
.env-example		.env-example
.gitignore		.gitignore
.python-version		.python-version
README.md		README.md
auth_check.py		auth_check.py
requirements.txt		requirements.txt
run.py		run.py
setup.cfg		setup.cfg

euxhenh/spotype

Folders and files

Latest commit

History

Repository files navigation

Music Typing Correlation Pipeline

Concept & Architecture

Repository Structure

Getting Started

What’s Happening Under the Hood?

Next Steps

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages