solana-clipper

Turn long crypto-podcast recordings (Supertalks, Ownership, and similar Solana-ecosystem shows) into transcripts and clip-ready post-production briefs that a video editor can drop straight into CapCut.

Two stages, one repo:

Transcription — transcribe.py runs every MP3 in audio/ through OpenAI Whisper, using vocabulary.txt as a proper-noun hint to reduce mishearings. Outputs output/<episode>/raw_transcript.txt and transcript.srt.
Clip pipeline — driven by Claude Code (or any LLM coding agent) following prompts/master_prompt.md. It cleans the transcript, proposes 5–8 standalone clip candidates, and writes per-clip briefs (.md) plus rebased SRTs (.srt) into output/<episode>/<episode>_clips/NN_slug/. The <episode>_clips/ folder is then handed off to a video editor who already has the master video.

The vocabulary list (vocabulary.txt) is the longest-lived asset here — it grows with every episode as new project names, tokens, and people get confirmed.

Quickstart (macOS)

One line in Terminal sets up everything — Homebrew, Python, ffmpeg, the repo itself, dependencies, and your .env. You'll be prompted to paste your OpenAI API key once.

curl -fsSL https://raw.githubusercontent.com/ozgtg797/solana-clipper/main/install.sh | bash

The repo lands in ~/solana-clipper. Re-running the command later just pulls the latest.

If the repo is still private: the raw URL above won't work without auth. Either flip the repo to public (Settings → General → Change repository visibility — the repo holds no secrets, .env is gitignored), or have the user clone with the GitHub CLI: gh repo clone ozgtg797/solana-clipper && cd solana-clipper && bash install.sh.

Manual setup

If you'd rather install everything by hand:

git clone https://github.com/ozgtg797/solana-clipper.git
cd solana-clipper
pip install -r requirements.txt
cp .env.example .env
# then edit .env and put your OPENAI_API_KEY there

setup.sh is a separate health-check that verifies your machine has everything ready. See SETUP.md for the full walkthrough (Serbian).

Usage

Transcribe new episodes

Drop MP3s into audio/, then:

python3 transcribe.py

For each MP3 the script writes output/<stem>/raw_transcript.txt and output/<stem>/transcript.srt. Episodes that already have both files are skipped, so re-running is safe.

MP3s over 25 MB need to be compressed before the Whisper API will accept them. Known-good recipe for clean speech: ffmpeg -i input.mp3 -ac 1 -ar 16000 -b:a 32k output.mp3

Make clips for an episode

Open the repo with Claude Code and ask it to "make clips for <episode>" (or napravi klipove, pokreni clips). It reads prompts/master_prompt.md and runs the pipeline:

Pre-analysis — flags ambiguous proper nouns and asks you to confirm spellings (interactive).
After confirmation, appends new terms to vocabulary.txt.
Writes cleaned.srt (intermediate).
Proposes a clip shortlist; you pick which to keep.
Writes output/<episode>/<episode>_clips/NN_slug/NN_slug.md + NN_slug.srt for each chosen clip.

The final folder is what you ship to the video editor.

Layout

solana-clipper/
├── transcribe.py            # Whisper -> raw_transcript.txt + transcript.srt
├── vocabulary.txt           # proper-noun hints (constantly updated)
├── prompts/
│   └── master_prompt.md     # rules the LLM follows for the clips pipeline
├── CLAUDE.md                # short operational guide for Claude Code
├── SETUP.md                 # local-setup walkthrough (Serbian)
├── setup.sh                 # one-shot dep installer (macOS)
├── audio/                   # gitignored — drop your MP3s here
└── output/                  # gitignored — generated artifacts per episode

Notes

The cleaned SRT is an intermediate artifact, not finished captions. Caption boundaries that slice through stutters or half-words still need a human pass with eyes/ears on the source video. Don't ship cleaned.srt directly.
English is the default transcription language. Flag Serbian (or other) episodes explicitly when running.
This tool is built for the Solana podcast ecosystem but the pipeline (Whisper + vocabulary hints + LLM-driven clip extraction) generalizes to any long-form spoken-word content.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

solana-clipper

Quickstart (macOS)

Manual setup

Usage

Transcribe new episodes

Make clips for an episode

Layout

Notes

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
prompts		prompts
.env.example		.env.example
.gitignore		.gitignore
CLAUDE.md		CLAUDE.md
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
README.md		README.md
SETUP.md		SETUP.md
install.sh		install.sh
requirements.txt		requirements.txt
setup.sh		setup.sh
transcribe.py		transcribe.py
vocabulary.txt		vocabulary.txt

Folders and files

Latest commit

History

Repository files navigation

solana-clipper

Quickstart (macOS)

Manual setup

Usage

Transcribe new episodes

Make clips for an episode

Layout

Notes

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages