🎙️ Podcast Transcription Tool

A powerful and easy-to-use tool for transcribing podcast MP3 files into clean, readable transcripts using OpenAI's Whisper AI model.

✨ Features

High Accuracy: Uses OpenAI's Whisper model for state-of-the-art transcription
Batch Processing: Transcribe entire folders of MP3 files at once
Multiple Formats: Save transcripts as TXT, JSON, and SRT (subtitle) formats
Timestamps: Optional timestamps for each segment
Language Support: Auto-detects language or specify manually
Clean Output: Beautiful, readable transcripts
Progress Tracking: Visual progress bars for long transcriptions
Flexible Models: Choose from 5 model sizes (speed vs accuracy)

📋 Requirements

Python 3.8 or higher
macOS, Linux, or Windows
At least 2GB RAM (more for larger models)
Internet connection (first time only, to download models)

🚀 Installation

Clone or download this repository
Install dependencies:
```
pip install -r requirements.txt
```
Note for macOS with Apple Silicon (M1/M2/M3): If you have issues, install PyTorch first:
```
pip install torch torchaudio
pip install -r requirements.txt
```
Verify installation:
```
python transcribe.py --help
```

📖 Usage

Basic Usage

Transcribe all MP3 files in a folder:

python transcribe.py /path/to/your/podcasts

This will:

Find all MP3 files in the specified folder
Transcribe each file using the base model
Save transcripts to a transcripts/ folder
Generate both TXT and JSON formats

Advanced Options

Use a more accurate model:

python transcribe.py /path/to/podcasts --model medium

Specify output directory:

python transcribe.py /path/to/podcasts --output my_transcripts

Specify language (faster than auto-detect):

python transcribe.py /path/to/podcasts --language en

Choose output formats:

# Save as TXT only
python transcribe.py /path/to/podcasts --formats txt

# Save all formats (TXT, JSON, SRT)
python transcribe.py /path/to/podcasts --formats txt json srt

Complete example:

python transcribe.py ~/Downloads/podcasts \
  --model medium \
  --output transcripts \
  --language en \
  --formats txt json srt

🎯 Model Sizes

Choose the right model for your needs:

Model	Speed	Accuracy	RAM Usage	Best For
`tiny`	⚡⚡⚡⚡⚡	⭐⭐	~1 GB	Quick drafts, testing
`base`	⚡⚡⚡⚡	⭐⭐⭐	~1 GB	Default, good balance
`small`	⚡⚡⚡	⭐⭐⭐⭐	~2 GB	Better accuracy
`medium`	⚡⚡	⭐⭐⭐⭐⭐	~5 GB	High quality transcripts
`large`	⚡	⭐⭐⭐⭐⭐⭐	~10 GB	Professional use

Recommendation: Start with base model. Upgrade to medium or large if you need better accuracy.

📁 Output Formats

TXT Format

Clean, readable text with two sections:

Transcript with timestamps for each segment
Clean transcript without timestamps (perfect for reading)

Example:

Transcript: podcast_episode_123.mp3
Generated: 2025-11-04 10:30:00
================================================================================

[00:00 -> 00:05] Welcome to the show, today we're talking about AI.
[00:05 -> 00:12] This is a fascinating topic that affects everyone.

================================================================================
CLEAN TRANSCRIPT (no timestamps):

Welcome to the show, today we're talking about AI. This is a fascinating topic that affects everyone.

JSON Format

Structured data with full details including:

Audio filename
Detected language
Complete text
All segments with timestamps
Generation timestamp

SRT Format

Standard subtitle format that can be used with video players or video editing software.

🛠️ Configuration

Supported Languages

Whisper supports 99+ languages. Common language codes:

en - English
es - Spanish
fr - French
de - German
it - Italian
pt - Portuguese
ja - Japanese
ko - Korean
zh - Chinese

Or leave blank for auto-detection!

💡 Tips & Best Practices

First Run: The first time you run the tool, it will download the selected Whisper model (~100MB-3GB depending on size). This happens once per model.
Audio Quality: Higher quality audio = better transcriptions. Whisper is very robust though!
Processing Time:
- A 1-hour podcast takes ~5-10 minutes with base model
- Use tiny for quick tests
- Use medium or large for important content
Batch Processing: Process multiple files overnight if you have many podcasts.
Storage: Transcripts are small (typically <100KB per hour of audio).

🔧 Troubleshooting

"No module named 'whisper'"

pip install openai-whisper

"Out of memory" error

Use a smaller model:

python transcribe.py /path/to/podcasts --model tiny

Slow transcription

Use a smaller model (tiny or base)
Ensure no other heavy processes are running
Check if your CPU supports optimizations

Poor accuracy

Try a larger model (medium or large)
Ensure audio quality is good
Specify the language explicitly

📝 Example Workflow

# 1. Put your podcast MP3s in a folder
mkdir ~/podcasts_to_transcribe
# (copy your MP3 files there)

# 2. Run the transcription
python transcribe.py ~/podcasts_to_transcribe --model base

# 3. Find your transcripts
ls transcripts/

# 4. Open and enjoy!
open transcripts/podcast_episode.txt

🤝 Contributing

Found a bug or want a feature? Feel free to:

Open an issue
Submit a pull request
Share your feedback

📄 License

MIT License - feel free to use this tool for personal or commercial projects!

🙏 Acknowledgments

Built with OpenAI Whisper
Inspired by the need for accessible podcast transcripts

🎉 Happy Transcribing!

If you find this tool useful, please star the repo and share it with others!

Questions? Open an issue or check the Whisper documentation.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
neurotechpub		neurotechpub
transcripts		transcripts
.gitattributes		.gitattributes
.gitignore		.gitignore
EXAMPLES.md		EXAMPLES.md
LICENSE		LICENSE
QUICKSTART.md		QUICKSTART.md
README.md		README.md
config.example.json		config.example.json
requirements.txt		requirements.txt
setup.sh		setup.sh
transcribe.py		transcribe.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

🎙️ Podcast Transcription Tool

✨ Features

📋 Requirements

🚀 Installation

📖 Usage

Basic Usage

Advanced Options

🎯 Model Sizes

📁 Output Formats

TXT Format

JSON Format

SRT Format

🛠️ Configuration

Supported Languages

💡 Tips & Best Practices

🔧 Troubleshooting

"No module named 'whisper'"

"Out of memory" error

Slow transcription

Poor accuracy

📝 Example Workflow

🤝 Contributing

📄 License

🙏 Acknowledgments

🎉 Happy Transcribing!

About

Uh oh!

Releases

Packages

Languages

License

LydNot/transcripts

Folders and files

Latest commit

History

Repository files navigation

🎙️ Podcast Transcription Tool

✨ Features

📋 Requirements

🚀 Installation

📖 Usage

Basic Usage

Advanced Options

🎯 Model Sizes

📁 Output Formats

TXT Format

JSON Format

SRT Format

🛠️ Configuration

Supported Languages

💡 Tips & Best Practices

🔧 Troubleshooting

"No module named 'whisper'"

"Out of memory" error

Slow transcription

Poor accuracy

📝 Example Workflow

🤝 Contributing

📄 License

🙏 Acknowledgments

🎉 Happy Transcribing!

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages