π΅ AI-powered karaoke video creator using Docker Spleeter for ML-based vocal separation and OpenAI Whisper for transcription.
Automatically download trending music videos and convert them to karaoke format every day!
# Quick start - Set up daily automation
./setup_daily_karaoke.sh
# Or run manually now
python3 daily_karaoke_generator.py
# Test your setup
./test_setup.shπ Complete Daily Karaoke Guide β
- β Downloads trending music videos daily from YouTube
- β Separates vocals from instrumentals using Docker Spleeter
- β Creates karaoke videos with instrumental-only tracks
- β Organizes output in timestamped folders
- β Generates comprehensive reports and logs
- β Automatic cleanup of old runs
Perfect for: Daily karaoke content creation, music libraries, entertainment venues
- π€ Docker Spleeter integration - True ML-based vocal/instrumental separation
- π£οΈ OpenAI Whisper transcription - Accurate speech-to-text with word-level timing
- π Synchronized subtitles - Word highlighting with professional typography
- ποΈ Multiple vocal levels - 0%, 5%, 10%, 15%, 25%, 50%, 75%
- π Bilingual support - Original language + transliteration
- βοΈ Cloud deployment - Google Cloud Run ready
- β° Daily automation - Scheduled runs via cron or systemd
- Python 3.9+
- Docker (for Spleeter audio separation)
- FFmpeg
- yt-dlp (for video downloads)
- OpenAI API key (optional, for Whisper transcription)
# 1. Install dependencies
pip install -r requirements.txt
pip install yt-dlp
# 2. Pull Spleeter Docker image
docker pull researchdeezer/spleeter:3.8-2stems
# 3. Test setup
./test_setup.sh
# 4. Set up daily automation (optional)
./setup_daily_karaoke.shCreate a .env file:
# Optional: For Whisper transcription
export OPENAI_API_KEY="your-openai-api-key"
# Optional: For cloud storage
export BUCKET_NAME="your-gcs-bucket"
# Optional: Custom port for web service
export PORT=8080# Run daily karaoke generation manually
python3 daily_karaoke_generator.py
# Or use the wrapper script
./run_karaoke_now.shOutput will be saved to: karaoke_daily_runs/run_YYYYMMDD_HHMMSS/
# Start processing service
docker-compose up processing
# Run in background
docker-compose up -d processing
# View logs
docker-compose logs -f processing
# Test Spleeter directly
mkdir -p input output
docker-compose --profile tools run spleeter separate -i /input/video.mp4 -o /output -p spleeter:2stems# Multiple vocal levels (0%, 5%, 10%, 15%, 25%, 50%, 75%)
python create_multi_vocal_karaoke.py
# Low vocal levels (5%, 10%, 15%)
python create_low_vocal_karaoke.py
# Download & process YouTube videos
python download_and_create_karaoke.py- Set up Google Cloud Project
gcloud config set project YOUR_PROJECT_ID
gcloud services enable run.googleapis.com cloudbuild.googleapis.com storage.googleapis.com- Create Storage Bucket
gsutil mb gs://soniq-karaoke-videos- Deploy with Cloud Build
gcloud builds submit --config=cloudbuild.yamlHealth Check
curl https://soniq-karaoke-HASH-uc.a.run.app/healthProcess Video
curl -X POST https://soniq-karaoke-HASH-uc.a.run.app/process \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"vocal_levels": [0.0, 0.25, 0.5]
}'Response:
{
"job_id": "uuid",
"title": "Video Title",
"videos": [
{
"vocal_level": 0,
"url": "https://storage.googleapis.com/bucket/file.mp4",
"filename": "karaoke_0_vocal.mp4"
}
]
}YouTube URL β yt-dlp β Docker Spleeter β OpenAI Whisper β FFmpeg β Cloud Storage
β β β β
Video File Vocal/Instrumental Subtitles Karaoke Video
- Download - Extract video from YouTube URL
- Separate - Use Docker Spleeter for ML-based audio separation
- Transcribe - OpenAI Whisper for word-level timestamps
- Subtitle - Create synchronized ASS subtitles with highlighting
- Mix - Combine vocals/instrumentals at specified levels
- Render - Generate final karaoke video with FFmpeg
- Upload - Store in Google Cloud Storage
- Python Spleeter - ML audio separation (replaced Docker-in-Docker)
- OpenAI Whisper - Speech transcription
- yt-dlp - YouTube video downloading
- FFmpeg - Video/audio processing
- Flask - Web API framework
- Google Cloud Run - Serverless deployment
- Google Cloud Storage - Video storage
β Trigger "soniqpush" active - pushes to main branch automatically deploy to Cloud Run!
Created karaoke videos with various vocal levels:
punjaban_karaoke_0_vocal.mp4- Pure instrumentalpunjaban_karaoke_25_vocal.mp4- Light vocal guidepunjaban_karaoke_50_vocal.mp4- Balanced mixpunjaban_karaoke_75_vocal.mp4- Strong vocal guide
Perfect for different karaoke preferences and skill levels!
MIT License - See LICENSE file for details.# GitHub Auto-Deployment Test Fri 22 Aug 2025 15:28:44 EDT