π΅ AI-powered karaoke video creator using Docker Spleeter for ML-based vocal separation and OpenAI Whisper for transcription.
- π€ Docker Spleeter integration - True ML-based vocal/instrumental separation
- π£οΈ OpenAI Whisper transcription - Accurate speech-to-text with word-level timing
- π Synchronized subtitles - Word highlighting with professional typography
- ποΈ Multiple vocal levels - 0%, 5%, 10%, 15%, 25%, 50%, 75%
- π Bilingual support - Original language + transliteration
- βοΈ Cloud deployment - Google Cloud Run ready
- Python 3.9+
- Docker
- FFmpeg
- OpenAI API key
pip install -r requirements.txtexport OPENAI_API_KEY="your-openai-api-key"# Multiple vocal levels (0%, 5%, 10%, 15%, 25%, 50%, 75%)
python create_multi_vocal_karaoke.py
# Low vocal levels (5%, 10%, 15%)
python create_low_vocal_karaoke.py
# Download & process YouTube videos
python download_and_create_karaoke.py- Set up Google Cloud Project
gcloud config set project YOUR_PROJECT_ID
gcloud services enable run.googleapis.com cloudbuild.googleapis.com storage.googleapis.com- Create Storage Bucket
gsutil mb gs://soniq-karaoke-videos- Deploy with Cloud Build
gcloud builds submit --config=cloudbuild.yamlHealth Check
curl https://soniq-karaoke-HASH-uc.a.run.app/healthProcess Video
curl -X POST https://soniq-karaoke-HASH-uc.a.run.app/process \
-H "Content-Type: application/json" \
-d '{
"url": "https://www.youtube.com/watch?v=VIDEO_ID",
"vocal_levels": [0.0, 0.25, 0.5]
}'Response:
{
"job_id": "uuid",
"title": "Video Title",
"videos": [
{
"vocal_level": 0,
"url": "https://storage.googleapis.com/bucket/file.mp4",
"filename": "karaoke_0_vocal.mp4"
}
]
}YouTube URL β yt-dlp β Docker Spleeter β OpenAI Whisper β FFmpeg β Cloud Storage
β β β β
Video File Vocal/Instrumental Subtitles Karaoke Video
- Download - Extract video from YouTube URL
- Separate - Use Docker Spleeter for ML-based audio separation
- Transcribe - OpenAI Whisper for word-level timestamps
- Subtitle - Create synchronized ASS subtitles with highlighting
- Mix - Combine vocals/instrumentals at specified levels
- Render - Generate final karaoke video with FFmpeg
- Upload - Store in Google Cloud Storage
- Python Spleeter - ML audio separation (replaced Docker-in-Docker)
- OpenAI Whisper - Speech transcription
- yt-dlp - YouTube video downloading
- FFmpeg - Video/audio processing
- Flask - Web API framework
- Google Cloud Run - Serverless deployment
- Google Cloud Storage - Video storage
β Trigger "soniqpush" active - pushes to main branch automatically deploy to Cloud Run!
Created karaoke videos with various vocal levels:
punjaban_karaoke_0_vocal.mp4- Pure instrumentalpunjaban_karaoke_25_vocal.mp4- Light vocal guidepunjaban_karaoke_50_vocal.mp4- Balanced mixpunjaban_karaoke_75_vocal.mp4- Strong vocal guide
Perfect for different karaoke preferences and skill levels!
MIT License - See LICENSE file for details.# GitHub Auto-Deployment Test Fri 22 Aug 2025 15:28:44 EDT