Soniq Music DL - AI Karaoke Creator

🎵 AI-powered karaoke video creator using Docker Spleeter for ML-based vocal separation and OpenAI Whisper for transcription.

Features

🤖 Docker Spleeter integration - True ML-based vocal/instrumental separation
🗣️ OpenAI Whisper transcription - Accurate speech-to-text with word-level timing
📝 Synchronized subtitles - Word highlighting with professional typography
🎚️ Multiple vocal levels - 0%, 5%, 10%, 15%, 25%, 50%, 75%
🌐 Bilingual support - Original language + transliteration
☁️ Cloud deployment - Google Cloud Run ready

Local Usage

Prerequisites

Python 3.9+
Docker
FFmpeg
OpenAI API key

Installation

pip install -r requirements.txt

Environment Variables

export OPENAI_API_KEY="your-openai-api-key"

Create Karaoke Videos

# Multiple vocal levels (0%, 5%, 10%, 15%, 25%, 50%, 75%)
python create_multi_vocal_karaoke.py

# Low vocal levels (5%, 10%, 15%) 
python create_low_vocal_karaoke.py

# Download & process YouTube videos
python download_and_create_karaoke.py

Cloud Deployment

Deploy to Google Cloud Run

Set up Google Cloud Project

gcloud config set project YOUR_PROJECT_ID
gcloud services enable run.googleapis.com cloudbuild.googleapis.com storage.googleapis.com

Create Storage Bucket

gsutil mb gs://soniq-karaoke-videos

Deploy with Cloud Build

gcloud builds submit --config=cloudbuild.yaml

API Usage

Health Check

curl https://soniq-karaoke-HASH-uc.a.run.app/health

Process Video

curl -X POST https://soniq-karaoke-HASH-uc.a.run.app/process \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/watch?v=VIDEO_ID",
    "vocal_levels": [0.0, 0.25, 0.5]
  }'

Response:

{
  "job_id": "uuid",
  "title": "Video Title",
  "videos": [
    {
      "vocal_level": 0,
      "url": "https://storage.googleapis.com/bucket/file.mp4",
      "filename": "karaoke_0_vocal.mp4"
    }
  ]
}

Architecture

YouTube URL → yt-dlp → Docker Spleeter → OpenAI Whisper → FFmpeg → Cloud Storage
                ↓              ↓             ↓           ↓
            Video File    Vocal/Instrumental  Subtitles   Karaoke Video

Video Pipeline

Download - Extract video from YouTube URL
Separate - Use Docker Spleeter for ML-based audio separation
Transcribe - OpenAI Whisper for word-level timestamps
Subtitle - Create synchronized ASS subtitles with highlighting
Mix - Combine vocals/instrumentals at specified levels
Render - Generate final karaoke video with FFmpeg
Upload - Store in Google Cloud Storage

Technologies

Python Spleeter - ML audio separation (replaced Docker-in-Docker)
OpenAI Whisper - Speech transcription
yt-dlp - YouTube video downloading
FFmpeg - Video/audio processing
Flask - Web API framework
Google Cloud Run - Serverless deployment
Google Cloud Storage - Video storage

GitHub Auto-Deploy

✅ Trigger "soniqpush" active - pushes to main branch automatically deploy to Cloud Run!

Examples

Created karaoke videos with various vocal levels:

punjaban_karaoke_0_vocal.mp4 - Pure instrumental
punjaban_karaoke_25_vocal.mp4 - Light vocal guide
punjaban_karaoke_50_vocal.mp4 - Balanced mix
punjaban_karaoke_75_vocal.mp4 - Strong vocal guide

Perfect for different karaoke preferences and skill levels!

License

MIT License - See LICENSE file for details.# GitHub Auto-Deployment Test Fri 22 Aug 2025 15:28:44 EDT

GitHub trigger test - Fri 22 Aug 2025 15:38:10 EDT

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
__pycache__		__pycache__
trending_music		trending_music
.DS_Store		.DS_Store
.gcloudignore		.gcloudignore
CLAUDE.md		CLAUDE.md
DEPLOYMENT.md		DEPLOYMENT.md
DEPLOY_MICROSERVICES.md		DEPLOY_MICROSERVICES.md
Dockerfile		Dockerfile
Dockerfile.download		Dockerfile.download
Dockerfile.processing		Dockerfile.processing
GITHUB_TRIGGER_SETUP.md		GITHUB_TRIGGER_SETUP.md
README.md		README.md
README_SPLIT_SERVICES.md		README_SPLIT_SERVICES.md
TRIGGER_SETUP_GUIDE.md		TRIGGER_SETUP_GUIDE.md
UI_TRIGGER_SETUP.md		UI_TRIGGER_SETUP.md
app.py		app.py
batch_process_trending.py		batch_process_trending.py
billie_birds_direct.mp4		billie_birds_direct.mp4
cloudbuild-download.yaml		cloudbuild-download.yaml
cloudbuild-processing.yaml		cloudbuild-processing.yaml
cloudbuild.yaml		cloudbuild.yaml
create_bilingual_karaoke.py		create_bilingual_karaoke.py
create_low_vocal_karaoke.py		create_low_vocal_karaoke.py
create_multi_vocal_karaoke.py		create_multi_vocal_karaoke.py
create_trigger_curl.sh		create_trigger_curl.sh
create_webhook_trigger.sh		create_webhook_trigger.sh
deploy.sh		deploy.sh
download_and_create_karaoke.py		download_and_create_karaoke.py
download_service.py		download_service.py
download_trending_music.py		download_trending_music.py
download_with_proxy.py		download_with_proxy.py
github-trigger-setup.sh		github-trigger-setup.sh
github_trigger_guide.md		github_trigger_guide.md
process_billie_karaoke.py		process_billie_karaoke.py
processing_service.py		processing_service.py
punjaban_docker_spleeter.mp4		punjaban_docker_spleeter.mp4
punjaban_karaoke_0_vocal.mp4		punjaban_karaoke_0_vocal.mp4
punjaban_karaoke_10_vocal.mp4		punjaban_karaoke_10_vocal.mp4
punjaban_karaoke_15_vocal.mp4		punjaban_karaoke_15_vocal.mp4
punjaban_karaoke_25_vocal.mp4		punjaban_karaoke_25_vocal.mp4
punjaban_karaoke_50_vocal.mp4		punjaban_karaoke_50_vocal.mp4
punjaban_karaoke_5_vocal.mp4		punjaban_karaoke_5_vocal.mp4
punjaban_karaoke_75_vocal.mp4		punjaban_karaoke_75_vocal.mp4
punjaban_pure_instrumental.mp4		punjaban_pure_instrumental.mp4
requirements.txt		requirements.txt
run_processing_docker.py		run_processing_docker.py
setup_github_trigger.py		setup_github_trigger.py
simple-trigger-config.json		simple-trigger-config.json
test_30sec_audio.wav		test_30sec_audio.wav
test_30sec_video.mp4		test_30sec_video.mp4
test_async_processing.py		test_async_processing.py
test_billie_download.py		test_billie_download.py
test_cloud_run_simulation.py		test_cloud_run_simulation.py
test_comprehensive_processing.py		test_comprehensive_processing.py
test_correct_proxies.py		test_correct_proxies.py
test_deployed_processing.py		test_deployed_processing.py
test_deployed_service.py		test_deployed_service.py
test_docker_processing.py		test_docker_processing.py
test_gcs_video.py		test_gcs_video.py
test_http_request.py		test_http_request.py
test_modified_service.py		test_modified_service.py
test_processing_local.py		test_processing_local.py
test_scraperapi_proxies.py		test_scraperapi_proxies.py
test_simple_processing.py		test_simple_processing.py
test_spleeter_local.py		test_spleeter_local.py
test_spleeter_only.py		test_spleeter_only.py
test_spleeter_working.py		test_spleeter_working.py
transcription_service.py		transcription_service.py
trigger-config.json		trigger-config.json
trigger-processing.yaml		trigger-processing.yaml
trigger_config.json		trigger_config.json
upload_and_process_via_gsutil.py		upload_and_process_via_gsutil.py
video1_karaoke_0_vocal.mp4		video1_karaoke_0_vocal.mp4
video1_karaoke_25_vocal.mp4		video1_karaoke_25_vocal.mp4
video2_karaoke_0_vocal.mp4		video2_karaoke_0_vocal.mp4
video2_karaoke_25_vocal.mp4		video2_karaoke_25_vocal.mp4
workflow_client.py		workflow_client.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Soniq Music DL - AI Karaoke Creator

Features

Local Usage

Prerequisites

Installation

Environment Variables

Create Karaoke Videos

Cloud Deployment

Deploy to Google Cloud Run

API Usage

Architecture

Video Pipeline

Technologies

GitHub Auto-Deploy

Examples

License

GitHub trigger test - Fri 22 Aug 2025 15:38:10 EDT

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Soniq Music DL - AI Karaoke Creator

Features

Local Usage

Prerequisites

Installation

Environment Variables

Create Karaoke Videos

Cloud Deployment

Deploy to Google Cloud Run

API Usage

Architecture

Video Pipeline

Technologies

GitHub Auto-Deploy

Examples

License

GitHub trigger test - Fri 22 Aug 2025 15:38:10 EDT

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages