Skip to content

robinsingh1/karaoke-content

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

22 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Soniq Music DL - AI Karaoke Creator

🎡 AI-powered karaoke video creator using Docker Spleeter for ML-based vocal separation and OpenAI Whisper for transcription.

πŸš€ NEW: Daily Automated Karaoke Generation

Automatically download trending music videos and convert them to karaoke format every day!

# Quick start - Set up daily automation
./setup_daily_karaoke.sh

# Or run manually now
python3 daily_karaoke_generator.py

# Test your setup
./test_setup.sh

πŸ“š Complete Daily Karaoke Guide β†’

What it does

  1. βœ… Downloads trending music videos daily from YouTube
  2. βœ… Separates vocals from instrumentals using Docker Spleeter
  3. βœ… Creates karaoke videos with instrumental-only tracks
  4. βœ… Organizes output in timestamped folders
  5. βœ… Generates comprehensive reports and logs
  6. βœ… Automatic cleanup of old runs

Perfect for: Daily karaoke content creation, music libraries, entertainment venues


Features

  • πŸ€– Docker Spleeter integration - True ML-based vocal/instrumental separation
  • πŸ—£οΈ OpenAI Whisper transcription - Accurate speech-to-text with word-level timing
  • πŸ“ Synchronized subtitles - Word highlighting with professional typography
  • 🎚️ Multiple vocal levels - 0%, 5%, 10%, 15%, 25%, 50%, 75%
  • 🌐 Bilingual support - Original language + transliteration
  • ☁️ Cloud deployment - Google Cloud Run ready
  • ⏰ Daily automation - Scheduled runs via cron or systemd

Local Usage

Prerequisites

  • Python 3.9+
  • Docker (for Spleeter audio separation)
  • FFmpeg
  • yt-dlp (for video downloads)
  • OpenAI API key (optional, for Whisper transcription)

Quick Setup

# 1. Install dependencies
pip install -r requirements.txt
pip install yt-dlp

# 2. Pull Spleeter Docker image
docker pull researchdeezer/spleeter:3.8-2stems

# 3. Test setup
./test_setup.sh

# 4. Set up daily automation (optional)
./setup_daily_karaoke.sh

Environment Variables

Create a .env file:

# Optional: For Whisper transcription
export OPENAI_API_KEY="your-openai-api-key"

# Optional: For cloud storage
export BUCKET_NAME="your-gcs-bucket"

# Optional: Custom port for web service
export PORT=8080

Daily Karaoke Generation

# Run daily karaoke generation manually
python3 daily_karaoke_generator.py

# Or use the wrapper script
./run_karaoke_now.sh

Output will be saved to: karaoke_daily_runs/run_YYYYMMDD_HHMMSS/

Docker Compose Usage

# Start processing service
docker-compose up processing

# Run in background
docker-compose up -d processing

# View logs
docker-compose logs -f processing

# Test Spleeter directly
mkdir -p input output
docker-compose --profile tools run spleeter separate -i /input/video.mp4 -o /output -p spleeter:2stems

Create Karaoke Videos (Manual)

# Multiple vocal levels (0%, 5%, 10%, 15%, 25%, 50%, 75%)
python create_multi_vocal_karaoke.py

# Low vocal levels (5%, 10%, 15%)
python create_low_vocal_karaoke.py

# Download & process YouTube videos
python download_and_create_karaoke.py

Cloud Deployment

Deploy to Google Cloud Run

  1. Set up Google Cloud Project
gcloud config set project YOUR_PROJECT_ID
gcloud services enable run.googleapis.com cloudbuild.googleapis.com storage.googleapis.com
  1. Create Storage Bucket
gsutil mb gs://soniq-karaoke-videos
  1. Deploy with Cloud Build
gcloud builds submit --config=cloudbuild.yaml

API Usage

Health Check

curl https://soniq-karaoke-HASH-uc.a.run.app/health

Process Video

curl -X POST https://soniq-karaoke-HASH-uc.a.run.app/process \
  -H "Content-Type: application/json" \
  -d '{
    "url": "https://www.youtube.com/watch?v=VIDEO_ID",
    "vocal_levels": [0.0, 0.25, 0.5]
  }'

Response:

{
  "job_id": "uuid",
  "title": "Video Title",
  "videos": [
    {
      "vocal_level": 0,
      "url": "https://storage.googleapis.com/bucket/file.mp4",
      "filename": "karaoke_0_vocal.mp4"
    }
  ]
}

Architecture

YouTube URL β†’ yt-dlp β†’ Docker Spleeter β†’ OpenAI Whisper β†’ FFmpeg β†’ Cloud Storage
                ↓              ↓             ↓           ↓
            Video File    Vocal/Instrumental  Subtitles   Karaoke Video

Video Pipeline

  1. Download - Extract video from YouTube URL
  2. Separate - Use Docker Spleeter for ML-based audio separation
  3. Transcribe - OpenAI Whisper for word-level timestamps
  4. Subtitle - Create synchronized ASS subtitles with highlighting
  5. Mix - Combine vocals/instrumentals at specified levels
  6. Render - Generate final karaoke video with FFmpeg
  7. Upload - Store in Google Cloud Storage

Technologies

  • Python Spleeter - ML audio separation (replaced Docker-in-Docker)
  • OpenAI Whisper - Speech transcription
  • yt-dlp - YouTube video downloading
  • FFmpeg - Video/audio processing
  • Flask - Web API framework
  • Google Cloud Run - Serverless deployment
  • Google Cloud Storage - Video storage

GitHub Auto-Deploy

βœ… Trigger "soniqpush" active - pushes to main branch automatically deploy to Cloud Run!

Examples

Created karaoke videos with various vocal levels:

  • punjaban_karaoke_0_vocal.mp4 - Pure instrumental
  • punjaban_karaoke_25_vocal.mp4 - Light vocal guide
  • punjaban_karaoke_50_vocal.mp4 - Balanced mix
  • punjaban_karaoke_75_vocal.mp4 - Strong vocal guide

Perfect for different karaoke preferences and skill levels!

License

MIT License - See LICENSE file for details.# GitHub Auto-Deployment Test Fri 22 Aug 2025 15:28:44 EDT

GitHub trigger test - Fri 22 Aug 2025 15:38:10 EDT

About

Complete karaoke creation system with video processing, Modal integration, and AI-powered vocal separation using Spleeter

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors