The AI music site that automatically generates karaoke player
-
Updated
Mar 10, 2023 - Python
The AI music site that automatically generates karaoke player
Extract phone-level alignment and phonemic transcript from kaldi ali.*.gz files
Use kaldi pretrained nnet3 model to align individual sentences and get phone-level transcripts
ElevenLabs Force Alignment SRT Generator - Generate synchronized subtitles with AI-powered semantic segmentation
Public version of my Computer-Aided Pronunciation Training (CAPT) system (server)
An automated workflow that generates timestamped subtitles from a video file with custom control using regex, Java and multiple online tools.
Given forced alignment results, we obtain words with their respective durations from concatenated phones.
Takes audio (mp3) and text input (string) and force aligns the text to the audio. Uses stable-ts and whisperx.
Perform force alignment on Mandarin data using aidatatang pretrained model at https://kaldi-asr.org/models/m10
Solution for Zalo AI Challenge 3 - Lyrics Alignment
Generate audiobooks from plain EPUB files in EPUB 3 Media Overlays format (SMIL) using high-quality TTS engines like Azure and Kokoro.
A small wrapper package around whisper-timestamped. Create force-aligned transcription TextGrids from raw audio!
Add a description, image, and links to the force-alignment topic page so that developers can more easily learn about it.
To associate your repository with the force-alignment topic, visit your repo's landing page and select "manage topics."