Laya – Audio & Text Alignment Tool

Laya is a lightweight browser-based tool built to align audio with word-level text for Sanskrit shlokas and other structured text, such as epics like the Ramayana. It supports manual timestamping, flagging, and transcript export — useful for training Text-to-Speech (TTS) models and precise sloka playback experiences (e.g., for Swara TTS).

⸻

✨ Features • 🎧 Audio Playback with Visual Sync

•	Highlighted line follows audio playback in real-time.

•	🧠 Manual Timestamping & Syncing

•	Assign timestamps with keyboard (↓ and ↑)

•	Adjust timestamps with precision (←/→)

•	🎛️ Transcript Editing

•	Add (Alt + Enter) and delete (Alt + Backspace) lines

•	Update timestamps directly via playback

•	🚩 Flagging Mechanism

•	Mark lines with issues like mispronunciation or low volume

•	Add reviewer comments

•	📝 Export

•	Download transcript as CSV with timestamps, flags, and comments

•	🗂️ Kanda & Sarga Organization

•	Load audio and transcript data by Kanda and Sarga divisions

⸻

⌨️ Keyboard Shortcuts

Shortcut Action Shift + Space Play / Pause ↓ (Down Arrow) Assign current time to next line ↑ (Up Arrow) Reset time for current line ← / → Adjust time by ±250ms Shift + ← / → Seek audio by ±1 second Alt + Enter Add line below current Alt + Backspace Delete current line (with confirm)

⸻

📦 Folder Structure

SwaraSangraha/ ├── ramayana2/ │ ├── audio/ # Audio files organized as /Kanda/Sarga.mp3 │ └── word_data.json # Input transcript JSON

⸻

🛠️ Setup & Usage

Serve Locally

You can open index.html directly in a browser, or serve via:

npx serve

or

python3 -m http.server

Data Format (JSON)

word_data.json should be structured as:

[ { "Kanda": "1", "Sarga": "1", "Word": "धर्मक्षेत्रे", "Word Start": "0.42" }, ... ]

Audio Format

MP3 audio files should be placed in:

SwaraSangraha/ramayana2/audio//.mp3

⸻

🧾 Export Format

Exported transcript.csv includes:

start,sentence,flags,comment 00:00:42,"धर्मक्षेत्रे","mispronunciation;low_volume","Needs re-recording" ...

⸻

🔍 Future Enhancements • Auto-save to DB via API • Forced alignment suggestions as initial timestamp • Multi-speaker or layered annotation support • Integration with Smruthi platform for preview/playback

⸻

🤝 Contributing

Feel free to fork and enhance! Submit issues or pull requests for bugs, UX suggestions, or new features.

⸻

🧠 Inspiration

Built for Swara TTS and Smruthi, to preserve and align sacred texts with accurate timing, pronunciation, and structure. Inspired by tools like Musixmatch for audio-word sync.

⸻

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
SwaraSangraha		SwaraSangraha
code		code
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
error_log.txt		error_log.txt
index.html		index.html
jsonToCSV.py		jsonToCSV.py
ocr.py		ocr.py
ramayana_slokas.csv		ramayana_slokas.csv
ramayana_slokas.json		ramayana_slokas.json
test.html		test.html
test.py		test.py
vAlmIkIrAmAyaNa.pdf		vAlmIkIrAmAyaNa.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Laya – Audio & Text Alignment Tool

or

About

Uh oh!

Releases

Packages

Languages

imradhe/laya

Folders and files

Latest commit

History

Repository files navigation

Laya – Audio & Text Alignment Tool

or

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages