A modern web interface for Gujarati Text-to-Speech synthesis using the F5-TTS model.
- Text-to-Speech Conversion: Convert Gujarati text to natural-sounding speech
- Advanced Generation Parameters: Customize speech generation with adjustable parameters
- Dark/Light Theme: Toggle between dark and light mode for comfortable viewing
- Audio History: View, play, and download previously generated audio
- Responsive Design: Works on desktop and mobile devices
- User Authentication: Basic login system to protect the application
- Backend: Flask (Python)
- Frontend: HTML, CSS, JavaScript
- TTS Engine: F5-TTS model via Gradio API
- Audio Processing: Generated WAV files with customizable settings
- Python 3.8+
- Gradio API running locally on port 7860
- The F5-TTS model files for Gujarati
- Clone the repository:
git clone https://github.com/Ahir7/f5-tts-web-app.git
cd f5-tts-web-app- Create a virtual environment and activate it:
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate- Install dependencies:
pip install -r requirements.txt- Configure the model paths in
app.py: Update theLANGUAGESdictionary with your local model paths.
-
Start the Gradio API for the TTS model on port 7860.
-
Run the Flask application:
python app.py- Access the web interface at http://localhost:5000
- Login: Use the default credentials (admin/admin123) or update them in the app
- Enter Text: Type or paste Gujarati text in the text area
- Adjust Parameters (optional):
- NFE Step: Adjust the number of function evaluations (default: 32)
- Speed: Control speech speed (0.5-2.0)
- Random Seed: Set for consistent output (or -1 for random)
- Remove Silence: Toggle to trim silence
- Use EMA: Toggle Exponential Moving Average usage
- Generate: Click "Generate Speech" to process the text
- Listen & Download: Play the audio in the browser or download it
- View History: Access previously generated audio files
app.py: Main Flask applicationtemplates/: HTML templatesindex.html: Main TTS interfacelogin.html: Login pagehistory.html: Audio history page
static/: Static assetscss/style.css: Application stylingjs/script.js: Client-side functionalityaudio/: Generated audio files
- Adding Languages: Add new language configurations to the
LANGUAGESdictionary inapp.py - Styling: Modify the
static/css/style.cssfile to change the appearance - Users: Update the
usersdictionary inapp.pyto manage authentication
- F5-TTS Model Team for the text-to-speech technology
- Contributors and maintainers of the original F5-TTS project
- Database integration for user management
- Multiple language support
- Batch processing capabilities
- API key authentication
- Customizable voice profiles