Speech recognition using NVIDIA's Parakeet TDT model.
- 🎙️ Speech-to-text transcription using NVIDIA Parakeet TDT model
- 📊 Real-time transcription with progress tracking
- 📝 Support for multiple audio formats (WAV, FLAC)
- 📈 Transcription history with export options
- 🎯 Optimized for both short and long audio files
- 💻 GPU acceleration support with fallback to CPU
- Install dependencies:
pip install -r requirements.txt- Run the application:
streamlit run app.py- Python 3.8+
- NVIDIA GPU with CUDA support (strongly recommended for optimal performance)
- FFmpeg (for audio processing)
- Upload an audio file or record directly in the browser
- Wait for the model to process and transcribe
- View and export transcription results
- NVIDIA GPU with CUDA support is strongly recommended for optimal performance
- Long audio files (>8 minutes) will automatically use optimized settings
- Maximum recommended audio duration is 30 minutes