Skip to content

Conversation

@mogwai
Copy link
Contributor

@mogwai mogwai commented Nov 10, 2025

This change simplifies the VAD implementation by replacing pyannote.audio
with Silero VAD, which offers several benefits:

  • No HuggingFace authentication required
  • Lighter weight and fewer dependencies
  • Simpler API and easier to use
  • Maintained function signature compatibility

Changes:

  • Rewrote src/vui/vad.py to use Silero VAD instead of pyannote
  • Removed pyannote.audio dependency from pyproject.toml
  • Updated readme.md to remove pyannote authentication instructions
  • Added merge_segments helper function for post-processing

This change simplifies the VAD implementation by replacing pyannote.audio
with Silero VAD, which offers several benefits:

- No HuggingFace authentication required
- Lighter weight and fewer dependencies
- Simpler API and easier to use
- Maintained function signature compatibility

Changes:
- Rewrote src/vui/vad.py to use Silero VAD instead of pyannote
- Removed pyannote.audio dependency from pyproject.toml
- Updated readme.md to remove pyannote authentication instructions
- Added merge_segments helper function for post-processing
Improvements:
- Fixed double loading of Silero VAD model
- Store both model and utils in pipeline for efficiency
- Added comprehensive test suite for validation
- Improved code documentation

Test files added:
- test_vad.py: Full integration test with synthetic audio
- test_vad_code_review.py: Code structure validation
- test_vad_simple.py: Module structure test
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants