- Beijing
-
22:19
(UTC +08:00) - alanlee.fun
- @bluekirin93
- https://www.zhihu.com/people/lyjwf1216
Audio/Video
Python bindings for FFmpeg - with complex filtering support
A machine learning-based video super resolution and frame interpolation framework. Est. Hack the Valley II, 2018.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Inference and training library for high-quality TTS models.
🐸💬 - a deep learning toolkit for Text-to-Speech, battle-tested in research and production
Slightly improved official version for finetune xtts
Webui for using XTTS and for finetuning it
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Neural building blocks for speaker diarization: speech activity detection, speaker change detection, overlapped speech detection, speaker embedding