Stars
Skills for Real Engineers. Straight from my .claude directory.
Simple, unified interface to multiple Generative AI providers
Automatic Speech Recognition with Speaker Diarization based on OpenAI Whisper
WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
Faster Whisper transcription with CTranslate2
Real time web based Speech-to-Text app with Streamlit
A multimodal approach on emotion recognition using audio and text.
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Self-Supervised Speech Pre-training and Representation Learning Toolkit
MMSA is a unified framework for Multimodal Sentiment Analysis.
Sequence-to-sequence framework with a focus on Neural Machine Translation based on PyTorch
Robust Speech Recognition via Large-Scale Weak Supervision
Yet another Thai Word Segmentation that employs multiple linguistic information with attention mechanisms.
Pycord is a modern, easy to use, feature-rich, and async ready API wrapper for Discord written in Python
Discovering Interpretable GAN Controls [NeurIPS 2020]