Stars
Character-aware audio-only subtitling
A curated collection of SQA research
A toolkit dedicate for speech evaluation.
Speech Security and Privacy Compendium - Mini
The official code repo of "HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection"
Official repository for "AM-RADIO: Reduce All Domains Into One"
The official repository of Dynamic-SUPERB.
PyTorch implementation of the ICASSP-24 paper: "Improving Audio Captioning Models with Fine-grained Audio Features, Text Embedding Supervision, and LLM Mix-up Augmentation"
DinoSR: Self-Distillation and Online Clustering for Self-supervised Speech Representation Learning
NeMo: a toolkit for conversational AI
TMT: Tri-Modal Translation between Speech, Image, and Text by Processing Different Modalities as Different Languages
Alexgichamba / espnet
Forked from espnet/espnetEnd-to-End Speech Processing Toolkit
VITS2: Improving Quality and Efficiency of Single-Stage Text-to-Speech with Adversarial Learning and Architecture Design
A toolkit for Spoken Language Understanding Evaluation (SLUE) benchmark. Refer paper https://arxiv.org/abs/2111.10367 for more details. Official website: https://asappresearch.github.io/slue-toolkit/
Script to download corpora from the Linguistic Data Consortium (LDC)
soumimaiti / espnet
Forked from espnet/espnetEnd-to-End Speech Processing Toolkit
Confidence interval computation for evaluation in machine learning using the bootstrapping approach
Scripts for data generation, scoring and data manifest preparation for CHiME-8 DASR task.
A repo containing download guidance and corresponding scripts of the VoxBlink dataset.