The repository provides links to collections of influential and interesting research papers from top AI conferences, with open-source code to promote reproducibility and provide detailed implementa…

Python 118 5 Updated Oct 24, 2025

pyf98 / DPHuBERT

INTERSPEECH 2023: "DPHuBERT: Joint Distillation and Pruning of Self-Supervised Speech Models"

Python 117 12 Updated Jan 26, 2024

YoonjinXD / kadtk

A standardized toolkit of Kernel Audio Distance (KAD)—a distribution-free, unbiased, and computationally efficient metric for evaluating generative audio.

Python 97 9 Updated Jun 12, 2025

SJTMusicTeam / SVS_system

A system works on singing voice synthesis

Python 79 19 Updated Jan 11, 2023

mrzjy / GenshinDialog

Extracting character conversations in Genshin Project

Python 75 8 Updated Feb 6, 2025

tiantiaf0627 / vox-profile-release

Vox-Profile Benchmark

Python 75 12 Updated Feb 16, 2026

crlandsc / torch-log-wmse

logWMSE, an audio quality metric & loss function with support for digital silence target. Useful for training and evaluating audio source separation systems.

Python 46 1 Updated Jan 29, 2026

hugeBlack / GenshinTextSearch

原神多语言文本搜索工具，可按关键字搜索所有文本、语音，可用于外语学习，剧情考据，模型训练等用途

Python 46 4 Updated Sep 3, 2024

SonyCSLParis / audio-metrics

Compute distribution-based quality metrics for audio data using embeddings, with a focus on music.

Python 43 3 Updated Jan 15, 2026

nomonosound / log-wmse-audio-quality

logWMSE, an audio quality metric with support for digital silence target. Useful for evaluating audio source separation systems, even when there are many audio tracks or stems.

Python 38 3 Updated Jun 24, 2025

jerryuhoo / VISinger

Forked from PlayVoice/VI-SVS

Use VITS and Opencpop to develop singing voice synthesis; Different from VISinger.

Python 37 2 Updated Feb 24, 2023

Previous Next

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jiatong ftshijt

Achievements

Achievements

Organizations

Block or report ftshijt

Stars

tencent-ailab / pika

xiaomi-research / r1-aqa

SJTMusicTeam / Muskits

neubig / starter-repo

sarulab-speech / UTMOSv2

Stability-AI / stable-audio-metrics

OpenBMB / UltraEval-Audio

YatingMusic / compound-word-transformer

nttcslab-sp / kaldiio

espnet / espnet_model_zoo

microsoft / fadtk

M4Singer / M4Singer

baichuan-inc / Baichuan-Audio

Takaaki-Saeki / DiscreteSpeechMetrics

a43992899 / MARBLE

espnet / espnet_onnx

unilight / sheet

jfsantos / SRMRpy

facebookresearch / SimulEval

DmitryRyumin / NewEraAI-Papers