Stars
Project for training SSL-based deepfake speech detector
[ACMMM2025] Official released code for ALLM4ADD
Official code for "F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching"
Research and development (R&D) is crucial for the enhancement of industrial productivity, especially in the AI era, where the core aspects of R&D are mainly focused on data and models. We are commi…
The pytorch implementation of BAM for Partialspoof Audio Localization.
[ICLR 2025] SONICS: Synthetic Or Not - Identifying Counterfeit Songs
A list of tools, papers and code related to Deepfake Detection.
A library for calculating the FLOPs in the forward() process based on torch.fx
SALMONN family: A suite of advanced multi-modal LLMs
A list of tools, papers and code related to Fake Audio Detection.
The official Soundwave repository
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
Unified Efficient Fine-Tuning of 100+ LLMs & VLMs (ACL 2024)
Official PyTorch implementation of "AASIST: Audio Anti-Spoofing using Integrated Spectro-Temporal Graph Attention Networks"
AI Audio Datasets (AI-ADS) 🎵, including Speech, Music, and Sound Effects, which can provide training data for Generative AI, AIGC, AI model training, intelligent audio tool development, and audio a…
This is the source code for Detecting Adversarial Data by Probing Multiple Perturbations Using Expected Perturbation Score (ICML2023).
Models and code for RepCodec: A Speech Representation Codec for Speech Tokenization
Robust Speech Recognition via Large-Scale Weak Supervision
Implementation of CoBERT: Self-Supervised Speech Representation Learning Through Code Representation Learning
Unofficial SoundStream implementation of Pytorch with training code and 16kHz pretrained checkpoint
A generative speech model for daily dialogue.
A Python Library for Outlier and Anomaly Detection, Integrating Classical and Deep Learning Techniques
Amphion (/æmˈfaɪən/) is a toolkit for Audio, Music, and Speech Generation. Its purpose is to support reproducible research and help junior researchers and engineers get started in the field of audi…
Defending against Adversarial Audio via Diffusion Model (ICLR 2023)
A comprehensive benchmark of deepfake detection
Official PyTorch implementation of "t-EER: Parameter-Free Tandem Evaluation Metric of Countermeasures and Biometric Comparators"
AdvSV stands as the first dataset developed specifically for evaluating Speaker Verification (SV) systems against adversarial attacks. It aims to benchmark the robustness of ASV models in the face…