EBM

This repository contains official pytorch implementation for following paper:

Title : Enhancing Audio Deepfake Detection by Improving Representation Similarity of Bonafide Speech
Autor : Seung-bin Kim, Hyun-seo Shin, Jungwoo Heo, Chan-yeong Lim, Kyo-Won Koo, Jisoo Son, Sanghyun Hong, Souhwan Jung, and Ha-Jin Yu

Abstract

The key to audio deepfake detection is distinguishing bonafide speech from carefully generated spoofed speech. The more distinguishable they are, the better and more generalizable the detection becomes. In this work, we propose a novel approach to enhance this distinguishability in the latent space. Inspired by one-class classification, we formulate an objective function that encourages the contraction of bonafide samples while dispersing fake speech samples during training. Our objective consists of two key components: Bonafide-Pair Learning (BPL) loss and an Extended One-Class Softmax (EOC-S) loss. The BPL reduces intra-class variance by aligning the embeddings of augmented bonafide pairs, while the EOC-S leverages Adam-based centroid updates and margin constraints to reinforce separability from spoofed data. Experimental results on ASVspoof datasets demonstrate that our proposed approach enhances detection performance across diverse attack scenarios.

Our experimental code was modified based on here.

Data

The ASVspoof 2019 LA and ASVspoof 2021 datasets were used for training and test. The ASVspoof 2019 LA trainset consists of 2,580 bona fide samples and 22,800 spoof samples.

Additionally, we applied vocoder augmentation to the training set using HiFi-GAN. The vocoder was applied only to spoof samples and the method for applying the HiFi-GAN is described here.

Environment

Docker image (nvcr.io/nvidia/pytorch:23.08-py3) of Nvidia GPU Cloud was used for conducting our experiments.

Make docker image and activate docker container.

.docker_build.sh
.docker_run.sh

Note that you need to modify the mapping path (/data) before running the 'docker_run.sh' file. Additionally, the dataset path is specified in the arguments.py file, so it must be set accordingly:

In arguments.py

'path_19LA'    : '/data/ASVspoof2019',
'path_21LA'  : '/data/ASVspoof2021_LA_eval',
'path_21DF'  : '/data/ASVspoof2021_DF'

We have a basic logger that stores information in local. However, if you would like to use an additional online neptune logger:

# Neptune: Add 'neptune_user' and 'neptune_token'
# input this arguments in "system_args" dictionary:
# for example
'neptune_user'  : 'user-name',
'neptune_token' : 'NEPTUNE_TOKEN'

Training

Trining on a single GPU

CUDA_VISIBLE_DEVICES=0 python3 main.py

Trining on multiple GPUs

CUDA_VISIBLE_DEVICES=0,1 python3 main.py

Citation

Please cite if you make use of the code.

@article{kim2025ebm,
  title={Enhancing Audio Deepfake Detection by Improving Representation Similarity of Bonafide Speech},
  author={Seung-bin Kim, Hyun-seo Shin, Jungwoo Heo, Chan-yeong Lim, Kyo-Won Koo, Jisoo Son, Sanghyun Hong, Souhwan Jung, and Ha-Jin Yu},
  journal={},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
data_processing		data_processing
dataset		dataset
logger		logger
model		model
scheduler		scheduler
utils		utils
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
arguments.py		arguments.py
data_loaders.py		data_loaders.py
docker_build.sh		docker_build.sh
docker_run.sh		docker_run.sh
main.py		main.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

EBM

Abstract

Data

Environment

Training

Citation

About

Uh oh!

Releases

Packages

Languages

License

kimho1wq/EBM

Folders and files

Latest commit

History

Repository files navigation

EBM

Abstract

Data

Environment

Training

Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages