A curated list of different papers and datasets in various areas of audio-visual processing
-
Updated
Jan 30, 2024
A curated list of different papers and datasets in various areas of audio-visual processing
ACM MM 2021: 'Is Someone Speaking? Exploring Long-term Temporal Features for Audio-visual Active Speaker Detection'
Implementation of "EPIC-Fusion: Audio-Visual Temporal Binding for Egocentric Action Recognition, ICCV, 2019" in PyTorch
This repo contains the official PyTorch implementation of: Diverse and Aligned Audio-to-Video Generation via Text-to-Video Model Adaptation
An audio visualizer for React. Provides separate components to visualize both live audio and audio blobs.
Libvisual Audio Visualization
🎙 Generator waveform paths for SVG 🎶
Human Emotion Understanding using multimodal dataset.
Source code for "Sparse in Space and Time: Audio-visual Synchronisation with Trainable Selectors." (Spotlight at the BMVC 2022)
Programmatic minimalistic audio visualizations.
[CVPR 2023] Collecting Cross-Modal Presence-Absence Evidence for Weakly-Supervised Audio-Visual Event Perception
Audio-Visual Corruption Modeling of our paper "Watch or Listen: Robust Audio-Visual Speech Recognition with Visual Corruption Modeling and Reliability Scoring" in CVPR23
Efficient synchronization from sparse cues
Audio Visual Scene-Aware Dialog (AVSD) Challenge at the 10th Dialog System Technology Challenge (DSTC)
Transformer-based online speech recognition system with TensorFlow 2
[ECCV 2022] Joint-Modal Label Denoising for Weakly-Supervised Audio-Visual Video Parsing
Official code for WACV 2024 paper, "Annotation-free Audio-Visual Segmentation"
Code for CVPR 2021 paper Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing
Official source code of the INTERSPEECH 2023 paper: "Audio-Visual Speech Separation in Noisy Environments with a Lightweight Iterative Model" (AVLIT)
Add a description, image, and links to the audio-visual topic page so that developers can more easily learn about it.
To associate your repository with the audio-visual topic, visit your repo's landing page and select "manage topics."