-
Gwangju Institute of Science and Technology
- Republic of Korea
- https://seongqjini.com
Stars
(Interspeech 2026, official code) MeCo: One-Step MeanFlow-based Corrector for Multi-Channel Speech Separation
(ICASSP 2025, official code)FlowSE: Flow Matching-based Speech Enhancement
This app is intended to automatically create a corpus for ASR systems using pseudo-labeling.
🇺🇦 Open Source Ukrainian Text-to-Speech datasets
🇺🇦 Speech Recognition & Synthesis for Ukrainian
An example of Django project with basic user functionality.
Emotional Dialogue Acts corpus contains dialogue act labels for the multimodal conversational emotion datasets IEMOCAP and MELD. https://www.aclweb.org/anthology/2020.lrec-1.78/
This repo contains implementation of different architectures for emotion recognition in conversations.
FastSpeech2, modified for training KSS Dataset. Modified from https://github.com/ming024/FastSpeech2
This is the official github repo for our ACM MM 2025 paper: "Grounding Emotion Recognition with Visual Prototypes: VEGA--Revisiting CLIP in MERC"
A context filtering with graph-based multi-frequency propagation for multimodal emotion recognition in conversations.
This repository represents the official implementation of the paper titled "Test-Time Prompt Tuning for Zero-Shot Depth Completion (ICCV 2025 Highlight)".
Official page of "DeepASA: An Object-Oriented Multi-Purpose Network for Auditory Scene Analysis"
Source code for ICASSP 2022 paper "MM-DFN: Multimodal Dynamic Fusion Network For Emotion Recognition in Conversations".
Pytorch implementation to paper "DialogueGCN: A Graph Convolutional Neural Network for Emotion Recognition in Conversation".