Stars
💩 Profanity means swear words. The adjective is 'profane'. Profanities can also be called curse ("cuss") words, dirty words, bad words, foul language, obscenity, obscene language, or expletives. It…
This repository contains the official implementation code of the paper Improving Multimodal Fusion with Hierarchical Mutual Information Maximization for Multimodal Sentiment Analysis, accepted at E…
A flexible, high-performance serving system for machine learning models
基于PaddlePaddle实现的语音识别,中文语音识别。项目完善,识别效果好。支持Windows,Linux下训练和预测,支持Nvidia Jetson开发板预测。
This repository contains various models targetting multimodal representation learning, multimodal fusion for downstream tasks such as multimodal sentiment analysis.
[NeurIPS 2021] Multiscale Benchmarks for Multimodal Representation Learning
This project is dedicated to the implementation and research of Kolmogorov-Arnold convolutional networks. The repository includes implementations of 1D, 2D, and 3D convolutions with different kern…
An efficient pure-PyTorch implementation of Kolmogorov-Arnold Network (KAN).
An unofficial implementation of Graph Transformer (Masked Label Prediction: Unified Message Passing Model for Semi-Supervised Classification) - IJCAI 2021
Pytorch implementation for the paper: Multivariate, Multi-frequency and Multimodal: Rethinking Graph Neural Networks for Emotion Recognition in Conversation, CVPR 2023.
[ACM MM 2019] Official Implementation for MMGCN: Multi-modal Graph Convolution Network forPersonalized Recommendation of Micro-video
This repo contains implementation of different architectures for emotion recognition in conversations.
[ACL'19] [PyTorch] Multimodal Transformer
PaddlePaddle-based implementation of Modeling Relational Data with Graph Convolutional Networks
[EMNLP2023] Conversation Understanding using Relational Temporal Graph Neural Networks with Auxiliary Cross-Modality Interaction
Emotional Speech Recognition with Pre-trained Deep Visual Models
State-of-the-Art Text Embeddings
Pytorch code for ACL-IJCNLP accepted paper "Directed Acyclic Graph Network for Conversational Emotion Recognition"
Open source speech codec designed for communications quality speech between 700 and 3200 bit/s. The main application is low bandwidth HF/VHF digital radio.
INTERSPEECH 2023-2024 Papers: A complete collection of influential and exciting research papers from the INTERSPEECH 2023-24 conference. Explore the latest advances in speech and language processin…
MetricAug: A Distortion Metric-Lead Augmentation Strategy for Training Noise-Robust Speech Emotion Recognizer Official Implementation