Lists (32)
Sort Name ascending (A-Z)
AIGC
Attention
Audio-Visual
Awesome
Awesome seriesChatGPT
CNN
Contrastive Learning
Deep Learning
Diffusion
ERC
Emotion recognition in conversationFER
Facial expression recognitionFunny and Useful AI Tool
job
LLM
Math
MER
Multimodal emotion recognitionMissing Modality
MMSA
Multimodal sentiment analysisMultimodal Learning
NLP
OOD
Open Vocabulary Learning
Paper Writing
Prompt Learning
Self-supervised Learning
Semi-supervised learning
SER
Speech emotion recognitionTime Series
Uncertainty Learning
Video Transformer
Vision Transformer
VLP
Vision-Language Pre-trainingStarred repositories
A curated list of papers, models, datasets, and benchmarks for unified multi-modal embedding models.
Official repository for the paper “Rethinking Facial Expression Recognition in the Era of Multimodal Large Language Models”
A vision foundation model for affective and facial recognition tasks
EmoCapCLIP: Learning Transferable Facial Emotion Representations from Large-Scale Semantically Rich Captions
[CVPR'25] AVF-MAE++ : Scaling Affective Video Facial Masked Autoencoders via Efficient Audio-Visual Self-Supervised Learning
Janus-Series: Unified Multimodal Understanding and Generation Models
HunyuanVideo: A Systematic Framework For Large Video Generation Model
A repository for storing models that have been inter-converted between various frameworks. Supported frameworks are TensorFlow, PyTorch, ONNX, OpenVINO, TFJS, TFTRT, TensorFlowLite (Float32/16/INT8…
Official implementation of MagicFace: High-Fidelity Facial Expression Editing with Action-Unit Control
LAFS: Landmark-based Facial Self-supervised Learning for Face Recognition
EC-STFL: Expression-Clustered Spatiotemporal Feature Learning. It is proposed for video-based Facial Expression Recognition (FER) task.
This repository provides the codes for MMA-DFER: multimodal (audiovisual) emotion recognition method. This is an official implementation for the paper MMA-DFER: MultiModal Adaptation of unimodal mo…
[CVPR 2024] EmoVIT: Revolutionizing Emotion Insights with Visual Instruction Tuning
Awesome speech/audio LLMs, representation learning, and codec models
📄 适合中文的简历模板收集(LaTeX,HTML/JS and so on)由 @hoochanlon 维护
[IJCAI 2024] EAT: Self-Supervised Pre-Training with Efficient Audio Transformer
[Information Fusion 2024] HiCMAE: Hierarchical Contrastive Masked Autoencoder for Self-Supervised Audio-Visual Emotion Recognition
Toolkits for Multimodal Emotion Recognition
Official code of "VRA: Variational Rectifed Activation for Out-of-distribution Detection"
A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models (ICASSP 2024)
Repository with the code of the paper: A proposal for Multimodal Emotion Recognition using auraltransformers and Action Units on RAVDESS dataset
Awesome list for research on CLIP (Contrastive Language-Image Pre-Training).
This repo lists relevant papers summarized in our survey paper: A Systematic Survey of Prompt Engineering on Vision-Language Foundation Models.
A curated list of prompt-based paper in computer vision and vision-language learning.