default search action
30th ACM Multimedia 2022: Lisboa, Portugal
- João Magalhães, Alberto Del Bimbo, Shin'ichi Satoh, Nicu Sebe, Xavier Alameda-Pineda, Qin Jin, Vincent Oria, Laura Toni:
MM '22: The 30th ACM International Conference on Multimedia, Lisboa, Portugal, October 10 - 14, 2022. ACM 2022, ISBN 978-1-4503-9203-7
Keynote Talks
- Yoelle Maarek:
Alexa, let's work together! How Alexa Helps Customers Complete Tasks with Verbal and Visual Guidance in the Alexa Prize TaskBot Challenge. 1-2 - Nuria Oliver:
Data Science against COVID-19: The Valencian Experience. 3-4 - Douwe Kiela:
Grounding, Meaning and Foundation Models: Adventures in Multimodal Machine Learning. 5
Oral Session I: Engaging Users with Multimedia -- Emotional and Social Signals
- Rui Li, Yiting Wang, Wei-Long Zheng, Bao-Liang Lu:
A Multi-view Spectral-Spatial-Temporal Masked Autoencoder for Decoding Emotions with Self-supervised Learning. 6-14 - Teng Sun, Wenjie Wang, Liqiang Jing, Yiran Cui, Xuemeng Song, Liqiang Nie:
Counterfactual Reasoning for Out-of-distribution Multimodal Sentiment Analysis. 15-23 - Yuanyuan Liu, Wei Dai, Chuanxu Feng, Wenbin Wang, Guanghao Yin, Jiabei Zeng, Shiguang Shan:
MAFW: A Large-scale, Multi-modal, Compound Affective Database for Dynamic Facial Expression Recognition in the Wild. 24-32 - Shengzhe Liu, Xin Zhang, Jufeng Yang:
SER30K: A Large-Scale Dataset for Sticker Emotion Recognition. 33-41
Poster Session I: Engaging Users with Multimedia -- Emotional and Social Signals
- Jicai Pan, Shangfei Wang, Lin Fang:
Representation Learning through Multimodal Attention and Time-Sync Comments for Affective Video Content Analysis. 42-50 - Xujin Li, Wei Wei, Shuang Qiu, Huiguang He:
TFF-Former: Temporal-Frequency Fusion Transformer for Zero-training Decoding of Two BCI Tasks. 51-59 - Yuedong Chen, Xu Yang, Tat-Jen Cham, Jianfei Cai:
Towards Unbiased Visual Emotion Recognition via Causal Intervention. 60-69 - Michal Balazia, Philipp Müller, Ákos Levente Tánczos, August von Liechtenstein, François Brémond:
Bodily Behaviors in Social Interaction: Novel Annotations and State-of-the-Art Evaluation. 70-79 - Niki Maria Foteinopoulou, Ioannis Patras:
Learning from Label Relationships in Human Affect. 80-89 - Ziyi Ye, Xiaohui Xie, Yiqun Liu, Zhihong Wang, Xuesong Chen, Min Zhang, Shaoping Ma:
Brain Topography Adaptive Network for Satisfaction Modeling in Interactive Information Access System. 90-100 - Yan Wang, Yixuan Sun, Wei Song, Shuyong Gao, Yiwen Huang, Zhaoyu Chen, Weifeng Ge, Wenqiang Zhang:
DPCNet: Dual Path Multi-Excitation Collaborative Network for Facial Expression Representation Learning in Videos. 101-110 - Yingjie Chen, Chong Chen, Xiao Luo, Jianqiang Huang, Xian-Sheng Hua, Tao Wang, Yun Liang:
Pursuing Knowledge Consistency: Supervised Hierarchical Contrastive Learning for Facial Action Unit Recognition. 111-119 - Shiqing Zhang, Ruixin Liu, Yijiao Yang, Xiaoming Zhao, Jun Yu:
Unsupervised Domain Adaptation Integrating Transformer and Mutual Information for Cross-Corpus Speech Emotion Recognition. 120-129 - Zhen Xing, Weimin Tan, Ruian He, Yangle Lin, Bo Yan:
Co-Completion for Occluded Facial Expression Recognition. 130-140 - Weichen Yu, Hongyuan Yu, Yan Huang, Liang Wang:
Generalized Inter-class Loss for Gait Recognition. 141-150 - Fan Qi, Zixin Zhang, Xianshan Yang, Huaiwen Zhang, Changsheng Xu:
Feeling Without Sharing: A Federated Video Emotion Recognition Framework Via Privacy-Agnostic Hybrid Aggregation. 151-160 - Jianjian Shao, Zhenqian Wu, Yuanyan Luo, Shudong Huang, Xiaorong Pu, Yazhou Ren:
Self-Paced Label Distribution Learning for In-The-Wild Facial Expression Recognition. 161-169 - Yong Zhao, Haifeng Chen, Hichem Sahli, Ke Lu, Dongmei Jiang:
Uncertainty-Aware Semi-Supervised Learning of 3D Face Rigging from Single Image. 170-179 - Junyu Chen, Qianqian Xu, Zhiyong Yang, Xiaochun Cao, Qingming Huang:
A Unified Framework against Topology and Class Imbalance. 180-188 - Yang Yu, Dong Zhang, Shoushan Li:
Unified Multi-modal Pre-training for Few-shot Sentiment Analysis with Prompt-based Learning. 189-198 - Zhicheng Zhang, Jufeng Yang:
Temporal Sentiment Localization: Listen and Look in Untrimmed Videos. 199-208 - Xinyu Cheng, Wei Wei, Changde Du, Shuang Qiu, Sanli Tian, Xiaojun Ma, Huiguang He:
VigilanceNet: Decouple Intra- and Inter-Modality Learning for Multimodal Vigilance Estimation in RSVP-Based BCI. 209-217 - Lijuan Wang, Guoli Jia, Ning Jiang, Haiying Wu, Jufeng Yang:
EASE: Robust Facial Expression Recognition via Emotion Ambiguity-SEnsitive Cooperative Networks. 218-227 - Bo-Kai Ruan, Ling Lo, Hong-Han Shuai, Wen-Huang Cheng:
Mimicking the Annotation Process for Recognizing the Micro Expressions. 228-236
Oral Session II: Engaging User with Multimedia -- Multimedia Search and Recommendation
- Peng-Fei Zhang, Guangdong Bai, Zi Huang, Xin-Shun Xu:
Machine Unlearning for Image Retrieval: A Generative Scrubbing Approach. 237-245 - Jianfeng Dong, Xianke Chen, Minsong Zhang, Xun Yang, Shujie Chen, Xirong Li, Xun Wang:
Partially Relevant Video Retrieval. 246-257 - Fangxiong Xiao, Lixi Deng, Jingjing Chen, Houye Ji, Xiaorui Yang, Zhuoye Ding, Bo Long:
From Abstract to Details: A Generative Multimodal Fusion Framework for Recommendation. 258-267 - Weili Guan, Xuemeng Song, Haoyu Zhang, Meng Liu, Chung-Hsing Yeh, Xiaojun Chang:
Bi-directional Heterogeneous Graph Hashing towards Efficient Outfit Recommendation. 268-276 - MeiYu Liang, Junping Du, Xiaowen Cao, Yang Yu, Kangkang Lu, Zhe Xue, Min Zhang:
Semantic Structure Enhanced Contrastive Adversarial Hash Network for Cross-media Representation Learning. 277-285 - Dan Song, Yue Yang, Weizhi Nie, Xuanya Li, An-An Liu:
Cross-Domain 3D Model Retrieval Based On Contrastive Learning And Label Propagation. 286-295 - Zhixin Ma, Chong-Wah Ngo:
Interactive Video Corpus Moment Retrieval using Reinforcement Learning. 296-306 - Chao Huang, Yabo Liu, Zheng Zhang, Chengliang Liu, Jie Wen, Yong Xu, Yaowei Wang:
Hierarchical Graph Embedded Pose Regularity Learning via Spatio-Temporal Transformer for Abnormal Behavior Detection. 307-315 - Yue Zhao, Weizhi Nie, Zan Gao, Anan Liu:
HMTN: Hierarchical Multi-scale Transformer Network for 3D Shape Recognition. 316-324 - Peng-Fei Zhang, Zi Huang, Guangdong Bai, Xin-Shun Xu:
IDEAL: High-Order-Ensemble Adaptation Network for Learning with Noisy Labels. 325-333 - Yu Zheng, Chen Gao, Jingtao Ding, Lingling Yi, Depeng Jin, Yong Li, Meng Wang:
DVR: Micro-Video Recommendation Optimizing Watch-Time-Gain under Duration Bias. 334-345
Poster Session II: Engaging User with Multimedia -- Multimedia Search and Recommendation
- Bolin Zhang, Chao Yang, Bin Jiang, Xiaokang Zhou:
Video Moment Retrieval with Hierarchical Contrastive Learning. 346-355 - Avinash Madasu, Junier Oliva, Gedas Bertasius:
Learning to Retrieve Videos by Asking Questions. 356-365 - Jinan Sun, Haixin Wang, Xiao Luo, Shikun Zhang, Wei Xiang, Chong Chen, Xian-Sheng Hua:
HEART: Towards Effective Hash Codes under Label Noise. 366-375 - Zongshen Mu, Yueting Zhuang, Jie Tan, Jun Xiao, Siliang Tang:
Learning Hybrid Behavior Patterns for Multimedia Recommendation. 376-384 - Feiyu Chen, Junjie Wang, Yinwei Wei, Hai-Tao Zheng, Jie Shao:
Breaking Isolation: Multimodal Graph Fusion for Multimedia Recommendation by Edge-wise Modulation. 385-394 - Jianwei Zhu, Zhixin Li, Yufei Zeng, Jiahui Wei, Huifang Ma:
Image-Text Matching with Fine-Grained Relational Dependency and Bidirectional Attention-Based Generative Networks. 395-403 - Yuxi Sun, Shanshan Feng, Xutao Li, Yunming Ye, Jian Kang, Xu Huang:
Visual Grounding in Remote Sensing Images. 404-412 - Guolong Wang, Xun Wu, Zhaoyuan Liu, Junchi Yan:
Prompt-based Zero-shot Video Moment Retrieval. 413-421 - Yabing Wang, Jianfeng Dong, Tianxiang Liang, Minsong Zhang, Rui Cai, Xun Wang:
Cross-Lingual Cross-Modal Retrieval with Noise-Robust Learning. 422-433 - Ziyue Wang, Aozhu Chen, Fan Hu, Xirong Li:
Learn to Understand Negation in Video Retrieval. 434-443 - Yongjie Zhu, Chunhui Han, Yuefeng Zhan, Bochen Pang, Zhaoju Li, Hao Sun, Si Li, Boxin Shi, Nan Duan, Weiwei Deng, Ruofei Zhang, Liangjie Zhang, Qi Zhang:
AdsCVLR: Commercial Visual-Linguistic Representation Modeling in Sponsored Search. 444-452 - Junfeng Tu, Xueliang Liu, Zongxiang Lin, Richang Hong, Meng Wang:
Differentiable Cross-modal Hashing via Multimodal Transformers. 453-461 - Zhixin Ling, Zhen Xing, Jiangtong Li, Li Niu:
Multi-Level Region Matching for Fine-Grained Sketch-Based Image Retrieval. 462-470 - Xiaolin Zheng, Jiajie Su, Weiming Liu, Chaochao Chen:
DDGHM: Dual Dynamic Graph with Hybrid Metric Training for Cross-Domain Sequential Recommendation. 471-481 - Yong Zhuang, Tong Yu, Junda Wu, Shiqu Wu, Shuai Li:
Spatial-Temporal Aligned Multi-Agent Learning for Visual Dialog Systems. 482-490 - Huafeng Liu, Liping Jing, Dahai Yu, Mingjie Zhou, Michael Ng:
Learning Intrinsic and Extrinsic Intentions for Cold-start Recommendation with Neural Stochastic Processes. 491-500 - Pingting Hong, Dayan Wu, Bo Li, Weiping Wang:
Camera-specific Informative Data Augmentation Module for Unbalanced Person Re-identification. 501-510 - Zhiqiang Guo, Guohui Li, Jianjun Li, Huaicong Chen:
TopicVAE: Topic-aware Disentanglement Representation Learning for Enhanced Recommendation. 511-520 - Chao Huang, Chengliang Liu, Zheng Zhang, Zhihao Wu, Jie Wen, Qiuping Jiang, Yong Xu:
Pixel-Level Anomaly Detection via Uncertainty-aware Prototypical Transformer. 521-530 - Lei Tan, Pingyang Dai, Rongrong Ji, Yongjian Wu:
Dynamic Prototype Mask for Occluded Person Re-Identification. 531-540 - Nan Pu, Yu Liu, Wei Chen, Erwin M. Bakker, Michael S. Lew:
Meta Reconciliation Normalization for Lifelong Person Re-Identification. 541-549 - Lin Wang, Wanqian Zhang, Dayan Wu, Fei Zhu, Bo Li:
Attack is the Best Defense: Towards Preemptive-Protection Person Re-Identification. 550-559 - Kai Chen, Weihua Chen, Tao He, Rong Du, Fan Wang, Xiuyu Sun, Yuchen Guo, Guiguang Ding:
TAGPerson: A Target-Aware Generation Pipeline for Person Re-identification. 560-571 - Dayan Wu, Qinghang Su, Bo Li, Weiping Wang:
Efficient Hash Code Expansion by Recycling Old Bits. 572-580 - Desheng Cai, Shengsheng Qian, Quan Fang, Jun Hu, Changsheng Xu:
Adaptive Anti-Bottleneck Multi-Modal Graph Learning Network for Personalized Micro-video Recommendation. 581-590 - Uttaran Bhattacharya, Gang Wu, Stefano Petrangeli, Viswanathan Swaminathan, Dinesh Manocha:
Show Me What I Like: Detecting User-Specific Video Highlights Using Content-Based Multi-Head Attention. 591-600 - Kai Wang, Yifan Wang, Xing Xu, Xin Liu, Weihua Ou, Huimin Lu:
Prototype-based Selective Knowledge Distillation for Zero-Shot Sketch Based Image Retrieval. 601-609 - Siyuan Li, Xing Xu, Zailei Zhou, Yang Yang, Guoqing Wang, Heng Tao Shen:
ARRA: Absolute-Relative Ranking Attack against Image Retrieval. 610-618 - Xiaoyu Du, Zike Wu, Fuli Feng, Xiangnan He, Jinhui Tang:
Invariant Representation Learning for Multimedia Recommendation. 619-628 - Tianyuan Xu, Xueliang Liu, Zhen Huang, Dan Guo, Richang Hong, Meng Wang:
Early-Learning regularized Contrastive Learning for Cross-Modal Retrieval with Noisy Labels. 629-637 - Yiwei Ma, Guohai Xu, Xiaoshuai Sun, Ming Yan, Ji Zhang, Rongrong Ji:
X-CLIP: End-to-End Multi-grained Contrastive Learning for Video-Text Retrieval. 638-647 - Yi Zhong, Chengyao Wang, Shiyong Li, Zhu Zhou, Yaowei Wang, Wei-Shi Zheng:
Mixed Supervision for Instance Learning in Object Detection with Few-shot Annotation. 648-658 - Zeyu Ma, Wei Ju, Xiao Luo, Chong Chen, Xian-Sheng Hua, Guangming Lu:
Improved Deep Unsupervised Hashing via Prototypical Learning. 659-667 - Rui Wang, Feng Chen, Jun Tang, Pu Yan:
Adaptive Camera Margin for Mask-guided Domain Adaptive Person Re-identification. 668-677 - Shengshan Hu, Ziqi Zhou, Yechao Zhang, Leo Yu Zhang, Yifeng Zheng, Yuanyuan He, Hai Jin:
BadHash: Invisible Backdoor Attacks against Deep Hashing with Clean Label. 678-686 - Xiaohao Liu, Zhulin Tao, Jiahong Shao, Lifang Yang, Xianglin Huang:
EliMRec: Eliminating Single-modal Bias in Multimedia Recommendation. 687-695 - Zhicheng Sun, Yadong Mu:
Patch-based Knowledge Distillation for Lifelong Person Re-Identification. 696-707
Oral Session III: Engaging User with Multimedia -- Summarization, Analytics, and Storytelling
- Xiaodong Chen, Wu Liu, Xinchen Liu, Yongdong Zhang, Jungong Han, Tao Mei:
MAPLE: Masked Pseudo-Labeling autoEncoder for Semi-supervised Point Cloud Action Recognition. 708-718 - Xun Jiang, Xing Xu, Zhiguo Chen, Jingran Zhang, Jingkuan Song, Fumin Shen, Huimin Lu, Heng Tao Shen:
DHHN: Dual Hierarchical Hybrid Network for Weakly-Supervised Audio-Visual Video Parsing. 719-727
Poster Session III: Engaging User with Multimedia -- Summarization, Analytics, and Storytelling
- Dixin Luo, Yutong Wang, Angxiao Yue, Hongteng Xu:
Weakly-Supervised Temporal Action Alignment Driven by Unbalanced Spectral Fused Gromov-Wasserstein Distance. 728-739 - Jiehang Xie, Xuanbai Chen, Shao-Ping Lu, Yulu Yang:
A Knowledge Augmented and Multimodal-Based Framework for Video Summarization. 740-749 - Dizhan Xue, Shengsheng Qian, Quan Fang, Changsheng Xu:
MMT: Image-guided Story Ending Generation with Multimodal Memory Transformer. 750-758 - Niankai Zhang, Junli Zhao, Fuqing Duan, Zhenkuan Pan, Zhongke Wu, Mingquan Zhou, Xianfeng Gu:
An End-to-End Conditional Generative Adversarial Network Based on Depth Map for 3D Craniofacial Reconstruction. 759-768 - Bowen Li, Philip H. S. Torr, Thomas Lukasiewicz:
Clustering Generative Adversarial Networks for Story Visualization. 769-778 - Jiayin Cai, Changlin Li, Xin Tao, Chun Yuan, Yu-Wing Tai:
DeViT: Deformed Vision Transformers in Video Inpainting. 779-789 - Ming Yao, Yu Bai, Wei Du, Xuejun Zhang, Heng Quan, Fuli Cai, Hongwei Kang:
Multi-Level Spatiotemporal Network for Video Summarization. 790-798
Oral Session IV: Experience -- Interactions and Quality of Experience
- Li Yang, Mai Xu, Tie Liu, Liangyu Huo, Xinbo Gao:
TVFormer: Trajectory-guided Visual Quality Assessment on 360° Images with Transformers. 799-808 - Zheng Lin, Zheng-Peng Duan, Zhao Zhang, Chun-Le Guo, Ming-Ming Cheng:
KnifeCut: Refining Thin Part Segmentation with Cutting Lines. 809-817 - Minju Kim, Yuhyun Lee, Jungjin Lee:
Multi-view Layout Design for VR Concert Experience. 818-826 - Kui Jiang, Zhongyuan Wang, Chen Chen, Zheng Wang, Laizhong Cui, Chia-Wen Lin:
Magic ELF: Image Deraining Meets Association Learning and Transformer. 827-836 - Liang Liao, Kangmin Xu, Haoning Wu, Chaofeng Chen, Wenxiu Sun, Qiong Yan, Weisi Lin:
Exploring the Effectiveness of Video Perceptual Representation in Blind Video Quality Assessment. 837-846 - Mengshun Hu, Kui Jiang, Zhixiang Nie, Zheng Wang:
You Only Align Once: Bidirectional Interaction for Spatial-Temporal Video Super-Resolution. 847-855 - Wei Sun, Xiongkuo Min, Wei Lu, Guangtao Zhai:
A Deep Learning based No-reference Quality Assessment Model for UGC Videos. 856-865
Poster Session IV: Experience - Interactions and Quality of Experience
- Szu-Wei Fu, Yaran Fan, Yasaman Hosseinkashi, Jayant Gupchup, Ross Cutler:
Improving Meeting Inclusiveness using Speech Interruption Analysis. 887-895 - Yaohui Li, Yuzhe Yang, Huaxiong Li, Haoxing Chen, Liwu Xu, Leida Li, Yaqian Li, Yandong Guo:
Transductive Aesthetic Preference Propagation for Personalized Image Aesthetics Assessment. 896-904 - Zheng Lin, Zhao Zhang, Linghao Han, Shao-Ping Lu:
Multi-Mode Interactive Image Segmentation. 905-914 - Nasim Jamshidi Avanaki, Steven Schmidt, Thilo Michael, Saman Zadtootaghaj, Sebastian Möller:
Deep-BVQM: A Deep-learning Bitstream-based Video Quality Model. 915-923 - Anton Ratnarajah, Zhenyu Tang, Rohith Aralikatti, Dinesh Manocha:
MESH2IR: Neural Acoustic Impulse Response Generator for Complex 3D Scenes. 924-933 - Wei Zhou, Zhou Wang:
Quality Assessment of Image Super-Resolution: Balancing Deterministic and Statistical Fidelity. 934-942 - Chaofan Zhang, Shiguang Liu:
No-reference Omnidirectional Image Quality Assessment Based on Joint Network. 943-951 - Abhishek Kumar, Lik-Hang Lee, Jagmohan Chauhan, Xiang Su, Mohammad Ashraful Hoque, Susanna Pirttikangas, Sasu Tarkoma, Pan Hui:
PassWalk: Spatial Authentication Leveraging Lateral Shift and Gaze on Mobile Headsets. 952-960 - Jun Fu, Chen Hou, Wei Zhou, Jiahua Xu, Zhibo Chen:
Adaptive Hypergraph Convolutional Network for No-Reference 360-degree Image Quality Assessment. 961-969 - Xingran Liao, Baoliang Chen, Hanwei Zhu, Shiqi Wang, Mingliang Zhou, Sam Kwong:
DeepWSD: Projecting Degradations in Perceptual Space to Wasserstein Distance in Deep Feature Space. 970-978 - Bohua Peng, Mobarakol Islam, Mei Tu:
Angular Gap: Reducing the Uncertainty of Image Difficulty through Model Calibration. 979-987 - Min Wang, Hao Yang, Qing Cheng:
GCL: Graph Calibration Loss for Trustworthy Graph Neural Network. 988-996 - Yixuan Gao, Xiongkuo Min, Yucheng Zhu, Jing Li, Xiao-Ping Zhang, Guangtao Zhai:
Image Quality Assessment: From Mean Opinion Score to Opinion Score Distribution. 997-1005 - Zihan Zhou, Yong Xu, Ruotao Xu, Yuhui Quan:
No-Reference Image Quality Assessment Using Dynamic Complex-Valued Neural Model. 1006-1015 - Tong Shao, Deming Zhai, Junjun Jiang, Xianming Liu:
Hybrid Conditional Deep Inverse Tone Mapping. 1016-1024 - Yili Jin, Junhua Liu, Fangxin Wang, Shuguang Cui:
Where Are You Looking?: A Large-Scale Dataset of Head and Gaze Behavior for 360-Degree Videos and a Pilot Study. 1025-1034
Oral Session V: Experience -- Art and Culture
- Zhengyan Tong, Xiaohang Wang, Shengchao Yuan, Xuanhong Chen, Junjie Wang, Xiangzhong Fang:
Im2Oil: Stroke-Based Oil Painting Rendering with Linearly Controllable Fineness Via Adaptive Sampling. 1035-1046 - Chen Zhang, LuChin Chang, Songruoyao Wu, Xu Tan, Tao Qin, Tie-Yan Liu, Kejun Zhang:
ReLyMe: Improving Lyric-to-Melody Generation by Incorporating Lyric-Melody Relationships. 1047-1056 - Zihao Wang, Kejun Zhang, Yuxing Wang, Chen Zhang, Qihao Liang, Pengfei Yu, Yongsheng Feng, Wenbo Liu, Yikai Wang, Yuntai Bao, Yiheng Yang:
SongDriver: Real-time Music Accompaniment Generation without Logical Latency nor Exposure Bias. 1057-1067 - Yijun Wang, Tao Liang, Jianxin Lin:
CACOLIT: Cross-domain Adaptive Co-learning for Imbalanced Image-to-Image Translation. 1068-1076 - Kyungwon Lee, Yu-Kyung Jang, Jaewoo Jung, Dong Hwan Kim, Hyun-Jean Lee, Seung Ah Lee:
EuglPollock: Rethinking Interspecies Collaboration through Art Making. 1077-1084
Poster Session V: Experience -- Art and Culture
- Nisha Huang, Fan Tang, Weiming Dong, Changsheng Xu:
Draw Your Art Dream: Diverse Digital Art Synthesis with Multimodal Guided Diffusion. 1085-1094 - Zhizhong Wang, Zhanjie Zhang, Lei Zhao, Zhiwen Zuo, Ailin Li, Wei Xing, Dongming Lu:
AesUST: Towards Aesthetic-Enhanced Universal Style Transfer. 1095-1106 - Matthias Springstein, Stefanie Schneider, Christian Althaus, Ralph Ewerth:
Semi-supervised Human Pose Estimation in Art-historical Images. 1107-1116 - Shenglan Cui, Fang Liu, Tongqing Zhou, Mohan Zhang:
Understanding and Identifying Artwork Plagiarism with the Wisdom of Designers: A Case Study on Poster Artworks. 1117-1127 - Quanwei Yang, Xinchen Liu, Wu Liu, Hongtao Xie, Xiaoyan Gu, Lingyun Yu, Yongdong Zhang:
REMOT: A Region-to-Whole Framework for Realistic Human Motion Transfer. 1128-1137 - Zixuan Wang, Jia Jia, Haozhe Wu, Junliang Xing, Jinghe Cai, Fanbo Meng, Guowen Chen, Yanfeng Wang:
GroupDancer: Music to Multi-People Dance Synthesis with Style Collaboration. 1138-1146 - Daqian Shi, Xiaolei Diao, Lida Shi, Hao Tang, Yang Chi, Chuntao Li, Hao Xu:
CharFormer: A Glyph Fusion based Attentive Framework for High-precision Character Image Denoising. 1147-1155 - Guang Yang, Wu Liu, Xinchen Liu, Xiaoyan Gu, Juan Cao, Jintao Li:
Delving into the Frequency: Temporally Consistent Human Motion Transfer in the Fourier Space. 1156-1166 - Zhimeng Zhang, Yu Ding:
Adaptive Affine Transformation: A Simple and Effective Operation for Spatial Misaligned Image Generation. 1167-1176 - Daqian Shi, Xiaolei Diao, Hao Tang, Xiaomin Li, Hao Xing, Hao Xu:
RCRN: Real-world Character Image Restoration Network via Skeleton Extraction. 1177-1185 - Yupei Lin, Sen Zhang, Tianshui Chen, Yongyi Lu, Guangping Li, Yukai Shi:
Exploring Negatives in Contrastive Learning for Unpaired Image-to-Image Translation. 1186-1194 - Xiang Chang, Fei Chao, Changjing Shang, Qiang Shen:
Sundial-GAN: A Cascade Generative Adversarial Networks Framework for Deciphering Oracle Bone Inscriptions. 1195-1203 - Xueyao Zhang, Jinchao Zhang, Yao Qiu, Li Wang, Jie Zhou:
Structure-Enhanced Pop Music Generation via Harmony-Aware Learning. 1204-1213 - Xingzhong Hou, Boxiao Liu, Shuai Zhang, Lulin Shi, Zite Jiang, Haihang You:
Dynamic Weighted Semantic Correspondence for Few-Shot Image Generative Adaptation. 1214-1222 - Zhejing Hu, Xiao Ma, Yan Liu, Gong Chen, Yongxu Liu:
The Beauty of Repetition in Machine Composition Scenarios. 1223-1231 - Xin Huang, Dong Liang, Hongrui Cai, Juyong Zhang, Jinyuan Jia:
CariPainter: Sketch Guided Interactive Caricature Generation. 1232-1240 - Jieun Lee, Hyeonwoo Kim, Jonghwa Shim, Eenjun Hwang:
Cartoon-Flow: A Flow-Based Generative Adversarial Network for Arbitrary-Style Photo Cartoonization. 1241-1251
Oral Session VI: Experience -- Multimedia Applications
- Yiling Wu, Xinfeng Zhang, Yaowei Wang, Qingming Huang:
Span-based Audio-Visual Localization. 1252-1260 - Jibin Gao, Junfu Pu, Honglun Zhang, Ying Shan, Wei-Shi Zheng:
PC-Dance: Posture-controllable Music-driven Dance Synthesis. 1261-1269 - Haipeng Liu, Yang Wang, Meng Wang, Yong Rui:
Delving Globally into Texture and Structure for Image Inpainting. 1270-1278 - Zeyu Ma, Yang Yang, Guoqing Wang, Xing Xu, Heng Tao Shen, Mingxing Zhang:
Rethinking Open-World Object Detection in Autonomous Driving Scenarios. 1279-1288 - Zhihua Hu, Bo Duan, Yanfeng Zhang, Mingwei Sun, Jingwei Huang:
MVLayoutNet: 3D Layout Reconstruction with Multi-view Panoramas. 1289-1298 - Jiaming Li, Hongtao Xie, Lingyun Yu, Yongdong Zhang:
Wavelet-enhanced Weakly Supervised Local Feature Learning for Face Forgery Detection. 1299-1308 - Xiaoyu Ma, Yaqi Wang, Chang Liu, Suiyu Zhang, Dingguo Yu:
ADGNet: Attention Discrepancy Guided Deep Neural Network for Blind Image Quality Assessment. 1309-1318 - Jingjing Wu, Pengyuan Lyu, Guangming Lu, Chengquan Zhang, Kun Yao, Wenjie Pei:
Decoupling Recognition from Detection: Single Shot Self-Reliant Scene Text Spotter. 1319-1328 - Chaofeng Chen, Xinyu Shi, Yipeng Qin, Xiaoming Li, Xiaoguang Han, Tao Yang, Shihui Guo:
Real-World Blind Super-Resolution via Feature Matching with Implicit High-Resolution Priors. 1329-1338 - Mengya Han, Heliang Zheng, Chaoyue Wang, Yong Luo, Han Hu, Bo Du:
Leveraging GAN Priors for Few-Shot Part Segmentation. 1339-1347 - Bo Fang, Wenhao Wu, Chang Liu, Yu Zhou, Dongliang He, Weiping Wang:
MaMiCo: Macro-to-Micro Semantic Correspondence for Self-supervised Video Representation Learning. 1348-1357 - Jinwang Pan, Deming Zhai, Yuanchao Bai, Junjun Jiang, Debin Zhao, Xianming Liu:
ChebyLighter: Optimal Curve Estimation for Low-light Image Enhancement. 1358-1366 - Xiaotong Lu, Teng Xi, Baopu Li, Gang Zhang, Weisheng Dong, Guangming Shi:
Bayesian based Re-parameterization for DNN Model Pruning. 1367-1375 - Dejia Xu, Hayk Poghosyan, Shant Navasardyan, Yifan Jiang, Humphrey Shi, Zhangyang Wang:
ReCoRo: Region-Controllable Robust Light Enhancement with User-Specified Imprecise Masks. 1376-1386 - Aaron Chadha, Ioannis Katsavounidis, Ayan Kumar Bhunia, Cosmin Stejerean, Mohammad Umar Karim Khan, Yiannis Andreopoulos:
Domain-Specific Fusion Of Objective Video Quality Metrics. 1387-1395 - Wen Yang, Jinjian Wu, Jupo Ma, Leida Li, Weisheng Dong, Guangming Shi:
Learning for Motion Deblurring with Hybrid Frames and Events. 1396-1404 - Yulei Lu, Yawei Luo, Li Zhang, Zheyang Li, Yi Yang, Jun Xiao:
Bidirectional Self-Training with Multiple Anisotropic Prototypes for Domain Adaptive Semantic Segmentation. 1405-1415 - Hui Lin, Zhiheng Ma, Xiaopeng Hong, Yaowei Wang, Zhou Su:
Semi-supervised Crowd Counting via Density Agency. 1416-1426 - Huachen Fang, Jinjian Wu, Leida Li, Junhui Hou, Weisheng Dong, Guangming Shi:
AEDNet: Asynchronous Event Denoising with Spatial-Temporal Correlation among Irregular Data. 1427-1435 - Hansen Feng, Lizhi Wang, Yuzhi Wang, Hua Huang:
Learnability Enhancement for Low-light Raw Denoising: Where Paired Real Data Meets Noise Modeling. 1436-1444 - Qian Cao, Xu Chen, Ruihua Song, Hao Jiang, Guang Yang, Zhao Cao:
Multi-Modal Experience Inspired AI Creation. 1445-1454 - Boming Zhao, Bangbang Yang, Zhenyang Li, Zuoyue Li, Guofeng Zhang, Jiashu Zhao, Dawei Yin, Zhaopeng Cui, Hujun Bao:
Factorized and Controllable Neural Re-Rendering of Outdoor Scene for Photo Extrapolation. 1455-1464 - Zhuowen Yuan, Zhengxin You, Sheng Li, Zhenxing Qian, Xinpeng Zhang, Alex C. Kot:
On Generating Identifiable Virtual Faces. 1465-1473 - Peijia Zheng, Zhiwei Cai, Huicong Zeng, Jiwu Huang:
Keyword Spotting in the Homomorphic Encrypted Domain Using Deep Complex-Valued CNN. 1474-1483 - Zhangkai Ni, Wenhan Yang, Hanli Wang, Shiqi Wang, Lin Ma, Sam Kwong:
Cycle-Interactive Generative Adversarial Network for Robust Unsupervised Low-Light Enhancement. 1484-1492 - Yunhao Li, Zhenbo Yu, Yucheng Zhu, Bingbing Ni, Guangtao Zhai, Wei Shen:
Skeleton2Humanoid: Animating Simulated Characters for Physically-plausible Motion In-betweening. 1493-1502 - Jiahao Li, Bin Li, Yan Lu:
Hybrid Spatial-Temporal Entropy Modelling for Neural Video Compression. 1503-1511 - Shuai Li, Kaixin Wang, Yanbo Gao, Xun Cai, Mao Ye:
Geometric Warping Error Aware CNN for DIBR Oriented View Synthesis. 1512-1521
Poster Session VI: Experience -- Multimedia Applications
- Jinbao Wang, Guoyang Xie, Yawen Huang, Yefeng Zheng, Yaochu Jin, Feng Zheng:
FedMed-ATL: Misaligned Unpaired Cross-Modality Neuroimage Synthesis via Affine Transform Loss. 1522-1531 - Rui Ma, Mengxi Guo, Yi Hou, Fan Yang, Yuan Li, Huizhu Jia, Xiaodong Xie:
Towards Blind Watermarking: Combining Invertible and Non-invertible Mechanisms. 1532-1542 - Kaixiong Gong, Shuang Li, Shugang Li, Rui Zhang, Chi Harold Liu, Qiang Chen:
Improving Transferability for Domain Adaptive Detection Transformers. 1543-1551 - Dariusz Mikulowski:
Support for Teaching Mathematics of the Blind by Sighted Tutors Through Multisensual Access to Formulas with Braille Converters and Speech. 1552-1560 - Yunning Cao, Ye Ma, Min Zhou, Chuanbin Liu, Hongtao Xie, Tiezheng Ge, Yuning Jiang:
Geometry Aligned Variational Transformer for Image-conditioned Layout Generation. 1561-1571 - Xianggang Yu, Jiapeng Tang, Yipeng Qin, Chenghong Li, Xiaoguang Han, Linchao Bao, Shuguang Cui:
PVSeRF: Joint Pixel-, Voxel- and Surface-Aligned Radiance Field for Single-Image Novel View Synthesis. 1572-1583 - Chaowei Fang, Dingwen Zhang, Liang Wang, Yulun Zhang, Lechao Cheng, Junwei Han:
Cross-Modality High-Frequency Transformer for MR Image Super-Resolution. 1584-1592 - Xintian Wu, Hanbin Zhao, Liangli Zheng, Shouhong Ding, Xi Li:
Adma-GAN: Attribute-Driven Memory Augmented GANs for Text-to-Image Generation. 1593-1602 - Chang Tang, Zhenglai Li, Weiqing Yan, Guanghui Yue, Wei Zhang:
Efficient Multiple Kernel Clustering via Spectral Perturbation. 1603-1611 - Yang Yang, Jingshuai Zhang, Fan Gao, Xiaoru Gao, Hengshu Zhu:
DOMFN: A Divergence-Orientated Multi-Modal Fusion Network for Resume Assessment. 1612-1620 - Ping Wei, Sheng Li, Xinpeng Zhang, Ge Luo, Zhenxing Qian, Qing Zhou:
Generative Steganography Network. 1621-1629 - Haiping Wang, Yuan Liu, Zhen Dong, Wenping Wang:
You Only Hypothesize Once: Point Cloud Registration with Rotation-equivariant Descriptors. 1630-1641 - Dingkang Yang, Shuai Huang, Haopeng Kuang, Yangtao Du, Lihua Zhang:
Disentangled Representation Learning for Multimodal Emotion Recognition. 1642-1651 - Yi Huang, Xiaoshan Yang, Ji Zhang, Changsheng Xu:
Relative Alignment Network for Source-Free Multimodal Video Domain Adaptation. 1652-1660 - Lin Yuan, Linguo Liu, Xiao Pu, Zhao Li, Hongbo Li, Xinbo Gao:
PRO-Face: A Generic Framework for Privacy-preserving Recognizable Obfuscation of Face Images. 1661-1669 - Xuanhan Wang, Yan Dai, Lianli Gao, Jingkuan Song:
Skeleton-based Action Recognition via Adaptive Cross-Form Learning. 1670-1678 - Yi Zhang, Weixuan Liang, Xinwang Liu, Sisi Dai, Siwei Wang, Liyang Xu, En Zhu:
Sample Weighted Multiple Kernel K-means via Min-Max optimization. 1679-1687 - Hanlei Zhang, Hua Xu, Xin Wang, Qianrui Zhou, Shaojie Zhao, Jiayan Teng:
MIntRec: A New Dataset for Multimodal Intent Recognition. 1688-1697 - Zhangming Li, Shengsheng Qian, Jie Cao, Quan Fang, Changsheng Xu:
Adaptive Transformer-Based Conditioned Variational Autoencoder for Incomplete Social Event Classification. 1698-1707 - Dingkang Yang, Haopeng Kuang, Shuai Huang, Lihua Zhang:
Learning Modality-Specific and -Agnostic Representations for Asynchronous Multimodal Language Sequences. 1708-1717 - Zijin Wu, Xingyi Li, Juewen Peng, Hao Lu, Zhiguo Cao, Weicai Zhong:
DoF-NeRF: Depth-of-Field Meets Neural Radiance Fields. 1718-1729 - Mingjin Zhang, Haichen Bai, Jing Zhang, Rui Zhang, Chaoyue Wang, Jie Guo, Xinbo Gao:
RKformer: Runge-Kutta Transformer with Random-Connection Attention for Infrared Small Target Detection. 1730-1738 - Liqiang Yin, Ruize Han, Wei Feng, Song Wang:
Self-Supervised Human Pose based Multi-Camera Video Synchronization. 1739-1748 - Zhekai Du, Jingjing Li, Lin Zuo, Lei Zhu, Ke Lu:
Energy-Based Domain Generalization for Face Anti-Spoofing. 1749-1757 - Jiajian Zhao, Yifan Zhao, Xiaowu Chen, Jia Li:
Revisiting Stochastic Learning for Generalizable Person Re-identification. 1758-1768 - Zhuo Chen, Chaoyue Wang, Haimei Zhao, Bo Yuan, Xiu Li:
D2Animator: Dual Distillation of StyleGAN For High-Resolution Face Animation. 1769-1778 - Lijian Gao, Ling Zhou, Qirong Mao, Ming Dong:
Adaptive Hierarchical Pooling for Weakly-supervised Sound Event Detection. 1779-1787 - Juze Zhang, Jingya Wang, Ye Shi, Fei Gao, Lan Xu, Jingyi Yu:
Mutual Adaptive Reasoning for Monocular 3D Multi-Person Pose Estimation. 1788-1796 - Fengjun Li, Xin Feng, Fanglin Chen, Guangming Lu, Wenjie Pei:
Learning Generalizable Latent Representations for Novel Degradations in Super-Resolution. 1797-1807 - Run Wang, Haoxuan Li, Lingzhou Mu, Jixing Ren, Shangwei Guo, Li Liu, Liming Fang, Jing Chen, Lina Wang:
Rethinking the Vulnerability of DNN Watermarking: Are Watermarks Robust against Naturalness-aware Perturbations? 1808-1818 - Xiao Pan, Peike Li, Zongxin Yang, Huiling Zhou, Chang Zhou, Hongxia Yang, Jingren Zhou, Yi Yang:
In-N-Out Generative Learning for Dense Unsupervised Video Segmentation. 1819-1827 - Rishubh Parihar, Ankit Dhiman, Tejan Karmali, Venkatesh R:
Everything is There in Latent Space: Attribute Editing and Attribute Style Manipulation by StyleGAN Latent Space Exploration. 1828-1836 - Dongyu She, Kun Xu:
An Image-to-video Model for Real-Time Video Enhancement. 1837-1846 - Xuesong Niu, Jili Gu, Guoxin Zhang, Pengfei Wan, Zhongyuan Wang:
Learning an Inference-accelerated Network from a Pre-trained Model with Frequency-enhanced Feature Distillation. 1847-1856 - Mingjin Zhang, Ke Yue, Jing Zhang, Yunsong Li, Xinbo Gao:
Exploring Feature Compensation and Cross-level Correlation for Infrared Small Target Detection. 1857-1865 - Fuming You, Jingjing Li, Zhi Chen, Lei Zhu:
Pixel Exclusion: Uncertainty-aware Boundary Discovery for Active Cross-Domain Semantic Segmentation. 1866-1874 - Mingjia Li, Yuanbin Fu, Xinhui Li, Xiaojie Guo:
Deep Flexible Structure Preserving Image Smoothing. 1875-1883 - Taeheon Kim, Youngjoon Yu, Yong Man Ro:
Defending Physical Adversarial Attack on Object Detection via Adversarial Patch-Feature Energy. 1905-1913 - Shankhanil Mitra, Rajiv Soundararajan:
Multiview Contrastive Learning for Completely Blind Video Quality Assessment of User Generated Content. 1914-1924 - Lechao Cheng, Chaowei Fang, Dingwen Zhang, Guanbin Li, Gang Huang:
Compound Batch Normalization for Long-tailed Image Classification. 1925-1934 - Yalan Ye, Ziqi Liu, Yangwuyong Zhang, Jingjing Li, Hengtao Shen:
Alleviating Style Sensitivity then Adapting: Source-free Domain Adaptation for Medical Image Segmentation. 1935-1944 - Jian Liu, Yufeng Chen, Jinan Xu:
Multimedia Event Extraction From News With a Unified Contrastive Learning Framework. 1945-1953 - Bolun Zheng, Xiaokai Pan, Hua Zhang, Xiaofei Zhou, Gregory G. Slabaugh, Chenggang Yan, Shanxin Yuan:
DomainPlus: Cross Transform Domain Learning towards High Dynamic Range Imaging. 1954-1963 - Shuai Wang, Da Yang, Yubin Wu, Yang Liu, Hao Sheng:
Tracking Game: Self-adaptative Agent based Multi-object Tracking. 1964-1972 - Gangwei Jiang, Shiyao Wang, Tiezheng Ge, Yuning Jiang, Ying Wei, Defu Lian:
Self-Supervised Text Erasing with Controllable Image Synthesis. 1973-1983 - Zijie Wang, Aichun Zhu, Jingyi Xue, Xili Wan, Chao Liu, Tian Wang, Yifeng Li:
Look Before You Leap: Improving Text-based Person Retrieval by Learning A Consistent Cross-modal Common Manifold. 1984-1992 - Xingxing Zhang, Zhizhe Liu, Weikai Yang, Liyuan Wang, Jun Zhu:
The More, The Better? Active Silencing of Non-Positive Transfer for Efficient Multi-Domain Few-Shot Classification. 1993-2001 - Lu Zhang, Yang Wang, Jiaogen Zhou, Chenbo Zhang, Yinglu Zhang, Jihong Guan, Yatao Bian, Shuigeng Zhou:
Hierarchical Few-Shot Object Detection: Problem, Benchmark and Method. 2002-2011 - Renshuai Tao, Tianbo Wang, Ziyang Wu, Cong Liu, Aishan Liu, Xianglong Liu:
Few-shot X-ray Prohibited Item Detection: A Benchmark and Weak-feature Enhancement Network. 2012-2020 - Shilv Cai, Zhijun Zhang, Liqun Chen, Luxin Yan, Sheng Zhong, Xu Zou:
High-Fidelity Variable-Rate Image Compression via Invertible Activation Transformation. 2021-2031 - Xudong Mao, Liujuan Cao, Aurele Tohokantche Gnanha, Zhenguo Yang, Qing Li, Rongrong Ji:
Cycle Encoding of a StyleGAN Encoder for Improved Reconstruction and Editability. 2032-2041 - Yeqi Bai, Tao Ma, Lipo Wang, Zhenjie Zhang:
Speech Fusion to Face: Bridging the Gap Between Human's Vocal Characteristics and Facial Imaging. 2042-2050 - Wei Li, Tianzhao Yang, Xiao Wu, Xian-Jun Du, Jian-Jun Qiao:
Learning Action-guided Spatio-temporal Transformer for Group Activity Recognition. 2051-2060 - Yangyang Guo, Liqiang Nie, Yongkang Wong, Yibing Liu, Zhiyong Cheng, Mohan S. Kankanhalli:
A Unified End-to-End Retriever-Reader Framework for Knowledge-based VQA. 2061-2069 - Tengyu Ma, Long Ma, Xin Fan, Zhongxuan Luo, Risheng Liu:
PIA: Parallel Architecture with Illumination Allocator for Joint Enhancement and Detection in Low-Light. 2070-2078 - Abhinav Aggarwal, Yash Pandya, Lokesh A. Ravindranathan, Laxmi S. Ahire, Manivel Sethu, Kaustav Nandy:
Robust Actor Recognition in Entertainment Multimedia at Scale. 2079-2087 - Yufan Zhang, Junkai Man, Peng Sun:
MF-Net: A Novel Few-shot Stylized Multilingual Font Generation Method. 2088-2096 - Yuan Sun, Dezhong Peng, Haixiao Huang, Zhenwen Ren:
Feature and Semantic Views Consensus Hashing for Image Set Classification. 2097-2105 - Che Sun, Yunde Jia, Yuwei Wu:
Evidential Reasoning for Video Anomaly Detection. 2106-2114 - Danni Xu, Ruimin Hu, Zheng Wang, Linbo Luo, Dengshi Li, Wenjun Zeng:
Gaze- and Spacing-flow Unveil Intentions: Hidden Follower Discovery. 2115-2123 - Hongcheng Zhang, Xu Zhao, Dongqi Wang:
Semi-supervised Learning for Multi-label Video Action Detection. 2124-2134 - Bo Zhang, Jiakang Yuan, Baopu Li, Tao Chen, Jiayuan Fan, Botian Shi:
Learning Cross-Image Object Semantic Relation in Transformer for Few-Shot Fine-Grained Image Classification. 2135-2144 - Mengshun Hu, Kui Jiang, Liang Liao, Zhixiang Nie, Jing Xiao, Zheng Wang:
Progressive Spatial-temporal Collaborative Network for Video Frame Interpolation. 2145-2153 - Xinwei Xue, Jia He, Long Ma, Yi Wang, Xin Fan, Risheng Liu:
Best of Both Worlds: See and Understand Clearly in the Dark. 2154-2162 - Xin Jin, Tianyu He, Xu Shen, Tongliang Liu, Xinchao Wang, Jianqiang Huang, Zhibo Chen, Xian-Sheng Hua:
Meta Clustering Learning for Large-scale Unsupervised Person Re-identification. 2163-2172 - Xiaotong Luo, Mingliang Dai, Yulun Zhang, Yuan Xie, Ding Liu, Yanyun Qu, Yun Fu, Junping Zhang:
Adjustable Memory-efficient Image Super-resolution via Individual Kernel Sparsity. 2173-2181 - Ning Wang, Jing Zhang, Lefei Zhang, Dacheng Tao:
GT-MUST: Gated Try-on by Learning the Mannequin-Specific Transformation. 2182-2190 - Chen Long, Wenxiao Zhang, Ruihui Li, Hao Wang, Zhen Dong, Bisheng Yang:
PC2-PU: Patch Correlation and Point Correlation for Effective Point Cloud Upsampling. 2191-2201 - Luoyuan Xu, Tao Guan, Yuesong Wang, Yawei Luo, Zhuo Chen, Wenkai Liu, Wei Yang:
Self-Supervised Multi-view Stereo via Adjacent Geometry Guided Volume Completion. 2202-2210 - Zhenbo Shi, Zhi Chen, Zhenbo Xu, Wei Yang, Liusheng Huang:
AtHom: Two Divergent Attentions Stimulated By Homomorphic Training in Text-to-Image Synthesis. 2211-2219 - Zhiqiang Fu, Yao Zhao, Dongxia Chang, Yiming Wang, Jie Wen, Xingxing Zhang, Guodong Guo:
One-step Low-Rank Representation for Clustering. 2220-2228 - Syed Muhammad Israr, Feng Zhao:
Customizing GAN Using Few-shot Sketches. 2229-2238 - Mustafa Shukor, Bharath Bhushan Damodaran, Xu Yao, Pierre Hellier:
Video Coding using Learned Latent GAN Compression. 2239-2248 - Qiujing Lu, Yipeng Zhang, Mingjian Lu, Vwani Roychowdhury:
Action-conditioned On-demand Motion Generation. 2249-2257 - Wenxu Shi, Lei Zhang, Weijie Chen, Shiliang Pu:
Universal Domain Adaptive Object Detector. 2258-2266 - Han Fang, Zhaoyang Jia, Zehua Ma, Ee-Chien Chang, Weiming Zhang:
PIMoG: An Effective Screen-shooting Noise-Layer Simulation for Deep-Learning-Based Watermarking Network. 2267-2275 - Puneet Mathur, Atula Tejaswi Neerkaje, Malika Chhibber, Ramit Sawhney, Fuming Guo, Franck Dernoncourt, Sanghamitra Dutta, Dinesh Manocha:
MONOPOLY: Financial Prediction from MONetary POLicY Conference Videos Using Multimodal Cues. 2276-2285 - Pan Mu, Haotian Qian, Cong Bai:
Structure-Inferred Bi-level Model for Underwater Image Enhancement. 2286-2295 - Yazhou Xing, Yu Li, Xintao Wang, Ye Zhu, Qifeng Chen:
Composite Photograph Harmonization with Complete Background Cues. 2296-2304 - Ke Qiu, Yawen Lai, Shiyi Liu, Ronggang Wang:
Self-supervised Multi-view Stereo via Inter and Intra Network Pseudo Depth. 2305-2313 - Zhenzhong Kuang, Longbin Teng, Zhou Yu, Jun Yu, Jianping Fan, Mingliang Xu:
Delegate-based Utility Preserving Synthesis for Pedestrian Image Anonymization. 2314-2323 - Mingqian Wang, Yujun Zhang, Wei Feng, Lei Zhu, Song Wang:
Video Instance Lane Detection via Deep Temporal and Geometry Consistency Constraints. 2324-2332 - Xu Liu, Jianing Li, Xianqi Zhang, Jingyuan Sun, Xiaopeng Fan, Yonghong Tian:
Learning Visible Surface Area Estimation for Irregular Objects. 2333-2343 - Qinwei Chang, Leichao Huang, Shaoteng Liu, Hualuo Liu, Tianshu Yang, Yexin Wang:
Blind Robust Video Watermarking Based on Adaptive Region Selection and Channel Reference. 2344-2350 - Yongqi Zhai, Luyang Tang, Yi Ma, Rui Peng, Ronggang Wang:
Disparity-based Stereo Image Compression with Aligned Cross-View Priors. 2351-2360 - Junkun Yuan, Xu Ma, Defang Chen, Kun Kuang, Fei Wu, Lanfen Lin:
Label-Efficient Domain Generalization via Collaborative Exploration and Generalization. 2361-2370 - Wufan Wang, Lei Zhang, Hua Huang:
Progressive Unsupervised Learning of Local Descriptors. 2371-2379 - Dong Zhang, Jinhui Tang, Kwang-Ting Cheng:
Graph Reasoning Transformer for Image Parsing. 2380-2389 - Qiang Liu, Tongqing Zhou, Zhiping Cai, Yonghao Tang:
Opportunistic Backdoor Attacks: Exploring Human-imperceptible Vulnerabilities on Speech Recognition Systems. 2390-2398 - Zhiqiang Gao, Shufei Zhang, Kaizhu Huang, Qiufeng Wang, Rui Zhang, Chaoliang Zhong:
Certifying Better Robust Generalization for Unsupervised Domain Adaptation. 2399-2410 - Yu Yin, Joseph P. Robinson, Yun Fu:
Multimodal In-bed Pose and Shape Estimation under the Blankets. 2411-2419 - Xiaoyu Han, Shengping Zhang, Qinglin Liu, Zonglin Li, Chenyang Wang:
Progressive Limb-Aware Virtual Try-On. 2420-2429 - Anna Zhu, Zhanhui Yin, Brian Kenji Iwana, Xinyu Zhou, Shengwu Xiong:
Text Style Transfer based on Multi-factor Disentanglement and Mixture. 2430-2440 - Zhaoyi Wan, Dejia Xu, Zhangyang Wang, Jian Wang, Jiebo Luo:
Cloud2Sketch: Augmenting Clouds with Imaginary Sketches. 2441-2451 - Daiheng Gao, Xindi Zhang, Xingyu Chen, Andong Tan, Bang Zhang, Pan Pan, Ping Tan:
CycleHand: Increasing 3D Pose Estimation Ability on In-the-wild Monocular Image through Cyclic Flow. 2452-2463 - Ziwen He, Wei Wang, Weinan Guan, Jing Dong, Tieniu Tan:
Defeating DeepFakes via Adversarial Visual Reconstruction. 2464-2472 - Xichu Ma, Yuchen Wang, Ye Wang:
Content based User Preference Modeling in Music Generation. 2473-2482 - Liliang Chen, Jiaqi Li, Han Huang, Yandong Guo:
CrossHuman: Learning Cross-guidance from Multi-frame Images for Human Reconstruction. 2483-2494 - Zhiqian Lin, Jiangke Lin, Lincheng Li, Yi Yuan, Zhengxia Zou:
High-Quality 3D Face Reconstruction with Affine Convolutional Networks. 2495-2503 - Astitva Srivastava, Chandradeep Pokhariya, Sai Sagar Jinka, Avinash Sharma:
xCloth: Extracting Template-free Textured 3D Clothes from a Monocular Image. 2504-2512 - Kangneng Zhou, Xiaobin Zhu, Daiheng Gao, Kai Lee, Xinjie Li, Xu-Cheng Yin:
SD-GAN: Semantic Decomposition for Face Image Synthesis with Discrete Attribute. 2513-2524 - Rongjie Huang, Chenye Cui, Feiyang Chen, Yi Ren, Jinglin Liu, Zhou Zhao, Baoxing Huai, Zhefeng Wang:
SingGAN: Generative Adversarial Network For High-Fidelity Singing Voice Generation. 2525-2535 - Yinpeng Chen, Zhiyu Pan, Min Shi, Hao Lu, Zhiguo Cao, Weicai Zhong:
Design What You Desire: Icon Generation from Orthogonal Application and Theme Labels. 2536-2546 - Zhaohui Jing, Youjian Zhang, Chaoyue Wang, Daqing Liu, Yong Xia:
Semantically-Consistent Dynamic Blurry Image Generation for Image Deblurring. 2547-2555 - Xintao Wang, Chao Dong, Ying Shan:
RepSR: Training Efficient VGG-style Super-Resolution Networks with Structural Re-Parameterization and Batch Normalization. 2556-2564 - Shuoyi Chen, Mang Ye, Bo Du:
Rotation Invariant Transformer for Recognizing Object in UAVs. 2565-2574 - Feifei Shao, Yawei Luo, Ping Liu, Jie Chen, Yi Yang, Yulei Lu, Jun Xiao:
Active Learning for Point Cloud Semantic Segmentation via Spatial-Structural Diversity Reasoning. 2575-2585 - Ji Zhang, Jingkuan Song, Lianli Gao, Hengtao Shen:
Free-Lunch for Cross-Domain Few-Shot Learning: Style-Aware Episodic Training with Robust Contrastive Learning. 2586-2594 - Rongjie Huang, Zhou Zhao, Huadai Liu, Jinglin Liu, Chenye Cui, Yi Ren:
ProDiff: Progressive Fast Diffusion Model for High-Quality Text-to-Speech. 2595-2605 - Yifeng Zhou, Chuming Lin, Donghao Luo, Yong Liu, Ying Tai, Chengjie Wang, Mingang Chen:
Joint Learning Content and Degradation Aware Feature for Blind Super-Resolution. 2606-2616 - Wenjing Wang, Zhengbo Xu, Haofeng Huang, Jiaying Liu:
Self-Aligned Concave Curve: Illumination Enhancement for Unsupervised Adaptation. 2617-2626 - Hong Ding, Fei Luo, Caoqing Jiang, Gang Fu, Zipei Chen, Shenghong Hu, Chunxia Xiao:
Photorealistic Style Transfer via Adaptive Filtering and Channel Seperation. 2627-2635 - Junyu Chen, Qianqian Xu, Zhiyong Yang, Ke Ma, Xiaochun Cao, Qingming Huang:
Recurrent Meta-Learning against Generalized Cold-start Problem in CTR Prediction. 2636-2644 - Liutao Yang, Rongjun Ge, Shichang Feng, Daoqiang Zhang:
Learning Projection Views for Sparse-View CT Reconstruction. 2645-2653 - Peichi Zhou, Dingbo Lu, Chen Li, Jian Zhang, Long Liu, Changbo Wang:
Unsupervised Textured Terrain Generation via Differentiable Rendering. 2654-2662 - Nikita Drobyshev, Jenya Chelishev, Taras Khakhulin, Aleksei Ivakhnenko, Victor Lempitsky, Egor Zakharov:
MegaPortraits: One-shot Megapixel Neural Head Avatars. 2663-2671 - Xin Ding, Tsuyoshi Takatani, Zhongyuan Wang, Ying Fu, Yinqiang Zheng:
Event-guided Video Clip Generation from Blurry Images. 2672-2680 - Jianhui Chang, Jian Zhang, Youmin Xu, Jiguo Li, Siwei Ma, Wen Gao:
Consistency-Contrast Learning for Conceptual Coding. 2681-2690 - Mandi Luo, Jie Cao, Ran He:
Order-aware Human Interaction Manipulation. 2691-2699 - Zipei Chen, Xiao Lu, Ling Zhang, Chunxia Xiao:
Semi-supervised Video Shadow Detection via Image-assisted Pseudo-label Generation. 2700-2708 - Xiaohao Xu, Jinglu Wang, Xiang Ming, Yan Lu:
Towards Robust Video Object Segmentation with Adaptive Object Calibration. 2709-2718 - Chengming Xu, Chen Liu, Siqian Yang, Yabiao Wang, Shijie Zhang, Lijie Jia, Yanwei Fu:
Split-PU: Hardness-aware Training Strategy for Positive-Unlabeled Learning. 2719-2729 - Jialei Xu, Xianming Liu, Yuanchao Bai, Junjun Jiang, Kaixuan Wang, Xiaozhi Chen, Xiangyang Ji:
Multi-Camera Collaborative Depth Prediction via Consistent Structure Estimation. 2730-2738 - Wenxue Cui, Shaohui Liu, Debin Zhao:
Fast Hierarchical Deep Unfolding Network for Image Compressed Sensing. 2739-2748 - Hongming Luo, Fei Zhou, Kin-Man Lam, Guoping Qiu:
Restoration of User Videos Shared on Social Media. 2749-2757 - Chenyang Qi, Junming Chen, Xin Yang, Qifeng Chen:
Real-time Streaming Video Denoising with Bidirectional Buffers. 2758-2766 - Yudong Liang, Bin Wang, Wenqi Ren, Jiaying Liu, Wenjian Wang, Wangmeng Zuo:
Learning Hierarchical Dynamics with Spatial Adjacency for Image Enhancement. 2767-2776 - Tao Xiang, Hangcheng Liu, Shangwei Guo, Hantao Liu, Tianwei Zhang:
Text's Armor: Optimized Local Adversarial Perturbation Against Scene Text Editing Attacks. 2777-2785 - Jiayun Fu, Bin B. Zhu, Haidong Zhang, Yayi Zou, Song Ge, Weiwei Cui, Yun Wang, Dongmei Zhang, Xiaojing Ma, Hai Jin:
ChartStamp: Robust Chart Embedding for Real-World Applications. 2786-2795 - Yan Hong, Li Niu, Jianfu Zhang, Liqing Zhang:
Few-shot Image Generation Using Discrete Content Representation. 2796-2804 - Jiaxin Zhang, Canjie Luo, Lianwen Jin, Fengjun Guo, Kai Ding:
Marior: Margin Removal and Iterative Content Rectification for Document Dewarping in the Wild. 2805-2815 - Wenhan Yang, Rizhao Cai, Alex C. Kot:
Image Inpainting Detection via Enriched Attentive Pattern with Near Original Image Augmentation. 2816-2824 - Haojia Lin, Lijiang Li, Xiawu Zheng, Fei Chao, Rongrong Ji:
Searching Lightweight Neural Network for Image Signal Processing. 2825-2833 - Zhengxin You, Qichao Ying, Sheng Li, Zhenxing Qian, Xinpeng Zhang:
Image Generation Network for Covert Transmission in Online Social Network. 2834-2842 - Bin Yang, Mang Ye, Jun Chen, Zesen Wu:
Augmented Dual-Contrastive Aggregation Learning for Unsupervised Visible-Infrared Person Re-Identification. 2843-2851 - Nikhil Bansal, Kartik Gupta, Kiruthika Kannan, Sivani Pentapati, Ravi Kiran Sarvadevabhatla:
DrawMon: A Distributed System for Detection of Atypical Sketch Content in Concurrent Pictionary Games. 2852-2861 - Jiali You, Zhenwen Ren, Quansen Sun, Yuan Sun, Xingfeng Li:
Approximate Shifted Laplacian Reconstruction for Multiple Kernel Clustering. 2862-2870 - Wujin Li, Jiawei Zhan, Jinbao Wang, Bizhong Xia, Bin-Bin Gao, Jun Liu, Chengjie Wang, Feng Zheng:
Towards Continual Adaptation in Industrial Anomaly Detection. 2871-2880 - Cheng Xiong, Guorui Feng, Xinran Li, Xinpeng Zhang, Chuan Qin:
Neural Network Model Protection with Piracy Identification and Tampering Localization Capability. 2881-2889 - Gang He, Kepeng Xu, Li Xu, Chang Wu, Ming Sun, Xing Wen, Yu-Wing Tai:
SDRTV-to-HDRTV via Hierarchical Dynamic Context Feature Mapping. 2890-2898 - Chen Tang, Haoyu Zhai, Kai Ouyang, Zhi Wang, Yifei Zhu, Wenwu Zhu:
Arbitrary Bit-width Network: A Joint Layer-Wise Quantization and Adaptive Inference Approach. 2899-2908 - Yiqin Zhao, Sheng Wei, Tian Guo:
Privacy-preserving Reflection Rendering for Augmented Reality. 2909-2918
Oral Session VII: Multimedia Systems - Systems and Middleware
- Zitai Wang, Qianqian Xu, Ke Ma, Xiaochun Cao, Qingming Huang:
Confederated Learning: Going Beyond Centralization. 2939-2947 - Insoo Lee, Seyeon Kim, Sandesh Dhawaskar Sathyanarayana, Kyungmin Bin, Song Chong, Kyunghan Lee, Dirk Grunwald, Sangtae Ha:
R-FEC: RL-based FEC Adjustment for Better QoE in WebRTC. 2948-2956
Poster Session VII: Multimedia Systems -- Systems and Middleware
- Xingshuo Han, Guowen Xu, Yuan Zhou, Xuehuan Yang, Jiwei Li, Tianwei Zhang:
Physical Backdoor Attacks to Lane Detection Systems in Autonomous Driving. 2957-2968 - Haochen Wang, Jie Liu, Yongtuo Liu, Subhransu Maji, Jan-Jakob Sonke, Efstratios Gavves:
Dynamic Transformer for Few-shot Instance Segmentation. 2969-2977 - Hao Pan, Feitong Tan, Wenhao Li, Yi-Chao Chen, Guangtao Xue:
OISSR: Optical Image Stabilization Based Super Resolution on Smartphone Cameras. 2978-2986 - Iryanto Jaya, Yusen Li, Wentong Cai:
Improving Scalability, Sustainability and Availability via Workload Distribution in Edge-Cloud Gaming. 2987-2995 - Shahram Ghandeharizadeh:
Display of 3D Illuminations using Flying Light Specks. 2996-3005
Oral Session VIII: Multimedia Systems -- Transport and Delivery
- Nuowen Kan, Yuankun Jiang, Chenglin Li, Wenrui Dai, Junni Zou, Hongkai Xiong:
Improving Generalization for Neural Adaptive Video Streaming via Meta Reinforcement Learning. 3006-3016 - Taslim Murad, Anh Nguyen, Zhisheng Yan:
DAO: Dynamic Adaptive Offloading for Video Analytics. 3017-3025 - Rui-Xiao Zhang, Changpeng Yang, Xiaochan Wang, Tianchi Huang, Chenglei Wu, Jiangchuan Liu, Lifeng Sun:
AggCast: Practical Cost-effective Scheduling for Large-scale Cloud-edge Crowdsourced Live Streaming. 3026-3034 - Shengzhong Liu, Tianshi Wang, Jinyang Li, Dachun Sun, Mani B. Srivastava, Tarek F. Abdelzaher:
AdaMask: Enabling Machine-Centric Video Streaming with Adaptive Frame Masking for DNN Inference Offloading. 3035-3044
Poster Session VIII: Multimedia Systems -- Transport and Delivery
- Tiesong Zhao, Weize Feng, Hongji Zeng, Yiwen Xu, Yuzhen Niu, Jiaying Liu:
Learning-Based Video Coding with Joint Deep Compression and Enhancement. 3045-3054 - Han Gao, Jinzhong Cui, Mao Ye, Shuai Li, Yu Zhao, Xiatian Zhu:
Structure-Preserving Motion Estimation for Learned Video Compression. 3055-3063 - Tianchi Huang, Chao Zhou, Lianchen Jia, Rui-Xiao Zhang, Lifeng Sun:
Learned Internet Congestion Control for Short Video Uploading. 3064-3075 - Wenhao Tang, Sheng Huang, Xiaoxian Zhang, Luwen Huangfu:
PicT: A Slim Weakly Supervised Vision Transformer for Pavement Distress Classification. 3076-3084 - Hang Yuan, Wei Gao, Ge Li, Zhu Li:
Rate-Distortion-Guided Learning Approach with Cross-Projection Information for V-PCC Fast CU Decision. 3085-3093 - Shishir Subramanyam, Irene Viola, Jack Jansen, Evangelos Alexiou, Alan Hanjalic, Pablo César:
Evaluating the Impact of Tiled User-Adaptive Real-Time Point Cloud Streaming on VR Remote Communication. 3094-3103 - Devdeep Ray, Vicente Bobadilla Riquelme, Srinivasan Seshan:
Prism: Handling Packet Loss for Ultra-low Latency Video. 3104-3114 - Jin Zhou, Na Li, Yao Liu, Shuochao Yao, Songqing Chen:
Exploring Spherical Autoencoder for Spherical Video Content Processing. 3115-3123 - Jianxin Shi, Lingjun Pu, Xinjing Yuan, Qianyun Gong, Jingdong Xu:
Sophon: Super-Resolution Enhanced 360° Video Streaming with Visual Saliency-aware Prefetch. 3124-3133 - Tzu-Kuan Hung, I-Chun Huang, Samuel Rhys Cox, Wei Tsang Ooi, Cheng-Hsin Hsu:
Error Concealment of Dynamic 3D Point Cloud Streaming. 3134-3142 - Yiyun Lu, Yifei Zhu, Zhi Wang:
Personalized 360-Degree Video Streaming: A Meta-Learning Approach. 3143-3151
Oral Session IX: Multimedia Systems -- Data Systems Management and Indexing
- Evgenia Romanenkova, Alexander Stepikin, Matvey Morozov, Alexey Zaytsev:
InDiD: Instant Disorder Detection via a Principled Neural Network. 3152-3162 - An Qin, Mengbai Xiao, Ben Huang, Xiaodong Zhang:
Maze: A Cost-Efficient Video Deduplication System at Web-scale. 3163-3172
Poster Session IX: Multimedia Systems -- Data Systems Management and Indexing
- Chengyin Xu, Zenghao Chai, Zhengzhuo Xu, Chun Yuan, Yanbo Fan, Jue Wang:
HyP2 Loss: Beyond Hypersphere Metric Space for Multi-label Image Retrieval. 3173-3184 - Heng Lian, John Scovil Atwood, Bojian Hou, Jian Wu, Yi He:
Online Deep Learning from Doubly-Streaming Data. 3185-3194 - Hyunmin Jung, Hyuk-Jae Lee, Chae-Eun Rhee:
Re-ordered Micro Image based High Efficient Residual Coding in Light Field Compression. 3195-3204 - Yu Mao, Yufei Cui, Tei-Wei Kuo, Chun Jason Xue:
Accelerating General-purpose Lossless Compression via Simple and Scalable Parameterization. 3205-3213
Oral Session X: Understanding Multimedia Content -- Multimodal Fusion and Embeddings
- Mengzhu Wang, Jianlong Yuan, Qi Qian, Zhibin Wang, Hao Li:
Semantic Data Augmentation based Distance Metric Learning for Domain Generalization. 3214-3223 - Yuehao Yin, Bin Zhu, Jingjing Chen, Lechao Cheng, Yu-Gang Jiang:
Mix-DANN and Dynamic-Modal-Distillation for Video Domain Adaptation. 3224-3233 - Liqiang Nie, Leigang Qu, Dai Meng, Min Zhang, Qi Tian, Alberto Del Bimbo:
Search-oriented Micro-video Captioning. 3234-3243 - Jiannan Ge, Hongtao Xie, Shaobo Min, Pandeng Li, Yongdong Zhang:
Dual Part Discovery Network for Zero-Shot Learning. 3244-3252 - Yi Bin, Wenhao Shi, Jipeng Zhang, Yujuan Ding, Yang Yang, Heng Tao Shen:
Non-Autoregressive Cross-Modal Coherence Modelling. 3253-3261 - Ning Liao, Yifeng Liu, Xiaobo Li, Chenyi Lei, Guoxin Wang, Xian-Sheng Hua, Junchi Yan:
CoHOZ: Contrastive Multimodal Prompt Tuning for Hierarchical Open-set Zero-shot Recognition. 3262-3271 - Zhi-Qi Cheng, Qi Dai, Siyao Li, Teruko Mitamura, Alexander Hauptmann:
GSRFormer: Grounded Situation Recognition Transformer with Alternate Semantic Attention Refinement. 3272-3281 - Qinyi Du, Qingqing Wang, Keqian Li, Jidong Tian, Liqiang Xiao, Yaohui Jin:
CALM: Commen-Sense Knowledge Augmentation for Document Image Understanding. 3282-3290 - Dapeng Chen, Min Wang, Haobin Chen, Lin Wu, Jing Qin, Wei Peng:
Cross-Modal Retrieval with Heterogeneous Graph Embedding. 3291-3300 - Yujie Mo, Yuhuan Chen, Liang Peng, Xiaoshuang Shi, Xiaofeng Zhu:
Simple Self-supervised Multiplex Graph Representation Learning. 3301-3309 - Tom Braude, Idan Schwartz, Alexander G. Schwing, Ariel Shamir:
Ordered Attention for Coherent Visual Storytelling. 3310-3318 - Zhong Wang, Lin Zhang, Ying Shen, Yicong Zhou:
LVI-ExC: A Target-free LiDAR-Visual-Inertial Extrinsic Calibration Framework. 3319-3327 - Xiangming Gu, Longshen Ou, Danielle Ong, Ye Wang:
MM-ALT: A Multimodal Automatic Lyric Transcription System. 3328-3337 - Yachao Zhang, Miaoyu Li, Yuan Xie, Cuihua Li, Cong Wang, Zhizhong Zhang, Yanyun Qu:
Self-supervised Exclusive Learning for 3D Segmentation with Cross-Modal Unsupervised Domain Adaptation. 3338-3346 - Yafei Zhang, Yongzeng Wang, Huafeng Li, Shuang Li:
Cross-Compatible Embedding and Semantic Consistent Feature Construction for Sketch Re-identification. 3347-3355 - Liang Yang, Weihang Peng, Wenmiao Zhou, Bingxin Niu, Junhua Gu, Chuan Wang, Yuanfang Guo, Dongxiao He, Xiaochun Cao:
Difference Residual Graph Neural Networks. 3356-3364
Poster Session X: Understanding Multimedia Content -- Multimodal Fusion and Embeddings
- Man Zhou, Jie Huang, Keyu Yan, Gang Yang, Aiping Liu, Chongyi Li, Feng Zhao:
Normalization-based Feature Selection and Restitution for Pan-sharpening. 3365-3374 - Man Zhou, Jie Huang, Chongyi Li, Hu Yu, Keyu Yan, Naishan Zheng, Feng Zhao:
Adaptively Learning Low-high Frequency Information Integration for Pan-sharpening. 3375-3384 - Rongyao Hu, Liang Peng, Jiangzhang Gan, Xiaoshuang Shi, Xiaofeng Zhu:
Complementary Graph Representation Learning for Functional Neuroimaging Identification. 3385-3393 - Jiwei Guo, Jiajia Tang, Weichen Dai, Yu Ding, Wanzeng Kong:
Dynamically Adjust Word Representations Using Unaligned Multimodal Information. 3394-3402 - Weiqing Yan, Jindong Xu, Jinglei Liu, Guanghui Yue, Chang Tang:
Bipartite Graph-based Discriminative Feature Learning for Multi-View Clustering. 3403-3411 - Xingfeng Li, Quansen Sun, Zhenwen Ren, Yinghui Sun:
Dynamic Incomplete Multi-view Imputing and Clustering. 3412-3420 - Shudong Huang, Yixi Liu, Yazhou Ren, Ivor W. Tsang, Zenglin Xu, Jiancheng Lv:
Learning Smooth Representation for Multi-view Subspace Clustering. 3421-3429 - Mianzhao Wang, Fan Shi, Xu Cheng, Meng Zhao, Yao Zhang, Chen Jia, Weiwei Tian, Shengyong Chen:
LFBCNet: Light Field Boundary-aware and Cascaded Interaction Network for Salient Object Detection. 3430-3439 - Junpu Zhang, Liang Li, Siwei Wang, Jiyuan Liu, Yue Liu, Xinwang Liu, En Zhu:
Multiple Kernel Clustering with Dual Noise Minimization. 3440-3450 - Hui Cui, Lei Zhu, Jingjing Li, Zheng Zhang, Weili Guan:
Webly Supervised Image Hashing with Lightweight Semantic Transfer Network. 3451-3460 - Chenxi Ma, Bo Yan, Qing Lin, Weimin Tan, Siming Chen:
Rethinking Super-Resolution as Text-Guided Details Generation. 3461-3469 - Nan Yin, Li Shen, Baopu Li, Mengzhu Wang, Xiao Luo, Chong Chen, Zhigang Luo, Xian-Sheng Hua:
DEAL: An Unsupervised Domain Adaptive Framework for Graph-level Classification. 3470-3479 - Pinci Yang, Xin Wang, Xuguang Duan, Hong Chen, Runze Hou, Cong Jin, Wenwu Zhu:
AVQA: A Dataset for Audio-Visual Question Answering on Videos. 3480-3491 - Jinyu Yang, Zhe Li, Feng Zheng, Ales Leonardis, Jingkuan Song:
Prompting for Multi-Modal Tracking. 3492-3500 - Anjun Chen, Xiangyu Wang, Shaohao Zhu, Yanxu Li, Jiming Chen, Qi Ye:
mmBody Benchmark: 3D Body Reconstruction Dataset and Analysis for Millimeter Wave Radar. 3501-3510 - Haizhuang Liu, Huimin Ma, Yilin Wang, Bochao Zou, Tianyu Hu, Rongquan Wang, Jiansheng Chen:
Eliminating Spatial Ambiguity for Weakly Supervised 3D Object Detection without Spatial Labels. 3511-3520 - Zhongwei Qiu, Qiansheng Yang, Jian Wang, Dongmei Fu:
Dynamic Graph Reasoning for Multi-person 3D Pose Estimation. 3521-3529 - Junlong Li, Yiheng Xu, Tengchao Lv, Lei Cui, Cha Zhang, Furu Wei:
DiT: Self-supervised Pre-training for Document Image Transformer. 3530-3539 - Nathan Louis, Jason J. Corso, Tylan N. Templin, Travis D. Eliason, Daniel P. Nicolella:
Learning to Estimate External Forces of Human Motion in Video. 3540-3548 - Meihuizi Jia, Xin Shen, Lei Shen, Jinhui Pang, Lejian Liao, Yang Song, Meng Chen, Xiaodong He:
Query Prior Matters: A MRC Framework for Multimodal Named Entity Recognition. 3549-3558 - Md Fahim Faysal Khan, Anusha Devulapally, Siddharth Advani, Vijaykrishnan Narayanan:
Robust Multimodal Depth Estimation using Transformer based Generative Adversarial Networks. 3559-3568 - Fu'ze Cong, Shibiao Xu, Li Guo, Yinbing Tian:
Caption-Aware Medical VQA via Semantic Focusing and Progressive Cross-Modality Comprehension. 3569-3577 - Guiyang Luo, Hui Zhang, Quan Yuan, Jinglin Li:
Complementarity-Enhanced and Redundancy-Minimized Collaboration Network for Multi-agent Perception. 3578-3586 - Qian Yang, Yunxin Li, Baotian Hu, Lin Ma, Yuxin Ding, Min Zhang:
Chunk-aware Alignment and Lexical Constraint for Visual Entailment with Natural Language Explanations. 3587-3597 - Xuelin Zhu, Jiuxin Cao, Jiawei Ge, Weijia Liu, Bo Liu:
Two-Stream Transformer for Multi-Label Image Classification. 3598-3607 - Dulanga Weerakoon, Vigneshwaran Subbaraju, Tuan Tran, Archan Misra:
SoftSkip: Empowering Multi-Modal Dynamic Pruning for Single-Stage Referring Comprehension. 3608-3616 - Ronghao Dang, Zhuofan Shi, Liuyi Wang, Zongtao He, Chengju Liu, Qijun Chen:
Unbiased Directed Object Attention Graph for Object Navigation. 3617-3627 - Meng Sun, Ju Ren, Xin Wang, Wenwu Zhu, Yaoxue Zhang:
FastPR: One-stage Semantic Person Retrieval via Self-supervised Learning. 3628-3636 - Yingchen Yu, Fangneng Zhan, Rongliang Wu, Jiahui Zhang, Shijian Lu, Miaomiao Cui, Xuansong Xie, Xian-Sheng Hua, Chunyan Miao:
Towards Counterfactual Image Manipulation via CLIP. 3637-3645 - Jiaqing Fan, Tiankang Su, Kaihua Zhang, Qingshan Liu:
Bidirectionally Learning Dense Spatio-temporal Feature Propagation Network for Unsupervised Video Object Segmentation. 3646-3655 - Shuyong Gao, Haozhe Xing, Wei Zhang, Yan Wang, Qianyu Guo, Wenqiang Zhang:
Weakly Supervised Video Salient Object Detection via Point Supervision. 3656-3665 - Rui Yan, Peng Huang, Xiangbo Shu, Junhao Zhang, Yonghua Pan, Jinhui Tang:
Look Less Think More: Rethinking Compositional Action Recognition. 3666-3675 - Xinhang Wan, Jiyuan Liu, Weixuan Liang, Xinwang Liu, Yi Wen, En Zhu:
Continual Multi-view Clustering. 3676-3684 - Tiejian Zhang, Xinwang Liu, En Zhu, Sihang Zhou, Zhibin Dong:
Efficient Anchor Learning-based Multi-view Clustering - A Late Fusion Method. 3685-3693 - Xianshuai Cao, Yuliang Shi, Jihu Wang, Han Yu, Xinjun Wang, Zhongmin Yan:
Cross-modal Knowledge Graph Contrastive Learning for Machine Learning Method Recommendation. 3694-3702 - Zan Gao, Hongwei Wei, Weili Guan, Weizhi Nie, Meng Liu, Meng Wang:
Multigranular Visual-Semantic Embedding for Cloth-Changing Person Re-identification. 3703-3711 - Liang Li, Baihua Zheng, Weiwei Sun:
Adaptive Structural Similarity Preserving for Unsupervised Cross Modal Hashing. 3712-3721 - Hao Sun, Hongyi Wang, Jiaqing Liu, Yen-Wei Chen, Lanfen Lin:
CubeMLP: An MLP-based Model for Multimodal Sentiment Analysis and Depression Estimation. 3722-3729 - Bicheng Guo, Tao Chen, Shibo He, Haoyu Liu, Lilin Xu, Peng Ye, Jiming Chen:
Generalized Global Ranking-Aware Neural Architecture Ranker for Efficient Image Classifier Search. 3730-3741 - Jinxiang Liu, Chen Ju, Weidi Xie, Ya Zhang:
Exploiting Transformation Invariance and Equivariance for Self-supervised Sound Localisation. 3742-3753 - Yanbin Hao, Jingru Duan, Hao Zhang, Bin Zhu, Pengyuan Zhou, Xiangnan He:
Unsupervised Video Hashing with Multi-granularity Contextualization and Multi-structure Preservation. 3754-3763 - Haiyang Liu, Naoya Iwamoto, Zihao Zhu, Zhengqing Li, You Zhou, Elif Bozkurt, Bo Zheng:
DisCo: Disentangled Implicit Content and Rhythm Learning for Diverse Co-Speech Gestures Synthesis. 3764-3773 - Man-Sheng Chen, Tuo Liu, Chang-Dong Wang, Dong Huang, Jian-Huang Lai:
Adaptively-weighted Integral Space for Fast Multiview Clustering. 3774-3782 - Zhiying Jiang, Zengxi Zhang, Xin Fan, Risheng Liu:
Towards All Weather and Unobstructed Multi-Spectral Image Stitching: Algorithm and Benchmark. 3783-3791 - Shizhe Hu, Ruilin Geng, Zhaoxu Cheng, Chaoyang Zhang, Guoliang Zou, Zhengzheng Lou, Yangdong Ye:
A Parameter-free Multi-view Information Bottleneck Clustering Method by Cross-view Weighting. 3792-3800 - Mengze Li, Tianbao Wang, Haoyu Zhang, Shengyu Zhang, Zhou Zhao, Wenqiao Zhang, Jiaxu Miao, Shiliang Pu, Fei Wu:
HERO: HiErarchical spatio-tempoRal reasOning with Contrastive Action Correspondence for End-to-End Video Object Grounding. 3801-3810 - Xiaoyu Zhou, Xiaotong Song, Hao Wu, Jingran Zhang, Xing Xu:
MAVT-FG: Multimodal Audio-Visual Transformer for Weakly-supervised Fine-Grained Recognition. 3811-3819 - Haichao Shi, Xiaoyu Zhang, Changsheng Li, Lixing Gong, Yong Li, Yongjun Bao:
Dynamic Graph Modeling for Weakly-Supervised Temporal Action Localization. 3820-3828 - Miaoyu Li, Yachao Zhang, Yuan Xie, Zuodong Gao, Cuihua Li, Zhizhong Zhang, Yanyun Qu:
Cross-Domain and Cross-Modal Knowledge Distillation in Domain Adaptation for 3D Semantic Segmentation. 3829-3837 - Eric Zhongcong Xu, Zeyang Song, Satoshi Tsutsui, Chao Feng, Mang Ye, Mike Zheng Shou:
AVA-AVD: Audio-visual Speaker Diarization in the Wild. 3838-3847 - Bo Peng, Liren He, Yining Qiu, Dong Wu, Mingmin Chi:
Image-Signal Correlation Network for Textile Fiber Identification. 3848-3856 - Derong Xu, Tong Xu, Shiwei Wu, Jingbo Zhou, Enhong Chen:
Relation-enhanced Negative Sampling for Multimodal Knowledge Graph Completion. 3857-3866 - Wuxuan Shi, Mang Ye, Bo Du:
Symmetric Uncertainty-Aware Feature Transmission for Depth Super-Resolution. 3867-3876 - Jiawei Fan, Yu Zhao, Xie Yu, Lihua Ma, Junqi Liu, Fangqiu Yi, Boxun Li:
DTR: An Information Bottleneck Based Regularization Framework for Video Action Recognition. 3877-3885 - Jin Yuan, Feng Hou, Yangzhou Du, Zhongchao Shi, Xin Geng, Jianping Fan, Yong Rui:
Self-Supervised Graph Neural Network for Multi-Source Domain Adaptation. 3907-3916 - Ho Yin Au, Jie Chen, Junkun Jiang, Yike Guo:
ChoreoGraph: Music-conditioned Automatic Dance Choreography over a Style and Tempo Consistent Dynamic Graph. 3917-3925 - Rui Peng, Tao Zhang, Bing Li, Yitong Wang:
Pixelwise Adaptive Discretization with Uncertainty Sampling for Depth Completion. 3926-3935 - Zhe Xue, Junping Du, Hai Zhu, Zhongchao Guan, Yunfei Long, Yu Zang, Meiyu Liang:
Robust Diversified Graph Contrastive Network for Incomplete Multi-view Clustering. 3936-3944 - Xiyu Wang, Yuecong Xu, Jianfei Yang, Kezhi Mao:
Calibrating Class Weights with Multi-Modal Information for Partial Video Domain Adaptation. 3945-3954 - Duo Chen, Zixin Tang, Yiguang Liu:
Cyclical Fusion: Accurate 3D Reconstruction via Cyclical Monotonicity. 3955-3964 - Tengfei Liang, Yi Jin, Wu Liu, Songhe Feng, Tao Wang, Yidong Li:
Keypoint-Guided Modality-Invariant Discriminative Learning for Visible-Infrared Person Re-identification. 3965-3973 - Gang Yang, Li Zhang, Man Zhou, Aiping Liu, Xun Chen, Zhiwei Xiong, Feng Wu:
Model-Guided Multi-Contrast Deep Unfolding Network for MRI Super-resolution Reconstruction. 3974-3982 - Fei Zhao, Chunhui Li, Zhen Wu, Shangyu Xing, Xinyu Dai:
Learning from Different text-image Pairs: A Relation-enhanced Graph Convolutional Network for Multimodal NER. 3983-3992 - Shuo Wang, Xinyu Zhang, Yanbin Hao, Chengbing Wang, Xiangnan He:
Multi-directional Knowledge Transfer for Few-Shot Learning. 3993-4002 - Yiming Sun, Bing Cao, Pengfei Zhu, Qinghua Hu:
DetFusion: A Detection-driven Infrared and Visible Image Fusion Network. 4003-4011 - Cuiqun Chen, Mang Ye, Meibin Qi, Bo Du:
Sketch Transformer: Asymmetrical Disentanglement Learning from Dynamic Synthesis. 4012-4020 - Jinxiang Lai, Siqian Yang, Guannan Jiang, Xi Wang, Yuxi Li, Zihui Jia, Xiaochen Chen, Jun Liu, Bin-Bin Gao, Wei Zhang, Yuan Xie, Chengjie Wang:
Rethinking the Metric in Few-shot Learning: From an Adaptive Multi-Distance Perspective. 4021-4030 - Yuanbin Wang, Leyan Zhu, Shaofei Huang, Tianrui Hui, Xiaojie Li, Fei Wang, Si Liu:
Cross-Modality Domain Adaptation for Freespace Detection: A Simple yet Effective Baseline. 4031-4042 - Jin Xie, Rao Muhammad Anwer, Hisham Cholakkal, Jing Nie, Jiale Cao, Jorma Laaksonen, Fahad Shahbaz Khan:
Learning a Dynamic Cross-Modal Network for Multispectral Pedestrian Detection. 4043-4052 - Haihan Wang, Shangfei Wang, Lin Fang:
Two-Stage Multi-Scale Resolution-Adaptive Network for Low-Resolution Face Recognition. 4053-4062 - Xuan Zhang, Xun Liang, Xiangping Zheng, Bo Wu, Yuhui Guo:
When True Becomes False: Few-Shot Link Prediction beyond Binary Relations through Mining False Positive Entities. 4063-4071 - Hanjia Lyu, Jiebo Luo:
Understanding Political Polarization via Jointly Modeling Users, Connections and Multimodal Contents on Heterogeneous Graphs. 4072-4082
Oral Session XI: Understanding Multimedia Content -- Vision and Language
- Yupan Huang, Tengchao Lv, Lei Cui, Yutong Lu, Furu Wei:
LayoutLMv3: Pre-training for Document AI with Unified Text and Image Masking. 4083-4091 - Daizong Liu, Xiaoye Qu, Wei Hu:
Reducing the Vision and Language Bias for Temporal Sentence Grounding. 4092-4101 - Luchuan Song, Xiaodan Li, Zheng Fang, Zhenchao Jin, Yuefeng Chen, Chenliang Xu:
Face Forgery Detection via Symmetric Transformer. 4102-4111 - Zaisheng Li, Yi Li, Liang Qiao, Pengfei Li, Zhanzhan Cheng, Yi Niu, Shiliang Pu, Xi Li:
End-to-End Compound Table Understanding with Multi-Modal Modeling. 4112-4121 - Yiyuan Zhang, Yuqi Ji:
Modality Eigen-Encodings Are Keys to Open Modality Informative Containers. 4122-4131 - Yue Ma, Yali Wang, Yue Wu, Ziyu Lyu, Siran Chen, Xiu Li, Yu Qiao:
Visual Knowledge Graph for Human Action Reasoning in Videos. 4132-4141 - Feilong Chen, Duzhen Zhang, Xiuyi Chen, Jing Shi, Shuang Xu, Bo Xu:
Unsupervised and Pseudo-Supervised Vision-Language Alignment in Visual Dialog. 4142-4153 - Jingqun Tang, Su Qiao, Benlei Cui, Yuhang Ma, Sheng Zhang, Dimitrios Kanoulas:
You Can even Annotate Text with Voice: Transcription-only-Supervised Text Spotting. 4154-4163 - Chao Bi, Shuhui Wang, Zhe Xue, Shengbo Chen, Qingming Huang:
Inferential Visual Question Generation. 4164-4174 - Gal-Lev Shalev, Gabi Shalev, Joseph Keshet:
A Baseline for Detecting Out-of-Distribution Examples in Image Captioning. 4175-4184 - Jingyuan Xu, Hongtao Xie, Chuanbin Liu, Yongdong Zhang:
Proxy Probing Decoder for Weakly Supervised Object Localization: A Baseline Investigation. 4185-4193 - Yusheng Zhao, Jinyu Chen, Chen Gao, Wenguan Wang, Lirong Yang, Haibing Ren, Huaxia Xia, Si Liu:
Target-Driven Structured Transformer Planner for Vision-Language Navigation. 4194-4203 - Xingchen Li, Long Chen, Wenbo Ma, Yi Yang, Jun Xiao:
Integrating Object-aware and Interaction-aware Knowledge for Weakly Supervised Scene Graph Generation. 4204-4213 - Mingkun Yang, Minghui Liao, Pu Lu, Jing Wang, Shenggao Zhu, Hualin Luo, Qi Tian, Xiang Bai:
Reading and Writing: Discriminative and Generative Modeling for Self-Supervised Text Recognition. 4214-4223 - Xudong Tian, Jun Liu, Zhizhong Zhang, Chengjie Wang, Yanyun Qu, Yuan Xie, Lizhuang Ma:
Hierarchical Walking Transformer for Object Re-Identification. 4224-4232 - Siying Wu, Xueyang Fu, Feng Wu, Zheng-Jun Zha:
Cross-modal Semantic Alignment Pre-training for Vision-and-Language Navigation. 4233-4241 - Rundong He, Zhongyi Han, Xiankai Lu, Yilong Yin:
RONF: Reliable Outlier Synthesis under Noisy Feature Space for Out-of-Distribution Detection. 4242-4251 - Yasunori Ohishi, Marc Delcroix, Tsubasa Ochiai, Shoko Araki, Daiki Takeuchi, Daisuke Niizumi, Akisato Kimura, Noboru Harada, Kunio Kashino:
ConceptBeam: Concept Driven Target Speech Extraction. 4252-4260 - Haoyu Cao, Xin Li, Jiefeng Ma, Deqiang Jiang, Antai Guo, Yiqing Hu, Hao Liu, Yinsong Liu, Bo Ren:
Query-driven Generative Network for Document Information Extraction in the Wild. 4261-4271 - Dezhi Peng, Xinyu Wang, Yuliang Liu, Jiaxin Zhang, Mingxin Huang, Songxuan Lai, Jing Li, Shenggao Zhu, Dahua Lin, Chunhua Shen, Xiang Bai, Lianwen Jin:
SPTS: Single-Point Text Spotting. 4272-4281 - Yiyang Ma, Huan Yang, Bei Liu, Jianlong Fu, Jiaying Liu:
AI Illustrator: Translating Raw Descriptions into Images by Prompt-based Cross-Modal Generation. 4282-4290 - Xiaoyu Zhang, Yulin Jin, Tao Wang, Jian Lou, Xiaofeng Chen:
Purifier: Plug-and-play Backdoor Mitigation for Pre-trained Models Via Anomaly Activation Suppression. 4291-4299 - Junsheng Wang, Tiantian Gong, Zhixiong Zeng, Changchang Sun, Yan Yan:
C3CMR: Cross-Modality Cross-Instance Contrastive Learning for Cross-Media Retrieval. 4300-4308 - Aihua Zheng, Peng Pan, Hongchao Li, Chenglong Li, Bin Luo, Chang Tan, Ruoran Jia:
Progressive Attribute Embedding for Accurate Cross-modality Person Re-ID. 4309-4317 - Lihua Zhou, Mao Ye, Xiatian Zhu, Shuaifeng Li, Yiguang Liu:
Class Discriminative Adversarial Learning for Unsupervised Domain Adaptation. 4318-4326 - Zhuowei Chen, Zhendong Mao, Shancheng Fang, Bo Hu:
Background Layout Generation and Object Knowledge Transfer for Text-to-Image Generation. 4327-4335 - Rengang Li, Baoyu Fan, Xiaochuan Li, Runze Zhang, Zhenhua Guo, Kun Zhao, Yaqian Zhao, Weifeng Gong, Endong Wang:
Towards Further Comprehension on Referring Expression with Rationale. 4336-4344 - Mengqi Huang, Zhendong Mao, Penghui Wang, Quan Wang, Yongdong Zhang:
DSE-GAN: Dynamic Semantic Evolution Generative Adversarial Network for Text-to-Image Generation. 4345-4354 - Hao Wei, Shuhui Wang, Xinzhe Han, Zhe Xue, Bin Ma, Xiaoming Wei, Xiaolin Wei:
Synthesizing Counterfactual Samples for Effective Image-Text Matching. 4355-4364 - Jingjing Zhang, Shancheng Fang, Zhendong Mao, Zhiwei Zhang, Yongdong Zhang:
Fine-tuning with Multi-modal Entity Prompts for News Image Captioning. 4365-4373 - Yangjun Mao, Long Chen, Zhihong Jiang, Dong Zhang, Zhimeng Zhang, Jian Shao, Jun Xiao:
Rethinking the Reference-based Distinctive Image Captioning. 4374-4384 - Alex Falcon, Giuseppe Serra, Oswald Lanz:
A Feature-space Multimodal Data Augmentation Technique for Text-video Retrieval. 4385-4394
Poster Session XI: Understanding Multimedia Content -- Vision and Language
- Zejun Li, Zhihao Fan, Huaixiao Tou, Jingjing Chen, Zhongyu Wei, Xuanjing Huang:
MVPTR: Multi-Level Semantic Alignment for Vision-Language Pre-Training via Multi-Stage Learning. 4395-4405 - Prince Jha, Gaël Dias, Alexis Lechervy, José G. Moreno, Anubhav Jangra, Sebastião Pais, Sriparna Saha:
Combining Vision and Language Representations for Patch-based Identification of Lexico-Semantic Relations. 4406-4415 - Weidong Chen, Dexiang Hong, Yuankai Qi, Zhenjun Han, Shuhui Wang, Laiyun Qing, Qingming Huang, Guorong Li:
Multi-Attention Network for Compressed Video Referring Object Segmentation. 4416-4425 - Kai Niu, Linjiang Huang, Yan Huang, Peng Wang, Liang Wang, Yanning Zhang:
Cross-modal Co-occurrence Attributes Alignments for Person Search by Language. 4426-4434 - Heqian Qiu, Hongliang Li, Taijin Zhao, Lanxiao Wang, Qingbo Wu, Fanman Meng:
RefCrowd: Grounding the Target in Crowd with Referring Expressions. 4435-4444 - Qiming Yang, Kai Zhang, Chaoxiang Lan, Zhi Yang, Zheyang Li, Wenming Tan, Jun Xiao, Shiliang Pu:
Unified Normalization for Accelerating and Stabilizing Transformers. 4445-4455 - Hui Zhu, Yongchun Lü, Hongbin Wang, Xunyi Zhou, Qin Ma, Yanhong Liu, Ning Jiang, Xin Wei, Linchengxi Zeng, Xiaofang Zhao:
Enhancing Semi-Supervised Learning with Cross-Modal Knowledge. 4456-4465 - Zi Qian, Xin Wang, Xuguang Duan, Hong Chen, Wenwu Zhu:
Dynamic Spatio-Temporal Modular Network for Video Question Answering. 4466-4477 - Xiao Wang, Tian Gan, Yinwei Wei, Jianlong Wu, Dai Meng, Liqiang Nie:
Micro-video Tagging via Jointly Modeling Social Influence and Tag Relation. 4478-4486 - Qiang Zhou, Chaohui Yu, Hao Luo, Zhibin Wang, Hao Li:
MimCo: Masked Image Modeling Pre-training with Contrastive Teacher. 4487-4495 - Gaoxiang Cong, Liang Li, Zhenhuan Liu, Yunbin Tu, Weijun Qin, Shenyuan Zhang, Chengang Yan, Wenyu Wang, Bin Jiang:
LS-GAN: Iterative Language-based Image Manipulation via Long and Short Term Consistency Reasoning. 4496-4504 - Chuanpeng Yang, Fuqing Zhu, Guihua Liu, Jizhong Han, Songlin Hu:
Multimodal Hate Speech Detection via Cross-Domain Knowledge Transfer. 4505-4514 - Zhiyuan Ma, Jianjun Li, Guohui Li, Kaiyan Huang:
CMAL: A Novel Cross-Modal Associative Learning Framework for Vision-Language Pre-Training. 4515-4524 - Xujie Zhang, Yu Sha, Michael C. Kampffmeyer, Zhenyu Xie, Zequn Jie, Chengwen Huang, Jianqing Peng, Xiaodan Liang:
ARMANI: Part-level Garment-Text Alignment for Unified Cross-Modal Fashion Design. 4525-4535 - Daizong Liu, Wei Hu:
Skimming, Locating, then Perusing: A Human-Like Framework for Natural Language Video Localization. 4536-4545 - Guangzhi Wang, Yangyang Guo, Yongkang Wong, Mohan S. Kankanhalli:
Distance Matters in Human-Object Interaction Detection. 4546-4554 - Chen-Wei Xie, Jianmin Wu, Yun Zheng, Pan Pan, Xian-Sheng Hua:
Token Embeddings Alignment for Cross-Modal Retrieval. 4555-4563 - Zan-Xia Jin, Mike Zheng Shou, Fang Zhou, Satoshi Tsutsui, Jingyan Qin, Xu-Cheng Yin:
From Token to Word: OCR Token Evolution via Contrastive Learning and Semantic Matching for Text-VQA. 4564-4572 - Xinyu Huang, Youcai Zhang, Ying Cheng, Weiwei Tian, Ruiwei Zhao, Rui Feng, Yuejie Zhang, Yaqian Li, Yandong Guo, Xiaobo Zhang:
IDEA: Increasing Text Diversity via Online Multi-Label Recognition for Vision-Language Pre-training. 4573-4583 - Guohao Li, Hu Yang, Feng He, Zhifan Feng, Yajuan Lyu, Hua Wu, Haifeng Wang:
CLOP: Video-and-Language Pre-Training with Knowledge Regularizations. 4584-4593 - Yudong Li, Xianxu Hou, Zhe Zhao, Linlin Shen, Xuefeng Yang, Kimmo Yan:
Talk2Face: A Unified Sequence-based Framework for Diverse Face Generation and Analysis Tasks. 4594-4604 - Zhenyu Wu, Zhou Ren, Yi Wu, Zhangyang Wang, Gang Hua:
TxVAD: Improved Video Action Detection by Transformers. 4605-4613 - Xin Li, Yan Zheng, Yiqing Hu, Haoyu Cao, Yunfei Wu, Deqiang Jiang, Yinsong Liu, Bo Ren:
Relational Representation Learning in Visually-Rich Documents. 4614-4624 - Zihao Wang, Junli Wang, Changjun Jiang:
Unified Multimodal Model with Unlikelihood Training for Visual Dialog. 4625-4634 - Manyi Zhang, Yuxin Ren, Zihao Wang, Chun Yuan:
Tackling Instance-Dependent Label Noise with Dynamic Distribution Calibration. 4635-4644 - Muhammad Umer Anwaar, Zhihui Pan, Martin Kleinsteuber:
On Leveraging Variational Graph Embeddings for Open World Compositional Zero-Shot Learning. 4645-4654 - Feifei Zhang, Ming Yan, Ji Zhang, Changsheng Xu:
Comprehensive Relationship Reasoning for Composed Query Based Image Retrieval. 4655-4664 - Ramtin Hosseini, Pengtao Xie:
Image Understanding by Captioning with Differentiable Architecture Search. 4665-4673 - Muqi Huang, Lefei Zhang:
Atrous Pyramid Transformer with Spectral Convolution for Image Inpainting. 4674-4683 - Ding Ma, Xiangqian Wu:
QuadTreeCapsule: QuadTree Capsules for Deep Regression Tracking. 4684-4693 - Qixin Deng, Binh Huy Le, Aobo Jin, Zhigang Deng:
End-to-End 3D Face Reconstruction with Expressions and Specular Albedos from Single In-the-wild Images. 4694-4703 - Yunqing He, Tongwei Ren, Jinhui Tang, Gangshan Wu:
Heterogeneous Learning for Scene Graph Generation. 4704-4713 - Yicong Li, Xiang Wang, Junbin Xiao, Tat-Seng Chua:
Equivariant and Invariant Grounding for Video Question Answering. 4714-4722 - Yan Yu, Yuchen Zhai, Yin Zhang:
Align and Adapt: A Two-stage Adaptation Framework for Unsupervised Domain Adaptation. 4723-4732 - Yutong Tan, Zheng Lin, Peng Fu, Mingyu Zheng, Lanrui Wang, Yanan Cao, Weiping Wang:
Detach and Attach: Stylized Image Captioning without Paired Stylized Dataset. 4733-4741 - Wei Zhang, Xiaohong Zhang, Sheng Huang, Yuting Lu, Kun Wang:
PixelSeg: Pixel-by-Pixel Stochastic Semantic Segmentation for Ambiguous Medical Images. 4742-4750 - Wei Zhang, Xiaohong Zhang, Sheng Huang, Yuting Lu, Kun Wang:
A Probabilistic Model for Controlling Diversity and Accuracy of Ambiguous Medical Image Segmentation. 4751-4759 - Ziyu Zhao, Zhenyao Wu, Xinyi Wu, Canyu Zhang, Song Wang:
Crossmodal Few-shot 3D Point Cloud Semantic Segmentation. 4760-4768 - Ben Fei, Weidong Yang, Wen-Ming Chen, Lipeng Ma:
VQ-DcTr: Vector-Quantized Autoencoder With Dual-channel Transformer Points Splitting for 3D Point Cloud Completion. 4769-4778 - Baoli Sun, Xinchen Ye, Tiantian Yan, Zhihui Wang, Haojie Li, Zhiyong Wang:
Fine-grained Action Recognition with Robust Motion Representation Decoupling and Concentration. 4779-4788 - Sheng Fang, Shuhui Wang, Junbao Zhuo, Qingming Huang, Bin Ma, Xiaoming Wei, Xiaolin Wei:
Concept Propagation via Attentional Knowledge Graph Reasoning for Video-Text Retrieval. 4789-4800 - Jingye Wang, Ruoyi Du, Dongliang Chang, Kongming Liang, Zhanyu Ma:
Domain Generalization via Frequency-domain-based Feature Disentanglement and Interaction. 4821-4829 - Runpeng Hou, Ziyuan Ye, Chengyu Yang, Linhao Fu, Chao Liu, Quanying Liu:
Immunofluorescence Capillary Imaging Segmentation: Cases Study. 4830-4838 - Siyuan Liang, Aishan Liu, Jiawei Liang, Longkang Li, Yang Bai, Xiaochun Cao:
Imitated Detectors: Stealing Knowledge of Black-box Object Detectors. 4839-4847 - Wu Zheng, Li Jiang, Fanbin Lu, Yangyang Ye, Chi-Wing Fu:
Boosting Single-Frame 3D Object Detection by Simulating Multi-Frame Point Clouds. 4848-4856 - Fengbin Zhu, Wenqiang Lei, Fuli Feng, Chao Wang, Haozhou Zhang, Tat-Seng Chua:
Towards Complex Document Understanding By Discrete Reasoning. 4857-4866 - Hanlin Li, Guanting Dong, Yueyi Zhang, Xiaoyan Sun, Zhiwei Xiong:
RPPformer-Flow: Relative Position Guided Point Transformer for Scene Flow Estimation. 4867-4876 - Wenjin Wang, Zhengjie Huang, Bin Luo, Qianglong Chen, Qiming Peng, Yinxu Pan, Weichong Yin, Shikun Feng, Yu Sun, Dianhai Yu, Yin Zhang:
mmLayout: Multi-grained MultiModal Transformer for Document Understanding. 4877-4886 - Haoran Wang, Di Xu, Dongliang He, Fu Li, Zhong Ji, Jungong Han, Errui Ding:
Boosting Video-Text Retrieval with Explicit High-Level Semantics. 4887-4898 - Hengyi Zhou, Longjun Liu, Haonan Zhang, Nanning Zheng:
Rethinking the Mechanism of the Pattern Pruning and the Circle Importance Hypothesis. 4899-4908 - Xinya Wu, Duo Zheng, Ruonan Wang, Jiashen Sun, Minzhen Hu, Fangxiang Feng, Xiaojie Wang, Huixing Jiang, Fan Yang:
A Region-based Document VQA. 4909-4920 - Hui Lu, Xuan Cheng, Wentao Xia, Pan Deng, Minghui Liu, Tianshu Xie, Xiaomin Wang, Ming Liu:
CyclicShift: A Data Augmentation Method For Enriching Data Patterns. 4921-4929 - Jinqiang Wang, Rui Hu, Chaoquan Jiang, Rui Hu, Jitao Sang:
Counterexample Contrastive Learning for Spurious Correlation Elimination. 4930-4938 - Tao Jin, Zhou Zhao, Meng Zhang, Xingshan Zeng:
MC-SLT: Towards Low-Resource Signer-Adaptive Sign Language Translation. 4939-4947 - Yang Qin, Dezhong Peng, Xi Peng, Xu Wang, Peng Hu:
Deep Evidential Learning with Noisy Correspondence for Cross-modal Retrieval. 4948-4956 - Hongyu Gao, Chao Zhu, Mengyin Liu, Weibo Gu, Hongfa Wang, Wei Liu, Xu-Cheng Yin:
CAliC: Accurate and Efficient Image-Text Retrieval via Contrastive Alignment and Visual Contexts Modeling. 4957-4966 - Meng Cao, Ji Jiang, Long Chen, Yuexian Zou:
Correspondence Matters for Video Referring Expression Comprehension. 4967-4976 - Zheng Wang, Zhenwei Gao, Xing Xu, Yadan Luo, Yang Yang, Heng Tao Shen:
Point to Rectangle Matching for Image Text Retrieval. 4977-4986 - Ruijie Hou, Yanran Li, Ningyu Zhang, Yulin Zhou, Xiaosong Yang, Zhao Wang:
Shifting Perspective to See Difference: A Novel Multi-view Method for Skeleton based Action Recognition. 4987-4995 - Yi Zhang, Junyang Wang, Jitao Sang:
Counterfactually Measuring and Eliminating Social Bias in Vision-Language Pre-training Models. 4996-5004 - Jiaming Zhang, Qi Yi, Jitao Sang:
Towards Adversarial Attack on Vision-Language Pre-training Models. 5005-5013 - Wei Wang, Yu Zhou, Jiahao Lyu, Dayan Wu, Guoqing Zhao, Ning Jiang, Weiping Wang:
TPSNet: Reverse Thinking of Thin Plate Splines for Arbitrary Shape Scene Text Representation. 5014-5025 - Zhengcong Fei:
Efficient Modeling of Future Context for Image Captioning. 5026-5035 - Banglei Guan, Ji Zhao:
Relative Pose Estimation for Multi-Camera Systems from Point Correspondences with Scale Ratio. 5036-5044 - Jun Peng, Han Pan, Yiyi Zhou, Jing He, Xiaoshuai Sun, Yan Wang, Yongjian Wu, Rongrong Ji:
Towards Open-Ended Text-to-Face Generation, Combination and Manipulation. 5045-5054 - Dongqing Wu, Huihui Li, Cang Gu, Lei Guo, Hang Liu:
Improving Fusion of Region Features and Grid Features via Two-Step Interaction for Image-Text Retrieval. 5055-5064 - Weixin An, Yingjie Yue, Yuanyuan Liu, Fanhua Shang, Hongying Liu:
A Numerical DEs Perspective on Unfolded Linearized ADMM Networks for Inverse Problems. 5065-5073 - Yonghui Wang, Wengang Zhou, Zhenbo Lu, Houqiang Li:
UDoc-GAN: Unpaired Document Illumination Correction with Background Light Prior. 5074-5082 - Juncheng Li, Junlin Xie, Linchao Zhu, Long Qian, Siliang Tang, Wenqiao Zhang, Haochen Shi, Shengyu Zhang, Longhui Wei, Qi Tian, Yueting Zhuang:
Dilated Context Integrated Network with Cross-Modal Consensus for Temporal Emotion Localization in Videos. 5083-5092 - Dong Wang, Yicheng Liu, Liangji Fang, Fanhua Shang, Yuanyuan Liu, Hongying Liu:
Balanced Gradient Penalty Improves Deep Long-Tailed Learning. 5093-5101 - Jinlu Zhang, Yujin Chen, Zhigang Tu:
Uncertainty-Aware 3D Human Pose Estimation from Monocular Video. 5102-5113 - Wenpeng Xing, Jie Chen:
MVSPlenOctree: Fast and Generic Reconstruction of Radiance Fields in PlenOctree from Multi-view Stereo. 5114-5122 - Junkun Jiang, Jie Chen, Yike Guo:
A Dual-Masked Auto-Encoder for Robust Motion Capture with Spatial-Temporal Skeletal Token Completion. 5123-5131 - Jun Peng, Xiaoxiong Du, Yiyi Zhou, Jing He, Yunhang Shen, Xiaoshuai Sun, Rongrong Ji:
Learning Dynamic Prior Knowledge for Text-to-Face Pixel Synthesis. 5132-5141 - Jingzheng Li, Hailong Sun:
Correct Twice at Once: Learning to Correct Noisy Labels for Robust Deep Learning. 5142-5151 - Zhihong Chen, Guanbin Li, Xiang Wan:
Align, Reason and Learn: Enhancing Medical Vision-and-Language Pre-training with Knowledge. 5152-5161 - Lingwei Dang, Yongwei Nie, Chengjiang Long, Qing Zhang, Guiqing Li:
Diverse Human Motion Prediction via Gumbel-Softmax Sampling from an Auxiliary Space. 5162-5171 - Meng Wang, Chaoyue Wang, Xiaojie Guo, Jiawan Zhang:
Towards High-Fidelity Face Normal Estimation. 5172-5180 - Yuxuan Wang, Jiakai Wang, Zixin Yin, Ruihao Gong, Jingyi Wang, Aishan Liu, Xianglong Liu:
Generating Transferable Adversarial Examples against Vision Transformers. 5181-5190 - Yan Xia, Zhou Zhao, Shangwei Ye, Yang Zhao, Haoyuan Li, Yi Ren:
Video-Guided Curriculum Learning for Spoken Video Grounding. 5191-5200 - Chen Li, Li Song, Xueyi Zou, Jiaming Guo, Youliang Yan, Wenjun Zhang:
Multi-Scale Coarse-to-Fine Transformer for Frame Interpolation. 5201-5209 - Pengpeng Zeng, Jinkuan Zhu, Jingkuan Song, Lianli Gao:
Progressive Tree-Structured Prototype Network for End-to-End Image Captioning. 5210-5218 - Miaohui Wang, Zhuowei Xu, Yuanhao Gong, Wuyuan Xie:
S-CCR: Super-Complete Comparative Representation for Low-Light Image Quality Inference In-the-wild. 5219-5227 - Mohammed M. Alghamdi, He Wang, Andrew J. Bulpitt, David C. Hogg:
Talking Head from Speech Audio using a Pre-trained Image Generator. 5228-5236 - Junjie Li, Zilei Wang, Yuan Gao, Xiaoming Hu:
Exploring High-quality Target Domain Information for Unsupervised Domain Adaptive Semantic Segmentation. 5237-5245 - Aishwarya Agarwal, Biplab Banerjee, Fabio Cuzzolin, Subhasis Chaudhuri:
Semantics-Driven Generative Replay for Few-Shot Class Incremental Learning. 5246-5254 - Lingling Gao, Yanli Ji, Yang Yang, Heng Tao Shen:
Global-Local Cross-View Fisher Discrimination for View-Invariant Action Recognition. 5255-5264 - Chenchen Ye, Lizi Liao, Suyu Liu, Tat-Seng Chua:
Reflecting on Experiences for Response Generation. 5265-5273 - Rengang Li, Cong Xu, Zhenhua Guo, Baoyu Fan, Runze Zhang, Wei Liu, Yaqian Zhao, Weifeng Gong, Endong Wang:
AI-VQA: Visual Question Answering based on Agent Interaction with Interpretability. 5274-5282 - Bo Xu, Jiake Xie, Han Huang, Ziwen Li, Cheng Lu, Yong Tang, Yandong Guo:
Situational Perception Guided Image Matting. 5283-5293 - Zhenjie Yu, Kai Chen, Shuang Li, Bingfeng Han, Chi Harold Liu, Shuigen Wang:
ROMA: Cross-Domain Region Similarity Matching for Unpaired Nighttime Infrared to Daytime Visible Video Translation. 5294-5302 - Liming Zhai, Qing Guo, Xiaofei Xie, Lei Ma, Yi Estelle Wang, Yang Liu:
A3GAN: Attribute-Aware Anonymization Networks for Face De-identification. 5303-5313 - Zijie Wang, Aichun Zhu, Jingyi Xue, Xili Wan, Chao Liu, Tian Wang, Yifeng Li:
CAIBC: Capturing All-round Information Beyond Color for Text-based Person Retrieval. 5314-5322 - Miao Zhang, Shuang Xu, Yongri Piao, Dongxiang Shi, Shusen Lin, Huchuan Lu:
PreyNet: Preying on Camouflaged Objects. 5323-5332 - Hanzhe Sun, Jun Liu, Zhizhong Zhang, Chengjie Wang, Yanyun Qu, Yuan Xie, Lizhuang Ma:
Not All Pixels Are Matched: Dense Contrastive Learning for Cross-Modality Person Re-Identification. 5333-5341 - Shiting Xu, Zhiheng Zhou, Junyuan Shang:
Asymmetric Adversarial-based Feature Disentanglement Learning for Cross-Database Micro-Expression Recognition. 5342-5350 - Yuhua Sun, Tailai Zhang, Xingjun Ma, Pan Zhou, Jian Lou, Zichuan Xu, Xing Di, Yu Cheng, Lichao Sun:
Backdoor Attacks on Crowd Counting. 5351-5360 - Kangcheng Liu:
Robust Industrial UAV/UGV-Based Unsupervised Domain Adaptive Crack Recognitions with Depth and Edge Awareness: From System and Database Constructions to Real-Site Inspections. 5361-5370 - Ziqiang Li, Yongxin Ge, Jiaruo Yu, Zhongming Chen:
Forcing the Whole Video as Background: An Adversarial Learning Strategy for Weakly Temporal Action Localization. 5371-5379 - Yifu Ding, Haotong Qin, Qinghua Yan, Zhenhua Chai, Junjie Liu, Xiaolin Wei, Xianglong Liu:
Towards Accurate Post-Training Quantization for Vision Transformer. 5380-5388 - Zhaoyang Jia, Yan Lu, Houqiang Li:
Neighbor Correspondence Matching for Flow-based Video Frame Synthesis. 5389-5397 - Xuewen Yang, Yingru Liu, Xin Wang:
ReFormer: The Relational Transformer for Image Captioning. 5398-5406 - Yu Xiong, Fabian Caba Heilbron, Dahua Lin:
Transcript to Video: Efficient Clip Sequencing from Texts. 5407-5416 - Senbo Yan, Liang Peng, Chuer Yu, Zheng Yang, Haifeng Liu, Deng Cai:
Domain Reconstruction and Resampling for Robust Salient Object Detection. 5417-5426 - Ye Liu, Liang Wan, Huazhu Fu, Jing Qin, Lei Zhu:
Phase-based Memory Network for Video Dehazing. 5427-5435 - Jun-Hao Zhuang, Yi-Si Luo, Xile Zhao, Tai-Xiang Jiang, Bichuan Guo:
UConNet: Unsupervised Controllable Network for Image and Video Deraining. 5436-5445 - Ziqi Jiang, Shengyu Zhang, Siyuan Yao, Wenqiao Zhang, Sihan Zhang, Juncheng Li, Zhou Zhao, Fei Wu:
Weakly-supervised Disentanglement Network for Video Fingerspelling Detection. 5446-5455 - Hongxiang Huang, Daihui Yang, Gang Dai, Zhen Han, Yuyi Wang, Kin-Man Lam, Fan Yang, Shuangping Huang, Yongge Liu, Mengchao He:
AGTGAN: Unpaired Image Translation for Photographic Ancient Character Generation. 5456-5467 - Yiren Song:
CLIPTexture: Text-Driven Texture Synthesis. 5468-5476 - Junjie Wang, Zhenbo Yu, Zhengyan Tong, Hang Wang, Jinxian Liu, Wenjun Zhang, Xiaoyan Wu:
OCR-Pose: Occlusion-aware Contrastive Representation for Unsupervised 3D Human Pose Estimation. 5477-5485 - Wencan Huang, Zhou Zhao, Jinzheng He, Mingmin Zhang:
DualSign: Semi-Supervised Sign Language Production with Balanced Multi-Modal Multi-Task Dual Transformation. 5486-5495 - Ce Zheng, Matías Mendieta, Pu Wang, Aidong Lu, Chen Chen:
A Lightweight Graph Transformer Network for Human Mesh Reconstruction from 2D Human Pose. 5496-5507 - Yue He, Minyue Jiang, Xiaoqing Ye, Liang Du, Zhikang Zou, Wei Zhang, Xiao Tan, Errui Ding:
Repainting and Imitating Learning for Lane Detection. 5508-5516 - Hao Wang, Guosheng Lin, Steven C. H. Hoi, Chunyan Miao:
Paired Cross-Modal Data Augmentation for Fine-Grained Image-to-Text Retrieval. 5517-5526 - Yulu Zhang, Liang Sang, Marcin Grzegorzek, John See, Cong Yang:
BlumNet: Graph Component Detection for Object Skeleton Extraction. 5527-5536 - Zihan Ding, Zi-han Ding, Tianrui Hui, Junshi Huang, Xiaoming Wei, Xiaolin Wei, Si Liu:
PPMN: Pixel-Phrase Matching Network for One-Stage Panoptic Narrative Grounding. 5537-5546 - Guangchen Shi, Yirui Wu, Jun Liu, Shaohua Wan, Wenhai Wang, Tong Lu:
Incremental Few-Shot Semantic Segmentation via Embedding Adaptive-Update and Hyper-class Representation. 5547-5556 - Zhenyu Wu, Lin Wang, Wei Wang, Tengfei Shi, Chenglizhao Chen, Aimin Hao, Shuo Li:
Synthetic Data Supervised Salient Object Detection. 5557-5565 - Zhiyin Shao, Xinyu Zhang, Meng Fang, Zhifeng Lin, Jian Wang, Changxing Ding:
Learning Granularity-Unified Representations for Text-to-Image Person Re-identification. 5566-5574 - Cheng Chen, Ji Zhang, Jingkuan Song, Lianli Gao:
Class Gradient Projection For Continual Learning. 5575-5583 - Song Chang, Youfang Lin, Shuo Zhang:
Flexible Hybrid Lenses Light Field Super-Resolution using Layered Refinement. 5584-5592 - Jingliang Li, Zhengda Lu, Yiqun Wang, Ying Wang, Jun Xiao:
DS-MVSNet: Unsupervised Multi-view Stereo via Depth Synthesis. 5593-5601 - Min Zhang, Zhihong Pan, Xin Zhou, C.-C. Jay Kuo:
Enhancing Image Rescaling using Dual Latent Variables in Invertible Neural Network. 5602-5610 - Qi Liu, Nianjuan Jiang, Jiangbo Lu, Mingang Chen, Ran Yi, Lizhuang Ma:
ScatterNet: Point Cloud Learning via Scatters. 5611-5619 - Wenxuan Ma, Jinming Zhang, Shuang Li, Chi Harold Liu, Yulin Wang, Wei Li:
Making The Best of Both Worlds: A Domain-Oriented Transformer for Unsupervised Domain Adaptation. 5620-5629 - Shengeng Tang, Richang Hong, Dan Guo, Meng Wang:
Gloss Semantic-Enhanced Network with Online Back-Translation for Sign Language Production. 5630-5638 - Bo Ju, Zhikang Zou, Xiaoqing Ye, Minyue Jiang, Xiao Tan, Errui Ding, Jingdong Wang:
Paint and Distill: Boosting 3D Object Detection with Semantic Passing Network. 5639-5648 - Shuangrui Ding, Rui Qian, Hongkai Xiong:
Dual Contrastive Learning for Spatio-temporal Representation. 5649-5658 - Huilin Zhu, Jingling Yuan, Zhengwei Yang, Xian Zhong, Zheng Wang:
Fine-Grained Fragment Diffusion for Cross Domain Crowd Counting. 5659-5668 - Teng Yang, Yue Wang, Lu Zhang, Jinqing Qi, Huchuan Lu:
Depth-inspired Label Mining for Unsupervised RGB-D Salient Object Detection. 5669-5677 - Yongqi Wang, Zhou Zhao:
FastLTS: Non-Autoregressive End-to-End Unconstrained Lip-to-Speech Synthesis. 5678-5687 - Ruitong Gan, Junsong Fan, Yuxi Wang, Zhaoxiang Zhang:
Interact with Open Scenes: A Life-long Evolution Framework for Interactive Segmentation Models. 5688-5697 - Duo Zheng, Fandong Meng, Qingyi Si, Hairun Fan, Zipeng Xu, Jie Zhou, Fangxiang Feng, Xiaojie Wang:
Visual Dialog for Spotting the Differences between Pairs of Similar Images. 5698-5709 - Xiang-Jun Shen, Zhaorui Xu, Liangjun Wang, Zechao Li:
Time and Memory Efficient Large-Scale Canonical Correlation Analysis in Fourier Domain. 5710-5718
Oral Session XII: Understanding Multimedia Content -- Media Interpretation
- Jie Zhang, Yin Zhao, Kai Qian:
Enlarging the Long-time Dependencies via RL-based Memory Network in Movie Affective Analysis. 5739-5750 - Shuhan Zhong, Sizhe Song, Guanyao Li, S.-H. Gary Chan:
A Tree-Based Structure-Aware Transformer Decoder for Image-To-Markup Generation. 5751-5760 - Junbao Zhuo, Yan Zhu, Shuhao Cui, Shuhui Wang, Bin Ma, Qingming Huang, Xiaoming Wei, Xiaolin Wei:
Zero-shot Video Classification with Appropriate Web and Task Knowledge Transfer. 5761-5772 - Hao Zhang, Lechao Cheng, Yanbin Hao, Chong-Wah Ngo:
Long-term Leap Attention, Short-term Periodic Shift for Video Classification. 5773-5782 - Jianjun Xu, Hongtao Xie, Hai Xu, Yuxin Wang, Sun'ao Liu, Yongdong Zhang:
Boat in the Sky: Background Decoupling and Object-aware Pooling for Weakly Supervised Semantic Segmentation. 5783-5792 - Shuang Wang, Lianli Gao, Xinyu Lyu, Yuyu Guo, Pengpeng Zeng, Jingkuan Song:
Dynamic Scene Graph Generation via Temporal Prior Inference. 5793-5801 - Xinyao Li, Zhekai Du, Jingjing Li, Lei Zhu, Ke Lu:
Source-Free Active Domain Adaptation via Energy-Based Locality Preserving Transfer. 5802-5810 - Jingbei Li, Yi Meng, Xixin Wu, Zhiyong Wu, Jia Jia, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Inferring Speaking Styles from Multi-modal Conversational Context by Multi-scale Relational Graph Convolutional Networks. 5811-5820 - Cláudio Bartolomeu, Rui Nóbrega, David Semedo:
Understanding News Text and Images Connection with Context-enriched Multimodal Transformers. 5821-5832 - Daichi Zhang, Fanzhao Lin, Yingying Hua, Pengju Wang, Dan Zeng, Shiming Ge:
Deepfake Video Detection with Spatiotemporal Dropout Transformer. 5833-5841 - Jiaqi Ma, Shengyuan Yan, Lefei Zhang, Guoli Wang, Qian Zhang:
ELMformer: Efficient Raw Image Restoration with a Locally Multiplicative Transformer. 5842-5852 - Hongbo Sun, Xiangteng He, Yuxin Peng:
SIM-Trans: Structure Information Modeling Transformer for Fine-grained Visual Categorization. 5853-5861 - Zhipeng Yu, Qianqian Xu, Yangbangyan Jiang, Haoyu Qin, Qingming Huang:
Pay Attention to Your Positive Pairs: Positive Pair Aware Contrastive Knowledge Distillation. 5862-5870 - Menglu Wang, Xueyang Fu, Jiawei Liu, Zheng-Jun Zha:
JPEG Compression-aware Image Forgery Localization. 5871-5879
Poster Session XII: Understanding Multimedia Content -- Media Interpretation
- Yi Tan, Yanbin Hao, Hao Zhang, Shuo Wang, Xiangnan He:
Hierarchical Hourglass Convolutional Network for Efficient Video Classification. 5880-5891 - Jin Wei, Yuan Zhang, Yu Zhou, Gangyan Zeng, Zhi Qiao, Youhui Guo, Haiying Wu, Hongbin Wang, Weiping Wang:
TextBlock: Towards Scene Text Spotting without Fine-grained Detection. 5892-5902 - Jianyuan Ni, Anne H. H. Ngu, Yan Yan:
Progressive Cross-modal Knowledge Distillation for Human Action Recognition. 5903-5912 - Zijie Yang, Lingxi Xie, Xinyue Huo, Sheng Tang, Qi Tian, Yongdong Zhang:
Finding the Host from the Lesion by Iteratively Mining the Registration Graph. 5913-5922 - Jonathan Samuel Lumentut, In Kyu Park:
3D Body Reconstruction Revisited: Exploring the Test-time 3D Body Mesh Refinement Strategy via Surrogate Adaptation. 5923-5933 - Felix Ott, David Rügamer, Lucas Heublein, Bernd Bischl, Christopher Mutschler:
Domain Adaptation for Time-Series Classification to Mitigate Covariate Shift. 5934-5943 - Pavel Korshunov, Sébastien Marcel:
Face Anthropometry Aware Audio-visual Age Verification. 5944-5951 - Xiaoxuan Chai, Junchi Zhou, Hang Zhou, Jui-Hsin Lai:
PDD-GAN: Prior-based GAN Network with Decoupling Ability for Single Image Dehazing. 5952-5960 - Yechao Xu, Zhengxing Sun, Qian Li, Yunhan Sun, Shoutong Luo:
Active Patterns Perceived for Stochastic Video Prediction. 5961-5969 - Nan Song, Chi Zhang, Guosheng Lin:
Few-shot Open-set Recognition Using Background as Unknowns. 5970-5979 - Yibo Wang, Yunhu Ye, Yuanpeng Mao, Yanwei Yu, Yuanping Song:
Self-supervised Scene Text Segmentation with Object-centric Layered Representations Augmented by Text Regions. 5980-5989 - Cunling Bian, Wei Feng, Song Wang:
Self-Supervised Representation Learning for Skeleton-Based Group Activity Recognition. 5990-5998 - Zehui Chen, Zhenyu Li, Shiquan Zhang, Liangji Fang, Qinhong Jiang, Feng Zhao:
Graph-DETR3D: Rethinking Overlapping Regions for Multi-View 3D Object Detection. 5999-6008 - Qianyu Zhou, Ke-Yue Zhang, Taiping Yao, Ran Yi, Shouhong Ding, Lizhuang Ma:
Adaptive Mixture of Experts Learning for Generalizable Face Anti-Spoofing. 6009-6018 - Meijie Zhang, Jianwu Li, Tianfei Zhou:
Multi-Granular Semantic Mining for Weakly Supervised Semantic Segmentation. 6019-6028 - Siwei Su, Haijian Wang, Meng Yang:
Consistency Learning based on Class-Aware Style Variation for Domain Generalizable Semantic Segmentation. 6029-6038 - Yinsong Xu, Zhuqing Jiang, Aidong Men, Yang Liu, Qingchao Chen:
Delving into the Continuous Domain Adaptation. 6039-6049 - Zihua Liu, Songyan Zhang, Zhicheng Wang, Masatoshi Okutomi:
Digging Into Normal Incorporated Stereo Matching. 6050-6060 - Wenjing Huang, Shikui Tu, Lei Xu:
Box-FaceS: A Bidirectional Method for Box-Guided Face Component Editing. 6061-6071 - Xuhao Jiang, Weimin Tan, Ri Cheng, Shili Zhou, Bo Yan:
Learning Parallax Transformer Network for Stereo Image JPEG Artifacts Removal. 6072-6082 - Ri Cheng, Yuqi Sun, Bo Yan, Weimin Tan, Chenxi Ma:
Geometry-Aware Reference Synthesis for Multi-View Image Super-Resolution. 6083-6093 - Xinyan Zu, Haiyang Yu, Bin Li, Xiangyang Xue:
Chinese Character Recognition with Augmented Character Profile Matching. 6094-6102 - Qianyue Bao, Fang Liu, Yang Liu, Licheng Jiao, Xu Liu, Lingling Li:
Hierarchical Scene Normality-Binding Modeling for Anomaly Detection in Surveillance Videos. 6103-6112 - Haiyang Ying, Jinzhi Zhang, Yuzhe Chen, Zheng Cao, Jing Xiao, Ruqi Huang, Lu Fang:
ParseMVS: Learning Primitive-aware Surface Representations for Sparse Multi-view Stereopsis. 6113-6124 - Jiong Wang, Zhou Zhao, Fei Wu:
Set-Based Face Recognition Beyond Disentanglement: Burstiness Suppression With Variance Vocabulary. 6125-6135 - Jinkai Zheng, Xinchen Liu, Xiaoyan Gu, Yaoqi Sun, Chuang Gan, Jiyong Zhang, Wu Liu, Chenggang Yan:
Gait Recognition in the Wild with Multi-hop Temporal Switch. 6136-6145 - Zan Gao, Shenghao Chen, Yangyang Guo, Weili Guan, Jie Nie, Anan Liu:
Generic Image Manipulation Localization through the Lens of Multi-scale Spatial Inconsistence. 6146-6154 - Wenmiao Hu, Yichen Zhang, Yuxuan Liang, Yifang Yin, Andrei Georgescu, An Tran, Hannes Kruppa, See-Kiong Ng, Roger Zimmermann:
Beyond Geo-localization: Fine-grained Orientation of Street-view Images by Cross-view Matching with Satellite Imagery. 6155-6164 - Chen Qian, Hui Zhang:
Region-based Pixels Integration Mechanism for Weakly Supervised Semantic Segmentation. 6165-6173 - Zhongwei Qiu, Qiansheng Yang, Jian Wang, Dongmei Fu:
IVT: An End-to-End Instance-guided Video Transformer for 3D Pose Estimation. 6174-6182 - Rui Cao, Kaiyi Zhang, Yang Chen, Ximing Yang, Cheng Jin:
Point Cloud Completion via Multi-Scale Edge Convolution and Attention. 6183-6192 - Suiyi Zhao, Zhao Zhang, Richang Hong, Mingliang Xu, Haijun Zhang, Meng Wang, Shuicheng Yan:
CRNet: Unsupervised Color Retention Network for Blind Motion Deblurring. 6193-6201 - Yanyan Wei, Zhao Zhang, Huan Zheng, Richang Hong, Yi Yang, Meng Wang:
SGINet: Toward Sufficient Interaction Between Single Image Deraining and Semantic Segmentation. 6202-6210 - Jiahuan Ren, Zhao Zhang, Richang Hong, Mingliang Xu, Haijun Zhang, Mingbo Zhao, Meng Wang:
Robust Low-Rank Convolution Network for Image Denoising. 6211-6219 - Suiyi Zhao, Zhao Zhang, Richang Hong, Mingliang Xu, Yi Yang, Meng Wang:
FCL-GAN: A Lightweight and Real-Time Baseline for Unsupervised Blind Image Deblurring. 6220-6229 - Huabin Liu, Weixian Lv, John See, Weiyao Lin:
Task-adaptive Spatial-Temporal Video Sampler for Few-shot Action Recognition. 6230-6240 - Jiashuo Yu, Ying Cheng, Rui-Wei Zhao, Rui Feng, Yuejie Zhang:
MM-Pyramid: Multimodal Pyramid Attentional Network for Audio-Visual Event Localization and Video Parsing. 6241-6249 - Sindhu B. Hegde, K. R. Prajwal, Rudrabha Mukhopadhyay, Vinay P. Namboodiri, C. V. Jawahar:
Lip-to-Speech Synthesis for Arbitrary Speakers in the Wild. 6250-6258 - Chaofan Chen, Xiaoshan Yang, Ming Yan, Changsheng Xu:
Attribute-guided Dynamic Routing Graph Network for Transductive Few-shot Learning. 6259-6268 - Ye Liu, Lingfeng Qiao, Di Yin, Zhuoxuan Jiang, Xinghua Jiang, Deqiang Jiang, Bo Ren:
OS-MSL: One Stage Multimodal Sequential Link Framework for Scene Segmentation and Classification. 6269-6277 - Jiashuo Yu, Jinyu Liu, Ying Cheng, Rui Feng, Yuejie Zhang:
Modality-aware Contrastive Instance Learning with Self-Distillation for Weakly-Supervised Audio-Visual Violence Detection. 6278-6287 - Zhicai Wang, Yanbin Hao, Xingyu Gao, Hao Zhang, Shuo Wang, Tingting Mu, Xiangnan He:
Parameterization of Cross-token Relations with Relative Positional Encoding for Vision MLP. 6288-6299 - Jian-Jun Qiao, Zhi-Qi Cheng, Xiao Wu, Wei Li, Ji Zhang:
Real-time Semantic Segmentation with Parallel Multiple Views Feature Augmentation. 6300-6308 - Jie Huang, Man Zhou, Yajing Liu, Mingde Yao, Feng Zhao, Zhiwei Xiong:
Exposure-Consistency Representation Learning for Exposure Correction. 6309-6317 - Jiawei Zhan, Jun Liu, Wei Tang, Guannan Jiang, Xi Wang, Bin-Bin Gao, Tianliang Zhang, Wenlong Wu, Wei Zhang, Chengjie Wang, Yuan Xie:
Global Meets Local: Effective Multi-Label Image Classification via Category-Aware Weak Supervision. 6318-6326 - Qi He, Zhaoquan Yuan, Xiao Wu, Jun-Yan He:
Domain-Specific Conditional Jigsaw Adaptation for Enhancing transferability and Discriminability. 6327-6336 - Guang Yu, Siqi Wang, Zhiping Cai, Xinwang Liu, Chengkun Wu:
Effective Video Abnormal Event Detection by Learning A Consistency-Aware High-Level Feature Extractor. 6337-6346 - Yiran Wang, Zhiyu Pan, Xingyi Li, Zhiguo Cao, Ke Xian, Jianming Zhang:
Less is More: Consistent Video Depth Estimation with Masked Frames Modeling. 6347-6358 - Huan Zheng, Zhao Zhang, Haijun Zhang, Yi Yang, Shuicheng Yan, Meng Wang:
Deep Multi-Resolution Mutual Learning for Image Inpainting. 6359-6367 - Linhai Zhuo, Yuqian Fu, Jingjing Chen, Yixin Cao, Yu-Gang Jiang:
TGDM: Target Guided Dynamic Mixup for Cross-Domain Few-Shot Learning. 6368-6376 - Zizheng Yang, Mingde Yao, Jie Huang, Man Zhou, Feng Zhao:
SIR-Former: Stereo Image Restoration Using Transformer. 6377-6385 - Zhengming Zhou, Qiulei Dong:
Learning Occlusion-aware Coarse-to-Fine Depth Map for Self-supervised Monocular Depth Estimation. 6386-6395 - Arghya Pal, Sailaja Rajanala, Raphael C.-W. Phan, KokSheik Wong:
Guess-It-Generator: Generating in a Lewis Signaling Framework through Logical Reasoning. 6396-6405 - Mengmeng Liu, Zhi Ma, Tao Li, Yanfeng Jiang, Kai Wang:
Long-Term Person Re-identification with Dramatic Appearance Change: Algorithm and Benchmark. 6406-6415 - Chuanming Wang, Huiyuan Fu, Huadong Ma:
PaCL: Part-level Contrastive Learning for Fine-grained Few-shot Image Classification. 6416-6424 - Gang Xu, Qibin Hou, Le Zhang, Ming-Ming Cheng:
FMNet: Frequency-Aware Modulation Network for SDR-to-HDR Translation. 6425-6435 - Ji Zhang, Zhi-Qi Cheng, Xiao Wu, Wei Li, Jian-Jun Qiao:
CrossNet: Boosting Crowd Counting with Localization. 6436-6444 - Chen Wang, Xian Wu, Yuan-Chen Guo, Song-Hai Zhang, Yu-Wing Tai, Shi-Min Hu:
NeRF-SR: High Quality Neural Radiance Fields using Supersampling. 6445-6454 - Xinpeng Li, Xiaojiang Peng:
Rail Detection: An Efficient Row-based Network and a New Benchmark. 6455-6463 - Yanyan Wei, Zhao Zhang, Mingliang Xu, Richang Hong, Jicong Fan, Shuicheng Yan:
Robust Attention Deraining Network for Synchronous Rain Streaks and Raindrops Removal. 6464-6472 - Weihong Lin, Zheng Sun, Chixiang Ma, Mingze Li, Jiawei Wang, Lei Sun, Qiang Huo:
TSRFormer: Table Structure Recognition with Transformers. 6473-6482 - Jinghao Zhang, Jie Huang, Mingde Yao, Man Zhou, Feng Zhao:
Structure- and Texture-Aware Learning for Low-Light Image Enhancement. 6483-6492 - Fengyi Zhang, Hui Zeng, Tianjun Zhang, Lin Zhang:
CLUT-Net: Learning Adaptively Compressed Representations of 3DLUTs for Lightweight Image Enhancement. 6493-6501 - Pedro Ramoneda, Dasaem Jeong, Eita Nakamura, Xavier Serra, Marius Miron:
Automatic Piano Fingering from Partially Annotated Scores using Autoregressive Neural Networks. 6502-6510 - Sindhu B. Hegde, Rudrabha Mukhopadhyay, Vinay P. Namboodiri, C. V. Jawahar:
Extreme-scale Talking-Face Video Upsampling with Audio-Visual Priors. 6511-6520 - Naishan Zheng, Jie Huang, Qi Zhu, Man Zhou, Feng Zhao, Zheng-Jun Zha:
Enhancement by Your Aesthetic: An Intelligible Unsupervised Personalized Enhancer for Low-Light Images. 6521-6529 - Han Ling, Quansen Sun, Zhenwen Ren, Yazhou Liu, Hongyuan Wang, Zichen Wang:
Scale-flow: Estimating 3D Motion from Video. 6530-6538 - Danna Xue, Fei Yang, Pei Wang, Luis Herranz, Jinqiu Sun, Yu Zhu, Yanning Zhang:
SlimSeg: Slimmable Semantic Segmentation with Boundary Supervision. 6539-6548 - Huiyu Duan, Wei Shen, Xiongkuo Min, Danyang Tu, Jing Li, Guangtao Zhai:
Saliency in Augmented Reality. 6549-6558 - Ye Deng, Siqi Hui, Sanping Zhou, Deyu Meng, Jinjun Wang:
T-former: An Efficient Transformer for Image Inpainting. 6559-6568 - Hao Liu, Bin Chen, Bo Wang, Chunpeng Wu, Feng Dai, Peng Wu:
Cycle Self-Training for Semi-Supervised Object Detection with Distribution Consistency Reweighting. 6569-6578 - Jiahui Zhang, Fangneng Zhan, Rongliang Wu, Yingchen Yu, Wenqing Zhang, Bai Song, Xiaoqin Zhang, Shijian Lu:
VMRF: View Matching Neural Radiance Fields. 6579-6587 - Yuqian Fu, Yu Xie, Yanwei Fu, Jingjing Chen, Yu-Gang Jiang:
ME-D2N: Multi-Expert Domain Decompositional Network for Cross-Domain Few-Shot Learning. 6609-6617 - Xiao Wang, Zheng Wang, Wu Liu, Xin Xu, Qijun Zhao, Shin'ichi Satoh:
Towards Causality Inference for Very Important Person Localization. 6618-6626 - Keyang Cheng, Yu Si, Hao Zhou, Rabia Tahir:
MMDV: Interpreting DNNs via Building Evaluation Metrics, Manual Manipulation and Decision Visualization. 6627-6635 - Chengjie Ge, Xueyang Fu, Zheng-Jun Zha:
Learning Dual Convolutional Dictionaries for Image De-raining. 6636-6644 - Hu Yu, Jie Huang, Yajing Liu, Qi Zhu, Man Zhou, Feng Zhao:
Source-Free Domain Adaptation for Real-World Image Dehazing. 6645-6654 - Xiangyu Miao, Shangfei Wang:
Knowledge Guided Representation Disentanglement for Face Recognition from Low Illumination Images. 6655-6663 - Tao Zhou, Wenhan Luo, Zhiguo Shi, Jiming Chen, Qi Ye:
APPTracker: Improving Tracking Multiple Objects in Low-Frame-Rate Videos. 6664-6674 - Jiaxu Leng, Jia Wang, Xinbo Gao, Bo Hu, Ji Gan, Chenqiang Gao:
ICNet: Joint Alignment and Reconstruction via Iterative Collaboration for Video Super-Resolution. 6675-6684 - Junshan Hu, Chaoxu Guo, Liansheng Zhuang, Biao Wang, Tiezheng Ge, Yuning Jiang, Houqiang Li:
Estimation of Reliable Proposal Quality for Temporal Action Detection. 6685-6695 - Zenggui Chen, Zhouhui Lian:
Semi-supervised Semantic Segmentation via Prototypical Contrastive Learning. 6696-6705 - Chiawei Kuo, Yi-Ting Tsai, Hong-Han Shuai, Yi-Ren Yeh, Ching-Chun Huang:
Towards Understanding Cross Resolution Feature Matching for Surveillance Face Recognition. 6706-6716 - Yurui Zhu, Xueyang Fu, Chengzhi Cao, Xi Wang, Qibin Sun, Zheng-Jun Zha:
Single Image Shadow Detection via Complementary Mechanism. 6717-6726 - Qiqi Bao, Rui Zhu, Bowen Gang, Pengyang Zhao, Wenming Yang, Qingmin Liao:
Distilling Resolution-robust Identity Knowledge for Texture-Enhanced Face Hallucination. 6727-6736 - Jia Wang, Tianhao Lan, Jie Chen, Chengwen Luo, Chao Wu, Jianqiang Li:
Phoneme-Aware Adaptation with Discrepancy Minimization and Dynamically-Classified Vector for Text-independent Speaker Verification. 6737-6745 - Jiaxu Leng, Mingpi Tan, Xinbo Gao, Wen Lu, Zongyi Xu:
Anomaly Warning: Learning and Memorizing Future Semantic Patterns for Unsupervised Ex-ante Potential Anomaly Prediction. 6746-6754 - Yuxi Mi, Yuge Huang, Jiazhen Ji, Hongquan Liu, Xingkun Xu, Shouhong Ding, Shuigeng Zhou:
DuetFace: Collaborative Privacy-Preserving Face Recognition via Channel Splitting in the Frequency Domain. 6755-6764 - Youze Xue, Jiansheng Chen, Yudong Zhang, Cheng Yu, Huimin Ma, Hongbing Ma:
3D Human Mesh Reconstruction by Learning to Sample Joint Adaptive Tokens for Transformers. 6765-6773 - Yanling Tian, Di Chen, Yunan Liu, Shanshan Zhang, Jian Yang:
Grouped Adaptive Loss Weighting for Person Search. 6774-6782 - Weilai Xiang, Hongyu Yang, Di Huang, Yunhong Wang:
Multi-view Gait Video Synthesis. 6783-6791 - Yuwei Zhou, Xin Wang, Hong Chen, Xuguang Duan, Chaoyu Guan, Wenwu Zhu:
Curriculum-NAS: Curriculum Weight-Sharing Neural Architecture Search. 6792-6801 - Ya-Nan Zhang, Linlin Shen, Qiufu Li:
Content and Gradient Model-driven Deep Network for Single Image Reflection Removal. 6802-6812 - Haoru Zhao, Zhaorui Gu, Bing Zheng, Haiyong Zheng:
TransCNN-HAE: Transformer-CNN Hybrid AutoEncoder for Blind Image Inpainting. 6813-6821 - Tangwen Qian, Yongjun Xu, Zhao Zhang, Fei Wang:
Trajectory Prediction from Hierarchical Perspective. 6822-6830 - Zhiyuan Zhao, Qingjie Liu, Yunhong Wang:
Exploring Effective Knowledge Transfer for Few-shot Object Detection. 6831-6839 - Xinhua Cheng, Mengxi Jia, Qian Wang, Jian Zhang:
More is better: Multi-source Dynamic Parsing Attention for Occluded Person Re-identification. 6840-6849 - Gyumin Shim, Minsoo Lee, Jaegul Choo:
ReFu: Refine and Fuse the Unobserved View for Detail-Preserving Single-Image 3D Human Reconstruction. 6850-6859 - Mingii Choi, Sangyeong Lee, Heesun Jung, Jong-Uk Hou:
Transformers in Spectral Domain for Estimating Image Geometric Transformation. 6860-6867
Brave New Ideas Session
- Renrui Zhang, Ziyao Zeng, Ziyu Guo, Yafeng Li:
Can Language Understand Depth? 6868-6874 - Yongkang Wong, Shaojing Fan, Yangyang Guo, Ziwei Xu, Karen Stephen, Rishabh Sheoran, Anusha Bhamidipati, Vivek Barsopia, Jianquan Liu, Mohan S. Kankanhalli:
Compute to Tell the Tale: Goal-Driven Narrative Generation. 6875-6882 - Jitao Sang, Xian Zhao, Jiaming Zhang, Zhiyu Lin:
Benign Adversarial Attack: Tricking Models for Goodness. 6883-6889 - Kurtis Haut, Caleb Wohn, Victor Antony, Aidan Goldfarb, Melissa Welsh, Dillanie Sumanthiran, Md. Rafayet Ali, Ehsan Hoque:
Demographic Feature Isolation for Bias Research using Deepfakes. 6890-6897 - Yoko Yamakata, Akihisa Ishino, Akiko Sunto, Sosuke Amano, Kiyoharu Aizawa:
Recipe-oriented Food Logging for Nutritional Management. 6898-6904
Doctoral Consortium
- Vignesh V. Menon:
Video Coding Enhancements for HTTP Adaptive Streaming. 6905-6909 - Xiaoyu Lin:
Unsupervised Multi-object Tracking via Dynamical VAE and Variational Inference. 6910-6914 - Igor Morawski:
Enabling Effective Low-Light Perception using Ubiquitous Low-Cost Visible-Light Cameras. 6915-6919 - Manuel Silva:
Interaction with Immersive Cultural Heritage Environments: Using XR Technologies to Represent Multiple Perspectives on Serralves Museum. 6920-6924 - Maurits J. R. Bleeker:
Multi-modal Learning Algorithms and Network Architectures for Information Extraction and Retrieval. 6925-6929 - Travis Seng:
Enriching Existing Educational Video Datasets to Improve Slide Classification and Analysis. 6930-6934 - Diogo Tavares:
Zero-shot Generalization of Multimodal Dialogue Agents. 6935-6939 - Diogo Silva:
The First Impression: Understanding the Impact of Multimodal System Responses on User Behavior in Task-oriented Agents. 6940-6943
Technical Demonstrators
- Wei Xu, Bowen Tian, Lijie Luo, Weiming Yang, Xianke Wang, Lei Wu:
SingMaster: A Sight-singing Evaluation System of "Shoot and Sing" Based on Smartphone. 6944-6946 - Kele Xu, Ming Feng, Weiquan Huang:
Seeing Speech: Magnetic Resonance Imaging-Based Vocal Tract Deformation Visualization Using Cross-Modal Transformer. 6947-6949 - Antonio Origlia, Martina Di Bratto, Maria Di Maro, Sabrina Mennella:
Developing Embodied Conversational Agents in the Unreal Engine: The FANTASIA Plugin. 6950-6951 - Yuanfeng Song, Rongzhong Lian, Yixin Chen, Di Jiang, Xuefang Zhao, Conghui Tan, Qian Xu, Raymond Chi-Wing Wong:
A Platform for Deploying the TFE Ecosystem of Automatic Speech Recognition. 6952-6954 - Ignacio Reimat, Yanni Mei, Evangelos Alexiou, Jack Jansen, Jie Li, Shishir Subramanyam, Irene Viola, Johan Oomen, Pablo César:
Mediascape XR: A Cultural Heritage Experience in Social VR. 6955-6957 - Ziyi Wang, Xingqi Wang, Zeyu Jin, Xiaohan Li, Shikun Sun, Jia Jia:
AI Carpet: Automatic Generation of Aesthetic Carpet Pattern. 6958-6960 - Yuki Tajima, Shota Okubo, Tomoaki Konno, Toshiharu Horiuchi, Tatsuya Kobayashi:
Sync Sofa: Sofa-type Side-by-side Communication Experience Based on Multimodal Expression. 6961-6963 - Xin Jin, Shu Zhao, Le Zhang, Xin Zhao, Qiang Deng, Chaoen Xiao:
Attribute Controllable Beautiful Caucasian Face Generation by Aesthetics Driven Reinforcement Learning. 6964-6966 - Giuseppe Becchi, Andrea Ferracani, Filippo Principi, Alberto Del Bimbo:
An AI Powered Re-Identification System for Real-time Contextual Multimedia Applications. 6967-6969 - Zhilong Zhou, Shiyao Wang, Tiezheng Ge, Yuning Jiang:
A High-resolution Image-based Virtual Try-on System in Taobao E-commerce Scenario. 6970-6972 - Wei Duan, Zhe Zhang, Yi Yu, Keizo Oyama:
Interpretable Melody Generation from Lyrics with Discrete-Valued Adversarial Training. 6973-6975 - Chuanhang Yan, Yu Sun, Qian Bao, Jinhui Pang, Wu Liu, Tao Mei:
WOC: A Handy Webcam-based 3D Online Chatroom. 6976-6978 - Pin-Xuan Liu, Tse-Yu Pan, Hsin-Shih Lin, Hung-Kuo Chu, Min-Chun Hu:
BetterSight: Immersive Vision Training for Basketball Players. 6979-6981 - Florent Geniet, Valérie Gouet-Brunet, Mathieu Brédif:
ALEGORIA: Joint Multimodal Search and Spatial Navigation into the Geographic Iconographic Heritage. 6982-6984 - Lorenzo Agnolucci, Leonardo Galteri, Marco Bertini, Alberto Del Bimbo:
Restoration of Analog Videos Using Swin-UNet. 6985-6987 - Shing Ming Wong, Chien-Wen Chen, Tse-Yu Pan, Hung-Kuo Chu, Min-Chun Hu:
GetWild: A VR Editing System with AI-Generated 3D Object and Terrain. 6988-6990 - Ting-Yang Kao, Tse-Yu Pan, Chen-Ni Chen, Tsung-Hsun Tsai, Hung-Kuo Chu, Min-Chun Hu:
ScoreActuary: Hoop-Centric Trajectory-Aware Network for Fine-Grained Basketball Shot Analysis. 6991-6993 - Tiago Fornelos, Pedro Valente, Rafael Ferreira, Diogo Tavares, Diogo Silva, David Semedo, João Magalhães, Nuno Correia:
A Conversational Shopping Assistant for Online Virtual Stores. 6994-6996 - Rafael Ferreira, Diogo Silva, Diogo Tavares, Frederico Vicente, Mariana Bonito, Gustavo Gonçalves, Rui Margarido, Paula Figueiredo, Helder Rodrigues, David Semedo, João Magalhães:
TWIZ: The Multimodal Conversational Task Wizard. 6997-6999 - Maria Giovanna Donadio, Filippo Principi, Andrea Ferracani, Marco Bertini, Alberto Del Bimbo:
Engaging Museum Visitors with Gamification of Body and Facial Expressions. 7000-7002
Grand Challenges
- Lutharsanen Kunam, Luca Rossetto, Abraham Bernstein:
A Multi-Stream Approach for Video Understanding. 7003-7007 - Weilong Chen, Chenghao Huang, Weimin Yuan, Xiaolu Chen, Wenhao Hu, Xinran Zhang, Yanru Zhang:
Title-and-Tag Contrastive Vision-and-Language Transformer for Social Media Popularity Prediction. 7008-7012 - Meng Liu, Shuyan Zhai, Yongqiang Li, Weili Guan, Liqiang Nie:
A Baseline for ViCo Conversational Head Generation Challenge. 7013-7015 - Chuin Hong Yap, Moi Hoon Yap, Adrian K. Davison, Connah Kendrick, Jingting Li, Su-Jing Wang, Ryan Cunningham:
3D-CNN for Facial Micro- and Macro-expression Spotting on Long Video Sequences using Temporal Oriented Reference Frame. 7016-7020 - Chao Zhou, Yixuan Ban, Yangchao Zhao, Liang Guo, Bing Yu:
PDAS: Probability-Driven Adaptive Streaming for Short Video. 7021-7025 - Tamás Grósz, Dejan Porjazovski, Yaroslav Getman, Sudarsana Reddy Kadiri, Mikko Kurimo:
Wav2vec2-based Paralinguistic Systems to Recognise Vocalised Emotions and Stuttering. 7026-7029 - Si-Ze Qian, Yuhong Xie, Zipeng Pan, Yuan Zhang, Tao Lin:
DAM: Deep Reinforcement Learning based Preload Algorithm with Action Masking for Short Video Streaming. 7030-7034 - Ricong Huang, Weizhi Zhong, Guanbin Li:
Audio-driven Talking Head Generation with Transformer and 3D Morphable Model. 7035-7039 - Siyang Sun, Xiong Xiong, Yun Zheng:
Two stage Multi-Modal Modeling for Video Interaction Analysis in Deep Video Understanding Challenge. 7040-7044 - Jianmin Wu, Liming Zhao, Dangwei Li, Chen-Wei Xie, Siyang Sun, Yun Zheng:
Deeply Exploit Visual and Language Information for Social Media Popularity Prediction. 7045-7049 - Ailin Huang, Zhewei Huang, Shuchang Zhou:
Perceptual Conversational Head Generation with Regularized Driver and Enhanced Renderer. 7050-7054 - Chen-Wei Xie, Siyang Sun, Liming Zhao, Jianmin Wu, Dangwei Li, Yun Zheng:
Deep Video Understanding with a Unified Multi-Modal Retrieval Framework. 7055-7059 - Kang You, Kele Xu, Boqing Zhu, Ming Feng, Dawei Feng, Bo Liu, Tian Gao, Bo Ding:
Masked Modeling-based Audio Representation for ACM Multimedia 2022 Computational Paralinguistics ChallengE. 7060-7064 - Wei Zhao, Peng Xiao, Rongju Zhang, Yijun Wang, Jianxin Lin:
Semantic-aware Responsive Listener Head Synthesis. 7065-7069 - Yingwei Pan, Yehao Li, Jianjie Luo, Jun Xu, Ting Yao, Tao Mei:
Auto-captions on GIF: A Large-scale Video-sentence Dataset for Vision-language Pre-training. 7070-7074 - Keith Curtis, George Awad, Shahzad Rajput, Ian Soboroff:
The ACM Multimedia 2022 Deep Video Understanding Grand Challenge. 7075-7078 - Tian Lv, Yu-Hui Wen, Zhiyao Sun, Zipeng Ye, Yong-Jin Liu:
Generating Smooth and Facial-Details-Enhanced Talking Head Video: A Perspective of Pre and Post Processes. 7079-7083 - Xutong Zuo, Yishu Li, Mohan Xu, Wei Tsang Ooi, Jiangchuan Liu, Junchen Jiang, Xinggong Zhang, Kai Zheng, Yong Cui:
Bandwidth-Efficient Multi-video Prefetching for Short Video Streaming. 7084-7088 - Xiaochen Cai, Hengxing Cai, Boqing Zhu, Kele Xu, Weiwei Tu, Dawei Feng:
Multiple Temporal Fusion based Weakly-supervised Pre-training Techniques for Video Categorization. 7089-7093 - Sean Campos, Devesh Khandelwal, Shwetha C. Nagaraj, Fred Nugen, Alberto Todeschini:
Deep Learning-Based Acoustic Mosquito Detection in Noisy Conditions Using Trainable Kernels and Augmentations. 7094-7098 - Fuyan Ma, Ziyu Ma, Bin Sun, Shutao Li:
TA-CNN: A Unified Network for Human Behavior Analysis in Multi-Person Conversations. 7099-7103 - Shakeel A. Sheikh, Md. Sahidullah, Slim Ouni, Fabrice Hirsch:
End-to-End and Self-Supervised Learning for ComParE 2022 Stuttering Sub-Challenge. 7104-7108 - Philipp Müller, Michael Dietz, Dominik Schiller, Dominike Thomas, Hali Lindsay, Patrick Gebhard, Elisabeth André, Andreas Bulling:
MultiMediate'22: Backchannel Detection and Agreement Estimation in Group Interactions. 7109-7114 - Ximing Wu, Lei Zhang, Laizhong Cui:
QoE-aware Download Control and Bitrate Adaptation for Short Video Streaming. 7115-7119 - Björn W. Schuller, Anton Batliner, Shahin Amiriparian, Christian Bergler, Maurice Gerczuk, Natalie Holz, Pauline Larrouy-Maestri, Sebastian P. Bayerl, Korbinian Riedhammer, Adria Mallol-Ragolta, Maria Pateraki, Harry Coppock, Ivan Kiskin, Marianne Sinka, Stephen J. Roberts:
The ACM Multimedia 2022 Computational Paralinguistics Challenge: Vocalisations, Stuttering, Activity, & Mosquitoes. 7120-7124 - Xinqi Fan, Ali Raza Shahid, Hong Yan:
Adaptive Dual Motion Model for Facial Micro-Expression Generation. 7125-7129 - Chih-Chung Hsu, Pi-Ju Tsai, Ting-Chun Yeh, Xiu-Yu Hou:
A Comprehensive Study of Spatiotemporal Feature Learning for Social Medial Popularity Prediction. 7130-7134 - Moreno La Quatra, Lorenzo Vaiani, Alkis Koudounas, Luca Cagliero, Paolo Garza, Elena Baralis:
How Much Attention Should we Pay to Mosquitoes? 7135-7139 - Tuan-Vinh La, Minh-Son Dao, Quang-Tien Tran, Thanh-Phuc Tran, Anh-Duy Tran, Duc-Tien Dang-Nguyen:
A Combination of Visual-Semantic Reasoning and Text Entailment-based Boosting Algorithm for Cheapfake Detection. 7140-7144 - Quang-Tien Tran, Thanh-Phuc Tran, Minh-Son Dao, Tuan-Vinh La, Anh-Duy Tran, Duc-Tien Dang-Nguyen:
A Textual-Visual-Entailment-based Unsupervised Algorithm for Cheapfake Detection. 7145-7149 - Sirui Zhao, Shukang Yin, Huaying Tang, Rijin Jin, Yifan Xu, Tong Xu, Enhong Chen:
Fine-grained Micro-Expression Generation based on Thin-Plate Spline and Relative AU Constraint. 7150-7154 - Gulshan Sharma, Abhinav Dhall, Ramanathan Subramanian:
A Transformer Based Approach for Activity Detection. 7155-7159 - Wenhao Leng, Sirui Zhao, Yiming Zhang, Shifeng Liu, Xinglong Mao, Hao Wang, Tong Xu, Enhong Chen:
ABPN: Apex and Boundary Perception Network for Micro- and Macro-Expression Spotting. 7160-7164 - Beibei Zhang, Yaqun Fang, Tongwei Ren, Gangshan Wu:
Multimodal Analysis for Deep Video Understanding with Video Language Transformer. 7165-7169 - Jingting Li, Moi Hoon Yap, Wen-Huang Cheng, John See, Xiaopeng Hong, Xiaobai Li, Su-Jing Wang, Adrian K. Davison, Yante Li, Zizhao Dong:
MEGC2022: ACM Multimedia 2022 Micro-Expression Grand Challenge. 7170-7174 - Yuan Zhao, Xin Tong, Zichong Zhu, Jianda Sheng, Lei Dai, Lingling Xu, Xuehai Xia, Yu Jiang, Jiao Li:
Rethinking Optical Flow Methods for Micro-Expression Spotting. 7175-7179 - Muhannad Alkaddour, Abhinav Dhall, Usman Tariq, Hasan Al-Nashash, Fares Al-Shargie:
Sentiment-aware Classifier for Out-of-Context Caption Detection. 7180-7184 - Penggang Qin, Jiarui Yu, Yan Gao, Derong Xu, Yunkai Chen, Shiwei Wu, Tong Xu, Enhong Chen, Yanbin Hao:
Unified QA-aware Knowledge Graph Generation Based on Multi-modal Modeling. 7185-7189 - Garima Sharma, Kalin Stefanov, Abhinav Dhall, Jianfei Cai:
Graph-based Group Modelling for Backchannel Detection. 7190-7194 - Claude Montacié, Marie-José Caraty, Nikola Lackovic:
Audio Features from the Wav2Vec 2.0 Embeddings for the ACM Multimedia 2022 Stuttering Challenge. 7195-7199 - Yunpeng Tan, Fangyu Liu, Bowei Li, Zheng Zhang, Bo Zhang:
An Efficient Multi-View Multimodal Data Processing Framework for Social Media Popularity Prediction. 7200-7204 - Jun Yu, Zhongpeng Cai, Zepeng Liu, Guochen Xie, Peng He:
Facial Expression Spotting Based on Optical Flow Features. 7205-7209 - Jun Yu, Guochen Xie, Zhongpeng Cai, Peng He, Fang Gao, Qiang Ling:
Micro Expression Generation with Thin-plate Spline Motion Model and Face Parsing. 7210-7214 - Raksha Ramesh, Vishal Anand, Zifan Chen, Yifei Dong, Yun Chen, Ching-Yung Lin:
Leveraging Text Representation and Face-head Tracking for Long-form Multimodal Semantic Relation Understanding. 7215-7219 - Miriam Redi, Georges Quénot:
Overview of the Multimedia Grand Challenges 2022. 7220-7222
Interactive Arts
- Manuel Silva, Luana Santos, Luís Teixeira, José Vasco Carvalho:
All is Noise: In Search of Enlightenment, a VR Experience. 7223-7224 - Johnny DiBlasi, Carlos Castellanos, Bello Bello:
Beauty: Machine Microbial Interface as Artistic Experimentation. 7225-7226 - Xinrui Wang, Yulu Song, Xiaohui Wang:
Being's Spread: Mirror of Life Interconnection. 7227-7228 - Tiago Rorke:
CAPTCHA the Flag: Interactive Plotter Livestream. 7229-7230 - Bo Shui, Xiaohui Wang:
Cellular Trending: Fragmented Information Dissemination on Social Media Through Generative Lens. 7231-7232 - Sofia Hinckel Dias, Sara Rodrigues Silva, Beatriz Rodrigues Silva, Rui Nóbrega:
Collaboration Superpowers: The Process of Crafting an Interactive Storytelling Animation. 7233-7234 - Varvara Guljajeva, Mar Canet Sola:
Dream Painter: An Interactive Art Installation Bridging Audience Interaction, Robotics, and Creative AI. 7235-7236 - Jorge Forero, Gilberto Bernardes, Mónica Mendes:
Emotional Machines: Toward Affective Virtual Environments. 7237-7238 - Jiaxiang You, Yinyu Chen, Xiaohui Wang:
Fragrance In Sight: Personalized Perfume Production Based on Style Recognition. 7239-7240 - Ze Gao, Anqi Wang, Pan Hui, Tristan Braud:
Meditation in Motion: Interactive Media Art Visualization Based on Ancient Tai Chi Chuan. 7241-7242 - Hugo Pauget Ballesteros, Gilles Azzaro, Jean Mélou, Yvain Quéau, Jean-Denis Durou:
Read Your Voice: A Playful Interactive Sound Encoder/Decoder. 7243-7244 - Tai-Chen Tsai, Tse-Yu Pan, Min-Chun Hu, Ya-Lun Tao:
StimulusLoop: Game-Actuated Mutuality Artwork for Evoking Affective State. 7245-7247 - Emily Graber, Charles Picasso, Elaine Chew:
Viva Contemporary! Mobile Music Laboratory. 7248-7249 - Yuqian Sun, Chenhang Cheng, Ying Xu, Yihua Li, Chang Hee Lee, Ali Asadipour:
Wander: An AI-driven Chatbot to Visit the Future Earth. 7250-7251
Industry session
- Zhenyu Zhang, Bowen Yu, Haiyang Yu, Tingwen Liu, Cheng Fu, Jingyang Li, Chengguang Tang, Jian Sun, Yongbin Li:
Layout-Aware Information Extraction for Document-Grounded Dialogue: Dataset, Method and Demonstration. 7252-7260 - Shiyao Wang, Qi Liu, Yicheng Zhong, Zhilong Zhou, Tiezheng Ge, Defu Lian, Yuning Jiang:
CreaGAN: An Automatic Creative Generation Framework for Display Advertising. 7261-7269 - Qinghui Sun, Jie Gu, Xiaoxiao Xu, Renjun Xu, Ke Liu, Bei Yang, Hong Liu, Huan Xu:
Learning Interest-oriented Universal User Representation via Self-supervision. 7270-7278 - Ruicheng Liu, Jialing Liang, Peiquan Jin, Yi Wang:
MMH-index: Enhancing Apache Lucene with High-Performance Multi-Modal Indexing and Searching. 7279-7289 - Qi Yang, Sergey I. Nikolenko, Alfred Huang, Aleksandr Farseev:
Personality-Driven Social Multimedia Content Recommendation. 7290-7299 - Junwu Zhang, Mang Ye, Yao Yang:
Learnable Privacy-Preserving Anonymization for Pedestrian Images. 7300-7308 - Wenke Huang, Mang Ye, Bo Du, Xiang Gao:
Few-Shot Model Agnostic Federated Learning. 7309-7316 - He Li, Mang Ye, Cong Wang, Bo Du:
Pyramidal Transformer with Conv-Patchify for Person Re-identification. 7317-7326
Open Source session
- Sachin Mehta, Farzad Abdolhosseini, Mohammad Rastegari:
CVNets: High Performance Library for Computer Vision. 7327-7330 - Yue Zhou, Xue Yang, Gefan Zhang, Jiabao Wang, Yanyi Liu, Liping Hou, Xue Jiang, Xingzhao Liu, Junchi Yan, Chengqi Lyu, Wenwei Zhang, Kai Chen:
MMRotate: A Rotated Object Detection Benchmark using PyTorch. 7331-7334 - Stéphane Massonnet, Marco Romanelli, Rémi Lebret, Niels Poulsen, Karl Aberer:
MoZuMa: A Model Zoo for Multimedia Applications. 7335-7338 - Wei Gao, Hang Yuan, Yang Guo, Lvfang Tao, Zhanyuan Cai, Ge Li:
OpenHardwareVC: An Open Source Library for 8K UHD Video Coding Hardware Implementation. 7339-7342 - Abdelhak Bentaleb, Zhengdao Zhan, Farzad Tashtarian, May Lim, Saad Harous, Christian Timmerer, Hermann Hellwagner, Roger Zimmermann:
Low Latency Live Streaming Implementation in DASH and HLS. 7343-7346 - Wei Gao, Hua Ye, Ge Li, Huiming Zheng, Yuyang Wu, Liang Xie:
OpenPointCloud: An Open-Source Algorithm Library of Deep Learning Based Point Cloud Compression. 7347-7350 - Haodong Duan, Jiaqi Wang, Kai Chen, Dahua Lin:
PYSKL: Towards Good Practices for Skeleton Action Recognition. 7351-7354 - Liang Qiao, Hui Jiang, Ying Chen, Can Li, Pengfei Li, Zaisheng Li, Baorui Zou, Dashan Guo, Yingda Xu, Yunlu Xu, Zhanzhan Cheng, Yi Niu:
DavarOCR: A Toolbox for OCR and Multi-Modal Document Understanding. 7355-7358 - Yuwei Zhou, Hong Chen, Zirui Pan, Chuanhao Yan, Fanqi Lin, Xin Wang, Wenwu Zhu:
CurML: A Curriculum Machine Learning Library. 7359-7363
Reproducibility session
- Xin Jin, Ke Liu, Dongqing Zou, Zhonglan Li, Heng Huang, Vajira Thambawita:
Reproducibility Companion Paper: Focusing on Persons: Colorizing Old Images Learning from Modern Historical Movies. 7364-7367
Tutorial Overviews
- Fernando Pereira:
Deep Learning-based Point Cloud Coding for Immersive Experiences. 7368-7370 - Yiannis Andreopoulos, Cosmin Stejerean:
Advances in Quality Assessment Of Video Streaming Systems: Algorithms, Methods, Tools. 7371 - Zheng Wang, Dan Xu, Zhedong Zheng, Kui Jiang:
Multimedia Content Understanding in Harsh Environments. 7372-7373 - Ioannis Pitas, Ioannis Mademlis:
Autonomous UAV Cinematography. 7374-7376 - Xin Wang, Xiaohan Lan, Wenwu Zhu:
Video Grounding and Its Generalization. 7377-7379 - Federico Becattini, Tiberio Uricchio:
Memory Networks. 7380-7382 - Jakub Lokoc, Klaus Schoeffmann, Werner Bailer, Luca Rossetto, Björn Þór Jónsson:
Open Challenges of Interactive Video Search and Evaluation. 7383-7385
Workshop Overviews
- Hideo Saito, Thomas B. Moeslund, Rainer Lienhart:
MMSports'22: 5th International ACM Workshop on Multimedia Content Analysis in Sports. 7386-7388 - Shahin Amiriparian, Lukas Christ, Andreas König, Eva-Maria Meßner, Alan Cowen, Erik Cambria, Björn W. Schuller:
MuSe 2022 Challenge: Multimodal Humour, Emotional Reactions, and Stress. 7389-7391 - Wei Gao, Ge Li, Hui Yuan, Raouf Hamzaoui, Zhu Li, Shan Liu:
APCCPA '22: 1st International Workshop on Advances in Point Cloud Compression, Processing and Analysis. 7392-7393 - Xavier Alameda-Pineda, Qin Jin, Vincent Oria, Laura Toni:
M4MM '22: 1st International Workshop on Methodologies for Multimedia. 7394-7396 - Jingting Li, Moi Hoon Yap, Wen-Huang Cheng, John See, Xiaopeng Hong, Xiaobai Li, Su-Jing Wang:
FME '22: 2nd Workshop on Facial Micro-Expression: Advanced Techniques for Multi-Modal Facial Expression Analysis. 7397-7399 - Mohan S. Kankanhalli, Jianquan Liu, Yongkang Wong, Karen Stephen, Rishabh Sheoran, Anusha Bhamidipati:
NarSUM '22: 1st Workshop on User-centric Narrative Summarization of Long Videos. 7400-7401 - Yoko Yamakata, Atsushi Hashimoto, Jingjing Chen:
CEA++'22: 1st International Workshop on Multimedia for Cooking, Eating, and related APPlications. 7402-7404 - Jianhua Tao, Jiangyan Yi, Cunhang Fan, Ruibo Fu, Shan Liang, Pengyuan Zhang, Haizhou Li, Helen Meng, Dong Yu, Masato Akagi:
DDAM '22: 1st International Workshop on Deepfake Detection for Audio Multimedia. 7405-7406 - Dingwen Zhang, Chaowei Fang, Wu Liu, Xinchen Liu, Jingkuan Song, Hongyuan Zhu, Wenbing Huang, John Smith:
HCMA'22: 3rd International Workshop on Human-Centric Multimedia Analysis. 7407-7409 - Luca Rossetto, Werner Bailer, Jakub Lokoc, Klaus Schoeffmann:
IMuR 2022: Introduction to the 2nd Workshop on Interactive Multimedia Retrieval. 7410-7411 - Irene Viola, Hadi Amirpour, Maria Torres Vega:
IXR '22: 1st Workshop on Interactive eXtended Reality. 7412-7413 - Stavroula G. Mougiakakou, Giovanni Maria Farinella, Keiji Yanai, Dario Allegra:
MADiMa'22: 7th International Workshop on Multimedia Assisted Dietary Management. 7414-7415 - Xuemeng Song, Jingjing Chen, Federico Becattini, Weili Guan, Yibing Zhan, Tat-Seng Chua:
MCFR'22: 1st Workshop on Multimedia Computing towards Fashion Recommendation. 7416-7417 - Si Liu, Qin Jin, Luoqi Liu, Zongheng Tang, Linli Lin:
PIC'22: 4th Person in Context Workshop. 7418-7419 - Ravi Prakash, Mylène C. Q. Farias, Marcelo M. Carvalho, Ryan P. McMahan:
PIES-ME '22: 1st Workshop on Photorealistic Image and Environment Synthesis for Multimedia Experiments. 7420-7422 - Jing Li, Patrick Le Callet, Xinbo Gao, Zhi Li, Wen Lu, Jiachen Yang, Junle Wang:
QoEVMA'22: 2nd Workshop on Quality of Experience (QoE) in Visual Multimedia Applications. 7423-7425 - Valérie Gouet-Brunet, Ronak Kosti, Li Weng:
SUMAC '22: 4th ACM International workshop on Structuring and Understanding of Multimedia heritAge Contents. 7426-7427 - Liang Liao, Dan Xu, Yang Wu, Xiao Wang, Jing Xiao:
UoLMM'22: 2nd International Workshop on Robust Understanding of Low-quality Multimedia Data: Unitive Enhancement, Analysis and Evaluation. 7428-7430
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.