default search action
ICASSP 2023: Rhodes Island, Greece
- IEEE International Conference on Acoustics, Speech and Signal Processing ICASSP 2023, Rhodes Island, Greece, June 4-10, 2023. IEEE 2023, ISBN 978-1-7281-6327-7
- Tongzhou Chen, Cyril Allauzen, Yinghui Huang, Daniel S. Park, David Rybach, W. Ronny Huang, Rodrigo Cabrera, Kartik Audhkhasi, Bhuvana Ramabhadran, Pedro J. Moreno, Michael Riley:
Large-Scale Language Model Rescoring on Long-Form Data. 1-5 - Haibo Ye, Fangyu Zhou, Xinjie Li, Qingheng Zhang:
Balanced Mixup Loss for Long-Tailed Visual Recognition. 1-5 - Hanbing Liu, Yanru Wu, Yang Liu, Ercan E. Kuruoglu, Xuan Zhang:
SDG-L: A Semiparametric Deep Gaussian Process based Framework for Battery Capacity Prediction. 1-5 - Harshat Kumar, Alejandro Parada-Mayorga, Alejandro Ribeiro:
Algebraic Convolutional Filters on Lie Group Algebras. 1-5 - Atsushi Miyashita, Tomoki Toda:
Representation of Vocal Tract Length Transformation Based on Group Theory. 1-5 - Aochuan Chen, Peter Lorenz, Yuguang Yao, Pin-Yu Chen, Sijia Liu:
Visual Prompting for Adversarial Robustness. 1-5 - Yuzhou Chen, Sotiris Batsakis, H. Vincent Poor:
Higher-Order Spatio-Temporal Neural Networks for Covid-19 Forecasting. 1-5 - Domenico Mattia Cinque, Claudio Battiloro, Paolo Di Lorenzo:
Pooling Strategies for Simplicial Convolutional Networks. 1-5 - Jerry Gu, Liam Collins, Debashri Roy, Aryan Mokhtari, Sanjay Shakkottai, Kaushik R. Chowdhury:
Meta-Learning for Image-Guided Millimeter-Wave Beam Selection in Unseen Environments. 1-5 - Amlu Anna Joshy, P. N. Parameswaran, Siddharth R. Nair, Rajeev Rajan:
Statistical Analysis of Speech Disorder Specific Features to Characterise Dysarthria Severity Level. 1-5 - Brian Yan, Matthew Wiesner, Ondrej Klejch, Preethi Jyothi, Shinji Watanabe:
Towards Zero-Shot Code-Switched Speech Recognition. 1-5 - Jian Chen, Wei Wang, Junxin Chen, Ming Cai:
Dynamic Vehicle Graph Interaction for Trajectory Prediction Based on Video Signals. 1-5 - Thien-Phuc Doan, Long Nguyen-Vu, Souhwan Jung, Kihun Hong:
BTS-E: Audio Deepfake Detection Using Breathing-Talking-Silence Encoder. 1-5 - Yahong Zhang, Sheng Shi, Chenchen Fan, Yixin Wang, Wenli Ouyang, WeiFan, Jianping Fan:
Long-Tailed Recognition with Causal Invariant Transformation. 1-5 - Xiu Zheng, Yuan Huang, Jie Tang:
Reliable Cluster-Based Framework for Open Set Domain Adaptation. 1-5 - Jing-Xuan Zhang, Genshun Wan, Zhen-Hua Ling, Jia Pan, Jianqing Gao, Cong Liu:
Self-Supervised Audio-Visual Speech Representations Learning by Multimodal Self-Distillation. 1-5 - Weiquan Huang, Fu Zhang:
Semi-Supervised Semantic Segmentation with Structured Output Space Adaption. 1-5 - Gaopeng Xu, Xianliang Wang, Sang Wang, Junfeng Yuan, Wei Guo, Wei Li, Jie Gao:
The NIO System for Audio-Visual Diarization and Recognition in MISP Challenge 2022. 1-2 - Chenghu Du, Shengwu Xiong:
CF-VTON: Multi-Pose Virtual Try-on with Cross-Domain Fusion. 1-5 - Subhashini Venugopalan, Jimmy Tobin, Samuel J. Yang, Katie Seaver, Richard J. N. Cave, Pan-Pan Jiang, Neil Zeghidour, Rus Heywood, Jordan R. Green, Michael P. Brenner:
Speech Intelligibility Classifiers from 550k Disordered Speech Samples. 1-5 - Kassem Kallas, Teddy Furon:
Mixer: DNN Watermarking using Image Mixup. 1-5 - Kaushani Majumder, Sibi Raj B. Pillai, Satish Mulleti:
Clustered Greedy Algorithm For Large-Scale Sensor Selection. 1-5 - Ke Hu, Tara N. Sainath, Bo Li, Nan Du, Yanping Huang, Andrew M. Dai, Yu Zhang, Rodrigo Cabrera, Zhifeng Chen, Trevor Strohman:
Massively Multilingual Shallow Fusion with Large Language Models. 1-5 - Dazhao Du, Bing Su, Zhewei Wei:
Preformer: Predictive Transformer with Multi-Scale Segment-Wise Correlations for Long-Term Time Series Forecasting. 1-5 - Ziyue Wang, Ya-Feng Liu, Zhaorui Wang, Wei Yu:
Scaling Law Analysis for Covariance Based Activity Detection in Cooperative Multi-Cell Massive Mimo. 1-5 - Michael Chan, Li Zhu, Korosh Vatanparvar, Hewon Jung, Jilong Kuang, Jun Alex Gao:
Improving Heart Rate and Heart Rate Variability Estimation from Video Through a HR-RR-Tuned Filter. 1-5 - Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe:
Intermpl: Momentum Pseudo-Labeling With Intermediate CTC Loss. 1-5 - Jiewen Zhu, Shengjia Chen, Lexiao Li, Luping Ji:
Sanet: Spatial Attention Network with Global Average Contrast Learning for Infrared Small Target Detection. 1-5 - Manila Kodali, Sudarsana Reddy Kadiri, Laura Laaksonen, Paavo Alku:
Automatic Classification of Vocal Intensity Category from Speech. 1-5 - Xingming Wang, Hao Wu, Chen Ding, Chuanzeng Huang, Ming Li:
Exploring Universal Singing Speech Language Identification Using Self-Supervised Learning Based Front-End Features. 1-5 - Jochen Fink, Renato L. G. Cavalcante, Zoran Utkovski, Slawomir Stanczak:
Deep-Unfolded Adaptive Projected Subgradient Method For Mimo Detection. 1-5 - Sofia Suvorova, Ali Pezeshki, Ross Kyprianou, Bill Moran:
A Radar-Jammer Zero-Sum Repeated Bayesian Game. 1-5 - Shuo Feng, Piji Li:
Ancient Chinese Word Segmentation and Part-of-Speech Tagging Using Distant Supervision. 1-5 - Yao Lu, Zhiyi Chen, Zehui Chen, Jie Hu, Liujuan Cao, Shengchuan Zhang:
CANDY: Category-Kernelized Dynamic Convolution for Instance Segmentation. 1-5 - Liuyin Wang, Mingchao Li, Hai-Tao Zheng:
High-Level Feature Fusion Network for Session-Based Social Recommendation. 1-5 - Mingliang Dai, Zhizhong Huang, Jiaqi Gao, Hongming Shan, Junping Zhang:
Cross-Head Supervision for Crowd Counting with Noisy Annotations. 1-5 - Liana Khamidullina, André L. F. de Almeida, Martin Haardt:
Rate Splitting and Precoding Strategies for Multi-User MIMO Broadcast Channels with Common and Private Streams. 1-5 - Lei Zhang, Jie Liu, Yanqi Bao, Jie Wang:
Region-Awared Transformer with Asymmetric Loss in Multi-Label Classification. 1-5 - Mehul Kumar, Jiyeon Kim, Dhananjaya Gowda, Abhinav Garg, Chanwoo Kim:
Self-Supervised Accent Learning for Under-Resourced Accents Using Native Language Data. 1-5 - Jun Wang, Peng Yao, Feng Deng, Jianchao Tan, Chengru Song, Xiaorui Wang:
NAS-DYMC: NAS-Based Dynamic Multi-Scale Convolutional Neural Network for Sound Event Detection. 1-5 - Xianyu Wang, Yuhan Zhang, Weihua He, Yaoyuan Wang, Minglei Li, Yuchen Wang, Jingyi Zhang, Shunbo Zhou, Ziyang Zhang:
Audio-Driven High Definetion and Lip-Synchronized Talking Face Generation Based on Face Reenactment. 1-5 - Han Ding, Wenjing Song, Cui Zhao, Fei Wang, Ge Wang, Wei Xi, Jizhong Zhao:
Knowledge-Graph Augmented Music Representation for Genre Classification. 1-5 - Da Li, Bo Tang, Lei Xue:
Co-Design for Mimo Radar and Mimo Communication Aided by Reconfigurable Intelligent Surface. 1-5 - Daizong Liu, Pan Zhou:
Jointly Visual- and Semantic-Aware Graph Memory Networks for Temporal Sentence Localization in Videos. 1-5 - Yudong Zhang, Wei Lu, Xu Wang, Pengkun Wang, Yang Wang:
Pondering About Task Spatial Misalignment: Classification-Localization Equilibrated Object Detection. 1-5 - Andrea Marinoni, Marine Mercier, Qian Shi, Sivasakthy Selvakumaran, Mark Girolami:
Incorporating Reliability in Graph Information Propagation by Fluid Dynamics Diffusion: A case of Multimodal Semisupervised Deep Learning. 1-5 - Zhao Ren, Thanh Tam Nguyen, Yi Chang, Björn W. Schuller:
Fast Yet Effective Speech Emotion Recognition with Self-Distillation. 1-5 - Marco A. Oliveira, Vitor Almeida, João Silva, Aníbal J. S. Ferreira:
Analysis and Re-Synthesis of Natural Cricket Sounds Assessing the Perceptual Relevance of Idiosyncratic Parameters. 1-5 - Yikang Wei, Yahong Han:
Exploring Instance Relation for Decentralized Multi-Source Domain Adaptation. 1-5 - Yihong Wu, Yuwen Heng, Mahesan Niranjan, Hansung Kim:
Depth Estimation for a Single Omnidirectional Image with Reversed-Gradient Warming-up Thresholds Discriminator. 1-5 - Ysobel Sims, Alexandre Mendes, Stephan K. Chalup:
Enhanced Embeddings in Zero-Shot Learning for Environmental Audio. 1-5 - Youngki Kwon, Hee-Soo Heo, Bong-Jin Lee, You Jin Kim, Jee-Weon Jung:
Absolute Decision Corrupts Absolutely: Conservative Online Speaker Diarisation. 1-5 - Paul-Gauthier Noé, Xiaoxiao Miao, Xin Wang, Junichi Yamagishi, Jean-François Bonastre, Driss Matrouf:
Hiding Speaker's Sex in Speech Using Zero-Evidence Speaker Representation in an Analysis/Synthesis Pipeline. 1-5 - Seyed Saman Saboksayr, Gonzalo Mateos:
Dual-Based Online Learning of Dynamic Network Topologies. 1-5 - Benjamin Z. Reichman, Anirudh Sundar, Christopher Richardson, Tamara Zubatiy, Prithwijit Chowdhury, Aaryan Shah, Jack Truxal, Micah Grimes, Dristi Shah, Woo Ju Chee, Saif Punjwani, Atishay Jain, Larry Heck:
Outside Knowledge Visual Question Answering Version 2.0. 1-5 - Zihui Cai, Hongwei Ding, Xuemeng Wu, Mohan Xu, Xiaohui Cui:
Hierarchical Transformer for Multi-Label Trailer Genre Classification. 1-5 - Georgios Rizos, Rafael A. Calvo, Björn W. Schuller:
Positive-Pair Redundancy Reduction Regularisation for Speech-Based Asthma Diagnosis Prediction. 1-5 - Xunmeng Wu, Zai Yang, Jian-Feng Cai, Zongben Xu:
Spectral Super-Resolution on the Unit Circle Via Gradient Descent. 1-5 - Seongyeon Park, Myungseo Song, Bohyung Kim, Tae-Hyun Oh:
Unsupervised Pre-Training for Data-Efficient Text-to-Speech on Low Resource Languages. 1-5 - Fengming Liang, Changlin Fan, Bo Xiao, Kongming Liang:
Semantic Centralized Contrastive Learning for Unsupervised Hashing. 1-5 - Chia-Sheng Liu, Jia-Fong Yeh, Hao Hsu, Hung-Ting Su, Ming-Sui Lee, Winston H. Hsu:
BIRD-PCC: Bi-Directional Range Image-Based Deep Lidar Point Cloud Compression. 1-5 - Zengrui Jin, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shujie Hu, Jiajun Deng, Guinan Li, Xunying Liu:
Adversarial Data Augmentation Using VAE-GAN for Disordered Speech Recognition. 1-5 - Guanjun Li, Wei Xue, Wenju Liu, Jiangyan Yi, Jianhua Tao:
GCC-Speaker: Target Speaker Localization with Optimal Speaker-Dependent Weighting in Multi-Speaker Scenarios. 1-5 - Yihe Wang, Yitong Li, Yasheng Wang, Fei Mi, Pingyi Zhou, Jin Liu, Xin Jiang, Qun Liu:
History, Present and Future: Enhancing Dialogue Generation with Few-Shot History-Future Prompt. 1-5 - Yang Zhang, Krishna C. Puvvada, Vitaly Lavrukhin, Boris Ginsburg:
Conformer-Based Target-Speaker Automatic Speech Recognition For Single-Channel Audio. 1-5 - Dan Berrebbi, Brian Yan, Shinji Watanabe:
Avoid Overthinking in Self-Supervised Models for Speech Recognition. 1-5 - Sarah Miller, Christina Karam, Achour Idoughi, Kodai Kikuchi, Keigo Hirakawa:
A Bayesian Perspective on Noise2Noise: Theory and Extensions. 1-5 - Yuhongze Zhou, Liguang Zhou, Issam Hadj Laradji, Tin Lun Lam, Yangsheng Xu:
Affinity Learning With Blind-Spot Self-Supervision for Image Denoising. 1-5 - Tzeviya Sylvia Fuchs, Yedid Hoshen:
Unsupervised Word Segmentation Using Temporal Gradient Pseudo-Labels. 1-5 - Nauman Dawalatabad, Sameer Khurana, Antoine Laurent, James R. Glass:
On Unsupervised Uncertainty-Driven Speech Pseudo-Label Filtering and Model Calibration. 1-5 - Kisoo Kwon, Kuhwan Jeong, Junghyun Park, Hwidong Na, Jinwoo Shin:
String-Based Molecule Generation Via Multi-Decoder VAE. 1-5 - Xinzhou Xu, Jun Deng, Zixing Zhang, Zhen Yang, Björn W. Schuller:
Zero-Shot Speech Emotion Recognition Using Generative Learning with Reconstructed Prototypes. 1-5 - Roberto Pereira, Xavier Mestre, David Gregoratti:
Consistent Estimators of a New Class of Covariance Matrix Distances in the Large Dimensional Regime. 1-5 - Yu Bai, Ruian He, Weimin Tan, Bo Yan, Yangle Lin:
Fine-Grained Blind Face Inpainting with 3D Face Component Disentanglement. 1-5 - Yibin Tang, Ying Chen, Yuan Gao, Aimin Jiang, Lin Zhou:
ADHD Classification with Biomarker Identification Using a Triplet Loss Attention Auto-Encoding Network. 1-5 - Rakib Hyder, M. Salman Asif:
Compressive Sensing with Tensorized Autoencoder. 1-5 - Zhengzhuo Xu, Shuo Yang, Xingjun Wang, Chun Yuan:
Rethink Long-Tailed Recognition with Vision Transforms. 1-5 - Ruoyu Wang, Jun Du, Tian Gao:
Quantum Transfer Learning Using the Large-Scale Unsupervised Pre-Trained Model Wavlm-Large for Synthetic Speech Detection. 1-5 - Daeun Kyung, Kyungmin Jo, Jaegul Choo, Joonseok Lee, Edward Choi:
Perspective Projection-Based 3d CT Reconstruction from Biplanar X-Rays. 1-5 - Yanan Lin, Keyu Chen, Shihao Zhou, Yunan Huang, Yunqi Lei:
CO-NET: Classification-Oriented Point Cloud Sampling via Informative Feature Learning and Non-Overlapped Local Adjustment. 1-5 - Rémi Delogne, Vincent Schellekens, Laurent Daudet, Laurent Jacques:
Signal Processing with Optical Quadratic Random Sketches. 1-5 - Ferdinand Jost, Vassillen Chizhov, Joachim Weickert:
Optimising Different Feature Types for Inpainting-Based Image Representations. 1-5 - Fuyan Ma, Bin Sun, Shutao Li:
Logo-Former: Local-Global Spatio-Temporal Transformer for Dynamic Facial Expression Recognition. 1-5 - Yun-Ning Hung, Chao-Han Huck Yang, Pin-Yu Chen, Alexander Lerch:
Low-Resource Music Genre Classification with Cross-Modal Neural Model Reprogramming. 1-5 - Jiukai Sun, Ganchao Liu, Xuelong Li, Yuan Yuan:
Difference Guided VHR Remote Sensing Image Change Detection. 1-5 - Ryuichi Yamamoto, Reo Yoneyama, Tomoki Toda:
NNSVS: A Neural Network-Based Singing Voice Synthesis Toolkit. 1-5 - Yuya Nishi, Takumi Takahashi, Hiroki Iimori, Giuseppe Abreu, Shinsuke Ibi, Seiichi Sampei:
Wireless Location Tracking via Complex-Domain Super MDS with Time Series Self-Localization Information. 1-5 - Adarsh M. Subramaniam, Akshayaa Magesh, Venugopal V. Veeravalli:
Adaptive Step-Size Methods for Compressed SGD. 1-5 - Khoa Anh Ngo, Kyuhong Shim, Byonghyo Shim:
Spatial Cross-Attention for Transformer-Based Image Captioning. 1-5 - Tong Lei, Zhongshu Hou, Yuxiang Hu, Wanyu Yang, Tianchi Sun, Xiaobin Rong, Dahan Wang, Kai Chen, Jing Lu:
A Low-Latency Hybrid Multi-Channel Speech Enhancement System For Hearing Aids. 1-2 - Guangzhi Sun, Chao Zhang, Philip C. Woodland:
End-to-End Spoken Language Understanding with Tree-Constrained Pointer Generator. 1-5 - Anastasia Kuznetsova, Aswin Sivaraman, Minje Kim:
The Potential of Neural Speech Synthesis-Based Data Augmentation for Personalized Speech Enhancement. 1-5 - Sarbani Ghose, Deepak Mishra, Santi P. Maity, George C. Alexandropoulos:
RIS Reflection and Placement Optimisation for Underlay D2D Communications in Cognitive Cellular Networks. 1-5 - Tianyu Geng, Feng Ji, Pratibha, Wee Peng Tay:
Modulo EEG Signal Recovery Using Transformer. 1-5 - Bach-Tung Pham, Ting-Yu Wang, Phuong Le Thi, Khai-Thinh Nguyen, Yuan-Shan Lee, Tzu-Chiang Tai, Jia-Ching Wang:
Dense Adversarial Transfer Learning Based On Class-Invariance. 1-5 - Yuang Li, Xianrui Zheng, Philip C. Woodland:
Self-Supervised Learning-Based Source Separation for Meeting Data. 1-5 - Gerrit Maus, Dieter Brückmann:
Joint Angle and Respiration Estimation for Passive and Device-Free Respiration Monitoring. 1-5 - Yingting Li, Ambuj Mehrish, Rishabh Bhardwaj, Navonil Majumder, Bo Cheng, Shuai Zhao, Amir Zadeh, Rada Mihalcea, Soujanya Poria:
Evaluating Parameter-Efficient Transfer Learning Approaches on SURE Benchmark for Speech Understanding. 1-5 - Kartik Audhkhasi, Brian Farris, Bhuvana Ramabhadran, Pedro J. Moreno:
Modular Conformer Training for Flexible End-to-End ASR. 1-5 - Zihan Zhang, Shimin Zhang, Mingshuai Liu, Yanhong Leng, Zhe Han, Li Chen, Lei Xie:
Two-Step Band-Split Neural Network Approach For Full-Band Residual Echo Suppression. 1-2 - Huaizhen Tang, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao:
Learning Speech Representations with Flexible Hidden Feature Dimensions. 1-5 - Vikram Krishnamurthy:
Adaptive Filtering Algorithms For Set-Valued Observations-Symmetric Measurement Approach To Unlabeled And Anonymized Data. 1-5 - Dianlong You, Houlin Wang, Bingxin Liu, Yang Yu, Zhiming Li:
DL-NET: Dilation Location Network for Temporal Action Detection. 1-5 - Vanya Bannihatti Kumar, Shanbo Cheng, Ningxin Peng, Yuchen Zhang:
Visual Information Matters for ASR Error Correction. 1-5 - Xiangping Zheng, Xun Liang, Bo Wu, Junlan Feng, Yuhui Guo, Sensen Zhang:
Intent Does Matter! Propagating High-Order Relations for Exploring Interest Preferences. 1-5 - Tom O'Malley, Shaojin Ding, Arun Narayanan, Quan Wang, Rajeev Rikhye, Qiao Liang, Yanzhang He, Ian McGraw:
Conditional Conformer: Improving Speaker Modulation For Single And Multi-User Speech Enhancement. 1-5 - Qin Lu, Konstantinos D. Polyzos:
Gaussian Process Dynamical Modeling for Adaptive Inference Over Graphs. 1-5 - Sakila S. Jayaweera, Beibei Wang, Xiaolu Zeng, Wei-Hsiang Wang, K. J. Ray Liu:
WIFI-Based Robust Child Presence Detection for Smart Cars. 1-5 - Hayato Futami, Jessica Huynh, Siddhant Arora, Shih-Lun Wu, Yosuke Kashiwagi, Yifan Peng, Brian Yan, Emiru Tsunoo, Shinji Watanabe:
The Pipeline System of ASR and NLU with MLM-based data Augmentation Toward Stop Low-Resource Challenge. 1-2 - Steven Vander Eeckt, Hugo Van hamme:
Weight Averaging: A Simple Yet Effective Method to Overcome Catastrophic Forgetting in Automatic Speech Recognition. 1-5 - Yan Zhao, Jincen Wang, Yuan Zong, Wenming Zheng, Hailun Lian, Li Zhao:
Deep Implicit Distribution Alignment Networks for cross-Corpus Speech Emotion Recognition. 1-5 - Byeonggeun Kim, Jun-Tae Lee, Seunghan Yang, Simyung Chang:
Scalable Weight Reparametrization for Efficient Transfer Learning. 1-5 - Mohammad Reza Hasanabadi, Majid Behdad, Davood Gharavian:
MFCCGAN: A Novel MFCC-Based Speech Synthesizer Using Adversarial Learning. 1-5 - Mingming Zhang, Ye Du, Zhenghui Hu, Qingjie Liu, Yunhong Wang:
BISVP: Building Footprint Extraction Via Bidirectional Serialized Vertex Prediction. 1-5 - Naman Khetan, Tushar Arora, Samee Ur Rehman, Deepak K. Gupta:
Implicitly Rotation Equivariant Neural Networks. 1-5 - Nirmesh Shah, Mayank Kumar Singh, Naoya Takahashi, Naoyuki Onoe:
Nonparallel Emotional Voice Conversion for Unseen Speaker-Emotion Pairs Using Dual Domain Adversarial Network & Virtual Domain Pairing. 1-5 - Yukun Zhang, Chuan Wang, Sanyi Zhang, Xiaochun Cao:
A Database for Multi-Modal Short Video Quality Assessment. 1-5 - Chakka Sai Pradeep, Neelam Sinha, Banibrata Mukhopadhyay:
Measuring Deviation from Stochasticity in Time-Series Using Autoencoder Based Time-Invariant Representation: Application to Black Hole Data. 1-5 - Rahul Pandey, Roger Ren, Qi Luo, Jing Liu, Ariya Rastrow, Ankur Gandhe, Denis Filimonov, Grant P. Strimel, Andreas Stolcke, Ivan Bulyko:
Procter: Pronunciation-Aware Contextual Adapter For Personalized Speech Recognition In Neural Transducers. 1-5 - Jitendra K. Tugnait:
Estimation of High-Dimensional Differential Graphs from Multi-Attribute Data. 1-5 - Li Huang, Hongmei Wu, Qiang Gao, Guisong Liu:
Attention Localness in Shared Encoder-Decoder Model For Text Summarization. 1-5 - Puja Trivedi, Danai Koutra, Jayaraman J. Thiagarajan:
A Closer Look At Scoring Functions And Generalization Prediction. 1-5 - Ke Liu, Jingzhao Hu, Jun Feng:
Speech Emotion Recognition Based on Low-Level Auto-Extracted Time-Frequency Features. 1-5 - M. Amin Manouchehrpour, Harvinder Lehal, Mahsa Salmani, Timothy N. Davidson:
TDMA-Based Multi-User Binary Computation Offloading in the Finite-Block-Length Regime. 1-5 - Zheng Tan, Longxiu Huang, HanQin Cai, Yifei Lou:
Non-Convex Approaches for Low-Rank Tensor Completion under Tubal Sampling. 1-5 - Costas A. Kokke, Mario Coutino, Laura Anitori, Richard Heusdens, Geert Leus:
Sensor Selection for Angle of Arrival Estimation Based on the Two-Target Cramér-Rao Bound. 1-5 - Yannan Chen, Licheng Zhao, Yaowen Zhang, Kaiming Shen:
Inverse Quadratic Transform for Minimizing A Sum of Ratios. 1-5 - Chen Chen, Dong Wang, Thomas Fang Zheng:
CN-CVS: A Mandarin Audio-Visual Dataset for Large Vocabulary Continuous Visual to Speech Synthesis. 1-5 - W. Bastiaan Kleijn, Michael Chinen, Felicia S. C. Lim, Jan Skoglund:
Multi-Channel Audio Signal Generation. 1-5 - Kyungsu Kim, Minju Park, Haesun Joung, Yunkee Chae, Yeongbeom Hong, Seonghyeon Go, Kyogu Lee:
Show Me the Instruments: Musical Instrument Retrieval From Mixture Audio. 1-5 - Ben Hayes, Charalampos Saitis, György Fazekas:
Sinusoidal Frequency Estimation by Gradient Descent. 1-5 - David Ramírez, Ignacio Santamaría, Louis L. Scharf:
Passive Detection of Rank-One Gaussian Signals for Known Channel Subspaces and Arbitrary Noise. 1-5 - Yufeng Wu, Baowei Wang, Changyu Dai, Yi Yuan, Bin Li, Weiqian Zheng, Hao Wu:
Enhancing Robustness and Imperceptibility of Blind Watermarking with Improved Message Processor. 1-5 - Chenyu Huang, Weimin Tan, Jiaxing Shi, Zhen Xing, Bo Yan:
Uncer2Natural: Uncertainty-Aware Unsupervised Image Denoising. 1-5 - Ruolin Su, Jingfeng Yang, Ting-Wei Wu, Biing-Hwang Juang:
Choice Fusion As Knowledge For Zero-Shot Dialogue State Tracking. 1-5 - Sixiang Chen, Tian Ye, Jun Shi, Yun Liu, Jingxia Jiang, Erkang Chen, Peng Chen:
DEHRFormer: Real-Time Transformer for Depth Estimation and Haze Removal from Varicolored Haze Scenes. 1-5 - Dongseong Hwang, Khe Chai Sim, Yu Zhang, Trevor Strohman:
Comparison of Soft and Hard Target RNN-T Distillation for Large-Scale ASR. 1-5 - Hongyi Pan, Xin Zhu, Zhilu Ye, Pai-Yen Chen, Ahmet Enis Çetin:
Real-Time Wireless ECG-Derived Respiration Rate Estimation using an Autoencoder with a DCT Layer. 1-5 - Gary Wang, Kyle Kastner, Ankur Bapna, Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Yu Zhang:
Understanding Shared Speech-Text Representations. 1-5 - Jinghan Jia, Yihua Zhang, Dogyoon Song, Sijia Liu, Alfred O. Hero III:
Robustness-Preserving Lifelong Learning Via Dataset Condensation. 1-5 - Yibo Zhang, Ping Gong, Zelin Wang, Zhe Li, Xuanyuan Yang:
DialogMI: A Dialogue Model Based on Enhancing Dialogue Mutual Information. 1-5 - Yi-Chiao Wu, Israel D. Gebru, Dejan Markovic, Alexander Richard:
Audiodec: An Open-Source Streaming High-Fidelity Neural Audio Codec. 1-5 - Yongxiang Feng, Weihua He, Kaichao You, Bing Liu, Ziyang Zhang, Yaoyuan Wang, Minglei Li, Yihang Lou, Jiawei Li, Guoqi Li, Jianxing Liao:
Test-Time Training-Free Domain Adaptation. 1-5 - Haoyu Lu, Nan Li, Tongtong Song, Longbiao Wang, Jianwu Dang, Xiaobao Wang, Shiliang Zhang:
Speech and Noise Dual-Stream Spectrogram Refine Network With Speech Distortion Loss For Robust Speech Recognition. 1-5 - Jinshuai Yang, Zhongliang Yang, Xinrui Ge, Jiajun Zou, Yue Gao, Yongfeng Huang:
LINK: Linguistic Steganalysis Framework with External Knowledge. 1-5 - Jinting Wu, Mei Tu:
A Person Identification System for the ICASSP 2023 e-Prevention Challenge. 1-2 - Xuandi Fu, Kanthashree Mysore Sathyendra, Ankur Gandhe, Jing Liu, Grant P. Strimel, Ross McGowan, Athanasios Mouchtaris:
Robust Acoustic And Semantic Contextual Biasing In Neural Transducers For Speech Recognition. 1-5 - Navneet Agrawal, Renato L. G. Cavalcante, Slawomir Stanczak:
Dynamic Distributed Convex Optimization "Over-The-Air" In Decentralized Wireless Networks. 1-5 - Weidong Chen, Xiaofen Xing, Xiangmin Xu, Jianxin Pang, Lan Du:
DST: Deformable Speech Transformer for Emotion Recognition. 1-5 - Matthias Blochberger, Filip Elvander, Randall Ali, Jan Østergaard, Jesper Jensen, Marc Moonen, Toon van Waterschoot:
Distributed Adaptive Norm Estimation for Blind System Identification in Wireless Sensor Networks. 1-5 - Hans Van Gorp, Merel M. van Gilst, Pedro Fonseca, Sebastiaan Overeem, Ruud J. G. van Sloun:
Aleatoric Uncertainty Estimation of Overnight Sleep Statistics Through Posterior Sampling Using Conditional Normalizing Flows. 1-5 - Tongzi Wu, Yuhao Zhou, Wang Ling, Hojin Yang, Joana Veloso, Lin Sun, Ruixin Huang, Norberto Guimaraes, Scott Sanner:
Towards Dialogue Modeling Beyond Text. 1-5 - Xiangui Kang, Pengcheng Su, Zisheng Huang, Yifang Chen, Jie Wang:
Double Compression Detection Based on the De-Blocking Filtering of HEVC Videos. 1-5 - Valentin Debarnot, Sidharth Gupta, Konik Kothari, Ivan Dokmanic:
Joint Cryo-ET Alignment and Reconstruction with Neural Deformation Fields. 1-5 - Shinta Otake, Rei Kawakami, Nakamasa Inoue:
Parameter Efficient Transfer Learning for Various Speech Processing Tasks. 1-5 - Esaú Villatoro-Tello, Srikanth R. Madikeri, Juan Zuluaga-Gomez, Bidisha Sharma, Seyyed Saeed Sarfjoo, Iuliia Nigmatulina, Petr Motlícek, Alexei V. Ivanov, Aravind Ganapathiraju:
Effectiveness of Text, Acoustic, and Lattice-Based Representations in Spoken Language Understanding Tasks. 1-5 - Georgi Tinchev, Marta Czarnowska, Kamil Deja, Kayoko Yanagisawa, Marius Cotescu:
Modelling Low-Resource Accents Without Accent-Specific TTS Frontend. 1-5 - Arghya Pal, Sailaja Rajanala, Raphaël C.-W. Phan, KokSheik Wong:
Self Supervised Bert for Legal Text Classification. 1-5 - Xiongbiao Luo:
A New Personalized Efficacy Atlas for Pallidal Deep Brain Stimulation. 1-5 - Harry Dong, Megna Shah, Sean Donegan, Yuejie Chi:
Deep Unfolded Tensor Robust PCA With Self-Supervised Learning. 1-5 - Ryota Komatsu, Yusuke Kimura, Takuma Okamoto, Takahiro Shinozaki:
Continuous Action Space-Based Spoken Language Acquisition Agent Using Residual Sentence Embedding and Transformer Decoder. 1-5 - Junhao Wang, Li Lu, Zhongjie Ba, Feng Lin, Kui Ren:
Shift to Your Device: Data Augmentation for Device-Independent Speaker Verification Anti-Spoofing. 1-5 - Chun-Yi Li, Yen-Yu Lin, Wei-Chen Chiu:
Decontamination Transformer For Blind Image Inpainting. 1-5 - Ruixia Zhang, Zhiqiong Wang, Zhongyang Wang, Junchang Xin:
A Dynamic Cross-Scale Transformer with Dual-Compound Representation for 3D Medical Image Segmentation. 1-5 - Yuzheng Wang, Zhaoyu Chen, Dingkang Yang, Yang Liu, Siao Liu, Wenqiang Zhang, Lizhe Qi:
Adversarial Contrastive Distillation with Adaptive Denoising. 1-5 - Jiawei Chen, Peijie Huang, Guotai Huang, Qianer Li, Yuhong Xu:
SDTN: Speaker Dynamics Tracking Network for Emotion Recognition in Conversation. 1-5 - Ioannis Tsiamas, Gerard I. Gállego, José A. R. Fonollosa, Marta R. Costa-jussà:
Efficient Speech Translation with Dynamic Latent Perceivers. 1-5 - Chaoran Yang, Qing Ling, Xueli Sheng, Mengfei Mu, Andreas Jakobsson:
Sparse and Structured Modelling of Underwater Acoustic Channel Impulse Responses. 1-5 - Ran Ji, Jiarui Li, Wentao He, Jianfeng Ren, Xudong Jiang:
Dual-Stream Siamese Vision Transformer With Mutual Attention For Radar Gait Verification. 1-5 - Camilo Aguilar, Mathias Ortner, Josiane Zerubia:
Enhanced GM-PHD Filter for Real Time Satellite Multi-Target Tracking. 1-5 - Yikemaiti Sataer, Chuanqi Shi, Miao Gao, Yunlong Fan, Bin Li, Zhiqiang Gao:
Integrating Syntactic and Semantic Knowledge in AMR Parsing with Heterogeneous Graph Attention Network. 1-5 - E. Kobayashi, Hiroyasu Yasuda, Kiyoshi Hayasaka, Yu Otake, Shunsuke Ono, Shogo Muramatsu:
Multi-Resolution Convolutional Dictionary Learning for Riverbed Dynamics Modeling. 1-5 - Priyesh Shukla, Sureshkumar S., Alex C. Stutts, Sathya Ravi, Theja Tulabandhula, Amit Ranjan Trivedi:
Robust Monocular Localization of Drones by Adapting Domain Maps to Depth Prediction Inaccuracies. 1-5 - Kaiwen Zhou, Zhilin Chen, Guochen Liu, Zhitang Chen:
A Novel Extrapolation Technique to Accelerate WMMSE. 1-5 - Takuya Fujihashi, Toshiaki Koike-Akino, Takashi Watanabe:
Soft 2D-to-3D Delivery Using Deep Graph Neural Networks for Holographic-Type Communication. 1-5 - Jie Chen, Xingchen Song, Zhendong Peng, Binbin Zhang, Fuping Pan, Zhiyong Wu:
LightGrad: Lightweight Diffusion Probabilistic Model for Text-to-Speech. 1-5 - Anselm Lohmann, Toon van Waterschoot, Jörg Bitzer, Simon Doclo:
Dereverberation in Acoustic Sensor Networks Using weighted Prediction Error with Microphone-Dependent Prediction Delays. 1-5 - Ke Yang, Sixian Wang, Jincheng Dai, Kailin Tan, Kai Niu, Ping Zhang:
WITT: A Wireless Image Transmission Transformer for Semantic Communications. 1-5 - Iván López-Espejo, Ram C. M. C. Shekar, Zheng-Hua Tan, Jesper Jensen, John H. L. Hansen:
Filterbank Learning for Noise-Robust Small-Footprint Keyword Spotting. 1-5 - Vandad Davoodnia, Ali Etemad:
Human Pose Estimation from Ambiguous Pressure Recordings with Spatio-Temporal Masked Transformers. 1-5 - Zekai Li, Wei Peng:
Self-Adaptive Reasoning on Sub-Questions for Multi-Hop Question Answering. 1-5 - Hugo Jaquard, Michaël Fanuel, Pierre-Olivier Amblard, Rémi Bardenet, Simon Barthelmé, Nicolas Tremblay:
Smoothing Complex-Valued Signals on Graphs with Monte-Carlo. 1-5 - Xian Zhong, Shuaipeng Su, Wenxuan Liu, Xuemei Jia, Wenxin Huang, Mengdie Wang:
Neighborhood Information-Based Label Refinement for Person Re-Identification with Label Noise. 1-5 - Oliver Watts, Lovisa Wihlborg, Cassia Valentini-Botinhao:
PUFFIN: Pitch-Synchronous Neural Waveform Generation for Fullband Speech on Modest Devices. 1-5 - Wilmer Lobato, Felipe Farias, William Cruz, Marcellus Amadeus:
Performance Comparison of TTS Models for Brazilian Portuguese to Establish a Baseline. 1-5 - Aaron Geldert, Nils Meyer-Kahlen, Sebastian J. Schlecht:
Interpolation of Spatial Room Impulse Responses Using Partial Optimal Transport. 1-5 - Marzieh Ajirak, Petar M. Djuric:
A Gaussian Latent Variable Model for Incomplete Mixed Type Data. 1-5 - Kang Li, Yan Song, Li-Rong Dai, Ian McLoughlin, Xin Fang, Lin Liu:
AST-SED: An Effective Sound Event Detection Method Based on Audio Spectrogram Transformer. 1-5 - Mariam Saeed, Marwan Torki:
Lit the Darkness: Three-Stage Zero-Shot Learning for Low-Light Enhancement with Multi-Neighbor Enhancement Factors. 1-2 - Jie Liu, Yixuan Liu, Xue Han, Chao Deng, Junlan Feng:
ESCL: Equivariant Self-Contrastive Learning for Sentence Representations. 1-5 - Shangeth Rajaa, Kriti Anandan, Swaraj Dalmia, Tarun Gupta, Eng Siong Chng:
Improving Spoken Language Identification with Map-Mix. 1-5 - Huy Phan, Elisabeth R. M. Heremans, Oliver Y. Chén, Philipp Koch, Alfred Mertins, Maarten De Vos:
Improving Automatic Sleep Staging Via Temporal Smoothness Regularization. 1-5 - Ori Kenig, Koby Todros, Tülay Adali:
Robust GMM Parameter Estimation via the K-BM Algorithm. 1-5 - Nayeon Kim, Moonsub Byeon, Daehyun Ji, Dokwan Oh:
D-3DLD: Depth-Aware Voxel Space Mapping for Monocular 3D Lane Detection with Uncertainty. 1-5 - Wei Huang, Yixin Zhao, Xuechao Wu, Le Yin:
Improved Indoor Localization With NLOS Signal Propagations. 1-5 - Chan-Jan Hsu, Ho-Lam Chung, Hung-Yi Lee, Yu Tsao:
T5lephone: Bridging Speech and Text Self-Supervised Models for Spoken Language Understanding Via Phoneme Level T5. 1-5 - Federico Baldassarre, Alaaeldin El-Nouby, Hervé Jégou:
Variable Rate Allocation for Vector-Quantized Autoencoders. 1-5 - Chao Liao, Jinwen Huang, Huan Yuan, Peng Yao, Jianchao Tan, Dawei Zhang, Feng Deng, Xiaorui Wang, Chengru Song:
Dynamic TF-TDNN: Dynamic Time Delay Neural Network Based on Temporal-Frequency Attention for Dialect Recognition. 1-5 - Toshiki Orihara, Kazi Mahmudul Hassan, Toshihisa Tanaka:
Active Selection of Source Patients in Transfer Learning for Epileptic Seizure Detection Using Riemannian Manifold. 1-5 - Alexandra Vioni, Georgia Maniati, Nikolaos Ellinas, June Sig Sung, Inchul Hwang, Aimilios Chalamandaris, Pirros Tsiakoulis:
Investigating Content-Aware Neural Text-to-Speech MOS Prediction Using Prosodic and Linguistic Features. 1-5 - Vincent P. Martin, Aymeric Ferron, Jean-Luc Rouas, Pierre Philip:
"Prediction of Sleepiness Ratings from Voice by Man and Machine": A Perceptual Experiment Replication Study. 1-5 - Dong Wu, Bin Liang, Xiangjun Liu, Xuan Zang, Mingmin Chi:
Bipartite Graph Convolutional Networks with Adversarial Domain Transfer. 1-5 - Dongmin Huang, Lingwei Wang, Hongzhou Lu, Wenjin Wang:
A Contrastive Embedding-Based Domain Adaptation Method for Lung Sound Recognition in Children Community-Acquired Pneumonia. 1-5 - Jie Wei, Guanyu Hu, Luu Anh Tuan, Xinyu Yang, Wenjing Zhu:
Multi-Scale Receptive Field Graph Model for Emotion Recognition in Conversations. 1-5 - Pascal A. Schirmer, Iosif Mporas:
A Wavelet Scattering Approach for Load Identification with Limited Amount of Training Data. 1-5 - Xiaoliang Wu, Peter Bell, Ajitha Rajan:
Explanations for Automatic Speech Recognition. 1-5 - Jiaxin Ye, Xin-Cheng Wen, Yujie Wei, Yong Xu, Kunhong Liu, Hongming Shan:
Temporal Modeling Matters: A Novel Temporal Emotional Modeling Approach for Speech Emotion Recognition. 1-5 - Fangyuan Chi, Yixiao Wang, Panos Nasiopoulos, Victor C. M. Leung, Mahsa T. Pourazad:
Federated Semi-Supervised Learning for Object Detection in Autonomous Driving. 1-5 - Shuai Tao, Himavanth Reddy, Jesper Rindom Jensen, Mads Græsbøll Christensen:
Frequency Bin-Wise Single Channel Speech Presence Probability Estimation Using Multiple DNNS. 1-5 - Yunzuo Zhang, Weili Kang, Yameng Liu, Pengfei Zhu:
Joint Multi-Level Feature Network for Lightweight Person Re-Identification. 1-5 - Yan Wang, Xin Luo, Zhen-Duo Chen, Peng-Fei Zhang, Meng Liu, Xin-Shun Xu:
FedVMR: A New Federated Learning Method for Video Moment Retrieval. 1-5 - Aleksej Chinaev, Niklas Knaepper, Gerald Enzner:
Long-Term Synchronization of Wireless Acoustic Sensor Networks with Nonpersistent Acoustic Activity Using Coherence State. 1-5 - Jiawei Liu, Hao Wang, Weining Wang, Xingjian He, Jing Liu:
WL-MSR: Watch and Listen for Multimodal Subtitle Recognition. 1-5 - Sándor Plósz, István Gyöngy, Jonathan Leach, Steve McLaughlin, Gerald S. Buller, Abderrahim Halimi:
Fast Multiscale 3D Reconstruction Using Single-Photon Lidar Data. 1-5 - Michael Nigro, Sridhar Krishnan:
SARdBScene: Dataset and Resnet Baseline for Audio Scene Source Counting and Analysis. 1-5 - Enes Krijestorac, Hazem Sallouha, Shamik Sarkar, Danijela Cabric:
Agile Radio Map Prediction Using Deep Learning. 1-2 - Lang Wang, Juan Liu, Peng Jiang, Dehua Cao, Baochuan Pang:
DDN: Dynamic Aggregation Enhanced Dual-Stream Network for Medical Image Classification. 1-5 - Han-Sol Lee, Moonkyu Song, Junseo Lee, Yeol-Min Seong, Ducksoo Kim, Kwanghyuk Bae, Seongwook Song:
An Antispoofing Approach in Biometric Authentication System for a Smartcard. 1-5 - Pierre Houdouin, Esa Ollila, Frédéric Pascal:
Regularized EM Algorithm. 1-5 - Shreyas Jaiswal, Ruchi Pandey, Santosh Nannuru:
Deep Architecture for DOA Trajectory Localization. 1-5 - Rumeysa Bodur, Binod Bhattarai, Tae-Kyun Kim:
Joint Training of Hierarchical GANs and Semantic Segmentation for Expression Translation. 1-5 - Ya Tang, Xiongjun Ye, Xuanya Li, Zhineng Chen:
Multi-Object Localization and Irrelevant-Semantic Separation for Nuclei Segmentation in Histopathology Images. 1-5 - Bangjian Zhou, Jieming Pan, Maheswari Sivan, Aaron Voon-Yew Thean, J. Senthilnath:
Quantile Online Learning for Semiconductor Failure Analysis. 1-5 - Qi Zhang, Zhongchang Sun, Luis C. Herrera, Shaofeng Zou:
Data-Driven Quickest Change Detection in Markov Models. 1-5 - Florian Hilgemann, Peter Jax:
Order Reduction of Multi-Channel FIR Filters by Balanced Truncation. 1-5 - Chengyou Jia, Minnan Luo, Zhuohang Dang, Xiaojun Chang, Qinghua Zheng:
Towards Real-Time Person Search with Invariant Feature Learning. 1-5 - Jinchao Li, Xixin Wu, Kaitao Song, Dongsheng Li, Xunying Liu, Helen Meng:
A Hierarchical Regression Chain Framework for Affective Vocal Burst Recognition. 1-5 - Yicheng Xiao, Yue Ma, Shuyan Li, Hantao Zhou, Ran Liao, Xiu Li:
SemanticAC: Semantics-Assisted Framework for Audio Classification. 1-5 - Xingrong Dong, Zhaoxian Wu, Qing Ling, Zhi Tian:
Distributed Online Learning With Adversarial Participants In An Adversarial Environment. 1-5 - Haitao Xu, Liangfa Wei, Jie Zhang, Jianming Yang, Yannan Wang, Tian Gao, Xin Fang, Li-Rong Dai:
A Multi-Scale Feature Aggregation Based Lightweight Network for Audio-Visual Speech Enhancement. 1-5 - Mingliang Zhai, Kang Ni, Jiucheng Xie, Hao Gao:
Spike-Based Optical Flow Estimation Via Contrastive Learning. 1-5 - Georgios Chochlakis, Gireesh Mahajan, Sabyasachee Baruah, Keith Burghardt, Kristina Lerman, Shrikanth Narayanan:
Leveraging Label Correlations in a Multi-Label Setting: a Case Study in Emotion. 1-5 - Shammur Absar Chowdhury, Ahmed Ali:
Multilingual Word Error Rate Estimation: E-Wer3. 1-5 - Jiexin Wang, Jiahao Chen, Bing Su:
Toward Auto-Evaluation With Confidence-Based Category Relation-Aware Regression. 1-5 - Jizhou Li, Bin Chen, Guibin Zan, Guannan Qian, Piero Pianetta, Yijin Liu:
Subspace Modeling Enabled High-Sensitivity X-Ray Chemical Imaging. 1-5 - Zhizheng Yang, Xun Wang, Dongyu Xia, Wei Wang, Haipeng Dai:
Sequence-Based Device-Free Gesture Recognition Framework for Multi-Channel Acoustic Signals. 1-5 - Sergey Novoselov, Vladimir Volokhov, Galina Lavrentyeva:
Universal Speaker Recognition Encoders for Different Speech Segments Duration. 1-5 - Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee:
Leveraging Phone-Level Linguistic-Acoustic Similarity For Utterance-Level Pronunciation Scoring. 1-5 - Wang Chen, Peizhen Chen, Weijie Chen, Luojun Lin:
Customized Automatic Face Beautification. 1-5 - Caitlin Richter, Jón Guðnason:
Relative Dynamic Time Warping Comparison for Pronunciation Errors. 1-5 - Tian Feng, Qiming Chen, Yao Shi, Xun Lang, Lei Xie, Hongye Su:
A Hybrid Deep Neural Network for Nonlinear Causality Analysis in Complex Industrial Control System. 1-5 - Haoran Zhao, Nan Li, Runqiang Han, Xiguang Zheng, Chen Zhang, Liang Guo, Bing Yu:
A Low-Latency Deep Hierarchical Fusion Network for Fullband Acoustic Echo Cancellation. 1-2 - Seong-Gyun Leem, Daniel Fulford, Jukka-Pekka Onnela, David Gard, Carlos Busso:
Adapting a Self-Supervised Speech Representation for Noisy Speech Emotion Recognition by Using Contrastive Teacher-Student Learning. 1-5 - Nicolas Horst, Priyanka Das, Mathias Wien:
A Template Matching Approach for Reference Picture Padding in Video Coding. 1-5 - Masanori Tsujikawa, Akihiko Sugiyama, Ken Hanazawa, Yoshinobu Kajikawa:
Linear Microphone Array Parallel to the Driving Direction for in-Car Speech Enhancement. 1-5 - Zhenxiao Cheng, Jie Zhou, Wen Wu, Qin Chen, Liang He:
Tell Model Where to Attend: Improving Interpretability of Aspect-Based Sentiment Classification via Small Explanation Annotations. 1-5 - Zhenzhen You, Yan Yan, Zhenghao Shi, Minghua Zhao, Jing Yan, Haiqin Liu, Xinhong Hei, Xiaoyong Ren:
Laryngeal Leukoplakia Classification Via Dense Multiscale Feature Extraction in White Light Endoscopy Images. 1-5 - Zihan Zhao, Yu Wang, Yanfeng Wang:
Knowledge-Aware Bayesian Co-Attention for Multimodal Emotion Recognition. 1-5 - Ting-Wei Lin, Chao-Lin Liu, Li Su:
Audio-Driven Facial Landmark Generation in Violin Performance using 3DCNN Network with Self Attention Model. 1-5 - Chin-Yun Yu, Sung-Lin Yeh, György Fazekas, Hao Tang:
Conditioning and Sampling in Variational Diffusion Models for Speech Super-Resolution. 1-5 - Xingke Song, Xiaoying Yang, Jianfeng Ren, Ruibin Bai, Xudong Jiang:
Solving Jigsaw Puzzle of Large Eroded Gaps Using Puzzlet Discriminant Network. 1-5 - Jianing Long, Qingmeng Zhu, Hao He, Zhipeng Yu, Qilin Zhang, Zhihong Zhang:
3D Point Cloud Completion Based on Multi-Scale Degradation. 1-5 - Bin Yang, Jun Chen, Mang Ye:
Top-K Visual Tokens Transformer: Selecting Tokens for Visible-Infrared Person Re-Identification. 1-5 - Yoav Noah, Nir Shlezinger:
Distributed Admm with Limited Communications Via Deep Unfolding. 1-5 - Stijn Kindt, Jenthe Thienpondt, Nilesh Madhu:
Exploiting Speaker Embeddings for Improved Microphone Clustering and Speech Separation in ad-hoc Microphone Arrays. 1-5 - Djallel Bouneffouf, Oznur Alkan, Raphaël Féraud, Baihan Lin:
Question Answering System with Sparse and Noisy Feedback. 1-5 - Zhezheng Hao, Zhoumin Lu, Feiping Nie, Rong Wang, Xuelong Li:
Multi-View K-Means with Laplacian Embedding. 1-5 - Matthew J. Goupell, Marjan Davoodian, Sarah Weinstein, David Gadzinski, Dmitry N. Zotkin, Kaushik Sethunath, Ramani Duraiswami:
Rapid Audiometric Evaluation for Personalized Headphone Listening. 1-5 - Prateek Verma, Chris Chafe:
A Content Adaptive Learnable "Time-Frequency" Representation for audio Signal Processing. 1-5 - Salamata Konate, Léo Lebrat, Rodrigo Santa Cruz, Clinton Fookes, Andrew P. Bradley, Olivier Salvado:
Bias Identification with RankPix Saliency. 1-5 - Othman Istaiteh, Yasmeen Kussad, Yahya Daqour, Maria Habib, Mohammad Habash, Dhananjaya Gowda:
A Transformer-Based E2E SLU Model for Improved Semantic Parsing. 1-2 - Arian Bakhtiarnia, Nemanja Milosevic, Qi Zhang, Dragana Bajovic, Alexandros Iosifidis:
Dynamic Split Computing for Efficient Deep EDGE Intelligence. 1-5 - Ju-Hyung Lee, Joohan Lee, Seon-Ho Lee, Andreas F. Molisch:
PMNet: Large-Scale Channel Prediction System for ICASSP 2023 First Pathloss Radio Map Prediction Challenge. 1-2 - Omid Rezaei, Mohammad Mahdi Naghsh, Seyed Mohammad Karbasi, Mohammad Mahdi Nayebi:
Resource Allocation for UAV-Enabled Integrated Sensing and Communication (ISAC) via Multi-Objective Optimization. 1-5 - Sizhe Chen, Qinghua Tao, Zhixing Ye, Xiaolin Huang:
Measuring the Transferability of ℓ∞ Attacks by the ℓ2 Norm. 1-5 - Zhenyao He, Wei Xu, Hong Shen, Derrick Wing Kwan Ng, Yonina C. Eldar, Xiaohu You:
Integrated Sensing and Full-Duplex Communication: Joint Transceiver Beamforming and Power Allocation. 1-5 - Zeyu Wang, Haibin Shen, Changyou Men, Quan Sun, Kejie Huang:
Thermal Infrared Image Inpainting Via Edge-Aware Guidance. 1-5 - Haole Ke, Lin Li, Peipei Wang, Jingling Yuan, Xiaohui Tao:
Tree-Like Interaction Learning for Bundle Recommendation. 1-5 - Martin Gölz, Abdelhak M. Zoubir, Visa Koivunen:
Spatial Inference Using Censored Multiple Testing with Fdr Control. 1-5 - Alexandros Gkillas, Dimitris Ampeliotis, Kostas Berberidis:
A Highly Interpretable Deep Equilibrium Network for Hyperspectral Image Deconvolution. 1-5 - Jingzhou Hu, Kejun Huang:
Identifiable Bounded Component Analysis Via Minimum Volume Enclosing Parallelotope. 1-5 - Weiji Zhao, Kefeng Huang, Chongyang Zhang:
Modulation-Based Center Alignment and Motion Mining for Spatial Temporal Action Detection. 1-5 - Yao Wei, Haoxiang Wang, Mingze Sun, Jiawang Liu:
Attention Based Relation Network for Facial Action Units Recognition. 1-5 - Yunpeng Bai, Yayuan Xiao, Xuan Hou, Ying Li, Changjing Shang, Qiang Shen:
SAR Image Despeckling with Residual-in-Residual Dense Generative Adversarial Network. 1-5 - Youngjun Kwak, Minyoung Jung, Hunjae Yoo, Jinho Shin, Changick Kim:
Liveness Score-Based Regression Neural Networks for Face Anti-Spoofing. 1-5 - Gaosheng Zhang, Shilei Miao, Linghui Tang, Peijia Qian:
A Two-Stage System for Spoken Language Understanding. 1-2 - Emilie Chouzenoux, Víctor Elvira:
Graphit: Iterative Reweighted ℓ1 Algorithm for Sparse Graph Inference in State-Space Models. 1-5 - Cyprien Gille, Frédéric Guyard, Michel Barlaud:
A New Semi-Supervised Classification Method Using a Supervised Autoencoder for Biomedical Applications. 1-5 - Elsa Rizk, Stefan Vlaski, Ali H. Sayed:
Local Graph-Homomorphic Processing for Privatized Distributed Systems. 1-5 - Zhi Zhou, Xianjin Li, Jia He, Xiaoyan Bi, Yan Chen, Guangjian Wang, Peiying Zhu:
6G Integrated Sensing and Communication - Sensing Assisted Environmental Reconstruction and Communication. 1-5 - Payal Mohapatra, Bashima Islam, Md Tamzeed Islam, Ruochen Jiao, Qi Zhu:
Efficient Stuttering Event Detection Using Siamese Networks. 1-5 - Peiyu Zhang, Ayush Bhandari:
Unlimited Sampling in Phase Space. 1-5 - Yu Wu, Dongfang Shen, Jiabao Jin, Guanping Xu, Yinran Chen, Xiongbiao Luo:
Local-Global Progressive U-Transformers for Accurate Hepatic and Portal Veins Segmentation in Abdominal MR Images. 1-5 - Zijian Gao, Kele Xu, Hongda Jia, Tianjiao Wan, Bo Ding, Dawei Feng, Xinjun Mao, Huaimin Wang:
Complementary Learning System Based Intrinsic Reward in Reinforcement Learning. 1-5 - Roy Sheffer, Yossi Adi:
I Hear Your True Colors: Image Guided Audio Generation. 1-5 - Bamelak Tadele, Volodymyr Shyianov, Faouzi Bellili, Amine Mezghani:
Channel Estimation with Tightly-Coupled Antenna Arrays. 1-5 - Fernando Pedraza, Giuseppe Caire:
Neurally Augmented State Space Model for Simultaneous Communication and Tracking with Low Complexity Receivers. 1-5 - Yujie Zheng, Chong Wang, Yi Chen, Jiangbo Qian, Jun Wang, Jiafei Wu:
Enlightening the Student in Knowledge Distillation. 1-5 - Guanghao Meng, Tao Dai, Bin Chen, Naiqi Li, Yong Jiang, Shu-Tao Xia:
Difficulty-Aware Data Augmentor for Scene Text Recognition. 1-5 - Anna Lopatnikova, Minh-Ngoc Tran:
Quantum Variational Bayes on Manifolds. 1-5 - Junghyun Koo, Marco A. Martínez Ramírez, Wei-Hsiang Liao, Stefan Uhlich, Kyogu Lee, Yuki Mitsufuji:
Music Mixing Style Transfer: A Contrastive Learning Approach to Disentangle Audio Effects. 1-5 - Hyemi Kim, Jiyun Park, Taegyun Kwon, Dasaem Jeong, Juhan Nam:
A Study of Audio Mixing Methods for Piano Transcription in Violin-Piano Ensembles. 1-5 - Han Han, Tao Jiang, Wei Yu:
Active Beam Tracking with Reconfigurable Intelligent Surface. 1-5 - Kevin Scheck, Tanja Schultz:
Multi-Speaker Speech Synthesis from Electromyographic Signals by Soft Speech Unit Prediction. 1-5 - George Retsinas, Giorgos Sfikas, Panagiotis Paraskevas Filntisis, Petros Maragos:
Newton-Based Trainable Learning Rate. 1-5 - Junyu Liu, Jianfeng Ren, Hongliang Sun, Xudong Jiang:
Face Recognition on Point Cloud with Cgan-Top for Denoising. 1-5 - Amir Weiss, Andrew C. Singer, Gregory W. Wornell:
Towards Robust Data-Driven Underwater Acoustic Localization: A Deep CNN Solution with Performance Guarantees for Model Mismatch. 1-5 - Jiahao Xu, Xufeng Yan, Cui Peng, Xinquan Wu, Lipeng Gu, Yanbiao Niu:
UAV Local Path Planning Based on Improved Proximal Policy Optimization Algorithm. 1-5 - Ju-Seok Seong, Jeong-Hwan Choi, Jehyun Kyung, Ye-Rin Jeoung, Joon-Hyuk Chang:
Noise-Aware Target Extension with Self-Distillation for Robust Speech Recognition. 1-5 - Hassan Taherian, DeLiang Wang:
Multi-Resolution Location-Based Training for Multi-Channel Continuous Speech Separation. 1-5 - Sanglee Park, Seung-won Hwang, Jungmin So:
SMCL: Saliency Masked Contrastive Learning for Long-Tailed Visual Recognition. 1-5 - Hessa Alfalahi, Ahsan Khandoker, Ghada Alhussein, Leontios J. Hadjileontiadis:
Cochlear Decomposition: A Novel Bio-Inspired Multiscale Analysis Framework. 1-5 - Xudong Pan, Mi Zhang, Duocai Wu:
RØROS: Building a Responsive Online Recommender System via Meta-Gradients Updating. 1-5 - Zhongweiyang Xu, Xulin Fan, Mark Hasegawa-Johnson:
Dual-Path Cross-Modal Attention for Better Audio-Visual Speech Extraction. 1-5 - Yifei Shen, Yuqing Ren, Andreas Toftegaard Kristensen, Xiaohu You, Chuan Zhang, Andreas Burg:
Improved Belief Propagation Decoding of Turbo Codes. 1-5 - Yadong Guan, Guibin Zheng, Jiqing Han, Huanliang Wang:
Subband Dependency Modeling for Sound Event Detection. 1-5 - Jiajiong Cao, Yufan Liu, Weiming Bai, Jingting Ding, Liang Li:
Nasty-SFDA: Source Free Domain Adaptation from a Nasty Model. 1-5 - Kehai Qiu, Stefanos Bakirtzis, Hui Song, Ian J. Wassell, Jie Zhang:
Deep Learning-Based Path Loss Prediction for Outdoor Wireless Communication Systems. 1-2 - Mohamed Elminshawi, Srikanth Raj Chetupalli, Emanuël A. P. Habets:
Beamformer-Guided Target Speaker Extraction. 1-5 - Mufan Sang, Yong Zhao, Gang Liu, John H. L. Hansen, Jian Wu:
Improving Transformer-Based Networks with Locality for Automatic Speaker Verification. 1-5 - Paula Andrea Pérez-Toro, Dalia Rodríguez-Salas, Tomás Arias-Vergara, Sebastian P. Bayerl, Philipp Klumpp, Korbinian Riedhammer, Maria Schuster, Elmar Nöth, Andreas K. Maier, Juan Rafael Orozco-Arroyave:
Transferring Quantified Emotion Knowledge for the Detection of Depression in Alzheimer's Disease Using Forestnets. 1-5 - Han-Mo Ou, Naresh R. Shanbhag:
Enhancing the Accuracy of Resistive In-Memory Architectures using Adaptive Signal Processing. 1-5 - Gaku Narita, Junichi Shimizu, Taketo Akama:
GANStrument: Adversarial Instrument Sound Synthesis with Pitch-Invariant Instance Conditioning. 1-5 - Valentin Bolz, Johannes Rueß, Andreas Zell:
Data-Driven Graph Convolutional Neural Networks for Power System Contingency Analysis. 1-5 - Randall Balestriero, Yann LeCun:
Fast and Exact Enumeration of Deep Networks Partitions Regions. 1-5 - Neel Bhandari, Pin-Yu Chen:
Lost In Translation: Generating Adversarial Examples Robust to Round-Trip Translation. 1-5 - Giovana Morais, Matthew E. P. Davies, Marcelo Queiroz, Magdalena Fuentes:
Tempo vs. Pitch: Understanding Self-Supervised Tempo Estimation. 1-5 - Xian Zhong, Wei Li, Liang Liao, Jing Xiao, Wenxuan Liu, Wenxin Huang, Zheng Wang:
Bat: Bi-Alignment Based On Transformation in Multi-Target Domain Adaptation for Semantic Segmentation. 1-5 - Yonghao Liu, Di Liang, Fang Fang, Sirui Wang, Wei Wu, Rui Jiang:
Time-Aware Multiway Adaptive Fusion Network for Temporal Knowledge Graph Question Answering. 1-5 - Zexin Fan, Kejiang Chen, Chuan Qin, Kai Zeng, Weiming Zhang, Nenghai Yu:
Image Adversarial Steganography Based on Joint Distortion. 1-5 - Abinay Reddy Naini, Mary A. Kohler, Carlos Busso:
Unsupervised Domain Adaptation for Preference Learning Based Speech Emotion Recognition. 1-5 - Frederik Bous, Axel Roebel:
Analysis and Transformation of Voice Level in Singing Voice. 1-5 - Rehana Mahfuz, Yinyi Guo, Erik Visser:
Improving Audio Captioning Using Semantic Similarity Metrics. 1-5 - Layne Berry, Yi-Jen Shih, Hsuan-Fu Wang, Heng-Jui Chang, Hung-Yi Lee, David Harwath:
M-SpeechCLIP: Leveraging Large-Scale, Pre-Trained Models for Multilingual Speech to Image Retrieval. 1-5 - Ya Jiang, Hang Chen, Jun Du, Qing Wang, Chin-Hui Lee:
Incorporating Lip Features into Audio-Visual Multi-Speaker DOA Estimation by Gated Fusion. 1-5 - Dongyue Li, Yaping Yan, Dong Liang, Songlin Du:
MSFORMER: Multi-Scale Transformer with Neighborhood Consensus for Feature Matching. 1-5 - Yakun Ju, Kin-Man Lam, Jun Xiao, Cong Zhang, Cuixin Yang, Junyu Dong:
Efficient Feature Fusion for Learning-Based Photometric Stereo. 1-5 - Bo-Wen Zhang, Yan Yan, Jiapei Yu:
Contrastive Learning of Sentence Embeddings in Product Search. 1-5 - Leonardo Spampinato, Alessia Tarozzi, Chiara Buratti, Riccardo Marini:
DRL Path Planning for UAV-Aided V2X Networks: Comparing Discrete to Continuous Action Spaces. 1-5 - Jianrong Wang, Yaxin Zhao, Hongkai Fan, Tianyi Xu, Qi Li, Sen Li, Li Liu:
Memory-Augmented Contrastive Learning for Talking Head Generation. 1-5 - Zhiyuan Peng, Mingjie Shao, Xuanji He, Xu Li, Tan Lee, Ke Ding, Guanglu Wan:
Covariance Regularization for Probabilistic Linear Discriminant Analysis. 1-5 - Koyo Sato, Shunsuke Ono:
Robust Hyperspectral Anomaly Detection with Simultaneous Mixed Noise Removal via Constrained Convex Optimization. 1-5 - Onkar Susladkar, Prajwal Gatti, Santosh Kumar Yadav:
SLBERT: A Novel Pre-Training Framework for Joint Speech and Language Modeling. 1-5 - Wuti Xiong:
CD-FSOD: A Benchmark For Cross-Domain Few-Shot Object Detection. 1-5 - Huaying Xue, Xiulian Peng, Yan Lu:
Contrast-PLC: Contrastive Learning for Packet Loss Concealment. 1-5 - Zhi Zhong, Masato Hirano, Kazuki Shimada, Kazuya Tateishi, Shusuke Takahashi, Yuki Mitsufuji:
An Attention-Based Approach to Hierarchical Multi-Label Music Instrument Classification. 1-5 - Zhengding Luo, Dongyuan Shi, Xiaoyi Shen, Junwei Ji, Woon-Seng Gan:
Deep Generative Fixed-Filter Active Noise Control. 1-5 - Pourya Shamsolmoali, Masoumeh Zareapoor, Eric Granger:
Image Completion Via Dual-Path Cooperative Filtering. 1-5 - Kyusung Seo, Joonhyung Park, Jaeyun Song, Eunho Yang:
Weavspeech: Data Augmentation Strategy For Automatic Speech Recognition Via Semantic-Aware Weaving. 1-5 - Yifan Peng, Jaesong Lee, Shinji Watanabe:
I3D: Transformer Architectures with Input-Dependent Dynamic Depth for Speech Recognition. 1-5 - Heejin Do, Yunsu Kim, Gary Geunbae Lee:
Hierarchical Pronunciation Assessment with Multi-Aspect Attention. 1-5 - Zilong Li, Qianqian Ren, Long Chen, Jianguo Sun:
Dual-Stage Graph Convolution Network With Graph Learning For Traffic Prediction. 1-5 - Ziyang Luo, Zhipeng Hu, Yadong Xi, Rongsheng Zhang, Jing Ma:
I-Tuning: Tuning Frozen Language Models with Image for Lightweight Image Captioning. 1-5 - Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
Context-Aware Coherent Speaking Style Prediction with Hierarchical Transformers for Audiobook Speech Synthesis. 1-5 - Shuaitao Zhang, Yuan Zhang, Zheng Zhao, Di Xie, Shiliang Pu:
HPFTN: Hierarchical Progressive Fusion Transformer Network for Video Denoising. 1-5 - Xiangyu Yang, Boris Joukovsky, Nikos Deligiannis:
Relevance Propagation through Deep Conditional Random Fields. 1-5 - François Grondin, Marc-Antoine Maheux, Jean-Samuel Lauzon, Jonathan Vincent, François Michaud:
Fast Cross-Correlation for TDoA Estimation on Small Aperture Microphone Arrays. 1-5 - Xueqi Gao, Chao Xu, Yihang Song, Jing Hu, Jian Xiao, Zhaopeng Meng:
Node-Wise Domain Adaptation Based on Transferable Attention for Recognizing Road Rage via EEG. 1-5 - Haoxu Wang, Ming Cheng, Qiang Fu, Ming Li:
The DKU Post-Challenge Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge: Deep Analysis. 1-5 - Linlin Yang, Hongying Liu, Fanhua Shang, Yuanyuan Liu:
Adaptive Non-Local Generative Adversarial Networks for Low-Dose CT Image Denoising. 1-5 - Yulu Jin, Lifeng Lai:
Adversarially Robust Fairness-Aware Regression. 1-5 - Chen Wang, Jiang Zhong, Qizhu Dai, Yafei Qi, Rongzhen Li, Qin Lei, Bin Fang, Xue Li:
PRRD: Pixel-Region Relation Distillation For Efficient Semantic Segmentation. 1-5 - Shuting Dong, Feng Lu, Chun Yuan:
Frequency Reciprocal Action and Fusion for Single Image Super-Resolution. 1-5 - Eun Som Jeon, Suhas Lohit, Rushil Anirudh, Pavan K. Turaga:
Robust Time Series Recovery and Classification Using Test-Time Noise Simulator Networks. 1-5 - Jiayi Tian, Chao Fang, Haonan Wang, Zhongfeng Wang:
Bebert: Efficient And Robust Binary Ensemble Bert. 1-5 - Junlin Liu, Xinchen Lyu:
Distance-Based Online Label Inference Attacks Against Split Learning. 1-5 - Steven M. Hernandez, Ding Zhao, Shaojin Ding, Antoine Bruguier, Rohit Prabhavalkar, Tara N. Sainath, Yanzhang He, Ian McGraw:
Sharing Low Rank Conformer Weights for Tiny Always-On Ambient Speech Recognition Models. 1-5 - Yuan Cao, Danchen Zhang, Xin Zheng, Hongming Shan, Junping Zhang:
Mutual Information Based Reweighting for Precipitation Nowcasting. 1-5 - Weijun Huang, Jia Huang, Guowei Wang, Hongzhou Lu, Min He, Wenjin Wang:
Exploiting CCTV Cameras for Hand Hygiene Recognition in ICU. 1-5 - Zhong Meng, Weiran Wang, Rohit Prabhavalkar, Tara N. Sainath, Tongzhou Chen, Ehsan Variani, Yu Zhang, Bo Li, Andrew Rosenberg, Bhuvana Ramabhadran:
JEIT: Joint End-to-End Model and Internal Language Model Training for Speech Recognition. 1-5 - Ti Wang, Hong Liu, Runwei Ding, Wenhao Li, Yingxuan You, Xia Li:
Interweaved Graph and Attention Network for 3D Human Pose Estimation. 1-5 - Nilaksh Das, Monica Sunkara, Sravan Bodapati, Jinglun Cai, Devang Kulshreshtha, Jeff Farris, Katrin Kirchhoff:
Mask the Bias: Improving Domain-Adaptive Generalization of CTC-Based ASR with Internal Language Model Estimation. 1-5 - Nafiul Rashid, Md Mahbubur Rahman, Tousif Ahmed, Jilong Kuang, Jun Alex Gao:
BreathIE: Estimating Breathing Inhale Exhale Ratio Using Motion Sensor Data from Consumer Earbuds. 1-5 - Linfeng Feng, Yijun Gong, Xiao-Lei Zhang:
Soft Label Coding for end-to-end Sound Source Localization with ad-hoc Microphone Arrays. 1-5 - Huixiang Wen, Shan Chang, Luo Zhou:
Light Projection-Based Physical-World Vanishing Attack Against Car Detection. 1-5 - Hui Chen, Hanyi Zhang, Longbiao Wang, Kong Aik Lee, Meng Liu, Jianwu Dang:
Self-Supervised Audio-Visual Speaker Representation with Co-Meta Learning. 1-5 - Arshdeep Singh, Mark D. Plumbley:
Efficient Similarity-Based Passive Filter Pruning for Compressing CNNS. 1-5 - Gongping Huang, Jacob Benesty, Israel Cohen, Emil Winebrand, Jingdong Chen, Walter Kellermann:
Switching Kronecker Product Linear Filtering for Multispeaker Adaptive Speech Dereverberation. 1-5 - Disheng Li, Wei Liu, Yuriy V. Zakharov, Paul D. Mitchell:
Graph Signal Processing for Narrowband Direction of Arrival Estimation. 1-5 - Shuo Zhang, Jing Liu:
Anomalous Signal Detection for Cyber-Physical Systems Using Interpretable Causal Neural Network. 1-5 - George Dimas, Anastasios Koulaouzidis, Dimitris K. Iakovidis:
Co-Operative CNN for Visual Saliency Prediction on WCE Images. 1-5 - Shengdi Qin, Shunli Zhang, Yu Zhang, Haoyu Gao:
CAENet: Using Collaborative Attention Transformer and Add-Boost Strategy for Single Image Deraining. 1-5 - Yuhe Ding, Jian Liang, Jie Cao, Aihua Zheng, Ran He:
Modify: Model-Driven Face Stylization Without Style Images. 1-5 - Taihui Li, Zhong Zhuang, Hengkang Wang, Ju Sun:
Random Projector: Efficient Deep Image Prior. 1-5 - Chunyu Qiang, Peng Yang, Hao Che, Ying Zhang, Xiaorui Wang, Zhongyuan Wang:
Improving Prosody for Cross-Speaker Style Transfer by Semi-Supervised Style Extractor and Hierarchical Modeling in Speech Synthesis. 1-5 - Xian Zhong, Aoyu Yi, Wenxuan Liu, Wenxin Huang, Chengming Zou, Zheng Wang:
Background-Weakening Consistency Regularization for Semi-Supervised Video Action Detection. 1-5 - Yuan-Pei Lin, Ting-Ming Yang:
Robust Angle Estimation for Hybrid mmWave Systems. 1-5 - Atsunori Ogawa, Takafumi Moriya, Naoyuki Kamo, Naohiro Tawara, Marc Delcroix:
Iterative Shallow Fusion of Backward Language Model for End-To-End Speech Recognition. 1-5 - Carlos Alejandro López, Jaume Riba:
Data Driven Joint Sensor Fusion and Regression Based on Geometric Mean Squared Error. 1-5 - Songpei Xu, Chaitanya Kaul, Xuri Ge, Roderick Murray-Smith:
Continuous Interaction with A Smart Speaker via Low-Dimensional Embeddings of Dynamic Hand Pose. 1-5 - Yanxing Wang, Shengqi Zhu, Guisheng Liao, Lan Lan, Zhuochen Chen, Feilong Liu:
Resolving Doppler Ambiguity Via Spread Phase Alignment in FDA-MIMO Radar. 1-5 - Verena Lachner, Katharina Schaar, Ralf Zimmermann:
CSM In Motion Vector Steganalysis: The Effect of Coders on Motion Vectors in H.264 Video Encoding. 1-5 - Yu Zheng, David C. Zhu, Jian Ren, Taosheng Liu, Karl J. Friston, Tongtong Li:
A Mathematical Model for Neuronal Activity and Brain Information Processing Capacity. 1-5 - Hsuan-Jui Chen, Yen Meng, Hung-yi Lee:
Once-for-All Sequence Compression for Self-Supervised Speech Models. 1-5 - Taylan Kargin, Fariborz Salehi, Babak Hassibi:
Asymptotic Distribution of Stochastic Mirror Descent Iterates in Average Ensemble Models. 1-5 - Aymane Abdali, Vincent Gripon, Lucas Drumetz, Bartosz Boguslawski:
Active Learning for Efficient Few-Shot Classification. 1-5 - Zhenyu Piao, Miseul Kim, Hyungchan Yoon, Hong-Goo Kang:
HappyQuokka System for ICASSP 2023 Auditory EEG Challenge. 1-2 - Plácido L. Vidal, Joaquim de Moura, Jorge Novo, Marcos Ortega, Jaime S. Cardoso:
Transformer-Based Multi-Prototype Approach for Diabetic Macular Edema Analysis in OCT Images. 1-5 - Can Han, Suncheng Xiang, Dahong Qian:
MTDL-NET: Morphological and Temporal Discriminative Learning for Heartbeat Classification. 1-5 - Kuan-Lin Chen, Daniel D. E. Wong, Ke Tan, Buye Xu, Anurag Kumar, Vamsi Krishna Ithapu:
Leveraging Heteroscedastic Uncertainty in Learning Complex Spectral Mapping for Single-Channel Speech Enhancement. 1-5 - Petteri Pulkkinen, Visa Koivunen:
Model-Free Online Learning for Waveform Optimization In Integrated Sensing And Communications. 1-5 - Ye-Rin Jeoung, Joon-Young Yang, Jeong-Hwan Choi, Joon-Hyuk Chang:
Improving Transformer-Based End-to-End Speaker Diarization by Assigning Auxiliary Losses to Attention Heads. 1-5 - Jie Tan, Hengyi Cai, Hongshen Chen, Hong Cheng, Helen Meng, Zhuoye Ding:
Contrastive Learning with Dialogue Attributes for Neural Dialogue Generation. 1-5 - Boyu Hou, Chengyu Wang, Xiaoqing Chen, Minghui Qiu, Liang Feng, Jun Huang:
Prompt-Distiller: Few-Shot Knowledge Distillation for Prompt-Based Language Learners with Dual Contrastive Learning. 1-5 - Dat Thanh Nguyen, Kamal Gopikrishnan Nambiar, André Kaup:
Deep Probabilistic Model for Lossless Scalable Point Cloud Attribute Compression. 1-5 - Qiquan Xiao, Yuan Zhang, Xuanya Li, Kai Hu:
Boundary Cue Guidance and Contextual Feature Mining for Glass Segmentation. 1-5 - William Chettleburgh, Zhishen Huang, Ming-Hsuan Yang:
Fast Robust Principle Component Analysis Using Gauss-Newton Iterations. 1-5 - Yuntao Li, Zhenpeng Su, Yutian Li, Hanchu Zhang, Sirui Wang, Wei Wu, Yan Zhang:
T5-SR: A Unified Seq-to-Seq Decoding Strategy for Semantic Parsing. 1-5 - Luca Barbieri, Bernardo Camajori Tedeschini, Mattia Brambilla, Monica Nicoli:
Implicit Vehicle Positioning with Cooperative Lidar Sensing. 1-5 - Zehua Zhang, Shiyun Xu, Xuyi Zhuang, Lianyu Zhou, Heng Li, Mingjiang Wang:
Two-Stage UNet with Multi-Axis Gated Multilayer Perceptron for Monaural Noisy-Reverberant Speech Enhancement. 1-5 - Syed A. Hamza, Kyle Juretus, Moeness G. Amin, Fauzia Ahmad:
Deep Learning Sparse Array Design Using Binary Switching Configurations. 1-5 - Sung-Feng Huang, Chia-Ping Chen, Zhi-Sheng Chen, Yu-Pao Tsai, Hung-Yi Lee:
Personalized Lightweight Text-to-Speech: Voice Cloning with Adaptive Structured Pruning. 1-5 - Yuma Shirahata, Ryuichi Yamamoto, Eunwoo Song, Ryo Terashima, Jae-Min Kim, Kentaro Tachibana:
Period VITS: Variational Inference with Explicit Pitch Modeling for End-To-End Emotional Speech Synthesis. 1-5 - Grant Davidson, Mark Vinton, Per Ekstrand, Cong Zhou, Lars F. Villemoes, Lie Lu:
High Quality Audio Coding with Mdctnet. 1-5 - Zhenghao Guo, Verity M. McClelland, Wei Dai, Zoran Cvetkovic:
Structured Errors-in-Variables Modelling for Cortico-Muscular Coherence Enhancement. 1-5 - Torsten Schlett, Sebastian Schachner, Christian Rathgeb, Juan E. Tapia, Christoph Busch:
Effect of Lossy Compression Algorithms on Face Image Quality and Recognition. 1-5 - Tal Peer, Simon Welker, Timo Gerkmann:
DiffPhase: Generative Diffusion-Based STFT Phase Retrieval. 1-5 - Dawei Dai, Yutang Li, Liang Wang, Shiyu Fu, Shuyin Xia, Guoyin Wang:
Sketch Less Face Image Retrieval: A New Challenge. 1-5 - Wen Cheng, Shichen Dong, Wei Wang:
W2KPE: Keyphrase Extraction with Word-Word Relation. 1-2 - Muhammad Saad Saeed, Shah Nawaz, Muhammad Haris Khan, Muhammad Zaigham Zaheer, Karthik Nandakumar, Muhammad Haroon Yousaf, Arif Mahmood:
Single-branch Network for Multimodal Training. 1-5 - Changzeng Fu, Zhenghan Chen, Jiaqi Shi, Bowen Wu, Chaoran Liu, Carlos Toshinori Ishi, Hiroshi Ishiguro:
HAG: Hierarchical Attention with Graph Network for Dialogue Act Classification in Conversation. 1-5 - Dror Jacoby, Jonatan Ostrometzky, Hagit Messer:
Model-based vs. Data-driven Approaches for Predicting Rain-induced Attenuation in Commercial Microwave Links: A Comparative Empirical Study. 1-5 - Muzhou Yu, Sia Huat Tan, Kailu Wu, Runpei Dong, Linfeng Zhang, Karsheng Ma:
CORSD: Class-Oriented Relational Self Distillation. 1-5 - Jun Xue, Cunhang Fan, Jiangyan Yi, Chenglong Wang, Zhengqi Wen, Dan Zhang, Zhao Lv:
Learning From Yourself: A Self-Distillation Method For Fake Speech Detection. 1-5 - Zhangying Weng, Peng Li, Xin Zhuang, Xuefeng Yan, Lina Gong, Haoran Xie, Mingqiang Wei:
ifUNet++: Iterative Feedback UNet++ for Infrared Small Target Detection. 1-5 - Weihang Ding, Mohammad Shikh-Bahaei:
HARQ Delay Minimization of 5G Wireless Network with Imperfect Feedback. 1-5 - Yuchen Wong, Qingni Shen, Cong Li, Cunzhan Liu, Tianxiang Ai:
Detecting Malicious Migration on Edge to Prevent Running Data Leakage. 1-5 - Chunyang Fu, Xiang Zhang, Thuong Nguyen-Canh, Xiaozhong Xu, Ge Li, Shan Liu:
Surface-Sampling Based Objective Quality Assessment Metrics for Meshes. 1-5 - Annika Briegleb, Mhd Modar Halimeh, Walter Kellermann:
Exploiting Spatial Information with the Informed Complex-Valued Spatial Autoencoder for Target Speaker Extraction. 1-5 - Ryouichi Nishimura, Kenichi Takizawa:
Simultaneous Estimation of Direction of Arrival and Sound Speed Using a Non-Uniform Sensor Array. 1-5 - Rossen Nenov, Dang-Khoa Nguyen, Peter Balazs:
Faster Than Fast: Accelerating the Griffin-Lim Algorithm. 1-5 - Shengfang Zhai, Qingni Shen, Xiaoyi Chen, Weilong Wang, Cong Li, Yuejian Fang, Zhonghai Wu:
NCL: Textual Backdoor Defense Using Noise-Augmented Contrastive Learning. 1-5 - Yongzi Yu, Wanyong Qiu, Chen Quan, Kun Qian, Zhihua Wang, Yu Ma, Bin Hu, Björn W. Schuller, Yoshiharu Yamamoto:
Federated Intelligent Terminals Facilitate Stuttering Monitoring. 1-5 - Farwa Abbas, Verity M. McClelland, Zoran Cvetkovic, Wei Dai:
SS-ADMM: Stationary and Sparse Granger Causal Discovery for Cortico-Muscular Coupling. 1-5 - Feihu Jin, Jinliang Lu, Jiajun Zhang:
Unified Prompt Learning Makes Pre-Trained Language Models Better Few-Shot Learners. 1-5 - Charalampos Symeonidis, Ioannis Mademlis, Ioannis Pitas, Nikos Nikolaidis:
Efficient Feature Extraction for Non-Maximum Suppression in Visual Person Detection. 1-5 - Akshayaa Magesh, Zhongchang Sun, Venugopal V. Veeravalli, Shaofeng Zou:
Robust Hypothesis Testing With Moment Constrained Uncertainty Sets. 1-5 - Rohan R. Pote, Bhaskar D. Rao:
Light-Weight Sequential SBL Algorithm: An Alternative to OMP. 1-5 - Yunyang Zeng, Joseph Konan, Shuo Han, David Bick, Muqiao Yang, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
TAPLoss: A Temporal Acoustic Parameter Loss for Speech Enhancement. 1-5 - Fang-Qi Li, Shi-Lin Wang, Yun Zhu:
Measure and Countermeasure of the Capsulation Attack Against Backdoor-Based Deep Neural Network Watermarks. 1-5 - Zijian Yang, Wei Zhou, Ralf Schlüter, Hermann Ney:
Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers. 1-5 - Yajie Liu, Xinmeng Xu, Weiping Tu, Yuhong Yang, Li Xiao:
Improving Acoustic Echo Cancellation by Mixing Speech Local and Global Features with Transformer. 1-5 - Tanoj Langore, Te-Cheng Hsu, Yi-Hsien Hsieh, Che Lin:
LE-DTA: Local Extrema Convolution for Drug Target Affinity Prediction. 1-5 - Antonio Agudo:
Detail-Aware Uncalibrated Photometric Stereo. 1-5 - Timur Locher, Guy Revach, Nir Shlezinger, Ruud J. G. van Sloun, Rik Vullings:
Hierarchical Filtering With Online Learned Priors for ECG Denoising. 1-5 - Ioannis C. Tsaknakis, Prashant Khanduri, Mingyi Hong:
An Implicit Gradient Method for Constrained Bilevel Problems Using Barrier Approximation. 1-5 - Tanvir Mahmud, Feng Liang, Yaling Qing, Diana Marculescu:
CLIP4VideoCap: Rethinking Clip for Video Captioning with Multiscale Temporal Fusion and Commonsense Knowledge. 1-5 - Rowel Atienza:
EfficientSpeech: An On-Device Text to Speech Model. 1-5 - Chang-Sung Sung, Jun-Cheng Chen, Chu-Song Chen:
Hearing and Seeing Abnormality: Self-Supervised Audio-Visual Mutual Learning for Deepfake Detection. 1-5 - Zhifang Guo, Yichong Leng, Yihan Wu, Sheng Zhao, Xu Tan:
Prompttts: Controllable Text-To-Speech With Text Descriptions. 1-5 - Sébastien Journé, Nicolas Le Bihan, Florent Chatelain, Julien Flamant:
Polarized Signal Singular Spectrum Analysis with Complex SSA. 1-5 - Lequan Lin, Junbin Gao:
A Magnetic Framelet-Based Convolutional Neural Network for Directed Graphs. 1-5 - Shivani Gowda, Yifan Hu, Mandy Korpusik:
Multi-Modal Food Classification in a Diet Tracking System with Spoken and Visual Inputs. 1-5 - Yuhang Yang, Haihua Xu, Hao Huang, Eng Siong Chng, Sheng Li:
Speech-Text Based Multi-Modal Training with Bidirectional Attention for Improved Speech Recognition. 1-5 - Bin Xie, Hao Tang, Bin Duan, Dawen Cai, Yan Yan:
MLP-GAN for Brain Vessel Image Segmentation. 1-5 - Ruixiang Chen, Sheng Liu, Junhao Chen, Bingnan Guo, Feng Zhang:
VLKP:Video Instance Segmentation with Visual-Linguistic Knowledge Prompts. 1-5 - Guanlong Zhao, Quan Wang, Han Lu, Yiling Huang, Ignacio López-Moreno:
Augmenting Transformer-Transducer Based Speaker Change Detection with Token-Level Training Loss. 1-5 - Francisco Teixeira, Alberto Abad, Bhiksha Raj, Isabel Trancoso:
Privacy-Preserving Automatic Speaker Diarization. 1-5 - Haniyeh Ehsani Oskouie, Farzan Farnia:
Interpretation of Neural Networks is Susceptible to Universal Adversarial Perturbations. 1-5 - Xovee Xu, Yutao Wei, Pengyu Wang, Xucheng Luo, Fan Zhou, Goce Trajcevski:
Diffusion Probabilistic Modeling for Fine-Grained Urban Traffic Flow Inference with Relaxed Structural Constraint. 1-5 - Weiquan Fan, Xiaofen Xing, Bolun Cai, Xiangmin Xu:
MGAT: Multi-Granularity Attention Based Transformers for Multi-Modal Emotion Recognition. 1-5 - Jacob J. Webber, Cassia Valentini-Botinhao, Evelyn Williams, Gustav Eje Henter, Simon King:
Autovocoder: Fast Waveform Generation from a Learned Speech Representation Using Differentiable Digital Signal Processing. 1-5 - Alan Yang, Tara Yasmin Mina, Grace Xingxin Gao:
Binary Sequence Set Optimization for CDMA Applications via Mixed-Integer Quadratic Programming. 1-5 - Yang Zhou, Hongxia Wang, Qiang Zeng, Rui Zhang, Sijiang Meng:
A Discriminative Multi-Channel Noise Feature Representation Model for Image Manipulation Localization. 1-5 - Xueyan Zhou, Jiacen Guo, Hao Liu, Chao Wang:
A Fusion-Based and Multi-Layer Method for Low Light Image Enhancement. 1-5 - Puneesh Deora, Christos Thrampoulidis:
On Weighted Cross-Entropy for Label-Imbalanced Separable Data: An Algorithmic-Stability Study. 1-5 - Ansel MacLaughlin, Anna Rumshisky, Rinat Khaziev, Anil Ramakrishna, Yuval Merhav, Rahul Gupta:
Self-Healing Through Error Detection, Attribution, and Retraining. 1-5 - Yuxin Yang, Xia Sun, Qiang Lu, Richard F. E. Sutcliffe, Jun Feng:
A Sentiment and Syntactic-Aware Graph Convolutional Network for Aspect-Level Sentiment Classification. 1-5 - Magda Amiridi, Cheng Qian, Nicholas D. Sidiropoulos, Lucas M. Glass:
Enrollment Rate Prediction in Clinical Trials based on CDF Sketching and Tensor Factorization tools. 1-5 - Zhongling Liu, Rujie Liu, Ziqiang Shi, Liu Liu, Xiaoyu Mi, Kentaro Murase:
Semi-Supervised Contrastive Learning with Soft Mask Attention for Facial Action Unit Detection. 1-5 - Sankha Subhra Bhattacharjee, Liming Shi, Guoli Ping, Xiaoxiang Shen, Mads Græsbøll Christensen:
Study And Design Of Robust Personal Sound Zones With Vast Using Low Rank Rirs. 1-5 - Junyi He, Meimei Wu, Meng Li, Xiaobo Zhu, Feng Ye:
Multilevel Transformer for Multimodal Emotion Recognition. 1-5 - Jie Qin, Peng Zheng, Yichao Yan, Rong Quan, Xiaogang Cheng, Bingbing Ni:
Movienet-PS: A Large-Scale Person Search Dataset in the Wild. 1-5 - Mohammad Amin Omidi, Babak Seyfe, Shahrokh Valaee:
Reducing the Computational Complexity of Learning with Random Convolutional Features. 1-5 - Steven Vander Eeckt, Hugo Van hamme:
Using Adapters to Overcome Catastrophic Forgetting in End-to-End Automatic Speech Recognition. 1-5 - Linhan Zhang, Qian Chen, Wen Wang, Chong Deng, Xin Cao, Kongzhang Hao, Yuxin Jiang, Wei Wang:
Weighted Sampling for Masked Language Modeling. 1-5 - Chenyang Gao, Yue Gu, Francesco Calivá, Yuzong Liu:
Self-Supervised Speech Representation Learning for Keyword-Spotting With Light-Weight Transformers. 1-5 - Robert Kuku Fotock, Alessio Zappone, Marco Di Renzo:
Energy Efficiency Maximization in RIS-aided Networks with Global Reflection Constraints. 1-5 - Hongjia Zhai, Hai Li, Hanzhi Zhang, Hujun Bao, Guofeng Zhang:
Self-Distillation Hashing for Efficient Hamming Space Retrieval. 1-5 - Rokia Abdein, Xuezhi Xiang, Ning Lv, Abdulmotaleb El-Saddik:
Deformable Cross Attention for Learning Optical Flow. 1-5 - Lihua Zhang, Quan Liu, Zhigang Huang, Lan Wu:
Learning Unbiased Rewards with Mutual Information in Adversarial Imitation Learning. 1-5 - Shiyu Chen, Wenxin Yu, Qi Wang, Jun Gong, Peng Chen:
Image Inpainting with Semantic-Aware Transformer. 1-5 - Woan-Shiuan Chien, Chi-Chun Lee:
Achieving Fair Speech Emotion Recognition via Perceptual Fairness. 1-5 - Fan Cui, Liyong Guo, Lang He, Jiyao Liu, Ercheng Pei, Yujun Wang, Dongmei Jiang:
Relate Auditory Speech To Eeg By Shallow-Deep Attention-Based Network. 1-2 - Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu:
Emodiff: Intensity Controllable Emotional Text-to-Speech with Soft-Label Guidance. 1-5 - Qin Shi, Liang Liu, Shuowen Zhang:
Joint Data Association, NLOS Mitigation, and Clutter Suppression for Networked Device-Free Sensing in 6G Cellular Network. 1-5 - Kazuyuki Arikawa, Shoichi Koyama, Hiroshi Saruwatari:
Spatial Active Noise Control Method Based on Sound Field Interpolation from Reference Microphone Signals. 1-5 - Cal Peyser, Michael Picheny, Kyunghyun Cho, Rohit Prabhavalkar, W. Ronny Huang, Tara N. Sainath:
A Comparison of Semi-Supervised Learning Techniques for Streaming ASR at Scale. 1-5 - Zehua Zhang, Shiyun Xu, Xuyi Zhuang, Yukun Qian, Lianyu Zhou, Mingjiang Wang:
Half-Temporal and Half-Frequency Attention U2Net for Speech Signal Improvement. 1-2 - Salvatore Calcagno, Raffaele Mineo, Daniela Giordano, Concetto Spampinato:
Ensemble and Personalized Transformer Models for Subject Identification and Relapse Detection in E-Prevention Challenge. 1-2 - Daniel Fejgin, Simon Doclo:
Assisted RTF-Vector-Based Binaural Direction of Arrival Estimation Exploiting A Calibrated External Microphone Array. 1-5 - Akshay S. Bondre, Christ D. Richmond, Ahmed Alkhateeb, Nicolò Michelusi:
Sparse Delay-Doppler Channel Estimation for OTFS Modulation Using 2D-Music. 1-5 - Jiancai Zhu, Jiabao Zhao, Jiayi Zhou, Liang He, Jing Yang, Zhi Zhang:
Uncertainty-Aware Few-Shot Class-Incremental Learning. 1-5 - Jiayan Guo, Meiqi Chen, Yan Zhang, Jianqiang Huang, Zhiwei Liu:
Hierarchical Hypergraph Recurrent Attention Network for Temporal Knowledge Graph Reasoning. 1-5 - Shana Moothedath, Namrata Vaswani:
Comparing Decentralized Gradient Descent Approaches and Guarantees. 1-5 - Hao-Wen Dong, Ke Chen, Shlomo Dubnov, Julian J. McAuley, Taylor Berg-Kirkpatrick:
Multitrack Music Transformer. 1-5 - Mingrui He, Tianyu Chen, Haoyi Zhou, Shanghang Zhang, Jianxin Li:
BadRes: Reveal the Backdoors Through Residual Connection. 1-5 - Mohamed Gueye, Yazid Attabi, Maxime Dumas:
Row Conditional-TGAN for Generating Synthetic Relational Databases. 1-5 - Ziya Gülgün, Erik G. Larsson:
Channel Estimation in Massive MIMO with Heavy-Tailed Noise: Gaussian-Mixture Versus Cauchy Models. 1-4 - Serge Kas Hanna, Zhiyuan Tan, Wen Xu, Antonia Wachter-Zeh:
Codes Correcting Burst and Arbitrary Erasures for Reliable and Low-Latency Communication. 1-5 - Daniel Tompkins, Dimitra Emmanouilidou, Soham Deshmukh, Benjamin Elizalde:
Multi-View Learning for Speech Emotion Recognition with Categorical Emotion, Categorical Sentiment, and Dimensional Scores. 1-5 - Teerapat Jenrungrot, Michael Chinen, W. Bastiaan Kleijn, Jan Skoglund, Zalán Borsos, Neil Zeghidour, Marco Tagliasacchi:
LMCodec: A Low Bitrate Speech Codec with Causal Transformer Models. 1-5 - Prasenjit Mondal, Ayush Pant, Sachin Soni:
Dewarping Documents Using C2 Continuous Boundary Estimation. 1-5 - Michele Cirillo, Vincenzo Matta, Ali H. Sayed:
Learning Dynamic Graphs under Partial Observability. 1-5 - Jie Wang, Zhicong Chen, Haodong Zhou, Lin Li, Qingyang Hong:
Community Detection Graph Convolutional Network for Overlap-Aware Speaker Diarization. 1-5 - Simon Vary, Hazan Daglayan, Laurent Jacques, Pierre-Antoine Absil:
Low-Rank Plus Sparse Trajectory Decomposition for Direct Exoplanet Imaging. 1-5 - Zhicong Chen, Jie Wang, Wenxuan Hu, Lin Li, Qingyang Hong:
Unsupervised Speaker Verification Using Pre-Trained Model and Label Correction. 1-5 - Tengtao Song, Nuo Chen, Ji Jiang, Zhihong Zhu, Yuexian Zou:
Improving Retrieval-Based Dialogue System Via Syntax-Informed Attention. 1-5 - Qianshuo Hu, Hong Liu, Huaqiu Wang, Mengyuan Liu:
Body Prior Guided Graph Convolutional Neural Network for Skeleton-Based Action Recognition. 1-5 - Haibin Yu, Yuxuan Hu, Yao Qian, Ma Jin, Linquan Liu, Shujie Liu, Yu Shi, Yanmin Qian, Edward Lin, Michael Zeng:
Code-Switching Text Generation and Injection in Mandarin-English ASR. 1-5 - Binglin Li, Jie Liang, Haisheng Fu, Jingning Han:
ROI-Based Deep Image Compression with Swin Transformers. 1-5 - Baichuan Huang, Azra Abtahi, Amir Aminifar:
Lightweight Machine Learning for Seizure Detection on Wearable Devices. 1-2 - Jiayu Li, Tianyun Zhang, Shengmin Jin, Reza Zafarani:
Semi-Supervised Graph Ultra-Sparsifier Using Reweighted ℓ1 Optimization. 1-5 - Junxiang Ruan, Xiangtao Kong, Wenqi Huang, Wenming Yang:
Retiformer: Retinex-Based Enhancement In Transformer For Low-Light Image. 1-5 - Daniel Nicholls, Jack Wells, Alex W. Robinson, Amirafshar Moshtaghpour, Maryna Kobylynska, Roland A. Fleck, Angus I. Kirkland, Nigel D. Browning:
A Targeted Sampling Strategy for Compressive Cryo Focused Ion Beam Scanning Electron Microscopy. 1-5 - Bin Ren, Hao Tang, Yiming Wang, Xia Li, Wei Wang, Nicu Sebe:
PI-Trans: Parallel-Convmlp and Implicit-Transformation Based Gan for Cross-View Image Translation. 1-5 - Ahmed M. A. Shaalan, Jun Du:
Super Dilated Nested Arrays with Ideal Critical Weights and Increased Degrees of Freedom. 1-5 - Jan Dorazil, Bernard H. Fleury, Franz Hlawatsch:
Bayesian Methods for Optical Flow Estimation Using a Variational Approximation, with Applications to Ultrasound. 1-5 - Shuo-Yiin Chang, Chao Zhang, Tara N. Sainath, Bo Li, Trevor Strohman:
Context-Aware end-to-end ASR Using Self-Attentive Embedding and Tensor Fusion. 1-5 - Ying Zhou, Xuefeng Liang, Shiquan Zheng, Huijun Xuan, Takatsune Kumada:
Adaptive Mask Co-Optimization for Modal Dependence in Multimodal Learning. 1-5 - Mocho Go, Hideyuki Tachibana:
GSWIN: Gated MLP Vision Model with Hierarchical Structure of Shifted Window. 1-5 - Xiang Gao, Honghui Lin, Yu Li, Ruiyan Fang, Xin Zhang:
Look and Think: Intrinsic Unification of Self-Attention and Convolution for Spatial-Channel Specificity. 1-5 - Leheng Sheng, Wenhan Wang, Zhiyi Shi, Jichao Zhan, Youyong Kong:
Brainnetformer: Decoding Brain Cognitive States with Spatial-Temporal Cross Attention. 1-5 - Binh P. Nguyen, Michael Nigro, Alice Rueda, Venkat Bhat, Sridhar Krishnan:
Digital Phenotype Representation by Statistical, Information Theory, Data-Driven Approach with Digital Health Data. 1-5 - Ankita Pasad, Bowen Shi, Karen Livescu:
Comparative Layer-Wise Analysis of Self-Supervised Speech Models. 1-5 - Yong-Yeon Jo, Young Sang Choi, Jong-Hwan Jang, Joon-Myoung Kwon:
ECGT2T: Towards Synthesizing Twelve-Lead Electrocardiograms from Two Asynchronous Leads. 1-5 - Aleksandr Laptev, Vladimir Bataev, Igor Gitman, Boris Ginsburg:
Powerful and Extensible WFST Framework for Rnn-Transducer Losses. 1-5 - Georgios Vasileios Karanikolas, Alba Pagès-Zamora, Georgios B. Giannakis:
Higher-Order Link Prediction Via Learnable Maximum Mean Discrepancy. 1-5 - Zhixuan Li, Ruohua Shi, Tiejun Huang, Tingting Jiang:
OAFormer: Learning Occlusion Distinguishable Feature for Amodal Instance Segmentation. 1-5 - Supritha M. Shetty, Shraddha Revankar, Nalini C. Iyer, K. T. Deepak:
F0 Estimation From Telephone Speech Using Deep Feature Loss. 1-5 - Xujiang Zhao, Xuchao Zhang, Chen Zhao, Jin-Hee Cho, Lance M. Kaplan, Dong Hyun Jeong, Audun Jøsang, Haifeng Chen, Feng Chen:
Multi-Label Temporal Evidential Neural Networks for Early Event Detection. 1-5 - Md. Ershadul Haque, Manoranjan Paul, Anwar Ulhaq, Tanmoy Debnath:
A Novel State Connection Strategy for Quantum Computing to Represent and Compress Digital Images. 1-5 - Ben Gabrielson, Mingyu Sun, Mohammad A. B. S. Akhonda, Vince D. Calhoun, Tülay Adali:
Independent Vector Analysis with Multivariate Gaussian Model: a Scalable Method by Multilinear Regression. 1-5 - Mohsen Abdoli, Gordon Clare, Félix Henry:
GOP-Based Latent Refinement for Learned Video Coding. 1-5 - Yongqiang Wang, Zhehuai Chen, Chengjian Zheng, Yu Zhang, Wei Han, Parisa Haghani:
Accelerating RNN-T Training and Inference Using CTC Guidance. 1-5 - Haoxiang Zhang, He Jiang, Ziqiang Wang, Deqiang Cheng:
Ontology-Aware Network for Zero-Shot Sketch-Based Image Retrieval. 1-5 - Rajat Hebbar, Digbalay Bose, Krishna Somandepalli, Veena Vijai, Shrikanth Narayanan:
A Dataset for Audio-Visual Sound Event Detection in Movies. 1-5 - William Chen, Brian Yan, Jiatong Shi, Yifan Peng, Soumi Maiti, Shinji Watanabe:
Improving Massively Multilingual ASR with Auxiliary CTC Objectives. 1-5 - Qinglin Zhang, Chong Deng, Jiaqing Liu, Hai Yu, Qian Chen, Wen Wang, Zhijie Yan, Jinglin Liu, Yi Ren, Zhou Zhao:
MUG: A General Meeting Understanding and Generation Benchmark. 1-5 - Rémi Piau, Thomas Maugey, Aline Roumy:
Learning on Entropy Coded Images with CNN. 1-5 - Bo Dekker, Alfred C. Schouten, Odette Scharenborg:
DAIS: The Delft Database of EEG Recordings of Dutch Articulated and Imagined Speech. 1-5 - Chengyu Zheng, Yuan Zhou, Xiulian Peng, Yuan Zhang, Yan Lu:
Real-Time Speech Enhancement with Dynamic Attention Span. 1-5 - Bohan Tang, Siheng Chen, Xiaowen Dong:
Learning Hypergraphs From Signals With Dual Smoothness Prior. 1-5 - Kazuhiro Kobayashi, Tomoki Hayashi, Tomoki Toda:
Low-Latency Electrolaryngeal Speech Enhancement Based on Fastspeech2-Based Voice Conversion and Self-Supervised Speech Representation. 1-5 - Yuan Huang, Yuting Tang, Xiu Zheng, Jie Tang:
CPD-GAN: Cascaded Pyramid Deformation GAN for Pose Transfer. 1-5 - Jenthe Thienpondt, Nilesh Madhu, Kris Demuynck:
Margin-Mixup: A Method for Robust Speaker Verification In Multi-Speaker Audio. 1-5 - Peiying Wang, Chaoqun Duan, Meng Chen, Xiaodong He:
Improving Disfluency Detection with Multi-Scale Self Attention and Contrastive Learning. 1-5 - Zhongyu Yang, Chen Shen, Wei Shao, Tengfei Xing, Runbo Hu, Pengfei Xu, Hua Chai, Ruini Xue:
CANet: Curved Guide Line Network with Adaptive Decoder for Lane Detection. 1-5 - Jae-Heung Cho, Joon-Hyuk Chang:
CAN2V: Can-Bus Data-Based Seq2seq Model for Vehicle Velocity Prediction. 1-5 - Mahdi Namazifar, Devamanyu Hazarika, Dilek Hakkani-Tür:
Role of Bias Terms in Dot-Product Attention. 1-5 - Angélica S. Z. Suárez, Clément Laroche, Line H. Clemmensen, Sneha Das:
On Crowdsourcing-Design with Comparison Category Rating for Evaluating Speech Enhancement Algorithms. 1-5 - Yuning Wu, Jiatong Shi, Tao Qian, Dongji Gao, Qin Jin:
Phoneix: Acoustic Feature Processing Strategy for Enhanced Singing Pronunciation With Phoneme Distribution Predictor. 1-5 - Saidur R. Pavel, Yimin D. Zhang, Maria S. Greco, Fulvio Gini:
Deep Learning-Based Compressive Sampling Optimization in Massive MIMO Systems. 1-5 - Tao Li, Huayu Shou, Yuchen Deng, Yu Zhou, Chenqi Shi, Pengpeng Chen:
A Novel Heart Rate Estimation Method Exploiting Heartbeat Second Harmonic Reconstruction Via Millimeter Wave Radar. 1-5 - Daniel Mas Montserrat, Alexander G. Ioannidis:
Adversarial Attacks on Genotype Sequences. 1-5 - Francesco Binucci, Paolo Banelli:
BER-Aware Dynamic Resource Management for Edge-Assisted Goal-Oriented Communications. 1-5 - Chenxu Niu, Yue Hu, Wei Peng, Yuqiang Xie:
Learning to Balance the Global Coherence and Informativeness in Knowledge-Grounded Dialogue Generation. 1-5 - Wenbo Shi, Wenming Yang, Qingmin Liao:
Robust Content-Variant Reference Image Quality Assessment Via Similar Patch Matching. 1-5 - Yanjia Li, Lahiru Samarakoon, Ivan Fung:
Improving Non-Autoregressive Speech Recognition with Autoregressive Pretraining. 1-5 - Niladri Halder, K. P. Arunkumar, Chandra R. Murthy:
Variational Bayesian Channel Estimation in Wideband Multi-Scale Multi-Lag Channels. 1-5 - Jian Xiong, Sifan Wu, Wang Luo, Jinli Suo, Hao Gao:
ψ-Net: Point Structural Information Network for No-Reference Point Cloud Quality Assessment. 1-5 - Shuxin Qin, Yongcan Luo, Gaofeng Tao:
Memory-Augmented U-Transformer For Multivariate Time Series Anomaly Detection. 1-5 - Jonah Anton, Harry Coppock, Pancham Shukla, Björn W. Schuller:
Audio Barlow Twins: Self-Supervised Audio Representation Learning. 1-5 - Boyang Zhang, Suping Wu, Meining Jia:
Time-Frequency Awareness Network For Human Mesh Recovery From Videos. 1-5 - Jiseob Kim, Kyuhong Shim, Junhan Kim, Byonghyo Shim:
Vision Transformer-Based Feature Extraction for Generalized Zero-Shot Learning. 1-5 - Na Jiang, Wei Quan, Qichuan Geng, Zhi-Ping Shi, Peng Xu:
Exploiting 3D Human Recovery for Action Recognition with Spatio-Temporal Bifurcation Fusion. 1-5 - Chao Liu, Ruipeng Ma, Zheng Si, Mingmin Chi:
A Method of Constructing and Automatically Labeling Radio Frequency Signal Training Dataset for UAV. 1-5 - Zixuan Xiao, Shengshi Yao, Jincheng Dai, Sixian Wang, Kai Niu, Ping Zhang:
Wireless Deep Speech Semantic Transmission. 1-5 - Zihao Guo, Shilin Wang:
Content-Insensitive Dynamic Lip Feature Extraction for Visual Speaker Authentication Against Deepfake Attacks. 1-5 - Felix Wu, Kwangyoun Kim, Shinji Watanabe, Kyu Jeong Han, Ryan McDonald, Kilian Q. Weinberger, Yoav Artzi:
Wav2Seq: Pre-Training Speech-to-Text Encoder-Decoder Models Using Pseudo Languages. 1-5 - Xiaoyu Lin, Xiaoyu Bie, Simon Leglaive, Laurent Girin, Xavier Alameda-Pineda:
Speech Modeling with a Hierarchical Transformer Dynamical VAE. 1-5 - Ibrahim Alkanhal, Abdullah Almansour, Lamia Alsalloom, Raied Aljadaany, Marios Savvides:
Cov Loss: Covariance-Based Loss for Deep Face Recognition. 1-5 - Xue Yao, Guolong Cui, Xianxiang Yu:
Dual-Use Signal Design for MIMO Radcom with Inter-Pulse Index Modulation. 1-5 - Victor Solo:
Asymptotic Bias and Variance of Kernel Ridge Regression. 1-5 - Atli Þór Sigurgeirsson, Simon King:
Do Prosody Transfer Models Transfer Prosodyƒ. 1-5 - Shanshan Wang, Soumya Tripathy, Annamaria Mesaros:
Self-Supervised Learning of Audio Representations using Angular Contrastive Loss. 1-5 - Mohan Zhou, Yalong Bai, Wei Zhang, Ting Yao, Tiejun Zhao, Tao Mei:
Visual-Aware Text-to-Speech*. 1-5 - Liwen Peng, Songlei Jian, Dongsheng Li, Siqi Shen:
MRML: Multimodal Rumor Detection by Deep Metric Learning. 1-5 - Zhimin He, Jiangbo Qian, Diqun Yan, Chong Wang, Yu Xin:
Animal Re-Identification Algorithm for Posture Diversity. 1-5 - Daichi Guo, Guanting Dong, Dayuan Fu, Yuxiang Wu, Chen Zeng, Tingfeng Hui, Liwen Wang, Xuefeng Li, Zechen Wang, Keqing He, Xinyue Cui, Weiran Xu:
Revisit Out-Of-Vocabulary Problem For Slot Filling: A Unified Contrastive Framework With Multi-Level Data Augmentations. 1-5 - Farshad G. Veshki, Sergiy A. Vorobyov:
Efficient Online Convolutional Dictionary Learning Using Approximate Sparse Components. 1-5 - Yidi Zhang, Wenqi Huang, Wenming Yang:
Global Matching-Optimization Network for Stereo Depth Estimation. 1-5 - Xuechao He, Jiaojiao Zhang, Qing Ling:
Byzantine-Robust and Communication-Efficient Personalized Federated Learning. 1-5 - Jin Zeng, Yang Liu, Gene Cheung, Wei Hu:
Sparse Graph Learning with Spectrum Prior for Deep Graph Convolutional Networks. 1-5 - Li Fu, Siqi Li, Qingtao Li, Liping Deng, Fangzhu Li, Lu Fan, Meng Chen, Xiaodong He:
UFO2: A Unified Pre-Training Framework for Online and Offline Speech Recognition. 1-5 - Haohan Luo, Feng Wang:
A Simulation-Based Framework for Urban Traffic Accident Detection. 1-5 - Chris Henry, Rijun Liao, Ruiyuan Lin, Zhebin Zhang, Hongyu Sun, Zhu Li:
Lightweight Fisher Vector Transfer Learning for Video Deduplication. 1-5 - Ofer Schwartz, Ayal Schwartz:
RNN-Based Step-Size Estimation for the RLS Algorithm with Application to Acoustic Echo Cancellation. 1-5 - Wei Huang, Haiyang Zhang, Nir Shlezinger, Yonina C. Eldar:
Joint Microstrip Selection and Beamforming Design for MmWave Systems with Dynamic Metasurface Antennas. 1-5 - Hexiang Zhang, Zhenghua Xu, Dan Yao, Shuo Zhang, Junyang Chen, Thomas Lukasiewicz:
Multi-Head Feature Pyramid Networks for Breast Mass Detection. 1-5 - Zishuo Zhao, Yuexiang Xie, Jingyou Xie, Zhenzhou Lin, Yaliang Li, Ying Shen:
Source-Free Unsupervised Domain Adaptation for Question Answering. 1-5 - Qingqing Zhao, Yanting Ma, Petros Boufounos, Saleh Nabi, Hassan Mansour:
Deep Born Operator Learning for Reflection Tomographic Imaging. 1-5 - Kangdi Mei, Xinyun Ding, Yinlong Liu, Zhiqiang Guo, Feiyang Xu, Xin Li, Tuya Naren, Jiahong Yuan, Zhenhua Ling:
The Ustc System for Adress-m Challenge. 1-2 - Hai Victor Habi, Hagit Messer, Yoram Bresler:
Learned Generative Misspecified Lower Bound. 1-5 - Tzu-Ting Chuang, Ting-Yun Wei, Yu-Hsing Hsieh, Chu-Song Chen, Huei-Fang Yang:
Continual Cell Instance Segmentation of Microscopy Images. 1-5 - Pengcheng Guo, He Wang, Bingshen Mu, Ao Zhang, Peikun Chen:
The NPU-ASLP System for Audio-Visual Speech Recognition in MISP 2022 Challenge. 1-2 - Dianwen Ng, Ruixi Zhang, Jia Qi Yip, Chong Zhang, Yukun Ma, Trung Hieu Nguyen, Chongjia Ni, Eng Siong Chng, Bin Ma:
Contrastive Speech Mixup for Low-Resource Keyword Spotting. 1-5 - Zechao Hu, Adrian G. Bors:
Few but Informative Local Hash Code Matching for Image Retrieval. 1-5 - Samik Sadhu, Hynek Hermansky:
Importance of Different Temporal Modulations of Speech: a Tale of two Perspectives. 1-5 - Shehzeen Hussain, Paarth Neekhara, Jocelyn Huang, Jason Li, Boris Ginsburg:
ACE-VC: Adaptive and Controllable Voice Conversion Using Explicitly Disentangled Self-Supervised Speech Representations. 1-5 - Charles Hovine, Alexander Bertrand:
A Distributed Adaptive Algorithm for Non-Smooth Spatial Filtering Problems. 1-5 - Xiaotong Zhang, Peng He, Han Liu, Zhengxi Yin, Xinyue Liu, Xianchao Zhang:
Knowledge-Aware Graph Convolutional Network with Utterance-Specific Window Search for Emotion Recognition In Conversations. 1-5 - Leonardo Fierro, Alec Wright, Vesa Välimäki, Matti S. Hämäläinen:
Extreme Audio Time Stretching Using Neural Synthesis. 1-5 - Arian Eamaz, Farhang Yeganegi, Mojtaba Soltanalian:
CyPMLI: WISL-Minimized Unimodular Sequence Design via Power Method-Like Iterations. 1-5 - Anup Singh, Kris Demuynck, Vipul Arora:
Simultaneously Learning Robust Audio Embeddings and Balanced Hash Codes for Query-by-Example. 1-5 - Fan Hu, Aozhu Chen, Xirong Li:
Towards Making a Trojan-Horse Attack on Text-to-Image Retrieval. 1-5 - Solomon Goldgraber Casspi, Oliver Hüsser, Guy Revach, Nir Shlezinger:
LQGNET: Hybrid Model-Based and Data-Driven Linear Quadratic Stochastic Control. 1-5 - Efthymios Tzinis, Gordon Wichern, Paris Smaragdis, Jonathan Le Roux:
Optimal Condition Training for Target Source Separation. 1-5 - Tianxiao Han, Jiancheng Tang, Qianqian Yang, Yiping Duan, Zhaoyang Zhang, Zhiguo Shi:
Generative Model based Highly Efficient Semantic Communication Approach for Image Transmission. 1-5 - Jian Ni, Yong Liao:
Semantics-Disentangled Contrastive Embedding for Generalized Zero-Shot Learning. 1-5 - Guillaume Le Guludec, Christine Guillemot:
Joint Neural Representation for Multiple Light Fields. 1-5 - Bin Liu, Fengfu Li, Xiaogang Wang, Bo Zhang, Junchi Yan:
Ternary Weight Networks. 1-5 - Chengyuan Xu, Kang Liu, Xuelong Li:
Meta++ Network for Few-Shot Aerospace Crack Segmentation. 1-5 - Xingyu Liu, Pengfei Ren, Yuchen Chen, Cong Liu, Jing Wang, Haifeng Sun, Qi Qi, Jing-Yu Wang:
Sample-Adapt Fusion Network for RGB-D Hand Detection in the Wild. 1-5 - Alireza Keshavarzian, Hojjat Salehinejad, Shahrokh Valaee:
Representation Learning of Clinical Multivariate Time Series with Random Filter Banks. 1-5 - Jiachen Lian, Alan W. Black, Yijing Lu, Louis Goldstein, Shinji Watanabe, Gopala Krishna Anumanchipalli:
Articulatory Representation Learning via Joint Factor Analysis and Neural Matrix Factorization. 1-5 - Simon Durand, Daniel Stoller, Sebastian Ewert:
Contrastive Learning-Based Audio to Lyrics Alignment for Multiple Languages. 1-5 - Ershad Banijamali, Pegah Kharazmi, Sepehr Eghbali, Jixuan Wang, Clement Chung, Samridhi Choudhary:
Pyramid Dynamic Inference: Encouraging Faster Inference Via Early Exit Boosting. 1-5 - Chenyue Zhang, Yiran He, Hoi-To Wai:
Product Graph Learning From Multi-Attribute Graph Signals with Inter-Layer Coupling. 1-5 - Bilal Taha, Dae Yon Hwang, Dimitrios Hatzinakos:
EEG Emotion Recognition Via Ensemble Learning Representations. 1-5 - Yunkai Zhuang, Shangdong Yang, Wenbin Li, Yang Gao:
Convergence Analysis of Graphical Game-Based Nash Q-Learning using the Interaction Detection Signal of N-Step Return. 1-5 - Hanqing Liu, Wei Wang, Niu Hu, Hai-Tao Zheng, Rui Xie, Wei Wu, Yang Bai:
Guide and Select: A Transformer-Based Multimodal Fusion Method for Points of Interest Description Generation. 1-5 - Nesryne Mejri, Enjie Ghorbel, Djamila Aouada:
UNTAG: Learning Generic Features for Unsupervised Type-Agnostic Deepfake Detection. 1-5 - Th. Beroud, Patrice Abry, Yannick Malevergne, Marc Senneret, Gerald Perrin, J. Macq:
Wassertein Gan Synthesis for Time Series with Complex Temporal Dynamics: Frugal Architectures and Arbitrary Sample-Size Generation. 1-5 - Zhiyuan Zhao, Lijun Wu, Chuanxin Tang, Dacheng Yin, Yucheng Zhao, Chong Luo:
Filler Word Detection with Hard Category Mining and Inter-Category Focal Loss. 1-5 - Chengxiao Luo, Yiming Li, Yong Jiang, Shu-Tao Xia:
Untargeted Backdoor Attack Against Object Detection. 1-5 - Guan-Yuan Chen, Ya-Fen Yeh, Von-Wun Soo:
RAT: Radial Attention Transformer for Singing Technique Recognition. 1-5 - Ryan M. Corey, Andrew C. Singer:
Immersive Enhancement and Removal of Loudspeaker Sound Using Wireless Assistive Listening Systems and Binaural Hearing Devices. 1-2 - Hyungseob Lim, Jihyun Lee, Byeong Hyeon Kim, Inseon Jang, Hong-Goo Kang:
End-to-End Neural Audio Coding in the MDCT Domain. 1-5 - Carlos Hurtado, Sarath Shekkizhar, Javier Ruiz Hidalgo, Antonio Ortega:
Study of Manifold Geometry Using Multiscale Non-Negative Kernel Graphs. 1-5 - Salime Bameri, Khalid Almahorg, Ramy H. Gohary, Amr El-Keyi, Yahia Ahmed:
Downlink Covariance Estimation in URA FDD Massive MIMO Systems. 1-5 - Cheng Tian, Zhiming Luo, Guimin Shi, Shaozi Li:
Frequency-Aware Attentional Feature Fusion for Deepfake Detection. 1-5 - Niklas Smedemark-Margulies, Basak Celik, Tales Imbiriba, Aziz Kocanaogullari, Deniz Erdogmus:
Recursive Estimation of User Intent From Noninvasive Electroencephalography Using Discriminative Models. 1-5 - Pengfei Sun, Ehsan Eqlimi, Yansong Chua, Paul Devos, Dick Botteldooren:
Adaptive Axonal Delays in Feedforward Spiking Neural Networks for Accurate Spoken Word Recognition. 1-5 - Chen Lin, Ye Liu, Siyu An, Di Yin:
Unsupervised Extractive Summarization With Heterogeneous Graph Embeddings for Chinese Documents. 1-5 - Chen Chen, Yuchen Hu, Weiwei Weng, Eng Siong Chng:
Metric-Oriented Speech Enhancement Using Diffusion Probabilistic Model. 1-5 - Abdulrahman Takiddin, Rachad Atat, Muhammad Ismail, Katherine R. Davis, Erchin Serpedin:
A Graph Neural Network Multi-Task Learning-Based Approach for Detection and Localization of Cyberattacks in Smart Grids. 1-5 - Chao Zhou, Can Chen, Dengyin Zhang:
A Flow-Guided Non-Local Alignment Network for Video Compressive Sensing Reconstruction. 1-5 - Tanuka Bhattacharjee, Chowdam Venkata Thirumala Kumar, Yamini Belur, Atchayaram Nalini, Ravi Yadav, Prasanta Kumar Ghosh:
Static and Dynamic Source and Filter Cues for Classification of Amyotrophic Lateral Sclerosis Patients and Healthy Subjects. 1-5 - Ryosuke Watanabe, Keisuke Nonaka, Eduardo Pavez, Tatsuya Kobayashi, Antonio Ortega:
Graph-Based Point Cloud Color Denoising with 3-Dimensional Patch-Based Similarity. 1-5 - Wei-Bang Jiang, Xu Yan, Wei-Long Zheng, Bao-Liang Lu:
Elastic Graph Transformer Networks for EEG-Based Emotion Recognition. 1-5 - Subash Timilsina, Sagar Shrestha, Xiao Fu:
Deep Spectrum Cartography Using Quantized Measurements. 1-5 - Jingqi Li, Jiaqi Gao, Yuzhen Zhang, Hongming Shan, Junping Zhang:
Motion Matters: A Novel Motion Modeling for Cross-View Gait Feature Learning. 1-5 - Tara N. Sainath, Rohit Prabhavalkar, Diamantino Caseiro, Pat Rondon, Cyril Allauzen:
Improving Contextual Biasing with Text Injection. 1-5 - Sreyan Ghosh, Ashish Seth, Srinivasan Umesh, Dinesh Manocha:
MAST: Multiscale Audio Spectrogram Transformers. 1-5 - Xun Wu, Guolong Wang, Zhaoyuan Liu, Xuan Dang, Zheng Qin:
Instance-Aware Hierarchical Structured Policy for Prompt Learning in Vision-Language Models. 1-5 - Hao Wu, Jiajie Wang, Zhonglin Zu:
Interaction-Assisted Multi-Modal Representation Learning for Recommendation. 1-5 - Koichi Saito, Naoki Murata, Toshimitsu Uesaka, Chieh-Hsin Lai, Yuhta Takida, Takao Fukui, Yuki Mitsufuji:
Unsupervised Vocal Dereverberation with Diffusion-Based Generative Models. 1-5 - Jiayi Gao, Jiaxing Li, Ke Zhang, Youyong Kong:
Topology Uncertainty Modeling For Imbalanced Node Classification on Graphs. 1-5 - Ziqian Ning, Qicong Xie, Pengcheng Zhu, Zhichao Wang, Liumeng Xue, Jixun Yao, Lei Xie, Mengxiao Bi:
Expressive-VC: Highly Expressive Voice Conversion with Attention Fusion of Bottleneck and Perturbation Features. 1-5 - Sara Pérez-Vieites, Víctor Elvira:
Adaptive Gaussian Nested Filter for Parameter Estimation and State Tracking in Dynamical Systems. 1-5 - Liusha Yang, Matthew R. McKay, Xun Wang:
Hypothesis Test for Leakage Detection in Water Pipelines with High-Dimensional Sensor Signals. 1-5 - Wing W. Y. Ng, Peixin Zheng, Ting Wang, Jianjun Zhang, Yinhao Liang, Hui Zhou, Dan Liang, Guangming Li, Xinhua Wei:
LSSED: A Robust Segmentation Network for Inflamed Appendix from CT Images. 1-5 - Anlei Zhu, Yinghui Wang, Wei Li, Pengjiang Qian:
Structural Reparameterization Lightweight Network for Video Action Recognition. 1-5 - Kyriakos Stylianopoulos, Mattia Merluzzi, Paolo Di Lorenzo, George C. Alexandropoulos:
Lyapunov-Driven Deep Reinforcement Learning for Edge Inference Empowered by Reconfigurable Intelligent Surfaces. 1-5 - Jinlong Xue, Yayue Deng, Fengping Wang, Ya Li, Yingming Gao, Jianhua Tao, Jianqing Sun, Jiaen Liang:
M2-CTTS: End-to-End Multi-Scale Multi-Modal Conversational Text-to-Speech Synthesis. 1-5 - Jianwei Yu, Hangting Chen, Yi Luo, Rongzhi Gu, Weihua Li, Chao Weng:
TSpeech-AI System Description to the 5th Deep Noise Suppression (DNS) Challenge. 1-2 - Xulong Zhang, Haobin Tang, Jianzong Wang, Ning Cheng, Jian Luo, Jing Xiao:
Dynamic Alignment Mask CTC: Improved Mask CTC With Aligned Cross Entropy. 1-5 - Riccardo Corvi, Davide Cozzolino, Giada Zingarini, Giovanni Poggi, Koki Nagano, Luisa Verdoliva:
On The Detection of Synthetic Images Generated by Diffusion Models. 1-5 - Yifan Peng, Kwangyoun Kim, Felix Wu, Prashant Sridhar, Shinji Watanabe:
Structured Pruning of Self-Supervised Pre-Trained Models for Speech Recognition and Understanding. 1-5 - Martina Cilia, Diego Valsesia, Giulia Fracastoro, Enrico Magli:
Multi-Level Fusion for Burst Super-Resolution with Deep Permutation-Invariant Conditioning. 1-5 - Lang Wang, Juan Liu, Peng Jiang, Dehua Cao, Baochuan Pang:
LGVIT: Local-Global Vision Transformer for Breast Cancer Histopathological Image Classification. 1-5 - Zihao Wu, Huy Tran, Hamed Pirsiavash, Soheil Kolouri:
Is Multi-Task Learning an Upper Bound for Continual Learning? 1-5 - Qian Li, Franck Multon, Adnane Boukhayma:
Learning Generalizable Light Field Networks from Few Images. 1-5 - Hong Ye Tan, Subhadip Mukherjee, Junqi Tang, Andreas Hauptmann, Carola-Bibiane Schönlieb:
Robust Data-Driven Accelerated Mirror Descent. 1-5 - Jinchao Li, Kaitao Song, Junan Li, Bo Zheng, Dongsheng Li, Xixin Wu, Xunying Liu, Helen Meng:
Leveraging Pretrained Representations With Task-Related Keywords for Alzheimer's Disease Detection. 1-5 - Jingyi Zhang, Peng Zhang, Jingjing Wang, Di Xie, Shiliang Pu:
Learning Expressive And Generalizable Motion Features For Face Forgery Detection. 1-5 - Aref Miri Rekavandi, Abd-Krim Seghouane, Farid Boussaïd, Mohammed Bennamoun:
Extended Expectation Maximization for Under-Fitted Models. 1-5 - Ban-Sok Shin, Luis Wientgens, Dmitriy Shutin:
Parallel 2D Seismic Ray Tracing Using Cuda on a Jetson Nano. 1-5 - Yixing Peng, Quan Wang, Zhendong Mao, Yong-Dong Zhang:
SADE: A Self-Adaptive Expert for Multi-Dataset Question Answering. 1-5 - Xinyan Pu, Ke Zhang, Huazhong Shu, Jean-Louis Coatrieux, Youyong Kong:
Graph Contrastive Learning with Learnable Graph Augmentation. 1-5 - Tilak Purohit, Sarthak Yadav, Bogdan Vlasenko, S. Pavankumar Dubagunta, Mathew Magimai-Doss:
Towards Learning Emotion Information from Short Segments of Speech. 1-5 - Dandan Shan, Zihan Li, Wentao Chen, Qingde Li, Jie Tian, Qingqi Hong:
Coarse-to-Fine Covid-19 Segmentation via Vision-Language Alignment. 1-5 - Clémence Prévost, Valentin Leplat:
Nonnegative Block-Term Decomposition with the β-Divergence: Joint Data Fusion and Blind Spectral Unmixing. 1-5 - Pavel Andreev, Aibek Alanov, Oleg Ivanov, Dmitry P. Vetrov:
HIFI++: A Unified Framework for Bandwidth Extension and Speech Enhancement. 1-5 - Yingrui Xu, Jingyuan Hu, Jingguo Ge, Yulei Wu, Tong Li, Hui Li:
Contrastive Learning at the Relation and Event Level for Rumor Detection. 1-5 - Claudio Battiloro, Zhiyang Wang, Hans Riess, Paolo Di Lorenzo, Alejandro Ribeiro:
Tangent Bundle Filters and Neural Networks: From Manifolds to Cellular Sheaves and Back. 1-5 - Vivien Cabannes, Alberto Bietti, Randall Balestriero:
On Minimal Variations for Unsupervised Representation Learning. 1-5 - Jeroen Overdevest, A. G. C. Koppelaar, M. J. G. Bekooij, J. Youn, Ruud J. G. van Sloun:
Signal Reconstruction for FMCW Radar Interference Mitigation Using Deep Unfolding. 1-5 - Evan Scope Crafts, Bo Zhao:
Bayesian Cramér-Rao Bound Estimation With Score-Based Models. 1-5 - Bolaji Yusuf, Aditya Gourav, Ankur Gandhe, Ivan Bulyko:
On-the-Fly Text Retrieval for end-to-end ASR Adaptation. 1-5 - Wenjie Liu, Bingshu Wang, Jiangbin Zheng, Wenmin Wang:
Shadow Removal of Text Document Images Using Background Estimation and Adaptive Text Enhancement. 1-5 - Peter Wu, Li-Wei Chen, Cheol Jun Cho, Shinji Watanabe, Louis Goldstein, Alan W. Black, Gopala Krishna Anumanchipalli:
Speaker-Independent Acoustic-to-Articulatory Speech Inversion. 1-5 - Milan Aryal, Nasim Yahya Soltani:
Position-Aware Graph-Based Learning of Whole Slide Images. 1-5 - Zhanchao Huang, Wei Li, Ran Tao:
Multimodal Knowledge Distillation for Arbitrary-Oriented Object Detection in Aerial Images. 1-5 - Feng Zhang, Sheng Liu, Bingnan Guo, Ruixiang Chen, Junhao Chen:
SQA: Strong Guidance Query with Self-Selected Attention for Human-Object Interaction Detection. 1-5 - Dmitry Kozlov, Mikhail Bakulin, Stanislav Pavlov, Aleksandr Zuev, Mariya Krylova, Igor Kharchikov:
Learning Properties of Holomorphic Neural Networks of Dual Variables. 1-5 - R. Channing Moore, Daniel P. W. Ellis, Eduardo Fonseca, Shawn Hershey, Aren Jansen, Manoj Plakal:
Dataset Balancing Can Hurt Model Performance. 1-5 - Qingqiu Li, Jilan Xu, Runtian Yuan, Yuejie Zhang, Rui Feng:
SCSGNet: Spatial-Correlated and Shape-Guided Network for Breast Mass Segmentation. 1-5 - Dading Chong, Helin Wang, Peilin Zhou, Qingcheng Zeng:
Masked Spectrogram Prediction for Self-Supervised Audio Pre-Training. 1-5 - Zahra Esmaeilbeig, Arian Eamaz, Kumar Vijay Mishra, Mojtaba Soltanalian:
Joint Waveform and Passive Beamformer Design in Multi-IRS-Aided Radar. 1-5 - Jiaqi Gao, Xinyang Jiang, Yuqing Yang, Dongsheng Li, Lili Qiu:
Unsupervised Video Anomaly Detection For Stereotypical Behaviours in Autism. 1-5 - Chen He, Weisheng Gong, Yangrui Dong, Xie Xie, Z. Jane Wang:
Capacity Maximization for Active RIS Assisted Outdoor-to-Indoor Communication System. 1-5 - Michael Krause, Christof Weiß, Meinard Müller:
Soft Dynamic Time Warping for Multi-Pitch Estimation and Beyond. 1-5 - Vinay Kothapally, Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu:
Deep Neural Mel-Subband Beamformer for in-Car Speech Separation. 1-5 - Yeting Guo, Fang Liu, Tongqing Zhou, Zhiping Cai, Nong Xiao:
Efficient Personalized Federated Learning on Selective Model Training. 1-5 - Pranay Dighe, Prateeth Nayak, Oggi Rudovic, Erik Marchi, Xiaochuan Niu, Ahmed H. Tewfik:
Audio-to-Intent Using Acoustic-Textual Subword Representations from End-to-End ASR. 1-5 - Jens Hauser, Zhao Meng, Damian Pascual, Roger Wattenhofer:
Bert is Robust! A Case Against Word Substitution-Based Adversarial Attacks. 1-5 - Anbai Jiang, Wei-Qiang Zhang, Yufeng Deng, Pingyi Fan, Jia Liu:
Unsupervised Anomaly Detection and Localization of Machine Audio: A Gan-Based Approach. 1-5 - Xin Zeng, Yiqiang Chen, Benfeng Xu, Tengxiang Zhang:
Modaldrop: Modality-Aware Regularization for Temporal-Spectral Fusion in Human Activity Recognition. 1-5 - Shabhrish Reddy Uddehal, Tilo Strutz, Hannah Och, André Kaup:
Image Segmentation for Improved Lossless Screen Content Compression. 1-5 - Yiyue Chen, Abolfazl Hashemi, Haris Vikalo:
Accelerated Distributed Stochastic Non-Convex Optimization over Time-Varying Directed Networks. 1-5 - Jiani Liu, Qinghua Tao, Ce Zhu, Yipeng Liu, Johan A. K. Suykens:
Tensorized LSSVMS For Multitask Regression. 1-5 - Reinhard Wiesmayr, Gian Marti, Chris Dick, Haochuan Song, Christoph Studer:
Bit Error and Block Error Rate Training for ML-Assisted Communication. 1-5 - Prajwal Singh, Pankaj Pandey, Krishna P. Miyapuram, Shanmuganathan Raman:
EEG2IMAGE: Image Reconstruction from EEG Brain Signals. 1-5 - Ze-Yu Mi, Kun Long, Yu-Bin Yang:
Multiple Domain-Adversarial Ensemble Learning for Domain Generalization. 1-5 - Rafael Valle, João Felipe Santos, Kevin J. Shih, Rohan Badlani, Bryan Catanzaro:
High-Acoustic Fidelity Text To Speech Synthesis With Fine-Grained Control Of Speech Attributes. 1-5 - Han Yang, Fei Gu, Jieping Ye:
Rethinking Learning-Based Method for Lossless Genome Compression. 1-5 - Ivan Rodin, Antonino Furnari, Dimitrios Mavroeidis, Giovanni Maria Farinella:
Egocentric Action Anticipation for Personal Health. 1-5 - Jinjiang Liu, Xueliang Zhang:
Inplace Cepstral Speech Enhancement System for the ICASSP 2023 Clarity Challenge. 1-2 - Yu Jin, Juan Liu, Hua Chen, Wensi Duan, Dehua Cao, Baochuan Pang:
MASKED-AP: Attention Pyramid Convolutional Neural Network with Mask for Cervical Cell Classification. 1-5 - Deyuan Wang, Tiantian Zhang, Caixia Yuan, Xiaojie Wang:
Joint Modeling for ASR Correction and Dialog State Tracking. 1-5 - Gwendal Le Vaillant, Thierry Dutoit:
Synthesizer Preset Interpolation Using Transformer Auto-Encoders. 1-5 - Carlos A. Gómez-Vega, Moe Z. Win, Andrea Conti:
UWB Localization-of-Things Via Soft Information: Network Experimentation in Indoor Environment. 1-5 - Pierre David, Patrick Le Callet, Suiyi Ling, Haixiong Wang, Ioannis Katsavounidis, Zafar Shahid, Cosmin Stejerean:
Estimating Uncertainty On Video Quality Metrics. 1-5 - Xin Tong, Zhaoyang Zhang, Zhaohui Yang:
Multi-View Millimeter-Wave Imaging Over Wireless Cellular Network. 1-5 - Hamza Djelouat, Markus Leinonen, Markku J. Juntti:
Joint Estimation of Clustered user Activity and Correlated Channels with Unknown Covariance in mMTC. 1-5 - Yuchao Feng, Jiawei Jiang, Honghui Xu, Jianwei Zheng:
Building Change Detection Using Cross-Temporal Feature Interaction Network. 1-5 - Shaikhah Alkhadhr, Mohamed Almekkawy:
Modeling the Wave Equation Using Physics-Informed Neural Networks Enhanced With Attention to Loss Weights. 1-5 - Longfei Yan, Weilong Huang, W. Bastiaan Kleijn, Thushara D. Abhayapala:
Neural Optimization Of Geometry And Fixed Beamformer For Linear Microphone Arrays. 1-5 - Mohammad Zeineldeen, Kartik Audhkhasi, Murali Karthick Baskar, Bhuvana Ramabhadran:
Robust Knowledge Distillation from RNN-T Models with Noisy Training Labels Using Full-Sum Loss. 1-5 - Tianhao Zhang, Qi Liu, Xinyuan Qian, Song-Lu Chen, Feng Chen, Xu-Cheng Yin:
Self-Convolution for Automatic Speech Recognition. 1-5 - Florian Eilers, Xiaoyi Jiang:
Building Blocks for a Complex-Valued Transformer Architecture. 1-5 - Youngjoon Jang, Youngtaek Oh, Jae-Won Cho, Myungchul Kim, Dong-Jin Kim, In So Kweon, Joon Son Chung:
Self-Sufficient Framework for Continuous Sign Language Recognition. 1-5 - Xiaohuai Le, Li Chen, Chao He, Yiqing Guo, Cheng Chen, Xianjun Xia, Jing Lu:
Personalized Speech Enhancement Combining Band-Split RNN and Speaker Attentive Module. 1-2 - William McDonald, Cedric Le Gentil, Teresa A. Vidal-Calleja:
Global Localisation in Continuous Magnetic Vector Fields Using Gaussian Processes. 1-5 - Marcus Valtonen Örnhag, Stefan Ingi Adalbjörnsson, Püren Güler, Mojtaba Mahdavi:
A Critical Look at Recent Trends in Compression of Channel State Information. 1-5 - Takaaki Saeki, Heiga Zen, Zhehuai Chen, Nobuyuki Morioka, Gary Wang, Yu Zhang, Ankur Bapna, Andrew Rosenberg, Bhuvana Ramabhadran:
Virtuoso: Massive Multilingual Speech-Text Joint Semi-Supervised Learning for Text-to-Speech. 1-5 - Davide Salvi, Paolo Bestagini, Stefano Tubaro:
Reliability Estimation for Synthetic Speech Detection. 1-5 - Amir Shirian, Mona Ahmadian, Krishna Somandepalli, Tanaya Guha:
Heterogeneous Graph Learning for Acoustic Event Classification. 1-5 - Jiakun Shen, Xueshuai Zhang, Pengyuan Zhang, Yonghong Yan, Shaoxing Zhang, Zhihua Huang, Yanfen Tang, Yu Wang, Fujie Zhang, Aijun Sun:
Piecewise Position Encoding in Convolutional Neural Network for Cough-Based Covid-19 Detection. 1-5 - Huaizheng Zhang, Pinxue Guo, Zhongwen Le, Wenqiang Zhang:
Robust Video Object Segmentation with Restricted Attention. 1-5 - Honghua Cai, Jiahui Pan:
Two-Phase Prototypical Contrastive Domain Generalization for Cross-Subject EEG-Based Emotion Recognition. 1-5 - Alexander Bohlender, Liesbeth Roelens, Nilesh Madhu:
Improved Deep Speaker Localization and Tracking: Revised Training Paradigm and Controlled Latency. 1-5 - Xiaoxiao Wu, Dongxing Xu, Haoran Wei, Yanhua Long:
FEW-Shot Continual Learning with Weight Alignment and Positive Enhancement for Bioacoustic Event Detection. 1-5 - Kangjie Zheng, Longyue Wang, Zhihao Wang, Binqi Chen, Ming Zhang, Zhaopeng Tu:
Towards a Unified Training for Levenshtein Transformer. 1-5 - Yoshinao Sato, Narumitsu Ikeda, Hirokazu Takahashi:
Shuffleaugment: A Data Augmentation Method Using Time Shuffling. 1-5 - Jin Woo Lee, Kyogu Lee:
Neural Fourier Shift for Binaural Speech Rendering. 1-5 - Andrea Montibeller, Fernando Pérez-González:
Exploiting PRNU and Linear Patterns in Forensic Camera Attribution under Complex Lens Distortion Correction. 1-5 - Siddarth Asokan, Fatwir Sheikh Mohammed, Chandra Sekhar Seelamantula:
A Game of Snakes and Gans. 1-5 - Hexin Liu, Haihua Xu, Leibny Paola García, Andy W. H. Khong, Yi He, Sanjeev Khudanpur:
Reducing Language Confusion for Code-Switching Speech Recognition with Token-Level Language Diarization. 1-5 - Nikolaos Antoniou, Athanasios Katsamanis, Theodoros Giannakopoulos, Shrikanth Narayanan:
Designing and Evaluating Speech Emotion Recognition Systems: A Reality Check Case Study with IEMOCAP. 1-5 - Shaoqi Sun, Yuanzhao Zhai, Kele Xu, Dawei Feng, Bo Ding:
Progressive Diversifying Policy for Multi-Agent Reinforcement Learning. 1-5 - Siqi Zhang, Lu Zhang, Zhiyong Liu:
Refined Pseudo Labeling for Source-Free Domain Adaptive Object Detection. 1-5 - Beibei Hu, Qiang Li, Xianjun Xia:
The Ajmide Topic Segmentation System for the ICASSP 2023 General Meeting Understanding and Generation Challenge. 1-2 - Damir Rakhimov, Martin Haardt:
Equivalence of Aperture Reduction in Element Space and Constrained Combination of DFT Beams in Beamspace. 1-5 - Jerome R. Bellegarda:
Prefix-Level Detection and Autocorrection of Keyboard Input Errors. 1-5 - Simon Tarboush, Anum Ali, Tareq Y. Al-Naffouri:
Compressive Estimation of Near Field Channels for Ultra Massive-Mimo Wideband THz Systems. 1-5 - Shaohan Wu, Brian L. Hughes:
Antenna Impedance Estimation in Correlated Rayleigh Fading Channels. 1-5 - Homa Esfahanizadeh, William Wu, Manya Ghobadi, Regina Barzilay, Muriel Médard:
InfoShape: Task-Based Neural Data Shaping via Mutual Information. 1-5 - Jung-Chun Chi, Chiao-En Chen, Yuan-Hao Huang:
-Complexity Low-Rank Approximation SVD for Massive Matrix in Tensor Train Format. 1-5 - Ziwei Niu, Hongyi Wang, Hao Sun, Shuyi Ouyang, Yen-Wei Chen, Lanfen Lin:
MCKD: Mutually Collaborative Knowledge Distillation For Federated Domain Adaptation And Generalization. 1-5 - Yan Zhang, Xiyuan Gao, Xiao Pu, Tao Wang, Xinbo Gao:
DecomFormer: Decompose Self-Attention Via Fourier Transform for VHR Aerial Image Scene Classification. 1-5 - Soumee Guha, Olivia de Cuba, Andreas Gahlmann, Scott T. Acton:
Diffusionnet: An Efficient Framework to Classify Single-Molecule Images with Latent Entropy Minimization. 1-5 - Robin Scheibler, Youna Ji, Soo-Whan Chung, Jaeuk Byun, Soyeon Choe, Min-Seok Choi:
Diffusion-Based Generative Speech Source Separation. 1-5 - Xiaomei Shi, Min Zhang, Shouhai Xia, Ruxue Zhang, Jun Feng:
Local Feature Enhanced Adversarial Network for the Blind Image Quality Assessment. 1-5 - Bin Fu, Hongliang He, Pengxu Wei, Jie Chen:
Learning Task-Aligned Mask Query for Instance Segmentation. 1-5 - Ahmet M. Elbir, Kumar Vijay Mishra, Symeon Chatzinotas:
NBA-OMP: Near-Field Beam-Split-Aware Orthogonal Matching Pursuit for Wideband THz Channel Estimation. 1-5 - Ruiming Guo, Ayush Bhandari:
ITER-SIS: Robust Unlimited Sampling Via Iterative Signal Sieving. 1-5 - Reo Yoneyama, Ryuichi Yamamoto, Kentaro Tachibana:
Nonparallel High-Quality Audio Super Resolution with Domain Adaptation and Resampling CycleGANs. 1-5 - Jun Zhang:
Surrogate Based Post-HOC Calibration for Distributional Shift. 1-5 - Leo Davy, Nelly Pustelnik, Patrice Abry:
Combining Dual-Tree Wavelet Analysis and Proximal Optimization for Anisotropic Scale-Free Texture Segmentation. 1-5 - Lechi Li, Chen Dai, Yuxuan Xia, Lennart Svensson:
Deep Fusion of Multi-Object Densities Using Transformer. 1-5 - Kaihui Cheng, Chule Yang, Zunlin Fan, Dayan Wu, Naiyang Guan:
TeAw: Text-Aware Few-Shot Remote Sensing Image Scene Classification. 1-5 - Hao Zhang, Meng Yu, Dong Yu:
Deep AHS: A Deep Learning Approach to Acoustic Howling Suppression. 1-5 - Jean-Marie Lemercier, Julius Richter, Simon Welker, Timo Gerkmann:
Analysing Diffusion-based Generative Approaches Versus Discriminative Approaches for Speech Restoration. 1-5 - Shengchang Xiao, Xueshuai Zhang, Pengyuan Zhang:
Multi-Dimensional Frequency Dynamic Convolution with Confident Mean Teacher for Sound Event Detection. 1-5 - Xiaoqiang Wang, Yanqing Liu, Jinyu Li, Sheng Zhao:
Improving Contextual Spelling Correction by External Acoustics Attention and Semantic Aware Data Augmentation. 1-5 - Jaeyoung Huh, Shujaat Khan, Eun Sun Lee, Jong Chul Ye:
Ultrasound Image Quality Control Using Speech-Assisted Switchable CycleGAN. 1-5 - Yu Zhou, Liyuan Guo, Lianghai Jin:
Quaternion Orthogonal Transformer for Facial Expression Recognition in the Wild. 1-5 - Li Zhang, Qing Wang, Hongji Wang, Yue Li, Wei Rao, Yannan Wang, Lei Xie:
Distance-Based Weight Transfer for Fine-Tuning From Near-Field to Far-Field Speaker Verification. 1-5 - Dor H. Shmuel, Julian P. Merkofer, Guy Revach, Ruud J. G. van Sloun, Nir Shlezinger:
Deep Root Music Algorithm for Data-Driven Doa Estimation. 1-5 - Nabil Mohsen, Ammar Hawbani, Xingfu Wang, Benjamin Bairrington, Liang Zhao, Saeed H. Alsamhi:
Enhanced Coprime Array Configuration for DoA Estimation of Non-Circular Signals. 1-5 - Rahul Mourya, João F. C. Mota:
MCNeT: Measurement-Consistent Networks Via A Deep Implicit Layer For Solving Inverse Problems. 1-5 - Joachim Ott, Shih-Chii Liu:
Biologically-Inspired Continual Learning of Human Motion Sequences. 1-5 - Daisuke Niizumi, Daiki Takeuchi, Yasunori Ohishi, Noboru Harada, Kunio Kashino:
Masked Modeling Duo: Learning Representations by Encouraging Both Networks to Model the Input. 1-5 - Randall Balestriero, Yann LeCun:
Police: Provably Optimal Linear Constraint Enforcement For Deep Neural Networks. 1-5 - Yiwei Chen, Chen Jiang, Yu Pan:
Single-Photon Image Super-Resolution via Self-Supervised Learning. 1-5 - Frank Zalkow, Prachi Govalkar, Meinard Müller, Emanuël A. P. Habets, Christian Dittmar:
Evaluating Speech-Phoneme Alignment and its Impact on Neural Text-To-Speech Synthesis. 1-5 - Pablo Pérez Zarazaga, Gustav Eje Henter, Zofia Malisz:
A Processing Framework to Access Large Quantities of Whispered Speech Found in ASMR. 1-5 - Jiawei Jiang, Yuchao Feng, Honghui Xu, Jianwei Zheng:
Low-Dose CT Reconstruction Via Optimization-Inspired GAN. 1-5 - Kaiwei Dai, Nan Kang, Li Kuang:
CTTSR: A Hybrid CNN-Transformer Network for Scene Text Image Super-Resolution. 1-5 - Yuhui Zhao, Ruichun Yang, Ning Yang, Tao Lin, Qiuai Fu, Yuchi Ma:
Robust Log-Based Anomaly Detection with Hierarchical Contrastive Learning. 1-5 - Aradhita Sharma, Glen S. Uehara, Vivek Sivaraman Narayanaswamy, Leslie Miller, Andreas Spanias:
Signal Analysis-Synthesis Using the Quantum Fourier Transform. 1-5 - Yang Lu, Pinxin Qian, Gang Huang, Hanzi Wang:
Personalized Federated Learning on Long-Tailed Data via Adversarial Feature Augmentation. 1-5 - Yao Sun, Hanyi Zhang, Longbiao Wang, Kong Aik Lee, Meng Liu, Jianwu Dang:
Noise-Disentanglement Metric Learning for Robust Speaker Verification. 1-5 - Apoorva Chawla, Domenico Ciuonzo, Pierluigi Salvo Rossi:
Sparse Bayesian Learning Assisted Decision Fusion in Millimeter Wave Massive MIMO Sensor Networks. 1-5 - Zhiyu Liu, Baojiang Zhong:
Line Segment Matching Based on Intersection-Enhanced Point Correspondences. 1-5 - Qiongqiong Wang, Kong Aik Lee, Tianchi Liu:
Incorporating Uncertainty from Speaker Embedding Estimation to Speaker Verification. 1-5 - Bastian Eisele, Ali Bereyhi, Ralf R. Müller:
Multiple Target Measurements: Bayesian Framework for Moving Object Detection in Mimo Radar. 1-5 - Xie Chen, Ziyang Ma, Changli Tang, Yujin Wang, Zhisheng Zheng:
Front-End Adapter: Adapting Front-End Input of Speech Based Self-Supervised Learning for Speech Recognition. 1-5 - Jiyoung Lee, Joon Son Chung, Soo-Whan Chung:
Imaginary Voice: Face-Styled Diffusion Model for Text-to-Speech. 1-5 - Desh Raj, Junteng Jia, Jay Mahadeokar, Chunyang Wu, Niko Moritz, Xiaohui Zhang, Ozlem Kalinli:
Anchored Speech Recognition with Neural Transducers. 1-5 - Yeqin Zhang, Haomin Fu, Cheng Fu, Haiyang Yu, Yongbin Li, Cam-Tu Nguyen:
Coarse-To-Fine Knowledge Selection for Document Grounded Dialogs. 1-5 - Ehsan Variani, Ke Wu, David Rybach, Cyril Allauzen, Michael Riley:
Alignment Entropy Regularization. 1-5 - Ruiqi Jia, Xianbing Feng, Xiaoqing Lyu, Zhi Tang:
Graph-Graph Context Dependency Attention for Graph Edit Distance. 1-5 - Liu Chen, Michael Deisher, Munir Georges:
An End-to-End Neural Network for Image-to-Audio Transformation. 1-5 - Jiatong Shi, Yun Tang, Ann Lee, Hirofumi Inaguma, Changhan Wang, Juan Pino, Shinji Watanabe:
Enhancing Speech-To-Speech Translation with Multiple TTS Targets. 1-5 - Qi Wang, Weiwei Fang, Meng Wang, Yusong Cheng:
Classification-Based Dynamic Network for Efficient Super-Resolution. 1-5 - Danwei Cai, Weiqing Wang, Ming Li, Rui Xia, Chuanzeng Huang:
Pretraining Conformer with ASR for Speaker Verification. 1-5 - Junhao Hu, Shirin Shoushtari, Zihao Zou, Jiaming Liu, Zhixin Sun, Ulugbek S. Kamilov:
Robustness of Deep Equilibrium Architectures to Changes in the Measurement Model. 1-5 - Koshi Watanabe, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama:
Learning Graph Laplacian from Intrinsic Patterns via Gaussian Process. 1-5 - Tung-Sheng Huang, Ping-Chung Yu, Li Su:
Note and Playing Technique Transcription of Electric Guitar Solos in Real-World Music Performance. 1-5 - Alessandro Ragano, Emmanouil Benetos, Andrew Hines:
Audio Quality Assessment of Vinyl Music Collections Using Self-Supervised Learning. 1-5 - Mariel Estévez, Luciana Ferrer:
Study on the Fairness of Speaker Verification Systems Across Accent and Gender Groups. 1-5 - Huicheng Pi, Senmao Tian, Ming Lu, Jiaming Liu, Yandong Guo, Shunli Zhang:
A Comprehensive Comparison of Projections in Omnidirectional Super-Resolution. 1-5 - Kun Yang, Jing Liu, Dingkang Yang, Hanqi Wang, Peng Sun, Yanni Zhang, Yan Liu, Liang Song:
A Novel Efficient Multi-View Traffic-Related Object Detection Framework. 1-5 - Yixuan Weng, Bin Li:
Visual Answer Localization with Cross-Modal Mutual Knowledge Transfer. 1-5 - Hongjian Xiao, Ling Li, Danilo P. Mandic:
ClassA Entropy for the Analysis of Structural Complexity of Physiological Signals. 1-5 - Jiahong Li, Chenda Li, Yifei Wu, Yanmin Qian:
Robust Audio-Visual ASR with Unified Cross-Modal Attention. 1-5 - Zhuangqi Chen, Xianjun Xia, Siyu Sun, Ziqian Wang, Cheng Chen, Guoliang Xie, Pingjian Zhang, Yijian Xiao:
A Progressive Neural Network for Acoustic Echo Cancellation. 1-2 - Alexander Fuchs, Christian Knoll, Nima N. Moghadam, Alexey Pak, Jinliang Huang, Erik Leitinger, Franz Pernkopf:
Self-Attention for Enhanced OAMP Detection in MIMO Systems. 1-5 - Xingming Lv, Lei Wu, Zhenwei Cheng, Xiangxu Meng:
End-to-End Unsupervised Sketch to Image Generation. 1-5 - Jingchen Xu, Yali Zhang, Ze Li, Jinjia Wang:
Image Fusion Via Slice-Based Convolutional Sparse Representation. 1-5 - Qiu-Shi Zhu, Long Zhou, Jie Zhang, Shujie Liu, Yu-Chen Hu, Li-Rong Dai:
Robust Data2VEC: Noise-Robust Speech Representation Learning for ASR by Combining Regression and Improved Contrastive Learning. 1-5 - Xiaohui Liu, Meng Liu, Longbiao Wang, Kong Aik Lee, Hanyi Zhang, Jianwu Dang:
Leveraging Positional-Related Local-Global Dependency for Synthetic Speech Detection. 1-5 - Yuanchao Li, Peter Bell, Catherine Lai:
Multimodal Dyadic Impression Recognition via Listener Adaptive Cross-Domain Fusion. 1-5 - Yuntong Li, Shaowei Wang, Yingying Wang, Jin Li, Yuqiu Qian, Bangzhou Xin, Wei Yang:
Fine-Grained Private Knowledge Distillation. 1-5 - Florian Klein, Sebastià V. Amengual Garí:
The R3VIVAL Dataset: Repository of Room Responses and 360 Videos of a Variable Acoustics Lab. 1-5 - Jiahao Liu, Pengcheng Guo, Yonghong Song:
NC-WAMKD: Neighborhood Correction Weight-Adaptive Multi-Teacher Knowledge Distillation for Graph-Based Semi-Supervised Node Classification. 1-5 - Amrit Romana, Kazuhito Koishida:
Toward A Multimodal Approach for Disfluency Detection and Categorization. 1-5 - Payal Mohapatra, Akash Pandey, Sinan Keten, Wei Chen, Qi Zhu:
Person Identification with Wearable Sensing Using Missing Feature Encoding and Multi-Stage Modality Fusion. 1-2 - Ze-Yu Mi, Yu-Bin Yang:
Dual Meta Calibration Mix for Improving Generalization in Meta-Learning. 1-5 - Nati Ofir:
Multispectral Image Fusion based on Super Pixel Segmentation. 1-5 - Feng-Ju Chang, Thejaswi Muniyappa, Kanthashree Mysore Sathyendra, Kai Wei, Grant P. Strimel, Ross McGowan:
Dialog Act Guided Contextual Adapter for Personalized Speech Recognition. 1-5 - Ioana Boier:
Multiresolution Signal Processing of Financial Market Objects. 1-5 - Tao Wang, Hui Yu, Zexin Lu, Zhongzhou Zhang, Jiliu Zhou, Yi Zhang:
Stay In The Middle: A Semi-Supervised Model for CT Metal Artifact Reduction. 1-5 - Yawen Yang, Xuming Hu, Fukun Ma, Shu'ang Li, Aiwei Liu, Lijie Wen, Philip S. Yu:
Gaussian Prior Reinforcement Learning for Nested Named Entity Recognition. 1-5 - Khanh Quoc Dinh, Kwang Pyo Choi:
Learned Video Coding with Motion Compensation Mixture Model. 1-5 - Yunqing Hu, Xuan Jin, Xi Chen, Yin Zhang:
Dual Collaborative Visual-Semantic Mapping for Multi-Label Zero-Shot Image Recognition. 1-5 - Amit Kumar Singh Yadav, Ziyue Xiang, Emily R. Bartusiak, Paolo Bestagini, Stefano Tubaro, Edward J. Delp:
ASSD: Synthetic Speech Detection in the AAC Compressed Domain. 1-5 - Cassia Valentini-Botinhao, Andrea Lorena Aldana Blanco, Ondrej Klejch, Peter Bell:
Efficient Intelligibility Evaluation Using Keyword Spotting: A Study on Audio-Visual Speech Enhancement. 1-5 - Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Roshan S. Sharma, Kohei Matsuura, Shinji Watanabe:
Speech Summarization of Long Spoken Document: Improving Memory Efficiency of Speech/Text Encoders. 1-5 - Gustav Zetterqvist, Fredrik Gustafsson, Gustaf Hendeby:
Using Received Power in Microphone Arrays to Estimate Direction of Arrival. 1-5 - Rui Zhang, Yajing Sun, Jingyuan Yang, Wei Peng:
Knowledge-Augmented Frame Semantic Parsing with Hybrid Prompt-Tuning. 1-5 - Minsu Kim, Joanna Hong, Yong Man Ro:
Lip-to-Speech Synthesis in the Wild with Multi-Task Learning. 1-5 - Tianxiao Xu, Zihao Zheng, Xinshuo Hu, Zetian Sun, Yu Zhao, Baotian Hu:
HITSZ TMG at ICASSP 2023 SPGC Shared Task: Leveraging Pre-Training and Distillation Method for Title Generation with Limited Resource. 1-2 - Bo Peng, Liren He, Dong Wu, Mingmin Chi, Jintao Chen:
A Multi-Signal Perception Network for Textile Composition Identification. 1-5 - Farhad Javanmardi, Saska Tirronen, Manila Kodali, Sudarsana Reddy Kadiri, Paavo Alku:
Wav2vec-Based Detection and Severity Level Classification of Dysarthria From Speech. 1-5 - Jun Chen, Yupeng Shi, Wenzhe Liu, Wei Rao, Shulin He, Andong Li, Yannan Wang, Zhiyong Wu, Shidong Shang, Chengshi Zheng:
Gesper: A Unified Framework for General Speech Restoration. 1-2 - Bo Chen, Chao Wu, Wenbin Zhao:
SEPDIFF: Speech Separation Based on Denoising Diffusion Model. 1-5 - Dayong Li, Xian Li, Xiaofei Li:
DVQVC: An Unsupervised Zero-Shot Voice Conversion Framework. 1-5 - Danna Xue, Luis Herranz, Javier Vazquez-Corral, Yanning Zhang:
Burst Perception-Distortion Tradeoff: Analysis and Evaluation. 1-5 - Xudong Tan, Menghan Hu, Guangtao Zhai, Yan Zhu, Wenfang Li, Xiaoping Zhang:
Unobtrusive Respiratory Monitoring System for Intensive Care. 1-5 - Hyunseung Chung, Jiho Kim, Joon-Myoung Kwon, Ki-Hyun Jeon, Min Sung Lee, Edward Choi:
Text-to-ECG: 12-Lead Electrocardiogram Synthesis Conditioned on Clinical Text Reports. 1-5 - Xiaoheng Sun, Yuejie Gao, Hanyao Lin, Huaping Liu:
Tg-Critic: A Timbre-Guided Model For Reference-Independent Singing Evaluation. 1-5 - Mengqun Jin, Kai Li, Shuyan Li, Chunming He, Xiu Li:
Towards Realizing the Value of Labeled Target Samples: A Two-Stage Approach for Semi-Supervised Domain Adaptation. 1-5 - Kuan-Chen Wang, Kai-Chun Liu, Sheng-Yu Peng, Yu Tsao:
ECG Artifact Removal from Single-Channel Surface EMG Using Fully Convolutional Networks. 1-5 - Landon Butler, Alejandro Parada-Mayorga, Alejandro Ribeiro:
Learning with Multigraph Convolutional Filters. 1-5 - Yikun Xiang, Feng Xi, Shengyao Chen:
LiQuiD-MIMO Radar: Distributed MIMO Radar with Low-Bit Quantization. 1-5 - Xiaorong Shi, Liping Yi, Xiaoguang Liu, Gang Wang:
FFEDCL: Fair Federated Learning with Contrastive Learning. 1-5 - Di Xiao, Qin Tang, Aozhu Zhao, Min Li:
Robust Watermarking Scheme in Encrypted Domain Based on Integer Lifting Wavelet Transform and Compressed Sensing. 1-5 - Zhihong Zhu, Weiyuan Xu, Xuxin Cheng, Tengtao Song, Yuexian Zou:
A Dynamic Graph Interactive Framework with Label-Semantic Injection for Spoken Language Understanding. 1-5 - Jianrong Wang, Jinyu Liu, Xuewei Li, Mei Yu, Jie Gao, Qiang Fang, Li Liu:
Two-Stream Joint-Training for Speaker Independent Acoustic-to-Articulatory Inversion. 1-5 - Donghyeong Kim, Chaewon Park, Suhwan Cho, Sangyoun Lee:
FAPM: Fast Adaptive Patch Memory for Real-Time Industrial Anomaly Detection. 1-5 - Angela F. Gao, Oscar Leong, He Sun, Katherine L. Bouman:
Image Reconstruction without Explicit Priors. 1-5 - Shuo Tang, Gerald LaMountain, Tales Imbiriba, Pau Closas:
On Parametric Misspecified Bayesian Cramér-Rao Bound: An Application to Linear/Gaussian Systems. 1-5 - Mikael Sørensen, Nicholas D. Sidiropoulos:
Radio-Astronomy Imaging and Interference Excision Using Tensor Decomposition and Canonical Correlation Analysis. 1-5 - Evan Bell, Shijun Liang, Qing Qu, Saiprasad Ravishankar:
Robust Self-Guided Deep Image Prior. 1-5 - Guan-Ting Lin, Qingming Tang, Chieh-Chi Kao, Viktor Rozgic, Chao Wang:
Weight-Sharing Supernet for Searching Specialized Acoustic Event Classification Networks Across Device Constraints. 1-5 - Victor Ardulov, Shrikanth Narayanan:
Navigating and Reaching Therapeutic Goals with Dynamical Systems in Conversation-Based Interventions. 1-5 - Michele Cirillo, Virginia Bordignon, Vincenzo Matta, Ali H. Sayed:
The Role of Memory in Social Learning When Sharing Partial Opinions. 1-5 - Yurun He, Nobuaki Minematsu, Daisuke Saito:
Multiple Acoustic Features Speech Emotion Recognition Using Cross-Attention Transformer. 1-5 - Xin-Yi Li, Pei-Nan Zhong, Di Chen, Yu-Bin Yang:
Core: Transferable Long-Range Time Series Forecasting Enhanced by Covariates-Guided Representation. 1-5 - Kaiyun Zhang, Wenkang Fan, Yinran Chen, Xiongbiao Luo:
DGN: Descriptor Generation Network for Feature Matching in Monocular Endoscopy 3D Reconstruction. 1-5 - Ping Jiang, Xiaoheng Deng, Shichao Zhang:
Decoupled Visual Causality for Robust Detection. 1-5 - Naveed Ahmad, Malcolm Egan, Jean-Marie Gorce, Jilles Steeve Dibangoye, Frédéric Le Mouël:
Optimization of Sensor Configurations for Fault Identification in Smart Buildings. 1-5 - Mohamad Jouni, Daniele Picone, Mauro Dalla Mura:
Model-Based Spectral Reconstruction Of Interferometric Acquisitions. 1-5 - Zhikang Zhang, Bruno Machado Trindade, Michael Green, Zifan Yu, Christopher Pawlowicz, Fengbo Ren:
Automatic Error Detection in Integrated Circuits Image Segmentation: A Data-Driven Approach. 1-5 - Vasileios Kalantzis, Panagiotis A. Traganitis:
Matrix Resolvent Eigenembeddings for Dynamic Graphs. 1-5 - Bo Wen, Chen Du, Truong Q. Nguyen:
Encoder-Decoder Graph Convolutional Network for Automatic Timed-Up-and-Go and Sit-to-Stand Segmentation. 1-5 - Cong Xu, Yuhang Li, Dae Lee, Dae Hoon Park, Hongda Mao, Huyen Do, Jonathan Chung, Dinesh Nair:
Augmentation Robust Self-Supervised Learning for Human Activity Recognition. 1-5 - Ming Cheng, Haoxu Wang, Ziteng Wang, Qiang Fu, Ming Li:
The WHU-Alibaba Audio-Visual Speaker Diarization System for the MISP 2022 Challenge. 1-2 - Thibault Sellam, Ankur Bapna, Joshua Camp, Diana Mackinnon, Ankur P. Parikh, Jason Riesa:
SQuId: Measuring Speech Naturalness in Many Languages. 1-5 - Shun Zhang, Tongliang Li, Jiaqi Bai, Zhoujun Li:
Label-Guided Contrastive Learning for Out-of-Domain Detection. 1-5 - Gaojie Li, Qing Liu, Haotian Liu, Yixiong Liang:
A Novel Transformer-Based Pipeline for Lung Cytopathological Whole Slide Image Classification. 1-5 - Haibo Shen, Yihao Luo, Xiang Cao, Liangqi Zhang, Juyu Xiao, Tianjiang Wang:
Training Stronger Spiking Neural Networks with Biomimetic Adaptive Internal Association Neurons. 1-5 - Siddhant Arora, Hayato Futami, Emiru Tsunoo, Brian Yan, Shinji Watanabe:
Joint Modelling of Spoken Language Understanding Tasks with Integrated Dialog History. 1-5 - Chenye Cui, Zhou Zhao, Yi Ren, Jinglin Liu, Rongjie Huang, Feiyang Chen, Zhefeng Wang, Baoxing Huai, Fei Wu:
VarietySound: Timbre-Controllable Video to Sound Generation Via Unsupervised Information Disentanglement. 1-5 - Yu Tang, Hongwei Zhang:
A Frequency-Weighted Leaky Fxlms Algorithm with Application to Feedback Active Noise Control Systems. 1-5 - Shi Tang, Xinchen Ye, Fei Xue, Rui Xu:
Cross-Modality depth Estimation via Unsupervised Stereo RGB-to-infrared Translation. 1-5 - Lihua Zhang, Quan Liu, Xiongzhen Zhang, Yapeng Xu:
A Perturbation-Based Policy Distillation Framework with Generative Adversarial Nets. 1-5 - Shuqi Sun, Xiaohui Yang, Jingliang Peng:
Yolo-Based Lightweight Object Detection With Structure Simplification And Attention Enhancement. 1-5 - Usman Mahmood, Zening Fu, Vince D. Calhoun, Sergey M. Plis:
Glacier: Glass-Box Transformer for Interpretable Dynamic Neuroimaging. 1-5 - Xiangze Bao, Yun-Hao Yuan, Yun Li, Jipeng Qiang, Yi Zhu:
Learning Supervised Covariation Projection Through General Covariance. 1-5 - Pengcheng Dong, Chuntao Wang, Zhenyong Lu, Kai Zhang, Wenbo Wan, Jiande Sun:
S-Feature Pyramid Network and Attention Model for Drone Detection. 1-2 - Yun Zhong, Fan Zhang, Yiannis Demiris:
Contrastive Self-Supervised Learning for Automated Multi-Modal Dance Performance Assessment. 1-5 - Xin-Yi Li, Wei-Jun Lei, Yu-Bin Yang:
From Easy to Hard: Two-Stage Selector and Reader for Multi-Hop Question Answering. 1-5 - Yuhan Zhang, He Zhu, Shan Yu:
Adaptive Data Augmentation for Contrastive Learning. 1-5 - Tomohiko Nakamura, Shinnosuke Takamichi, Naoko Tanji, Satoru Fukayama, Hiroshi Saruwatari:
jaCappella Corpus: A Japanese a Cappella Vocal Ensemble Corpus. 1-5 - Liang Xu, Lizhong Wang, Sijun Bi, Hanyue Liu, Jing Wang:
Semi-Supervised Sound Event Detection with Pre-Trained Model. 1-5 - Weihong Bao, Liyang Chen, Chaoyong Zhou, Sicheng Yang, Zhiyong Wu:
Wavsyncswap: End-To-End Portrait-Customized Audio-Driven Talking Face Generation. 1-5 - Haonan Han, Rui Yang, Shuyan Li, Runze Hu, Xiu Li:
SSGD: A Smartphone Screen Glass Dataset for Defect Detection. 1-5 - Rami Botros, Rohit Prabhavalkar, Johan Schalkwyk, Ciprian Chelba, Tara N. Sainath, Françoise Beaufays:
Lego-Features: Exporting Modular Encoder Features for Streaming and Deliberation ASR. 1-5 - Ting Hu, Christoph Meinel, Haojin Yang:
Boosting Bert Subnets with Neural Grafting. 1-5 - Jielin Qiu, Jiacheng Zhu, Mengdi Xu, Peide Huang, Michael A. Rosenberg, Douglas Weber, Emerson Liu, Ding Zhao:
Cardiac Disease Diagnosis on Imbalanced Electrocardiography Data Through Optimal Transport Augmentation. 1-5 - Nicholas Glaze, Artun Bayer, Xiaoqian Jiang, Sean I. Savitz, Santiago Segarra:
Graph Representation Learning For Stroke Recurrence Prediction. 1-5 - Qidong Wang, Lili Guo, Shifei Ding, Jian Zhang, Xiao Xu:
SFEMGN: Image Denoising with Shallow Feature Enhancement Network and Multi-Scale ConvGRU. 1-5 - Zihao Cui, Shilei Zhang, Yanan Chen, Yingying Gao, Chao Deng, Junlan Feng:
Semi-Supervised Speech Enhancement Based On Speech Purity. 1-5 - Lucas Goncalves, Carlos Busso:
Learning Cross-Modal Audiovisual Representations with Ladder Networks for Emotion Recognition. 1-5 - Rapolas Daugintis, Roberto Barumerli, Lorenzo Picinali, Michele Geronazzo:
Classifying Non-Individual Head-Related Transfer Functions with A Computational Auditory Model: Calibration And Metrics. 1-5 - Sashi Novitasari, Sakriani Sakti, Satoshi Nakamura:
Self-Adaptive Incremental Machine Speech Chain for Lombard TTS with High-Granularity ASR Feedback in Dynamic Noise Condition. 1-5 - Ernst Aschbacher, Florian Frühauf, Anja Kurz, Peter Nopp:
Design and Performance of the Low-Power Noise Reduction Algorithm of the Med-El Sonnet 2™ Cochlear Implant Audio Processor. 1-5 - Tanuka Bhattacharjee, Yamini Belur, Atchayaram Nalini, Ravi Yadav, Prasanta Kumar Ghosh:
Exploring the Role of Fricatives in Classifying Healthy Subjects and Patients with Amyotrophic Lateral Sclerosis and Parkinson's Disease. 1-5 - Yusuke Akamatsu, Yoshifumi Onishi, Hitoshi Imaoka:
Blood Oxygen Saturation Estimation from Facial Video Via DC and AC Components of Spatio-Temporal Map. 1-5 - Kleanthis Avramidis, Shanti Stewart, Shrikanth Narayanan:
On the Role of Visual Context in Enriching Music Representations. 1-5 - Decai Chen, Haofei Lu, Ingo Feldmann, Oliver Schreer, Peter Eisert:
Dynamic Multi-View Scene Reconstruction Using Neural Implicit Surface. 1-5 - Marco Olivieri, Luca Comanducci, Mirco Pezzoli, Davide Balsarri, Luca Menescardi, Michele Buccoli, Simone Pecorino, Antonio Grosso, Fabio Antonacci, Augusto Sarti:
Real-Time Multichannel Speech Separation and Enhancement Using a Beamspace-Domain-Based Lightweight CNN. 1-5 - Dancheng Liu, Kazim Ergun, Tajana Simunic Rosing:
Towards a Robust and Efficient Classifier for Real World Radio Signal Modulation Classification. 1-5 - Joseph Caroselli, Arun Narayanan, Nathan Howard, Tom O'Malley:
Cleanformer: A Multichannel Array Configuration-Invariant Neural Enhancement Frontend for ASR in Smart Speakers. 1-5 - Luis Felipe Florenzan Reyes, Francesco Smarra, Alessandro D'Innocenzo, Marco Levorato:
CADET: Control-Aware Dynamic Edge Computing for Real-Time Target Tracking in UAV Systems. 1-5 - Qi Hu, Ning Ma, Guy J. Brown:
Robust Binaural Sound Localisation with Temporal Attention. 1-5 - Royston Rodrigues, Masahiro Tani:
SemGeo: Semantic Keywords for Cross-View Image Geo-Localization. 1-5 - Ruizhi Wang, Xiangtao Wang, Zhenghua Xu, Wenting Xu, Junyang Chen, Thomas Lukasiewicz:
MvCo-DoT: Multi-View Contrastive Domain Transfer Network for Medical Report Generation. 1-5 - Han Zhang, Maoguo Gong, Feiping Nie, Xuelong Li:
Multilayer Subspace Learning With Self-Sparse Robustness for Two-Dimensional Feature Extraction. 1-5 - Yuzi Yan, Yu Dong, Kai Ma, Yuan Shen:
Approximation Error Back-Propagation for Q-Function in Scalable Reinforcement Learning with Tree Dependence Structure. 1-5 - Junwei Huang, Karthik Ganesan, Soumi Maiti, Young Min Kim, Xuankai Chang, Paul Liang, Shinji Watanabe:
FindAdaptNet: Find and Insert Adapters by Learned Layer Importance. 1-5 - Xingchao Jian, Wee Peng Tay:
Kernel Ridge Regression for Generalized Graph Signal Processing. 1-5 - Jun Shi, Bingcai Wei, Gang Zhou, Liye Zhang:
Sandformer: CNN and Transformer under Gated Fusion for Sand Dust Image Restoration. 1-5 - Yaoyao Du, Zixiao Zhang, Zhihao Li, Peng Wei, Qingmin Liao, Wenming Yang:
Flowpose: Conditional Normalizing Flows for 3D Human Pose and Shape Estimation from Monocular Videos. 1-5 - Ganghui Ru, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao:
Improving Music Genre Classification from multi-modal Properties of Music and Genre Correlations Perspective. 1-5 - Renyou Xie, Chaojie Li, Xiaojun Zhou, Zhaoyang Dong:
Asynchronous Federated Learning for Real-Time Multiple Licence Plate Recognition Through Semantic Communication. 1-5 - Haiyu Zhang, Shaolin Su, Yu Zhu, Jinqiu Sun, Yanning Zhang:
Boosting No-Reference Super-Resolution Image Quality Assessment with Knowledge Distillation and Extension. 1-5 - Marc Baltes, Nidal Abuhajar, Ye Yue, Charles D. Smith, Jundong Liu:
Joint Ann-SNN Co-training for Object Localization and Image Segmentation. 1-5 - Subba Reddy Oota, Khushbu Pahwa, Mounika Marreddy, Manish Gupta, Raju S. Bapi:
Neural Architecture of Speech. 1-5 - Hengfang Wang, Yasi Zhang, Xiaojun Mao, Zhonglei Wang:
Transductive Matrix Completion with Calibration for Multi-Task Learning. 1-5 - Anjie Peng, Zhi Lin, Hui Zeng, Wenxin Yu, Xiangui Kang:
Boosting Transferability of Adversarial Example via an Enhanced Euler's Method. 1-5 - Thomas Feuillen, Shankar Mysore Rama R. Bhavani, Ayush Bhandari:
Unlimited Sampling Radar: Life Below the Quantization Noise. 1-5 - Wenjing Han, Yirong Chen, Xiaofen Xing, Guohua Zhou, Xiangmin Xu:
Speaker-Aware Hierarchical Transformer For Personality Recognition In Multiparty Dialogues. 1-5 - Tobias Cord-Landwehr, Christoph Böddeker, Catalin Zorila, Rama Doddipatla, Reinhold Haeb-Umbach:
Frame-Wise and Overlap-Robust Speaker Embeddings for Meeting Diarization. 1-5 - Burak Bartan, Mert Pilanci:
Convex Optimization of Deep Polynomial and ReLU Activation Neural Networks. 1-5 - Wen-Chiao Tsai, Chi-Wei Chen, An-Yeu Andy Wu:
Compressive Channel Estimation for IRS-Aided Millimeter-Wave Systems via Two-Stage Lamp Network. 1-5 - Xiangyu Huang, Caidan Zhao, Zhiqiang Wu:
A Video Anomaly Detection Framework Based on Appearance-Motion Semantics Representation Consistency. 1-5 - Ruimin Peng, Changming Zhao, Yifan Xu, Jun Jiang, Guangtao Kuang, Jianbo Shao, Dongrui Wu:
WAVELET2VEC: A Filter Bank Masked Autoencoder for EEG-Based Seizure Subtype Classification. 1-5 - Yujie Yang, Kun Zhang, Zhiyong Wu, Helen Meng:
Keyword-Specific Acoustic Model Pruning for Open-Vocabulary Keyword Spotting. 1-5 - Dongji Gao, Jiatong Shi, Shun-Po Chuang, Leibny Paola García, Hung-Yi Lee, Shinji Watanabe, Sanjeev Khudanpur:
Euro: Espnet Unsupervised ASR Open-Source Toolkit. 1-5 - Christian A. Schroth, Stefan Vlaski, Abdelhak M. Zoubir:
Robust M-Estimation Based Distributed Expectation Maximization Algorithm with Robust Aggregation. 1-5 - Shai Aharon, Gil Ben-Artzi:
Hypernetwork-Based Adaptive Image Restoration. 1-5 - Chenxu Wang, Ping Jian, Hai Wang:
Numerical Semantic Modeling for Implicit Discourse Relation Recognition. 1-5 - Yuguang Yang, Yu Pan, Jingjing Yin, Jiangyu Han, Lei Ma, Heng Lu:
Hybridformer: Improving Squeezeformer with Hybrid Attention and NSR Mechanism. 1-5 - Evandro Gouvêa, Ali Dadgar, Shahab Jalalvand, Rathi Chengalvarayan, Badrinath Jayakumar, Ryan Price, Nicholas Ruiz, Jennifer McGovern, Srinivas Bangalore, Benjamin Stern:
TRUSTERA: A Live Conversation Redaction System. 1-5 - Qing Yao, Longxiu Huang, Sui Tang:
Space-Time Variable Density Samplings for Sparse Bandlimited Graph Signals Driven by Diffusion Operators. 1-5 - Michael P. Sheehan, Julián Tachella, Mike E. Davies:
Hardware Friendly Spline Sketched Lidar. 1-5 - José Vinícius de Miranda Cardoso, Jiaxi Ying, Sandeep Kumar, Daniel P. Palomar:
Estimating Normalized Graph Laplacians in Financial Markets. 1-5 - Eike Jannik Nustede, Jörn Anemüller:
Single-Channel Speech Enhancement with Deep Complex U-Networks and Probabilistic Latent Space Models. 1-5 - Tran Thanh Phong Nguyen, Son Lam Phung, Vinod Gopaldasani, Jane Whitelaw:
Bilateral Coarse-to-Fine Network for Point Cloud Completion. 1-5 - Bishwadeep Das, Elvin Isufi:
Online Vector Autoregressive Models Over Expanding Graphs. 1-5 - Yuxin Wen, Jonas Geiping, Micah Goldblum, Tom Goldstein:
STYX: Adaptive Poisoning Attacks Against Byzantine-Robust Defenses in Federated Learning. 1-5 - Denis C. Ilie-Ablachim, Andra Baltoiu, Bogdan Dumitrescu:
Sparse Representations with Cone Atoms. 1-5 - Gwantae Kim, Seonghyeok Noh, Insung Ham, Hanseok Ko:
MPE4G : Multimodal Pretrained Encoder for Co-Speech Gesture Generation. 1-5 - Jiangjing Hu, Fengyu Wang, Wenjun Xu, Hui Gao, Ping Zhang:
Scalable Multi-Task Semantic Communication System with Feature Importance Ranking. 1-5 - Xianchao Wu:
Enhancing Unsupervised Speech Recognition with Diffusion GANS. 1-5 - Timothée Dhaussy, Bassam Jabaian, Fabrice Lefèvre, Radu Horaud:
Audio-Visual Speaker Diarization in the Framework of Multi-User Human-Robot Interaction. 1-5 - Ruidi Chen, Boran Hao, Ioannis Ch. Paschalidis:
Distributionally Robust Multiclass Classification and Applications in Deep Image Classifiers. 1-2 - Vishal Sunder, Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury, Eric Fosler-Lussier:
Fine-Grained Textual Knowledge Transfer to Improve RNN Transducers for Speech Recognition and Understanding. 1-5 - Lifan Xu, Shunqiao Sun, Yimin D. Zhang, Athina P. Petropulu:
Joint Antenna Selection and Beamforming in Integrated Automotive Radar Sensing-Communications with Quantized Double Phase Shifters. 1-5 - Qingsong Wen, Linxiao Yang, Liang Sun:
Robust Dominant Periodicity Detection for Time Series with Missing Data. 1-5 - Yanyi Chen, Ruifang Liu, Xiyan Liu, Yidong Shi, Ge Bai:
An Interpretable Model Using Evidence Information for Multi-Hop Question Answering Over Long Texts. 1-5 - Heqing Cheng, Yong Feng, Mingliang Zhou, Xiancai Xiong, Yongheng Wang, Baohua Qiang:
Joint Robust Representation And Generalization Enhancement For Cross-Modality Person Re-Identification. 1-5 - Eun Jung Yeo, Kwanghee Choi, Sunhee Kim, Minhwa Chung:
Automatic Severity Classification of Dysarthric Speech by Using Self-Supervised Model with Multi-Task Learning. 1-5 - Peijie Dong, Xin Niu, Zhiliang Tian, Lujun Li, Xiaodong Wang, Zimian Wei, Hengyue Pan, Dongsheng Li:
Progressive Meta-Pooling Learning for Lightweight Image Classification Model. 1-5 - Chenyang Zhao, Chuanfei Hu, Hang Shao, Zhe Wang, Yongxiong Wang:
Towards Trustworthy Multi-Label Sewer Defect Classification via Evidential Deep Learning. 1-5 - Yuanming Zhang, Haoxin Ruan, Ziyan Yuan, Haoliang Du, Xia Gao, Jing Lu:
A Learnable Spatial Mapping for Decoding the Directional Focus of Auditory Attention Using EEG. 1-5 - Chenzhong Yin, Zhihong Pan, Xin Zhou, Le Kang, Paul Bogdan:
Raising The Limit of Image Rescaling Using Auxiliary Encoding. 1-5 - Bardia Safaei, Vibashan VS, Celso M. de Melo, Shuowen Hu, Vishal M. Patel:
Open-Set Automatic Target Recognition. 1-5 - Luan Vinícius Fiorio, Boris Karanov, Johan David, Wim J. van Houtum, Frans Widdershoven, Ronald M. Aarts:
Semi-Supervised Learning with Per-Class Adaptive Confidence Scores for Acoustic Environment Classification with Imbalanced Data. 1-5 - Loes van Bemmel, Zhuoran Liu, Nik Vaessen, Martha A. Larson:
Beyond Neural-on-Neural Approaches to Speaker Gender Protection. 1-5 - Hang Shao, Tian Tan, Wei Wang, Xun Gong, Yanmin Qian:
Joint Discriminator and Transfer Based Fast Domain Adaptation For End-To-End Speech Recognition. 1-5 - Marco Carpentiero, Vincenzo Matta, Ali H. Sayed:
Compressed Distributed Regression over Adaptive Networks. 1-5 - Zhaoyi Xu, Donglin Gao, Shuping Li, Chung-Tse Michael Wu, Athina P. Petropulu:
Flexible Beam Design for Vital Sign Monitoring Using a Phased Array Equipped With Double-Phase Shifters. 1-5 - Yihan Lin, Xunquan Chen, Ryoichi Takashima, Tetsuya Takiguchi:
Zero-Shot Sound Event Classification Using a Sound Attribute Vector with Global and Local Feature Learning. 1-5 - Abhishek Mondal, Deepak Mishra, Ganesh Prasad, Ashraf Hossain:
Deep Reinforcement Learning for Green UAV-Assisted Data Collection. 1-5 - Peter Wißbrock, Yvonne Richter, David Pelkmann, Zhao Ren, Gregory Palmer:
Cutting Through the Noise: An Empirical Comparison of Psycho-Acoustic and Envelope-based Features for Machinery Fault Detection. 1-5 - Yosuke Kashiwagi, Siddhant Arora, Hayato Futami, Jessica Huynh, Shih-Lun Wu, Yifan Peng, Brian Yan, Emiru Tsunoo, Shinji Watanabe:
E-Branchformer-Based E2E SLU Toward Stop on-Device Challenge. 1-2 - Gasper Begus, Alan Zhou, Peter Wu, Gopala Krishna Anumanchipalli:
Articulation GAN: Unsupervised Modeling of Articulatory Learning. 1-5 - Vlad Pandelea, Edoardo Ragusa, Paolo Gastaldo, Erik Cambria:
Selecting Language Models Features VIA Software-Hardware Co-Design. 1-5 - Georgi Shopov, Stefan Gerdjikov, Stoyan Mihov:
StreamSpeech: Low-Latency Neural Architecture for High-Quality on-Device Speech Synthesis. 1-5 - Le Trung Thanh, Aref Miri Rekavandi, Abd-Krim Seghouane, Karim Abed-Meraim:
Robust Subspace Tracking with Contamination Mitigation via α-Divergence. 1-5 - Jiayao Sun, Dawei Luo, Zhaoxia Li, Jindong Li, Yukai Ju, Yang Li:
Multi-Task Sub-Band Network For Deep Residual Echo Suppression. 1-2 - Yue Zhang, Tao Lei, Shaoxiong Han, Yetong Xu, Asoke K. Nandi:
Local-Global Siamese Network with Efficient Inter-Scale Feature Learning for Change Detection in VHR Remote Sensing Images. 1-5 - Mohammad Samragh, Arnav Kundu, Ting-Yao Hu, Aman Chadha, Ashish Srivastava, Minsik Cho, Oncel Tuzel, Devang Naik:
I See What You Hear: A Vision-Inspired Method to Localize Words. 1-5 - Tie Liu, Mai Xu, Shengxi Li, Chaoran Chen, Li Yang, Zhuoyi Lv:
Learnt Mutual Feature Compression for Machine Vision. 1-5 - Claudio Battiloro, Paolo Di Lorenzo, Sergio Barbarossa:
Topological Slepians: Maximally Localized Representations of Signals Over Simplicial Complexes. 1-5 - Jiaxiang You, Yuanman Li, Rongqin Liang, Yuxuan Tan, Jiantao Zhou, Xia Li:
Image Sharing Chain Detection VIA Sequence-To-Sequence Model. 1-5 - Fangzheng Lin, Heming Sun, Jinming Liu, Jiro Katto:
Multistage Spatial Context Models for Learned Image Compression. 1-5 - Taishi Nakashima, Rintaro Ikeshita, Nobutaka Ono, Shoko Araki, Tomohiro Nakatani:
Fast Online Source Steering Algorithm for Tracking Single Moving Source Using Online Independent Vector Analysis. 1-5 - Aniruddh Sikdar, Sumanth Udupa, Suresh Sundaram:
Fully Complex-Valued Deep Learning Model for Visual Perception. 1-5 - Shogo Seki, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko:
JSV-VC: Jointly Trained Speaker Verification and Voice Conversion Models. 1-5 - Haolan Zhan, Xuming Lin, Shaobo Cui, Zhongzhou Zhao, Wei Zhou, Haiqing Chen:
Towards Zero-Shot Personalized Table-to-Text Generation with Contrastive Persona Distillation. 1-5 - Hemant Yadav, Sunayana Sitaram, Rajiv Ratn Shah:
Analysing the Masked Predictive Coding Training Criterion for Pre-Training a Speech Representation Model. 1-5 - Ao Zhang, He Wang, Pengcheng Guo, Yihui Fu, Lei Xie, Yingying Gao, Shilei Zhang, Junlan Feng:
VE-KWS: Visual Modality Enhanced End-to-End Keyword Spotting. 1-5 - Yaowu Fan, Jia Wan, Yuan Yuan, Qi Wang:
Weakly-Supervised Scene-Specific Crowd Counting Using Real-Synthetic Hybrid Data. 1-5 - Nazif Can Tamer, Yigitcan Özer, Meinard Müller, Xavier Serra:
TAPE: An End-to-End Timbre-Aware Pitch Estimator. 1-5 - Marek Hilton, Pier Luigi Dragotti:
Sparse Asynchronous Samples from Networks of Tems for Reconstruction of Classes of Non-Bandlimited Signals. 1-5 - Catalin Zorila, Rama Doddipatla:
On the Effectiveness of Monoaural Target Source Extraction for Distant end-to-end Automatic Speech Recognition. 1-5 - Gourav Datta, Zeyu Liu, Md. Abdullah-Al Kaiser, Souvik Kundu, Joe Mathai, Zihan Yin, Ajey P. Jacob, Akhilesh R. Jaiswal, Peter A. Beerel:
In-Sensor & Neuromorphic Computing Are all You Need for Energy Efficient Computer Vision. 1-5 - Qiankun Tang, Yuan Zhang, Xiaogang Xu, Jun Wang, Yimin Guo:
Input-Dependent Dynamical Channel Association For Knowledge Distillation. 1-5 - Yifeng Fan, Colin Vaz, Di He, Jahn Heymann, Viet Anh Trinh, Zhe Zhang, Venkatesh Ravichandran:
Towards Accurate and Real-Time End-of-Speech Estimation. 1-5 - Danwei Cai, Zexin Cai, Ming Li:
Identifying Source Speakers for Voice Conversion Based Spoofing Attacks on Speaker Verification Systems. 1-5 - Maxim Khomiakov, Alejandro Valverde Mahou, Alba Reinders Sánchez, Jes Frellsen, Michael Riis Andersen:
Learning To Generate 3d Representations of Building Roofs Using Single-View Aerial Imagery. 1-5 - Alexander Johnson, Vishwas M. Shetty, Mari Ostendorf, Abeer Alwan:
Leveraging Multiple Sources in Automatic African American English Dialect Detection for Adults and Children. 1-5 - Fuxiang Tao, Xuri Ge, Wei Ma, Anna Esposito, Alessandro Vinciarelli:
Multi-Local Attention for Speech-Based Depression Detection. 1-5 - Bin Duan, Xingxian Liu, Shusen Wang, Yajing Xu, Bo Xiao:
Relational Representation Learning for Zero-Shot Relation Extraction with Instance Prompting and Prototype Rectification. 1-5 - Yue Zheng, Yali Li, Shengjin Wang:
Divcon: Learning Concept Sequences for Semantically Diverse Image Captioning. 1-5 - Jun Bai, Chuantao Yin, Hanhua Hong, Jianfei Zhang, Chen Li, Yanmeng Wang, Wenge Rong:
Permutation Invariant Training for Paraphrase Identification. 1-5 - Fen Wang, Taihao Li, Xue Zhang:
Revisit Sampling Theory of Bandlimited Graph Signals: One Bridge Between GSP and DSP. 1-5 - Jian Ni, Yong Liao:
Structure-Preserving and Redundancy-Free Features Refinement for Generalized Zero-Shot Learning. 1-5 - Jiaping Yu, Tongqing Zhou, Zhiping Cai, Wenyuan Kuang:
Tracking Targets in Hyper-Scale Cameras Using Movement Predication. 1-5 - Zhiliang Wu, Kang Zhang, Changchang Sun, Hanyu Xuan, Yan Yan:
Flow-Guided Deformable Alignment Network with Self-Supervision for Video Inpainting. 1-5 - Rohit Agarwal, Gyanendra Das, Saksham Aggarwal, Alexander Horsch, Dilip K. Prasad:
Mabnet: Master Assistant Buddy Network With Hybrid Learning for Image Retrieval. 1-5 - Vasudha Kowtha, Miquel Espi Marques, Jonathan Huang, Yichi Zhang, Carlos Avendaño:
Learning to Detect Novel and Fine-Grained Acoustic Sequences Using Pretrained Audio Representations. 1-5 - Nils Laurent, Sylvain Meignen, Marcelo Alejandro Colominas, Juan Manuel Miramont, François Auger:
A Novel Approach Based on Voronoï Cells to Classify Spectrogram Zeros of Multicomponent Signals. 1-5 - Nicolas Boizard, Kevin El Haddad, Thierry Ravet, François Cresson, Thierry Dutoit:
Deep Learning-Based Stereo Camera Multi-Video Synchronization. 1-5 - Clément Cosserat, Ben Gabrielson, Emilie Chouzenoux, Jean-Christophe Pesquet, Tülay Adali:
A Proximal Approach to IVA-G with Convergence Guarantees. 1-5 - Fabian Latorre, Chenghao Liu, Doyen Sahoo, Steven C. H. Hoi:
OTW: Optimal Transport Warping for Time Series. 1-5 - Jae-Hong Lee, Dong-Hyun Kim, Joon-Hyuk Chang:
Repackagingaugment: Overcoming Prediction Error Amplification in Weight-Averaged Speech Recognition Models Subject to Self-Training. 1-5 - Wei-Cheng Lin, Carlos Busso:
Role of Lexical Boundary Information in Chunk-Level Segmentation for Speech Emotion Recognition. 1-5 - Sabri Mustafa Kahya, Muhammet Sami Yavuz, Eckehard G. Steinbach:
Mcrood: Multi-Class Radar Out-Of-Distribution Detection. 1-5 - Zhen Wu, Yizhe Lu, Xinyu Dai:
An Empirical Study and Improvement for Speech Emotion Recognition. 1-5 - Tiantian Gong, Junsheng Wang, Liyan Zhang:
Rethink Pair-Wise Self-Supervised Cross-Modal Retrieval From A Contrastive Learning Perspective. 1-5 - Jhony H. Giraldo, Sajid Javed, Arif Mahmood, Fragkiskos D. Malliaros, Thierry Bouwmans:
Higher-Order Sparse Convolutions in Graph Neural Networks. 1-5 - Jian Wu, Zhuo Chen, Min Hu, Xiong Xiao, Jinyu Li:
Speaker Change Detection For Transformer Transducer ASR. 1-5 - Lixiang Lian, Ben Wang:
Regularized Deep Generative Model Learning for Real-Time Massive MIMO Channel Tracking. 1-5 - Xiyan Liu, Yidong Shi, Ruifang Liu, Ge Bai, Yanyi Chen:
Narrow Down Before Selection: A Dynamic Exclusion Model for Multiple-Choice QA. 1-5 - Yao Luo, Jinshan Pan, Jinhui Tang:
SVMV: Spatiotemporal Variance-Supervised Motion Volume for Video Frame Interpolation. 1-5 - Zhenyu Wang, Li Wan, Biqiao Zhang, Yiteng Huang, Shang-Wen Li, Ming Sun, Xin Lei, Zhaojun Yang:
Disentangled Training with Adversarial Examples for Robust Small-Footprint Keyword Spotting. 1-5 - Tuan Nguyen, Salima Mdhaffar, Natalia A. Tomashenko, Jean-François Bonastre, Yannick Estève:
Federated Learning for ASR Based on wav2vec 2.0. 1-5 - Osimone Imhogiemhe, Julien Flamant, Xavier Luciani, Yassine Zniyed, Sebastian Miron:
Low-Rank Tensor Decompositions for Quaternion Multiway Arrays. 1-5 - Anthony Faustine, Lucas Pereira:
Applying Symmetrical Component Transform for Industrial Appliance Classification in Non-Intrusive Load Monitoring. 1-5 - Bob Van Dyck, Liuyin Yang, Marc M. Van Hulle:
Decoding Auditory EEG Responses Using an Adapted Wavenet. 1-2 - Yuchen Sun, Kejun Huang:
Volume-Regularized Nonnegative Tucker Decomposition with Identifiability Guarantees. 1-5 - Xiangtao Wang, Ruizhi Wang, Biao Tian, Jiaojiao Zhang, Shuo Zhang, Junyang Chen, Thomas Lukasiewicz, Zhenghua Xu:
MPS-AMS: Masked Patches Selection and Adaptive Masking Strategy Based Self-Supervised Medical Image Segmentation. 1-5 - Hagar Kafri, Marco Olivieri, Fabio Antonacci, Mordehay Moradi, Augusto Sarti, Sharon Gannot:
Grad-CAM-Inspired Interpretation of Nearfield Acoustic Holography using Physics-Informed Explainable Neural Network. 1-5 - Andrew O'Brien, Rosina Weber, Edward Kim:
Investigating SINDy as a Tool for Causal Discovery in Time Series Signals. 1-5 - Chin Choy Chai, Xiaoping Zhang:
Iterative Water-Filling Power and Subcarrier Allocation for Multicarrier NOMA Downlink. 1-5 - Bin Li, Yixuan Weng, Bin Sun, Shutao Li:
Learning To Locate Visual Answer In Video Corpus Using Question. 1-5 - Dingyi Zeng, Wenyu Chen, Wanlong Liu, Li Zhou, Hong Qu:
Rethinking Random Walk in Graph Representation Learning. 1-5 - Shreyas Chaudhari, Srinivasa Pranav, José M. F. Moura:
Learning Gradients of Convex Functions with Monotone Gradient Networks. 1-5 - Yi Zhu, Mahil Hussain Shaik, Tiago H. Falk:
On the Importance of Different Cough Phases for COVID-19 Detection. 1-5 - Kathleen MacWilliam, Filip Elvander, Toon van Waterschoot:
Simultaneous Acoustic Echo Sorting and 3-D Room Geometry Inference. 1-5 - Liuyin Wang, Mingchao Li, Hai-Tao Zheng:
Rethinking Rule-Based Approaches in Session-Based Recommendation. 1-5 - Yunzuo Zhang, Tian Zhang, Cunyu Wu, Yuxin Zheng:
Hierarchical Spatiotemporal Feature Fusion Network For Video Saliency Prediction. 1-5 - Yanbin Wang, Weifan Zhu, Haitao Xu, Zhan Qin, Kui Ren, Wenrui Ma:
A Large-Scale Pretrained Deep Model for Phishing URL Detection. 1-5 - Stefan Braun, Erik McDermott, Roger Hsiao:
Neural Transducer Training: Reduced Memory Consumption with Sample-Wise Computation. 1-5 - Yasitha Warahena Liyanage, Daphney-Stavroula Zois:
Interpretability in the Context of Sequential Cost-Sensitive Feature Acquisition. 1-5 - Jie Chen, Shizhe Zhou:
Vision2Touch: Imaging Estimation of Surface Tactile Physical Properties. 1-5 - Yafeng Chen, Siqi Zheng, Hui Wang, Luyao Cheng, Qian Chen:
Pushing the Limits of Self-Supervised Speaker Verification using Regularized Distillation Framework. 1-5 - Mu Yang, Andros Tjandra, Chunxi Liu, David Zhang, Duc Le, Ozlem Kalinli:
Learning ASR Pathways: A Sparse Multilingual ASR Model. 1-5 - Kfir M. Cohen, Sangwoo Park, Osvaldo Simeone, Shlomo Shamai Shitz:
Calibrating AI Models for Few-Shot Demodulation VIA Conformal Prediction. 1-5 - Pin-Jie Liao, Yu-Cheng Huang, Chen-Kuo Chiang, Shang-Hong Lai:
Robust Multi-Object Tracking With Spatial Uncertainty. 1-5 - Jifan Yang, Zhongyuan Wang, Baojin Huang, Lianbing Deng:
Continuous Learning for Blind Image Quality Assessment with Contrastive Transformer. 1-5 - Jen-Tzung Chien, Yuan-An Chen:
Self-Supervised Adversarial Training for Contrastive Sentence Embedding. 1-5 - Rao Ma, Xiaobo Wu, Jin Qiu, Yanan Qin, Haihua Xu, Peihao Wu, Zejun Ma:
Internal Language Model Estimation Based Adaptive Language Model Fusion for Domain Adaptation. 1-5 - Ruize Xu, Ruoxuan Feng, Shi-Xiong Zhang, Di Hu:
MMCosine: Multi-Modal Cosine Loss Towards Balanced Audio-Visual Fine-Grained Learning. 1-5 - Sheida Nozari, Ali Krayani, Pablo Marin, Lucio Marcenaro, David Martín, Carlo S. Regazzoni:
Adapting Exploratory Behaviour in Active Inference for Autonomous Driving. 1-5 - Shuai Wang, Yanqing Xu, Yanli Yuan, Xiuhua Wang, Tony Q. S. Quek:
Boosting Semi-Supervised Federated Learning with Model Personalization and Client-Variance-Reduction. 1-5 - Junyan Jiang, Gus Xia:
Self-Supervised Hierarchical Metrical Structure Modeling. 1-5 - Anastasios Alexandridis, Kanthashree Mysore Sathyendra, Grant P. Strimel, Feng-Ju Chang, Ariya Rastrow, Nathan Susanj, Athanasios Mouchtaris:
Gated Contextual Adapters For Selective Contextual Biasing In Neural Transducers. 1-5 - Li-Wei Chen, Alexander Rudnicky:
Exploring Wav2vec 2.0 Fine Tuning for Improved Speech Emotion Recognition. 1-5 - Honglei Xu, Shaohui Liu, Yan Shu, Feng Jiang:
Aprogressive Image Dehazing Framework with inter and Intra Contrastive Learning. 1-5 - Kai Li, Yi Luo:
On The Design and Training Strategies for Rnn-Based Online Neural Speech Separation Systems. 1-5 - TaeSoo Kim, Daniel Rho, Gahui Lee, JaeHan Park, Jong Hwan Ko:
Regression to Classification: Waveform Encoding for Neural Field-Based Audio Signal Representation. 1-5 - Eloi Moliner, Jaakko Lehtinen, Vesa Välimäki:
Solving Audio Inverse Problems with a Diffusion Model. 1-5 - Junhyeok Lee, Seungu Han, Hyunjae Cho, Wonbin Jung:
PhaseAug: A Differentiable Augmentation for Speech Synthesis to Simulate One-to-Many Mapping. 1-5 - Jianglin Jin, Jiabo Ye, Xin Lin, Liang He:
Pseudo-Query Generation For Semi-Supervised Visual Grounding With Knowledge Distillation. 1-5 - Hugo Brehier, Arnaud Breloy, Mohammed Nabil El Korso, Sandeep Kumar:
Robust and Globally Sparse Pca via Majorization-Minimization and Variable Splitting. 1-5 - Satwika Bhogavalli, Éric Grivel, K. V. S. Hari, Vincent Corretja:
Waveform Design to Improve the Estimation of Target Parameters Using the Fourier Transform Method in a MIMO OFDM DFRC System. 1-5 - Jun Diao, Lin Zhou, Lin Bai:
Achievable Error Exponents for Almost Fixed-Length M-Ary Hypothesis Testing. 1-5 - Yangcheng Li, Zefang Yu, Suncheng Xiang, Ting Liu, Yuzhuo Fu:
AV-TAD: Audio-Visual Temporal Action Detection With Transformer. 1-5 - Constance Douwes, Giovanni Bindi, Antoine Caillon, Philippe Esling, Jean-Pierre Briot:
Is Quality Enoughƒ Integrating Energy Consumption in a Large-Scale Evaluation of Neural Audio Synthesis Models. 1-5 - Cheol Jun Cho, Peter Wu, Abdelrahman Mohamed, Gopala Krishna Anumanchipalli:
Evidence of Vocal Tract Articulation in Self-Supervised Learning of Speech. 1-5 - Rim Abrougui, Géraldine Damnati, Johannes Heinecke, Frédéric Béchet:
Abstract Representation for Multi-Intent Spoken Language Understanding. 1-5 - Guy Sagi, Nir Shlezinger, Tirza Routtenberg:
Extended Kalman Filter for Graph Signals in Nonlinear Dynamic Systems. 1-5 - Hanwen Bi, Thushara D. Abhayapala, Fei Ma, Prasanga N. Samarasinghe:
Spherical Sector Harmonics Based Soundfield Radial Extrapolation And Robustness Analysis. 1-5 - Karthikeyan Natesan Ramamurthy, Aldo Guzmán-Sáenz, Mustafa Hajij:
TOPO-MLP : A Simplicial Network without Message Passing. 1-5 - Andrea Nardin, Tales Imbiriba, Pau Closas:
Jamming Source Localization Using Augmented Physics-Based Model. 1-5 - Alkis Koudounas, Eliana Pastor, Giuseppe Attanasio, Vittorio Mazzia, Manuel Giollo, Thomas Gueudré, Luca Cagliero, Luca de Alfaro, Elena Baralis, Daniele Amberti:
Exploring Subgroup Performance in End-to-End Speech Models. 1-5 - Xiao Zhao, Liuzhen Su, Xukun Zhang, Dingkang Yang, Mingyang Sun, Shunli Wang, Peng Zhai, Lihua Zhang:
D-CONFORMER: Deformable Sparse Transformer Augmented Convolution for Voxel-Based 3D Object Detection. 1-5 - Xianrui Wang, Andreas Brendel, Gongping Huang, Yichen Yang, Walter Kellermann, Jingdong Chen:
Spatially Informed Independent vector analysis for Source Extraction based on the convolutive Transfer Function Model. 1-5 - Yicheng Hsu, Chenghung Ma, Mingsian R. Bai:
Model-Matching Principle Applied to the Design of an Array-Based All-Neural Binaural Rendering System for Audio Telepresence. 1-5 - Yuanbo Hou, Yun Wang, Wenwu Wang, Dick Botteldooren:
Gct: Gated Contextual Transformer for Sequential Audio Tagging. 1-5 - Yuwei Chen, Zengde Deng, Yinzhi Zhou, Zaiyi Chen, Yujie Chen, Haoyuan Hu:
An Online Algorithm for Chance Constrained Resource Allocation. 1-5 - Arie N. Arya, Yao Lei Xu, Ljubisa Stankovic, Danilo P. Mandic:
Hierarchical Graph Learning for Stock Market Prediction Via a Domain-Aware Graph Pooling Operator. 1-5 - Changheng Li, Richard C. Hendriks:
Noise PSD Insensitive RTF Estimation in a Reverberant and Noisy Environment. 1-5 - Xiaohan Zhang, Dong Wang, Xiaohong Ma:
Efficient Siamese Network for UAV Tracking. 1-5 - Yuhan Li, Jesper Rindom Jensen, Maozhong Fu, Zhenmiao Deng, Mads Græsbøll Christensen:
Sparse Bayesian Learning Based Three-Dimensional Imaging for Antenna Array Radar. 1-5 - Sibo Tong, Philip Harding, Simon Wiesler:
Slot-Triggered Contextual Biasing For Personalized Speech Recognition Using Neural Transducers. 1-5 - Karthik Comandur, Yunpeng Li, Santosh Nannuru:
Particle Flow Gaussian Sum Particle Filter. 1-5 - Yiqing Shen, Liwu Xu, Yuzhe Yang, Yaqian Li, Yandong Guo:
Mixed Sample Augmentation for Online Distillation. 1-5 - Ashutosh Singh, Ashish Singh, Aria Masoomi, Tales Imbiriba, Erik G. Learned-Miller, Deniz Erdogmus:
Inv-Senet: Invariant Self Expression Network for Clustering Under Biased Data. 1-5 - Ali Golmakani, Mostafa Sadeghi, Romain Serizel:
Audio-Visual Speech Enhancement with a Deep Kalman Filter Generative Model. 1-5 - Xianghu Yue, Junyi Ao, Xiaoxue Gao, Haizhou Li:
Token2vec: A Joint Self-Supervised Pre-Training Framework Using Unpaired Speech and Text. 1-5 - Ismail El-Yamany, Abdelrahman Wael, Noha Adly, Marwan Torki:
STACKMAPS: A Visualization Technique for Diabetic Retinopathy Grading. 1-5 - Qianying Liu, Zhuo Gong, Zhengdong Yang, Yuhang Yang, Sheng Li, Chenchen Ding, Nobuaki Minematsu, Hao Huang, Fei Cheng, Chenhui Chu, Sadao Kurohashi:
Hierarchical Softmax for End-To-End Low-Resource Multilingual Speech Recognition. 1-5 - Kazuki Naganuma, Shunsuke Ono:
Static-Scene Constrained Optimization for Matrix/Tensor-Decomposition-free Foreground-Background Separation. 1-5 - Zhiyu Wang, Xuezhi Yang, Hongzhou Lu, Caifeng Shan, Wenjin Wang:
Benchmark of Physiological Model Based and Deep Learning Based Remote Photoplethysmography in Automotive Applications. 1-5 - Salam Hamieh, Vincent Heiries, Hussein Al Osman, Christelle Godin:
Relapse Detection in Patients with Psychotic Disorders Using Unsupervised Learning on Smartwatch Signals. 1-2 - Stefanos Koffas, Luca Pajola, Stjepan Picek, Mauro Conti:
Going in Style: Audio Backdoors Through Stylistic Transformations. 1-5 - Sunwoo Kim, Kyuhong Shim, Luong Trung Nguyen, Byonghyo Shim:
Semantic-Preserving Augmentation for Robust Image-Text Retrieval. 1-5 - Masato Fujitake:
A3S: Adversarial Learning of Semantic Representations for Scene-Text Spotting. 1-5 - Lei Wang, Zhibin Jiao, Qiyong Zhao, Jie Zhu, Yang Fu:
Framewise Multiple Sound Source Localization and Counting Using Binaural Spatial Audio Signals. 1-5 - Angello Hoyos, Mariano Rivera:
Hadamard Layer to Improve Semantic Segmentation. 1-5 - Xin Xiong, Eduardo Pavez, Antonio Ortega, Balu Adsumilli:
Rate-Distortion Optimization with Alternative References for UGC Video Compression. 1-5 - Yazhen Xie, Yanglin Huang, Yuan Zhang, Xuanya Li, Xiongjun Ye, Kai Hu:
Transwnet: Integrating Transformers into CNNS via Row and Column Attention for Abdominal Multi-Organ Segmentation. 1-5 - Ege C. Kaya, Mehmet Berk Sahin, Abolfazl Hashemi:
Communication-Constrained Exchange of Zeroth-Order Information with Application to Collaborative Target Tracking. 1-5 - Rahul Kumar Gupta, Shilka Roy, Sujit Jos, V. S. Unni, Lauren Lavoie, Frederic Medous, Walter Smith:
Information Extraction from Pill Bottle Images via Text Stitching. 1-5 - Jiahong Zhang, Lihong Cao, Moning Zhang, Wenlong Fu:
Extracting the Brain-Like Representation by an Improved Self-Organizing Map for Image Classification. 1-5 - Ying Mo, Hongyin Tang, Jiahao Liu, Qifan Wang, Zenglin Xu, Jingang Wang, Wei Wu, Zhoujun Li:
Multi-Task Transformer with Relation-Attention and Type-Attention for Named Entity Recognition. 1-5 - Daniel Yue Zhang, Soumya Saha, Sarah Campbell:
Phonetic RNN-Transducer for Mispronunciation Diagnosis. 1-5 - Zecheng Wang, Yik-Cheung Tam:
Suffix Retrieval-Augmented Language Modeling. 1-5 - Javier Maroto, Gérôme Bovet, Pascal Frossard:
Maximum Likelihood Distillation for Robust Modulation Classification. 1-5 - Jung Uk Kim, Seong Tae Kim:
Towards Robust Audio-Based Vehicle Detection Via Importance-Aware Audio-Visual Learning. 1-5 - Xiaojie Gu, Renze Lou, Lin Sun, Shangxin Li:
PAGE: A Position-Aware Graph-Based Model for Emotion Cause Entailment in Conversation. 1-5 - Qiang Zhou, Chaohui Yu, Zhibin Wang, Fan Wang:
D2Q-DETR: Decoupling and Dynamic Queries for Oriented Object Detection with Transformers. 1-5 - Yi Zheng, Heming Jing, Qiujie Xie, Yuejie Zhang, Rui Feng, Tao Zhang, Shang Gao:
Video Captioning via Relation-Aware Graph Learning. 1-5 - Wen-Chin Huang, Benjamin N. Peloquin, Justine Kao, Changhan Wang, Hongyu Gong, Elizabeth Salesky, Yossi Adi, Ann Lee, Peng-Jen Chen:
A Holistic Cascade System, Benchmark, and Human Evaluation Protocol for Expressive Speech-to-Speech Translation. 1-5 - Farhad Pakdaman, Moncef Gabbouj:
Comprehensive Complexity Assessment of Emerging Learned Image Compression on CPU and GPU. 1-5 - Yanmeng Wang, Qingjiang Shi, Tsung-Hui Chang:
Batch Normalization Damages Federated Learning on NON-IID Data: Analysis and Remedy. 1-5 - Anirudh S. Sundar, Gokce Keskin, Chander Chandak, I-Fan Chen, Pegah Ghahremani, Shalini Ghosh:
Prune Then Distill: Dataset Distillation with Importance Sampling. 1-5 - Xuxin Cheng, Zhihong Zhu, Hongxiang Li, Yaowei Li, Yuexian Zou:
SSVMR: Saliency-Based Self-Training for Video-Music Retrieval. 1-5 - Haoyang Ma, Zeyu Li, Hongyu Guo:
A Contrastive Framework to Enhance Unsupervised Sentence Representation Learning. 1-5 - Lucas Maison, Yannick Estève:
Improving Accented Speech Recognition with Multi-Domain Training. 1-5 - Qiquan Zhang, Hongxu Zhu, Qi Song, Xinyuan Qian, Zhaoheng Ni, Haizhou Li:
Ripple Sparse Self-Attention for Monaural Speech Enhancement. 1-5 - Xun Gong, Yu Wu, Jinyu Li, Shujie Liu, Rui Zhao, Xie Chen, Yanmin Qian:
LongFNT: Long-Form Speech Recognition with Factorized Neural Transducer. 1-5 - Ruoqi Li, Huimin Yu, Kaiyang Du, Zhuoling Xiao, Bo Yan, Zhengxi Yuan:
Adaptive Semantic Fusion Framework for Unsupervised Monocular Depth Estimation. 1-5 - Pawel Swietojanski, Stefan Braun, Dogan Can, Thiago Fraga da Silva, Arnab Ghoshal, Takaaki Hori, Roger Hsiao, Henry Mason, Erik McDermott, Honza Silovsky, Ruchir Travadi, Xiaodan Zhuang:
Variable Attention Masking for Configurable Transformer Transducer Speech Recognition. 1-5 - Javier Álvarez-Vizoso, Diego Cuevas, Carlos Beltrán, Ignacio Santamaría, Vít Tucek, Gunnar Peters:
Noncoherent Multiuser Grassmannian Constellations for the Mimo Multiple Access Channel. 1-5 - Wenlong Hang, Jiaxing Li, Shuang Liang, Yuan Wu, Baiying Lei, Jing Qin, Yu Zhang, Kup-Sze Choi:
FedEEG: Federated EEG Decoding Via inter-Subject Structure Matching. 1-5 - Jean-Marc Valin, Jan Büthe, Ahmed Mustafa:
Low-Bitrate Redundancy Coding of Speech Using A Rate-Distortion-Optimized Variational Autoencoder. 1-5 - Daliang Ouyang, Su He, Guozhong Zhang, Mingzhu Luo, Huaiyong Guo, Jian Zhan, Zhijie Huang:
Efficient Multi-Scale Attention Module with Cross-Spatial Learning. 1-5 - Anna Meyer, André Kaup:
A Novel Cross-Component Context Model for End-to-End Wavelet Image Coding. 1-5 - Muhammad Salaar Arif Khan, Salman Nadeem, Zubair Khalid:
Sampling Order-Limited Signals on the Sphere. 1-5 - Marie Guyomard, Susana Barbosa, Lionel Fillatre:
Understandable Relu Neural Network For Signal Classification. 1-5 - Narcís Cardona, J. Samuel Romero, Wenfei Yang, Jian Li:
Integrating the Sensing and Radio Communications Channel Modelling From Radar Mutual Interference. 1-5 - Yanbin He, Geethu Joseph:
Structure-Aware Sparse Bayesian Learning-Based Channel Estimation for Intelligent Reflecting Surface-Aided MIMO. 1-5 - Wuyang Liu, Yanzhen Ren, Jingru Wang:
Attention Mixup: An Accurate Mixup Scheme Based On Interpretable Attention Mechanism for Multi-Label Audio Classification. 1-5 - Leo Hsu, Visar Berisha:
Does Human Speech Follow Benford's Law? 1-5 - Mohammad Hassan Vali, Tom Bäckström:
Stochastic Optimization of Vector Quantization Methods in Application to Speech and Image Processing. 1-5 - Gokcan Tatli, Alper T. Erdogan:
A Bayesian Perspective for Determinant Minimization Based Robust Structured Matrix Factorization. 1-5 - Haoyu Li, Yun Liu, Junichi Yamagishi:
Joint Noise Reduction and Listening Enhancement for Full-End Speech Enhancement. 1-5 - Xingjian Du, Zijie Wang, Xia Liang, Huidong Liang, Bilei Zhu, Zejun Ma:
Bytecover3: Accurate Cover Song Identification On Short Queries. 1-5 - Xingyu Bai, Taiqiang Wu, Han Guo, Zhe Zhao, Xuefeng Yang, Jiayi Li, Weijie Liu, Qi Ju, Weigang Guo, Yujiu Yang:
Recouple Event Field via Probabilistic Bias for Event Extraction. 1-5 - Suhang Ye, Zebo Hong, Jiawen Zheng, Shengchuan Zhang:
Improving Occluded Human Pose Estimation Via Linked Joints. 1-5 - Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
SLICER: Learning Universal Audio Representations Using Low-Resource Self-Supervised Pre-Training. 1-5 - Yimin Hu, Guorui Yu, Yuejie Zhang, Rui Feng, Tao Zhang, Xuequan Lu, Shang Gao:
Motion-Aware Video Paragraph Captioning via Exploring Object-Centered Internal Knowledge. 1-5 - Romain Serizel, Samuele Cornell, Nicolas Turpault:
Performance Above All? Energy Consumption vs. Performance, a Study on Sound Event Detection with Heterogeneous Data. 1-5 - Dan Ben Ami, Kobi Cohen, Qing Zhao:
Client Selection for Generalization in Accelerated Federated Learning: A Bandit Approach. 1-5 - Xavier Gitiaux, Aditya Khant, Ebrahim Beyrami, Chandan K. A. Reddy, Jayant Gupchup, Ross Cutler:
AURA: Privacy-Preserving Augmentation to Improve Test Set Diversity in Speech Enhancement. 1-5 - Wenye Lin, Yangming Li, Lemao Liu, Shuming Shi, Hai-Tao Zheng:
A Simple Yet Effective Approach to Structured Knowledge Distillation. 1-5 - Ekkasit Pinyoanuntapong, Ayman Ali, Pu Wang, Minwoo Lee, Chen Chen:
Gaitmixer: Skeleton-Based Gait Representation Learning Via Wide-Spectrum Multi-Axial Mixer. 1-5 - Charlie Windolf, Angelique C. Paulk, Yoav Kfir, Eric Trautmann, Domokos Meszéna, William Muñoz, Irene Caprara, Mohsen Jamali, Julien Boussard, Ziv M. Williams, Sydney S. Cash, Liam Paninski, Erdem Varol:
Robust Online Multiband Drift Estimation in Electrophysiology Data. 1-5 - Marcelin Tworski, Stéphane Lathuilière:
Test Your Samples Jointly: Pseudo-Reference for Image Quality Evaluation. 1-5 - Haiyan Jin, Dawei Wei, Haonan Su:
Deep Low Light Image Enhancement Via Multi-Scale Recursive Feature Enhancement and Curve Adjustment. 1-5 - Émile Pierret, Bruno Galerne:
Stochastic Super-Resolution For Gaussian Textures. 1-5 - Yufeng Tan, Youjun Xiang, Lei Cai, Pengcheng Wang, Ying Zhang, Yuli Fu:
Two-Stage Video De-Raining with Spatio-Temporal Fusion and Illumination-Invariant Detail Preservation. 1-5 - Paul Rodríguez:
Improving the Stochastic Gradient Descent's Test Accuracy by Manipulating the ℓ∞ Norm of its Gradient Approximation. 1-5 - Guangchen Wang, Peng Cheng, Zhuo Chen, Wei Xiang, Branka Vucetic, Yonghui Li:
Inverse Reinforcement Learning with Graph Neural Networks for IoT Resource Allocation. 1-5 - Ruhan He, Shanshan Xiang, Tao Peng, Yongsheng Yu:
Monocular 3D Human Pose Estimation Based on Global Temporal-Attentive and Joints-Attention In Video. 1-5 - Sebastian O. Jordan, Thomas W. Sherson, Richard Heusdens:
Convergence of Stochastic PDMM. 1-5 - Zuheng Kang, Yayun He, Jianzong Wang, Junqing Peng, Xiaoyang Qu, Jing Xiao:
Feature-Rich Audio Model Inversion for Data-Free Knowledge Distillation Towards General Sound Classification. 1-5 - Gabriel Mittag, Babak Naderi, Vishak Gopal, Ross Cutler:
LSTM-Based Video Quality Prediction Accounting for Temporal Distortions in Videoconferencing Calls. 1-5 - Tao Li, Haodong Zhou, Jie Wang, Qingyang Hong, Lin Li:
The XMU System for Audio-Visual Diarization and Recognition in MISP Challenge 2022. 1-2 - Andrei Andrusenko, Rauf Nasretdinov, Aleksei Romanenko:
UCONV-Conformer: High Reduction of Input Sequence Length for End-to-End Speech Recognition. 1-5 - Sagar Shrestha, Xiao Fu, Mingyi Hong:
Towards Efficient and Optimal Joint Beamforming and Antenna Selection: A Machine Learning Approach. 1-5 - Yaoxun Xu, Baiji Liu, Qiaochu Huang, Xingchen Song, Zhiyong Wu, Shiyin Kang, Helen Meng:
CB-Conformer: Contextual Biasing Conformer for Biased Word Recognition. 1-5 - Xiangping Zheng, Xun Liang, Bo Wu:
Select The Best: Enhancing Graph Representation with Adaptive Negative Sample Selection. 1-5 - Ching Hua Lee, Chouchang Yang, Yilin Shen, Hongxia Jin:
Improved Mask-Based Neural Beamforming for Multichannel Speech Enhancement by Snapshot Matching Masking. 1-5 - Yingcong Li, Samet Oymak:
On The Fairness of Multitask Representation Learning. 1-5 - Kinan Abbas, Matthieu Puigt, Gilles Delmaire, Gilles Roussel:
Joint Unmixing And Demosaicing Methods For Snapshot Spectral Images. 1-5 - Xiaohuan Wu, Ji Sun, Xiaoyuan Jia, Shuxin Wang:
Source Localization for Extremely Large-Scale Antenna Arrays with Spatial Non-Stationarity. 1-5 - Wanli Ni, Jingheng Zheng, Yonina C. Eldar, Changsheng You, Kaibin Huang:
Semi-Federated Learning for Edge Intelligence with Imperfect SIC. 1-5 - Andrew Rouditchenko, Yung-Sung Chuang, Nina Shvetsova, Samuel Thomas, Rogério Feris, Brian Kingsbury, Leonid Karlinsky, David Harwath, Hilde Kuehne, James R. Glass:
C2KD: Cross-Lingual Cross-Modal Knowledge Distillation for Multilingual Text-Video Retrieval. 1-5 - Ruohao Yan, Huaping Zhang, Wushour Silamu, Askar Hamdulla:
Unsupervised word Segmentation Based on Word Influence. 1-5 - Chen Tang, Hongbo Zhang, Tyler Loakman, Chenghua Lin, Frank Guerin:
Terminology-Aware Medical Dialogue Generation. 1-5 - Javier Kipen, Joakim Jaldén, Shyamprasad N. Raja, Saumey Jain:
Efficient Implementation of Robust CUSUM Algorithm to Characterize Nanogaps Measurements with Heavy-Tailed Noise. 1-5 - Songlin Yang, Wei Wang, Bo Peng, Jing Dong:
Designing A 3d-Aware Stylenerf Encoder for Face Editing. 1-5 - Tousif Ahmed, Md. Mahbubur Rahman, Ebrahim Nemati, Jilong Kuang, Alex Gao:
Mouth Breathing Detection Using Audio Captured Through Earbuds. 1-5 - Yimao Sun, K. C. Ho, Yanbing Yang, Lei Zhang, Liangyin Chen:
Robust Iterative Solution for Linear Array-Based 3-D Localization by Message Passing. 1-5 - Ulrik Kowalk, Simon Doclo, Jörg Bitzer:
Geometry-Aware DOA Estimation Using a Deep Neural Network with Mixed-Data Input Features. 1-5 - Cong Yu, Yu-Dong Zhang, Zhi Wu, Chunyang Xie, Zhi Lu, Yang Hu, Yan Chen:
Fast 3D Human Pose Estimation Using RF Signals. 1-5 - Peipei Xu, Fu Wang, Wenjie Ruan, Chi Zhang, Xiaowei Huang:
Sora: Scalable Black-Box Reachability Analyser on Neural Networks. 1-5 - Baobei Xu, Shukai Fang, Zhaoyang Li, Shicai Yang, Di Xie, Shiliang Pu:
PRIME: 3D Human Pose and Body Shape Recovery with Perspective Projection. 1-5 - Yuanzhe Chen, Ming Tu, Tang Li, Xin Li, Qiuqiang Kong, Jiaxin Li, Zhichao Wang, Qiao Tian, Yuping Wang, Yuxuan Wang:
Streaming Voice Conversion via Intermediate Bottleneck Features and Non-Streaming Teacher Guidance. 1-5 - Siying Liu, Qiankun Liu, Qi Chu, Bin Liu, Nenghai Yu:
Dual-Feature Enhancement for Weakly Supervised Temporal Action Localization. 1-5 - Jixun Yao, Yi Lei, Qing Wang, Pengcheng Guo, Ziqian Ning, Lei Xie, Hai Li, Junhui Liu, Danming Xie:
Preserving Background Sound in Noise-Robust Voice Conversion Via Multi-Task Learning. 1-5 - Viet-Quoc Pham, Nao Mishima:
Focusing on Targets for Improving Weakly Supervised Visual Grounding. 1-5 - Jeong Hun Yeo, Minsu Kim, Yong Man Ro:
Multi-Temporal Lip-Audio Memory for Visual Speech Recognition. 1-5 - Hanchen Pei, Yuhong Yang, Xufeng Chen, Qingmu Liu, Hongyang Chen, Weiping Tu, Song Lin:
PMMSD: Development of the Matrix Sentence Intelligibility Dataset for Mandarin with Lombard Effect. 1-5 - Hongzhi Liu, Kaizhong Zheng, Shujian Yu, Badong Chen:
Towards a More Stable and General Subgraph Information Bottleneck. 1-5 - Shiyuan Xing, Changlong Lin, Yuchen Li, Huandong Wang:
An Adaptive DFE Using Light-Pattern-Protection Algorithm in 12 NM CMOS Technology. 1-5 - Sahaj Mistry, Shreyas Chatterjee, Ajeet Kumar Verma, Vinit Jakhetiya, Badri N. Subudhi, Sunil Prasad Jaiswal:
Drone-vs-Bird: Drone Detection Using YOLOv7 with CSRT Tracker. 1-2 - Debasmit Das, Shubhankar Borse, Hyojin Park, Kambiz Azarian, Hong Cai, Risheek Garrepalli, Fatih Porikli:
Transadapt: A Transformative Framework for Online Test Time Adaptive Semantic Segmentation. 1-5 - Pranav Kulkarni, P. P. Vaidyanathan:
Interpolation Filter Model For Ramanujan Subspace Signals. 1-5 - Yen-Ting Lin, Chen-Yu Chiang:
EGAN: A Neural Excitation Generation Model Based on Generative Adversarial Networks with Harmonics and Noise Input. 1-5 - Keita Goto, Shinta Otake, Rei Kawakami, Nakamasa Inoue:
Step restriction for improving adversarial attacks. 1-5 - Harlin Lee, Aaqib Saeed, Andrea L. Bertozzi:
Active Learning of non-Semantic Speech Tasks with Pretrained models. 1-5 - Dilan Senaratne, Jinsub Kim:
Sparse Error Correction for Power Network Parameters. 1-5 - Jiapeng Zhang, Yongxiong Wang, Zhiqun Pan, Zhenhui Tang, Lijun Chen, Jinlong Liu:
LDTSF: A Label-Decoupling Teacher-Student Framework for Semi-Supervised Echocardiography Segmentation. 1-5 - Le Yu, Tongyan Hua, Wenming Yang, Peng Ye, Qingmin Liao:
CDHD: Contrastive Dreamer for Hint Distillation. 1-5 - Tsung-Han Tsai, Wei-Chung Wan:
NL-DSE: Non-Local Neural Network with Decoder-Squeeze-and-Excitation for Monocular Depth Estimation. 1-4 - Masaki Yoshida, Ren Togo, Takahiro Ogawa, Miki Haseyama:
Binauralization Robust To Camera Rotation Using 360° Videos. 1-5 - Saierdaer Yusuyin, Hao Huang, Junhua Liu, Cong Liu:
Investigation into Phone-Based Subword Units for Multilingual End-to-End Speech Recognition. 1-5 - Wenxi Ma, Tianxiang Hou, Qianji Di, Zhongang Qi, Ying Shan, Hanzi Wang:
ERBNet: An Effective Representation Based Network for Unbiased Scene Graph Generation. 1-5 - Hao Yang, Shuyuan Lin, Runqing Jiang, Yang Lu, Hanzi Wang:
DQFORMER: Dynamic Query Transformer for Lane Detection. 1-5 - Koichi Miyazaki, Masato Murata, Tomoki Koriyama:
Structured State Space Decoder for Speech Recognition and Synthesis. 1-5 - Jin Woo Lee, Sungho Lee, Kyogu Lee:
Global HRTF Interpolation Via Learned Affine Transformation of Hyper-Conditioned Features. 1-5 - Long Feng, Guohua Geng, Chen Guo, Longquan Yan, Xingrui Ma, Zhan Li, Kang Li:
Gender-Cartoon: Image Cartoonization Method Based on Gender Classification. 1-5 - Alberto Natali, Geert Leus:
Blind Polynomial Regression. 1-5 - Chengze Yu, Taiqiang Wu, Jiayi Li, Xingyu Bai, Yujiu Yang:
Syngen: A Syntactic Plug-And-Play Module for Generative Aspect-Based Sentiment Analysis. 1-5 - Thomas Stogiannopoulos, Grigorios-Aris Cheimariotis, Nikolaos Mitianoudis:
A non-contact SpO2 estimation using video magnification and infrared data. 1-5 - Yuting Yang, Yuke Li, Binbin Du:
Improving CTC-Based ASR Models With Gated Interlayer Collaboration. 1-5 - Jiahao Xie, Wei Xu, Dingkang Liang, Zhanyu Ma, Kongming Liang, Weidong Liu, Rui Wang, Ling Jin:
Super-Resolution Information Enhancement for Crowd Counting. 1-5 - Nada Osman, Guglielmo Camporese, Lamberto Ballan:
TAMformer: Multi-Modal Transformer with Learned Attention Mask for Early Intent Prediction. 1-5 - Jun Wang, Benedetta Tondi, Mauro Barni:
Classification of Synthetic Facial Attributes by Means of Hybrid Classification/Localization Patch-Based Analysis. 1-5 - Hari Hara Suthan Chittoor, Osvaldo Simeone:
Learning Quantum Entanglement Distillation With Noisy Classical Communications. 1-5 - Veljko Boljanovic, Danijela Cabric:
Joint Millimeter-Wave AoD and AoA Estimation Using one OFDM Symbol and Frequency-Dependent Beams. 1-5 - Naoya Takahashi, Mayank Kumar Singh, Yuki Mitsufuji:
Hierarchical Diffusion Models for Singing Voice Neural Vocoder. 1-5 - Diyu Yang, Shimin Tang, Singanallur V. Venkatakrishnan, Mohammad Samin Nur Chowdhury, Yuxuan Zhang, Hassina Z. Bilheux, Gregery T. Buzzard, Charles A. Bouman:
An Edge Alignment-Based Orientation Selection Method for Neutron Tomography. 1-5 - Tian Yang, Hongbo Bo, Xinyu Yang, Jun Gao, Zijian Shi:
Conditional LS-GAN Based Skylight Polarization Image Restoration and Application in Meridian Localization. 1-5 - Samuel Pinilla, Kumar Vijay Mishra, Brian M. Sadler:
Unique Bispectrum Inversion for Signals with Finite Spectral/Temporal Support. 1-5 - Xiuheng Wang, Ricardo Augusto Borsoi, Cédric Richard, Jie Chen:
Change Point Detection with Neural Online Density-Ratio Estimator. 1-5 - Ilan Price, Jared Tanner:
Improved Projection Learning for Lower Dimensional Feature Maps. 1-5 - Nelly Pustelnik:
On The Primal and Dual Formulations Of The Discrete Mumford-Shah Functional. 1-5 - Sarina Meyer, Florian Lux, Julia Koch, Pavel Denisov, Pascal Tilli, Ngoc Thang Vu:
Prosody Is Not Identity: A Speaker Anonymization Approach Using Prosody Cloning. 1-5 - Juhyun Lyu, Jinseok Yang, Junghee Kim, Woohyung Lim, Wonbin Ahn, Dongwan Kang, Minjae Kim, Nam Soo Kim:
Multi-Resolution Sequence Aggregation and Model-Agnostic Framework for Time-Series Forecasting. 1-5 - Hyun Joon Park, Seok Woo Yang, Jin Sob Kim, Wooseok Shin, Sung Won Han:
TriAAN-VC: Triple Adaptive Attention Normalization for Any-to-Any Voice Conversion. 1-5 - Miroslaw Pawlak, Mateusz Pabian, Dominik Rzepka:
Asymptotically Optimal Nonparametric Classification Rules for Spike Train Data. 1-5 - Arijit Ukil, Leandro Marín, Antonio J. Jara:
Priv-Aug-Shap-ECGResNet: Privacy Preserving Shapley-Value Attributed Augmented Resnet for Practical Single-Lead Electrocardiogram Classification. 1-5 - Muqiao Yang, Naoyuki Kanda, Xiaofei Wang, Jian Wu, Sunit Sivasankaran, Zhuo Chen, Jinyu Li, Takuya Yoshioka:
Simulating Realistic Speech Overlaps Improves Multi-Talker ASR. 1-5 - Xujun Wei, Zechu Zhou, Pinxue Guo, Wenqiang Zhang:
Gated Enhanced RPN and Hybrid-View for Few-Shot Object Detection. 1-5 - Arash Ahmadian, Louis S. P. Liu, Yue Fei, Konstantinos N. Plataniotis, Mahdi S. Hosseini:
Pseudo-Inverted Bottleneck Convolution for Darts Search Space. 1-5 - Haolin Zuo, Rui Liu, Jinming Zhao, Guanglai Gao, Haizhou Li:
Exploiting Modality-Invariant Feature for Robust Multimodal Emotion Recognition with Missing Modalities. 1-5 - Chengjie Ke, Hao Liang, Duidui Li, Xin Tian:
High-Frequency Transformer Network Based on Window Cross-Attention for Pansharpening. 1-5 - Bojan Kolosnjaji, Apostolis Zarras:
Label-Efficient and Robust Learning from Multiple Experts. 1-5 - Rohan Badlani, Akshit Arora, Subhankar Ghosh, Rafael Valle, Kevin J. Shih, João Felipe Santos, Boris Ginsburg, Bryan Catanzaro:
Vani: Very-Lightweight Accent-Controllable TTS for Native And Non-Native Speakers With Identity Preservation. 1-2 - Ankit Shah, Larry Tang, Po Hao Chou, Yi Yu Zheng, Ziqian Ge, Bhiksha Raj:
An Approach to Ontological Learning from Weak Labels. 1-5 - Seungeun Lee:
Facial Texure Perceiver: Towards High-Fidelity Facial Texture Recovery with Input-Level Inductive Biased Perceiver IO. 1-5 - Bo Liu, Fenglei Chang, Wenpeng Luan, Bochao Zhao:
Improved Appliance Transient Feature Extraction Via Template Matching. 1-5 - Xue Jiang, Xiulian Peng, Yuan Zhang, Yan Lu:
Disentangled Feature Learning for Real-Time Neural Speech Coding. 1-5 - Jiaming Cheng, Cong Pang, Ruiyu Liang, Jingjie Fan, Li Zhao:
Dual-Path Dilated Convolutional Recurrent Network with Group Attention for Multi-Channel Speech Enhancement. 1-2 - Bastiaan Tamm, Rik Vandenberghe, Hugo Van hamme:
Cross-Lingual Transfer Learning for Alzheimer's Detection from Spontaneous Speech. 1-2 - Ruifu Li, Danijela Cabric:
Robust Adaptive Beamforming with Proximal Method. 1-5 - Juyeop Kim, Jun-Ho Choi, Soobeom Jang, Jong-Seok Lee:
Amicable Aid: Perturbing Images to Improve Classification Performance. 1-5 - Shiqi Wei, Ziyu Wang, Weiguo Gao, Gus Xia:
Controllable Music Inpainting with Mixed-Level and Disentangled Representation. 1-5 - Xuan Shi, Erica Cooper, Xin Wang, Junichi Yamagishi, Shrikanth Narayanan:
Can Knowledge of End-to-End Text-to-Speech Models Improve Neural Midi-to-Audio Synthesis Systems? 1-5 - Takenori Yoshimura, Shinji Takaki, Kazuhiro Nakamura, Keiichiro Oura, Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda:
Embedding a Differentiable Mel-Cepstral Synthesis Filter to a Neural Speech Synthesis System. 1-5 - Honggang Liu, Jinlong Yang, Yue Xu, Le Yang:
Optimizing Distributed Multi-Sensor Multi-Target Tracking Algorithm Based On Labeled Multi-Bernoulli Filter. 1-5 - Hee-Soo Heo, Youngki Kwon, Bong-Jin Lee, You Jin Kim, Jee-Weon Jung:
High-Resolution Embedding Extractor for Speaker Diarisation. 1-5 - Ryo Shichida, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama:
Estimation of Visual Contents from Human Brain Signals via VQA Based on Brain-Specific Attention. 1-5 - Arjun Singh, Vitaly Petrov, Josep Miquel Jornet:
Utilization of Bessel Beams in Wideband Sub Terahertz Communication Systems to Mitigate Beamsplit Effects in the Near-field. 1-5 - Pierre Develter, Jonathan Bosse, Olivier Rabaste, Philippe Forster, Jean Philippe Ovarlez:
False Alarm Regulation for Off-Grid Target Detection With The Matched Filter. 1-5 - Tsz Kin Lam, Shigehiko Schamoni, Stefan Riezler:
Make More of Your Data: Minimal Effort Data Augmentation for Automatic Speech Recognition and Translation. 1-5 - Samuel Thomas, Hong-Kwang Jeff Kuo, George Saon, Brian Kingsbury:
Multi-Speaker Data Augmentation for Improved end-to-end Automatic Speech Recognition. 1-5 - Sumit Kumar, B. Anshuman, Linus Rüttimann, Richard H. R. Hahnloser, Vipul Arora:
Balanced Deep CCA for Bird Vocalization Detection. 1-5 - Hien Ohnaka, Shinnosuke Takamichi, Keisuke Imoto, Yuki Okamoto, Kazuki Fujii, Hiroshi Saruwatari:
Visual Onoma-to-Wave: Environmental Sound Synthesis from Visual Onomatopoeias and Sound-Source Images. 1-5 - Ming-Yen Chen, Mahdin Rohmatillah, Ching-Hsien Lee, Jen-Tzung Chien:
Meta Learning for Domain Agnostic Soft Prompt. 1-5 - Zhe Li, Man-Wai Mak, Helen Mei-Ling Meng:
Discriminative Speaker Representation Via Contrastive Learning with Class-Aware Attention in Angular Space. 1-5 - Junda Liao, Qin Liu, Takeshi Ikenaga:
Pyramid Spatial Feature Transform and Shared-Offsets Deformable Alignment Based Convolutional Network for HDR Imaging. 1-5 - Sophia E. Economou, Ada Warren, Edwin Barnes:
The Role of Initial Entanglement in Adaptive Gibbs State Preparation on Quantum Computers. 1-5 - Bo Li, Dongseong Hwang, Zhouyuan Huo, Junwen Bai, Guru Prakash, Tara N. Sainath, Khe Chai Sim, Yu Zhang, Wei Han, Trevor Strohman, Françoise Beaufays:
Efficient Domain Adaptation for Speech Foundation Models. 1-5 - Kun Hu, Mingyu Cao, Mengzhu Wang, Long Lan, Wenjing Yang, Huibin Tan:
Enhanced Dcf Tracker Regularized by Reliable Sample Construction. 1-5 - Haopeng Kuang, Dingkang Yang, Shunli Wang, Xiaoying Wang, Lihua Zhang:
Towards Simultaneous Segmentation Of Liver Tumors And Intrahepatic Vessels Via Cross-Attention Mechanism. 1-5 - Dohoon Kim, Minwoo Shin, Joonki Paik:
PU-Edgeformer: Edge Transformer for Dense Prediction in Point Cloud Upsampling. 1-5 - Yajing Wang, Zongwei Luo:
Causal Discovery and Causal Inference Based Counterfactual Fairness in Machine Learning. 1-5 - Hyun Ryu, Junil Choi:
EMC2-Net: Joint Equalization and Modulation Classification Based on Constellation Network. 1-5 - Shanshan Zheng, Yachao Zhang, Hongyi Huang, Yanyun Qu:
Sample-Aware Knowledge Distillation for Long-Tailed Learning. 1-5 - Xudong Mou, Rui Wang, Tiejun Wang, Jie Sun, Bo Li, Tianyu Wo, Xudong Liu:
Deep Autoencoding One-Class time Series Anomaly Detection. 1-5 - Minkyu Kim, Kim Sung-Bin, Tae-Hyun Oh:
Prefix Tuning for Automated Audio Captioning. 1-5 - Xingyue Shi, Hong Liu, Wei Shi, Zihui Zhou, Yidi Li:
Boosting Person Re-Identification with Viewpoint Contrastive Learning and Adversarial Training. 1-5 - Biyun Sheng, Yan Bao, Fu Xiao, Linqing Gui:
DyLiteRADHAR: Dynamic Lightweight Slowfast Network for Human Activity Recognition Using MMWAVE Radar. 1-5 - Karl El Hajal, Zihan Wu, Neil Scheidwasser-Clow, Gasser Elbanna, Milos Cernak:
Efficient Speech Quality Assessment Using Self-Supervised Framewise Embeddings. 1-5 - Claudio Battiloro, Stefania Sardellitti, Sergio Barbarossa, Paolo Di Lorenzo:
Topological Signal Processing Over Weighted Simplicial Complexes. 1-5 - Yi-Xing Lin, Cheng-Hsun Pai, Phuong Thi Le, Bima Prihasto, Chien-Ling Huang, Jia-Ching Wang:
Code-Switching Speech Synthesis Based on Self-Supervised Learning and Domain Adaptive Speaker Encoder. 1-5 - Hao Li, Li Li, Yunmeng Huang, Ning Li, Yongtao Zhang:
An Adaptive Plug-and-Play Network for Few-Shot Learning. 1-5 - Shreya G. Upadhyay, Luz Martinez-Lucas, Bo-Hao Su, Wei-Cheng Lin, Woan-Shiuan Chien, Ya-Tse Wu, William Katz, Carlos Busso, Chi-Chun Lee:
Phonetic Anchor-Based Transfer Learning to Facilitate Unsupervised Cross-Lingual Speech Emotion Recognition. 1-5 - Abdullah Karaaslanli, Selin Aviyente:
Dynamic Signed Graph Learning. 1-5 - Yan Liu, Xiaokang Chen, Qi Dai:
Parallel Sentence-Level Explanation Generation for Real-World Low-Resource Scenarios. 1-5 - Bing Zhu, Sheng Xu, Feng Rice, Kutluyil Dogançay:
Angle-Of-Arrival Target Tracking Using A Mobile Uav In External Signal-Denied Environment. 1-5 - Zhidi Lin, Lei Cheng, Feng Yin, Lexi Xu, Shuguang Cui:
Output-Dependent Gaussian Process State-Space Model. 1-5 - Mingshuai Liu, Shubo Lv, Zihan Zhang, Runduo Han, Xiang Hao, Xianjun Xia, Li Chen, Yijian Xiao, Lei Xie:
Two-Stage Neural Network for ICASSP 2023 Speech Signal Improvement Challenge. 1-2 - Xinfa Zhu, Yi Lei, Kun Song, Yongmao Zhang, Tao Li, Lei Xie:
Multi-Speaker Expressive Speech Synthesis via Multiple Factors Decoupling. 1-5 - Jaeuk Byun, Youna Ji, Soo-Whan Chung, Soyeon Choe, Min-Seok Choi:
An Empirical Study on Speech Restoration Guided by Self-Supervised Speech Representation. 1-5 - Weikuo Guo, Xiangwei Kong:
Embrace Smaller Attention: Efficient Cross-Modal Matching with Dual Gated Attention Fusion. 1-5 - Sathvik Udupa, C. Siddarth, Prasanta Kumar Ghosh:
Improved Acoustic-to-Articulatory Inversion Using Representations from Pretrained Self-Supervised Learning Models. 1-5 - Xiaoxue Gao, Xianghu Yue, Haizhou Li:
Self-Transriber: Few-Shot Lyrics Transcription With Self-Training. 1-5 - Milind Rao, Gopinath Chennupati, Gautam Tiwari, Anit Kumar Sahu, Anirudh Raju, Ariya Rastrow, Jasha Droppo:
Federated Self-Learning with Weak Supervision for Speech Recognition. 1-5 - Dong Yang, Tomoki Koriyama, Yuki Saito, Takaaki Saeki, Detai Xin, Hiroshi Saruwatari:
Duration-Aware Pause Insertion Using Pre-Trained Language Model for Multi-Speaker Text-To-Speech. 1-5 - Yonggang Hu, Sharon Gannot, Thushara D. Abhayapala:
Generalized Relative Harmonic Coefficients. 1-5 - Yashish M. Siriwardena, Carol Y. Espy-Wilson:
The Secret Source : Incorporating Source Features to Improve Acoustic-To-Articulatory Speech Inversion. 1-5 - Hayato Futami, Emiru Tsunoo, Kentaro Shibata, Yosuke Kashiwagi, Takao Okuda, Siddhant Arora, Shinji Watanabe:
Streaming Joint Speech Recognition and Disfluency Detection. 1-5 - Zhichao Wang, Xinsheng Wang, Lei Xie, Yuanzhe Chen, Qiao Tian, Yuping Wang:
Delivering Speaking Style in Low-Resource Voice Conversion with Multi-Factor Constraints. 1-5 - Alexandros Stergiou, Dima Damen:
Play It Back: Iterative Attention For Audio Recognition. 1-5 - Akihiko Sugiyama:
Adaptive Noise Canceller Algorithm with SNR-Based Stepsize and Data-Dependent Averaging. 1-5 - Zhiyuan Ren, Zhihong Pan, Xin Zhou, Le Kang:
Diffusion Motion: Generate Text-Guided 3D Human Motion by Diffusion Model. 1-5 - Anik Chattopadhyay, Arunava Banerjee:
Beyond Rate Coding: Signal Coding and Reconstruction Using Lean Spike Trains. 1-5 - Parvin Malekzadeh, Ming Hou, Konstantinos N. Plataniotis:
A Unified Uncertainty-Aware Exploration: Combining Epistemic and Aleatory Uncertainty. 1-5 - Johan Pauwels, Lorenzo Picinali:
On the Relevance of the Differences Between HRTF Measurement Setups for Machine Learning. 1-5 - Hansheng Guo, Juncheng Li, Guangwei Gao, Zhi Li, Tieyong Zeng:
PFT-SSR: Parallax Fusion Transformer for Stereo Image Super-Resolution. 1-5 - Geumbyeol Hwang, Sunwon Hong, Seunghyun Lee, Sungwoo Park, Gyeongsu Chae:
DisCoHead: Audio-and-Video-Driven Talking Head Generation by Disentangled Control of Head Pose and Facial Expressions. 1-5 - Tao Bai, Chen Chen, Lingjuan Lyu, Jun Zhao, Bihan Wen:
Towards Adversarially Robust Continual Learning. 1-5 - Kun Cao, Na Qi, Wei Xu, Qing Zhu, Shibo Xu, Changxin Pan:
G2CNN: Geometric Prior Based GCNN for Single-View 3D Reconstruction with Loop Subdivision. 1-5 - Zhen Wang, Xuedan Yan, Qian He, Rick S. Blum:
Target Velocity Estimation for Quantization-Based Cooperative MIMO Radar and Communications System. 1-5 - Iordanis Thoidis, Clément Gaultier, Tobias Goehring:
Perceptual Analysis of Speaker Embeddings for Voice Discrimination between Machine And Human Listening. 1-5 - Xiaoquan Ke, Man-Wai Mak, Helen M. Meng:
Feature Selection and Text Embedding for Detecting Dementia from Spontaneous Cantonese. 1-5 - Yifei Xin, Dongchao Yang, Yuexian Zou:
Improving Text-Audio Retrieval by Text-Aware Attention Pooling and Prior Matrix Revised Loss. 1-5 - Muhammad Asad Lodhi, Jiahao Pang, Dong Tian:
Sparse Convolution Based Octree Feature Propagation for Lidar Point Cloud Compression. 1-5 - Akshad Shyam, Kusum Komalavally, Monika Gautam, Vamshikrishna Kancharla, Vennela Gudisa, Virendra Patil, Aanandh Balasubramanian, Sumohana S. Channappayya:
An Automotive Radar Dataset For Object Classification. 1-5 - Junghwan Lee, Yao Xie, Xiuyuan Cheng:
Training Neural Networks for Sequential Change-Point Detection. 1-5 - Sam Perochon, Laurent Oudre:
Unsupervised Action Segmentation of Untrimmed Egocentric Videos. 1-5 - Jiaxin Guo, Minghan Wang, Xiaosong Qiao, Daimeng Wei, Hengchao Shang, Zongyao Li, Zhengzhe Yu, Yinglu Li, Chang Su, Min Zhang, Shimin Tao, Hao Yang:
UCorrect: An Unsupervised Framework for Automatic Speech Recognition Error Correction. 1-5 - Zijin Yin, Runpu Wei, Kongming Liang, Yiyang Lin, Wei Liu, Zhanyu Ma, Min Min, Jun Guo:
Semantic Memory Guided Image Representation for Polyp Segmentation. 1-5 - Xiaohan Zhao, Yongzhe Li, Ran Tao:
Efficent Large-Scale Multi-Unimodular Waveform Design with Good Correlation Properties via Direct Phase Optimizations. 1-5 - Sangeeta Bhattacharjee, Kumar Vijay Mishra, Ramesh Annavajjala, Chandra R. Murthy:
Multi-Carrier Wideband OCDM-Based THZ Automotive Radar. 1-5 - Jisoo Kim, Hyebin Ahn, Byounghyun Yoo:
Abusive Activity Detection with Multi-Modality Based on Convolutional Neural Network. 1-5 - Xinran Lyu, Libao Zhang:
Progressive Refinement Learning Based on Feature Cross Perception for Residential Areas Semantic Segmentation. 1-5 - Khandker Sadia Rahman, Daphney-Stavroula Zois, Charalampos Chelmis:
Bayesian Network Modeling and Prediction of Transitions Within the Homelessness System. 1-5 - Min Hyun Han, Sung Hwan Mun, Minchan Kim, Myeonghun Jeong, Sunghwan Ahn, Nam Soo Kim:
Improving Learning Objectives for Speaker Verification from the Perspective of Score Comparison. 1-5 - Kuncheng Luo, Zhiheng Li:
Real-Time Human Reconstruction Based on Human Pose Prior and Epipolar Refinement. 1-5 - Shangda Wu, Xiaobing Li, Maosong Sun:
Chord-Conditioned Melody Harmonization With Controllable Harmonicity. 1-5 - Jungjun Kim, Changjin Han, Gyuhyeon Nam, Gyeongsu Chae:
Good Neighbors are All You Need for Chinese Grapheme-To-Phoneme Conversion. 1-5 - Srinath Tankasala, Long Chen, Andreas Stolcke, Anirudh Raju, Qianli Deng, Chander Chandak, Aparna Khare, Roland Maas, Venkatesh Ravichandran:
Cross-Utterance ASR Rescoring with Graph-Based Label Propagation. 1-5 - Xin Huang, Jiake Xie, Bo Xu, Han Huang, Ziwen Li, Cheng Lu, Yandong Guo, Yong Tang:
Ultra Real-Time Portrait Matting via Parallel Semantic Guidance. 1-5 - Abdullah Karaaslanli, Satabdi Saha, Tapabrata Maiti, Selin Aviyente:
Multiple Signed Graph Learning for Gene Regulatory Network Inference. 1-5 - Wei Tang, Zuyao Ma, Haifeng Sun, Jingyu Wang:
Learning Sparse Alignments via Optimal Transport for Cross-Domain Fake News Detection. 1-5 - Yufan Liu, Jiajiong Cao, Weiming Bai, Bing Li, Weiming Hu:
Learning from the Raw Domain: Cross Modality Distillation for Compressed Video Action Recognition. 1-5 - Zhuo Chen, Naoyuki Kanda, Jian Wu, Yu Wu, Xiaofei Wang, Takuya Yoshioka, Jinyu Li, Sunit Sivasankaran, Sefik Emre Eskimez:
Speech Separation with Large-Scale Self-Supervised Learning. 1-5 - Christine Beauchene, Michael S. Brandstein, Stephanie Haro, Thomas F. Quatieri, Christopher J. Smalt:
Subject-Specific Adaptation for a Causally-Trained Auditory-Attention Decoding System. 1-5 - Xiaowen Ma, Mengting Ma, Chenlu Hu, Zhiyuan Song, Ziyan Zhao, Tian Feng, Wei Zhang:
Log-Can: Local-Global Class-Aware Network For Semantic Segmentation of Remote Sensing Images. 1-5 - Heitor R. Guimarães, Arthur Pimentel, Anderson R. Avila, Mehdi Rezagholizadeh, Boxing Chen, Tiago H. Falk:
Robustdistiller: Compressing Universal Speech Representations for Enhanced Environment Robustness. 1-5 - Yiwen Wang, Zijian Lan, Xihong Wu, Tianshu Qu:
TT-Net: Dual-Path Transformer Based Sound Field Translation in the Spherical Harmonic Domain. 1-5 - Jiyue Wang, Yanxiong Li, Qianhua He, Wei Xie:
Clean Sample Guided Self-Knowledge Distillation for Image Classification. 1-5 - You Zhang, Yuxiang Wang, Zhiyao Duan:
HRTF Field: Unifying Measured HRTF Magnitude Representation with Neural Fields. 1-5 - James Zachary Hare, Lance M. Kaplan:
Improved Small Sample Hypothesis Testing Using the Uncertain Likelihood Ratio. 1-5 - Julian Neri, Sebastian Braun:
Towards Real-Time Single-Channel Speech Separation in Noisy and Reverberant Environments. 1-5 - Dingbang Li, Xin Lin, Haibin Cai, Wenzhou Chen:
Visual Graph Reasoning Network. 1-5 - Maxime Leiber, Yosra Marnissi, Axel Barrau, Mohamed El Badaoui:
Differentiable Adaptive Short-Time Fourier Transform with Respect to the Window Length. 1-5 - Zhantu Lin, Xiaoyan Zhang:
FFFN: Fashion Feature Fusion Network by Co-Attention Model for Fashion Recommendation. 1-5 - Yuanzhao Zhai, Kele Xu, Bo Ding, Dawei Feng, Zijian Gao, Huaimin Wang:
Diversifying Message Aggregation in Multi-Agent Communication Via Normalized Tensor Nuclear Norm Regularization. 1-5 - Wei Liu, Kaiqi Fu, Xiaohai Tian, Shuju Shi, Wei Li, Zejun Ma, Tan Lee:
An ASR-Free Fluency Scoring Approach with Self-Supervised Learning. 1-5 - Samuele Cornell, Zhong-Qiu Wang, Yoshiki Masuyama, Shinji Watanabe, Manuel Pariente, Nobutaka Ono, Stefano Squartini:
Multi-Channel Speaker Extraction with Adversarial Training: The Wavlab Submission to The Clarity ICASSP 2023 Grand Challenge. 1-2 - Weinan Tong, Jiaxu Zhu, Jun Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
TFCnet: Time-Frequency Domain Corrector for Speech Separation. 1-5 - Zohreh Hajiakhondi-Meybodi, Arash Mohammadi, Ming Hou, Jamshid Abouei, Konstantinos N. Plataniotis:
ViT-Cat: Parallel Vision Transformers With Cross Attention Fusion for Popularity Prediction in MEC Networks. 1-5 - Chenpeng Du, Yiwei Guo, Feiyu Shen, Kai Yu:
Multi-Speaker Multi-Lingual VQTTS System for LIMMITS 2023 Challenge. 1-2 - Muqiao Yang, Joseph Konan, David Bick, Yunyang Zeng, Shuo Han, Anurag Kumar, Shinji Watanabe, Bhiksha Raj:
Paaploss: A Phonetic-Aligned Acoustic Parameter Loss for Speech Enhancement. 1-5 - Xinnan Zhang, Yuanbo Cheng, Xiaolei Shang, Jun Liu:
Optimal Mixed-ADC Arrangement for DOA Estimation Via CRB Using ULA. 1-5 - Shuo Huang, Jia Jia, Zongxin Yang, Wei Wang, Haozhe Wu, Yi Yang, Junliang Xing:
Shuffled Autoregression for Motion Interpolation. 1-5 - Abilé Magbonde, Franck Quaine, Bertrand Rivet:
Constrained non-negative PARAFAC2 for electromyogram separation. 1-5 - Oscar Ferraz, Helder Araújo, Vítor Silva, Gabriel Falcão Paiva Fernandes:
Benchmarking Convolutional Neural Network Inference on Low-Power Edge Devices. 1-5 - Zipeng Li, Xian Zhong, Shuqin Chen, Wenxuan Liu, Wenxin Huang, Lin Li:
Background Disturbance Mitigation for Video Captioning Via Entity-Action Relocation. 1-5 - Cem Ates Musluoglu, Alexander Bertrand:
A Computationally Efficient Algorithm for Distributed Adaptive Signal Fusion Based on Fractional Programs. 1-5 - Lingfeng Xu, Kimberly D. Mueller, Julie Liss, Visar Berisha:
Decorrelating Language Model Embeddings for Speech-Based Prediction of Cognitive Impairment. 1-5 - Manish Kumar Singh, Konstantinos D. Polyzos, Panagiotis A. Traganitis, Sairaj V. Dhople, Georgios B. Giannakis:
Physics-Informed Transfer Learning for Voltage Stability Margin Prediction. 1-5 - Yu Zhang, Yue Wang, Zhi Tian, Geert Leus, Gong Zhang:
Super-Resolution Harmonic Retrieval of Non-Circular Signals. 1-5 - Vasileios Mygdalis, Ioannis Pitas:
Exploiting One-Class Classification Optimization Objectives for Increasing Adversarial Robustness. 1-5 - Xiang Li, Yucheng Zhou:
Disentangled and Robust Representation Learning for Bragging Classification in Social Media. 1-5 - B. Ashwini, Vrinda Narayan, Jainendra Shukla:
SPASHT: Semantic and Pragmatic Speech Features for Automatic Assessment of Autism. 1-5 - Wei Chen, Yulin He, Zhengfa Liang, Yulan Guo:
Adaptive Scale and Spatial Aggregation for Real-Time Object Detection. 1-5 - Kuan-Lin Chen, Ching Hua Lee, Bhaskar D. Rao, Harinath Garudadri:
A DNN Based Normalized Time-Frequency Weighted Criterion for Robust Wideband DoA Estimation. 1-5 - Jinhai Yang, Hua Yang:
Sine: Similarity-Regularized Intra-Class Exploitation for Cross-Granularity Few-Shot Learning. 1-5 - Dan Berrebbi, Ronan Collobert, Navdeep Jaitly, Tatiana Likhomanenko:
More Speaking or More Speakers? 1-5 - Santiago Pascual, Gautam Bhattacharya, Chunghsin Yeh, Jordi Pons, Joan Serrà:
Full-Band General Audio Synthesis with Score-Based Diffusion. 1-5 - Kohei Matsuura, Takanori Ashihara, Takafumi Moriya, Tomohiro Tanaka, Atsunori Ogawa, Marc Delcroix, Ryo Masumura:
Leveraging Large Text Corpora For End-To-End Speech Summarization. 1-5 - Georgios Kollias, Vassilis Kalantzis, Theodoros Salonidis, Shashanka Ubaru:
Quantum Graph Transformers. 1-5 - Yusuke Yasuda, Tomoki Toda:
Text-To-Speech Synthesis Based on Latent Variable Conversion Using Diffusion Probabilistic Model and Variational Autoencoder. 1-5 - Federico Landini, Mireia Díez, Alicia Lozano-Diez, Lukás Burget:
Multi-Speaker and Wide-Band Simulated Conversations as Training Data for End-to-End Neural Diarization. 1-5 - Viet-Anh Nguyen, Anh H. T. Nguyen, Andy W. H. Khong:
Improving Performance of Real-Time Full-Band Blind Packet-Loss Concealment with Predictive Network. 1-5 - Andrea Guamo-Morocho, Roberto López-Valcarce:
Frequency-Selective Hybrid Beamforming For Mmwave Full-Duplex. 1-5 - Youngwon Choi, Eunkyun Lee, Inseon Jang, Jong Won Shin:
Individual Sub-Band Estimation Approach to Bandwidth Extension and Enhancement of Coded Speech. 1-5 - Ting Dang, Antoni Dimitriadis, Jingyao Wu, Vidhyasaharan Sethu, Eliathamby Ambikairajah:
Constrained Dynamical Neural ODE for Time Series Modelling: A Case Study on Continuous Emotion Prediction. 1-5 - Linlong Wu, Bowen Wang, Ziyang Cheng, Bhavani Shankar Mysore Rama Rao, Björn E. Ottersten:
Joint Symbol-Level Precoding and Sub-Block-Level RIS Design for Dual-Function Radar-Communications. 1-5 - Weijie Xiong, Jinfeng Hu, Kai Zhong:
Mimo Radar Transmit Beampattern Matching Via Manifold Optimization. 1-5 - Raphaël Baena, Lucas Drumetz, Vincent Gripon:
Entropy Based Feature Regularization to Improve Transferability of Deep Learning Models. 1-5 - Jakub Mosinski, Piotr Bilinski, Thomas Merritt, Abdelhamid Ezzerg, Daniel Korzekwa:
AE-Flow: Autoencoder Normalizing Flow. 1-5 - Ruiming Guo, Ayush Bhandari:
Unlimited Sampling of FRI Signals Independent of Sampling Rate. 1-5 - Wen Wu, Chao Zhang, Philip C. Woodland:
Self-Supervised Representations in Speech-Based Depression Detection. 1-5 - Jian Yang, Chen Li, Xuelong Li:
Underwater Image Restoration with Light-Aware Progressive Network. 1-5 - Andrei Buciulea, Antonio G. Marques:
Graph Learning from Gaussian and Stationary Graph Signals. 1-5 - Hui Lan, Cheolkon Jung, Yang Liu, Ming Li:
CNN Filter for RPR-Based SR in VVC with Wavelet Decomposition. 1-5 - Jung Uk Kim, Yong Man Ro:
Similarity Relation Preserving Cross-Modal Learning for Multispectral Pedestrian Detection Against Adversarial Attacks. 1-5 - Reem Gody, David Harwath:
Unsupervised Fine-Tuning Data Selection for ASR Using Self-Supervised Speech Models. 1-5 - Yujie Yang, Changsheng Quan, Xiaofei Li:
MCNET: Fuse Multiple Cues for Multichannel Speech Enhancement. 1-5 - Lili Yin, Di Wu, Zhibin Qiu, Hao Huang:
Mitigating Domain Dependency for Improved Speech Enhancement Via SNR Loss Boosting. 1-5 - Lunchen Xie, Kaiyu Huang, Fan Xu, Qingjiang Shi:
ZO-DARTS: Differentiable Architecture Search with Zeroth-Order Approximation. 1-5 - Yu Rong, Kumar Vijay Mishra, Daniel W. Bliss:
Wireless Sensing for Simultaneous Human Vocal Sound and Heart Sound Recognition. 1-5 - Zhi Zhu, Yoshinao Sato:
Domain Adaptation without Catastrophic Forgetting on a Small-Scale Partially-Labeled Corpus for Speech Emotion Recognition. 1-5 - Letian Zhang, Jinping Wang, Lu Jie, Nanjie Chen, Xiaojun Tan, Zhifei Duan:
LMBAO: A Landmark Map for Bundle Adjustment Odometry in LiDAR SLAM. 1-5 - Ziyu Zhu, Wenlei Liu, Zhidong Deng:
Learnable Flow Model Conditioned on Graph Representation Memory for Anomaly Detection. 1-5 - Justin Cano, Yi Ding, Gaël Pagès, Eric Chaumette, Jerome Le Ny:
A Robust Kalman Filter Based Approach for Indoor Robot Positionning with Multi-Path Contaminated UWB Data. 1-5 - Kaixuan Zhang, Zihan Liu, Jiashang Hu, Shilin Wang:
An Auto-Encoder Based Method for Camera Fingerprint Compression. 1-5 - Naoyuki Kanda, Jian Wu, Xiaofei Wang, Zhuo Chen, Jinyu Li, Takuya Yoshioka:
Vararray Meets T-Sot: Advancing the State of the Art of Streaming Distant Conversational Speech Recognition. 1-5 - Hayate Kojima, Hikari Noguchi, Koki Yamada, Yuichi Tanaka:
Restoration of Time-Varying Graph Signals using Deep Algorithm Unrolling. 1-5 - Zhouyuan Huo, Khe Chai Sim, Bo Li, Dongseong Hwang, Tara N. Sainath, Trevor Strohman:
Resource-Efficient Transfer Learning from Speech Foundation Model Using Hierarchical Feature Fusion. 1-5 - Vasista Sai Lodagala, Sreyan Ghosh, Srinivasan Umesh:
Data2vec-Aqc: Search for the Right Teaching Assistant in the Teacher-Student Training Setup. 1-5 - Lester Phillip Violeta, Ding Ma, Wen-Chin Huang, Tomoki Toda:
Intermediate Fine-Tuning Using Imperfect Synthetic Speech for Improving Electrolaryngeal Speech Recognition. 1-5 - Samy Labsir, Alexandre Renaux, Jordi Vilà-Valls, Éric Chaumette:
Cramér-Rao Bound on Lie Groups with Observations on Lie Groups: Application to SE(2). 1-5 - Hongyu Fu, Yijing Yang, Vinod K. Mishra, C.-C. Jay Kuo:
Classification via Subspace Learning Machine (SLM): Methodology and Performance Evaluation. 1-5 - Yuwei Ren, Matt Zivney, Yin Huang, Eddie Choy, Chirag Patel, Hao Xu:
Speaker Diaphragm Excursion Prediction: Deep Attention and Online Adaptation. 1-5 - Imen Ayadi, Florent Bouchard, Frédéric Pascal:
Elliptical Wishart Distribution: Maximum Likelihood Estimator from Information Geometry. 1-5 - Yiwei Wei, Shaozu Yuan, Meng Chen, Longbiao Wang:
Enhancing Multimodal Alignment with Momentum Augmentation for Dense Video Captioning. 1-5 - Shuaiqi Chen, Xiaofen Xing, Weibin Zhang, Weidong Chen, Xiangmin Xu:
DWFormer: Dynamic Window Transformer for Speech Emotion Recognition. 1-5 - Junwei Liao, Duyu Tang, Fan Zhang, Shuming Shi:
Skillnet-NLG: General-Purpose Natural Language Generation with a Sparsely Activated Approach. 1-5 - Chen He, Weisheng Gong, Yangrui Dong, Xie Xie, Z. Jane Wang:
Radio Map Based UAV Target Localization. 1-5 - Matteo Torcoli, Emanuël A. P. Habets:
Better Together: Dialogue Separation and Voice Activity Detection for Audio Personalization in TV. 1-5 - Michaela Areti Zervou, Effrosyni Doutsi, Panagiotis Tsakalides:
Efficient Protein Structural Class Prediction Via Chaos Game Representation and Recurrent Neural Networks. 1-5 - Ming Cheng, Weiqing Wang, Yucong Zhang, Xiaoyi Qin, Ming Li:
Target-Speaker Voice Activity Detection Via Sequence-to-Sequence Prediction. 1-5 - Jing Yang, Zhiqiang You, Zhiwei Zhong, Peng Liu, Langqi Mei, Shenguang Huang:
DTTR: Detecting Text with Transformers. 1-5 - Shanxiang Lyu:
Optimized Dithering for Quantization Index Modulation. 1-5 - Santiago Ruiz, Toon van Waterschoot, Marc Moonen:
Centralized Cascade Multi-Channel Noise Reduction and Acoustic Feedback Cancellation in a Wireless Acoustic Sensor And Actuator Network. 1-5 - Sebastian Ellis, Stefan Goetze, Heidi Christensen:
Moving Towards Non-Binary Gender Identification Via Analysis of System Errors in Binary Gender Classification. 1-5 - Jie Zhang, Yi Xiao, Yan Zheng, Zhenni Wang, Chi-Sing Leung:
Semantic-Aware Gated Fusion Network For Interactive Colorization. 1-5 - Loveneet Saini, Axel Acosta, Gor Hakobyan:
Graph Neural Networks for Object Type Classification Based on Automotive Radar Point Clouds and Spectra. 1-5 - Siang-Ruei Wu, Chun-Tse Li, Hao-Chung Cheng:
Efficient Data Loading with Quantum Autoencoder. 1-5 - Xin Lu, Weixiang Zhao, Yanyan Zhao, Bing Qin, Zhentao Zhang, Junjie Wen:
A Topic-Enhanced Approach for Emotion Distribution Forecasting in Conversations. 1-5 - Jin-Seong Choi, Jae-Hong Lee, Chae-Won Lee, Joon-Hyuk Chang:
M-CTRL: A Continual Representation Learning Framework with Slowly Improving Past Pre-Trained Model. 1-5 - Nicolas Zilberstein, Chris Dick, Rahman Doost-Mohammady, Ashutosh Sabharwal, Santiago Segarra:
Accelerated Massive MIMO Detector Based on Annealed Underdamped Langevin Dynamics. 1-5 - Changan Chen, Wei Sun, David Harwath, Kristen Grauman:
Learning Audio-Visual Dereverberation. 1-5 - Zechao Hu, Adrian G. Bors:
Enabling Large-Scale Image Search with Co-Attention Mechanism. 1-5 - Wook-Hyung Kim, Cheul-Hee Hahm, Anant Baijal, Namuk Kim, Ilhyun Cho, Jayoon Koo:
LiNuIQA: Lightweight No-Reference Image Quality Assessment Based on Non-Uniform Weighting. 1-5 - Yingxuan You, Hong Liu, Xia Li, Wenhao Li, Ti Wang, Runwei Ding:
Gator: Graph-Aware Transformer with Motion-Disentangled Regression for Human Mesh Recovery from a 2D Pose. 1-5 - Zhaowei Chen, Peng Li, Zeyong Wei, Honghua Chen, Haoran Xie, Mingqiang Wei, Fu Lee Wang:
Geogcn: Geometric Dual-Domain Graph Convolution Network For Point Cloud Denoising. 1-5 - Xiao-Min Zeng, Yan Song, Zhu Zhuo, Yu Zhou, Yu-Hong Li, Hui Xue, Li-Rong Dai, Ian McLoughlin:
Joint Generative-Contrastive Representation Learning for Anomalous Sound Detection. 1-5 - Gokul Karthik Kumar, Praveen S. V, Pratyush Kumar, Mitesh M. Khapra, Karthik Nandakumar:
Towards Building Text-to-Speech Systems for the Next Billion Users. 1-5 - Tian Huey Teh, Vivian Hu, Devang S. Ram Mohan, Zack Hodari, Christopher G. R. Wallis, Tomás Gómez Ibarrondo, Alexandra Torresquintero, James Leoni, Mark J. F. Gales, Simon King:
Ensemble Prosody Prediction For Expressive Speech Synthesis. 1-5 - Junlin Hou, Fan Xiao, Jilan Xu, Rui Feng, Yue Zhang, Haidong Zou, Lina Lu, Wenwen Xue:
Diabetic Retinopathy Grading with Weakly-Supervised Lesion Priors. 1-5 - Vedran Mihal, Markus Püschel:
Möbius Total Variation for Directed Acyclic Graphs. 1-5 - Pallavi Kaushik, Ilina Tripathi, Partha Pratim Roy:
Motor Activity Recognition Using Eeg Data and Ensemble of Stacked BLSTM-LSTM Network and Transformer Model. 1-5 - Trisha Mittal, Zakaria Aldeneh, Masha Fedzechkina, Anurag Ranjan, Barry-John Theobald:
Naturalistic Head Motion Generation from Speech. 1-5 - Pranav Kulkarni, P. P. Vaidyanathan:
Difference Coarrays of Rational Arrays. 1-5 - Yuchen Hu, Chen Chen, Ruizhe Li, Qiushi Zhu, Eng Siong Chng:
Gradient Remedy for Multi-Task Learning in End-to-End Noise-Robust Speech Recognition. 1-5 - Durgesh Singh, Ahcène Boubekki, Robert Jenssen, Michael C. Kampffmeyer:
Supercm: Revisiting Clustering for Semi-Supervised Learning. 1-5 - Shingo Takemoto, Shunsuke Ono:
Enhancing Spatio-Spectral Regularization by Structure Tensor Modeling for Hyperspectral Image Denoising. 1-5 - Sudheer Kovela, Rafael Valle, Ambrish Dantrey, Bryan Catanzaro:
Any-to-Any Voice Conversion with F0 and Timbre Disentanglement and Novel Timbre Conditioning. 1-5 - Maria Tzelepi, Paraskevi Nousi, Anastasios Tefas:
Improving Electric Load Demand Forecasting with Anchor-Based Forecasting Method. 1-5 - Hassan Taherian, Sefik Emre Eskimez, Takuya Yoshioka:
Breaking the Trade-Off in Personalized Speech Enhancement With Cross-Task Knowledge Distillation. 1-5 - Omar Zamzam, Haleh Akrami, Richard M. Leahy:
Learning From Positive and Unlabeled Data Using Observer-GAN. 1-5 - Yan-Tsung Peng, Wei-Hua Li:
Rain2Avoid: Self-Supervised Single Image Deraining. 1-5 - Cai Wen, Timothy N. Davidson:
Transceiver Design for MIMO-DFRC Systems. 1-5 - Jun Chen, Wei Rao, Zilin Wang, Jiuxin Lin, Zhiyong Wu, Yannan Wang, Shidong Shang, Helen Meng:
Inter-Subnet: Speech Enhancement with Subband Interaction. 1-5 - Juan Cerviño, Juan Andrés Bazerque, Miguel Calvo-Fullana, Alejandro Ribeiro:
Multi-Task Bias-Variance Trade-Off Through Functional Constraints. 1-5 - Zhuoran Xu, Yang Yang, Zhixiang Zhang, Weiming Zhang:
No Reference Quality Assessment for Screen Content Images Based on Entire and High-Influence Regions. 1-5 - Yonathan Eder, Zhuoyang Liu, Yonina C. Eldar:
Sparse Non-Contact Multiple People Localization and Vital Signs Monitoring Via FMCW Radar. 1-5 - Ben Luijten, Boudewine W. Ossenkoppele, Nico de Jong, Martin D. Verweij, Yonina C. Eldar, Massimo Mischi, Ruud J. G. van Sloun:
Neural Maximum-a-Posteriori Beamforming for Ultrasound Imaging. 1-5 - Longbin Jin, Yealim Oh, Hyunseo Kim, Hyuntaek Jung, Hyo Jin Jon, Jung Eun Shin, Eun Yi Kim:
CONSEN: Complementary and Simultaneous Ensemble for Alzheimer's Disease Detection and MMSE Score Prediction. 1-2 - Swapnil Bhosale, Rupayan Chakraborty, Sunil Kumar Kopparapu:
A Novel Metric For Evaluating Audio Caption Similarity. 1-5 - Shu Liu, Yan Xu, Tongming Wan, Xiaoyan Kui:
A Dual-Branch Adaptive Distribution Fusion Framework for Real-World Facial Expression Recognition. 1-5 - Ishan D. Khurjekar, Peter Gerstoft, Christoph F. Mecklenbräuker, Zoi-Heleni Michalopoulou:
Direction-of-Arrival Estimation Using Gaussian Process Interpolation. 1-5 - Ruchao Fan, Yiming Wang, Yashesh Gaur, Jinyu Li:
CTCBERT: Advancing Hidden-Unit Bert with CTC Objectives. 1-5 - Quang Minh Nguyen, Nhan Khanh Le, Lam M. Nguyen:
Scalable and Secure Federated XGBoost. 1-5 - Zhijun Liu, Yiwei Guo, Kai Yu:
DiffVoice: Text-to-Speech with Latent Diffusion. 1-5 - R. Gnana Praveen, Eric Granger, Patrick Cardinal:
Recursive Joint Attention for Audio-Visual Fusion in Regression Based Emotion Recognition. 1-5 - Hongwei Yu, Jiansheng Chen, Huimin Ma, Cheng Yu, Xinlong Ding:
Defending Against Universal Patch Attacks by Restricting Token Attention in Vision Transformers. 1-5 - Qizhi Wang, Wei Huang, Yuan Zhang, Xuanya Li, Xiongjun Ye, Kai Hu:
Automatic Segmentation of Nasopharyngeal Carcinoma in CT Images Using Dual Attention and Edge Detection. 1-5 - Agnimitra Dasgupta, Carlo Graziani, Zichao Wendy Di:
Simultaneous Reconstruction and Uncertainty Quantification for Tomography. 1-5 - Pai Zhu, Hyun Jin Park, Alex Park, Angelo Scorza Scarpati, Ignacio López-Moreno:
Locale Encoding for Scalable Multilingual Keyword Spotting Models. 1-5 - Jaroslav Cmejla, Zbynek Koldovský, Václav Kautský, Tülay Adali:
Dynamic Independent Component Extraction with Blending Mixing Vector: Lower Bound on Mean Interference-to-Signal Ratio. 1-5 - Ruixian Liu, Peter Gerstoft:
SD-PINN: Physics Informed Neural Networks for Spatially Dependent PDES. 1-5 - Zhengyuan Liu, Nancy F. Chen:
Picking the Underused Heads: A Network Pruning Perspective of Attention Head Selection for Fusing Dialogue Coreference Information. 1-5 - Do June Min, Andreas Stolcke, Anirudh Raju, Colin Vaz, Di He, Venkatesh Ravichandran, Viet Anh Trinh:
Adaptive Endpointing with Deep Contextual Multi-Armed Bandits. 1-5 - Weitao Yuan, Yuren Bian, Shengbei Wang, Masashi Unoki, Wenwu Wang:
An Improved Optimal Transport Kernel Embedding Method with Gating Mechanism for Singing Voice Separation and Speaker Identification. 1-5 - Oguzhan Ulucan, Diclehan Ulucan, Marc Ebner:
Block-Based Color Constancy: The Deviation of Salient Pixels. 1-5 - Siyuan Shen, Feng Liu, Aimin Zhou:
Mingling or Misalignment? Temporal Shift for Speech Emotion Recognition with Pre-Trained Representations. 1-5 - Yadong Niu, Nan Li, Xihong Wu, Jing Chen:
A Model-Based Hearing Compensation Method Using a Self-Supervised Framework. 1-5 - Maciej Niedzwiecki, Artur Gancza, Lu Shen, Yuriy V. Zakharov:
On Bidirectional Preestimates and Their Application to Identification of fast Time-Varying Systems. 1-5 - Ignacio Santamaría, Mohammad Soleymani, Eduard A. Jorswieck, Jesús Gutiérrez:
Interference Leakage Minimization in RIS-Assisted MIMO Interference Channels. 1-5 - Jahangir Alam, Woo Hyun Kang, Abderrahim Fathan:
Hybrid Neural Network with Cross- and Self-Module Attention Pooling for Text-Independent Speaker Verification. 1-5 - Jinxin Guo, Jiaqiang Zhang, ShaoJie Li, Xiaojing Zhang, Ming Ma:
MTFD: Multi-Teacher Fusion Distillation for Compressed Video Action Recognition. 1-5 - Koen C. E. van de Camp, Hamdi Joudeh, Duarte J. Antunes, Ruud J. G. van Sloun:
Active Subsampling Using Deep Generative Models by Maximizing Expected Information Gain. 1-5 - N. Shashaank, Berker Banar, Mohammad Rasool Izadi, Jeremy Kemmerer, Shuo Zhang, Chuan-Che Jeff Huang:
HiSSNet: Sound Event Detection and Speaker Identification via Hierarchical Prototypical Networks for Low-Resource Headphones. 1-5 - Xiaopeng Yan, Yindi Yang, Zhihao Guo, Liangliang Peng, Lei Xie:
The NPU-Elevoc Personalized Speech Enhancement System for Icassp2023 DNS Challenge. 1-2 - Chunfeng Wang, Peisong Huang, Yuxiang Zou, Haoyu Zhang, Shichao Liu, Xiang Yin, Zejun Ma:
LiteG2P: A Fast, Light and High Accuracy Model for Grapheme-to-Phoneme Conversion. 1-5 - Chengxiang Lei, Sichao Fu, Yuetian Wang, Wenhao Qiu, Yachen Hu, Qinmu Peng, Xinge You:
Self-Supervised Guided Hypergraph Feature Propagation for Semi-Supervised Classification with Missing Node Features. 1-5 - Prateek Keserwani, Srinivas Soumitri Miriyala, Vikram Nelvoy Rajendiran, Pradeep N. Shivamurthappa:
Receptive Field Reliant Zero-Cost Proxies for Neural Architecture Search. 1-5 - Po-Chih Chen, P. P. Vaidyanathan:
Error Analysis of Convolutional Beamspace Algorithms. 1-5 - Yi Luo:
Streaming Multi-Channel Speech Separation with Online Time-Domain Generalized Wiener Filter. 1-5 - Xuyang Liu, Yuan Zheng:
Class-Guided Triple Head Prediction Network for Long-Tail Object Detection. 1-5 - Jingyu Li, Yusheng Tian, Tan Lee:
Convolution-Based Channel-Frequency Attention for Text-Independent Speaker Verification. 1-5 - Prachi Singh, Amrit Kaul, Sriram Ganapathy:
Supervised Hierarchical Clustering Using Graph Neural Networks for Speaker Diarization. 1-5 - Masato Hagiwara:
AVES: Animal Vocalization Encoder Based on Self-Supervision. 1-5 - Feifei Xiong, Minya Dong, Kechenying Zhou, Houwei Zhu, Jinwei Feng:
Deep Subband Network for Joint Suppression of Echo, Noise and Reverberation in Real-Time Fullband Speech Communication. 1-5 - Kareem Metwaly, Junho Kweon, Khaled Alhujaili, Maria Greco, Fulvio Gini, Vishal Monga:
Interpretable, Unrolled Deep Radar Beampattern Design. 1-5 - Dimitris Kompostiotis, Dimitris Vordonis, Vassilis Paliouras:
Received Power Maximization with Practical Phase-Dependent Amplitude Response in RIS-Aided OFDM Wireless Communications. 1-5 - Tianrui Wang, Xie Chen, Zhuo Chen, Shu Yu, Weibin Zhu:
An Adapter Based Multi-Label Pre-Training for Speech Separation and Enhancement. 1-5 - Ruolin Su, Zhongkai Sun, Sixing Lu, Chengyuan Ma, Chenlei Guo:
Clicker: Attention-Based Cross-Lingual Commonsense Knowledge Transfer. 1-5 - David M. Chan, Shalini Ghosh, Ariya Rastrow, Björn Hoffmeister:
Domain Adaptation with External Off-Policy Acoustic Catalogs for Scalable Contextual End-to-End Automated Speech Recognition. 1-5 - Emna Ghorbel, Mahmoud Ghorbel, Slim M'hiri:
Data Augmentation Based On Invariant Shape Blending For Deep Learning Classification. 1-5 - Stéphane Ragot, Adriana Vasilache:
Spherical Vector Quantization for Spatial Direction Coding. 1-5 - Zhe Liu, Yue Hui, Fuchun Peng:
Group Personalized Federated Learning. 1-5 - Jisheng Bai, Siwei Huang, Han Yin, Yafei Jia, Mou Wang, Jianfeng Chen:
3D Audio Signal Processing Systems for Speech Enhancement and Sound Localization and Detection. 1-2 - Abdellah Rahmani, Arun Venkitaraman, Pascal Frossard:
A Meta-Gnn Approach to Personalized Seizure Detection and Classification. 1-5 - Sian Jin, Pu Wang, Petros Boufounos, Ryuhei Takahashi, Sumit Roy:
Spatial-Domain Object Detection Under Mimo-Fmcw Automotive Radar Interference. 1-5 - Moritz Garkisch, Vahid Jamali, Robert Schober:
Codebook-Based User Tracking in IRS-Assisted mmWave Communication Networks. 1-5 - Sina Hafezi, Alastair H. Moore, Pierre Guiraud, Patrick A. Naylor, Jacob Donley, Vladimir Tourbabin, Thomas Lunner:
Subspace Hybrid Beamforming for Head-Worn Microphone Arrays. 1-5 - Yukai Ju, Jun Chen, Shimin Zhang, Shulin He, Wei Rao, Weixin Zhu, Yannan Wang, Tao Yu, Shidong Shang:
TEA-PSE 3.0: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System For ICASSP 2023 Dns-Challenge. 1-2 - Kun Hu, Xianchen Zhou, Mingyu Cao, Mengzhu Wang, Guangjie Gao, Wenjing Yang, Huibin Tan:
Progressive Perception Learning for Distribution Modulation in Siamese Tracking. 1-2 - Seungheon Doh, Minz Won, Keunwoo Choi, Juhan Nam:
Textless Speech-to-Music Retrieval Using Emotion Similarity. 1-5 - Zeyu Xiong, Daizong Liu, Pan Zhou, Jiahao Zhu:
Tracking Objects and Activities with Attention for Temporal Sentence Grounding. 1-5 - Vincent K. M. Cheung, Yueh-Po Peng, Jing-Hua Lin, Li Su:
Decoding Musical Pitch from Human Brain Activity with Automatic Voxel-Wise Whole-Brain FMRI Feature Selection. 1-5 - Shijun Wang, Jón Guðnason, Damian Borth:
Fine-Grained Emotional Control of Text-to-Speech: Learning to Rank Inter- and Intra-Class Emotion Intensities. 1-5 - Xuchu Chen, Yu Pu, Jinpeng Li, Wei-Qiang Zhang:
Cross-Lingual Alzheimer's Disease Detection Based on Paralinguistic and Pre-Trained Features. 1-2 - Frank Cwitkowitz, Toni Hirvonen, Anssi Klapuri:
Fretnet: Continuous-Valued Pitch Contour Streaming For Polyphonic Guitar Tablature Transcription. 1-5 - Kriti Kumar, Angshul Majumdar, Achanna Anil Kumar, M. Girish Chandra:
Unsupervised Domain Adaptation via Subspace Interpolating Deep Dictionary Learning: A Case Study in Machine Inspection. 1-5 - Thibault Maho, Teddy Furon, Erwan Le Merrer:
Model Fingerprinting with Benign Inputs. 1-5 - Yuyun Lian, Yongshan Zhang, Xuxiang Feng, Xinwei Jiang, Zhihua Cai:
Low-Rank Constrained Memory Autoencoder for Hyperspectral Anomaly Detection. 1-5 - Jiexing Qi, Shuhao Li, Zhixin Guo, Yusheng Huang, Chenghu Zhou, Weinan Zhang, Xinbing Wang, Zhouhan Lin:
Text Classification In The Wild: A Large-Scale Long-Tailed Name Normalization Dataset. 1-5 - Chan-Shuo Hu, Sung-Wei Tseng, Xin-Yun Fan, Chen-Kuo Chiang:
Vehicle View Synthesis by Generative Adversarial Network. 1-5 - Hanlu Yang, Fateme Ghayem, Ben Gabrielson, Mohammad A. B. S. Akhonda, Vince D. Calhoun, Tülay Adali:
Constrained Independent Component Analysis Based on Entropy Bound Minimization for Subgroup Identification from Multi-subject fMRI Data. 1-5 - Peiyuan Zhai, Raj Thilak Rajan:
Distributed Gaussian Process Hyperparameter Optimization for Multi-Agent Systems. 1-5 - Jun-Hwa Kim, Namho Kim, Chee Sun Won:
High-Speed Drone Detection Based On Yolo-V8. 1-2 - Wenkang Fan, Kaiyun Zhang, Hong Shi, Jianhua Chen, Yinran Chen, Xiongbiao Luo:
Deep Triple-Supervision Learning Unannotated Surgical Endoscopic Video Data for Monocular Dense Depth Estimation. 1-5 - Amirhossein Ahmadian, Fredrik Lindsten:
Enhancing Representation Learning with Deep Classifiers in Presence of Shortcut. 1-5 - Huajian Fang, Niklas Wittmer, Johannes Twiefel, Stefan Wermter, Timo Gerkmann:
Partially Adaptive Multichannel Joint Reduction of Ego-Noise and Environmental Noise. 1-5 - Rui Xu, Xun Liang:
Adaptive Submanifold-Preserving Sparse Regression for Feature Selection And Multiclass Classification. 1-5 - Rui Xu, Xun Liang:
Semi-Supervised Local Structured Feature Learning with Dynamic Maximum Entropy Graph. 1-5 - Xingyun Mao, Heng Qiao:
On Super-Resolution with Separation Prior. 1-5 - Woncheol Shin, Gyubok Lee, Jiyoung Lee, Eunyi Lyou, Joonseok Lee, Edward Choi:
Exploration Into Translation-Equivariant Image Quantization. 1-5 - Rémi Andre, Xavier Luciani:
Sparsity Constraint Implementation for the Joint Eigenvalue Decomposition of Matrices. 1-5 - Da-Hee Yang, Joon-Hyuk Chang:
Selective Film Conditioning with CTC-Based ASR Probability for Speech Enhancement. 1-5 - Pegah Kharazmi, Zhewei Zhao, Clement Chung, Samridhi Choudhary:
Distill-Quantize-Tune - Leveraging Large Teachers for Low-Footprint Efficient Multilingual NLU on Edge. 1-5 - Julius Richter, Simon Welker, Jean-Marie Lemercier, Bunlong Lay, Tal Peer, Timo Gerkmann:
Speech Signal Improvement Using Causal Generative Diffusion Models. 1-2 - Sooyoung Park, Arda Senocak, Joon Son Chung:
MarginNCE: Robust Sound Localization with a Negative Margin. 1-5 - Xu Cao, Wenqian Ye, Elena Sizikova, Xue Bai, Megan Coffee, Hongwu Zeng, Jianguo Cao:
Vitasd: Robust Vision Transformer Baselines for Autism Spectrum Disorder Facial Diagnosis. 1-5 - Meishu Song, Andreas Triantafyllopoulos, Zijiang Yang, Hiroki Takeuchi, Toru Nakamura, Akifumi Kishi, Tetsuro Ishizawa, Kazuhiro Yoshiuchi, Xin Jing, Vincent Karas, Zhonghao Zhao, Kun Qian, Bin Hu, Björn W. Schuller, Yoshiharu Yamamoto:
Daily Mental Health Monitoring from Speech: A Real-World Japanese Dataset and Multitask Learning Analysis. 1-5 - Lisha Chen, Momin Abbas, Tianyi Chen:
A Nested Ensemble Method to Bilevel Machine Learning. 1-5 - Taiga Kawamura, Natsuki Ueno, Nobutaka Ono:
Element Selection with Wide Class of Optimization Criteria Using Non-Convex Sparse Optimization. 1-5 - Kai Wang, Yuhang Yang, Hao Huang, Ying Hu, Sheng Li:
Speakeraugment: Data Augmentation for Generalizable Source Separation via Speaker Parameter Manipulation. 1-5 - Yan Deng, Long Zhou, Yuanhao Yi, Shujie Liu, Lei He:
Prosody-Aware Speecht5 for Expressive Neural TTS. 1-5 - Kévin Planolles, Marc Chaumont, Frédéric Comby:
A Study on the Invariance in Security Whatever the Dimension of Images for the Steganalysis by Deep-Learning. 1-5 - Kostas Tsampourakis, Víctor Elvira:
An Augmented Gaussian Sum Filter through a mixture Decomposition. 1-5 - Xubo Liu, Haohe Liu, Qiuqiang Kong, Xinhao Mei, Mark D. Plumbley, Wenwu Wang:
Simple Pooling Front-Ends for Efficient Audio Classification. 1-5 - Joon Byun, Seungmin Shin, Youngcheol Park, Jongmo Sung, Seungkwon Beack:
A Perceptual Neural Audio Coder with a Mean-Scale Hyperprior. 1-5 - Mojtaba Heydari, Ju-Chiang Wang, Zhiyao Duan:
SingNet: a real-time Singing Voice beat and Downbeat Tracking System. 1-5 - Jian Guan, Youde Liu, Qiaoxi Zhu, Tieran Zheng, Jiqing Han, Wenwu Wang:
Time-Weighted Frequency Domain Audio Representation with GMM Estimator for Anomalous Sound Detection. 1-5 - Mumin Jin, Prashant Serai, Jilong Wu, Andros Tjandra, Vimal Manohar, Qing He:
Voice-Preserving Zero-Shot Multiple Accent Conversion. 1-5 - Jiaqing Liu, Chong Deng, Qinglin Zhang, Qian Chen, Wen Wang:
Meeting Action Item Detection with Regularized Context Modeling. 1-5 - Marzieh Hashemipour-Nazari, Renate Debets, Kees Goossens, Alexios Balatsoukas-Stimming:
Recursive/Iterative Unique Projection-Aggregation Decoding of Reed-Muller Codes. 1-5 - Daniel G. Tiglea, Renato Candido, Luis Antonio Azpicueta-Ruiz, Magno T. M. Silva:
Reducing the Communication and Computational Cost of Random Fourier Features Kernel LMS in Diffusion Networks. 1-5 - Leiyu Xie, Yuxing Yang, Zeyu Fu, Syed Mohsen Naqvi:
One-Shot Medical Action Recognition With A Cross-Attention Mechanism And Dynamic Time Warping. 1-5 - Xiaohuan Wu, Yaxin Liu, Xiaoyuan Jia:
Gridless Target Localization for FDA-Mimo Radar with Sparse Arrays. 1-5 - Hui Zhu, Yongchun Lü, Hongyu Zhao, Guoqing Zhao, Xiaofang Zhao:
Not All Classes are Equal: Adaptively Focus-Aware Confidence for Semi-Supervised Object Detection. 1-5 - Saurabh Sihag, Gonzalo Mateos, Corey McMillan, Alejandro Ribeiro:
Predicting Brain Age Using Transferable Covariance Neural Networks. 1-5 - Jinliang Lu, Feihu Jin, Jiajun Zhang:
Adapter Tuning With Task-Aware Attention Mechanism. 1-5 - Ziyang Wang, Sissi Xiaoxiao Wu, Junjie Zhu, Yingying Zhu:
A Privacy-Preserving Trajectory Mining Model. 1-5 - Rakesh Iyer:
NVOC-22: A Low Cost Mel Spectrogram Vocoder for Mobile Devices. 1-5 - Yu Chen, Wen Ding, Junjie Lai:
Improving Noisy Student Training on Non-Target Domain Data for Automatic Speech Recognition. 1-5 - Muskan Gupta, Gokul Kannan, Ranjitha Prasad, Garima Gupta:
Deep Survival Analysis and Counterfactual Inference Using Balanced Representations. 1-5 - Zefang Yu, Yanping Hu, Suncheng Xiang, Ting Liu, Yuzhuo Fu:
CC-PoseNet: Towards Human Pose Estimation in Crowded Classrooms. 1-5 - Yanwu Yang, Guoqing Cai, Chenfei Ye, Yang Xiang, Ting Ma:
Tensor-based Complex-valued Graph Neural Network for Dynamic Coupling Multimodal brain Networks. 1-5 - Gaurav Chaudhary, Laxmidhar Behera, Tushar Sandhan:
Active Perception System for Enhanced Visual Signal Recovery Using Deep Reinforcement Learning. 1-5 - Huiyuan Sun, Prasanga N. Samarasinghe, Thushara D. Abhayapala:
Blind Source Counting and Separation with Relative Harmonic Coefficients. 1-5 - Moein Ahmadi, Mohammad Alaee-Kerahroodi, M. R. Bhavani Shankar, Björn E. Ottersten:
Subspace-Based Detector For Distributed Mmwave Mimo Radar Sensors. 1-5 - Dongmei Wang, Xiong Xiao, Naoyuki Kanda, Takuya Yoshioka, Jian Wu:
Target Speaker Voice Activity Detection with Transformers and Its Integration with End-To-End Neural Diarization. 1-5 - Kai Liu, Ziqing Du, Xucheng Wan, Huan Zhou:
X-SEPFORMER: End-To-End Speaker Extraction Network with Explicit Optimization on Speaker Confusion. 1-5 - Hyungjun Lim, Younggwan Kim, Kiho Yeom, Eunjoo Seo, Hoodong Lee, Stanley Jungkyu Choi, Honglak Lee:
Lightweight Feature Encoder for Wake-Up Word Detection Based on Self-Supervised Speech Representation. 1-5 - Jaesung Huh, Jacob Chalk, Evangelos Kazakos, Dima Damen, Andrew Zisserman:
Epic-Sounds: A Large-Scale Dataset of Actions that Sound. 1-5 - Matthew Phelps, Ryan Swindle, J. Zachary Gazak, Andrew Vandenberg, Justin Fletcher:
SPECTRANET-SO(3): Learning Satellite Orientation from Optical Spectra by Implicitly Modeling Mutually Exclusive Probability Distributions on The Rotation Manifold. 1-5 - Hao Wu, Bo Yang, Xiaopeng Ke, Siyi He, Fengyuan Xu, Sheng Zhong:
GAPter: Gray-Box Data Protector for Deep Learning Inference Services at User Side. 1-5 - Roshan Sharma, Weipeng He, Ju Lin, Egor Lakomkin, Yang Liu, Kaustubh Kalgaonkar:
Egocentric Audio-Visual Noise Suppression. 1-5 - Liwen You, Erika Pelaez Coyotl, Suren Gunturu, Maarten Van Segbroeck:
Transformer-Based Bioacoustic Sound Event Detection on Few-Shot Learning Tasks. 1-5 - Chule Yang, Chao Zhang, Zunlin Fan, Zeting Yu, Qianchong Sun, Mengyuan Dai:
A Multi-Channel Aggregation Framework for Object Detection in Large-Scale SAR Image. 1-5 - Peipei Liu, Xin Zheng, Hong Li, Jie Liu, Yimo Ren, Hongsong Zhu, Limin Sun:
Improving the Modality Representation with multi-view Contrastive Learning for Multimodal Sentiment Analysis. 1-5 - Tianrun Chen, Chenglong Fu, Lanyun Zhu, Papa Mao, Jia Zhang, Ying Zang, Lingyun Sun:
Deep3DSketch: 3D Modeling from Free-Hand Sketches with View- and Structural-Aware Adversarial Training. 1-5 - Karren D. Yang, Ting-Yao Hu, Jen-Hao Rick Chang, Hema Swetha Koppula, Oncel Tuzel:
Text is all You Need: Personalizing ASR Models Using Controllable Speech Synthesis. 1-5 - Jakob Möderl, Erik Leitinger, Franz Pernkopf, Klaus Witrisal:
Variational Message Passing-Based Respiratory Motion Estimation and Detection Using Radar Signals. 1-5 - Kaiyang Liu, Wendong Gan, Chenchen Yuan:
MAID: A Conditional Diffusion Model for Long Music Audio Inpainting. 1-5 - Félix Mathieu, Thomas Courtat, Gaël Richard, Geoffroy Peeters:
Learning Interpretable Filters In Wav-UNet For Speech Enhancement. 1-5 - Hainan Xu, Fei Jia, Somshubra Majumdar, Shinji Watanabe, Boris Ginsburg:
Multi-Blank Transducers for Speech Recognition. 1-5 - Irfan Al-Hussaini, Cassie S. Mitchell:
Towards Interpretable Seizure Detection Using Wearables. 1-2 - Suzhen Wang, Yifeng Ma, Yu Ding:
Exploring Complementary Features in Multi-Modal Speech Emotion Recognition. 1-5 - Johannes W. de Vries, Miao Sun, Natasja M. S. de Groot, Richard C. Hendriks:
Estimation of Cardiac Fibre Direction Based on Activation Maps. 1-5 - Sarthak Gupta, Vassilis Kekatos:
A Quantum Approach for Stochastic Constrained Binary Optimization. 1-5 - Wanying Ge, Hemlata Tak, Massimiliano Todisco, Nicholas W. D. Evans:
Can Spoofing Countermeasure And Speaker Verification Systems Be Jointly Optimised? 1-5 - Rohun Agrawal, Oscar Leong:
Alternating Phase Langevin Sampling with Implicit Denoiser Priors for Phase Retrieval. 1-5 - Haozhe Xing, Shuyong Gao, Hao Tang, Tsui Qin Mok, Yanlan Kang, Wenqiang Zhang:
TINYCOD: Tiny and Effective Model for Camouflaged Object Detection. 1-5 - Junxuan Huang, Junsong Yuan, Chunming Qiao, Yatong An, Cheng Lu, Bai Chen:
POINTACL: Adversarial Contrastive Learning for Robust Point Clouds Representation Under Adversarial Attack. 1-5 - Minjun Zhu, Yixuan Weng, Shizhu He, Cunguang Wang, Kang Liu, Li Cai, Jun Zhao:
Learning to Build Reasoning Chains by Reliable Path Retrieval. 1-5 - Andros Tjandra, Nayan Singhal, David Zhang, Ozlem Kalinli, Abdelrahman Mohamed, Duc Le, Michael L. Seltzer:
Massively Multilingual ASR on 70 Languages: Tokenization, Architecture, and Generalization Capabilities. 1-5 - Meng Feng, Chieh-Chi Kao, Qingming Tang, Amit Solomon, Viktor Rozgic, Chao Wang:
FedRPO: Federated Relaxed Pareto Optimization for Acoustic Event Classification. 1-5 - Ge Li, Ruonan Zhang:
A Point is A Wave: Point-Wave Network for Place Recognition. 1-5 - Takao Kawamura, Yuma Kinoshita, Nobutaka Ono, Robin Scheibler:
Effectiveness of Inter- and Intra-Subarray Spatial Features for Acoustic Scene Classification. 1-5 - Karim Helwani, Paris Smaragdis, Michael M. Goodwin:
Generative Modeling Based Manifold Learning for Adaptive Filtering Guidance. 1-5 - Yachun Li, Jingjing Wang, Yuhui Chen, Di Xie, Shiliang Pu:
Single Domain Dynamic Generalization for Iris Presentation Attack Detection. 1-5 - Haibo Shen, Juyu Xiao, Yihao Luo, Xiang Cao, Liangqi Zhang, Tianjiang Wang:
Training Robust Spiking Neural Networks with Viewpoint Transform and Spatiotemporal Stretching. 1-5 - Zefan Tian, Rongjie Wang, Zhenyu Wang, Ronggang Wang:
HQP-MVS:High-Quality Plane Priors Assisted Multi-View Stereo for Low-Textured Areas. 1-5 - E. Fekas, Athanasia Zlatintsi, Panagiotis Paraskevas Filntisis, Christos Garoufis, Niki Efthymiou, Petros Maragos:
Relapse Prediction from Long-Term Wearable Data Using Self-Supervised Learning and Survival Analysis. 1-5 - Victor Solo:
On Tracking a Stochastically Time-Varying Subspace. 1-5 - Anni Yu, Yu-Bin Yang:
Learning to Explain: a Gradient-based Attribution Method for Interpreting Super-Resolution Networks. 1-5 - Yi-Zhan Xu, Chih-Yao Chen, Cheng-Te Li:
SUVR: A Search-Based Approach to Unsupervised Visual Representation Learning. 1-5 - Peijie Dong, Xin Niu, Lujun Li, Zhiliang Tian, Xiaodong Wang, Zimian Wei, Hengyue Pan, Dongsheng Li:
RD-NAS: Enhancing One-Shot Supernet Ranking Ability Via Ranking Distillation From Zero-Cost Proxies. 1-5 - Soo-Chang Pei, Kuo-Wei Chang:
Binary Image Fast Perfect Recovery from Sparse 2D-DFT Coefficients. 1-5 - Leying Zhang, Zhengyang Chen, Yanmin Qian:
Adaptive Large Margin Fine-Tuning For Robust Speaker Verification. 1-5 - Yushan Qian, Bo Wang, Ting-En Lin, Yinhe Zheng, Ying Zhu, Dongming Zhao, Yuexian Hou, Yuchuan Wu, Yongbin Li:
Empathetic Response Generation via Emotion Cause Transition Graph. 1-5 - Marvin Borsdorf, Saurav Pahuja, Gabriel Ivucic, Siqi Cai, Haizhou Li, Tanja Schultz:
Multi-Head Attention and GRU for Improved Match-Mismatch Classification of Speech Stimulus and EEG Response. 1-2 - Yoohwan Kwon, Soo-Whan Chung:
MoLE : Mixture Of Language Experts For Multi-Lingual Automatic Speech Recognition. 1-5 - Renat Sergazinov, Mohammadreza Armandpour, Irina Gaynanova:
Gluformer: Transformer-based Personalized glucose Forecasting with uncertainty quantification. 1-5 - Benedikt Böck, Michael Baur, Valentina Rizzello, Wolfgang Utschick:
Variational Inference Aided Estimation of Time Varying Channels. 1-5 - Ziheng Jiao, Hongyuan Zhang, Xuelong Li:
Learn Topological Representation with Flexible Manifold Layer. 1-5 - Jaemin Jung, Youkyum Kim, Jihwan Park, Youshin Lim, Byeong-Yeol Kim, Youngjoon Jang, Joon Son Chung:
Metric Learning for User-Defined Keyword Spotting. 1-5 - Mingjie Tian, Fausto Giunchiglia, Rui Song, Xing Chen, Hao Xu:
Enhancing Ontology Translation Through Cross-Lingual Agreement. 1-5 - Mirza Asif Haider, Saidur R. Pavel, Yimin D. Zhang, Elias Aboutanios:
Active IRS-Assisted MIMO Channel Estimation and Prediction. 1-5 - Zikang Jin, Changchun Yin, Piji Li, Lu Zhou, Liming Fang, Xiangmao Chang, Zhe Liu:
Multi-Layer Feature Division Transferable Adversarial Attack. 1-5 - Mostafa Sadeghi, Romain Serizel:
Fast and Efficient Speech Enhancement with Variational Autoencoders. 1-5 - Ke Li, Jay Mahadeokar, Jinxi Guo, Yangyang Shi, Gil Keren, Ozlem Kalinli, Michael L. Seltzer, Duc Le:
Improving fast-slow Encoder based Transducer with Streaming Deliberation. 1-5 - Ruixuan Wang, Yue Qi, Mojtaba Vaezi, Xun Jiao, Moeness G. Amin:
Strategies for Enhanced Signal Modulation Classifications Under Unknown Symbol Rates and Noise Conditions. 1-5 - Yanqing Xu, Enbin Song, Qingjiang Shi, Tsung-Hui Chang:
Sparse Aggregation-Based Channel Estimation For Massive Mimo Systems With Decentralized Baseband Processing. 1-5 - Byeongho Jo, Seungkwon Beack, Taejin Lee:
Audio Coding With Unified Noise Shaping And Phase Contrast Control. 1-5 - Maosheng Yang, Bishwadeep Das, Elvin Isufi:
Online Edge Flow Prediction Over Expanding Simplicial Complexes. 1-5 - Jie Huang, Xiachong Feng, Yangfan Ye, Liang Zhao, Xiaocheng Feng, Bing Qin, Ting Liu:
Dialogue Context Modelling for Action Item Detection: Solution for ICASSP 2023 Mug Challenge Track 5. 1-2 - Peizhu Gong, Jin Liu, Xiliang Zhang, Xingye Li:
A Multi-Stage Hierarchical Relational Graph Neural Network for Multimodal Sentiment Analysis. 1-5 - Evangelos Georgatos, Christos Mavrokefalidis, Kostas Berberidis:
Fully Distributed Federated Learning with Efficient Local Cooperations. 1-5 - Djallel Bouneffouf, Mayank Agarwal, Irina Rish:
Dialogue System with Missing Observation. 1-5 - Chi Wang, Jian Gao, Yang Hua, Hui Wang:
Cross-Domain Learning with Normalizing Flow. 1-5 - Siqi Cai, Jingling Yuan, Lin Li:
A Mutual Implicit Sentiment Analysis Model with Bundle-Aware Contrastive Learning. 1-5 - Meng Liu, Kong Aik Lee, Longbiao Wang, Hanyi Zhang, Chang Zeng, Jianwu Dang:
Cross-Modal Audio-Visual Co-Learning for Text-Independent Speaker Verification. 1-5 - Yassine El Ouahidi, Lucas Drumetz, Giulia Lioi, Nicolas Farrugia, Bastien Pasdeloup, Vincent Gripon:
Spatial Graph Signal Interpolation with an Application for Merging BCI Datasets with Various Dimensionalities. 1-5 - Xiaojun Meng, Wenlin Dai, Yasheng Wang, Baojun Wang, Zhiyong Wu, Xin Jiang, Qun Liu:
Lexicon-injected Semantic Parsing for Task-Oriented Dialog. 1-5 - Zihan Chen, Zeshen Li, Howard H. Yang, Tony Q. S. Quek:
Personalizing Federated Learning with Over-The-Air Computations. 1-5 - Dang Nguyen, Trang Nguyen, Khai Nguyen, Dinh Q. Phung, Hung Hai Bui, Nhat Ho:
On Cross-Layer Alignment for Model Fusion of Heterogeneous Neural Networks. 1-5 - Xin Wang, Junichi Yamagishi:
Spoofed Training Data for Speech Spoofing Countermeasure Can Be Efficiently Created Using Neural Vocoders. 1-5 - Yimeng Zhuang:
Heuristic Masking for Text Representation Pretraining. 1-5 - Chih-Wei Lin, Zhongsheng Chen:
U-Shiftformer: Brain Tumor Segmentation Using A Shifted Attention Mechanism. 1-5 - Bowen Zhang, Daijun Ding, Guangning Xu, Jinjin Guo, Zhichao Huang, Xu Huang:
Twitter Stance Detection via Neural Production Systems. 1-5 - Euntae Choi, Youshin Lim, Byeong-Yeol Kim, Hyung Yong Kim, Hanbin Lee, Yunkyu Lim, Seung Woo Yu, Sungjoo Yoo:
Masked Token Similarity Transfer for Compressing Transformer-Based ASR Models. 1-5 - Brent De Weerdt, Yonina C. Eldar, Nikos Deligiannis:
Designing Transformer Networks for Sparse Recovery of Sequential Data Using Deep Unfolding. 1-5 - Xiongjie Chen, Yunpeng Li, Yongxin Yang:
Batch-Ensemble Stochastic Neural Networks for Out-of-Distribution Detection. 1-5 - Zhengkun Tian, Hongyu Xiang, Min Li, Feifei Lin, Ke Ding, Guanglu Wan:
Peak-First CTC: Reducing the Peak Latency of CTC Models by Applying Peak-First Regularization. 1-5 - Zhonghan Niu, Qing-Long Zhang, Yi Fan, Yu-Bin Yang:
M2TSR: Multi-Range and Mix-Grained Transformer for Single Image Super-Resolution. 1-5 - Nele Sophie Brügge, Esfandiar Mohammadi, Alexander Münchau, Tobias Bäumer, Christian Frings, Christian Beste, Veit Rößner, Heinz Handels:
Towards Privacy and Utility in Tourette TIC Detection Through Pretraining Based on Publicly Available Video Data of Healthy Subjects. 1-5 - Jian Cui, Lin Li, Xin Zhang, Jingling Yuan:
Multimodal Propaganda Detection Via Anti-Persuasion Prompt enhanced contrastive learning. 1-5 - Xingxian Liu, Bin Duan, Bo Xiao, Yajing Xu:
Query-Utterance Attention With Joint Modeuing For Query-Focused Meeting Summarization. 1-5 - Ji-Hoon Kim, Hongsun Yang, Yooncheol Ju, Ilhwan Kim, Byeongyeol Kim:
CROSSSPEECH: Speaker-Independent Acoustic Representation for Cross-Lingual Speech Synthesis. 1-5 - Cristian J. Vaca-Rubio, Pablo Ramirez-Espinosa, Kimmo Kansanen, Zheng-Hua Tan, Elisabeth de Carvalho:
Radio Sensing with Large Intelligent Surface for 6G. 1-5 - Kohei Saijo, Tetsuji Ogawa:
Self-Remixing: Unsupervised Speech Separation VIA Separation and Remixing. 1-5 - Irina-Elena Veliche, Pascale Fung:
Improving Fairness and Robustness in End-to-End Speech Recognition Through Unsupervised Clustering. 1-5 - Junyi He, Di Zhang, Shumeng Liu, Yuezhi Zhou, Yaoxue Zhang:
Managing Information Updating with Edge Computing: A Distributed and Learning Approach. 1-5 - Rodrigo Cabral Farias, Sebastian Miron:
Projected Hierarchical ALS for Generalized Boolean Matrix Factorization. 1-5 - Yaqi Zhang, Yan Lu, Bin Liu, Zhiwei Zhao, Qi Chu, Nenghai Yu:
Evopose: A Recursive Transformer for 3D Human Pose Estimation with Kinematic Structure Priors. 1-5 - Menghao Zhang, Jingyu Wang, Jing Wang, Qi Qi, Zirui Zhuang, Haifeng Sun, Ning Xiao:
Robust Video Anomaly Detection Framework via Prior Knowledge and Multi-Path Frame Prediction. 1-5 - Feng Chen, Shiwen Deng, Tieran Zheng, Yongjun He, Jiqing Han:
Graph-Based Spectro-Temporal Dependency Modeling for Anti-Spoofing. 1-5 - Daniel Faronbi, Irán R. Román, Juan Pablo Bello:
Exploring Approaches to Multi-Task Automatic Synthesizer Programming. 1-5 - Yiyang Li, Hongqiu Wu, Hai Zhao:
Contrastive Learning of Functionality-Aware Code Embeddings. 1-5 - Yuxiang Zhang, Mengmeng Zhang, Wei Li, Ran Tao:
Multi-Modal Domain Generalization for Cross-Scene Hyperspectral Image Classification. 1-5 - Marco Comunità, Christian J. Steinmetz, Huy Phan, Joshua D. Reiss:
Modelling Black-Box Audio Effects with Time-Varying Feature Modulation. 1-5 - Jaume Banus, Augustin Ogier, Roger Hullin, Philippe Meyer, Ruud B. van Heeswijk, Jonas Richiardi:
Deep Spatio-Temporal Multiplex Graph Learning for Cardiac Imaging Classification. 1-5 - Rishabh Khurana, Jayesh Rajkumar Vachhani, Sourabh Vasant Gothe, Pranay Kashyap:
Repetition Counting from Compressed Videos Using Sparse Residual Similarity. 1-5 - Hiroki Kuroda, Daichi Kitahara, Eiichi Yoshikawa, Hiroshi Kikuchi, Tomoo Ushio:
Sparsity-Smoothness-Aware Power Spectral Density Estimation with Application to Phased Array Weather Radar. 1-5 - Stavros Sykiotis, Maria Kaselimi, Anastasios Doulamis, Nikolaos Doulamis:
Continilm: A Continual Learning Scheme for Non-Intrusive Load Monitoring. 1-5 - Benno Weck, Xavier Serra:
Data Leakage in Cross-Modal Retrieval Training: A Case Study. 1-5 - Carla Schenker, Xiulin Wang, Evrim Acar:
Parafac2-Based Coupled Matrix and Tensor Factorizations. 1-5 - Yuxuan Zhang, Chao Xu, Howard H. Yang, Xijun Wang, Tony Q. S. Quek:
DPP-Based Client Selection for Federated Learning with NON-IID DATA. 1-5 - Motoi Omachi, Brian Yan, Siddharth Dalmia, Yuya Fujita, Shinji Watanabe:
Align, Write, Re-Order: Explainable End-to-End Speech Translation via Operation Sequence Generation. 1-5 - Ahmed Omran, Neil Zeghidour, Zalán Borsos, Félix de Chaumont Quitry, Malcolm Slaney, Marco Tagliasacchi:
Disentangling Speech from Surroundings with Neural Embeddings. 1-5 - Amitay Sicherman, Yossi Adi:
Analysing Discrete Self Supervised Speech Representation For Spoken Language Modeling. 1-5 - Youness Moukafih, Mounir Ghogho, Kamel Smaïli:
Supervised Contrastive Learning as Multi-Objective Optimization for Fine-Tuning Large Pre-Trained Language Models. 1-5 - Weiyi Yu, Yiming Lei, Hongming Shan:
Fan-Net: Fourier-Based Adaptive Normalization for Cross-Domain Stroke Lesion Segmentation. 1-5 - Peng Wang, Xi Huang, Li Cui:
IR-ECG: Invertible Reconstruction of ECG. 1-5 - Anurag Kumar, Ke Tan, Zhaoheng Ni, Pranay Manocha, Xiaohui Zhang, Ethan Henderson, Buye Xu:
Torchaudio-Squim: Reference-Less Speech Quality and Intelligibility Measures in Torchaudio. 1-5 - Zihao Zhang, Nan Sang, Xupeng Wang, Mumuxin Cai:
SC-Net: Salient Point and Curvature Based Adversarial Point Cloud Generation Network. 1-5 - Akshay Raina, Vipul Arora:
SyncNet: Correlating Objective for Time Delay Estimation in Audio Signals. 1-5 - Dan Meng, Xue Wang, Jun Wang:
Backdoor Attack Against Automatic Speaker Verification Models in Federated Learning. 1-5 - Xilong Wang, Yaofei Wang, Kejiang Chen, Jinyang Ding, Weiming Zhang, Nenghai Yu:
ICStega: Image Captioning-based Semantically Controllable Linguistic Steganography. 1-5 - Hritam Basak, Soumitri Chattopadhyay, Rohit Kundu, Sayan Nag, Rammohan Mallipeddi:
Ideal: Improved Dense Local Contrastive Learning For Semi-Supervised Medical Image Segmentation. 1-5 - Jicun Li, Xingjian Li, Tianyang Wang, Shi Wang, Yanan Cao, Cheng-Zhong Xu, Dejing Dou:
Improving Bert Fine-Tuning via Stabilizing Cross-Layer Mutual Information. 1-5 - Yuzheng Wang, Zuhao Ge, Zhaoyu Chen, Xian Liu, Chuangjia Ma, Yunquan Sun, Lizhe Qi:
Explicit and Implicit Knowledge Distillation via Unlabeled Data. 1-5 - Michail Chatzianastasis, Loukas Ilias, Dimitris Askounis, Michalis Vazirgiannis:
Neural Architecture Search with Multimodal Fusion Methods for Diagnosing Dementia. 1-5 - Janghwan Lee, Youngdeok Hwang, Jungwook Choi:
Finding Optimal Numerical Format for Sub-8-Bit Post-Training Quantization of Vision Transformers. 1-5 - Xiaoqing Chen, Chengyu Wang, Junwei Dong, Minghui Qiu, Liang Feng, Jun Huang:
Boosting Prompt-Based Few-Shot Learners Through Out-of-Domain Knowledge Distillation. 1-5 - Tomoya Nishida, Takashi Endo, Yohei Kawaguchi:
Zero-Shot Domain Adaptation of Anomalous Samples for Semi-Supervised Anomaly Detection. 1-5 - Yingyi Ma, Zhe Liu, Xuedong Zhang:
Adaptive Multi-Corpora Language Model Training for Speech Recognition. 1-5 - Huan Zhao, Haijiao Chen, Yufeng Xiao, Zixing Zhang:
Privacy-Enhanced Federated Learning Against Attribute Inference Attack for Speech Emotion Recognition. 1-5 - Minmin Yi, Houchun Ning, Peng Liu:
FedSD: A New Federated Learning Structure Used in Non-iid Data. 1-5 - Tala Abdallah, Nisrine Jrad, Fahed Abdallah, Anne Humeau-Heurtier, Patrick Van Bogaert:
Cross-Site Generalization for Imbalanced Epileptic Classification. 1-5 - Shaokai Li, Peng Song, Liang Ji, Yun Jin, Wenming Zheng:
A Generalized Subspace Distribution Adaptation Framework for Cross-Corpus Speech Emotion Recognition. 1-5 - Haifeng Zhao, Hongzhi Wan, Lili Huang, Mingwei Cao:
G2PL: Lexicon Enhanced Chinese Polyphone Disambiguation Using Bert Adapter with a New Dataset. 1-5 - Heming Wang, Yao Qian, Hemin Yang, Nauyuki Kanda, Peidong Wang, Takuya Yoshioka, Xiaofei Wang, Yiming Wang, Shujie Liu, Zhuo Chen, DeLiang Wang, Michael Zeng:
DATA2VEC-SG: Improving Self-Supervised Learning Representations for Speech Generation Tasks. 1-5 - Jiguang He, Aymen Fakhreddine, George C. Alexandropoulos:
Joint Channel and Direction Estimation for Ground-to-UAV Communications Enabled by a Simultaneous Reflecting and Sensing RIS. 1-5 - Maximo Cobos, Mirco Pezzoli, Fabio Antonacci, Augusto Sarti:
Acoustic Source Localization in the Spherical Harmonics Domain Exploiting Low-Rank Approximations. 1-5 - Mateusz Guzik, Konrad Kowalczyk:
Convolutive NTF for Ambisonic Source Separation under Reverberant Conditions. 1-5 - Maria Giulia Preti, Thomas William Arthur Bolton, Alessandra Griffa, Dimitri Van De Ville:
Graph Signal Processing For Neurogimaging to Reveal Dynamics of Brain Structure-Function Coupling. 1-5 - Dilki Wijekoon, Amine Mezghani, Ekram Hossain:
Beamforming Optimization in RIS-Aided Mimo Systems Under Multiple-Reflection Effects. 1-5 - Chenda Li, Yao Qian, Zhuo Chen, Dongmei Wang, Takuya Yoshioka, Shujie Liu, Yanmin Qian, Michael Zeng:
Target Sound Extraction with Variable Cross-Modality Clues. 1-5 - Marouane Tliba, Aladine Chetouani, Giuseppe Valenzise, Frédéric Dufaux:
PCQA-Graphpoint: Efficient Deep-Based Graph Metric for Point Cloud Quality Assessment. 1-5 - Zilin Wang, Peng Liu, Jun Chen, Sipan Li, Jinfeng Bai, Gang He, Zhiyong Wu, Helen Meng:
A Synthetic Corpus Generation Method for Neural Vocoder Training. 1-5 - Yuting He, Renjie Huang, Yangguang Shi, Guoqiang Xiao, Bin Yang, Yuqi Li:
Scale-Adaptive Tiny Object Detection Enhanced by Across-Scale and Shape-Preserved Semantic Location. 1-5 - Mingjie Shao, Wing-Kin Ma, Yatao Liu:
Symbol-Level Precoding is Related to Parameter Estimation from Quantized Data. 1-5 - Yijun Lin, Xingzhe Su, Fengge Wu, Junsuo Zhao:
Exploring Progressive Hybrid-Degraded Image Processing for Homography Estimation. 1-5 - Chenyang Li, Zhi-Qi Cheng, Jun-Yan He, Pengyu Li, Bin Luo, Han-Yuan Chen, Yifeng Geng, Jin-Peng Lan, Xuansong Xie:
Longshortnet: Exploring Temporal and Semantic Features Fusion In Streaming Perception. 1-5 - Xiaoyan Wang, Minghan Shao, Dongyan Guo, Ying Cui, Xiaojie Huang, Ming Xia, Cong Bai:
Multi-Stage Aggregation Transformer for Medical Image Segmentation. 1-5 - Pengteng Li, Ying He, Dongfu Yin, F. Richard Yu, Pinhao Song:
Bagging R-CNN: Ensemble for Object Detection in Complex Traffic Scenes. 1-5 - Yanting Zhang, Shuanghong Wang, Yuxuan Fan, Gaoang Wang, Cairong Yan:
TransLink: Transformer-Based Embedding for Tracklets' Global Link. 1-5 - Yifan Yuan, Siteng Ma, Hongming Shan, Junping Zhang:
DO-FAM: Disentangled Non-Linear Latent Navigation For Facial Attribute Manipulation. 1-5 - Yanchun Li, Xinan He, Shujuan Tian, Zhetao Li, Saiqin Long:
Deep Feature Aggregation for Lightweight Single Image Super-Resolution. 1-5 - Taihui Li, Hengkang Wang, Le Peng, Xian'e Tang, Ju Sun:
Robust Autoencoders for Collective Corruption Removal. 1-5 - Alessandro Ilic Mezza, Giulio Zanetti, Maximo Cobos, Fabio Antonacci:
Zero-Shot Anomalous Sound Detection in Domestic Environments Using Large-Scale Pretrained Audio Pattern Recognition Models. 1-5 - Christos Kolomvakis, Nicolas Gillis:
Robust Binary Component Decompositions. 1-5 - Hongmeng Liu, Jiapeng Zhao, Yixuan Huo, Yuyan Wang, Chun Liao, Liyan Shen, Shiyao Cui, Jinqiao Shi:
URM4DMU: An User Representation Model for Darknet Markets Users. 1-5 - Ming-Yi Hong, Shih-Yen Chang, Hao-Wei Hsu, Yi-Hsiang Huang, Chih-Yu Wang, Che Lin:
TreeXGNN: can gradient-boosted decision trees help boost heterogeneous graph neural networks? 1-5 - Tomás Kerepecký, Jiaming Liu, Xue Wen Ng, David W. Piston, Ulugbek S. Kamilov:
Dual-Cycle: Self-Supervised Dual-View Fluorescence Microscopy Image Reconstruction using CycleGAN. 1-5 - Tingting Zhang, Feng Xu, Sergiy A. Vorobyov:
Transmit Energy Focusing For Parameter Estimation in Transmit Beamspace Slow-Time MIMO Radar. 1-5 - Juntae Kim, Sung Min Ban:
Phase-Aware Spoof Speech Detection Based On Res2net with Phase Network. 1-5 - Fei Ye, Adrian G. Bors:
Dynamic Scalable Self-Attention Ensemble for Task-Free Continual Learning. 1-5 - Wei Xu, Na Qi, Qing Zhu, Jingzhong Qi, Longlu Huang, Kun Cao, Yuxin Bao, Qianwen Wang:
Color Guided Depth Map Super-Resolution with Nonlocla Autoregres-Sive Modeling. 1-5 - Saki Mizuno, Nobukatsu Hojo, Satoshi Kobashikawa, Ryo Masumura:
Next-Speaker Prediction Based on Non-Verbal Information in Multi-Party Video Conversation. 1-5 - Jinggang Chen, Xiaoyang Qu, Junjie Li, Jianzong Wang, Jiguang Wan, Jing Xiao:
Detecting Out-of-Distribution Examples Via Class-Conditional Impressions Reappearing. 1-5 - Ha Minh Tan, Kai-Wen Liang, Jia-Ching Wang:
Discriminative Vector Learning with Application to Single Channel Speech Separation. 1-5 - Achyut Mani Tripathi, Aakansha Mishra:
Sub-Band Contrastive Learning-Based Knowledge Distillation For Sound Classification. 1-5 - Aditya Sant, Bhaskar D. Rao:
Regularized Neural Detection for Millimeter Wave Massive Mimo Communication Systems with One-Bit Adcs. 1-5 - Haonan Wang, Connor Imes, Souvik Kundu, Peter A. Beerel, Stephen P. Crago, John Paul Walters:
Quantpipe: Applying Adaptive Post-Training Quantization For Distributed Transformer Pipelines In Dynamic Edge Environments. 1-5 - Siddhant Arora, Hayato Futami, Shih-Lun Wu, Jessica Huynh, Yifan Peng, Yosuke Kashiwagi, Emiru Tsunoo, Brian Yan, Shinji Watanabe:
A Study on the Integration of Pipeline and E2E SLU Systems for Spoken Semantic Parsing Toward Stop Quality Challenge. 1-2 - Lei Zhang, Chunyu Lin, Kang Liao, Yao Zhao:
Towards Reliable Image Outpainting: Learning Structure-Aware Multimodal Fusion with Depth Guidance. 1-5 - Philipp Götz, Cagdas Tuna, Andreas Walther, Emanuël A. P. Habets:
Contrastive Representation Learning for Acoustic Parameter Estimation. 1-5 - Zhijin Chen, Branko Ristic, Du Yong Kim:
Possibilistic Bernoulli Filter for Extended Target Tracking. 1-5 - T. Mitchell Roddenberry, Vincent P. Grande, Florian Frantzen, Michael T. Schaub, Santiago Segarra:
Signal Processing On Product Spaces. 1-5 - Antonio Montanaro, Diego Valsesia, Enrico Magli:
Towards Hyperbolic Regularizers For Point Cloud Part Segmentation. 1-5 - Bagus Tris Atmaja, Akira Sasou:
Evaluating Variants of wav2vec 2.0 on Affective Vocal Burst Tasks. 1-5 - Claudio J. Bordin, Caio Gomes de Figueredo, Marcelo G. S. Bruno:
Distributed Bayesian Tracking on the Special Euclidean Group Using Lie Algebra Parametric Approximations. 1-5 - Ghazaleh Ardeshiri, Azadeh Vosoughi:
EH-Enabled Distributed Detection Over Temporally Correlated Markovian MIMO Channels. 1-5 - Benjamin Elizalde, Soham Deshmukh, Mahmoud Al Ismail, Huaming Wang:
CLAP Learning Audio Concepts from Natural Language Supervision. 1-5 - Xinbiao Liu, Bin Liang, Junyu Niu, Chaofeng Sha, Dong Wu:
Dual-graph co-representation learning for knowledge-Graph Enhanced Recommendation. 1-5 - Hengyue Liang, Buyun Liang, Ying Cui, Tim Mitchell, Ju Sun:
Optimization for Robustness Evaluation Beyond ℓp Metrics. 1-5 - Hao Zhang, Lin Mei, Cheolkon Jung:
Long Range Imaging Using Multispectral Fusion of RGB and NIR Images. 1-5 - Hui Tang, Yao Lu, Qi Xuan:
SR-init: An Interpretable Layer Pruning Method. 1-5 - Dafeng Zhang, Xiaobing Wang, Zhezhu Jin:
MRNET: Multi-Refinement Network for Dual-Pixel Images Defocus Deblurring. 1-5 - Martin Jälmby, Filip Elvander, Toon van Waterschoot:
Fast Low-Latency Convolution by Low-Rank Tensor Approximation. 1-5 - Zhe Liu, Xuedong Zhang, Fuchun Peng:
Mitigating Unintended Memorization in Language Models Via Alternating Teaching. 1-5 - Gege Qi, Yuefeng Chen, Yao Zhu, Binyuan Hui, Xiaodan Li, Xiaofeng Mao, Rong Zhang, Hui Xue:
Transaudio: Towards the Transferable Adversarial Audio Attack Via Learning Contextualized Perturbations. 1-5 - Chengliang Wang, Haojian Ning, Xinrun Chen, Shiying Li:
DB-UNet: MLP Based Dual Branch UNet for Accurate Vessel Segmentation in OCTA Images. 1-5 - Soheil Zabihi, Elahe Rahimian, Amir Asif, Arash Mohammadi:
Light-Weight CNN-Attention Based Architecture for Hand Gesture Recognition Via Electromyography. 1-5 - Mingyuan Fan, Wenzhong Guo, Zuobin Ying, Ximeng Liu:
Enhance Transferability of Adversarial Examples with Model Architecture. 1-5 - Xiaoming Ren, Chao Li, Shenjian Wang, Biao Li:
Practice of the Conformer Enhanced Audio-Visual Hubert on Mandarin and English. 1-5 - Huayi Zhou, Fei Jiang, Jiaxin Si, Lili Xiong, Hongtao Lu:
Stuart: Individualized Classroom Observation of Students with Automatic Behavior Recognition And Tracking. 1-5 - Juliano G. C. Ribeiro, Shoichi Koyama, Hiroshi Saruwatari:
Kernel Interpolation of Acoustic Transfer Functions with Adaptive Kernel for Directed and Residual Reverberations. 1-5 - Juntao Zhang, Yihao Luo, Peng Cheng, Zehan Li, Hao Wu, Kun Yu, Wenbo An, Jun Zhou:
An Application of Quantum Mechanics to Attention Methods in Computer Vision. 1-5 - Jiawei Zhang, Tiantian Wang, Zhixi Feng, Shuyuan Yang:
AMC-Net: An Effective Network for Automatic Modulation Classification. 1-5 - Po-Chih Chen, P. P. Vaidyanathan:
Unitary Esprit for Coprime Arrays. 1-5 - Chengmei Yang, Shuai Jiang, Bowei He, Chen Ma, Lianghua He:
Mutually Guided Few-Shot Learning For Relational Triple Extraction. 1-5 - Aaman Rebello, Kriton Konstantinidis, Yao Lei Xu, Danilo P. Mandic:
Tensor Completion for Efficient and Accurate Hyperparameter Optimisation in Large-Scale Statistical Learning. 1-5 - Kyohei Unno, Kohei Matsuzaki, Satoshi Komorita, Kei Kawamura:
Rate-Distortion Optimized Variable-Node-size Trisoup for Point Cloud Coding. 1-5 - Yuchen Hu, Chen Chen, Heqing Zou, Xionghu Zhong, Eng Siong Chng:
Unifying Speech Enhancement and Separation with Gradient Modulation for End-to-End Noise-Robust Speech Separation. 1-5 - Jianwen Qi, Jie Zhang, Yongshan Zhang, Xinwei Jiang, Zhihua Cai:
Tensor Decomposition Based Latent Feature Clustering for Hyperspectral Band Selection. 1-5 - Bubai Maji, Monorama Swain, Rajlakshmi Guha, Aurobinda Routray:
Multimodal Emotion Recognition Based on Deep Temporal Features Using Cross-Modal Transformer and Self-Attention. 1-5 - Yan Zhang, Pengcheng Zheng, Jianan Jiang, Xiao Pu, Xinbo Gao:
FCIR: Rethink Aerial Image Super Resolution with Fourier Analysis. 1-5 - Jian Pei, Gang Wang, K. C. Ho, Lei Huang:
Bias Reduced Semidefinite Relaxation Method for Multistatic Localization in the Absence of Transmitter Position And Its Synchronization. 1-5 - Adam Mekhiche, Antonio Maria Cipriano, Charly Poulliat:
Expectation Propagation on Factor Graphs Based on Matrix Decomposition. 1-5 - Evonne P. C. Lee, Guangzhi Sun, Chao Zhang, Philip C. Woodland:
Spectral Clustering-Aware Learning of Embeddings for Speaker Diarisation. 1-5 - Cheng Chu, Grant Skipper, Martin Swany, Fan Chen:
IQGAN: Robust Quantum Generative Adversarial Network for Image Synthesis On NISQ Devices. 1-5 - Yifan Wang, Luka Murn, Luis Herranz, Fei Yang, Marta Mrak, Wei Zhang, Shuai Wan, Marc Górriz Blanch:
Efficient Super-Resolution for Compression Of Gaming Videos. 1-5 - Jixun Yao, Qing Wang, Yi Lei, Pengcheng Guo, Lei Xie, Namin Wang, Jie Liu:
Distinguishable Speaker Anonymization Based on Formant and Fundamental Frequency Scaling. 1-5 - Rohith Aralikatti, Christoph Böddeker, Gordon Wichern, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Reverberation as Supervision For Speech Separation. 1-5 - Takashi Fukuda, Samuel Thomas:
Effective Training of RNN Transducer Models on Diverse Sources of Speech and Text Data. 1-5 - Mohan Li, Cong-Thanh Do, Rama Doddipatla:
Cumulative Attention Based Streaming Transformer ASR with Internal Language Model Joint Training and Rescoring. 1-5 - Sahar Sadrizadeh, AmirHossein Dabiri Aghdam, Ljiljana Dolamic, Pascal Frossard:
Targeted Adversarial Attacks Against Neural Machine Translation. 1-5 - Shengkui Zhao, Bin Ma:
D2Former: A Fully Complex Dual-Path Dual-Decoder Conformer Network Using Joint Complex Masking and Complex Spectral Mapping for Monaural Speech Enhancement. 1-5 - Haici Yang, Wootaek Lim, Minje Kim:
Neural Feature Predictor and Discriminative Residual Coding for Low-Bitrate Speech Coding. 1-5 - Sebastião Quintas, Alberto Abad, Julie Mauclair, Virginie Woisard, Julien Pinquier:
Towards Reducing Patient Effort for the Automatic Prediction of Speech Intelligibility in Head and Neck Cancers. 1-5 - Zicheng Zhang, Yingjie Zhou, Wei Sun, Xiongkuo Min, Yuzhe Wu, Guangtao Zhai:
Perceptual Quality Assessment for Digital Human Heads. 1-5 - Shaoke Fang, Qingsong Liu, Lei Xu, Wenfei Wu:
Learning To Regularized Resource Allocation with Budget Constraints. 1-5 - Yang Yang, Shao-Fu Shih, Hakan Erdogan, Jamie Menjay Lin, Chehung Lee, Yunpeng Li, George Sung, Matthias Grundmann:
Guided Speech Enhancement Network. 1-5 - Libo Zhang, Weiming Xiong, Ku Zhao, Kehan Chen, Mingyang Zhong:
Maskdul: Data Uncertainty Learning in Masked Face Recognition. 1-5 - Shih-Lun Wu, Yi-Hsuan Yang:
Compose & Embellish: Well-Structured Piano Performance Generation via A Two-Stage Approach. 1-5 - Junfeng Guan, Sohrab Madani, Waleed Ahmed, Samah Hussein, Saurabh Gupta, Haitham Hassanieh:
Exploiting Virtual Array Diversity for Accurate Radar Detection. 1-5 - Sungho Lee, Jaehyun Park, Seungryeol Paik, Kyogu Lee:
Blind Estimation of Audio Processing Graph. 1-5 - Mark Anderson, Tomi Kinnunen, Naomi Harte:
Learnable Frontends That Do Not Learn: Quantifying Sensitivity To Filterbank Initialisation. 1-5 - Kaiqi Zhao, Yitao Chen, Ming Zhao:
A Contrastive Knowledge Transfer Framework for Model Compression and Transfer Learning. 1-5 - Chang-Bin Jeon, Hyeongi Moon, Keunwoo Choi, Ben Sangbae Chon, Kyogu Lee:
Medleyvox: An Evaluation Dataset for Multiple Singing Voices Separation. 1-5 - Lixin Cao, Jun Wang, Ben Yang, Dan Su, Dong Yu:
Trinet: Stabilizing Self-Supervised Learning From Complete or Slow Collapse. 1-5 - An Dang, Toan H. Vu, Le Dinh Nguyen, Jia-Ching Wang:
EMIX: A Data Augmentation Method for Speech Emotion Recognition. 1-5 - Daniele Ugo Leonzio, Paolo Bestagini, Marco Marcon, Gian Paolo Quarta, Stefano Tubaro:
Water Leak Detection and Localization Using Convolutional Autoencoders. 1-5 - Xin Li, Chunping Liu, Yi Ji:
Associative Learning Network for Coherent Visual Storytelling. 1-5 - Huaizhen Tang, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao:
VQ-CL: Learning Disentangled Speech Representations with Contrastive Learning and Vector Quantization. 1-5 - Sangwook Han, Youngdo Ahn, Kyeongmuk Kang, Jong Won Shin:
Short-Segment Speaker Verification Using ECAPA-TDNN with Multi-Resolution Encoder. 1-5 - Lukas Rapp, Luca Schmid, Andrej Rode, Laurent Schmalen:
Structural Optimization of Factor Graphs for Symbol Detection via Continuous Clustering and Machine Learning. 1-5 - Zhepei Wang, Ritwik Giri, Devansh Shah, Jean-Marc Valin, Michael M. Goodwin, Paris Smaragdis:
A Framework for Unified Real-Time Personalized and Non-Personalized Speech Enhancement. 1-5 - Jianxiu Li, Urbashi Mitra:
Channel State Information-Free Artificial Noise-Aided Location-Privacy Enhancement. 1-5 - Minh Vu, Yuki Akiyama, Konstantinos Slavakis:
Dynamic Selection of p-norm in Linear Adaptive Filtering via online Kernel-based Reinforcement Learning. 1-5 - Jinseok Park, Hyung Yong Kim, Jihwan Park, Byeong-Yeol Kim, Shukjae Choi, Yunkyu Lim:
Joint Unsupervised and Supervised Learning for Context-Aware Language Identification. 1-5 - Zili Huang, Zhuo Chen, Naoyuki Kanda, Jian Wu, Yiming Wang, Jinyu Li, Takuya Yoshioka, Xiaofei Wang, Peidong Wang:
Self-Supervised Learning with Bi-Label Masked Speech Prediction for Streaming Multi-Talker Speech Recognition. 1-5 - Eunseop Lee, Inhan Kim, Daijin Kim:
Weight-Based Mask For Domain Adaptation. 1-5 - Shuo Xiao, Xiaojing Qiu, Chaogang Tang, Zhenzhen Huang:
A Spatial-Temporal ECG Emotion Recognition Model Based on Dynamic Feature Fusion. 1-5 - Ying Cao, Elsa Rizk, Stefan Vlaski, Ali H. Sayed:
Multi-Agent Adversarial Training Using Diffusion Learning. 1-5 - Huang Xie, Okko Räsänen, Tuomas Virtanen:
On Negative Sampling for Contrastive Audio-Text Retrieval. 1-5 - Hengbo Liu, Ziqing Ma, Linxiao Yang, Tian Zhou, Rui Xia, Yi Wang, Qingsong Wen, Liang Sun:
SADI: A Self-Adaptive Decomposed Interpretable Framework for Electric Load Forecasting Under Extreme Events. 1-5 - Michele Esposito, Giancarlo Valente, Yenisel Plasencia Calaña, Michel Dumontier, Bruno L. Giordano, Elia Formisano:
Semantically-Informed Deep Neural Networks For Sound Recognition. 1-5 - Hang-Rui Hu, Yan Song, Jian-Tao Zhang, Li-Rong Dai, Ian McLoughlin, Zhu Zhuo, Yu Zhou, Yu-Hong Li, Hui Xue:
Stargan-vc Based Cross-Domain Data Augmentation for Speaker Verification. 1-5 - Kuan-Po Huang, Tzu-hsun Feng, Yu-Kuan Fu, Tsu-Yuan Hsu, Po-Chieh Yen, Wei-Cheng Tseng, Kai-Wei Chang, Hung-Yi Lee:
Ensemble Knowledge Distillation of Self-Supervised Speech Models. 1-5 - Pingchuan Ma, Alexandros Haliassos, Adriana Fernandez-Lopez, Honglie Chen, Stavros Petridis, Maja Pantic:
Auto-AVSR: Audio-Visual Speech Recognition with Automatic Labels. 1-5 - Ramon Sanabria, Nikolay Bogoychev, Nina Markl, Andrea Carmantini, Ondrej Klejch, Peter Bell:
The Edinburgh International Accents of English Corpus: Towards the Democratization of English ASR. 1-5 - Jaewoo Lee, Kapje Sung, Daeul Park, Younghan Jeon:
KEPS-NET: Robust Parking slot Detection based Keypoint estimation for High Localization Accuracy. 1-5 - Alexandre L'Her, Angélique Drémeau, Florent Le Courtois, Gaultier Real, Xavier Cristol, Yann Stéphan:
Towards Improved Sonar Performance Using Environment-Informed Sparse Sub-Array Processing. 1-5 - Amirhossein Hajavi, Ali Etemad:
A Study on Bias and Fairness in Deep Speaker Recognition. 1-5 - Hyeongju Kim, Hyeong-Seok Choi:
Towards Trustworthy Phoneme Boundary Detection with Autoregressive Model and Improved Evaluation Metric. 1-5 - Takuya Fujimura, Tomoki Toda:
Analysis Of Noisy-Target Training For Dnn-Based Speech Enhancement. 1-5 - Takafumi Moriya, Takanori Ashihara, Hiroshi Sato, Kohei Matsuura, Tomohiro Tanaka, Ryo Masumura:
Improving Scheduled Sampling for Neural Transducer-Based ASR. 1-5 - Beltrán Labrador, Guanlong Zhao, Ignacio López-Moreno, Angelo Scorza Scarpati, Liam Fowl, Quan Wang:
Exploring Sequence-to-Sequence Transformer-Transducer Models for Keyword Spotting. 1-5 - Sourabh Vasant Gothe, Jayesh Rajkumar Vachhani, Rishabh Khurana, Pranay Kashyap:
Self-Similarity is all You Need for Fast and Light-Weight Generic Event Boundary Detection. 1-5 - Yan Shu, Shaohui Liu, Yu Zhou, Honglei Xu, Feng Jiang:
EI2SR: Learning an Enhanced Intra-Instance Semantic Relationship for Arbitrary-Shaped Scene Text Detection. 1-5 - João Prazeres, Zhe Luo, António M. G. Pinheiro, Luís Alberto da Silva Cruz, Stuart W. Perry:
JPEG Pleno Call for Proposals Responses Quality Assessment. 1-5 - Yifan Liu, Youbao Tang, Ning Zhang, Ruei-Sung Lin, Haoqian Wang:
Prior-Enhanced Temporal Action Localization Using Subject-Aware Spatial Attention. 1-5 - Kaushal Santosh Bhogale, Abhigyan Raman, Tahir Javed, Sumanth Doddapaneni, Anoop Kunchukuttan, Pratyush Kumar, Mitesh M. Khapra:
Effectiveness of Mining Audio and Text Pairs from Public Data for Improving ASR Systems for Low-Resource Languages. 1-5 - Abijith Jagannath Kamath, Chandra Sekhar Seelamantula:
Multichannel Time-Encoding of Finite-Rate-of-Innovation Signals. 1-5 - Rui Li, Xueqian Wang, Gang Li, Xiao-Ping Zhang:
TEFISTA-NET: GTD Parameter Estimation of Low-Frequency Ultra- Wideband Radar via Model-Based Deep Learning. 1-5 - Maryam Fazel-Zarandi, Wei-Ning Hsu:
Cocktail Hubert: Generalized Self-Supervised Pre-Training for Mixture and Single-Source Speech. 1-5 - Zhongshu Hou, Qinwen Hu, Tianchi Sun, Yuxiang Hu, Changbao Zhu, Kai Chen:
Convolutional Recurrent MetriCGAN With Spectral Dimension Compression For Full-Band Speech Enhancement. 1-2 - Yi Fan, Zhonghan Niu, Yu-Bin Yang:
Data-Aware Zero-Shot Neural Architecture Search for Image Recognition. 1-5 - Zexin Cai, Weiqing Wang, Ming Li:
Waveform Boundary Detection for Partially Spoofed Audio. 1-5 - Cong Han, Nima Mesgarani:
Online Binaural Speech Separation Of Moving Speakers With A Wavesplit Network. 1-5 - Wujiang Xu, Runzhong Wang, Xiaobo Guo, Shaoshuai Li, Qiongxu Ma, Yunan Zhao, Sheng Guo, Zhenfeng Zhu, Junchi Yan:
MHSCNET: A Multimodal Hierarchical Shot-Aware Convolutional Network for Video Summarization. 1-5 - Keqi Deng, Philip C. Woodland:
Adaptable End-to-End ASR Models Using Replaceable Internal LMs and Residual Softmax. 1-5 - Zhiheng Luan, Yanzhen Ren, Li Peng, Xiong Chen, Xiuping Yang, Weiping Tu, Yuhong Yang:
Learning From Single-Expert Annotated Labels for Automatic Sleep Staging. 1-5 - Junzhang Jia, Xuetong Wu, Jamie S. Evans, Jingge Zhu:
On the Value of Stochastic Side Information in Online Learning. 1-5 - T. Mitchell Roddenberry, Santiago Segarra:
Windowed Fourier Analysis for Signal Processing on Graph Bundles. 1-5 - Parijat Dube, Theodoros Salonidis, Parikshit Ram, Ashish Verma:
Runtime Prediction of Machine Learning Algorithms in Automl Systems. 1-5 - Sashank Macha, Om Oza, Alex Escott, Francesco Calivá, Robbie Armitano, Santosh Kumar Cheekatmalla, Sree Hari Krishnan Parthasarathi, Yuzong Liu:
Fixed-Point Quantization Aware Training for on-Device Keyword-Spotting. 1-5 - Ying Zhang, Liang Wen, Lizhong Wang, Yinji Piao, Weijing Shi, Kwang Pyo Choi:
Distortion-Aware Convolutional Neural Network-Based Interpolation Filter for AVS3. 1-5 - Binghuai Lin, Liyuan Wang:
Multi-Lingual Pronunciation Assessment with Unified Phoneme Set and Language-Specific Embeddings. 1-5 - Shikun Zhang, Fengyi Song, Ge Song, Ming Yang:
SDRNet: Shape Decoupled Regression Network for 3d face Reconstruction. 1-5 - Baichuan Zhang, Fanyang Meng, Runwei Ding, Mengyuan Liu:
Multi-Stream Facial Adaptive Network for Expression Recognition from a Single Image. 1-5 - Hongji Wang, Chengdong Liang, Shuai Wang, Zhengyang Chen, Binbin Zhang, Xu Xiang, Yanlei Deng, Yanmin Qian:
Wespeaker: A Research and Production Oriented Speaker Embedding Learning Toolkit. 1-5 - Bac Nguyen, Stefan Uhlich, Fabien Cardinaux:
Improving Self-Supervised Learning for Audio Representations by Feature Diversity and Decorrelation. 1-5 - Shashwat Jain, Kunal Pattanayak, Vikram Krishnamurthy, Christopher Berry:
Adaptive Eccm for Mitigating Smart Jammers. 1-5 - Wenbo Shi, Robert A. Malaney:
Signal Processing And Quantum State Tomography on Noisy Devices. 1-5 - Yi Wang, Jiajun Deng, Tianzi Wang, Bo Zheng, Shoukang Hu, Xunying Liu, Helen Meng:
Exploiting Prompt Learning with Pre-Trained Language Models for Alzheimer's Disease Detection. 1-5 - Yekun Chai, Qiyue Yin, Junge Zhang:
Improved Training Of Mixture-Of-Experts Language GANs. 1-5 - Hongsen He, Jingdong Chen, Jacob Benesty, Yi Yu:
A Frequency-Domain Recursive Least-Squares Adaptive Filtering Algorithm Based On A Kronecker Product Decomposition. 1-5 - Qiang Zeng, Hongxia Wang, Yang Zhou, Rui Zhang, Sijiang Meng:
A Parallel Attention Mechanism for Image Manipulation Detection and Localization. 1-5 - Jianwei Yu, Yi Luo:
Efficient Monaural Speech Enhancement with Universal Sample Rate Band-Split RNN. 1-5 - Hai-Miao Hu, Zhenbo Xu, Wenshuai Xu, You Song, YiTao Zhang, Liu Liu, Zhilin Han, Ajin Meng:
One-Shot Neural Band Selection for Spectral Recovery. 1-5 - Zhong-Qiu Wang, Samuele Cornell, Shukjae Choi, Younglo Lee, Byeong-Yeol Kim, Shinji Watanabe:
TF-GRIDNET: Making Time-Frequency Domain Models Great Again for Monaural Speaker Separation. 1-5 - Anthony Frion, Lucas Drumetz, Mauro Dalla Mura, Guillaume Tochon, Abdeldjalil Aïssa-El-Bey:
Leveraging Neural Koopman Operators to Learn Continuous Representations of Dynamical Systems from Scarce Data. 1-5 - Alec S. Xu, Laura Balzano, Jeffrey A. Fessler:
HeMPPCAT: Mixtures of Probabilistic Principal Component analysers for data with heteroscedastic noise. 1-5 - Xueliang Wang, Wenqi Huang, Wenming Yang, Qingmin Liao:
Spatial Correlation Fusion Network for Few-Shot Segmentation. 1-5 - Tomer Hershkovitz, Martin Haardt, Arie Yeredor:
Various Performance Bounds on the Estimation of Low-Rank Probability Mass Function Tensors from Partial Observations. 1-5 - Jake Stuchbury-Wass, Erika Bondareva, Kayla-Jade Butkow, Sanja Scepanovic, Zoran Radivojevic, Cecilia Mascolo:
Heart Rate Extraction from Abdominal Audio Signals. 1-5 - Anindya Bijoy Das, Aditya Ramamoorthy, David J. Love, Christopher G. Brinton:
Coded Matrix Computations for D2D-Enabled Linearized Federated Learning. 1-5 - Xinyuan Li, Yu Wang, Jien Kato:
Long-Tailed Image Recognition with Dynamic Re-Weighting. 1-5 - Quan Wei, Ziping Zhao:
Large Covariance Matrix Estimation with Oracle Statistical Rate. 1-5 - Sohee Jang, Jiye Kim, Yeon-Ju Kim, Joon-Hyuk Chang:
Adaptive Time-Scale Modification for Improving Speech Intelligibility Based On Phoneme Clustering For Streaming Services. 1-5 - Tao Li, Shilian Wu, Zengfu Wang:
Mask Guided Selective Context Decoding for Handwritten Chinese Text Recognition. 1-5 - Jiaang Li, Quan Wang, Zhendong Mao:
Inductive Relation Prediction from Relational Paths and Context with Hierarchical Transformers. 1-5 - Tejas Jayashankar, Jilong Wu, Leda Sari, David Kant, Vimal Manohar, Qing He:
Self-Supervised Representations for Singing Voice Conversion. 1-5 - Kevin Wilkinghoff:
Design Choices for Learning Embeddings from Auxiliary Tasks for Domain Generalization in Anomalous Sound Detection. 1-5 - Chutian Wang, Stefan Vlaski:
Robust Network Topologies for Distributed Learning. 1-5 - Xinyue Zhang, Guodong Wang, Lijuan Yang, Chenglizhao Chen:
Lightweight Portrait Segmentation Via Edge-Optimized Attention. 1-5 - Hyunseok Oh, Juheon Yi, Youngki Lee:
Papez: Resource-Efficient Speech Separation with Auditory Working Memory. 1-5 - Travis M. Bartley, Fei Jia, Krishna C. Puvvada, Samuel Kriman, Boris Ginsburg:
Accidental Learners: Spoken Language Identification in Multilingual Self-Supervised Models. 1-5 - Anuja Vats, Ahmed Kedir Mohammed, Marius Pedersen, Nirmalie Wiratunga:
This Changes to That : Combining Causal and Non-Causal Explanations to Generate Disease Progression in Capsule Endoscopy. 1-5 - Simon Rouard, Francisco Massa, Alexandre Défossez:
Hybrid Transformers for Music Source Separation. 1-5 - Chen-Chou Lo, Patrick Vandewalle:
RCDPT: Radar-Camera Fusion Dense Prediction Transformer. 1-5 - Guangyu Chen, Yuanyuan Cao:
A Reality Check and a Practical Baseline for Semantic Speech Embedding. 1-5 - Zhangzi Zhu, Shuai Wang, Hong Qu:
Improving Image Captioning with Control Signal of Sentence Quality. 1-5 - Kecheng Chen, Haoliang Li, Hong Yan:
Cross-Domain Object Classification Via Successive Subspace Alignment. 1-5 - Zhen-Tang Huang, Yan-He Chen, Mei-Chen Yeh:
Weakly- and Semi-Supervised Object Localization. 1-5 - Zimian Wei, Hengyue Pan, Lujun Li, Menglong Lu, Xin Niu, Peijie Dong, Dongsheng Li:
DMFormer: Closing the gap Between CNN and Vision Transformers. 1-5 - S. S. Krishna Chaitanya Bulusu, Nuutti Tervo, Praneeth Susarla, Mikko J. Sillanpää, Olli Silvén, Markku J. Juntti, Aarno Pärssinen:
Machine Learning-Aided Piece-Wise Modeling Technique of Power Amplifier for Digital Predistortion. 1-5 - Yixuan Li, Huaping Liu, Qiang Jin, Miaomiao Cai, Peng Li:
TrOMR:Transformer-Based Polyphonic Optical Music Recognition. 1-5 - Christos Korgialas, Constantine Kotropoulos:
Electric Network Frequency Detection Using Least Absolute Deviations. 1-5 - Akinori F. Ebihara, Taiki Miyagawa, Kazuyuki Sakurai, Hitoshi Imaoka:
Toward Asymptotic Optimality: Sequential Unsupervised Regression of Density Ratio for Early Classification. 1-5 - Zehan Tan, Weidong Yang, Shuai Wu:
Retrieval-Based Natural 3D Human Motion Generation. 1-5 - Ravi Pranjal, Ranjana Seshadri, Rakesh Kumar Sanath Kumar Kadaba, Tiantian Feng, Shrikanth S. Narayanan, Theodora Chaspari:
Toward Privacy-Enhancing Ambulatory-Based Well-Being Monitoring: Investigating User Re-Identification Risk in Multimodal Data. 1-5 - Soumi Maiti, Yifan Peng, Takaaki Saeki, Shinji Watanabe:
Speechlmscore: Evaluating Speech Generation Using Speech Language Model. 1-5 - Chandan Gautam, Aditya Kane, Savitha Ramasamy, Suresh Sundaram:
Unsupervised Out-of-Distribution Detection Using Few in-Distribution Samples. 1-5 - Chung Kwan Lai, Bhan Lam, Dongyuan Shi, Woon-Seng Gan:
Real-Time Modelling of Observation Filter in the Remote Microphone Technique for an Active Noise Control Application. 1-5 - Yangning Li, Jiaoyan Chen, Yinghui Li, Yuejia Xiang, Xi Chen, Hai-Tao Zheng:
Vision, Deduction and Alignment: An Empirical Study on Multi-Modal Knowledge Graph Alignment. 1-5 - Najla D. Al Futaisi, Alejandrina Cristià, Björn W. Schuller:
Hearttoheart: The Arts of Infant Versus Adult-Directed Speech Classification. 1-5 - Sitian Li, Alexios Balatsoukas-Stimming, Andreas Burg:
Single-Anchor UWB Localization Using Channel Impulse Response Distributions. 1-5 - Yang Xiao, Liejun Wang, Tongguan Wang, Huicheng Lai:
Scoreformer: Score Fusion-Based Transformers for Weakly-Supervised Violence Detection. 1-5 - Georgios Paraskevopoulos, Chandrashekhar Lavania, Lovish Chum, Shiva Sundaram:
Multi-Scale Compositional Constraints for Representation Learning on Videos. 1-5 - Shivam Agarwal, Ritesh Soun, Rahul Shivani, Vishnu Varanasi, Navroop Gill, Ramit Sawhney:
HyperSteg: Hyperbolic Learning for Deep Steganography. 1-5 - Yutang Xia, Yang Luo, Wu Luo, Qingni Shen, Yahui Yang, Zhonghai Wu:
A Role Engineering Approach Based on Spectral Clustering Analysis for Restful Permissions in Cloud. 1-5 - Shengjia Chen, Jiewen Zhu, Luping Ji, Hongjun Pan, Yuhao Xu:
AugTarget Data Augmentation for Infrared Small Target Detection. 1-5 - Lihua Ni, Di Zhang, Tianyi Xing, Maoyan Ran, Ning Liu, Qun Wan:
Direct Position Determination with One-Bit Signal for Multiple Targets. 1-5 - Junhao Chen, Sheng Liu, Ruixiang Chen, Bingnan Guo, Feng Zhang:
IAST: Instance Association Relying on Spatio-Temporal Features for Video Instance Segmentation. 1-5 - Zirun Zhu, Hemin Yang, Min Tang, Ziyi Yang, Sefik Emre Eskimez, Huaming Wang:
Real-Time Audio-Visual End-To-End Speech Enhancement. 1-5 - George Close, William Ravenscroft, Thomas Hain, Stefan Goetze:
Perceive and Predict: Self-Supervised Speech Representation Based Loss Functions for Speech Enhancement. 1-5 - Miguel Heredia Conde:
Transient Dictionary Learning for Compressed Time-of-Flight Imaging. 1-5 - Shihao Ren, Yikang Ding, Jinli Liao, Xinghui Li, Jia Guo, Wensen Feng, Xueqian Wang:
Volumetric 3D Reconstruction with Window-Wise Global Feature Aggregation. 1-5 - Wenjing Liu, Xuanya Li, Kai Hu, Xieping Gao:
Exploiting Multi-Decision and Deep Refinement for Ultrasound Image Segmentation. 1-5 - Siwen Ding, You Zhang, Zhiyao Duan:
SAMO: Speaker Attractor Multi-Center One-Class Learning For Voice Anti-Spoofing. 1-5 - Rehan Ahmad, Md Asif Jalal, Muhammad Umar Farooq, Anna Ollerenshaw, Thomas Hain:
Towards Domain Generalisation in ASR with Elitist Sampling and Ensemble Knowledge Distillation. 1-5 - Allen Yan, Jinsub Kim, Raviv Raich:
Forensics for Adversarial Machine Learning Through Attack Mapping Identification. 1-5 - Sai Zhang, Yuwei Hu, Xiaojie Wang, Caixia Yuan:
An Asynchronous Updating Reinforcement Learning Framework for Task-Oriented Dialog System. 1-5 - Ke Wu, Ehsan Variani, Tom Bagby, Michael Riley:
Last: Scalable Lattice-Based Speech Modelling in Jax. 1-5 - Varun A. Kelkar, Dehong Liu, Hiroshi Inoue, Makoto Kanemaru:
Sparsity-Driven Joint Blind Deconvolution-Demodulation with Application to Motor Fault Detection. 1-5 - Jinglun Cai, Mingda Li, Ziyan Jiang, Eunah Cho, Zheng Chen, Yang Liu, Xing Fan, Chenlei Guo:
KG-ECO: Knowledge Graph Enhanced Entity Correction For Query Rewriting. 1-5 - Soky Kak, Sheng Li, Chenhui Chu, Tatsuya Kawahara:
Domain and Language Adaptation Using Heterogeneous Datasets for Wav2vec2.0-Based Speech Recognition of Low-Resource Language. 1-5 - Zhen Long, Ce Zhu, Pierre Comon, Yipeng Liu:
Feature Space Recovery for Incomplete Multi-View Clustering. 1-5 - Harry Gao, Weijie Gan, Zhixin Sun, Ulugbek S. Kamilov:
SINCO: A Novel Structural Regularizer for Image Compression Using Implicit Neural Representations. 1-5 - Jikai Li, Shogo Muramatsu:
Inter-Scale Sure-Let Denoise with Structured Deep Image Prior: Interpretable Self-Supervised Learning. 1-5 - Tuan-Nam Nguyen, Ngoc-Quan Pham, Alexander Waibel:
SYNTACC : Synthesizing Multi-Accent Speech By Weight Factorization. 1-5 - Rakshith Subramanyam, Kowshik Thopalli, Spring Berman, Pavan K. Turaga, Jayaraman J. Thiagarajan:
Single-Shot Domain Adaptation via Target-Aware Generative Augmentations. 1-5 - Tong Xu, Yiming Li, Yong Jiang, Shu-Tao Xia:
BATT: Backdoor Attack with Transformation-Based Triggers. 1-5 - Yu-Hsuan Chen, Fu-Cheng Pan, Yu-Chien Liao, Jao-Hong Kao, Yu-Chiang Frank Wang:
Semantics-Aware Gamma Correction for Unsupervised Low-Light Image Enhancement. 1-5 - Jie Cheng, Maria Juhlin, Wen-Qin Wang, Andreas Jakobsson:
Optimal Carrier Frequency Design for Frequency Diverse Array Mimo Radar. 1-5 - Zhe Zhang, Huachen Gao, Yuxi Hu, Ronggang Wang:
N2MVSNet: Non-Local Neighbors Aware Multi-View Stereo Network. 1-5 - Francesco Barbato, Giulia Rizzoli, Pietro Zanuttigh:
DepthFormer: Multimodal Positional Encodings and Cross-Input Attention for Transformer-based Segmentation Networks. 1-5 - Tamir L. S. Gez, Kobi Cohen:
Subgradient Descent Learning with Over-the-Air Computation. 1-5 - Xin Luo, Wei Chen, Chen Li, Bin Zhou, Yusong Tan:
Domain Generalized Fundus Image Segmentation via Dual-Level Mixing. 1-5 - Aaron Berk, Yanting Ma, Petros Boufounos, Pu Wang, Hassan Mansour:
Deep Proximal Gradient Method for Learned Convex Regularizers. 1-5 - Kai Zhang, Tian Jin, Feng Zhang, Jiande Sun:
Long-Short Attention Network For The Spectral Super-Resolution Of Multispectral Images. 1-5 - Christophe Dupuy, Jimit Majmudar, Jixuan Wang, Tanya G. Roosta, Rahul Gupta, Clement Chung, Jie Ding, Salman Avestimehr:
Quantifying Catastrophic Forgetting in Continual Federated Learning. 1-5 - Liu Yang, Yu Jiang, Junkun Hong, Zhenjie Wu, Zhan Yang, Jun Long:
Stacking-Based Attention Temporal Convolutional Network for Action Segmentation. 1-5 - Guozhu Jiang, Jie Zhang, Yongshan Zhang, Xinwei Jiang, Zhihua Cai:
Structured-Anchor Projected Clustering for Hyperspectral Images. 1-5 - Xuan-Phi Nguyen, Sravya Popuri, Changhan Wang, Yun Tang, Ilia Kulikov, Hongyu Gong:
Improving Speech-to-Speech Translation Through Unlabeled Text. 1-5 - Zhiying Deng, Jianjun Li, Zhiqiang Guo, Guohui Li:
Multi-Aspect Interest Neighbor-Augmented Network for Next-Basket Recommendation. 1-5 - Yuhang Liang, Zheng Lin, Fengcheng Yuan, Hanwen Zhang, Lei Wang, Weiping Wang:
Towards Polymorphic Adversarial Examples Generation for Short Text. 1-5 - Qiao ZhongZheng, Minghui Hu, Xudong Jiang, Ponnuthurai Nagaratnam Suganthan, Savitha Ramasamy:
Class-Incremental Learning on Multivariate Time Series Via Shape-Aligned Temporal Distillation. 1-5 - Jiaming Liang, Meiqin Liu, Chao Yao, Chunyu Lin, Yao Zhao:
SIGVIC: Spatial Importance Guided Variable-Rate Image Compression. 1-5 - Xuran Lv, Jinyong Cheng, Guohua Lv, Zhonghe Wei:
A Deep Fusion Rule for Infrared and Visible Image Fusion: Feature Communication for Importance Assessment. 1-5 - Hong Yu, Yuanqiu Liu, Baokun Qi, Zhaolong Hu, Han Liu:
End-to-End Non-Autoregressive Image Captioning. 1-5 - Huiyuan Sun, Prasanga N. Samarasinghe, Thushara D. Abhayapala:
Active Noise Control over 3D Space: A Realistic Error Microphone Geometry Design. 1-5 - Junwei Ji, Dongyuan Shi, Zhengding Luo, Xiaoyi Shen, Woon-Seng Gan:
A Practical Distributed Active Noise Control Algorithm Overcoming Communication Restrictions. 1-5 - Kangli Zeng, Zhongyuan Wang, Tao Lu, Jianyu Chen:
Structure-Aware Multi-Feature Co-Learning for Dual Branch Face Super Resolution. 1-5 - Nguyen Anh Tu, Duong Xuan Hieu, Tu Minh Phuong, Ngo Xuan Bach:
A Bidirectional Joint Model for Spoken Language Understanding. 1-5 - Odysseas S. Chlapanis, Georgios Paraskevopoulos, Alexandros Potamianos:
Adapted Multimodal Bert with Layer-Wise Fusion for Sentiment Analysis. 1-5 - Kawon Han, Songcheol Hong:
Cough Detection Using Millimeter-Wave Fmcw Radar. 1-5 - Ryosuke Sawata, Takahiro Ogawa, Miki Haseyama:
Class-Aware Shared Gaussian Process Dynamic Model. 1-5 - Junsu Jang, Florian Meyer, Eric R. Snyder, Sean M. Wiggins, Simone Baumann-Pickering, John A. Hildebrand:
Passive Acoustic Tracking of Whales in 3-D. 1-5 - Byeong Hyeon Kim, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong-Goo Kang:
Progressive Multi-Stage Neural Audio Codec with Psychoacoustic Loss and Discriminator. 1-5 - Kashyap Patel, Anton Kovalyov, Issa M. S. Panahi:
UX-Net: Filter-and-Process-Based Improved U-Net for real-time time-domain audio Separation. 1-5 - Yuqi Chen, Juan Liu, Zhiqun Zuo, Peng Jiang, Yu Jin, Guangsheng Wu:
Classifying Pathological Images Based on Multi-Instance Learning and End-to-End Attention Pooling. 1-5 - Weiwei Wang, Victor E. DeBrunner, Linda S. DeBrunner:
Fast Convolution Algorithm for Real-Valued Finite Length Sequences. 1-5 - Shengjie Liu, Chuang Zhu, Yuan Li, Wenqi Tang:
WUDA: Unsupervised Domain Adaptation Based on Weak Source Domain Labels. 1-5 - Weixin Zhu, Zilin Wang, Jiuxin Lin, Chang Zeng, Tao Yu:
SSI-Net: A Multi-Stage Speech Signal Improvement System for ICASSP 2023 SSI Challenge. 1-2 - Mingyi Yang, Luis Herranz, Fei Yang, Luka Murn, Marc Górriz Blanch, Shuai Wan, Fuzheng Yang, Marta Mrak:
Semantic Preprocessor for Image Compression for Machines. 1-5 - Yuejiang Li, H. Vicky Zhao, Gene Cheung:
Eigen-Decomposition-Free Directed Graph Sampling via Gershgorin Disc Alignment. 1-5 - George Saon, Ankit Gupta, Xiaodong Cui:
Diagonal State Space Augmented Transformers for Speech Recognition. 1-5 - Libio Gonçalves Braz, Allmin Pradhap Singh Susaiyah:
A Controllable Lifestyle Simulator for Use in Deep Reinforcement Learning Algorithms. 1-5 - Edmilson da Silva Morais, Matheus Damasceno, Hagai Aronowitz, Aharon Satt, Ron Hoory:
Modeling Turn-Taking in Human-To-Human Spoken Dialogue Datasets Using Self-Supervised Features. 1-5 - Jiatong Shi, Chan-Jan Hsu, Ho-Lam Chung, Dongji Gao, Paola García, Shinji Watanabe, Ann Lee, Hung-Yi Lee:
Bridging Speech and Textual Pre-Trained Models With Unsupervised ASR. 1-5 - Haoran Bi, Maksym Kyryliuk, Zhiyi Wang, Cristian Meo, Yanbo Wang, Ruben Imhoff, Remko Uijlenhoet, Justin Dauwels:
Nowcasting of Extreme Precipitation Using Deep Generative Models. 1-5 - Binghuai Lin, Liyuan Wang:
Robust multi-modal speech emotion recognition with ASR error adaptation. 1-5 - Mohammad Javad Salehi, Mohammad NaseriTehrani, Antti Tölli:
Multicast Beamformer Design for Mimo Coded Caching Systems. 1-5 - Sergio Rozada, Antonio G. Marques:
Matrix Low-Rank Approximation for Policy Gradient Methods. 1-5 - Jhon A. Castro-Correa, Jhony H. Giraldo, Anindya Mondal, Mohsen Badiey, Thierry Bouwmans, Fragkiskos D. Malliaros:
Time-Varying Signals Recovery Via Graph Neural Networks. 1-5 - Lauri Juvela, Eero-Pekka Damskägg, Aleksi Peussa, Jaakko Mäkinen, Thomas Sherson, Stylianos I. Mimilakis, Kimmo Rauhanen, Athanasios Gotsopoulos:
End-to-End Amp Modeling: from Data to Controllable Guitar Amplifier Models. 1-5 - Ziji Zhang, Ping Gong, Haotian Sun, Pingping Wu, Xuanyuan Yang:
Dynamic Local and Global Context Exploration for Small Object Detection. 1-5 - Suliang Bu, Tuo Zhao, Yunxin Zhao:
Joint Estimation of DOA and Distance in Noisy Reverberant Conditions. 1-5 - Giridhar Pamisetty, Sahukari Chaitanya Varun, K. Sri Rama Murty:
Lightweight Prosody-TTS for Multi-Lingual Multi-Speaker Scenario. 1-2 - Hongbo Wang, Weimin Xiong, Yifan Song, Dawei Zhu, Yu Xia, Sujian Li:
DocRED-FE: A Document-Level Fine-Grained Entity and Relation Extraction Dataset. 1-5 - Kangkang Lu, Manh Cuong Nguyen, Xun Xu, Chuan Sheng Foo:
On Adversarial Robustness of Audio Classifiers. 1-5 - Tre DiPassio, Michael C. Heilemann, Benjamin Thompson, Mark F. Bocko:
Estimating Acoustic Direction of Arrival Using a Single Structural Sensor on a Resonant Surface. 1-5 - Ninon Devis, Nils Demerlé, Sarah Nabi, David Genova, Philippe Esling:
Continuous Descriptor-Based Control for Deep Audio Synthesis. 1-5 - A. Foroughi, Christian Rathgeb, Mathias Ibsen, Christoph Busch:
Benchmarking Cross-Domain Face Recognition with Avatars, Caricatures and Sketches. 1-5 - Zhihao Liu, Zhiwei Xu, Guoliang Fan:
Hierarchical Multi-Agent Reinforcement Learning with Intrinsic Reward Rectification. 1-5 - Yuhuan Lin, Tongda Xu, Ziyu Zhu, Yanghao Li, Zhe Wang, Yan Wang:
Your Camera Improves Your Point Cloud Compression. 1-5 - Thilo von Neumann, Christoph Böddeker, Keisuke Kinoshita, Marc Delcroix, Reinhold Haeb-Umbach:
On Word Error Rate Definitions and Their Efficient Computation for Multi-Speaker Speech Recognition Systems. 1-5 - Kai Cheng, Xinhua Zeng, Yang Liu, Mengyang Zhao, Chengxin Pang, Xing Hu:
Spatial-Temporal Graph Convolutional Network Boosted Flow-Frame Prediction For Video Anomaly Detection. 1-5 - Zhao Song, Ke Yang, Naiyang Guan, Junjie Zhu, Peng Qiao, Qingyong Hu:
VPPT: Visual Pre-Trained Prompt Tuning Framework for Few-Shot Image Classification. 1-5 - Shuyue Stella Li, Xiangyu Zhang, Shu Zhou, Hongchao Shu, Ruixing Liang, Hexin Liu, Leibny Paola García:
PQLM - Multilingual Decentralized Portable Quantum Language Model. 1-5 - Huibin Tan, Kun Hu, Mingyu Cao, Mengzhu Wang, Liyang Xu, Wenjing Yang:
Decomposition, Interaction, Reconstruction Meets Global Context Learning In Visual Tracking. 1-5 - Xiaocui Yang, Shi Feng, Daling Wang, Pengfei Hong, Soujanya Poria:
Multiple Contrastive Learning for Multimodal Sentiment Analysis. 1-5 - Saksham Aggarwal, Taneesh Gupta, Pawan Kumar Sahu, Arnav Chavan, Rishabh Tiwari, Dilip K. Prasad, Deepak K. Gupta:
On Designing Light-Weight Object Trackers Through Network Pruning: Use CNNS or Transformers? 1-5 - Mo Zhou, Martin Bo Møller, Christian Pedersen, Jan Østergaard:
Robust Fir Filters for Wireless Low-Frequency Sound Zones. 1-5 - Thomas Guilmeau, Emilie Chouzenoux, Víctor Elvira:
Adaptive Simulated Annealing Through Alternating Rényi Divergence Minimization. 1-5 - Jiaming Wang, Zhihao Du, Shiliang Zhang:
TOLD: a Novel Two-Stage Overlap-Aware Framework for Speaker Diarization. 1-5 - Xiangyu Bai, Le Jiang, Yedi Luo, Aniket Gupta, Pushyami Kaveti, Hanumant Singh, Sarah Ostadabbas:
An Evaluation Platform to Scope Performance of Synthetic Environments in Autonomous Ground Vehicles Simulation. 1-5 - Xun Gong, Wei Wang, Hang Shao, Xie Chen, Yanmin Qian:
Factorized AED: Factorized Attention-Based Encoder-Decoder for Text-Only Domain Adaptive ASR. 1-5 - Siyuan Shan, Yang Li, Junier B. Oliva:
NRTSI: Non-Recurrent Time Series Imputation. 1-5 - Florence Regol, Anja Kroon, Mark Coates:
Evaluation of Categorical Generative Models - Bridging the Gap Between Real and Synthetic Data. 1-5 - Camille Noufi, Jonathan Berger, Karen J. Parker, Daniel L. Bowling:
Acoustically-Driven Phoneme Removal that Preserves Vocal Affect Cues. 1-5 - Jie Wang, Menglong Xu, Jingyong Hou, Binbin Zhang, Xiao-Lei Zhang, Lei Xie, Fuping Pan:
Wekws: A Production First Small-Footprint End-to-End Keyword Spotting Toolkit. 1-5 - Kenneth Ooi, Karn N. Watcharasupat, Bhan Lam, Zhen-Ting Ong, Woon-Seng Gan:
Autonomous Soundscape Augmentation with Multimodal Fusion of Visual and Participant-Linked Inputs. 1-5 - Huabao Chen, Xiaolong Huang, Qiankun Li, Jianqing Wang, Bo Fang, Junxin Chen:
LABANet: Lead-Assisting Backbone Attention Network for Oral Multi-Pathology Segmentation. 1-5 - Tal Shaharabany, Lior Wolf:
Learning a Weight Map for Weakly-Supervised Localization. 1-5 - Zhaolong Zhang, Yangdong Chen, Yuejie Zhang, Rui Feng, Tao Zhang:
Boosting Fine-Grained Sketch-Based Image Retrieval with Self-Supervised Learning. 1-5 - Tomohiro Tanaka, Ryo Masumura, Mana Ihori, Hiroshi Sato, Taiga Yamane, Takanori Ashihara, Kohei Matsuura, Takafumi Moriya:
Leveraging Language Embeddings for Cross-Lingual Self-Supervised Speech Representation Learning. 1-5 - Mert Cemri, Virginia Bordignon, Mert Kayaalp, Valentina Shumovskaia, Ali H. Sayed:
Asynchronous Social Learning. 1-5 - A. Ulvog, Joshua Rapp, Toshiaki Koike-Akino, Hassan Mansour, Petros Boufounos, Kieran Parsons:
Phase Unwrapping in Correlated Noise for FMCW Lidar Depth Estimation. 1-5 - Wenting Li, Jiahong Yang, Haibo Cheng, Ping Wang, Kaitai Liang:
Improved Wordpcfg for Passwords with Maximum Probability Segmentation. 1-5 - Kun Yu, Morteza Darvish Morshedi Hosseini, Anjie Peng, Hui Zeng, Miroslav Goljan:
Make Your Enemy Your Friend: Improving Image Rotation Angle Estimation with Harmonics. 1-5 - Xidong Mu, Yuanwei Liu:
Rate Region Characterization for Semantics and Bits based Multiuser Communications. 1-5 - Konstantinos Georgiadis, Mehmet Kerim Yucel, Evangelos Skartados, Valia Dimaridou, Anastasios Drosou, Albert Saà-Garriga, Bruno Manganelli:
LP-IOANet: Efficient High Resolution Document Shadow Removal. 1-5 - Tam Thuc Do, Philip A. Chou, Gene Cheung:
Volumetric Attribute Compression for 3D Point Clouds Using Feedforward Network with Geometric Attention. 1-5 - Kefan Ma, Zheng Huang, Xinrui Deng, Jie Guo, Weidong Qiu:
LED: Label Correlation Enhanced Decoder for Multi-Label Text Classification. 1-5 - Jialun Cai, Hong Liu, Runwei Ding, Wenhao Li, Jianbing Wu, Miaoju Ban:
HTNet: Human Topology aware network for 3d Human pose estimation. 1-5 - Murtiza Ali, Aditya Arie Nugraha, Karan Nathwani:
Exploiting Sparse Recovery Algorithms for Semi-Supervised Training of Deep Neural Networks for Direction-of-Arrival Estimation. 1-5 - Xiaoheng Deng, Lirong Liao, Ping Jiang, Yurong Qian:
Towards Scale Adaptive Underwater Detection Through Refined Pyramid Grid. 1-5 - Yu Xuan, Xiangyu Zhang, Shuyue Stella Li, Zihan Shen, Xin Xie, Leibny Paola García, Roberto Togneri:
A New Approach to Extract Fetal Electrocardiogram Using Affine Combination of Adaptive Filters. 1-5 - Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Rohit Prabhavalkar, Tara N. Sainath, Trevor Strohman:
From English to More Languages: Parameter-Efficient Model Reprogramming for Cross-Lingual Speech Recognition. 1-5 - Zaiyi Hu, Binglu Wang, Xuelong Li:
Densitytoken: Weakly-Supervised Crowd Counting with Density Classification. 1-5 - Shuo Wang, Xiangyu Kong, Xiulian Peng, Hesam Movassagh, Vinod Prakash, Yan Lu:
Dasformer: Deep Alternating Spectrogram Transformer For Multi/Single-Channel Speech Separation. 1-5 - Valentina Shumovskaia, Mert Kayaalp, Ali H. Sayed:
Identifying Opinion Influencers over Social Networks. 1-5 - Huaibo Zhao, Shinya Fujie, Tetsuji Ogawa, Jin Sakuma, Yusuke Kida, Tetsunori Kobayashi:
Conversation-Oriented ASR with Multi-Look-Ahead CBS Architecture. 1-5 - Ling Zhao, Yunpeng Ma, Shanxiong Chen, Jun Zhou:
Deep Double Self-Expressive Subspace Clustering. 1-5 - Yaochi Zhao, Sen Chen, Qiong Chen, Zhuhua Hu:
Combining Loss Reweighting and Sample Resampling for Long-Tailed Instance Segmentation. 1-5 - Zhenduo Zhao, Zhuo Li, Wenchao Wang, Pengyuan Zhang:
PCF: ECAPA-TDNN with Progressive Channel Fusion for Speaker Verification. 1-5 - Ronak Mehta, Sathya N. Ravi, Vikas Singh:
Robustness and Convergence of Mirror Descent for Blind Deconvolution. 1-5 - Markus Müller, Anastasios Alexandridis, Zach Trozenski, Joel Whiteman, Grant P. Strimel, Nathan Susanj, Athanasios Mouchtaris, Siegfried Kunzmann:
Multilingual End-To-End Spoken Language Understanding For Ultra-Low Footprint Applications. 1-5 - Ahmed Adel Attia, Carol Y. Espy-Wilson:
Masked Autoencoders are Articulatory Learners. 1-5 - Wei Zhu, Peng Wang, Xiaoling Wang, Yuan Ni, Guotong Xie:
ACF: Aligned Contrastive Finetuning For Language and Vision Tasks. 1-5 - Yalong Jiang, Huining Li, Changkang Li:
A Physically Explainable Framework for Human-Related Anomaly Detection. 1-5 - Liang Zhao, Zihao Wang, Yukun Yuan, Feng Ding:
Unrestricted Anchor Graph Based GCN for Incomplete Multi-View Clustering. 1-5 - Yuhong Zhang, Hengsheng Zhang, Li Song, Rong Xie, Wenjun Zhang:
Dual-Head Fusion Network for Image Enhancement. 1-5 - Anton Ratnarajah, Ishwarya Ananthabhotla, Vamsi Krishna Ithapu, Pablo Hoffmann, Dinesh Manocha, Paul Calamia:
Towards Improved Room Impulse Response Estimation for Speech Recognition. 1-5 - Yuhui Guo, Xun Liang, James T. Kwok, Xiangping Zheng, Bo Wu, Yuefeng Ma:
Cross-Modal Matching and Adaptive Graph Attention Network for RGB-D Scene Recognition. 1-5 - Yang Wu, Hao Zhang, Lingyan Liang, Yaqian Zhao, Kaihua Zhang:
Group-Wise Co-Salient Object Detection with Siamese Transformers Via Brownian Distance Covariance Matching. 1-5 - Saarang Panchavati, Samuel Vander Dussen, Hemal Semwal, Ahmed Ali, Justin Chen, Haoran Li, Corey W. Arnold, William Speier:
Pretrained Transformers for Seizure Detection. 1-2 - Xing Wei, Bin Wen, Lei Chen, Yujie Liu, Chong Zhao, Yang Lu:
Contrastive Domain Adaptation Via Delimitation Discriminator. 1-5 - Christo Kurisummoottil Thomas, Dirk Slock:
Alternating Constrained Minimization Based Approximate Message Passing. 1-5 - Zewang Zhang, Yibin Zheng, Xinhui Li, Li Lu:
WeSinger 2: Fully Parallel Singing Voice Synthesis via Multi-Singer Conditional Adversarial Training. 1-5 - Jingqi Li, Yuzhen Zhang, Hongming Shan, Junping Zhang:
Gaitcotr: Improved Spatial-Temporal Representation for Gait Recognition with a Hybrid Convolution-Transformer Framework. 1-5 - Ge Zhu, Yujia Yan, Juan Pablo Cáceres, Zhiyao Duan:
Transcription Free Filler Word Detection with Neural Semi-CRFs. 1-5 - R. S. Prasobh Sankar, Sundeep Prabhakar Chepuri:
Quantized Precoding and RIS-Assisted Modulation for Integrated Sensing and Communications Systems. 1-5 - Kun Wei, Long Zhou, Ziqiang Zhang, Liping Chen, Shujie Liu, Lei He, Jinyu Li, Furu Wei:
Joint Pre-Training with Speech and Bilingual Text for Direct Speech to Speech Translation. 1-5 - Adarsh Barik, Jean Honorio:
Provable Computational and Statistical Guarantees for Efficient Learning of Continuous-Action Graphical Games. 1-5 - Yuanyuan Wang, Yang Zhang, Zhiyong Wu, Zhihan Yang, Tao Wei, Kun Zou, Helen Meng:
DASA: Difficulty-Aware Semantic Augmentation for Speaker Verification. 1-5 - Aditya Kanade, Mansi Sharma, Manivannan Muniyandi:
Attention-Guided Deep Learning Framework For Movement Quality Assessment. 1-5 - Brayan Monroy, Jorge Bacca, Henry Arguello:
Deep Adaptive Superpixels For Hadamard Single Pixel Imaging In Near-Infrared Spectrum. 1-5 - Paul M. Reuter, Christian Rollwage, Bernd T. Meyer:
Multilingual Query-by-Example Keyword Spotting with Metric Learning and Phoneme-to-Embedding Mapping. 1-5 - Meizheng Peng, Xu Jia, Min Peng:
A Benchmark for Evaluating Robustness of Spoken Language Understanding Models in Slot Filling. 1-5 - Chao Zhang, Bo Li, Tara N. Sainath, Trevor Strohman, Shuo-Yiin Chang:
UML: A Universal Monolingual Output Layer For Multilingual Asr. 1-5 - Dong Yang, Fei Jiang, Wei Wu, Xuefei Fang, Muyong Cao:
Low-Complexity Acoustic Echo Cancellation with Neural Kalman Filtering. 1-5 - Chaojian Li, Wenwan Chen, Jiayi Yuan, Yingyan Celine Lin, Ashutosh Sabharwal:
ERSAM: Neural Architecture Search for Energy-Efficient and Real-Time Social Ambiance Measurement. 1-5 - Gerald Matz, Claudio Verardo, Thomas Dittrich:
Efficient Learning of Balanced Signature Graphs. 1-5 - Ali Elkahky, Wei-Ning Hsu, Paden Tomasello, Tu Anh Nguyen, Robin Algayres, Yossi Adi, Jade Copet, Emmanuel Dupoux, Abdelrahman Mohamed:
Do Coarser Units Benefit Cluster Prediction-Based Speech Pre-Training? 1-5 - Takashi Maekaku, Yuya Fujita, Xuankai Chang, Shinji Watanabe:
Fully Unsupervised Topic Clustering of Unlabelled Spoken Audio Using Self-Supervised Representation Learning and Topic Model. 1-5 - Jingyi Li, Weiping Tu, Li Xiao:
Freevc: Towards High-Quality Text-Free One-Shot Voice Conversion. 1-5 - Christopher Ick, Adib Mehrabi, Wenyu Jin:
Blind Acoustic Room Parameter Estimation Using Phase Features. 1-5 - Jabran Akhtar:
High-Resolution Neural Network Processing of LFM Radar Pulses. 1-5 - Jiahuan Zhang, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama:
Defense Against Black-Box Adversarial Attacks Via Heterogeneous Fusion Features. 1-5 - Guangwei Li, Xuenan Xu, Lingfeng Dai, Mengyue Wu, Kai Yu:
Diverse and Vivid Sound Generation from Text Descriptions. 1-5 - Mark Cartwright, Magdalena Fuentes, Charlie Mydlarz, Fabio Miranda, Juan Pablo Bello:
Does a Quieter City Mean Fewer Complaints? The Sounds of New York City During Covid-19 Lockdown. 1-5 - Biao Liu, Xiaoyu Wu, Bo Yuan:
ICEL: Learning with Inconsistent Explanations. 1-5 - Xinlu Zhuang, Yunjie Ge, Baolin Zheng, Qian Wang:
Adversarial Network Pruning by Filter Robustness Estimation. 1-5 - Pinjun Zheng, Hui Chen, Tarig Ballal, Henk Wymeersch, Tareq Y. Al-Naffouri:
Misspecified Cramér-Rao Bound of RIS-Aided Localization Under Geometry Mismatch. 1-5 - Peixuan Liu, Yinghui Wang, Jinlong Yang, Wei Li:
An Adaptive Enhancement Method for Gastrointestinal Low-Light Images of Capsule Endoscope. 1-5 - Abin Jose, Rijo Roy, Dennis Eschweiler, Ina Laube, Reza Azad, Daniel Moreno-Andrés, Johannes Stegmaier:
End-to-End Classification of Cell-Cycle Stages with Center-Cell Focus Tracker Using Recurrent Neural Networks. 1-5 - Satish Mulleti, Yonina C. Eldar:
High-Dynamic Range ADC for Finite-Rate-of-Innovation Signals. 1-5 - Haoran Deng, Guy Revach, Hai Morgenstern, Nir Shlezinger:
Kalmanbot: Kalmannet-Aided Bollinger Bands for Pairs Trading. 1-5 - Samar Hadou, Charilaos I. Kanatsoulis, Alejandro Ribeiro:
Space-Time Graph Neural Networks with Stochastic Graph Perturbations. 1-5 - Arjun Gupta, Pablo Hoffmann, Sebastian Prepelita, Philip W. Robinson, Vamsi K. Ithapu, David L. Alon:
Learning to Personalize Equalization for High-Fidelity Spatial Audio Reproduction. 1-5 - Yicong He, George K. Atia:
Robust and Parallelizable Tensor Completion Based on Tensor Factorization and Maximum Correntropy Criterion. 1-5 - Jessica Centers, Jeffrey L. Krolik:
Multi-User Methods for Vibrational Radar Backscatter Communications. 1-5 - Jaechang Kim, Yunjoo Lee, Hyun Mi Cho, Dong Woo Kim, Chi Hoon Song, Jungseul Ok:
Activity-Informed Industrial Audio Anomaly Detection Via Source Separation. 1-5 - Shaohuan Zhou, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Enhancing the Vocal Range of Single-Speaker Singing Voice Synthesis with Melody-Unsupervised Pre-Training. 1-5 - Irene Martín-Morató, Manu Harju, Paul Ahokas, Annamaria Mesaros:
Training Sound Event Detection with Soft Labels from Crowdsourced Annotations. 1-5 - Jingchao Gao, Ao Tang, Weiyu Xu:
Optimal Compression for Minimizing Classification Error Probability: An Information-Theoretic Approach. 1-5 - Patrick L. Combettes, Jean-Christophe Pesquet, Audrey Repetti:
A Variational Inequality Model for Learning Neural Networks. 1-5 - Huijing Zhan, Ling Li, Shaohua Li, Weide Liu, Manas Gupta, Alex C. Kot:
Towards Explainable Recommendation Via Bert-Guided Explanation Generator. 1-5 - Harm Lameris, Shivam Mehta, Gustav Eje Henter, Joakim Gustafson, Éva Székely:
Prosody-Controllable Spontaneous TTS with Neural HMMS. 1-5 - Yusong Wu, Ke Chen, Tianyu Zhang, Yuchen Hui, Taylor Berg-Kirkpatrick, Shlomo Dubnov:
Large-Scale Contrastive Language-Audio Pretraining with Feature Fusion and Keyword-to-Caption Augmentation. 1-5 - Yukun Ma, Trung Hieu Nguyen, Jinjie Ni, Wen Wang, Qian Chen, Chong Zhang, Bin Ma:
Auxiliary Pooling Layer For Spoken Language Understanding. 1-5 - Shaowu Chen, Weize Sun, Lei Huang:
WHC: Weighted Hybrid Criterion for Filter Pruning on Convolutional Neural Networks. 1-5 - Peng Sun, Zhenyu Wen, Yejian Zhou, Zhen Hong, Tao Lin:
Neural Mode Estimation. 1-5 - Jin Liu, Xi Wang, Xiaomeng Fu, Yesheng Chai, Cai Yu, Jiao Dai, Jizhong Han:
OPT: One-shot Pose-Controllable Talking Head Generation. 1-5 - Ryosuke Isono, Kazuki Naganuma, Shunsuke Ono:
Robust Spatiotemporal Fusion of Satellite Images via Convex Optimization. 1-5 - Hojjat Salehinejad, Navid Hasanzadeh, Radomir Djogo, Shahrokh Valaee:
Joint Human Orientation-Activity Recognition Using WIFI Signals for Human-Machine Interaction. 1-5 - Guangning Xu, Jinyang Yang, Jinjin Guo, Zhichao Huang, Bowen Zhang:
Int-GNN: A User Intention Aware Graph Neural Network for Session-Based Recommendation. 1-5 - Ruizhe Huang, Matthew Wiesner, Leibny Paola García-Perera, Daniel Povey, Jan Trmal, Sanjeev Khudanpur:
Building Keyword Search System from End-To-End Asr Systems. 1-5 - Yixiao Xu, Xiaolei Liu, Teng Hu, Bangzhou Xin, Run Yang:
Sparse Black-Box Inversion Attack with Limited Information. 1-5 - Haihui Chen, Likai Ran, Xixia Sun, Chao Cai:
SW-WAVENET: Learning Representation from Spectrogram and Wavegram Using Wavenet for Anomalous Sound Detection. 1-5 - Sofoklis Kakouros, Themos Stafylakis, Ladislav Mosner, Lukás Burget:
Speech-Based Emotion Recognition with Self-Supervised Models Using Attentive Channel-Wise Correlations and Label Smoothing. 1-5 - Shinnosuke Matsuo, Ryoma Bise, Seiichi Uchida, Daiki Suehiro:
Learning From Label Proportion with Online Pseudo-Label Decision by Regret Minimization. 1-5 - Keigo Takeuchi:
Long-Memory Message-Passing for Spatially Coupled Systems. 1-5 - Michael Rotman, Lior Wolf:
Energy Regularized RNNS for solving non-stationary Bandit problems. 1-5 - Jiuxin Lin, Xinyu Cai, Heinrich Dinkel, Jun Chen, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Zhiyong Wu, Yujun Wang, Helen Meng:
Av-Sepformer: Cross-Attention Sepformer for Audio-Visual Target Speaker Extraction. 1-5 - Sergio Cruces:
On the Minimum Perimeter Criterion for Bounded Component Analysis. 1-5 - Pravallika Lavanuru, Sawon Pratiher, Karuna P. Sahoo, Mrinal Acharya, Sreejith S, Nirmalya Ghosh, Amit Patra:
Parasympathetic-Sympathetic Causal Interactions and Perceived Workload for Varying Difficulty Affective Computing Tasks. 1-5 - Suwon Shon, Felix Wu, Kwangyoun Kim, Prashant Sridhar, Karen Livescu, Shinji Watanabe:
Context-Aware Fine-Tuning of Self-Supervised Speech Models. 1-5 - Xinyu Zhang, Han Ying, Ye Tao, Youlu Xing, Guihuan Feng:
General Category Network: Handwritten Mathematical Expression Recognition with Coarse-Grained Recognition Task. 1-5 - Cristian J. Vaca-Rubio, Pu Wang, Toshiaki Koike-Akino, Ye Wang, Petros Boufounos, Petar Popovski:
mmWave Wi-Fi Trajectory Estimation with Continuous-Time Neural Dynamic Learning. 1-5 - Navid Hasanzadeh, Shahrokh Valaee, Hojjat Salehinejad:
Multi-Observation Hidden Semi-Markov Model for Photoplethysmogram Signal Semantic Segmentation. 1-5 - Adel Moumen, Titouan Parcollet:
Stabilising and Accelerating Light Gated Recurrent Units for Automatic Speech Recognition. 1-5 - Yuhan Xiang, Kaijian Liu, Shixiang Tang, Lei Bai, Feng Zhu, Rui Zhao, Xianming Lin:
Trust Your Partner's Friends: Hierarchical Cross-Modal Contrastive Pre-Training for Video-Text Retrieval. 1-5 - Tuo Zhang, Tiantian Feng, Samiul Alam, Sunwoo Lee, Mi Zhang, Shrikanth S. Narayanan, Salman Avestimehr:
FedAudio: A Federated Learning Benchmark for Audio Tasks. 1-5 - You Jin Kim, Hee-Soo Heo, Jee-Weon Jung, Youngki Kwon, Bong-Jin Lee, Joon Son Chung:
Advancing the Dimensionality Reduction of Speaker Embeddings for Speaker Diarisation: Disentangling Noise and Informing Speech Activity. 1-5 - Aki Härmä, Ulf Großekathöfer, Okke Ouweltjes, Venkata Srikanth Nallanthighal:
Forecasting of Breathing Events from Speech for Respiratory Support. 1-5 - Emmanouil Theodosis, Demba E. Ba:
Learning Silhouettes with Group Sparse Autoencoders. 1-5 - Charles Laroche, Andrés Almansa, Eva Coupeté, Matias Tassano:
Provably Convergent Plug & Play Linearized ADMM, Applied to Deblurring Spatially Varying Kernels. 1-5 - Caoyun Fan, Wenqing Chen, Jidong Tian, Yitian Li, Hao He, Yaohui Jin:
Improving the out-of-Distribution Generalization Capability of Language Models: Counterfactually-Augmented Data is not Enough. 1-5 - Chenda Li, Yifei Wu, Yanmin Qian:
Predictive Skim: Contrastive Predictive Coding for Low-Latency Online Speech Separation. 1-5 - Lei Kang, Lichao Zhang, Dazhi Jiang:
Learning Robust Self-Attention Features for Speech Emotion Recognition with Label-Adaptive Mixup. 1-5 - Wuwei Huang, Renren Jin, Wen Zhang, Jian Luan, Bin Wang, Deyi Xiong:
Joint Training and Decoding for Multilingual End-to-End Simultaneous Speech Translation. 1-5 - Yuening Li, Wing-Kin Ma, Ruiyuan Wu, Huikang Liu:
A Simple Scheme for Coupled Factorization for Hyperspectral Super-Resolution: Exploiting Sparsity in an Easy Way. 1-5 - Haoyin Yan, Haitao Xu, Qing Wang, Jie Zhang:
The NERCSLIP-USTC System for the L3DAS23 Challenge Task2: 3D Sound Event Localization and Detection (SELD). 1-2 - Zeqin Yu, Bin Li, Yuzhen Lin, Jinhua Zeng, Jishen Zeng:
Learning to Locate the Text Forgery in Smartphone Screenshots. 1-5 - Gary C. F. Lee, Amir Weiss, Alejandro Lancho, Yury Polyanskiy, Gregory W. Wornell:
On Neural Architectures for Deep Learning-Based Source Separation of Co-Channel OFDM Signals. 1-5 - Yunchang Liu, Hong Jiang, Qi Zhang:
Mixed Far-field and Near-field Source Localization Based on Low-Rank Matrix Reconstruction. 1-5 - Heng Zhang, Bing Su:
Decaying Contrast for Fine-Grained Video Representation Learning. 1-5 - Run Chen, Seokhwan Kim, Alexandros Papangelis, Julia Hirschberg, Yang Liu, Dilek Hakkani-Tür:
Identifying Entrainment in Task-Oriented Conversations. 1-5 - Anuj Diwan, Ching-Feng Yeh, Wei-Ning Hsu, Paden Tomasello, Eunsol Choi, David Harwath, Abdelrahman Mohamed:
Continual Learning for On-Device Speech Recognition Using Disentangled Conformers. 1-5 - Hang Zheng, Chengwei Zhou, Sergiy A. Vorobyov, Zhiguo Shi:
Tensorized Neural Layer Decomposition for 2-D DOA Estimation. 1-5 - Darius Petermann, Gordon Wichern, Aswin Shanmugam Subramanian, Jonathan Le Roux:
Hyperbolic Audio Source Separation. 1-5 - Dimitrios Bralios, Efthymios Tzinis, Gordon Wichern, Paris Smaragdis, Jonathan Le Roux:
Latent Iterative Refinement for Modular Source Separation. 1-5 - Kai Wang, Fangdong Chen, Zongmiao Ye, Li Wang, Xiaoyang Wu, Shiliang Pu:
A Spatio-Temporal Decomposition Network for Compressed Video Quality Enhancement. 1-5 - Yuhao Liu, Cheng Gong, Longbiao Wang, Xixin Wu, Qiuyu Liu, Jianwu Dang:
VF-Taco2: Towards Fast and Lightweight Synthesis for Autoregressive Models with Variation Autoencoder and Feature Distillation. 1-5 - Weiran Wang, Ding Zhao, Shaojin Ding, Hao Zhang, Shuo-Yiin Chang, David Rybach, Tara N. Sainath, Yanzhang He, Ian McGraw, Shankar Kumar:
Multi-Output RNN-T Joint Networks for Multi-Task Learning of ASR and Auxiliary Tasks. 1-5 - Kang You, Bo Liu, Kele Xu, Yunsheng Xiong, Qisheng Xu, Ming Feng, Tamás Gábor Csapó, Boqing Zhu:
Raw Ultrasound-Based Phonetic Segments Classification Via Mask Modeling. 1-5 - Joose Sainio, Alexandre Mercat, Jarno Vanne:
RDO Candidate Selection for Maximizing Coding Efficiency in a Practical HEVC Encoder. 1-5 - Bo Jiang, Hamid Krim, Tianfu Wu, Derya Cansever:
Implicit Bayes Adaptation: A Collaborative Transport Approach. 1-5 - Wen Wang, Wanli Ni, Hui Tian, Yonina C. Eldar:
Multi-Functional Reconfigurable Intelligent Surface. 1-5 - Dalu Guo, Ke Zhang, Jiaxing Li, Youyong Kong:
Topgformer: Topological-Based Graph Transformer for Mapping Brain Structural Connectivity to Functional Connectivity. 1-5 - He Zhu, Ce Li, Haitian Yang, Yan Wang, Weiqing Huang:
Prompt Makes mask Language Models Better Adversarial Attackers. 1-5 - Rodrigo B. Pinheiro, Jean-Eudes Marvie, Giuseppe Valenzise, Frédéric Dufaux:
NF-PCAC: Normalizing Flow Based Point Cloud Attribute Compression. 1-5 - Shashwat Jain, Vikram Krishnamurthy, Muralidhar Rangaswamy, Bosung Kang, Sandeep Gogineni:
Radar Clutter Covariance Estimation: A Nonlinear Spectral Shrinkage Approach. 1-5 - Duc Le, Frank Seide, Yuhao Wang, Yang Li, Kjell Schubert, Ozlem Kalinli, Michael L. Seltzer:
Factorized Blank Thresholding for Improved Runtime Efficiency of Neural Transducers. 1-5 - Wei-Hsiang Wang, Xiaolu Zeng, Beibei Wang, Yexin Cao, K. J. Ray Liu:
Improved Wifi-Based Respiration Tracking via Contrast Enhancement. 1-2 - Yanyan Huang, Yong Wang, Kun Shi, Chaojie Gu, Yu Fu, Cheng Zhuo, Zhiguo Shi:
HDNet: Hierarchical Dynamic Network for Gait Recognition using Millimeter-wave radar. 1-5 - Shengkui Zhao, Bin Ma:
MossFormer: Pushing the Performance Limit of Monaural Speech Separation Using Gated Single-Head Transformer with Convolution-Augmented Joint Self-Attentions. 1-5 - Pablo Luesia-Lahoz, Diego Gutierrez, Adolfo Muñoz:
Zone Plate Virtual Lenses for Memory-Constrained NLOS Imaging. 1-5 - Felix Burkhardt, Anna Derington, Matthias Kahlau, Klaus R. Scherer, Florian Eyben, Björn W. Schuller:
Masking Speech Contents by Random Splicing: is Emotional Expression Preserved? 1-5 - Shaik Basheeruddin Shah, Satish Mulleti, Yonina C. Eldar:
Lasso-Based Fast Residual Recovery For Modulo Sampling. 1-5 - Guilherme Schu, Parvaneh Janbakhshi, Ina Kodrasi:
On Using the UA-Speech and Torgo Databases to Validate Automatic Dysarthric Speech Classification Approaches. 1-5 - Xinyu Lin, Yingjie Zhou, Xun Zhang, Yipeng Liu, Ce Zhu:
Efficient and Effective Multi-Camera Pose Estimation with Weighted M-Estimate Sample Consensus. 1-5 - Yanhua Wang, Qiubo Pei, Xueyao Hu, Jiamin Long, Hao Yu, Le Zheng:
Doppler-Coded Joint Division Multiple Access Waveform for Automotive MIMO Radar. 1-5 - Satoshi Tsutsui, Zhengyang Su, Bihan Wen:
Benchmarking White Blood Cell Classification under Domain Shift. 1-5 - Zicheng Cai, Lei Chen, Hai-Lin Liu:
BHE-DARTS: Bilevel Optimization Based on Hypergradient Estimation for Differentiable Architecture Search. 1-5 - Wenqing Wang, Lingqing Zhang, Chi-Man Pun, Jiucheng Xie:
Boosting Face Recognition Performance with Synthetic Data and Limited Real Data. 1-5 - Fumio Nihei, Ryo Ishii, Yukiko I. Nakano, Atsushi Fukayama, Takao Nakamura:
Whether Contribution of Features Differ Between Video-Mediated and In-Person Meetings in Important Utterance Estimation. 1-5 - Tae-Jin Woo, Woo-Jeoung Nam, Yeong-Joon Ju, Seong-Whan Lee:
Compensatory Debiasing For Gender Imbalances In Language Models. 1-5 - Ruiteng Zhang, Jianguo Wei, Xugang Lu, Wenhuan Lu, Di Jin, Lin Zhang, Junhai Xu:
Optimal Transport with a Diversified Memory Bank for Cross-Domain Speaker Verification. 1-5 - Florian Schmid, Khaled Koutini, Gerhard Widmer:
Efficient Large-Scale Audio Tagging Via Transformer-to-CNN Knowledge Distillation. 1-5 - Rana M. Khalil, Alexandra Papanicolaou, Renee Ti Chou, Bobby E. Gibbs, Samira Anderson, Sandra Gordon-Salant, Michael P. Cummings, Matthew J. Goupell:
Using Machine Learning to Understand the Relationships Between Audiometric Data, Speech Perception, Temporal Processing, And Cognition. 1-5 - Joseph Kim, Dong Wu, Mingmin Chi, Gaoqi Xu:
Hierarchical Multi-Task Learning for Fabric Component Analysis Based on NIR Spectral Signals. 1-5 - Konstantinos D. Polyzos, Qin Lu, Georgios B. Giannakis:
Bayesian Optimization with Ensemble Learning Models and Adaptive Expected Improvement. 1-5 - Darius Petermann, Inseon Jang, Minje Kim:
Native Multi-Band Audio Coding Within Hyper-Autoencoded Reconstruction Propagation Networks. 1-5 - Xiangxiang Gao, Wei Zhu, Jiasheng Gao, Congrui Yin:
F-PABEE: Flexible-Patience-Based Early Exiting For Single-Label and Multi-Label Text Classification Tasks. 1-5 - Xuhang Chen, Xiaodong Cun, Chi-Man Pun, Shuqiang Wang:
Shadocnet: Learning Spatial-Aware Tokens in Transformer for Document Shadow Removal. 1-5 - Bozhen Hu, Zelin Zang, Jun Xia, Lirong Wu, Cheng Tan, Stan Z. Li:
Deep Manifold Graph Auto-Encoder For Attributed Graph Embedding. 1-5 - Han Han, Vincent Lostanlen, Mathieu Lagrange:
Perceptual-Neural-Physical Sound Matching. 1-5 - Jiang Zhu, Xiangming Meng, Xupeng Lei, Qinghua Guo:
A Unitary Transform Based Generalized Approximate Message Passing. 1-5 - Nikola Lackovic, Claude Montacié, Cédric Lequilliec, Marie-José Caraty:
Healthcall Corpus and Transformer Embeddings from Healthcare Customer-Agent Conversations. 1-5 - Jonathan Svirsky, Ofir Lindenbaum:
SG-VAD: Stochastic Gates Based Speech Activity Detection. 1-5 - Zhaoxi Mu, Xinyu Yang, Wenjing Zhu:
Multi-Dimensional and Multi-Scale Modeling for Speech Separation Optimized by Discriminative Learning. 1-5 - Antonino Maria Rizzo, Luca Magri, Pietro Invernizzi, Enrico Sozio, Stefano Piciaccia, Alberto Tanzi, Stefano Binetti, Cesare Alippi, Giacomo Boracchi:
Anomaly Detection in Optical Spectra VIA Joint Optimization. 1-5 - Moyu Terao, Eita Nakamura, Kazuyoshi Yoshii:
Neural Band-to-Piano Score Arrangement with Stepless Difficulty Control. 1-5 - Robert P. Spang, Karl El Hajal, Sebastian Möller, Milos Cernak:
Personalized Task Load Prediction in Speech Communication. 1-5 - Luke Snow, Vikram Krishnamurthy, Brian M. Sadler:
Identifying Coordination in a Cognitive Radar Network - A Multi-Objective Inverse Reinforcement Learning Approach. 1-5 - Jinglei Shi, Christine Guillemot:
Light Field Compression Via Compact Neural Scene Representation. 1-5 - Souvik Kundu, Sairam Sundaresan, Sharath Nittur Sridhar, Shunlin Lu, Han Tang, Peter A. Beerel:
Sparse Mixture Once-for-all Adversarial Training for Efficient in-situ Trade-off between Accuracy and Robustness of DNNs. 1-5 - Tong Zhang, Wenxue Cui, Chen Hui, Feng Jiang:
Hierarchical Interactive Reconstruction Network for Video Compressive Sensing. 1-5 - Francesco Linsalata, Nassar Ksairi:
On the Joint Estimation of Phase Noise and time-Varying Channels for OFDM under High-Mobility Conditions. 1-5 - Tian Cheng, Masataka Goto:
U-Beat: A Multi-Scale Beat Tracking Model Based on Wave-U-Net. 1-5 - Kazuhiro Yamawaki, Xian-Hua Han:
Local to global prior Learning for blind Unsupervised Image super Resolution. 1-5 - Zhengbo Wang, Jian Liang, Zilei Wang, Tieniu Tan:
Notice of Removal: Exploiting Semantic Attributes for Transductive Zero-Shot Learning. 1-5 - Zhuocheng Jiang, Yue Tian, Yangmin Ding, Sarper Ozharar, Ting Wang:
Utility Polelocalization by Learning from Ambient Traces on Distributed Acoustic Sensing. 1-5 - Chun Ren, Danfeng Yan, Yuanqiang Cai, Yangchun Li:
Semi-Swinderain: Semi-Supervised Image Deraining Network Using SWIN Transformer. 1-5 - Shuonan Chen, Bovey Y. Rao, Stephanie Herrlinger, Attila Losonczy, Liam Paninski, Erdem Varol:
Multimodal Microscopy Image Alignment Using Spatial and Shape Information and a Branch-and-Bound Algorithm. 1-5 - Peng Zheng, Zhen Huang, Yong Dou, Yeqing Yan:
Rumor Detection Via Assessing the Spreading Propensity of Users. 1-5 - Zuhaib Akhtar, Mohammad Omar Khursheed, Dongsu Du, Yuzong Liu:
Small-Footprint Slimmable Networks for Keyword Spotting. 1-5 - Kun Yang, Jun Lu:
DMSA: Dynamic Multi-Scale Unsupervised Semantic Segmentation Based On Adaptive Affinity. 1-5 - Shikun Sun, Jia Jia, Haozhe Wu, Zijie Ye, Junliang Xing:
MSNet: A Deep Architecture Using Multi-Sentiment Semantics for Sentiment-Aware Image Style Transfer. 1-5 - Xu Guo, Ming Ma, Jiaqiang Zhang, Shaojie Li:
YOLOX-B: A Better Yolox Model for Real-Time Driver Behavior Detection. 1-5 - Siwei Zhang, Tobias Baumgartner, Emanuel Staudinger, Robert Pöhlmann, Fabio Broghammer, Armin Dammann:
Autonomous Navigation of a Robotic Swarm in Space Exploration Missions. 1-5 - Guanghao Zheng, Yuchen Liu, Wenrui Dai, Chenglin Li, Junni Zou, Hongkai Xiong:
Learning Causal Representations for Generalizable Face Anti Spoofing. 1-5 - Jixuan Wang, Martin Radfar, Kai Wei, Clement Chung:
End-to-End Spoken Language Understanding Using Joint CTC Loss and Self-Supervised, Pretrained Acoustic Encoders. 1-5 - Chenchen Fan, Yixin Wang, Yahong Zhang, Wenli Ouyang:
Interpretable Multi-Scale Neural Network for Granger Causality Discovery. 1-5 - Oyku Deniz Kose, Yanning Shen:
Dynamic Fair Node Representation Learning. 1-5 - Zhikang Zhang, Zifan Yu, Suya You, Raghuveer Rao, Sanjeev Agarwal, Fengbo Ren:
Enhanced Low-Resolution LiDAR-Camera Calibration via Depth Interpolation and Supervised Contrastive Learning. 1-5 - Yi Chang, Zhao Ren, Thanh Tam Nguyen, Kun Qian, Björn W. Schuller:
Knowledge Transfer for on-Device Speech Emotion Recognition With Neural Structured Learning. 1-5 - Tien-Ju Yang, Yonghui Xiao, Giovanni Motta, Françoise Beaufays, Rajiv Mathews, Mingqing Chen:
Online Model Compression for Federated Learning with Large Models. 1-5 - Bowen Tan, Linfeng Xu, Zihuan Qiu, Qingbo Wu, Fanman Meng:
MFAT: A Multi-Level Feature Aggregated Transformer for Person Re-Identification. 1-5 - Zepeng Zhang, Ziping Zhao, Kaiming Shen:
Enhancing the Efficiency of WMMSE and FP for Beamforming by Minorization-Maximization. 1-5 - Julius Ott, Lorenzo Servadei, Jose A. Arjona-Medina, Enrico Rinaldi, Gianfranco Mauro, Daniela Sanchez Lopera, Michael Stephan, Thomas Stadelmayer, Avik Santra, Robert Wille:
MEET: A Monte Carlo Exploration-Exploitation Trade-Off for Buffer Sampling. 1-5 - Bo Pang, Yongquan Fu, Siyuan Ren, Siqi Shen, Ye Wang, Qing Liao, Yan Jia:
A Multi-Modal Approach For Context-Aware Network Traffic Classification. 1-5 - François Effa, Romain Serizel, Jean-Pierre Arz, Nicolas Grimault:
Lightweight Annotation and Class Weight Training for Automatic Estimation of Alarm Audibility in Noise. 1-5 - Yuxuan Sun, Chenglu Zhu, Yunlong Zhang, Honglin Li, Pingyi Chen, Lin Yang:
Assessing the Robustness of Deep Learning-Assisted Pathological Image Analysis Under Practical Variables of Imaging System. 1-5 - Amir Aghabiglou, Matthieu Terris, Adrian Jackson, Yves Wiaux:
Deep Network Series for Large-Scale High-Dynamic Range Imaging. 1-5 - Dianwen Ng, Ruixi Zhang, Jia Qi Yip, Zhao Yang, Jinjie Ni, Chong Zhang, Yukun Ma, Chongjia Ni, Eng Siong Chng, Bin Ma:
De'hubert: Disentangling Noise in a Self-Supervised Model for Robust Speech Recognition. 1-5 - Anahita Baninajjar, Kamran Hosseini, Ahmed Rezine, Amir Aminifar:
SafeDeep: A Scalable Robustness Verification Framework for Deep Neural Networks. 1-5 - Honglong Wang, Yanjie Fu, Junjie Li, Meng Ge, Longbiao Wang, Xinyuan Qian:
Stream Attention Based U-Net for L3DAS23 Challenge. 1-2 - Sahar Husseini, Jean-Luc Dugelay, Fabien Aili, Emmanuel Nars:
A 3D-Assisted Framework to Evaluate the Quality of Head Motion Replication by Reenactment DEEPFAKE Generators. 1-5 - Zhengyan Sheng, Yang Ai, Zhen-Hua Ling:
Zero-Shot Personalized Lip-To-Speech Synthesis with Face Image Based Voice Control. 1-5 - Kevin J. Mitchell, Khaled Kassem, Chaitanya Kaul, Valentin Kapitany, Philip Binner, Andrew Ramsay, Daniele Faccio, Roderick Murray-Smith:
mmSense: Detecting Concealed Weapons with a Miniature Radar Sensor. 1-5 - Quchen Fu, Szu-Wei Fu, Yaran Fan, Yu Wu, Zhuo Chen, Jayant Gupchup, Ross Cutler:
Real-Time Speech Interruption Analysis: from Cloud to Client Deployment. 1-5 - Suhee Jo, Younggun Lee, Yookyung Shin, Yeongtae Hwang, Taesu Kim:
Cross-Speaker Emotion Transfer by Manipulating Speech Style Latents. 1-5 - Aya Watanabe, Shinnosuke Takamichi, Yuki Saito, Detai Xin, Hiroshi Saruwatari:
MID-Attribute Speaker Generation Using Optimal-Transport-Based Interpolation of Gaussian Mixture Models. 1-5 - Moshe Mandel, Or Tal, Yossi Adi:
AERO: Audio Super Resolution in the Spectral Domain. 1-5 - Haodong Zhao, Wei Du, Fangqi Li, Peixuan Li, Gongshen Liu:
FedPrompt: Communication-Efficient and Privacy-Preserving Prompt Tuning in Federated Learning. 1-5 - Dekai Sun, Yancheng He, Jiqing Han:
Using Auxiliary Tasks In Multimodal Fusion of Wav2vec 2.0 And Bert for Multimodal Emotion Recognition. 1-5 - Yixiao Yang, Ran Tao:
Single-Shot Fractional Fourier Phase Retrieval. 1-5 - Wentao Zhu, Mohamed Omar:
Multiscale Audio Spectrogram Transformer for Efficient Audio Classification. 1-5 - Cheng Tan, Zhangyang Gao, Jun Xia, Bozhen Hu, Stan Z. Li:
Global-Context Aware Generative Protein Design. 1-5 - John B. Harvill, Jarred Barber, Arun Nair, Ramin Pishehvar:
SPADE: Self-Supervised Pretraining for Acoustic Disentanglement. 1-5 - Tong Xia, Jing Han, Abhirup Ghosh, Cecilia Mascolo:
Cross-Device Federated Learning for Mobile Health Diagnostics: A First Study on COVID-19 Detection. 1-5 - Wei Tsung Lu, Ju-Chiang Wang, Yun-Ning Hung:
Multitrack Music Transcription with a Time-Frequency Perceiver. 1-5 - Chris Henry, M. Salman Asif, Zhu Li:
Privacy Preserving Face Recognition with Lensless Camera. 1-5 - Jing Lu, Chunlei Wu, Leiquan Wang, Shaozu Yuan, Jie Wu:
Nested Attention Network with Graph Filtering for Visual Question and Answering. 1-5 - Yu-Jhe Li, Matthew O'Toole, Kris Kitani:
ST-MVDNet++: Improve Vehicle Detection with Lidar-Radar Geometrical Augmentation via Self-Training. 1-5 - Kanishka Tyagi, Shan Zhang, Yihang Zhang, John L. Kirkwood, Sanling Song, Narbik Manukian:
Machine Learning Based Early Debris Detection Using Automotive Low Level Radar Data. 1-5 - Sangeon Yong, Li Su, Juhan Nam:
A Phoneme-Informed Neural Network Model For Note-Level Singing Transcription. 1-5 - Jiahuan Ji, Baojiang Zhong, Kai-Kuang Mu:
A Content-Based Multi-Scale Network for Single Image Super-Resolution. 1-5 - Yuqi Yang, Songyun Yang, Jiyang Xie, Zhongwei Si, Kai Guo, Ke Zhang, Kongming Liang:
Multi-Head Uncertainty Inference for Adversarial Attack Detection. 1-5 - Sathvik Udupa, Prasanta Kumar Ghosh:
Real-Time MRI Video Synthesis from Time Aligned Phonemes with Sequence-to-Sequence Networks. 1-5 - Huipeng Ma, Qiu Tang, Ni Zhang, Rui Xu, Yanhua Shao, Wei Yan, Yaojun Wang:
Spteae: A Soft Prompt Transfer Model for Zero-Shot Cross-Lingual Event Argument Extraction. 1-5 - Heidi Lei, Arm Wonghirundacha, Irmak Bukey, T. J. Tsai:
Audio Cross Verification Using Dual Alignment Likelihood Ratio Test. 1-5 - Jhih-Cing Huang, Yu-Lin Tsai, Chao-Han Huck Yang, Cheng-Fang Su, Chia-Mu Yu, Pin-Yu Chen, Sy-Yen Kuo:
Certified Robustness of Quantum Classifiers Against Adversarial Examples Through Quantum Noise. 1-5 - William Todo, Merwann Selmani, Béatrice Laurent, Jean-Michel Loubes:
Counterfactual Explanation for Multivariate Times Series Using A Contrastive Variational Autoencoder. 1-5 - Zongyu Zhang, Yujie Gu, Zhiguo Shi:
Explicit Ziv-Zakai Bound For Multiple Sources Doa Estimation. 1-5 - Zefa Hu, Xiuyi Chen, Haoran Wu, Minglun Han, Ziyi Ni, Jing Shi, Shuang Xu, Bo Xu:
Matching-Based Term Semantics Pre-Training for Spoken Patient Query Understanding. 1-5 - Yutong Shao, Arun Kumar, Ndapa Nakashole:
Database-Aware ASR Error Correction for Speech-to-SQL Parsing. 1-5 - Hongxin Lin, Yunwei Chiu, Peiyuan Wu:
AMPose: Alternately Mixed Global-Local Attention Model for 3D Human Pose Estimation. 1-5 - Jiamu Sheng, Jiayuan Fan, Peng Ye, Jianjian Cao:
JNDMix: Jnd-Based Data Augmentation for No-Reference Image Quality Assessment. 1-5 - Reo Yoneyama, Yi-Chiao Wu, Tomoki Toda:
Source-Filter HiFi-GAN: Fast and Pitch Controllable High-Fidelity Neural Vocoder. 1-5 - Xiyuan Zhang, Ranak Roy Chowdhury, Jingbo Shang, Rajesh K. Gupta, Dezhi Hong:
Towards Diverse and Coherent Augmentation for Time-Series Forecasting. 1-5 - Weikai Kong, Shuhong Ye, Chenglin Yao, Jianfeng Ren:
Confidence-Based Event-Centric Online Video Question Answering on a Newly Constructed ATBS Dataset. 1-5 - Zakaria Aldeneh, Masha Fedzechkina, Skyler Seto, Katherine Metcalf, Miguel Sarabia, Nicholas Apostoloff, Barry-John Theobald:
On the Role of LIP Articulation in Visual Speech Perception. 1-5 - Chengwei Ouyang, Kexin Fei, Haoshuai Zhou, Congxi Lu, Linkai Li:
A Multi-Stage Low-Latency Enhancement System for Hearing Aids. 1-2 - Chengwen Zhang, Danqin Wu:
Modeling Global Latent Semantic in Multi-Turn Conversations with Random Context Reconstruction. 1-5 - Yiyang Liu, Chenxin Li, Xiaotong Tu, Xinghao Ding, Yue Huang:
Hint-Dynamic Knowledge Distillation. 1-5 - Adriano Durao, Joel P. Arrais, Bernardete Ribeiro, Gabriel Falcao:
On the Quantization of Recurrent Neural Networks for Smiles Generation. 1-5 - Xinjun Pei, Xiaoheng Deng, Shengwei Tian, Kaiping Xue:
Efficient Privacy Preserving Graph Neural Network for Node Classification. 1-5 - Shuvendu Roy, Ali Etemad:
Temporal Contrastive Learning with Curriculum. 1-5 - Rui-Chen Zheng, Yang Ai, Zhen-Hua Ling:
Speech Reconstruction from Silent Tongue and Lip Articulation by Pseudo Target Generation and Domain Adversarial Training. 1-5 - Miseul Kim, Zhenyu Piao, Jihyun Lee, Hong-Goo Kang:
Style Modeling for Multi-Speaker Articulation-to-Speech. 1-5 - Alexandre Bruckert, Mona Abid, Matthieu Perreira Da Silva, Patrick Le Callet:
Could the BubbleView Metaphor be used to Infer Visual Attention on 3D Graphical Content? 1-5 - Christo Kurisummoottil Thomas, Walid Saad:
Reliable Beamforming at Terahertz Bands: Are Causal Representations the Way Forward? 1-5 - Jian Xu, Yang Lei, Guangqi Zhu, Yunling Feng, Bo Xiao, Qifeng Qian, Yajing Xu:
SL-MoE: A Two-Stage Mixture-of-Experts Sequence Learning Framework for Forecasting Rapid Intensification of Tropical Cyclone. 1-5 - Haibo Shen, Yihao Luo, Xiang Cao, Liangqi Zhang, Juyu Xiao, Tianjiang Wang:
Training Robust Spiking Neural Networks on Neuromorphic Data with Spatiotemporal Fragments. 1-5 - Marie Kunesová, Jindrich Matousek, Jan Lehecka, Jan Svec, Josef Michálek, Daniel Tihelka, Martin Bulín, Zdenek Hanzlícek, Markéta Rezácková:
Ensemble of Deep Neural Network Models for MOS Prediction. 1-5 - Ho-Hsiang Wu, Oriol Nieto, Juan Pablo Bello, Justin Salamon:
Audio-Text Models Do Not Yet Leverage Natural Language. 1-5 - Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda:
Singing Voice Synthesis Based on a Musical Note Position-Aware Attention Mechanism. 1-5 - Linghui Cai, Zhenhua Tang:
Stereoscopic Video Retargeting Based on Camera Motion Classification. 1-5 - Jianlong Yuan, Yuanhong Xu, Zhibin Wang:
K2NN: Self-Supervised Learning with Hierarchical Nearest Neighbors for Remote Sensing. 1-5 - Wenyun Li, Guo Zhong, Xingyu Lu, Chi-Man Pun:
Locality Preserving Multiview Graph Hashing For Large Scale Remote Sensing Image Search. 1-5 - Zhongjie Li, Bin Zhao, Gaoyan Zhang, Jianwu Dang:
Brain Network Features Differentiate Intentions from Different Emotional Expressions of the Same Text. 1-5 - Sandipan Dhar, Padmanabha Banerjee, Nanda Dulal Jana, Swagatam Das:
Voice Conversion Using Feature Specific Loss Function Based Self-Attentive Generative Adversarial Network. 1-5 - Michalis Vrigkas, Virginia Tagka, Marina E. Plissiti, Christophoros Nikou:
Composition of Motion from Video Animation Through Learning Local Transformations. 1-5 - Liangjie Huang, Tian Yuan, Yunming Liang, Zeyu Chen, Can Wen, Yanlu Xie, Jinsong Zhang, Dengfeng Ke:
LIMI-VC: A Light Weight Voice Conversion Model with Mutual Information Disentanglement. 1-5 - Yashas Malur Saidutta, Rakshith Sharma Srinivasa, Ching Hua Lee, Chouchang Yang, Yilin Shen, Hongxia Jin:
To Wake-Up or Not to Wake-Up: Reducing Keyword False Alarm by Successive Refinement. 1-5 - Ramon Sanabria, Hao Tang, Sharon Goldwater:
Analyzing Acoustic Word Embeddings from Pre-Trained Self-Supervised Speech Models. 1-5 - Shutong Niu, Jun Du, Qing Wang, Li Chai, Huaxin Wu, Zhaoxu Nian, Lei Sun, Yi Fang, Jia Pan, Chin-Hui Lee:
An Experimental Study on Sound Event Localization and Detection Under Realistic Testing Conditions. 1-5 - Silpa Babu, Selin Aviyente, Namrata Vaswani:
Tensor Low Rank Column-Wise Compressive Sensing for Dynamic Imaging. 1-5 - Zhihua Li, Lijun Yin:
Multimodal Facial Action unit Detection with Physiological Signals. 1-5 - Dennis Wei, Dmitry M. Malioutov:
A Statistical Interpretation of the Maximum Subarray Problem. 1-5 - Vikramjit Mitra, Vasudha Kowtha, Hsiang-Yun Sherry Chien, Erdrin Azemi, Carlos Avendaño:
Pre-Trained Model Representations and Their Robustness Against Noise for Speech Emotion Analysis. 1-5 - Mike Thornton, Danilo P. Mandic, Tobias Reichenbach:
Relating EEG Recordings to Speech Using Envelope Tracking and The Speech-FFR. 1-2 - Tao Hong, Zeren Zhang, Jinwen Ma:
PCSalmix: Gradient Saliency-Based Mix Augmentation for Point Cloud Classification. 1-5 - Allen Chang, Mary Knapp, James LaBelle, John Swoboda, Ryan Volz, Philip J. Erickson:
Removing Radio Frequency Interference From Auroral Kilometric Radiation With Stacked Autoencoders. 1-5 - Detai Xin, Sharath Adavanne, Federico Ang, Ashish Kulkarni, Shinnosuke Takamichi, Hiroshi Saruwatari:
Improving Speech Prosody of Audiobook Text-To-Speech Synthesis with Acoustic and Textual Contexts. 1-5 - Yiran He, Hoi-To Wai:
Central Nodes Detection from Partially Observed Graph Signals. 1-5 - Zakir Hussain Shaik, Erik G. Larsson:
Distributed Signal Processing for Out-of-System Interference Suppression in Cell-Free Massive MIMO. 1-5 - Heinrich Dinkel, Yongqing Wang, Zhiyong Yan, Junbo Zhang, Yujun Wang:
Unified Keyword Spotting and Audio Tagging on Mobile Devices with Transformers. 1-5 - Shiwei Wu, Kang Zhang, Xia Yuan, Chunxia Zhao:
Infrared and Visible Image Fusion by Using Multi-Scale Transformation and Fractional-Order Gradient Information. 1-5 - Itay Buchnik, Damiano Steger, Guy Revach, Ruud J. G. van Sloun, Tirza Routtenberg, Nir Shlezinger:
Learned Kalman Filtering in Latent Space with High-Dimensional Data. 1-5 - Jee-Weon Jung, Hee-Soo Heo, Bong-Jin Lee, Jaesung Huh, Andrew Brown, Youngki Kwon, Shinji Watanabe, Joon Son Chung:
In Search of Strong Embedding Extractors for Speaker Diarisation. 1-5 - Philippe Gonzalez, Tommy Sonne Alstrøm, Tobias May:
On Batching Variable Size Inputs for Training End-to-End Speech Enhancement Systems. 1-5 - Weichen Xu, Tianhao Fu:
Overcoming the Seesaw in Monocular 3D Object Detection Via Language Knowledge Transferring. 1-5 - Natsuki Akaishi, Kohei Yatabe, Yasuhiro Oikawa:
Improving Phase-Vocoder-Based Time Stretching by Time-Directional Spectrogram Squeezing. 1-5 - Talha Bozkus, Urbashi Mitra:
Ensemble Graph Q-Learning for Large Scale Networks. 1-5 - Daria Botvynko, Carlos Granero-Belinchón, Simon Van Gennip, Abdesslam Benzinou, Ronan Fablet:
Deep Learning for Lagrangian Drift Simulation at The Sea Surface. 1-5 - Boyang Lyu, Thuan Nguyen, Matthias Scheutz, Prakash Ishwar, Shuchin Aeron:
A Principled Approach to Model Validation in Domain Generalization. 1-5 - Yongchang Li, Juncheng Jia, Yan Zuo, Weipeng Zhu:
TinyOOD: Effective out-of-Distribution Detection for TinyML. 1-5 - Yalan Ye, Yutuo He, Wanjing Huang, Qiaosen Dong, Chong Wang, Guoqing Wang:
Cross-Subject Mental Fatigue Detection based on Separable Spatio-Temporal Feature Aggregation. 1-2 - Xingchen Song, Di Wu, Zhiyong Wu, Binbin Zhang, Yuekai Zhang, Zhendong Peng, Wenpeng Li, Fuping Pan, Changbao Zhu:
TrimTail: Low-Latency Streaming ASR with Simple But Effective Spectrogram-Level Length Penalty. 1-5 - Ngai-Wing Kwong, Yui-Lam Chan, Sik-Ho Tsang, Daniel Pak-Kong Lun:
Optimized Quality Feature Learning for Video Quality Assessment. 1-5 - Po-Wei Chen, Von-Wun Soo:
A Few Shot Learning of Singing Technique Conversion Based on Cycle Consistency Generative Adversarial Networks. 1-5 - Hongbo Chen, Dongchen Zhu, Guanghui Zhang, Wenjun Shi, Xiaolin Zhang, Jiamao Li:
CM-CS: Cross-Modal Common-Specific Feature Learning For Audio-Visual Video Parsing. 1-5 - Andriy Enttsel, Filippo Martinini, Alex Marchioni, Mauro Mangia, Riccardo Rovatti, Gianluca Setti:
Second-Order Statistic Deviation to Model Anomalies in the Design of Unsupervised Detectors. 1-5 - Julien Hauret, Thomas Joubaud, Véronique Zimpfer, Éric Bavu:
EBEN: Extreme Bandwidth Extension Network Applied To Speech Signals Captured With Noise-Resilient Body-Conduction Microphones. 1-5 - Christos Papaioannidis, Ioannis Mademlis, Ioannis Pitas:
Fast Single-Person 2D Human Pose Estimation Using Multi-Task Convolutional Neural Networks. 1-5 - Yining Wang, Ye Hu, Hongyang Du, Tao Luo, Dusit Niyato:
Multi-Agent Reinforcement Learning for Covert Semantic Communications over Wireless Networks. 1-5 - Rui Zhou, Wenye Zhu, Xiaofei Li:
Speech Dereverberation with a Reverberation Time Shortening Target. 1-5 - Sunjae Yoon, Ji Woo Hong, SooHwan Eom, Hee Suk Yoon, Eunseop Yoon, Daehyeok Kim, Junyeong Kim, Chanwoo Kim, Chang D. Yoo:
Counterfactual Two-Stage Debiasing For Video Corpus Moment Retrieval. 1-5 - Ziming Wang, Han Yu, Xiaoguang Zhu, Zengwen Li, Changxue Chen, Liang Song:
Learning 3D Human Pose and Shape Estimation Using Uncertainty-Aware Body Part Segmentation. 1-5 - Xianchao Zhang, Guanglu Wang, Xiaotong Zhang, Han Liu, Zhengxi Yin, Wentao Yang:
TRICL: Triplet Continual Learning. 1-5 - Constantin Patsch, Eckehard G. Steinbach:
Self-Attention Based Action Segmentation Using Intra-And Inter-Segment Representations. 1-5 - Liang Zhao, Zihao Wang, Ziyue Wang, Zhikui Chen:
Multi-View Graph Regularized Deep Autoencoder-Like NMF Framework. 1-5 - Tong Wei, Linlong Wu, Kumar Vijay Mishra, M. R. Bhavani Shankar:
RIS-Aided Wideband DFRC with Reconfigurable Holographic Surface. 1-5 - Arijit Roy, Constantinos Psomas, Ioannis Krikidis:
Wireless Power Transfer Using Chirp Waveforms. 1-5 - Zhuohang Li, Jiaxin Zhang, Jian Liu:
Speech Privacy Leakage from Shared Gradients in Distributed Learning. 1-5 - Pierre-Michel Bousquet, Mickael Rouvier:
Jeffreys Divergence-Based Regularization of Neural Network Output Distribution Applied to Speaker Recognition. 1-5 - Jennifer Williams, Vahid Yazdanpanah, Sebastian Stein:
Privacy-Preserving Occupancy Estimation. 1-5 - Han Chen, Yan Song, Zhu Zhuo, Yu Zhou, Yu-Hong Li, Hui Xue, Ian McLoughlin:
An Effective Anomalous Sound Detection Method Based on Representation Learning with Simulated Anomalies. 1-5 - Mingbin Xu, Congzheng Song, Ye Tian, Neha Agrawal, Filip Granqvist, Rogier C. van Dalen, Xiao Zhang, Arturo Argueta, Shiyi Han, Yaqiao Deng, Leo Liu, Anmol Walia, Alex Jin:
Training Large-Vocabulary Neural Language Models by Private Federated Learning for Resource-Constrained Devices. 1-5 - Lei Liu, Li Liu:
Cross-Modal Mutual Learning for Cued Speech Recognition. 1-5 - Behrad Soleimani, Henning F. Schepker, Majid Mirbagheri:
Neural-AFC: Learning-Based Step-Size Control for Adaptive Feedback Cancellation with Closed-Loop Model Training. 1-5 - L. Yashvanth, Chandra R. Murthy:
Comparative Study of IRS Assisted Opportunistic Communications Over i.i.d. and los channels. 1-5 - Pavel Rumiantsev, Mark Coates:
Performing Neural Architecture Search Without Gradients. 1-5 - Anda Cheng, Jian Cheng:
APGP: Accuracy-Preserving Generative Perturbation for Defending Against Model Cloning Attacks. 1-5 - Xiaorui Wang, Jun Wang, Xin Tang, Peng Gao, Rui Fang, Guotong Xie:
Filter Pruning Via Filters Similarity in Consecutive Layers. 1-5 - Panagiotis Koromilas, Mihalis A. Nicolaou, Theodoros Giannakopoulos, Yannis Panagakis:
MMATR: A Lightweight Approach for Multimodal Sentiment Analysis Based on Tensor Methods. 1-5 - Zeping Min, Qian Ge, Guanhua Huang:
SAN: A Robust End-to-End ASR Model Architecture. 1-5 - Seungheon Doh, Minz Won, Keunwoo Choi, Juhan Nam:
Toward Universal Text-To-Music Retrieval. 1-5 - Ha Minh Tan, Duc-Quang Vu, Jia-Ching Wang:
Selinet: A Lightweight Model for Single Channel Speech Separation. 1-5 - Neil Irwin Bernardo, Jingge Zhu, Yonina C. Eldar, Jamie S. Evans:
Hardware-Limited Non-Uniform Task-Based Quantizers. 1-5 - Siyu Sun, Jian Jin, Zhe Han, Xianjun Xia, Li Chen, Yijian Xiao, Piao Ding, Shenyi Song, Roberto Togneri, Haijian Zhang:
A Lightweight Fourier Convolutional Attention Encoder for Multi-Channel Speech Enhancement. 1-5 - Stepan Mazokha, Sanaz Naderi, Georgios I. Orfanidis, George Sklivanitis, Dimitris A. Pados, Jason O. Hallstrom:
Single-Sample Direction-of-Arrival Estimation for Fast and Robust 3D Localization With Real Measurements from a Massive MIMO System. 1-5 - Huajian Fang, Timo Gerkmann:
Uncertainty Estimation in Deep Speech Enhancement Using Complex Gaussian Mixture Models. 1-5 - Saumya Y. Sahai, Jing Liu, Thejaswi Muniyappa, Kanthashree Mysore Sathyendra, Anastasios Alexandridis, Grant P. Strimel, Ross McGowan, Ariya Rastrow, Feng-Ju Chang, Athanasios Mouchtaris, Siegfried Kunzmann:
Dual-Attention Neural Transducers for Efficient Wake Word Spotting in Speech Recognition. 1-5 - Zhaoxi Mu, Xinyu Yang, Xiangyuan Yang, Wenjing Zhu:
A Multi-Stage Triple-Path Method For Speech Separation in Noisy and Reverberant Environments. 1-5 - Jin-Peng Lan, Zhi-Qi Cheng, Jun-Yan He, Chenyang Li, Bin Luo, Xu Bao, Wangmeng Xiang, Yifeng Geng, Xuansong Xie:
Procontext: Exploring Progressive Context Transformer for Tracking. 1-5 - Zhangping Liu, Bin Liu, Zhiwei Zhao, Qi Chu, Nenghai Yu:
Dual-Uncertainty Guided Curriculum Learning and Part-Aware Feature Refinement for Domain Adaptive Person Re-Identification. 1-5 - Christos Plachouras, Marius Miron:
Music Rearrangement Using Hierarchical Segmentation. 1-5 - Shujie Hu, Xurong Xie, Zengrui Jin, Mengzhe Geng, Yi Wang, Mingyu Cui, Jiajun Deng, Xunying Liu, Helen Meng:
Exploring Self-Supervised Pre-Trained ASR Models for Dysarthric and Elderly Speech Recognition. 1-5 - Xuan Li, Li Zhang:
Geometric Matrix Completion with Collaborative Routing Between Capsules. 1-5 - Yuxuan Wu, Yifan He, Xinlu Liu, Yi Wang, Roger B. Dannenberg:
Transplayer: Timbre Style Transfer with Flexible Timbre Control. 1-5 - Davide Albertini, Gioele Greco, Alberto Bernardini, Augusto Sarti:
Diffusion-Based Sound Source Localization Using Networks of Planar Microphone Arrays. 1-5 - Hang Wang, Sahar Karami, Ousmane Dia, Hippolyt Ritter, Ehsan Emamjomeh-Zadeh, Jiahui Chen, Zhen Xiang, David J. Miller, George Kesidis:
Training Set Cleansing of Backdoor Poisoning by Self-Supervised Representation Learning. 1-5 - Ruoshu Wang, Shengfa Miao, Di Liu, Xin Jin, Weisheng Zhang:
Multi-Layer Seasonal Perception Network for Time Series Forecasting. 1-5 - Bonan Ding, Jin Xie, Jing Nie:
C2BN: Cross-Modality and Cross-Scale Balance Network for Multi-Modal 3D Object Detection. 1-5 - Shashank P, B. N. Bharath:
Online Caching with Fetching cost for Arbitrary Demand Pattern: a Drift-Plus-Penalty Approach. 1-5 - Panagiotis Tzirakis, Alice Baird, Jeffrey A. Brooks, Christopher Gagne, Lauren Kim, Michael Opara, Christopher B. Gregory, Jacob Metrick, Garrett Boseck, Vineet Tiruvadi, Björn W. Schuller, Dacher Keltner, Alan Cowen:
Large-Scale Nonverbal Vocalization Detection Using Transformers. 1-5 - Kyungho Kim, Junseo Lee, Jihwa Lee:
Image Generation is May All You Need for VQA. 1-5 - Hiroki Okamura, Keisuke Maeda, Ren Togo, Takahiro Ogawa, Miki Haseyama:
Improving Dropout in Graph Convolutional Networks for Recommendation via Contrastive Loss. 1-5 - Svantje Voit, Gerald Enzner:
Neural Network Models with Integrated Training and Adaptation For Nonlinear Acoustic System Identification. 1-5 - Yizhe Zhu, Chunhui Zhang, Qiong Liu, Xi Zhou:
Audio-Driven Talking Head Video Generation with Diffusion Model. 1-5 - Hao Zhang, Nianwen Si, Yaqi Chen, Wenlin Zhang, Xukui Yang, Dan Qu, Zhen Li:
Decoupled Non-Parametric Knowledge Distillation for end-to-End Speech Translation. 1-5 - Hui Wang, Jie Sun, Tianyu Wo, Xudong Liu:
FED-3DA: A Dynamic and Personalized Federated Learning Framework. 1-5 - Binghuai Lin, Liyuan Wang:
Multi-modal ASR error correction with joint ASR error detection. 1-5 - Pablo Alonso-Jiménez, Xavier Favory, Hadrien Foroughmand, Grigoris Bourdalas, Xavier Serra, Thomas Lidy, Dmitry Bogdanov:
Pre-Training Strategies Using Contrastive Learning and Playlist Information for Music Classification and Similarity. 1-5 - Peichao Wang, Qian He:
Heart Rate Estimation and Performance Analysis using MIMO Radar with Dispersed Antennas. 1-5 - Alexey Shovkun, Andrey Kiryasov, Ilya Zakharov, Mariam Khayretdinova:
Optimization of the Deep Neural Networks for Seizure Detection. 1-2 - Cyprien Gille, Frédéric Guyard, Marc Antonini, Michel Barlaud:
Learning Sparse auto-Encoders for Green AI image coding. 1-5 - Ee-Leng Tan, Santi Peksi, Woon-Seng Gan:
Implementing Continuous HRTF Measurement in Near-Field. 1-5 - Muyang Yi, Dong Liang, Rui Wang, Yue Ding, Hongtao Lu:
Spammer Detection on Short Video Applications: A new Challenge and Baselines. 1-5 - Wei Kang, Liyong Guo, Fangjun Kuang, Long Lin, Mingshuang Luo, Zengwei Yao, Xiaoyu Yang, Piotr Zelasko, Daniel Povey:
Fast and Parallel Decoding for Transducer. 1-5 - Chenyue Zhang, Hang Chen, Jun Du, Bao-Cai Yin, Jia Pan, Chin-Hui Lee:
Incorporating Visual Information Reconstruction into Progressive Learning for Optimizing audio-visual Speech Enhancement. 1-5 - Nasimuddin Ahmed, Shivam Singhal, Aniruddha Sinha, Avik Ghose:
A Patient Invariant Model Towards the Prediction of Freezing of Gait. 1-5 - Amin Radbord, Italo Atzeni, Antti Tölli:
Multi-User Data Detection in Massive MIMO with 1-Bit ADCS. 1-5 - Rajarshi Saha, Mert Pilanci, Andrea J. Goldsmith:
Low Precision Representations for High Dimensional Models. 1-5 - He-Yen Hsieh, Ding-Jie Chen, Cheng-Wei Chang, Tyng-Luh Liu:
One-Shot Action Detection via Attention Zooming In. 1-5 - Ailin Li, Lei Zhao, Zhiwen Zuo, Zhizhong Wang, Wei Xing, Dongming Lu:
CRFAST: Clip-Based Reference-Guided Facial Image Semantic Transfer. 1-5 - Jing-Yu Liu, Yan-Ming Zhang, Fei Yin, Cheng-Lin Liu:
Streaming Stroke Classification of Online Handwriting. 1-5 - Yonas Sium, Georgios Kollias, Tsuyoshi Idé, Payel Das, Naoki Abe, Aurélie C. Lozano, Qi Li:
Direction Aware Positional and Structural Encoding for Directed Graph Neural Networks. 1-5 - Mian Zhang, Xiabing Zhou, Wenliang Chen, Min Zhang:
Emotion Recognition in Conversation from Variable-Length Context. 1-5 - Pierre Guiraud, Alastair H. Moore, Rebecca R. Vos, Patrick A. Naylor, Mike Brookes:
The MBSTOI Binaural Intelligibility Metric Using a Close-Talking Microphone Reference. 1-5 - Jiaxing Lin, Runxin Xu, Baobao Chang:
TABLEIE: Capturing the Interactions Among Sub-Tasks in Information Extraction via Double Tables. 1-5 - Keon Lee, Kyumin Park, Daeyoung Kim:
DailyTalk: Spoken Dialogue Dataset for Conversational Text-to-Speech. 1-5 - Hyeonggon Ryu, Arda Senocak, In So Kweon, Joon Son Chung:
Hindi as a Second Language: Improving Visually Grounded Speech with Semantically Similar Samples. 1-5 - Kin Wai Cheuk, Ryosuke Sawata, Toshimitsu Uesaka, Naoki Murata, Naoya Takahashi, Shusuke Takahashi, Dorien Herremans, Yuki Mitsufuji:
Diffroll: Diffusion-Based Generative Music Transcription with Unsupervised Pretraining Capability. 1-5 - Marcos A. Cantu, Volker Hohmann:
Spectro-Temporal Post-Filtering Via Short-Time Target Cancellation for Directional Speech Enhancement in a Dual-Microphone Hearing AID. 1-5 - Hao Xie, Weizhe Yuan, Bin Kang, Songlin Du:
CFFMixer: Multi-Dimensional Feature Fusion for Object Detection. 1-5 - Heming Wang, DeLiang Wang:
Cross-Domain Diffusion Based Speech Enhancement for Very Noisy Speech. 1-5 - Kristina Tesch, Timo Gerkmann:
Spatially Selective Deep Non-Linear Filters For Speaker Extraction. 1-5 - Yang Liu, Yangyang Shi, Yun Li, Kaustubh Kalgaonkar, Sriram Srinivasan, Xin Lei:
SCA: Streaming Cross-Attention Alignment For Echo Cancellation. 1-5 - Daofeng Liu, Fan Lyu, Linyan Li, Zhenping Xia, Fuyuan Hu:
Centroid Distance Distillation for Effective Rehearsal in Continual Learning. 1-5 - Anirudh Rayas, Rajasekhar Anguluri, Jiajun Cheng, Gautam Dasarathy:
Differential Analysis for Networks Obeying Conservation Laws. 1-5 - Ahmed Mustafa, Jean-Marc Valin, Jan Büthe, Paris Smaragdis, Mike Goodwin:
Framewise Wavegan: High Speed Adversarial Vocoder In Time Domain With Very Low Computational Complexity. 1-5 - Jookyung Song, Yeonjin Chang, Seonguk Park, Nojun Kwak:
Semantics-Guided Object Removal for Facial Images: with Broad Applicability and Robust Style Preservation. 1-5 - Nguyen Phan, Ta Duc Huy, Soan Thi Minh Duong, Nguyen Hoang Tran, Sam Tran, Dao Huu Hung, Chanh D. Tr. Nguyen, Trung H. Bui, Steven Q. H. Truong:
Logovit: Local-Global Vision Transformer for Object Re-Identification. 1-5 - Chaewon Park, Minhyeok Lee, Suhwan Cho, Donghyeong Kim, Sangyoun Lee:
Two-Stream Decoder Feature Normality Estimating Network for Industrial Anomaly Detection. 1-5 - Alec Wright, Vesa Välimäki, Lauri Juvela:
Adversarial Guitar Amplifier Modelling with Unpaired Data. 1-5 - Huadong Tang, Youpeng Zhao, Yingying Jiang, Zhuoxin Gan, Qiang Wu:
Class-Aware Contextual Information for Semantic Segmentation. 1-5 - Zhong-Qiu Wang, Samuele Cornell, Shukjae Choi, Younglo Lee, Byeong-Yeol Kim, Shinji Watanabe:
FNeural Speech Enhancement with Very Low Algorithmic Latency and Complexity via Integrated full- and sub-band Modeling. 1-5 - Baptiste Magnier, Ghulam Sakhi Shokouh, Louis Berthier, Marcel Pie, Adrien Ruggiero:
2DSBG: A 2d Semi Bi-Gaussian Filter Adapted for Adjacent and Multi-Scale Line Feature Detection. 1-5 - Dong Chen, Duoqian Miao, Xue Rong Zhao:
Hyneter: Hybrid Network Transformer for Object Detection. 1-5 - Li-Wei Chen, Shinji Watanabe, Alexander Rudnicky:
A Unified One-Shot Prosody and Speaker Conversion System with Self-Supervised Discrete Speech Units. 1-5 - Zhongjie Yu, Shuyang Wang, Lin Chen, Zhongwei Cheng:
Halluaudio: Hallucinate Frequency as Concepts For Few-Shot Audio Classification. 1-5 - Alexey Sholokhov, Nikita Kuzmin, Kong Aik Lee, Eng Siong Chng:
Probabilistic Back-ends for Online Speaker Recognition and Clustering. 1-5 - Yuhao Liu, Chen Cui, Marzieh Ajirak, Petar M. Djuric:
Estimation of Time-Varying Graph Topologies from Graph Signals. 1-5 - Pranav U. Damale, Edwin K. P. Chong, Louis L. Scharf:
Wiener Filtering Without Covariance Matrix Inversion. 1-5 - Yanjue Song, Nilesh Madhu:
Aiding Speech Harmonic Recovery in DNN-Based Single Channel Noise Reduction Using Cepstral Excitation Manipulation (CEM) Components. 1-5 - Vadim Popov, Amantur Amatov, Mikhail A. Kudinov, Vladimir Gogoryan, Tasnima Sadekova, Ivan Vovk:
Optimal Transport in Diffusion Modeling for Conversion Tasks in Audio Domain. 1-5 - Meng Liu, Ran Yi, Lizhuang Ma:
EMCLR: Expectation Maximization Contrastive Learning Representations. 1-5 - Weimin Wang, Qiong Chang:
VAN-ICP: GPU-Accelerated Approximate Nearest Neighbor Search for ICP Registration via Voxel Dilation. 1-5 - Huijie Guo, Lei Shi:
Ultimate Negative Sampling for Contrastive Learning. 1-5 - Yisi Liu, Peter Wu, Alan W. Black, Gopala Krishna Anumanchipalli:
A Fast and Accurate Pitch Estimation Algorithm Based on the Pseudo Wigner-Ville Distribution. 1-5 - Yuantian Huang, Satoshi Iizuka, Kazuhiro Fukui:
Free-View Expressive Talking Head Video Editing. 1-5 - Jinjin Guo, Zhichao Huang, Guangning Xu, Bowen Zhang, Chaoqun Duan:
Knowledge-Aware Few Shot Learning for Event Detection from Short Texts. 1-5 - Yuvraj Singh, Jahnvi Singh Rohela, Satish Mulleti:
Sample-Efficient Robust MMV Recovery Algorithm. 1-5 - W. Ronny Huang, Shuo-Yiin Chang, Tara N. Sainath, Yanzhang He, David Rybach, Robert David, Rohit Prabhavalkar, Cyril Allauzen, Cal Peyser, Trevor D. Strohman:
E2E Segmentation in a Two-Pass Cascaded Encoder ASR Model. 1-5 - Yunbo Qiu, Yue Jin, Lebin Yu, Jian Wang, Xudong Zhang:
Promoting Cooperation in Multi-Agent Reinforcement Learning via Mutual Help. 1-5 - Yuchen Liang, Venugopal V. Veeravalli:
Quickest Change Detection with Leave-one-out Density Estimation. 1-5 - Ting Zou, Zhong Qian, Peifeng Li, Qiaoming Zhu:
Cross-Modal Adversarial Contrastive Learning for Multi-Modal Rumor Detection. 1-5 - Zhenxi Song, Zian Pei, Huixia Ren, Lin Zhu, Yi Guo, Zhiguo Zhang:
Disambiguation of Cognitive Impairment Diagnosis with EEG-Based Dual-Contrastive Learning. 1-5 - Lin Li, Peipei Wang, Xinhao Zheng, Qing Xie:
Code-Enhanced Fine-Grained Semantic Matching For Tag Recommendation In Software Information Sites. 1-5 - Qiulin Wang, Wenxuan Hu, Lin Li, Qingyang Hong:
Meta Learning with Adaptive Loss Weight for Low-Resource Speech Recognition. 1-5 - Félix Gontier, Romain Serizel, Christophe Cerisara:
Spice+: Evaluation of Automatic Audio Captioning Systems with Pre-Trained Language Models. 1-5 - Yongjin Yuan, Zheng Wang, Feiping Nie, Xuelong Li:
Unsupervised Feature Selection with self-Weighted and ℓ2,0-Norm Constraint. 1-5 - Biao Fu, Peigen Ye, Liang Zhang, Pei Yu, Cong Hu, Xiaodong Shi, Yidong Chen:
A Token-Level Contrastive Framework for Sign Language Translation. 1-5 - Minghe Zhu, Lei Li, Shuqiang Xia, Tsung-Hui Chang:
Information and Sensing Beamforming Optimization for Multi-User Multi-Target MIMO ISAC Systems. 1-5 - Lei Lin, Shuangtao Li, Xiaodong Shi:
LEAPT: Learning Adaptive Prefix-to-Prefix Translation For Simultaneous Machine Translation. 1-5 - Yang Ai, Zhen-Hua Ling:
Neural Speech Phase Prediction Based on Parallel Estimation Architecture and Anti-Wrapping Losses. 1-5 - Tin-Han Chi, Kai-Chun Liu, Chia-Yeh Hsieh, Yu Tsao, Chia-Tai Chan:
Prefallkd: Pre-Impact Fall Detection Via CNN-ViT Knowledge Distillation. 1-5 - Masaya Kawamura, Yuma Shirahata, Ryuichi Yamamoto, Kentaro Tachibana:
Lightweight and High-Fidelity End-to-End Text-to-Speech with Multi-Band Generation and Inverse Short-Time Fourier Transform. 1-5 - Hui Li, Jinghan Jia, Shijun Liang, Yuguang Yao, Saiprasad Ravishankar, Sijia Liu:
SMUG: Towards Robust Mri Reconstruction by Smoothed Unrolling. 1-5 - Vassilis Kalantzis, Georgios Kollias, Shashanka Ubaru, Theodoros Salonidis:
Accelerating Matrix Trace Estimation by Aitken's Δ2 Process. 1-5 - Chi Wang, Jian Gao, Yang Hua, Hui Wang:
Flowreg: Latent Space Regularization Using Normalizing Flow For Limited Samples Learning. 1-5 - Andreas Zinonos, Alexandros Haliassos, Pingchuan Ma, Stavros Petridis, Maja Pantic:
Learning Cross-Lingual Visual Speech Representations. 1-5 - Julian Wechsler, Srikanth Raj Chetupalli, Wolfgang Mack, Emanuël A. P. Habets:
Multi-Microphone Speaker Separation by Spatial Regions. 1-5 - Jingyang Yuan, Xiao Luo, Yifang Qin, Yusheng Zhao, Wei Ju, Ming Zhang:
Learning on Graphs under Label Noise. 1-5 - Ruijie Tao, Kong Aik Lee, Zhan Shi, Haizhou Li:
Speaker Recognition with Two-Step Multi-Modal Deep Cleansing. 1-5 - Fred Goodyer, Bashar I. Ahmad, Simon J. Godsill:
GaPP: Multi-Target Tracking with Gaussian Processes. 1-5 - Bima Prihasto, Yi-Xing Lin, Phuong Thi Le, Chien-Lin Huang, Jia-Ching Wang:
CNEG-VC: Contrastive Learning Using Hard Negative Example In Non-Parallel Voice Conversion. 1-5 - Valentina Sanguineti, Sanket Kumar Thakur, Pietro Morerio, Alessio Del Bue, Vittorio Murino:
Audio-Visual Inpainting: Reconstructing Missing Visual Information with Sound. 1-5 - Yingkang Cao, Xiaodi Wu:
Distributed Quantum Sensing Network with Geographically Constrained Measurement Strategies. 1-5 - Lin Zhan, Jiayuan Fan, Peng Ye, Jianjian Cao:
A2S-NAS: Asymmetric Spectral-Spatial Neural Architecture Search for Hyperspectral Image Classification. 1-5 - Yusheng Huang, Jiexing Qi, Xinbing Wang, Zhouhan Lin:
Asymmetric Polynomial Loss for Multi-Label Classification. 1-5 - Sheng Yang, Yiming Li, Yong Jiang, Shu-Tao Xia:
Backdoor Defense via Suppressing Model Shortcuts. 1-5 - Dafeng Zhang, Jiangbo Guo, Zhezhu Jin:
Context-Aware Face Clustering with Graph Convolutional Networks. 1-5 - Steven Davy, Niamh Belton, Joshua Tobin, Owais Bin Zuber, Liu Dong, Yuan Xuewen:
A Causal Convolutional Approach for Packet Loss Concealment in Low Powered Devices. 1-5 - Jianyu Xiong, Tao Dai, Yaohua Zha, Xin Wang, Shu-Tao Xia:
Semantic Preserving Learning for Task-Oriented Point Cloud Downsampling. 1-5 - Katrin Tomanek, Katie Seaver, Pan-Pan Jiang, Richard Cave, Lauren Harrell, Jordan R. Green:
An Analysis of Degenerating Speech Due to Progressive Dysarthria on ASR Performance. 1-5 - Xurong Xie, Xunying Liu, Hui Chen, Hongan Wang:
Unsupervised Model-Based Speaker Adaptation of End-To-End Lattice-Free MMI Model for Speech Recognition. 1-5 - Xiaomeng Wu, Yongqing Sun, Akisato Kimura:
Deep Quantigraphic Image Enhancement via Comparametric Equations. 1-5 - Sourav Mishra, Suresh Sundaram:
A Memory-Free Evolving Bipolar Neural Network for Efficient Multi-Label Stream Learning. 1-5 - Nan Su, Bingzhu Du, Yuchi Zhang, Chao Liu, Yongliang Wang, Hong Chen, Xin Lu:
Precognition in Contextual Spoken Language Understanding via Knowledge Distillation. 1-5 - Ping Hu, Virginia Bordignon, Mert Kayaalp, Ali H. Sayed:
Performance of Social Machine Learning Under Limited Data. 1-5 - Minghan Wang, Yinglu Li, Jiaxin Guo, Xiaosong Qiao, Chang Su, Min Zhang, Shimin Tao, Hao Yang:
Zephyr: Zero-Shot Punctuation Restoration. 1-5 - Lyndon R. Duong, Bohan Li, Cheng Chen, Jingning Han:
Multi-Rate Adaptive Transform Coding for Video Compression. 1-5 - Toros Arikan, Amir Weiss, Hari Vishnu, Grant B. Deane, Andrew C. Singer, Gregory W. Wornell:
Learning Environmental Structure Using Acoustic Probes with a Deep Neural Network. 1-5 - Jubum Han, Mateusz Matuszewski, Olaf Sikorski, Hosang Sung, Hoonyoung Cho:
Randmasking Augment: A Simple and Randomized Data Augmentation For Acoustic Scene Classification. 1-5 - Berken Utku Demirel, Khaldoon Al-Naimi, Fahim Kawsar, Alessandro Montanari:
Cancelling Intermodulation Distortions for Otoacoustic Emission Measurements with Earbuds. 1-5 - Qianyu Yang, Anna Guerra, Francesco Guidi, Nir Shlezinger, Haiyang Zhang, Davide Dardari, Baoyun Wang, Yonina C. Eldar:
Near-field Localization with Dynamic Metasurface Antennas. 1-5 - Hassaan Hashmi, Spyridon Pougkakiotis, Dionysios S. Kalogerias:
Model-Free Learning of Optimal Beamformers for Passive IRS-Assisted Sumrate Maximization. 1-5 - Rajsuryan Singh, Pablo Zinemanas, Xavier Serra, Juan Pablo Bello, Magdalena Fuentes:
Flowgrad: Using Motion for Visual Sound Source Localization. 1-5 - Su Yan, Herman Verinaz-Jadan, Junjie Huang, Nathan Daly, Catherine Higgitt, Pier Luigi Dragotti:
Super-Resolution for Macro X-Ray Fluorescence Data Collected from Old Master Paintings. 1-5 - Jian Wu, Liping Wang, Hailin Pan, Binyu Wang:
MLCGAN: Multi-Lead ECG Synthesis with Multi Label Conditional Generative Adversarial Network. 1-5 - Rongzhen Li, Jiang Zhong, Zhongxuan Xue, Qizhu Dai, Chen Wang, Xue Li:
Commdre: Document-Level Relation Extraction with Self-Supervised Commonsense Learning. 1-5 - Tosiron Adegbija:
Jazznet: A Dataset of Fundamental Piano Patterns for Music Audio Machine Learning Research. 1-5 - Ruohui Zheng, Libao Zhang:
UAV Remote Sensing Image Dehazing Based on Multi-Dimensional Saliency Awareness Unequal Network. 1-5 - Shimin Huang, Cheolkon Jung, Yang Liu, Ming Li:
CNN Filter for Super-Resolution with RPR Functionality in VVC. 1-5 - Hayato Terao, Wataru Noguchi, Hiroyuki Iizuka, Masahito Yamamoto:
Efficient Compressed Video Action Recognition Via Late Fusion with a Single Network. 1-5 - Yang Liu, Di Li, Wei Zhu, Dingkang Yang, Jing Liu, Liang Song:
MSN-net: Multi-Scale Normality Network for Video Anomaly Detection. 1-5 - Yao Zhou, Ruidan Su, Shikui Tu, Lei Xu:
A Deep Temporal Factor Analysis Method for Large Scale Financial Portfolio Selection. 1-5 - Ignacio Hounie, Juan Elenter, Alejandro Ribeiro:
Neural Networks with Quantization Constraints. 1-5 - Yifei Xin, Xiulian Peng, Yan Lu:
Improving Speech Enhancement via Event-Based Query. 1-5 - Qianying Liu, Chaitanya Kaul, Jun Wang, Christos Anagnostopoulos, Roderick Murray-Smith, Fani Deligianni:
Optimizing Vision Transformers for Medical Image Segmentation. 1-5 - Xiaoyu Liu, Hanlin Lu, Jianbo Yuan, Xinyu Li:
CAT: Causal Audio Transformer for Audio Classification. 1-5 - Berk Iskender, Marc Louis Klasky, Brian M. Patterson, Yoram Bresler:
Factorized Projection-Domain Spatio-Temporal Regularization for Dynamic Tomography. 1-5 - Yangyi Liu, Sadaf Salehkalaibar, Stefano Rini, Jun Chen:
M22: Rate-Distortion Inspired Gradient Compression. 1-5 - Haobin Tang, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao:
QI-TTS: Questioning Intonation Control for Emotional Speech Synthesis. 1-5 - Usman Hassan, Dongjie Chen, Sen-Ching S. Cheung, Chen-Nee Chuah:
He-Gan: Differentially Private Gan Using Hamiltonian Monte Carlo Based Exponential Mechanism. 1-5 - Shuai Wang, Zipei Yan, Daoan Zhang, Haining Wei, Zhongsen Li, Rui Li:
Prototype Knowledge Distillation for Medical Segmentation with Missing Modality. 1-5 - Zahra Hafezi Kafshgari, Ivan V. Bajic, Parvaneh Saeedi:
Smart Split-Federated Learning over Noisy Channels for Embryo Image Segmentation. 1-5 - Thomas Strypsteen, Alexander Bertrand:
Neural Source Coding For Bandwidth-Efficient Brain-Computer Interfacing With Wireless Neuro-Sensor Networks. 1-5 - Philipp Scholl, Aras Bacho, Holger Boche, Gitta Kutyniok:
The Uniqueness Problem of Physical Law Learning. 1-5 - Masato Hagiwara, Benjamin Hoffman, Jen-Yu Liu, Maddie Cusimano, Felix Effenberger, Katie Zacarian:
BEANS: The Benchmark of Animal Sounds. 1-5 - Marcelo Alejandro Colominas, Sylvain Meignen:
Making Synchrosqueezing Locally Adaptive in The Time-Frequency Plane. 1-5 - Changhong Wang, Vincent Lostanlen, Mathieu Lagrange:
Explainable audio Classification of Playing Techniques with Layer-wise Relevance Propagation. 1-5 - Vinay Gupta, Laxmidhar Behera, Tushar Sandhan:
Graph Based Semantic Ensemble of Riemannian Neural Structured Learning for BCI-EEG Signal Classification. 1-5 - Cyprien Doz, Chengfang Ren, Jean Philippe Ovarlez, Romain Couillet:
Large Dimensional Analysis of LS-SVM Transfer Learning: Application to Polsar Classification. 1-5 - Fangzhou Wang, A. Lee Swindlehurst:
Hybrid Ris-Assisted Interference Mitigation for Spectrum Sharing. 1-5 - Shulin He, Wei Rao, Jinjiang Liu, Jun Chen, Yukai Ju, Xueliang Zhang, Yannan Wang, Shidong Shang:
Speech Enhancement with Intelligent Neural Homomorphic Synthesis. 1-5 - Yiran Song, Lizhuang Ma:
CLMAE: A Liter and Faster Masked Autoencoders. 1-5 - Peng Sun, Jie Su, Zhenyu Wen, Yejian Zhou, Zhen Hong, Shanqing Yu, Huaji Zhou:
Boosting Signal Modulation Few-Shot Learning with Pre-Transformation. 1-5 - Théo Deschamps-Berger, Lori Lamel, Laurence Devillers:
Exploring Attention Mechanisms for Multimodal Emotion Recognition in an Emergency Call Center Corpus. 1-5 - Nguyen Minh Tran, Muhammad Miftahul Amri, Je Hyeon Park, Dong In Kim, Kae Won Choi:
An Efficient Beam-Sharing Algorithm for RIS-aided Simultaneous Wireless Information and Power Transfer Applications. 1-5 - Guanqun Bi, Yanan Cao, Piji Li, Yuqiang Xie, Fang Fang, Zheng Lin:
Seri: Sketching-Reasoning-Integrating Progressive Workflow for Empathetic Response Generation. 1-5 - Liang Zhao, Qiongjie Xie, Sontao Wu, Shubin Ma:
An End-to-End Framework for Partial View-Aligned Clustering with Graph Structure. 1-5 - Nan Jing, Yu Zhang:
A Lightweight Convolutional Neural Network using Feature Filtering Module. 1-5 - Zhe Wang, Shilong Wu, Hang Chen, Mao-Kui He, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Baocai Yin, Jia Pan, Jianqing Gao, Cong Liu:
The Multimodal Information Based Speech Processing (Misp) 2022 Challenge: Audio-Visual Diarization And Recognition. 1-5 - Ehsan Raei, Mohammad Alaee-Kerahroodi, Bhavani Shankar, Björn E. Ottersten:
Range-ISL Minimization and Spectral Shaping in MIMO Radar Systems via Waveform Design. 1-5 - Frederik Hoppe, Felix Krahmer, Claudio Mayrink Verdun, Marion I. Menzel, Holger Rauhut:
High-Dimensional Confidence Regions in Sparse MRI. 1-5 - Luca Barbieri, Osvaldo Simeone, Monica Nicoli:
Channel-Driven Decentralized Bayesian Federated Learning for Trustworthy Decision Making in D2D Networks. 1-5 - Junyi Peng, Themos Stafylakis, Rongzhi Gu, Oldrich Plchot, Ladislav Mosner, Lukás Burget, Jan Cernocký:
Parameter-Efficient Transfer Learning of Pre-Trained Transformer Models for Speaker Verification Using Adapters. 1-5 - Zhibin Qiu, Mengfan Fu, Yinfeng Yu, LiLi Yin, Fuchun Sun, Hao Huang:
SRTNET: Time Domain Speech Enhancement via Stochastic Refinement. 1-5 - Haolin Zhuang, Shun Lei, Long Xiao, Weiqin Li, Liyang Chen, Sicheng Yang, Zhiyong Wu, Shiyin Kang, Helen Meng:
GTN-Bailando: Genre Consistent long-Term 3D Dance Generation Based on Pre-Trained Genre Token Network. 1-5 - Van-Thinh Nguyen, Hung-Cuong Pham, Dang-Khoa Mac:
How to Push the Fastest Model 50x Faster: Streaming Non-Autoregressive Speech Synthesis on Resouce-Limited Devices. 1-5 - Yiran Song, Qianyu Zhou, Lizhuang Ma:
Rethinking Implicit Neural Representations For Vision Learners. 1-5 - Hiroki Kanagawa, Yusuke Ijima:
Enhancement of Text-Predicting Style Token With Generative Adversarial Network for Expressive Speech Synthesis. 1-5 - Qingheng Zhang, Haibo Ye, Kaicheng Yu:
Mendam: Multi-Expert Network with Distribution-Aware Momentum for Long-Tailed Recognition. 1-5 - Anna Silnova, Niko Brümmer, Albert Swart, Lukás Burget:
Toroidal Probabilistic Spherical Discriminant Analysis. 1-5 - Filip Elvander:
Estimating Inharmonic Signals with Optimal Transport Priors. 1-5 - Hyungyo Kim, Naresh R. Shanbhag:
Boosting the Accuracy of SRAM-Based in-Memory Architectures Via Maximum Likelihood-Based Error Compensation Method. 1-5 - Bing Han, Zhengyang Chen, Yanmin Qian:
Exploring Binary Classification Loss for Speaker Verification. 1-5 - Xiaolin Zhu, Dongli Wang, Yan Zhou:
Hierarchical Spatial-Temporal Transformer with Motion Trajectory for Individual Action and Group Activity Recognition. 1-5 - Ziping Zhao, Huan Wang, Haishuai Wang, Björn W. Schuller:
Hierarchical Network with Decoupled Knowledge Distillation for Speech Emotion Recognition. 1-5 - Hao Yen, François G. Germain, Gordon Wichern, Jonathan Le Roux:
Cold Diffusion for Speech Enhancement. 1-5 - Minh Tran, Mohammad Soleymani:
A Speech Representation Anonymization Framework via Selective Noise Perturbation. 1-5 - André G. C. Pacheco, Frank A. C. Cabello, Adriana M. O. Fonoff, Paula G. Rodrigues, Otávio A. B. Penatti, Paula R. Pinto:
Towards Low-Power Heart Rate Estimation Based on User's Demographics and Activity Level For Wearables. 1-5 - Jin Sob Kim, Hyun Joon Park, Wooseok Shin, Sung Won Han:
AD-YOLO: You Look Only Once in Training Multiple Sound Event Localization and Detection. 1-5 - Farzan Niknejad Mazandarani, Paul S. Babyn, Javad Alirezaie:
UNeXt: a Low-Dose CT denoising UNet model with the modified ConvNeXt block. 1-5 - Arnav Kundu, Mohammad Samragh, Minsik Cho, Priyanka Padmanabhan, Devang Naik:
HEiMDaL: Highly Efficient Method for Detection and Localization of Wake-Words. 1-5 - Kai Shigemi, Shuji Komeiji, Takumi Mitsuhashi, Yasushi Iimura, Hiroharu Suzuki, Hidenori Sugano, Koichi Shinoda, Kohei Yatabe, Toshihisa Tanaka:
Synthesizing Speech from ECoG with a Combination of Transformer-Based Encoder and Neural Vocoder. 1-5 - Liyong Guo, Xiaoyu Yang, Quandong Wang, Yuxiang Kong, Zengwei Yao, Fan Cui, Fangjun Kuang, Wei Kang, Long Lin, Mingshuang Luo, Piotr Zelasko, Daniel Povey:
Predicting Multi-Codebook Vector Quantization Indexes for Knowledge Distillation. 1-5 - Chao He, Hongxi Wei:
Transformer-Based Deep Hashing Method for Multi-Scale Feature Fusion. 1-5 - Seyed Mehdi Iranmanesh, Sherry X. Chen, Kuo-Chin Lien:
Pair DETR: Toward Faster Convergent DETR. 1-5 - Mazin Hnewa, Alireza Rahimpour, Justin Miller, Devesh Upadhyay, Hayder Radha:
Cross Modality Knowledge Distillation for Robust Pedestrian Detection in Low Light and Adverse Weather Conditions. 1-5 - Dimitris N. Makropoulos, Antigoni Tsiami, Aristides Prospathopoulos, Dimitris Kassis, Alexandros Frantzis, Emmanuel K. Skarsoulis, George Piperakis, Petros Maragos:
Convolutional Recurrent Neural Networks for the Classification of Cetacean Bioacoustic Patterns. 1-5 - Minxiang Ye, Yifei Zhang, Shiqiang Zhu, Anhuan Xie, Senwei Xiang:
Semi-Supervised Domain Generalization with Graph-Based Classifier. 1-5 - Jun Xia, Ge Wang, Bozhen Hu, Cheng Tan, Jiangbin Zheng, Yongjie Xu, Stan Z. Li:
Wordreg: Mitigating the Gap between Training and Inference with Worst-Case Drop Regularization. 1-5 - Wentao Lei, Lei Liu, Li Liu:
Spatio-Temporal Structure Consistency for Semi-Supervised Medical Image Classification. 1-5 - Bi-Cheng Yan, Hsin-Wei Wang, Yi-Cheng Wang, Berlin Chen:
Effective Graph-Based Modeling of Articulation Traits for Mispronunciation Detection and Diagnosis. 1-5 - Jingyu Lin, Yan Yan, Hanzi Wang:
A Dual-Path Transformer Network for Scene Text Detection. 1-5 - Mingliang Zhai, Kang Ni, Jiucheng Xie, Hao Gao:
Cross-Modal Optical Flow Estimation via Modality Compensation and Alignment. 1-5 - Manon Dampfhoffer, Thomas Mesquida, Emmanuel Hardy, Alexandre Valentian, Lorena Anghel:
Leveraging Sparsity with Spiking Recurrent Neural Networks for Energy-Efficient Keyword Spotting. 1-5 - Kun He, Changyu Li, Dongyang Zhang, Jie Shao:
Capturing Cross-Scale Disparity for Stereo Image Super-Resolution. 1-5 - Xiaokang Liu, Zhiqiang Wang, Kai Hu, Xieping Gao:
Pseudo Multi-Source Domain Extension and Selective Pseudo-Labeling for Unsupervised Domain Adaptive Medical Image Segmentation. 1-5 - Zexu Pan, Wupeng Wang, Marvin Borsdorf, Haizhou Li:
ImagineNet: Target Speaker Extraction with Intermittent Visual Cue Through Embedding Inpainting. 1-5 - Mingliang Zhai, Kang Ni, Jiucheng Xie, Hao Gao:
Learning Scene Flow from 3d Point Clouds with Cross-Transformer and Global Motion Cues. 1-5 - Mingyu Shao, Li Lu, Ye Ding, Qing Liao:
Minimising Distortion for GAN-Based Facial Attribute Manipulation. 1-5 - Yosuke Higuchi, Tetsuji Ogawa, Tetsunori Kobayashi, Shinji Watanabe:
BECTRA: Transducer-Based End-To-End ASR with Bert-Enhanced Encoder. 1-5 - Chao Tan, Yang Cao, Sheng Li, Masatoshi Yoshikawa:
General or Specific? Investigating Effective Privacy Protection in Federated Learning for Speech Emotion Recognition. 1-5 - Kleanthis Avramidis, Kranti Adsul, Digbalay Bose, Shrikanth Narayanan:
Signal Processing Grand Challenge 2023 - E-Prevention: Sleep Behavior as an Indicator of Relapses in Psychotic Patients. 1-2 - Haozhao Ma, Chuang Yang, Yuan Yuan, Qi Wang:
Optimal Kernel for Real-Time Arbitrary-Shaped Text Detection. 1-5 - Felix Schwock, Julien A. Bloch, Les Atlas, Shima Abadi, Azadeh Yazdan-Shahmorad:
Estimating and Analyzing Neural Information flow using Signal Processing on Graphs. 1-5 - Maxime Poli, Emmanuel Dupoux, Rachid Riad:
Introducing Topography in Convolutional Neural Networks. 1-5 - Fei Ye, Adrian G. Bors:
Compressing Cross-Domain Representation via Lifelong Knowledge Distillation. 1-5 - Lu Liu, A. Lee Swindlehurst:
Overlay Cognitive Radio Using Symbol Level Precoding With Quantized CSI. 1-5 - Lincon S. Souza, Bojan Batalo, Keisuke Yamazaki:
A Geometric Surrogate for Simulation Calibration. 1-5 - Anderson Nogueira Cotrim, Hélio Pedrini:
Residual Squeeze-and-Excitation U-Shaped Network for Minutia Extraction in Contactless Fingerprint Images. 1-5 - Yuan Tseng, Cheng-I Jeff Lai, Hung-Yi Lee:
Cascading and Direct Approaches to Unsupervised Constituency Parsing on Spoken Sentences. 1-5 - Zhiyuan Zha, Bihan Wen, Xin Yuan, Jiantao Zhou, Ce Zhu:
Hyperspectral Image Denoising Via Nonlocal Rank Residual Modeling. 1-5 - Pranav Kadam, Hardik Prajapati, Min Zhang, Jintang Xue, Shan Liu, C.-C. Jay Kuo:
S3I-PointHop: SO(3)-Invariant PointHop for 3D Point Cloud Classification. 1-5 - Zheyu Wu, Ya-Feng Liu, Bo Jiang, Yu-Hong Dai:
Efficient Quantized Constant Envelope Precoding for Multiuser Downlink Massive MIMO Systems. 1-5 - Mark Lindsey, Tyler Vuong, Richard M. Stern:
Unsupervised Voice Type Discrimination Score Adaptation Using X-Vector Clusters. 1-5 - Kun Song, Yongmao Zhang, Yi Lei, Jian Cong, Hanzhao Li, Lei Xie, Gang He, Jinfeng Bai:
DSPGAN: A Gan-Based Universal Vocoder for High-Fidelity TTS by Time-Frequency Domain Supervision from DSP. 1-5 - Xiaoyu Liu, Xu Li, Joan Serrà:
Quantitative Evidence on Overlooked Aspects of Enrollment Speaker Embeddings for Target Speaker Separation. 1-5 - Songtao Lu, Tian Gao:
Meta-Dag: Meta Causal Discovery Via Bilevel Optimization. 1-5 - Tao Liu, Zhengyang Chen, Yanmin Qian, Kai Yu:
Multi-Speaker End-to-End Multi-Modal Speaker Diarization System for the MISP 2022 Challenge. 1-2 - Marie Kunesová, Zbynek Zajíc:
Multitask Detection of Speaker Changes, Overlapping Speech and Voice Activity Using Wav2vec 2.0. 1-5 - R. A. Borsoi, Isabell Lehmann, Mohammad A. B. S. Akhonda, Vince D. Calhoun, Konstantin Usevich, David Brie, Tülay Adali:
Coupled CP Tensor Decomposition with Shared and Distinct Components for Multi-Task Fmri Data Fusion. 1-5 - Zhongyuan Zhao, Bojan Radojicic, Gunjan Verma, Ananthram Swami, Santiago Segarra:
Delay-Aware Backpressure Routing Using Graph Neural Networks. 1-5 - Jiaming Zhou, Shiwan Zhao, Ning Jiang, Guoqing Zhao, Yong Qin:
MADI: Inter-Domain Matching and Intra-Domain Discrimination for Cross-Domain Speech Recognition. 1-5 - Yifan Zhang, Shaojie Li, Xuan Yang:
Knowledge Distillation with Active Exploration and Self-Attention Based Inter-Class Variation Transfer for Image Segmentation. 1-5 - Yichen Zhang, Shujian Yu, Badong Chen:
Sequential Invariant Information Bottleneck. 1-5 - Jun Qi, Xiao-Lei Zhang, Javier Tejedor:
Optimizing Quantum Federated Learning Based on Federated Quantum Natural Gradient Descent. 1-5 - Yunpeng Li, Yue Hu, Wei Peng, Yuqiang Xie:
Think Before You Speak: Concept-Guided Explicit Persona Reasoning for Personalized Dialogue Generation. 1-5 - Jinjie Ni, Yukun Ma, Wen Wang, Qian Chen, Dianwen Ng, Han Lei, Trung Hieu Nguyen, Chong Zhang, Bin Ma, Erik Cambria:
Adaptive Knowledge Distillation Between Text and Speech Pre-Trained Models. 1-5 - Yiming Sun, Yang Li, Changbo Wang:
Multi-Source Templates Learning for Real-Time Aerial Tracking. 1-5 - Katerina Papadimitriou, Gerasimos Potamianos:
Sign Language Recognition via Deformable 3D Convolutions and Modulated Graph Convolutional Networks. 1-5 - Michael A. Akeroyd, Will Bailey, Jon Barker, Trevor J. Cox, John F. Culling, Simone Graetzer, Graham Naylor, Zuzanna Podwinska, Zehai Tu:
The 2nd Clarity Enhancement Challenge for Hearing Aid Speech Intelligibility Enhancement: Overview and Outcomes. 1-5 - Yusuke Fujita, Tatsuya Komatsu, Robin Scheibler, Yusuke Kida, Tetsuji Ogawa:
Neural Diarization with Non-Autoregressive Intermediate Attractors. 1-5 - Amy Bastine, Thushara D. Abhayapala, Jihui Aimee Zhang:
Room Impulse Response Reconstruction Based on Spatio-Temporal-Spectral Features Learned from a Spherical Microphone Array Measurement. 1-5 - Wei Zhou, Haotian Wu, Jingjing Xu, Mohammad Zeineldeen, Christoph Lüscher, Ralf Schlüter, Hermann Ney:
Enhancing and Adversarial: Improve ASR with Speaker Labels. 1-5 - Mansooreh Montazerin, Elahe Rahimian, Farnoosh Naderkhani, Seyed Farokh Atashzar, Hamid Alinejad-Rokny, Arash Mohammadi:
HYDRA-HGR: A Hybrid Transformer-Based Architecture for Fusion of Macroscopic and Microscopic Neural Drive Information. 1-5 - Songwei Zheng, Dong Zhang, Chunyan Yu, Danhong Zhu, Longlong Zhu, Hao Liu, Zhongzheng Huang:
Vision Transformer with Progressive Tokenization for CT Metal Artifact Reduction. 1-5 - Aaron Master, Lie Lu, Jonas Samuelsson, Heidi-Maria Lehtonen, Scott Norcross, Nathan Swedlow, Audrey Howard:
Deepspace: Dynamic Spatial and Source CUE Based Source Separation for Dialog Enhancement. 1-5 - Rodrigo Mira, Buye Xu, Jacob Donley, Anurag Kumar, Stavros Petridis, Vamsi Krishna Ithapu, Maja Pantic:
LA-VOCE: LOW-SNR Audio-Visual Speech Enhancement Using Neural Vocoders. 1-5 - Hyun-Joon Nam, Hong-June Park:
Pitch Mark Detection from Noisy Speech Waveform Using Wave-U-Net. 1-5 - Jeongmin Chae, Praneeth Narayanamurthy, Selin Bac, Shaama Mallikarjun Sharada, Urbashi Mitra:
Column-Based Matrix Approximation with Quasi-Polynomial Structure. 1-5 - Akash Gupta, Rohun Tripathi, Wondong Jang:
MODEFORMER: Modality-Preserving Embedding For Audio-Video Synchronization Using Transformers. 1-5 - Yu Chen, Mingyu Yang, Hun-Seok Kim:
Search for Efficient Deep Visual-Inertial Odometry Through Neural Architecture Search. 1-5 - Xiangyu Huang, Caidan Zhao, Chenxing Gao, Lvdong Chen, Zhiqiang Wu:
Synthetic Pseudo Anomalies for Unsupervised Video Anomaly Detection: A Simple Yet Efficient Framework Based on Masked Autoencoder. 1-5 - Soheil Khorram, Anshuman Tripathi, Jaeyoung Kim, Han Lu, Qian Zhang, Rohit Prabhavalkar, Hasim Sak:
Cross-Training: A Semi-Supervised Training Scheme for Speech Recognition. 1-5 - Ahmad Sajedi, Yuri A. Lawryshyn, Konstantinos N. Plataniotis:
A New Probabilistic Distance Metric with Application in Gaussian Mixture Reduction. 1-5 - Xiaoqi Wang, Jian Xiong, Bo Li, Jinli Suo, Hao Gao:
Learning Hybrid Representations of Semantics and Distortion for Blind Image Quality Assessment. 1-5 - Marco Gaido, Yun Tang, Ilia Kulikov, Rongqing Huang, Hongyu Gong, Hirofumi Inaguma:
Named Entity Detection and Injection for Direct Speech Translation. 1-5 - Rui Huang, Qingyi Zhao, Ruofei Wang, Caihua Liu, Sihua Gao, Yuxiang Zhang, Wei Fan:
ScaleMix: Intra- And Inter-Layer Multiscale Feature Combination for Change Detection. 1-5 - Zhiheng Hu, Yongzhen Wang, Peng Li, Jie Qin, Haoran Xie, Mingqiang Wei:
ISmallNet: Densely Nested Network with Label Decoupling for Infrared Small Target Detection. 1-5 - Yifei Xin, Dongchao Yang, Fan Cui, Yujun Wang, Yuexian Zou:
Improving Weakly Supervised Sound Event Detection with Causal Intervention. 1-5 - Zehra Shah, Shiang Qi, Fei Wang, Mahtab Farrokh, Mashrura Tasnim, Eleni Stroulia, Russell Greiner, Manos Plitsis, Athanasios Katsamanis:
Exploring Language-Agnostic Speech Representations Using Domain Knowledge for Detecting Alzheimer's Dementia. 1-2 - Eli Kurtz, Youxiang Zhu, Tiffany M. Driesse, Bang Tran, John A. Batsis, Robert M. Roth, Xiaohui Liang:
Early Detection of Cognitive Decline Using Voice Assistant Commands. 1-5 - Ioannis Tsetis, Xiaotong Cheng, Setareh Maghsudi:
A Bandit Online Convex Optimization Approach To Distributed Energy Management In Networked Systems. 1-5 - Zhiyang Zhou, Shihui Liu:
Learning to Auto-Correct for High-Quality Spectrograms. 1-5 - Yi Zhang, Isao Yamada:
A Compensated Shrinkage Affine Projection Algorithm for Debiased Sparse Adaptive Filtering. 1-5 - Mona Zehni, Zhizhen Zhao:
CryoSWD: Sliced Wasserstein Distance Minimization for 3D Reconstruction in Cryo-electron Microscopy. 1-5 - Likai Wang, Ruize Han, Wei Feng:
Combining the Silhouette and Skeleton Data for Gait Recognition. 1-5 - Feng Hou, Yao Zhang, Yang Liu, Jin Yuan, Cheng Zhong, Yang Zhang, Zhongchao Shi, Jianping Fan, Zhiqiang He:
Learning How to Learn Domain-Invariant Parameters for Domain Generalization. 1-5 - Sung-Lin Yeh, Hao Tang:
Learning Dependencies of Discrete Speech Representations with Neural Hidden Markov Models. 1-5 - Siavash Golkar, David Lipshutz, Tiberiu Tesileanu, Dmitri B. Chklovskii:
An Online Algorithm for Contrastive Principal Component Analysis. 1-5 - Fangjian Lin, Yizhe Ma, ShengWei Tian:
Exploring Vision Transformer Layer Choosing for Semantic Segmentation. 1-5 - Kristian Fischer, Fabian Brand, Christian Blum, André Kaup:
Saliency-Driven Hierarchical Learned Image Coding for Machines. 1-5 - Xiaoyi Shen, Dongyuan Shi, Zhengding Luo, Junwei Ji, Woon-Seng Gan:
A Momentum Two-Gradient Direction Algorithm with Variable Step Size Applied to Solve Practical Output Constraint Issue for Active Noise Control. 1-5 - Jaimin Shah, Martina Cardone, Alex Dytso, Cynthia Rush:
When is Mimo Massive in Radar? 1-5 - Jiachen Luo, Huy Phan, Joshua D. Reiss:
Cross-Modal Fusion Techniques for Utterance-Level Emotion Recognition from Text and Speech. 1-5 - Peter G. Vouras, Kumar Vijay Mishra, Alexandra B. Artusio-Glimpse:
Phase Retrieval for Rydberg Quantum Arrays. 1-5 - Chen Zhang, Shubham Bansal, Aakash Lakhera, Jinzhu Li, Gang Wang, Sandeepkumar Satpal, Sheng Zhao, Lei He:
LeanSpeech: The Microsoft Lightweight Speech Synthesis System for Limmits Challenge 2023. 1-2 - Xinjian Li, Ye Jia, Chung-Cheng Chiu:
Textless Direct Speech-to-Speech Translation with Discrete Speech Representation. 1-5 - Ajinkya Jayawant, Antonio Ortega:
Towards Bandwidth Estimation for Graph Signal Reconstruction. 1-5 - Jinchen Zeng, Rick Butler, John van den Dobbelsteen, Benno H. W. Hendriks, Maarten Van der Elst, Justin Dauwels:
Automatic Camera Pose Estimation by Key-Point Matching of Reference Objects. 1-5 - Weidong Dai, Xuejun Yan, Jingjing Wang, Di Xie, Shiliang Pu:
MDR-MFI:Multi-Branch Decoupled Regression and Multi-Scale Feature Interaction for Partial-to-Partial Cloud Registration. 1-5 - Guoqiu Li, Shengjie Chen, Yujiu Yang, Zhenhua Guo:
A Two-Branch Network for Video Anomaly Detection with Spatio-Temporal Feature Learning. 1-5 - James Orme-Rogers, Ajitesh Srivastava:
Spatio-Temporal Attention in Multi-Granular Brain Chronnectomes For Detection of Autism Spectrum Disorder. 1-5 - Pascal Bacchus, Renaud Fraisse, Aline Roumy, Christine Guillemot:
Joint Compression and Demosaicking For Satellite Images. 1-5 - Pranay Manocha, Israel D. Gebru, Anurag Kumar, Dejan Markovic, Alexander Richard:
Nord: Non-Matching Reference Based Relative Depth Estimation from Binaural Speech. 1-5 - Fotios Drakopoulos, Arthur Van Den Broucke, Sarah Verhulst:
A DNN-Based Hearing-Aid Strategy For Real-Time Processing: One Size Fits All. 1-5 - Costas Mavromatis, George Karypis:
Global and Nodal Mutual Information Maximization in Heterogeneous Graphs. 1-5 - Haitong Zhang, Xinyuan Yu, Yue Lin:
NSV-TTS: Non-Speech Vocalization Modeling And Transfer In Emotional Text-To-Speech. 1-5 - Natalie Lang, Elad Sofer, Nir Shlezinger, Rafael G. L. D'Oliveira, Salim El Rouayheb:
CPA: Compressed Private Aggregation for Scalable Federated Learning Over Massive Networks. 1-5 - Yushu Zhang, Gang Li, Xiao-Ping Zhang, You He:
Transformer-based tracking Network for Maneuvering Targets. 1-5 - Zaharah Allah Bukhsh, Aaqib Saeed:
On Out-of-Distribution Detection for Audio with Deep Nearest Neighbors. 1-5 - Fengbo Lan, Gene Cheung, Prabhkirat Arora, Deinabo Richard-Koko, Lisa Cole:
On Designing A 3d Imaging Summer Project For Ontario's High School Students During Covid-19 Pandemic. 1-5 - Weiming Xu, Zhihao Guo:
Tayloraecnet: A Taylor Style Neural Network For Full-Band Echo Cancellation. 1-2 - Xinyu Lin, Yingjie Zhou, Yipeng Liu, Ce Zhu:
Level-Line Guided Edge Drawing for Robust Line Segment Detection. 1-5 - Jiaqi Cao, Lixiang Lian, Yijie Mao, Bruno Clerckx:
Adaptive CSI Feedback with Hidden Semantic Information Transfer. 1-5 - Bryce Irvin, Marko Stamenovic, Mikolaj Kegler, Li-Chia Yang:
Self-Supervised Learning for Speech Enhancement Through Synthesis. 1-5 - Sadaf Khademi, Shahin Heidarian, Parnian Afshar, Farnoosh Naderkhani, Anastasia Oikonomou, Konstantinos N. Plataniotis, Arash Mohammadi:
Spatio-Temporal Hybrid Fusion of CAE and SWin Transformers for Lung Cancer Malignancy Prediction. 1-5 - Minsung Kim, Kyle Jamieson:
Finer-Grained Decomposition for Parallel Quantum Mimo Processing. 1-5 - Shichao Sun, Ruifeng Yuan, Wenjie Li, Sujian Li:
Improving Sentence Similarity Estimation for Unsupervised Extractive Summarization. 1-5 - Dexin Liao, Tao Jiang, Feng Wang, Lin Li, Qingyang Hong:
Towards A Unified Conformer Structure: from ASR to ASV Task. 1-5 - Hanzhuo Wang, Xingjian Wang, Chengwei Zhou, Wenchao Meng, Zhiguo Shi:
Low in Resolution, High in Precision: UAV Detection with Super-Resolution and Motion Information Extraction. 1-5 - Wei Wang, Yanmin Qian:
HuBERT-AGG: Aggregated Representation Distillation of Hidden-Unit Bert for Robust Speech Recognition. 1-5 - Chao Xue, Di Liang, Sirui Wang, Jing Zhang, Wei Wu:
Dual Path Modeling for Semantic Matching by Perceiving Subtle Conflicts. 1-5 - MohammadReza Ebrahimi, Navona Calarco, Colin Hawco, Aristotle N. Voineskos, Ashish Khisti:
Time-Resolved FMRI Shared Response Model Using Gaussian Process Factor Analysis. 1-5 - Jiarui Wang, Prasanga N. Samarasinghe, Thushara D. Abhayapala, Jihui Aimee Zhang:
Image Source Method Based on the Directional Impulse Responses. 1-5 - Jongho Choi, Kyogu Lee:
Pop2Piano : Pop Audio-Based Piano Cover Generation. 1-5 - Dawei Liang, Hang Su, Tarun Singh, Jay Mahadeokar, Shanil Puri, Jiedan Zhu, Edison Thomaz, Mike Seltzer:
Dynamic Speech Endpoint Detection with Regression Targets. 1-5 - Zan Mao, Xinyu Tong, Ze Luo:
Semi-Supervised Remote Sensing Image Change Detection Using Mean Teacher Model for Constructing Pseudo-Labels. 1-5 - Tong Ye, Zhitao Li, Jianzong Wang, Ning Cheng, Jing Xiao:
Efficient Uncertainty Estimation with Gaussian Process for Reliable Dialog Response Retrieval. 1-5 - Rui Zhao, Jian Xue, Partha Parthasarathy, Veljko Miljanic, Jinyu Li:
Fast and Accurate Factorized Neural Transducer for Text Adaption of End-to-End Speech Recognition Models. 1-5 - Zijian Fan, Xinwei Cao, Giampiero Salvi, Torbjørn Svendsen:
Using Modified Adult Speech as Data Augmentation for Child Speech Recognition. 1-5 - Mingyu Liu, Yijie Wang, Hongzuo Xu, Xiaohui Zhou, Bin Li, Yongjun Wang:
Smoothing Point Adjustment-Based Evaluation of Time Series Anomaly Detection. 1-5 - Charles E. Thornton, William W. Howard, R. Michael Buehrer:
Online Learning-Based Waveform Selection for Improved Vehicle Recognition in Automotive Radar. 1-5 - Yaohua Zha, Rongsheng Li, Tao Dai, Jianyu Xiong, Xin Wang, Shu-Tao Xia:
SFR: Semantic-Aware Feature Rendering of Point Cloud. 1-5 - Pavlos Stoikos, Olympia Axelou, George Floros, Nestoras E. Evmorfopoulos, Georgios I. Stamoulis:
On the Reduction of Large-Scale Room Acoustic Models. 1-5 - Juan Cerviño, Luana Ruiz, Alejandro Ribeiro:
Training Graph Neural Networks on Growing Stochastic Graphs. 1-5 - Caroline P. A. Moraes, Bruno Aristimunha, Lucas Heck Dos Santos, Walter Hugo Lopez Pinaya, Raphael Yokoingawa de Camargo, Denis G. Fantinato, Aline Neves:
Applying Independent Vector Analysis on EEG-Based Motor Imagery Classification. 1-5 - Thomas Decker, Michael Lebacher, Volker Tresp:
Does Your Model Think Like an Engineer? Explainable AI for Bearing Fault Detection with Deep Learning. 1-5 - Tianxiang Chen, Qi Chu, Zhentao Tan, Bin Liu, Nenghai Yu:
BAUENet: Boundary-Aware Uncertainty Enhanced Network for Infrared Small Target Detection. 1-5 - Wan-Cyuan Fan, Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu:
IoU-Aware Multi-Expert Cascade Network Via Dynamic Ensemble for Long-Tailed Object Detection. 1-5 - Ying Li, Lei Cheng, Feng Yin, Michael Minyi Zhang, Sergios Theodoridis:
Overcoming Posterior Collapse in Variational Autoencoders Via EM-Type Training. 1-5 - Fateme Ghayem, Hanlu Yang, Furkan Kantar, Seung-Jun Kim, Vince D. Calhoun, Tülay Adali:
New Interpretable Patterns and Discriminative Features from Brain Functional Network Connectivity using Dictionary Learning. 1-5 - Joshin Krishnan, Rohan T. Money, Baltasar Beferull-Lozano, Elvin Isufi:
Simplicial Vector Autoregressive Model For Streaming Edge Flows. 1-5 - Juan Song, Zhilei Liu:
Self-Supervised Facial Action Unit Detection with Region and Relation Learning. 1-5 - Ricardo Augusto Borsoi, Tales Imbiriba, Deniz Erdogmus:
A Deep Disentangled Approach for Interpretable Hyperspectral Unmixing. 1-2 - Haoran Zhang, Yinghong Tian, Liang Yuan, Yue Lu:
Invariant Adversarial Imitation Learning From Visual Inputs. 1-5 - Olukorede Fakorede, Ashutosh Nirala, Modeste Atsague, Jin Tian:
Improving Adversarial Robustness with Hypersphere Embedding and Angular-Based Regularizations. 1-5 - Ali Raza Syed, Michael I. Mandel:
Estimating Shapley Values of Training Utterances for Automatic Speech Recognition Models. 1-5 - Jian Guan, Feiyang Xiao, Youde Liu, Qiaoxi Zhu, Wenwu Wang:
Anomalous Sound Detection Using Audio Representation with Machine ID Based Contrastive Learning Pretraining. 1-5 - Kailai Li, Jiawei Sun, Ruoxin Chen, Wei Ding, Kexue Yu, Jie Li, Chentao Wu:
Towards Practical Edge Inference Attacks Against Graph Neural Networks. 1-5 - Wei Kang, Zengwei Yao, Fangjun Kuang, Liyong Guo, Xiaoyu Yang, Long Lin, Piotr Zelasko, Daniel Povey:
Delay-Penalized Transducer for Low-Latency Streaming ASR. 1-5 - Marija Iloska, Mónica F. Bugallo:
Generalized Two-Stage Particle Filter for High Dimensions. 1-5 - Jifan Zhang, Zhe Wu, Xinfeng Zhang, Guoli Song, Yaowei Wang, Jie Chen:
Recurrent Fine-Grained Self-Attention Network for Video Crowd Counting. 1-5 - In-Sun Hwang, Youngsub Han, Byoung-Ki Jeon:
CyFi-TTS: Cyclic Normalizing Flow with Fine-Grained Representation for End-to-End Text-to-Speech. 1-5 - Ke Liu, Dongya Wu, Dekui Wang, Jun Feng:
Speech Emotion Recognition via Heterogeneous Feature Learning. 1-5 - William Ravenscroft, Stefan Goetze, Thomas Hain:
Deformable Temporal Convolutional Networks for Monaural Noisy Reverberant Speech Separation. 1-5 - Ziyu Jia, Youfang Lin, Yuhan Zhou, Xiyang Cai, Peng Zheng, Qiang Li, Jing Wang:
Exploiting Interactivity and Heterogeneity for Sleep Stage Classification Via Heterogeneous Graph Neural Network. 1-5 - Hsin-Yi Lin, Huan-Hsin Tseng, Yu Tsao:
On the Robustness of Non-Intrusive Speech Quality Model by Adversarial Examples. 1-5 - Weizhou Shen, Xiaojun Quan, Ke Yang:
Generic Dependency Modeling for Multi-Party Conversation. 1-5 - Bingchun Luo, Wei Yu:
Residual Hybrid Attention Network for Compression Artifact Reduction. 1-5 - Nobutaka Ito, Masashi Sugiyama:
Audio Signal Enhancement with Learning from Positive and Unlabeled Data. 1-5 - Mehmet Can Hücümenoglu, Pulak Sarangi, Robin Rajamäki, Piya Pal:
To Regularize or Not to Regularize: The Role of Positivity in Sparse Array Interpolation with a Single Snapshot. 1-5 - Nikos Piperigkos, Aris S. Lalos, Kostas Berberidis, Christos Anagnostopoulos:
Cooperative Five Degrees Of Freedom Motion Estimation For A Swarm Of Autonomous Vehicles. 1-2 - Hyunjong Ok, Seong-Bae Park:
Post-Trained Language Model Adaptive to Extractive Summarization of Long Spoken Documents. 1-2 - Matthew Howard, Keigo Hirakawa:
Event-Based Visual Microphone. 1-5 - Satoshi Kamiya, Kazuhiro Hotta, Taka-aki Tsunoyama, Akihiro Kusumi:
Single-Particle Tracking by Graph Transformer. 1-5 - Tomoro Tanaka, Kohei Yatabe, Yasuhiro Oikawa:
UPGLADE: Unplugged Plug-and-Play Audio Declipper Based on Consensus Equilibrium of DNN and Sparse Optimization. 1-5 - Zhiyang Wang, Luana Ruiz, Alejandro Ribeiro:
Convolutional Filtering on Sampled Manifolds. 1-5 - Takanori Ashihara, Takafumi Moriya, Kohei Matsuura, Tomohiro Tanaka:
Exploration of Language Dependency for Japanese Self-Supervised Speech Representation Models. 1-5 - Jiali Gong, Hongfan Gao, Jiahao Chao, Zhou Zhou, Zhengfeng Yang, Zhenbing Zeng:
Kernel Estimation and Deconvolution for Blind Image Super-Resolution. 1-5 - Rodrigo Diaz, Ben Hayes, Charalampos Saitis, György Fazekas, Mark B. Sandler:
Rigid-Body Sound Synthesis with Differentiable Modal Resonators. 1-5 - Manuel Morante, Jan Østergaard, Sergios Theodoridis:
Interpretable Nonnegative Incoherent Deep Dictionary Learning for FMRI Data Analysis. 1-5 - Nafiseh Jabbari Tofighi, Mohamed Hedi Elfkir, Nevrez Imamoglu, Cagri Ozcinar, Erkut Erdem, Aykut Erdem:
ST360IQ: No-Reference Omnidirectional Image Quality Assessment With Spherical Vision Transformers. 1-5 - Tan M. Nguyen, Tam Nguyen, Long Bui, Hai Do, Duy Khuong Nguyen, Dung D. Le, Hung Tran-The, Nhat Ho, Stanley J. Osher, Richard G. Baraniuk:
A Probabilistic Framework for Pruning Transformers Via a Finite Admixture of Keys. 1-5 - Guillaume Lauga, Elisa Riccietti, Nelly Pustelnik, Paulo Gonçalves:
Multilevel FISTA for Image Restoration. 1-5 - Tongtong Su, Jinsong Zhang, Gang Wang, Xiaoguang Liu:
Self-Supervised Learning with Explorative Knowledge Distillation. 1-5 - Hao Shi, Masato Mimura, Longbiao Wang, Jianwu Dang, Tatsuya Kawahara:
Time-Domain Speech Enhancement Assisted by Multi-Resolution Frequency Encoder and Decoder. 1-5 - Pengbin Yu, Jianjun Wang, Chen Xu:
Matrix Recovery using Deep Generative Priors with Low-Rank Deviations. 1-5 - Rui Zhang, Yang Hua, Tao Song, Zhengui Xue, Ruhui Ma, Haibing Guan:
Online Residual-Based Key Frame Sampling with Self-Coach Mechanism and Adaptive Multi-Level Feature Fusion. 1-5 - Jianan Chen, Sakriani Sakti:
An Isotropy Analysis for Self-Supervised Acoustic Unit Embeddings on the Zero Resource Speech Challenge 2021 Framework. 1-5 - Sachini Piyoni Ekanayake, Daphney-Stavroula Zois, Charalampos Chelmis:
Sequential Datum-Wise Joint Feature Selection and Classification in the Presence of External Classifier. 1-5 - Abibulla Atawulla, Xi Zhou, Yating Yang, Bo Ma, Fengyi Yang:
A Slot-Shared Span Prediction-Based Neural Network for Multi-Domain Dialogue State Tracking. 1-5 - Yixin Wang, Wei Wei, Ye Wang:
Phonation Mode Detection in Singing: A Singer Adapted Model. 1-5 - Jiachi Liu, Sishi Xiong, Yuehuan He, Tong Zhou, Liwen Wang, Xuefeng Li, Bo Xiao:
SIAST: A Slot Imbalance-Aware Self-Training Scheme for Semi-Supervised Slot Filling. 1-5 - Lavanya Venkatasubramaniam, Vishal Sunder, Eric Fosler-Lussier:
End-to-End Word-Level Disfluency Detection and Classification in Children's Reading Assessment. 1-5 - Christos G. Tsinos, Theodoros A. Tsiftsis, Robert Schober:
Symbol Level Precoding in the RF Domain for Low Hardware Complexity RIS-Assisted MU-MISO Systems. 1-5 - Ryosuke Watanabe, Keisuke Nonaka, Eduardo Pavez, Tatsuya Kobayashi, Antonio Ortega:
Graph Wavelet-Based Point Cloud Geometric Denoising with Surface-Consistent Non-Negative Kernel Regression. 1-5 - Yue Wang, Mingrong Gong, Lei Xia, Qieshi Zhang, Jun Cheng:
Efficiently Fusing Sparse Lidar for Enhanced Self-Supervised Monocular Depth Estimation. 1-5 - Sanqian Li, Muxing Xiong, Bing Yang, Xiaoqing Zhang, Risa Higashita, Jiang Liu:
Oct Image Blind Despeckling Based on Gradient Guided Filter with Speckle Statistical Prior. 1-5 - Lingwei Meng, Jiawen Kang, Mingyu Cui, Yuejiao Wang, Xixin Wu, Helen Meng:
A Sidecar Separator Can Convert A Single-Talker Speech Recognition System to A Multi-Talker One. 1-5 - Bandhav Veluri, Justin Chan, Malek Itani, Tuochao Chen, Takuya Yoshioka, Shyamnath Gollakota:
Real-Time Target Sound Extraction. 1-5 - Chen Chen, Yuchen Hu, Heqing Zou, Linhui Sun, Eng Siong Chng:
Unsupervised Noise Adaptation Using Data Simulation. 1-5 - Zili Huang, Desh Raj, Paola García, Sanjeev Khudanpur:
Adapting Self-Supervised Models to Multi-Talker Speech Recognition Using Speaker Embeddings. 1-5 - Ding Zhang, Yinghui Li, Qingyu Zhou, Shirong Ma, Yangning Li, Yunbo Cao, Hai-Tao Zheng:
Contextual Similarity is More Valuable Than Character Similarity: An Empirical Study for Chinese Spell Checking. 1-5 - Chao-Han Huck Yang, Bo Li, Yu Zhang, Nanxin Chen, Tara N. Sainath, Sabato Marco Siniscalchi, Chin-Hui Lee:
A Quantum Kernel Learning Approach to Acoustic Modeling for Spoken Command Recognition. 1-5 - Sun-Kyung Lee, Jong-Hwan Kim:
SENER: Sentiment Element Named Entity Recognition for Aspect-Based Sentiment Analysis. 1-5 - Eric Grinstein, Mike Brookes, Patrick A. Naylor:
Graph Neural Networks for Sound Source Localization on Distributed Microphone Networks. 1-5 - Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka, Shogo Seki:
Wave-U-Net Discriminator: Fast and Lightweight Discriminator for Generative Adversarial Network-Based Speech Synthesis. 1-5 - Hao Dang, Yuekai Zhang, Xingqun Qi, Wanting Zhou, Muyi Sun:
Lightvessel: Exploring Lightweight Coronary Artery Vessel Segmentation Via Similarity Knowledge Distillation. 1-5 - Sixiang Chen, Tian Ye, Yun Liu, Taodong Liao, Jingxia Jiang, Erkang Chen, Peng Chen:
MSP-Former: Multi-Scale Projection Transformer for Single Image Desnowing. 1-5 - Mohammad Salimibeni, Arash Mohammadi:
RL-IFF: Indoor Localization via Reinforcement Learning-Based Information Fusion. 1-5 - Haitao Tang, Yu Fu, Lei Sun, Jiabin Xue, Dan Liu, Yongchao Li, Zhiqiang Ma, Minghui Wu, Jia Pan, Genshun Wan, Ming'en Zhao:
Reducing the GAP Between Streaming and Non-Streaming Transducer-Based ASR by Adaptive Two-Stage Knowledge Distillation. 1-5 - Kexin Zhu, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao:
Improving EEG-based Emotion Recognition by Fusing Time-Frequency and Spatial Representations. 1-5 - Salih Atici, Hongyi Pan, Mohammed H. Elnagar, Veerasathpurush Allareddy, Omar Suhaym, Rashid Ansari, Ahmet Enis Çetin:
Classification of the Cervical Vertebrae Maturation (CVM) Stages Using the Tripod Network. 1-5 - Qing Wang, Jun Du, Zhaoxu Nian, Shutong Niu, Li Chai, Huaxin Wu, Jia Pan, Chin-Hui Lee:
Loss Function Design for DNN-Based Sound Event Localization and Detection on Low-Resource Realistic Data. 1-5 - Huan Zhang, Simon Dixon:
Disentangling the Horowitz Factor: Learning Content and Style From Expressive Piano Performance. 1-5 - Guy Gubnitsky, Roee Diamant:
Inter-Pulse Estimation for Sperm Whale Click Detection. 1-5 - David Reixach:
Multi-Dimensional Signal Recovery Using Low-Rank Deconvolution. 1-5 - Xianrui Wang, Ningning Pan, Jacob Benesty, Jingdong Chen:
On Multiple-Input/Binaural-Output Antiphasic Speaker Signal Extraction. 1-5 - Wei-Chen Lin, Ching-Te Chiu, Kuan-Chang Shih:
RGB-D Based Pose-Invariant Face Recognition Via Attention Decomposition Module. 1-5 - Jie Hu, Mengze Zeng, Enhua Wu:
Bag of Tricks with Quantized Convolutional Neural Networks for Image Classification. 1-5 - Chitralekha Gupta, Purnima Kamath, Yize Wei, Zhuoyao Li, Suranga Nanayakkara, Lonce Wyse:
Towards Controllable Audio Texture Morphing. 1-5 - Jiaxing Li, Chenqi Kong, Shiqi Wang, Haoliang Li:
Two-Branch Multi-Scale Deep Neural Network for Generalized Document Recapture Attack Detection. 1-5 - Ron M. Hecht, Ohad Rahamim, Shaul Oron, Andrea Forgacs, Gershon Celniker, Dan Levi, Omer Tsimhoni:
Gaze Pre-Train For Improving Disparity Estimation Networks. 1-5 - Hui Guo, Xin Wang, Siwei Lyu:
Detection of Real-Time Deepfakes in Video Conferencing with Active Probing and Corneal Reflection. 1-5 - Huijiao Wang, Xulei Yang:
Efficient Practices for Profile-to-Frontal Face Synthesis and Recognition. 1-5 - Nhan Thanh Nguyen, Mengyuan Ma, Nir Shlezinger, Yonina C. Eldar, A. Lee Swindlehurst, Markku J. Juntti:
Deep Unfolding-Enabled Hybrid Beamforming Design for mmWave Massive MIMO Systems. 1-5 - Qi Chen, Ziyang Ma, Tao Liu, Xu Tan, Qu Lu, Kai Yu, Xie Chen:
Improving Few-Shot Learning for Talking Face System with TTS Data Augmentation. 1-5 - Samuel Yen-Chi Chen:
Quantum Deep Recurrent Reinforcement Learning. 1-5 - Brandon Le Bon, Mikaël Le Pendu, Christine Guillemot:
Unrolled Fourier Disparity Layer Optimization for Scene Reconstruction from Few-Shots Focal Stacks. 1-5 - Han Gao, Shuo Zhao, Huiyan Li, Li Liu, You Wang, Ruifen Hu, Jin Zhang, Guang Li:
Bimodal Fusion Network for Basic Taste Sensation Recognition from Electroencephalography and Electromyography. 1-5 - Xilai Li, Goeric Huybrechts, Srikanth Ronanki, Jeff Farris, Sravan Bodapati:
Dynamic Chunk Convolution for Unified Streaming and Non-Streaming Conformer ASR. 1-5 - Chengdong Liang, Xiao-Lei Zhang, Binbin Zhang, Di Wu, Shengqiang Li, Xingchen Song, Zhendong Peng, Fuping Pan:
Fast-U2++: Fast and Accurate End-to-End Speech Recognition in Joint CTC/Attention Frames. 1-5 - Lei Yang, Wei Liu, Lufen Tan, Jaemo Yang, Han-Gil Moon:
Target Speaker Extraction with Ultra-Short Reference Speech by VE-VE Framework. 1-5 - Jinjiang Liu, Xueliang Zhang:
ICCRN: Inplace Cepstral Convolutional Recurrent Neural Network for Monaural Speech Enhancement. 1-5 - Mahmut Karakaya, Ramazan Savas Aygün:
Retinal Biomarkers for Detecting Diabetic Retinopaty Using Smartphone-Based Deep Learning Frameworks. 1-5 - Omran Alamayreh, Giovanna Maria Dimitri, Jun Wang, Benedetta Tondi, Mauro Barni:
Which Country is This Picture From? New Data and Methods For Dnn-Based Country Recognition. 1-5 - Sarper Aydin, Ceyhun Eksin:
Networked Policy Gradient Play in Markov Potential Games. 1-5 - Hao Yen, Woojay Jeon:
Improvements to Embedding-Matching Acoustic-to-Word ASR Using Multiple-Hypothesis Pronunciation-Based Embeddings. 1-5 - Shuli Zhuang, Pengfei Xia, Bin Li:
An Empirical Study of Backdoor Attacks on Masked Auto Encoders. 1-5 - Shunit Truzman, Guy Revach, Nir Shlezinger, Itzik Klein:
Outlier-Insensitive Kalman Filtering Using NUV Priors. 1-5 - Nicolas Heintz, Simon Geirnaert, Tom Francart, Alexander Bertrand:
Unbiased Unsupervised Stimulus Reconstruction for EEG-Based Auditory Attention Decoding. 1-5 - Ao Li, Yugang Ji, Guanyi Chu, Xiao Wang, Dong Li, Chuan Shi:
Clustering-Based Supervised Contrastive Learning for Identifying Risk Items on Heterogeneous Graph. 1-5 - Xuxin Cheng, Qianqian Dong, Fengpeng Yue, Tom Ko, Mingxuan Wang, Yuexian Zou:
M3ST: Mix at Three Levels for Speech Translation. 1-5 - Bac Nguyen, Fabien Cardinaux, Stefan Uhlich:
Autotts: End-to-End Text-to-Speech Synthesis Through Differentiable Duration Modeling. 1-5 - Sree Hari Krishnan Parthasarathi, Lu Zeng, Dilek Hakkani-Tür:
Conversational Text-to-SQL: An Odyssey into State-of-the-Art and Challenges Ahead. 1-5 - Minki Kang, Dongchan Min, Sung Ju Hwang:
Grad-StyleSpeech: Any-Speaker Adaptive Text-to-Speech Synthesis with Diffusion Models. 1-5 - Thai Binh Nguyen, Le Duc Minh Nhat, Quang Minh Nguyen, Quoc Truong Do, Chi Mai Luong, Alexander Waibel:
AdapITN: A Fast, Reliable, and Dynamic Adaptive Inverse Text Normalization. 1-5 - Zhiyi Chen, Yao Lu, Xinzhe Deng, Jia Meng, Shengchuan Zhang, Liujuan Cao:
Self-Paced Partial Domain-Aware Learning for Face Anti-Spoofing. 1-5 - Emilian Postolache, Jordi Pons, Santiago Pascual, Joan Serrà:
Adversarial Permutation Invariant Training for Universal Sound Separation. 1-5 - Kexin Feng, Theodora Chaspari:
A Knowledge-Driven Vowel-Based Approach of Depression Classification from Speech Using Data Augmentation. 1-5 - Shuo Liu, Adria Mallol-Ragolta, Björn W. Schuller:
COVID-19 Detection from Speech in Noisy Conditions. 1-5 - Liangqi Zhang, Yihao Luo, Xiang Cao, Haibo Shen, Tianjiang Wang:
Frequency and Scale Perspectives of Feature Extraction. 1-5 - Kaveen Liyanage, Reese Pearsall, Clemente Izurieta, Bradley M. Whitaker:
Dictionary Learning on Graph Data with Weisfieler-Lehman Sub-Tree Kernel and Ksvd. 1-5 - Kang Du, Yu Xiang:
Generalized Invariant Matching Property Via Lasso. 1-5 - Guanting Dong, Zechen Wang, Liwen Wang, Daichi Guo, Dayuan Fu, Yuxiang Wu, Chen Zeng, Xuefeng Li, Tingfeng Hui, Keqing He, Xinyue Cui, QiXiang Gao, Weiran Xu:
A Prototypical Semantic Decoupling Method via Joint Contrastive Learning for Few-Shot Named Entity Recognition. 1-5 - Oggi Rudovic, Wonil Chang, Vineet Garg, Pranay Dighe, Pramod Simha, Jack Berkowitz, Ahmed Hussen Abdelaziz, Sachin Kajarekar, Erik Marchi, Saurabh Adya:
Less Is More: A Unified Architecture for Device-Directed Speech Detection with Multiple Invocation Types. 1-5 - Yuqian Kuang, Xiaopeng Fan:
Collaborative Audio-Visual Event Localization Based on Sequential Decision and Cross-Modal Consistency. 1-5 - Miaomiao Zhang, Ji Chen, Xiaoyan Fu, Ge Xin, Jingzhi Zhang, Na Jiang, Jan D'hooge:
Hankel Structured Low Rank and Sparse Representation Via L0-Norm Optimization for Compressed Ultrasound Plane Wave Signal Reconstruction. 1-5 - Weihang Ding, Mohammad Shikh-Bahaei:
An Efficient Relay Selection Scheme for Relay-assisted HARQ. 1-5 - Jiguang He, Aymen Fakhreddine, Henk Wymeersch, George C. Alexandropoulos:
Compressed-Sensing-Based 3D Localization with Distributed Passive Reconfigurable Intelligent Surfaces. 1-5 - Haleh Akrami, Hannes Gamper:
Speech MOS Multi-Task Learning and Rater Bias Correction. 1-5 - Xiaoliu Luo, Zhao Duan, Taiping Zhang:
Spatial Similarity Guidance for Few-Shot Segmentation. 1-5 - Yu-Shan Tai, Ming-Guang Lin, An-Yeu Andy Wu:
TSPTQ-ViT: Two-Scaled Post-Training Quantization for Vision Transformer. 1-5 - Saska Tirronen, Farhad Javanmardi, Manila Kodali, Sudarsana Reddy Kadiri, Paavo Alku:
Utilizing Wav2Vec In Database-Independent Voice Disorder Detection. 1-5 - Ke Liu, Dekui Wang, Dongya Wu, Jun Feng:
Speech Emotion Recognition Via Two-Stream Pooling Attention With Discriminative Channel Weighting. 1-5 - Xiaomeng Liu, Christian Schaible, Timothy N. Davidson:
Multiple Access Computation Offloading for the K-User Case. 1-5 - Georgios Chochlakis, Gireesh Mahajan, Sabyasachee Baruah, Keith Burghardt, Kristina Lerman, Shrikanth Narayanan:
Using Emotion Embeddings to Transfer Knowledge between Emotions, Languages, and Annotation Formats. 1-5 - Florian Angulo, Slim Essid, Geoffroy Peeters, Christophe Mietlicki:
Cosmopolite Sound Monitoring (CoSMo): A Study of Urban Sound Event Detection Systems Generalizing to Multiple Cities. 1-5 - Mengge Liu, Wen Zhang, Xiang Li, Jian Luan, Bin Wang, Yuhang Guo, Shuoying Chen:
Rethinking the Reasonability of the Test Set for Simultaneous Machine Translation. 1-5 - Jiale Liu, Yu-Wei Zhan, Xin Luo, Zhen-Duo Chen, Yongxin Wang, Xin-Shun Xu:
Prototype-Based Layered Federated Cross-Modal Hashing. 1-2 - Yu-Lei Li, Yang Lu, Jie Li, Hanzi Wang:
Learning to Reconnect Interrupted Trajectories for Weakly Supervised Multi-Object Tracking. 1-5 - Dichucheng Li, Mingjin Che, Wenwu Meng, Yulun Wu, Yi Yu, Fan Xia, Wei Li:
Frame-Level Multi-Label Playing Technique Detection Using Multi-Scale Network and Self-Attention Mechanism. 1-5 - Yinghao Aaron Li, Cong Han, Xilin Jiang, Nima Mesgarani:
Phoneme-Level Bert for Enhanced Prosody of Text-To-Speech with Grapheme Predictions. 1-5 - Zijie Ye, Jia Jia, Haozhe Wu, Shuo Huang, Shikun Sun, Junliang Xing:
Salient Co-Speech Gesture Synthesizing with Discrete Motion Representation. 1-5 - Cheng Chu, Lei Jiang, Martin Swany, Fan Chen:
QTROJAN: A Circuit Backdoor Against Quantum Neural Networks. 1-5 - Digbalay Bose, Rajat Hebbar, Krishna Somandepalli, Shrikanth Narayanan:
Contextually-Rich Human Affect Perception Using Multimodal Scene Information. 1-5 - Ziwang Xu, Lanqing Guo, Shuyan Zhang, Alex C. Kot, Bihan Wen:
Unsupervised Deep Digital Staining for Microscopic Cell Images via Knowledge Distillation. 1-5 - Hao Tan, Jianjun Wang, Weichao Kong:
Deep Plug-and-Play for Tensor Robust Principal Component Analysis. 1-5 - Dayong Wang, Yu Sun, Weisheng Li, Lele Xie, Xin Lu, Frédéric Dufaux, Ce Zhu:
A Novel Mode Selection-Based Fast Intra Prediction Algorithm for Spatial SHVC. 1-5 - Djordje Batic, Giulia Tanoni, Lina Stankovic, Vladimir Stankovic, Emanuele Principi:
Improving Knowledge Distillation for Non-Intrusive Load Monitoring Through Explainability Guided Learning. 1-5 - Rimita Lahiri, Md. Nasir, Catherine Lord, So Hyun Kim, Shrikanth Narayanan:
A Context-Aware Computational Approach for Measuring Vocal Entrainment in Dyadic Conversations. 1-5 - Syed Rifat Mahmud Rafee, György Fazekas, Geraint A. Wiggins:
HIPI: A Hierarchical Performer Identification Model Based on Symbolic Representation of Music. 1-5 - Xiaohu You, Chi Li, Jianwei Xu, Mi Zhang:
AutoGCF: Personalized Aggregation on Neural Graph Collaborative Filtering. 1-5 - Junwen Duan, Han Jiang, Ying Yu:
MHLAT: Multi-Hop Label-Wise Attention Model for Automatic ICD Coding. 1-5 - Qinglin Zhang, Chong Deng, Jiaqing Liu, Hai Yu, Qian Chen, Wen Wang, Zhijie Yan, Jinglin Liu, Yi Ren, Zhou Zhao:
Overview of the ICASSP 2023 General Meeting Understanding and Generation Challenge (MUG). 1-2 - Trevor J. Cox, Jon Barker, Will Bailey, Simone Graetzer, Michael A. Akeroyd, John F. Culling, Graham Naylor:
Overview of the 2023 ICASSP SP Clarity Challenge: Speech Enhancement for Hearing Aids. 1-2 - Christos Chatzichristos, Miguel Bhagubai, Wim Van Paesschen, Maarten De Vos:
Epilepsy Detection Grand Challenge. 1-2 - Christian Marinoni, Riccardo F. Gramaccioni, Changan Chen, Aurelio Uncini, Danilo Comminiello:
Overview of the L3DAS23 Challenge on Audio-Visual Extended Reality. 1-2 - Çagkan Yapar, Fabian Jaensch, Ron Levie, Gitta Kutyniok, Giuseppe Caire:
The First Pathloss Radio Map Prediction Challenge. 1-2 - Lies Bollens, Mohammad Jalilpour-Monesi, Bernd Accou, Jonas Vanthornhout, Hugo Van hamme, Tom Francart:
ICASSP 2023 Auditory EEG Decoding Challenge. 1-2 - Saturnino Luz, Fasih Haider, Davida Fromm, Ioulietta Lazarou, Ioannis Kompatsiaris, Brian MacWhinney:
Multilingual Alzheimer's Dementia Recognition through Spontaneous Speech: A Signal Processing Grand Challenge. 1-2 - Angelo Coluccia, Alessio Fascista, Lars Sommer, Arne Schumann, Anastasios Dimou, Dimitrios Zarpalas, Nabin Sharma:
Drone-vs-Bird Detection Grand Challenge at ICASSP2023. 1-2 - Abhayjeet Singh, Amala Nagireddi, Deekshitha G, Jesuraja Bandekar, Roopa R., Sandhya Badiger, Sathvik Udupa, Prasanta Kumar Ghosh, Hema A. Murthy, Heiga Zen, Pranaw Kumar, Kamal Kant, Amol Bole, Bira Chandra Singh, Keiichi Tokuda, Mark Hasegawa-Johnson, Philipp Olbrich:
Lightweight, Multi-Speaker, Multi-Lingual Indic Text-to-Speech. 1-2 - Athanasia Zlatintsi, Panagiotis Paraskevas Filntisis, Niki Efthymiou, Christos Garoufis, George Retsinas, Thomas Sounapoglou, Ilias Maglogiannis, Panayiotis Tsanakas, Nikolaos Smyrnis, Petros Maragos:
E-Prevention: The ICASSP-2023 Challenge on Person Identification and Relapse Detection from Continuous Recordings of Biosignals. 1-2 - Akshat Shrivastava, Suyoun Kim, Paden Tomasello, Ali Elkahky, Daniel Lazar, Trang Le, Shan Jiang, Duc Le, Aleksandr Livshits, Ahmed Aly:
ICASSP 2023 Spoken Language Understanding Grand Challenge. 1-2 - Hang Chen, Shilong Wu, Yusheng Dai, Zhe Wang, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Bao-Cai Yin, Jia Pan, Jianqing Gao, Cong Liu:
Summary on the Multimodal Information Based Speech Processing (MISP) 2022 Challenge. 1-2
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.