default search action
ICASSP 2024: Seoul, Korea
- IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2024, Seoul, Republic of Korea, April 14-19, 2024. IEEE 2024, ISBN 979-8-3503-4485-1
- Jiwei Shen, Hu Lu, Hao Zhang, Shujing Lyu, Yue Lu:
Enhanced Deep Reinforcement Learning for Parcel Singulation in Non-Stationary Environments. 1-5 - Yaowei Li, Yating Liu, Xuxin Cheng, Zhihong Zhu, Hongxiang Li, Bang Yang, Zhiqi Huang:
KC-Prompt: End-To-End Knowledge-Complementary Prompting for Rehearsal-Free Continual Learning. 1-5 - Miao Jiang, Min Li, Junxing Ren, Weiqing Huang:
HOICS: Zero-Shot Hoi Detection via Compatibility Self-Learning. 1-5 - Xuanhao Zhang, Hui Kou, Chenjie Xia, Hao Cai, Bo Liu:
Small-Footprint Automatic Speech Recognition System using Two-Stage Transfer Learning based Symmetrized Ternary Weight Network. 1-5 - Zhenjiao Liu, Xiao Wang, Xiaodi Huang, Guanlin Li, Ke Sun, Zhikui Chen:
Incomplete Multi-View Representation Learning Through Anchor Graph-Based GCN and Information Bottleneck. 1-5 - Samuel Fernández-Menduiña, Joshua Rapp, Hassan Mansour, M. Greiff, Kieran Parsons:
Tracking Beyond the Unambiguous Range with Modulo Single-Photon Lidar. 6-10 - Yhonatan Kvich, Yonina C. Eldar:
Modulo Sampling and Recovery in Shift-Invariant Spaces. 11-15 - Chaoqun Gong, Yuqin Dai, Ronghui Li, Achun Bao, Jun Li, Jian Yang, Yachao Zhang, Xiu Li:
Text2Avatar: Text to 3d Human Avatar Generation with Codebook-Driven Body Controllable Attribute. 16-20 - Tao Chen, Minxing Li, Ziming Liu:
The Joint Grid-Free DOA and Polarization Estimation Algorithm based on Atomic Norm Minimization. 21-25 - Shaolei Feng, Xiaoguang Lu, Deshana Kaushal Desai, Lei Guan:
A Learning-Based System for Automatic Intentional Non-Adherence Detection from Dosing Videos. 26-30 - Jingqing Ruan, Runpeng Xie, Xuantang Xiong, Shuang Xu, Bo Xu:
MaDE: Multi-Scale Decision Enhancement for Multi-Agent Reinforcement Learning. 31-35 - Yuanbo Wen, Tao Gao, Ziqi Li, Jing Zhang, Ting Chen:
Encoder-Minimal and Decoder-Minimal Framework for Remote Sensing Image Dehazing. 36-40 - Tao Chen, Qi An, Minxing Li:
An Error Self-Corrected DOA Estimation Model for Sparse Array Based on ANM. 41-45 - Yijia Zhang, Deepak Mishra, Hassan Habibi Gharakheili, Derrick Wing Kwan Ng:
UAV Operation Time Minimization for Wireless-Powered Data Collection. 46-50 - Christophe El Zeinaty, Glenn Herrou, Wassim Hamidouche, Daniel Ménard:
Dicetrack: Lightweight Dice Classification on Resource-Constrained Platforms with Optimized Deep Learning Models. 51-55 - Kaiyuan Hu, Hongjie Liao, Mingxiao Li, Fangxin Wang:
MMCOUNT: Stationary Crowd Counting System Based on Commodity Millimeter-Wave Radar. 56-60 - Zirui Wan, Saeid Sanei:
Crowd Modeling and Control Via Cooperative Adaptive Filtering. 61-65 - Pavlo Hilei, Marian Petruk, Ievgen Korotkyi, Oleg Farenyuk:
Deep Learning AMR Model Inference Acceleration with CFU for Edge Systems. 66-70 - Masahito Togami, Jean-Marc Valin, Karim Helwani, Ritwik Giri, Umut Isik, Michael M. Goodwin:
Real-Time Stereo Speech Enhancement with Spatial-Cue Preservation Based on Dual-Path Structure. 71-75 - Deeksha Chandola, Enas Altarawneh, Michael Jenkin, Manos Papagelis:
SERC-GCN: Speech Emotion Recognition In Conversation Using Graph Convolutional Networks. 76-80 - Tenghao Cai, Lei Li, Tsung-Hui Chang:
Sensing-Assisted Distributed User Scheduling and Beamforming in Muli-Cell mmWave Networks. 81-85 - Jiayuan Gao, Yingwei Zhang, Yiqiang Chen, Tengxiang Zhang, Boshi Tang, Xiaoyu Wang:
Unsupervised Human Activity Recognition Via Large Language Models and Iterative Evolution. 91-95 - Tao Chen, Ziming Liu, Lei Zhan:
ANM-Based Source Localization Under Mixed Field. 96-100 - Ran Wang, Jing Sun, Cheng Xu, Ruixue Li, Shihong Duan, Xiaotong Zhang:
Reinforcement Learning Compensated Filter for Multi-Agents Cooperative Localization. 101-105 - Entong He, Yuxiang Yang, Chenshu Wu:
Quantum Ranging Enhanced TDoA Localization. 106-110 - Haoyu Wang, Jinbo Chen, Dongheng Zhang, Zhi Lu, Changwei Wu, Yang Hu, Qibin Sun, Yan Chen:
Contactless Radar Heart Rate Variability Monitoring Via Deep Spatio-Temporal Modeling. 111-115 - Nikolaos Palaiodimopoulos, Vítor Fortes Rey, Matthias Tschöpe, Christina Jörg, Paul Lukowicz, Maximilian Kiefer-Emmanouilidis:
Quantum Inspired Image Augmentation Applicable to Waveguides and Optical Image Transfer Via Anderson Localization. 116-120 - Anestis Kaimakamidis, Ioannis Pitas:
Political Tweet Sentiment Analysis for Public Opinion Polling. 121-125 - Victor R. J. Deville, C. M. Lievers, Jonathan H. Manton:
Enhanced Axle-Based Vehicle Classification Using Angle-Based Micro-Doppler Signature. 126-130 - Su Fong Chien, David Chieng, Samuel Y. C. Chen, Charilaos C. Zarakovitis, Heng Siong Lim, Y. H. Xu:
Applying Hybrid Quantum LSTM for Indoor Localization Based on RSSI. 131-135 - Hengxi Zhang, Zhendong Shi, Yuanquan Hu, Wenbo Ding, Ercan E. Kuruoglu, Xiao-Ping Zhang:
Optimizing Trading Strategies in Quantitative Markets Using Multi-Agent Reinforcement Learning. 136-140 - Yan Zhang, Xin Liu, Zuping Zhang:
Motif-Matching Based Sub-Braingraph Level Networks for Noisy Resting-State fMRI Analysis. 141-145 - Judith Herrmann, Raphael Kunert, Ron Hachmon, Aviv Markus, Allison Gunby-Mann, Sarel Cohen, Tobias Friedrich, Peter Chin:
Detecting Continuous Gravitational Waves Using Generated Training Data. 146-150 - Titan Yuan, Filip Maksimovic, David C. Burnett, Kristofer S. J. Pister:
Hardware-Limited Time Constant Estimation Using a Weighted Linear Regression. 151-155 - Kunwar Pritiraj Rajput, Linlong Wu, M. R. Bhavani Shankar, Pramod K. Varshney:
Joint Transmit Precoders and Passive Reflection Beamformer Design in IRS-Aided IoT Networks. 156-160 - Zhiqiang Zhou, Linxiao Yang, Qingsong Wen, Liang Sun:
RobustTSVar: A Robust Time Series Variance Estimation Algorithm. 161-165 - Xu Wang, Dongheng Zhang, Fengquan Zhan, Xuecheng Xie, Pengcheng Huang, Yang Hu, Yan Chen:
RoFi: Robust WiFi Intrusion Detection via Distribution Matching. 166-170 - Wuxia Hu, Yang Yang, Yonina C. Eldar, Chunyan Feng, Caili Guo:
Digital Task-Oriented Communication with Hardware-Limited Task-Based Quantization. 171-175 - Shuai Yang, Dongheng Zhang, Jinbo Chen, Fang Zhou, Guanzhong Wang, Qibin Sun, Yan Chen:
Automotive Radar Interference Mitigation Via SINR Maximization. 176-180 - Keshab K. Parhi:
A Low-Latency Fft-Ifft Cascade Architecture. 181-185 - Seyed Ali Ghazi Asgar, Kaan Sel, Anando Paul, Roderic I. Pettigrew, Roozbeh Jafari:
Cuffless Blood Pressure Estimation Using Magnetic Flux In A Ring Form Factor. 186-190 - Xuantang Xiong, Linghui Meng, Jingqing Ruan, Shuang Xu, Bo Xu:
UNeC: Unsupervised Exploring In Controllable Space. 191-195 - Jia-Yu Yang, Chih-I Ho, Pei-Yun Tsai, Hung-Ju Lin, Tzung-Dau Wang:
MAML-Based 24-Hour Personalized Blood Pressure Estimation from Wrist Photoplethysmography Signals in Free-Living Context. 196-200 - Shuyi Ren, Beichen Huang, Xiaoyang Li, Kaiming Shen:
Aerial-IRS-Assisted Load Balancing In Downlink Networks. 201-205 - Yu-Min Chiu, Ching-Te Chiu, Dao-Heng Luo:
Multi-Layer Relation Knowledge Distillation For Fingerprint Restoration. 206-210 - Toivo Henningson, Stefan Ingi Adalbjörnsson, Anders Berkeman, Carl Drougge, Xavante Erickson, Alexander Hunt:
A Concept for a Slam Back End Hardware Accelerator. 211-215 - Ganlin Zhang, Dongheng Zhang, Hongyu Deng, Yun Wu, Fengquan Zhan, Yan Chen:
Practical Challenge and Solution for IRS-Aided Indoor Localization System. 216-220 - Qu Yang, Qianhui Liu, Nan Li, Meng Ge, Zeyang Song, Haizhou Li:
SVAD: A Robust, Low-Power, and Light-Weight Voice Activity Detection with Spiking Neural Networks. 221-225 - Zeyang Song, Jibin Wu, Malu Zhang, Mike Zheng Shou, Haizhou Li:
Spiking-Leaf: A Learnable Auditory Front-End for Spiking Neural Networks. 226-230 - Zheng Si, Chao Liu, Jianyu Liu, Yinhao Zhou:
Application of SNNS Model Based On Multi-Dimensional Attention In Drone Radio Frequency Signal Classification. 231-235 - Yize Sun, Jiarui Liu, Yunpu Ma, Volker Tresp:
Differentiable Quantum Architecture Search For Job Shop Scheduling Problem. 236-240 - Peichao Wang, Qian He:
Low-Complexity GLRT Based Quickest Detection With Unknown Parameters. 241-245 - Irtaza Shahid, Khaldoon Al-Naimi, Ting Dang, Yang Liu, Fahim Kawsar, Alessandro Montanari:
Towards Enabling DPOAE Estimation on Single-Speaker Earbuds. 246-250 - Bo Han, Liangjian Han:
Efficient 3D Position Estimation in Badminton Scene. 251-255 - Kevin Wilkinghoff, Keisuke Imoto:
F1-EV score: Measuring The Likelihood of Estimating a Good Decision Threshold for Semi-Supervised Anomaly Detection. 256-260 - Xinlei Niu, Jing Zhang, Christian Walder, Charles Patrick Martin:
SoundLoCD: An Efficient Conditional Discrete Contrastive Latent Diffusion Model for Text-to-Sound Generation. 261-265 - Christopher Hahne, Michel Hayoz, Raphael Sznitman:
StofNet: Super-Resolution Time of Flight Network. 266-270 - Yiming Li, Xiangdong Wang, Hong Liu, Rui Tao, Long Yan, Kazushige Ouchi:
Semi-Supervised Sound Event Detection with Local and Global Consistency Regularization. 271-275 - Kevin Wilkinghoff:
Self-Supervised Learning for Anomalous Sound Detection. 276-280 - Yushu Wu, Xiao Quan, Mohammad Rasool Izadi, Chuan-Che Jeff Huang:
"It os Okay to be Uncommon": Quantizing Sound Event Detection Networks on Hardware Accelerators with Uncommon Sub-Byte Support. 281-285 - Shansong Liu, Atin Sakkeer Hussain, Chenshuo Sun, Ying Shan:
Music Understanding LLaMA: Advancing Text-to-Music Generation with Question Answering and Captioning. 286-290 - Heinrich Dinkel, Yongqing Wang, Zhiyong Yan, Junbo Zhang, Yujun Wang:
CED: Consistent Ensemble Distillation for Audio Tagging. 291-295 - Ali Gökçe, Hüseyin Hacihabiboglu:
Semi-Blind Estimation of Direct-to-Reverberant Energy Ratio Using Residual Energy Test Statistics. 296-300 - Haojie Wei, Xueke Cao, Wenbo Xu, Tangpeng Dan, Yueguo Chen:
DJCM: A Deep Joint Cascade Model for Singing Voice Separation and Vocal Pitch Estimation. 301-305 - Rhiannon Mogridge, George Close, Robert Sutherland, Thomas Hain, Jon Barker, Stefan Goetze, Anton Ragni:
Non-Intrusive Speech Intelligibility Prediction for Hearing-Impaired Users Using Intermediate ASR Features and Human Memory Models. 306-310 - Jiayi Zhang, Rita Singh:
Vocal Fold Dynamics for Automatic Detection of Amyotrophic Lateral Sclerosis from Voice. 311-315 - Shih-Lun Wu, Xuankai Chang, Gordon Wichern, Jee-Weon Jung, François G. Germain, Jonathan Le Roux, Shinji Watanabe:
Improving Audio Captioning Models with Fine-Grained Audio Features, Text Embedding Supervision, and LLM Mix-Up Augmentation. 316-320 - Yoshihide Tomita, Shoichi Koyama, Hiroshi Saruwatari:
Localizing Acoustic Energy in Sound Field Synthesis by Directionally Weighted Exterior Radiation Suppression. 321-325 - Jia Qi Yip, Shengkui Zhao, Yukun Ma, Chongjia Ni, Chong Zhang, Hao Wang, Trung Hieu Nguyen, Kun Zhou, Dianwen Ng, Eng Siong Chng, Bin Ma:
SPGM: Prioritizing Local Features for Enhanced Speech Separation Performance. 326-330 - Mahesh Kumar Nandwana, Yifan He, Joseph Liu, Xiao Yu, Charles Shang, Eloi du Bois, Morgan McGuire, Kiran Bhat:
Voice Toxicity Detection Using Multi-Task Learning. 331-335 - Benjamin Elizalde, Soham Deshmukh, Huaming Wang:
Natural Language Supervision For General-Purpose Audio Representations. 336-340 - Yao Qiu, Jinchao Zhang, Yong Shan, Jie Zhou:
Enhancing Note-Level Singing Transcription Model with Unlabeled and Weakly Labeled Data. 341-345 - Yo Sasaki, Yasushige Nakayama:
Simultaneous Interior and Exterior Sound Field Synthesis Using Cylindrical and Spherical Loudspeaker Arrays. 346-350 - George Close, William Ravenscroft, Thomas Hain, Stefan Goetze:
Multi-CMGAN+/+: Leveraging Multi-Objective Speech Quality Metric Prediction for Speech Enhancement. 351-355 - Johannes Zeitler, Michael Krause, Meinard Müller:
Soft Dynamic Time Warping with Variable Step Weights. 356-360 - Yi-Chiao Wu, Dejan Markovic, Steven Krenn, Israel D. Gebru, Alexander Richard:
ScoreDec: A Phase-Preserving High-Fidelity Audio Codec with a Generalized Score-Based Diffusion Post-Filter. 361-365 - Ali Vosoughi, Luca Bondi, Ho-Hsiang Wu, Chenliang Xu:
Learning Audio Concepts from Counterfactual Natural Language. 366-370 - Soham Deshmukh, Benjamin Elizalde, Dimitra Emmanouilidou, Bhiksha Raj, Rita Singh, Huaming Wang:
Training Audio Captioning Models without Audio. 371-375 - Pranay Manocha, Donald Williamson, Adam Finkelstein:
Corn: Co-Trained Full- and No-Reference Speech Quality Assessment. 376-380 - Jozef Coldenhoff, Andrew Harper, Paul Kendrick, Tijana Stojkovic, Milos Cernak:
Multi-Channel Mosra: Mean Opinion Score and Room Acoustics Estimation Using Simulated Data and A Teacher Model. 381-385 - Idan Cohen, Sharon Gannot, Ofir Lindenbaum:
Unsupervised Acoustic Scene Mapping Based on Acoustic Features and Dimensionality Reduction. 386-390 - Manuel Milling, Andreas Triantafyllopoulos, Iosif Tsangko, Simon David Noel Rampp, Björn Wolfgang Schuller:
Bringing the Discussion of Minima Sharpness to the Audio Domain: A Filter-Normalised Evaluation for Acoustic Scene Classification. 391-395 - Chih-Cheng Chang, Li Su:
Beast: Online Joint Beat and Downbeat Tracking Based on Streaming Transformer. 396-400 - Yiqun Zhang, Xinmeng Xu, Weiping Tu:
Improving Acoustic Echo Cancellation by Exploring Speech and Echo Affinity with Multi-Head Attention. 401-405 - Pavan Seshadri, Chaeyeon Han, Bon-Woo Koo, Noah Posner, Subhrajit Guhathakurta, Alexander Lerch:
ASPED: An Audio Dataset for Detecting Pedestrians. 406-410 - Yuki Okamoto, Keisuke Imoto, Shinnosuke Takamichi, Ryotaro Nagase, Takahiro Fukumori, Yoichi Yamashita:
Environmental Sound Synthesis from Vocal Imitations and Sound Event Labels. 411-415 - Mattes Ohlenbusch, Christian Rollwage, Simon Doclo:
Multi-Microphone Noise Data Augmentation for DNN-Based Own Voice Reconstruction for Hearables in Noisy Environments. 416-420 - Shulin He, Jinjiang Liu, Hao Li, Yang Yang, Fei Chen, Xueliang Zhang:
3S-TSE: Efficient Three-Stage Target Speaker Extraction for Real-Time and Low-Resource Applications. 421-425 - Yi Luo, Rongzhi Gu:
Improving Music Source Separation with Simo Stereo Band-Split Rnn. 426-430 - Yichi Wang, Jie Zhang, Shihao Chen, Weitai Zhang, Zhongyi Ye, Xinyuan Zhou, Lirong Dai:
A Study of Multichannel Spatiotemporal Features and Knowledge Distillation on Robust Target Speaker Extraction. 431-435 - Clara Borrelli, James Rae, Dogac Basaran, Matt McVicar, Mehrez Souden, Matthias Mauch:
Resource-Constrained Stereo Singing Voice Cancellation. 436-440 - Zhengding Luo, Dongyuan Shi, Xiaoyi Shen, Woon-Seng Gan:
Unsupervised Learning Based End-to-End Delayless Generative Fixed-Filter Active Noise Control. 441-445 - Younglo Lee, Shukjae Choi, Byeong-Yeol Kim, Zhongqiu Wang, Shinji Watanabe:
Boosting Unknown-Number Speaker Separation with Transformer Decoder-Based Attractor. 446-450 - Youqiang Zheng, Weiping Tu, Li Xiao, Xinmeng Xu:
Srcodec: Split-Residual Vector Quantization for Neural Speech Codec. 451-455 - Haocheng Guo, Xiaohuai Le, Kai Chen, Jing Lu:
A Light-Weight State Detection Model for Kalman-Filter-Based Acoustic Feedback Cancellation with Rapid Recovery from Abrupt Path Changes. 456-460 - Anbin Qi, Xiang Xie, Jing Wang:
Mtdiffusion: Multi-Task Diffusion Model With Dual-Unet for Foley Sound Generation. 461-465 - Chenglong Jiang, Ying Gao, Hao Jin, Linrong Pan, Wing W. Y. Ng:
Fastmandarin: Efficient Local Modeling for Natural Mandarin Speech Synthesis. 461-465 - Shrishti Saha Shetu, Soumitro Chakrabarty, Oliver Thiergart, Edwin Mabande:
Ultra Low Complexity Deep Learning Based Noise Suppression. 466-470 - Carlotta Anemüller, Oliver Thiergart, Emanuël A. P. Habets:
Binaural Rendering of Heterogeneous Sound Sources with Extent. 471-475 - Jan Büthe, Ahmed Mustafa, Jean-Marc Valin, Karim Helwani, Michael M. Goodwin:
NOLACE: Improving Low-Complexity Speech Codec Enhancement Through Adaptive Temporal Shaping. 476-480 - Wei Tsung Lu, Ju-Chiang Wang, Qiuqiang Kong, Yun-Ning Hung:
Music Source Separation With Band-Split Rope Transformer. 481-485 - Yiming Li, Xiangdong Wang, Hong Liu:
Audio-Free Prompt Tuning for Language-Audio Models. 491-495 - Pengyu Wang, Xiaofei Li:
RVAE-EM: Generative Speech Dereverberation Based On Recurrent Variational Auto-Encoder And Convolutive Transfer Function. 496-500 - Weilong Huang, Cheng Xue, Jinwei Feng, W. Bastiaan Kleijn:
A Practical Online Multichannel Dereverberation Approach with Data-Reuse Technique. 501-505 - Yile Angela Zhang, Fei Ma, Thushara D. Abhayapala, Prasanga N. Samarasinghe, Amy Bastine:
An Active Noise Control System Based On Soundfield Interpolation Using A Physics-Informed Neural Network. 506-510 - Fan Zhang, Chao Pan, Jacob Benesty, Jingdong Chen:
Directional Gain Based Noise Covariance Matrix Estimation for MVDR Beamforming. 511-515 - Soonhyeon Choi, Jung-Woo Choi:
Noisy-Arcmix: Additive Noisy Angular Margin Loss Combined With Mixup For Anomalous Sound Detection. 516-520 - Dichucheng Li, Yinghao Ma, Weixing Wei, Qiuqiang Kong, Yulun Wu, Mingjin Che, Fan Xia, Emmanouil Benetos, Wei Li:
Mertech: Instrument Playing Technique Detection Using Self-Supervised Pretrained Model with Multi-Task Finetuning. 521-525 - Haesun Joung, Kyogu Lee:
Music Auto-Tagging with Robust Music Representation Learned via Domain Adversarial Training. 526-530 - Théo Mariotte, Antonio Almudévar, Marie Tahon, Alfonso Ortega Giménez:
An Explainable Proxy Model for Multilabel Audio Segmentation. 531-535 - Jae-Won Kim, Byeongho Jo, Seungkwon Beack, Hochong Park:
Pre-Echo Reduction in Transform Audio Coding via Temporal Envelope Control with Machine Learning Based Estimation. 536-540 - Wuyang Liu, Yanzhen Ren:
Semantic Proximity Alignment: Towards Human Perception-Consistent Audio Tagging by Aligning with Label Text Description. 541-545 - Jordi Pons, Xiaoyu Liu, Santiago Pascual, Joan Serrà:
GASS: Generalizing Audio Source Separation with Large-Scale Data. 546-550 - An-Yan Chang, Jing-Tong Tzeng, Huan-Yu Chen, Chih-Wei Sung, Chun-Hsiang Huang, Edward Pei-Chuan Huang, Chi-Chun Lee:
GaP-Aug: Gamma Patch-Wise Correction Augmentation Method for Respiratory Sound Classification. 551-555 - Srikanth Burra, Asutosh Kar, Mads Græsbøll Christensen:
Conjugate Gradient Based Adaptive Algorithm for Nonlinear AEC. 556-560 - Keigo Wakayama, Tsubasa Ochiai, Marc Delcroix, Masahiro Yasuda, Shoichiro Saito, Shoko Araki, Akira Nakayama:
Online Target Sound Extraction with Knowledge Distillation from Partially Non-Causal Teacher. 561-565 - Youqiang Zheng, Weiping Tu, Li Xiao, Xinmeng Xu:
SuperCodec: A Neural Speech Codec with Selective Back-Projection Network. 566-570 - Guochen Yu, Xiguang Zheng, Nan Li, Runqiang Han, Chengshi Zheng, Chen Zhang, Chao Zhou, Qi Huang, Bing Yu:
BAE-Net: a Low Complexity and High Fidelity Bandwidth-Adaptive Neural Network for Speech Super-Resolution. 571-575 - Ron Moisseev, Gal Itzhak, Israel Cohen:
Array Geometry Optimization for Region-of-Interest Near-Field Beamforming. 576-580 - Yi Yuan, Haohe Liu, Xubo Liu, Qiushi Huang, Mark D. Plumbley, Wenwu Wang:
Retrieval-Augmented Text-to-Audio Generation. 581-585 - Liang Xu, Jing Wang, Jianqian Zhang, Xiang Xie:
LightCodec: A High Fidelity Neural Audio Codec with Low Computation Complexity. 586-590 - Zhihao Du, Shiliang Zhang, Kai Hu, Siqi Zheng:
FunCodec: A Fundamental, Reproducible and Integrable Open-Source Toolkit for Neural Speech Codec. 591-595 - Carlos Hernandez-Olivan, Koichi Saito, Naoki Murata, Chieh-Hsin Lai, Marco A. Martínez Ramírez, Wei-Hsiang Liao, Yuki Mitsufuji:
VRDMG: Vocal Restoration via Diffusion Posterior Sampling with Multiple Guidance. 596-600 - Satoru Emura:
Permutation-Alignment Method Using Manifold Optimization for Frequency-Domain Blind Source Separation. 601-605 - Zhijian Jiang, Haoming Li, Nengheng Zheng:
Two-Stage Acoustic Echo Cancellation Network with Dual-Path Alignment. 606-610 - Satvik Venkatesh, Arthur Benilov, Philip Coleman, Frederic Roskam:
Real-Time Low-Latency Music Source Separation Using Hybrid Spectrogram-Tasnet. 611-615 - Amal Emthyas, Sebastià V. Amengual Garí, Enzo De Sena:
Binaural Room Transfer Function Interpolation Via System Inversion. 616-620 - Hassan Taherian, Ashutosh Pandey, Daniel Wong, Buye Xu, DeLiang Wang:
Leveraging Sound Localization to Improve Continuous Speaker Separation. 621-625 - Bunlong Lay, Jean-Marie Lemercier, Julius Richter, Timo Gerkmann:
Single and Few-Step Diffusion for Generative Speech Enhancement. 626-630 - Stefano Damiano, Luca Bondi, Shabnam Ghaffarzadegan, Andre Guntoro, Toon van Waterschoot:
Can Synthetic Data Boost the Training of Deep Acoustic Vehicle Counting Networks? 631-635 - Kazuki Shimada, Kengo Uchida, Yuichiro Koyama, Takashi Shibuya, Shusuke Takahashi, Yuki Mitsufuji, Tatsuya Kawahara:
Zero- and Few-Shot Sound Event Localization and Detection. 636-640 - Xiaoli Tang, Jihui Aimee Zhang, Thushara D. Abhayapala:
Active Noise Control Over A Large Region with Multiple Spherical Microphone Arrays In Wave Domain. 641-645 - Mayuka Kono, Yutaro Hirao, Monica Perusquía-Hernández, Naoya Isoyama, Hideaki Uchiyama, Nobuchika Sakata, Jun Takamatsu, Kiyoshi Kiyokawa:
U2R: Underwater Ultrasonic Reflection Wave Dataset Toward Pose-Invariant Material Recognition. 646-650 - Guendalina Milano, Oliver Thiergart, Emanuël A. P. Habets:
Sector-Based Interference Cancellation for Robust Keyword Spotting Applications Using an Informed MPDR Beamformer. 651-655 - Huaying Xue, Xiulian Peng, Yan Lu:
Low-Latency Speech Enhancement via Speech Token Generation. 661-665 - Bar Shaybet, Anurag Kumar, Vladimir Tourbabin, Boaz Rafaely:
Ambisonics Networks - The Effect of Radial Functions Regularization. 666-670 - Matthew C. McCallum, Matthew E. P. Davies, Florian Henkel, Jaehun Kim, Samuel E. Sandberg:
On The Effect Of Data-Augmentation On Local Embedding Properties In The Contrastive Learning Of Music Audio Representations. 671-675 - Jonah Casebeer, Junkai Wu, Paris Smaragdis:
Meta-AF Echo Cancellation for Improved Keyword Spotting. 676-680 - Vikas Tokala, Eric Grinstein, Mike Brookes, Simon Doclo, Jesper Jensen, Patrick A. Naylor:
Binaural Speech Enhancement Using Deep Complex Convolutional Transformer Networks. 681-685 - Matthew C. McCallum, Florian Henkel, Jaehun Kim, Samuel E. Sandberg, Matthew E. P. Davies:
Similar but Faster: Manipulation of Tempo in Music Audio Embeddings for Tempo Prediction and Search. 686-690 - Gyuhak Kim, Ho-Hsiang Wu, Luca Bondi, Bing Liu:
Multi-Modal Continual Pre-Training For Audio Encoders. 691-695 - Babak Naderi, Ross Cutler, Nicolae-Catalin Ristea:
Multi-Dimensional Speech Quality Assessment in Crowdsourcing. 696-700 - Mikko Heikkinen, Archontis Politis, Tuomas Virtanen:
Neural Ambisonics Encoding For Compact Irregular Microphone Arrays. 701-705 - María Alfaro-Contreras, Antonio Ríos-Vila, Jose J. Valero-Mas, Jorge Calvo-Zaragoza:
A Transformer Approach for Polyphonic Audio-to-Score Transcription. 706-710 - Hao Zhang, Yixuan Zhang, Meng Yu, Dong Yu:
Advancing Acoustic Howling Suppression Through Recursive Training of Neural Networks. 711-715 - Yuanbo Hou, Qiaoqiao Ren, Siyang Song, Yuxin Song, Wenwu Wang, Dick Botteldooren:
Multi-Level Graph Learning For Audio Event Classification And Human-Perceived Annoyance Rating Prediction. 716-720 - Cong Han, Kevin W. Wilson, Scott Wisdom, John R. Hershey:
Unsupervised Multi-Channel Separation And Adaptation. 721-725 - Riku Arakawa, Mathieu Parvaix, Chiong Lai, Hakan Erdogan, Alex Olwal:
Quantifying The Effect Of Simulator-Based Data Augmentation For Speech Recognition On Augmented Reality Glasses. 726-730 - Daniel Fejgin, Elior Hadad, Sharon Gannot, Zbynek Koldovský, Simon Doclo:
Comparison Of Frequency-Fusion Mechanisms For Binaural Direction-Of-Arrival Estimation For Multiple Speakers. 731-735 - Jens Heitkaemper, Arun Narayanan, Turaj Zakizadeh Shabestary, Sankaran Panchapagesan, James Walker, Bhalchandra Gajare, Shlomi Regev, Ajay Dudani, Alexander Gruenstein:
Improving Acoustic Echo Cancellation for Voice Assistants Using Neural Echo Suppression and Multi-Microphone Noise Reduction. 736-740 - Ke Chen, Jiaqi Su, Zeyu Jin:
MDX-GAN: Enhancing Perceptual Quality in Multi-Class Source Separation Via Adversarial Training. 741-745 - Karn N. Watcharasupat, Alexander Lerch:
Quantifying Spatial Audio Quality Impairment. 746-750 - Ravi Shankar, Ke Tan, Buye Xu, Anurag Kumar:
A Closer Look at Wav2vec2 Embeddings for On-Device Single-Channel Speech Enhancement. 751-755 - Kunxing Lu, Xianrui Wang, Tetsuya Ueda, Shoji Makino, Jingdong Chen:
A Computationally Efficient Semi-Blind Source Separation Approach for Nonlinear Echo Cancellation Based on an Element-Wise Iterative Source Steering. 756-760 - Luca Della Libera, Cem Subakan, Mirco Ravanelli, Samuele Cornell, Frédéric Lepoutre, François Grondin:
Resource-Efficient Separation Transformer. 761-765 - Yu Du, Xu Liu, Yansong Chua:
Spiking Structured State Space Model for Monaural Speech Enhancement. 766-770 - Jinhua Liang, Huy Phan, Emmanouil Benetos:
Learning from Taxonomy: Multi-Label Few-Shot Classification for Everyday Sound Recognition. 771-775 - Xudong Zhao, Xueqin Luo, Gongping Huang, Jingdong Chen, Jacob Benesty:
Differential Beamforming with Null Constraints for Spherical Microphone Arrays. 776-780 - Yang Xiang, Jingguang Tian, Xinhui Hu, Xinkang Xu, Zhaohui Yin:
A Deep Representation Learning-Based Speech Enhancement Method Using Complex Convolution Recurrent Variational Autoencoder. 781-785 - Yichen Yang, Haowen Li, Xianrui Wang, Wen Zhang, Shoji Makino, Jingdong Chen:
Stereophonic Music Source Separation with Spatially-Informed Bridging Band-Split Network. 786-790 - Mahmoud Namazi, Kenneth Rose:
Ultra-Low Delay Lossless Compression of Higher Order Ambisonics. 791-795 - Gaël Le Lan, Varun Nagaraja, Ernie Chang, David Kant, Zhaoheng Ni, Yangyang Shi, Forrest N. Iandola, Vikas Chandra:
Stack-and-Delay: A New Codebook Pattern for Music Generation. 796-800 - Jayeon Yi, Junghyun Koo, Kyogu Lee:
DDD: A Perceptually Superior Low-Response-Time DNN-Based Declipper. 801-805 - Li Li, Shogo Seki:
Remixed2remixed: Domain Adaptation for Speech Enhancement by Noise2noise Learning with Remixing. 806-810 - Wei-Yang Lin, Yu-Chiang Frank Wang, Li Su:
Enhancing Violin Fingering Generation through Audio-Symbolic Fusion. 811-815 - Lior Arbel, Ishwarya Ananthabhotla, Zamir Ben-Hur, David Lou Alon, Boaz Rafaely:
On HRTF Notch Frequency Prediction using Anthropometric Features and Neural Networks. 816-820 - Yiqiang Cai, Peihong Zhang, Shengchen Li:
TF-SepNet: An Efficient 1D Kernel Design in Cnns for Low-Complexity Acoustic Scene Classification. 821-825 - Seungheon Doh, Minhee Lee, Dasaem Jeong, Juhan Nam:
Enriching Music Descriptions with A Finetuned-LLM and Metadata for Text-to-Music Retrieval. 826-830 - Ryandhimas E. Zezario, Bo-Ren Brian Bai, Chiou-Shann Fuh, Hsin-Min Wang, Yu Tsao:
Multi-Task Pseudo-Label Learning for Non-Intrusive Speech Quality Assessment Model. 831-835 - Matteo Torcoli, Chih-Wei Wu, Sascha Dick, Phillip A. Williams, Mhd Modar Halimeh, William Wolcott, Emanuël A. P. Habets:
Odaq: Open Dataset of Audio Quality. 836-840 - Pil Moo Byun, Joon-Hyuk Chang:
Generalized Specaugment via Multi-Rectangle Inverse Masking For Acoustic Scene Classification. 841-845 - Tal Peer, Simon Welker, Johannes Kolhoff, Timo Gerkmann:
A Flexible Online Framework for Projection-Based Stft Phase Retrieval. 846-850 - Hanyue Liu, Miao Liu, Jing Wang, Xiang Xie, Lidong Yang:
Non-Intrusive Speech Quality Assessment with Multi-Task Learning Based on Tensor Network. 851-855 - Côme Peladeau, Geoffroy Peeters:
Blind Estimation of Audio Effects Using an Auto-Encoder Approach and Differentiable Digital Signal Processing. 856-860 - Zixing Zhang, Tao Pang, Jing Han, Björn W. Schuller:
Intelligent Cardiac Auscultation for Murmur Detection via Parallel-Attentive Models with Uncertainty Estimation. 861-865 - Zeyu Xie, Baihan Li, Xuenan Xu, Mengyue Wu, Kai Yu:
Enhancing Audio Generation Diversity with Visual Information. 866-870 - Kazuki Matsumoto, Kohei Yatabe:
Determined BSS by Combination of IVA and DNN via Proximal Average. 871-875 - Wangjin Zhou, Zhengdong Yang, Chenhui Chu, Sheng Li, Raj Dabre, Yi Zhao, Tatsuya Kawahara:
MOS-FAD: Improving Fake Audio Detection Via Automatic Mean Opinion Score Prediction. 876-880 - Huy Phan, Byeonggeun Kim, Vu Nguyen, Andrew Bydlon, Qingming Tang, Chieh-Chi Kao, Chao Wang:
Cross-Triggering Issue in Audio Event Detection and Mitigation. 881-885 - Alireza Nezamdoust, Mario Huemer, Aurelio Uncini, Danilo Comminiello:
Efficient Functional Link Adaptive Filters Based On Nearest Kronecker Product Decomposition. 886-890 - Stepan Shishkin, Danilo Hollosi, Stefan Goetze, Simon Doclo:
Active Learning for Sound Event Classification Using Bayesian Neural Networks with Gaussian Variational Posterior. 896-900 - Aolin Hu, Xueshuai Zhang, Shaoxing Zhang, Pengyuan Zhang, Yu Lu, Pengfei Ye, Qingwei Zhao, Yonghong Yan:
Snore Sound Features Based on Percussive Enhancing and Positional Encoding Combined with Multi-Task Learning for Osahs Detection. 901-905 - Nikolay D. Gaubitch, David Looney:
On The Role of Room Acoustics in Audio Presentation Attack Detection. 906-910 - Nian Shao, Xian Li, Xiaofei Li:
Fine-Tune the Pretrained ATST Model for Sound Event Detection. 911-915 - Manjunath Mulimani, Annamaria Mesaros:
Class-Incremental Learning for Multi-Label Audio Classification. 916-920 - David Sundström, Filip Elvander, Andreas Jakobsson:
Estimation of Impulse Responses for a Moving Source Using Optimal Transport Regularization. 921-925 - Yiyuan Yang, Kaichen Zhou, Niki Trigoni, Andrew Markham:
SSL-Net: A Synergistic Spectral and Learning-Based Network for Efficient Bird Sound Classification. 926-930 - Jiakun Shen, Xueshuai Zhang, Pengyuan Zhang, Yonghong Yan, Qingwei Zhao, Ta Li, Yanfen Tang, Shaoxing Zhang:
One-Epoch Training with Single Test Sample in Test Time for Better Generalization of Cough-Based Covid-19 Detection Model. 931-935 - Marco Comunità, Riccardo F. Gramaccioni, Emilian Postolache, Emanuele Rodolà, Danilo Comminiello, Joshua D. Reiss:
Syncfusion: Multimodal Onset-Synchronized Video-to-Audio Foley Synthesis. 936-940 - Zhiwei Lin, Jun Chen, Boshi Tang, Binzhu Sha, Jing Yang, Yaolong Ju, Fan Fan, Shiyin Kang, Zhiyong Wu, Helen Meng:
Multi-View Midivae: Fusing Track- and Bar-View Representations for Long Multi-Track Symbolic Music Generation. 941-945 - Gaël Richard, Pierre Chouteau, Bernardo Torres:
A Fully Differentiable Model for Unsupervised Singing Voice Separation. 946-950 - Manvi Agarwal, Changhong Wang, Gaël Richard:
Structure-Informed Positional Encoding for Music Generation. 951-955 - Antonin Gagneré, Slim Essid, Geoffroy Peeters:
Adapting Pitch-Based Self Supervised Learning Models for Tempo Estimation. 956-960 - Yuanyuan Wang, Hangting Chen, Dongchao Yang, Jianwei Yu, Chao Weng, Zhiyong Wu, Helen Meng:
Consistent and Relevant: Rethink the Query Embedding in General Sound Separation. 961-965 - Yuankun Xie, Haonan Cheng, Yutian Wang, Long Ye:
An Efficient Temporary Deepfake Location Approach Based Embeddings for Partially Spoofed Audio Detection. 966-970 - Xiaobin Rong, Tianchi Sun, Xu Zhang, Yuxiang Hu, Changbao Zhu, Jing Lu:
GTCRN: A Speech Enhancement Model Requiring Ultralow Computational Resources. 971-975 - Aurian Quelennec, Michel Olvera, Geoffroy Peeters, Slim Essid:
On The Choice of the Optimal Temporal Support for Audio Classification with Pre-Trained Embeddings. 976-980 - Rong Xie, Anqi Tu, Chuang Shi, Stephen Elliott, Huiyong Li, Le Zhang:
Cognitive Virtual Sensing Technique for Feedforward Active Noise Control. 981-985 - Teysir Baoueb, Haocheng Liu, Mathieu Fontaine, Jonathan Le Roux, Gaël Richard:
SpecDiff-GAN: A Spectrally-Shaped Noise Diffusion GAN for Speech and Music Synthesis. 986-990 - Inseon Jang, Haici Yang, Wootaek Lim, Seungkwon Beack, Minje Kim:
Personalized Neural Speech Codec. 991-995 - Yuliang Zhang, Roberto Togneri, David Huang:
A Unified Loss Function to Tackle Inter-Class and Intra-Class Data Imbalance in Sound Event Detection. 996-1000 - Qiquan Zhang, Meng Ge, Hongxu Zhu, Eliathamby Ambikairajah, Qi Song, Zhaoheng Ni, Haizhou Li:
An Empirical Study on the Impact of Positional Encoding in Transformer-Based Monaural Speech Enhancement. 1001-1005 - Vasudha Sathyapriyan, Michael Syskind Pedersen, Mike Brookes, Jan Østergaard, Patrick A. Naylor, Jesper Jensen:
Speech Enhancement in Hearing Aids Using Target Speech Presence Estimation Based on a Delayed Remote Microphone Signal. 1006-1010 - Alessandro Ragano, Jan Skoglund, Andrew Hines:
NOMAD: Unsupervised Learning of Perceptual Embeddings For Speech Enhancement and Non-Matching Reference Audio Quality Assessment. 1011-1015 - Yoshiki Masuyama, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
NIIRF: Neural IIR Filter Field for HRTF Upsampling and Personalization. 1016-1020 - Yadong Guan, Jiqing Han, Hongwei Song, Wenjie Song, Guibin Zheng, Tieran Zheng, Yongjun He:
Contrastive Loss Based Frame-Wise Feature Disentanglement for Polyphonic Sound Event Detection. 1021-1025 - Shiqi Zhang, Zheng Qiu, Daiki Takeuchi, Noboru Harada, Shoji Makino:
Unrestricted Global Phase Bias-Aware Single-Channel Speech Enhancement with Conformer-Based Metric Gan. 1026-1030 - Mihailo Kolundzija, Mathew Shaji Kavalekalam, Ivana Balic, Michelle Mao, Raúl Casas:
Low Bitrate Loss Resilience Scheme for a Speech Enhancing Neural Codec. 1031-1035 - Yin-Jyun Luo, Sebastian Ewert, Simon Dixon:
Unsupervised Pitch-Timbre Disentanglement of Musical Instruments Using a Jacobian Disentangled Sequential Autoencoder. 1036-1040 - Shota Okubo, Toshiharu Horiuchi:
Three-Dimensional Sound Wave Propagation Reproduction by CE-FDTD Simulation Applying Actual Radiation Characteristics. 1041-1045 - Zhiheng Wang, Hongsen He, Jingdong Chen, Jacob Benesty, Yi Yu:
A Steered Response Power Approach with Bilinear Prediction-Based Trade-Off Prewhitening for Speaker Localization. 1046-1050 - Xavier Riley, Drew Edwards, Simon Dixon:
High Resolution Guitar Transcription Via Domain Adaptation. 1051-1055 - Tong Xiao, Simon Doclo:
Effect of Target Signals and Delays on Spatially Selective Active Noise Control for Open-Fitting Hearables. 1056-1060 - Tony Alex, Sara Ahmed, Armin Mustafa, Muhammad Awais, Philip J. B. Jackson:
Max-AST: Combining Convolution, Local and Global Self-Attentions for Audio Event Classification. 1061-1065 - Shuhua Liu, Chunyu Zhang, Binshuai Li, Niantong Qin, Huanting Cheng, Huayu Zhang:
TIA: A Teaching Intonation Assessment Dataset in Real Teaching Situations. 1066-1070 - Shuai Yu, Jun Liu, Yi Yu, Wei Li:
A Scalable Sparse Transformer Model for Singing Melody Extraction. 1071-1075 - Haohe Liu, Ke Chen, Qiao Tian, Wenwu Wang, Mark D. Plumbley:
Audiosr: Versatile Audio Super-Resolution at Scale. 1076-1080 - Manos Plitsis, Theodoros Kouzelis, Georgios Paraskevopoulos, Vassilis Katsouros, Yannis Panagakis:
Investigating Personalization Methods in Text to Music Generation. 1081-1085 - Akshay Raina, Sayeedul Islam Sheikh, Vipul Arora:
Learning Ontology Informed Representations with Constraints for Acoustic Event Detection. 1086-1090 - Xuenan Xu, Xiaohang Xu, Zeyu Xie, Pingyue Zhang, Mengyue Wu, Kai Yu:
A Detailed Audio-Text Data Simulation Pipeline Using Single-Event Sounds. 1091-1095 - Francesca Ronchini, Romain Serizel:
Performance and Energy Balance: A Comprehensive Study of State-of-the-Art Sound Event Detection Systems. 1096-1100 - Anselm Lohmann, Toon van Waterschoot, Jörg Bitzer, Simon Doclo:
Microphone Subset Selection for the Weighted Prediction Error Algorithm Using a Group Sparsity Penalty. 1101-1105 - Nils Marggraf-Turley, Michael Lovedee-Turner, Enzo De Sena:
HRTF Recommendation Based on the Predicted Binaural Colouration Model. 1106-1110 - Xingjian Du, Pei Zou, Mingyu Liu, Xia Liang, Minghang Chu, Bilei Zhu:
ByteHum: Fast and Accurate Query-by-Humming in the Wild. 1111-1115 - Yingxue Gao, Huan Zhao, Zixing Zhang:
Adaptive Speech Emotion Representation Learning Based On Dynamic Graph. 1116-11120 - Julian D. Parker, Janne Spijkervet, Katerina Kosta, Furkan Yesiler, Boris Kuznetsov, Ju-Chiang Wang, Matt Avent, Jitong Chen, Duc Le:
STEMGEN: A Music Generation Model That Listens. 1116-1120 - Christoph Hold, Leo McCormack, Archontis Politis, Ville Pulkki:
Perceptually-Motivated Spatial Audio Codec for Higher-Order Ambisonics Compression. 1121-1125 - Xingjian Du, Zhesong Yu, Jiaju Lin, Bilei Zhu, Qiuqiang Kong:
Joint Music and Language Attention Models for Zero-Shot Music Tagging. 1126-1130 - Zhongweiyang Xu, Yong Xu, Vinay Kothapally, Heming Wang, Muqiao Yang, Dong Yu:
SPATIALCODEC: Neural Spatial Speech Coding. 1131-1135 - Jianwei Yu, Hangting Chen, Yanyao Bian, Xiang Li, Yi Luo, Jinchuan Tian, Mengyang Liu, Jiayi Jiang, Shuai Wang:
AutoPrep: An Automatic Preprocessing Framework for In-The-Wild Speech Data. 1136-1140 - Gabriel Meseguer-Brocal, Dorian Desblancs, Romain Hennequin:
An Experimental Comparison of Multi-View Self-Supervised Methods for Music Tagging. 1141-1145 - Giuseppe Concialdi, Alkis Koudounas, Eliana Pastor, Barbara Di Eugenio, Elena Baralis:
Ainur: Harmonizing Speed and Quality in Deep Music Generation Through Lyrics-Audio Embeddings. 1146-1150 - Tomoki Ariga, Yosuke Higuchi, Kazutoshi Hayasaka, Naoki Okamoto, Tetsuji Ogawa:
Parody Detection Using Source-Target Attention with Teacher-Forced Lyrics. 1151-1155 - Dimitrios Bralios, Gordon Wichern, François G. Germain, Zexu Pan, Sameer Khurana, Chiori Hori, Jonathan Le Roux:
Generation or Replication: Auscultating Audio Latent Diffusion Models. 1156-1160 - Sreyan Ghosh, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramani Duraiswami, Dinesh Manocha:
Recap: Retrieval-Augmented Audio Captioning. 1161-1165 - Marco Pasini, Maarten Grachten, Stefan Lattner:
Bass Accompaniment Generation Via Latent Diffusion. 1166-1170 - Harshvardhan C. Takawale, Nirupam Roy:
Learning Speaker-Listener Mutual Head Orientation by Leveraging HRTF and Voice Directivity on Headphones. 1171-1175 - Bernardo Torres, Geoffroy Peeters, Gaël Richard:
Unsupervised Harmonic Parameter Estimation Using Differentiable DSP and Spectral Optimal Transport. 1176-1180 - Arvind Krishna Sridhar, Yinyi Guo, Erik Visser, Rehana Mahfuz:
Parameter Efficient Audio Captioning with Faithful Guidance Using Audio-Text Shared Latent Representation. 1181-1185 - Dennis Fedorishin, Livio Forte, Philip Schneider, Srirangaraj Setlur, Venu Govindaraju:
Fine-Grained Engine Fault Sound Event Detection Using Multimodal Signals. 1186-1190 - Darius Petermann, Minje Kim:
Hyperbolic Distance-Based Speech Separation. 1191-1195 - Jiarui Hai, Helin Wang, Dongchao Yang, Karan Thakkar, Najim Dehak, Mounya Elhilali:
DPM-TSE: A Diffusion Probabilistic Model for Target Sound Extraction. 1196-1200 - Yang Yang, George Sung, Shao-Fu Shih, Hakan Erdogan, Chehung Lee, Matthias Grundmann:
Binaural Angular Separation Network. 1201-1205 - Ke Chen, Yusong Wu, Haohe Liu, Marianna Nezhurina, Taylor Berg-Kirkpatrick, Shlomo Dubnov:
MusicLDM: Enhancing Novelty in text-to-music Generation Using Beat-Synchronous mixup Strategies. 1206-1210 - Alexander Gebhard, Andreas Triantafyllopoulos, Teresa Bez, Lukas Christ, Alexander Kathan, Björn W. Schuller:
Exploring Meta Information for Audio-Based Zero-Shot Bird Classification. 1211-1215 - Minje Kim, Trausti Kristjansson:
Scalable and Efficient Speech Enhancement Using Modified Cold Diffusion: A Residual Learning Approach. 1216-1220 - Irán R. Román, Christopher Ick, Sivan Ding, Adrian S. Roman, Brian McFee, Juan Pablo Bello:
Spatial Scaper: A Library to Simulate and Augment Soundscapes for Sound Event Localization and Detection in Realistic Rooms. 1221-1225 - Minz Won, Yun-Ning Hung, Duc Le:
A Foundation Model for Music Informatics. 1226-1230 - Huiyuan Sun, Howe Yuan Zhu, Minh T. D. Nguyen, Vincent Nguyen, Chin-Teng Lin, Craig T. Jin:
From RIR to BRIR: A Sparse Recovery Beamforming Approach for Virtual Binaural Sound Rendering. 1231-1235 - Huiyuan Sun, Craig T. Jin, Thushara D. Abhayapala, Prasanga N. Samarasinghe:
Active Noise Control Over 3D Space with A Dynamic Noise Source. 1236-1240 - Karan Thakkar, Jiarui Hai, Mounya Elhilali:
Investigating Self-Supervised Deep Representations for EEG-Based Auditory Attention Decoding. 1241-1245 - Seungmin Shin, Joon Byun, Jongmo Sung, Seungkwon Beack, Youngcheol Park:
Quantization Noise Masking in Perceptual Neural Audio Coder. 1246-1250 - Haici Yang, Inseon Jang, Minje Kim:
Generative De-Quantization for Neural Speech Codec Via Latent Diffusion. 1251-1255 - Ruimin Wu, Xianke Wang, Yuqing Li, Wei Xu, Wenqing Cheng:
Piano Transcription with Harmonic Attention. 1256-1260 - Xi Liu, Szu-Jui Chen, John H. L. Hansen:
Dual-Path Minimum-Phase and All-Pass Decomposition Network for Single Channel Speech Dereverberation. 1261-1265 - Yucong Zhang, Juan Liu, Yao Tian, Haifeng Liu, Ming Li:
A Dual-Path Framework with Frequency-and-Time Excited Network for Anomalous Sound Detection. 1266-1270 - Hejing Zhang, Qiaoxi Zhu, Jian Guan, Haohe Liu, Feiyang Xiao, Jiantong Tian, Xinhao Mei, Xubo Liu, Wenwu Wang:
First-Shot Unsupervised Anomalous Sound Detection with Unknown Anomalies Estimated by Metadata-Assisted Audio Generation. 1271-1275 - Weinan Tong, Jiaxu Zhu, Jun Chen, Shiyin Kang, Tao Jiang, Yang Li, Zhiyong Wu, Helen Meng:
SCNet: Sparse Compression Network for Music Source Separation. 1276-1280 - Xilin Jiang, Cong Han, Yinghao Aaron Li, Nima Mesgarani:
Exploring Self-supervised Contrastive Learning of Spatial Sound Event Representation. 1281-1285 - Yongyi Zang, Yi Zhong, Frank Cwitkowitz, Zhiyao Duan:
SynthTab: Leveraging Synthesized Data for Guitar Tablature Transcription. 1286-1290 - Frank Cwitkowitz, Kin Wai Cheuk, Woosung Choi, Marco A. Martínez Ramírez, Keisuke Toyama, Wei-Hsiang Liao, Yuki Mitsufuji:
Timbre-Trap: A Low-Resource Framework for Instrument-Agnostic Music Transcription. 1291-1295 - Kahyun Choi, Minje Kim:
A Comparative Analysis of Poetry Reading Audio: Singing, Narrating, or Somewhere in Between? 1296-1300 - Donghang Wu, Xihong Wu, Tianshu Qu:
A Hybrid Deep-Online Learning Based Method for Active Noise Control in Wave Domain. 1301-1305 - Enis Berk Çoban, Megan Perra, Michael I. Mandel:
Towards High Resolution Weather Monitoring With Sound Data. 1306-1310 - Longling Zhang, Lyqi Liu, Dan Meng, Jun Wang, Shengshan Hu:
Stealthy Backdoor Attack Towards Federated Automatic Speaker Verification. 1311-1315 - David Robinson, Adelaide Robinson, Lily Akrapongpisak:
Transferable Models for Bioacoustics with Human Language Supervision. 1316-1320 - Adrian S. Roman, Irán R. Román, Juan Pablo Bello:
Robust DoA Estimation from Deep Acoustic Imaging. 1321-1325 - Bing Han, Zhiqiang Lv, Anbai Jiang, Wen Huang, Zhengyang Chen, Yufeng Deng, Jiawei Ding, Cheng Lu, Wei-Qiang Zhang, Pingyi Fan, Jia Liu, Yanmin Qian:
Exploring Large Scale Pre-Trained Models for Robust Machine Anomalous Sound Detection. 1326-1330 - Azalea Gui, Hannes Gamper, Sebastian Braun, Dimitra Emmanouilidou:
Adapting Frechet Audio Distance for Generative Music Evaluation. 1331-1335 - Shaoheng Xu, Jihui Aimee Zhang, Thushara D. Abhayapala, Amy Bastine, Wei-Ting Lai, Prasanga N. Samarasinghe:
Sparse Sound Field Representation Using Complex Orthogonal Matching Pursuit. 1336-1340 - Chunxi Wang, Maoshen Jia, Meiran Li, Changchun Bao, Wenyu Jin:
Attention Is All You Need For Blind Room Volume Estimation. 1341-1345 - Chengbo Chang, Ziye Yang, Jie Chen:
Plug-and-Play MVDR Beamforming for Speech Separation. 1346-1350 - Sankha Subhra Bhattacharjee, Srikanth Burra, Jesper Rindom Jensen, Liming Shi, Guoli Ping, Jingkai Weng, Mads Græsbøll Christensen:
Broadband Personal Sound Zone Control in the Presence of Nonlinearities. 1351-1355 - Florian Henkel, Jaehun Kim, Matthew C. McCallum, Samuel E. Sandberg, Matthew E. P. Davies:
Tempo Estimation as Fully Self-Supervised Binary Classification. 1356-1360 - Yun Liang, Hai Lin, Shaojian Qiu, Yihang Zhang:
AAT: Adapting Audio Transformer for Various Acoustics Recognition Tasks. 1361-1365 - Jun-You Wang, Chung-Che Wang, Chon-In Leong, Jyh-Shing Roger Jang:
MIR-MLPop: A Multilingual Pop Music Dataset with Time-Aligned Lyrics and Audio. 1366-1370 - Jiyun Park, Sangeon Yong, Taegyun Kwon, Juhan Nam:
A Real-Time Lyrics Alignment System Using Chroma and Phonetic Features for Classical Vocal Performance. 1371-1375 - Yurii Iotov, Sidsel Marie Nørholm, Peter John McCutcheon, Mads Græsbøll Christensen:
Improving Speech Attenuation in Headphones using Harmonic Model Decomposition and Multiple-Frequency ANC. 1376-1380 - Zizheng Zhang, Chen Chen, Hsin-Hung Chen, Xiang Liu, Yuchen Hu, Eng Siong Chng:
Noise-Aware Speech Separation with Contrastive Learning. 1381-1385 - Ernst Seidel, Pejman Mowlaee, Tim Fingscheidt:
Efficient High-Performance Bark-Scale Neural Network for Residual Echo and Noise Suppression. 1386-1390 - Ryosuke Tanaka, Satoshi Tamura:
Few-Shot Anomalous Sound Detection Based on Anomaly Map Estimation Using Pseudo Abnormal Data. 1391-1395 - Dail Kim, Min-Sang Baek, Yungyeo Kim, Joon-Hyuk Chang:
Improving Target Sound Extraction with Timestamp Knowledge Distillation. 1396-1400 - Donghyun Kim, Yungyeo Kim, Joon-Hyuk Chang:
Class: Continual Learning Approach for Speech Super-Resolution. 1401-1405 - Chenglong Wang, Jiayi He, Jiangyan Yi, Jianhua Tao, Chu Yuan Zhang, Xiaohui Zhang:
Multi-Scale Permutation Entropy for Audio Deepfake Detection. 1406-1410 - Masahiro Yasuda, Shoichiro Saito, Akira Nakayama, Noboru Harada:
6DoF SELD: Sound Event Localization and Detection Using Microphones and Motion Tracking Sensors on Self-Motioning Human. 1411-1415 - Jiu Feng, Mehmet Hamza Erol, Joon Son Chung, Arda Senocak:
From Coarse to Fine: Efficient Training for Audio Spectrogram Transformers. 1416-1420 - Santiago Cuervo, Ricard Marxer:
Speech Foundation Models on Intelligibility Prediction for Hearing-Impaired Listeners. 1421-1425 - June-Woo Kim, Sangmin Bae, Won-Yang Cho, Byungjo Lee, Ho-Young Jung:
Stethoscope-Guided Supervised Contrastive Learning for Cross-Domain Adaptation on Respiratory Sound Classification. 1431-1435 - Laura Lechler, Kamil Wójcicki:
Crowdsourced Multilingual Speech Intelligibility Testing. 1441-1445 - Yuzhuo Liu, Xubo Liu, Yan Zhao, Yuanyuan Wang, Rui Xia, Pingchuan Tain, Yuxuan Wang:
Audio Prompt Tuning for Universal Sound Separation. 1446-1450 - Yuto Kondo, Hirokazu Kameoka, Kou Tanaka, Takuhiro Kaneko:
Selecting N-Lowest Scores for Training MOS Prediction Models. 1451-1455 - Tatsuya Komatsu, Yusuke Fujita, Kazuya Takeda, Tomoki Toda:
Audio Difference Learning for Audio Captioning. 1456-1460 - Yanjue Song, Nilesh Madhu:
Phase Reconstruction in Single Channel Speech Enhancement Based on Phase Gradients and Estimated Clean-Speech Amplitudes. 1461-1465 - Vitjan Zavrtanik, Matija Marolt, Matej Kristan, Danijel Skocaj:
Anomalous Sound Detection by Feature-Level Anomaly Simulation. 1466-1470 - Xingda Li, Fan Zhuo, Dan Luo, Jun Chen, Shiyin Kang, Zhiyong Wu, Tao Jiang, Yang Li, Han Fang, Yahui Zhou:
Generating Stereophonic Music with Single-Stage Language Models. 1471-1475 - Federico Miotello, Luca Comanducci, Mirco Pezzoli, Alberto Bernardini, Fabio Antonacci, Augusto Sarti:
Reconstruction of Sound Field Through Diffusion Models. 1476-1480 - Tianchi Sun, Tong Lei, Xu Zhang, Yuxiang Hu, Changbao Zhu, Jing Lu:
A Lightweight Hybrid Multi-Channel Speech Extraction System with Directional Voice Activity Detection. 1486-1490 - Jin Woo Lee, Min Jun Choi, Kyogu Lee:
String Sound Synthesizer On Gpu-Accelerated Finite Difference Scheme. 1491-1495 - Ying Hu, Haitao Xu, Zhongcun Guo, Hao Huang, Liang He:
SMMA-Net: An Audio Clue-Based Target Speaker Extraction Network with Spectrogram Matching and Mutual Attention. 1496-1500 - Jiuqiang Li, Zheng Wang, Shilei Zhu:
Mixed Informed Transformer for Few-Shot Medical Image Segmentation. 1501-1505 - Zhihao Yu, Chaohe Zhang, Yasha Wang, Wen Tang, Jiangtao Wang, Liantao Ma:
Predict and Interpret Health Risk Using Ehr Through Typical Patients. 1506-1510 - Yee-Fan Tan, Junn Yong Loo, Chee-Ming Ting, Fuad Noman, Raphaël C.-W. Phan, Hernando Ombao:
BrainFC-CGAN: A Conditional Generative Adversarial Network for Brain Functional Connectivity Augmentation and Aging Synthesis. 1511-1515 - Xuechen Guo, Wenhao Hu, Chiming Ni, Wenhao Chai, Shiyan Li, Gaoang Wang:
Blind Inpainting with Object-Aware Discrimination for Artificial Marker Removal. 1516-1520 - Minheng Chen, Zhirun Zhang, Shuheng Gu, Youyong Kong:
Embedded Feature Similarity Optimization with Specific Parameter Initialization for 2D/3D Medical Image Registration. 1521-1525 - Xinzhe Zheng, Sijie Ji, Chenshu Wu:
Predicting Adverse Events for Patients with Type-1 Diabetes Via Self-Supervised Learning. 1526-1530 - Siteng Ma, Haochang Wu, Aonghus Lawlor, Ruihai Dong:
Breaking the Barrier: Selective Uncertainty-Based Active Learning for Medical Image Segmentation. 1531-1535 - Zhuotong Cai, Jingmin Xin, Siyuan Dong, John A. Onofrey, Nanning Zheng, James S. Duncan:
Symmetric Consistency with Cross-Domain Mixup for Cross-Modality Cardiac Segmentation. 1536-1540 - Renqi Chen, Jingjing Luo, Fan Nian, Yuhui Cen, Yiheng Peng, Zekuan Yu:
SSHNN: Semi-Supervised Hybrid NAS Network for Echocardiographic Image Segmentation. 1541-1545 - Mengjiao Yao, Xiang Gao:
Gland Instance Segmentation by Full Resolution Multi-Scale Dilation Residual Networks. 1546-1550 - Yining Qiu, Yuxi Li, Jiafu Wu, Zhenye Gan, Mingmin Chi, Yabiao Wang, Chengjie Wang, Pei Wang:
Learning Hybrid Negative Probability Model for Weakly-Supervised Whole Slide Image Recognition. 1551-1555 - Peiji Chen, Dian Li, Yifan Tang, Shunta Togo, Hiroshi Yokoi, Yinlai Jiang:
Dynamic Label Smoothing Strategy for Biosignal Classification. 1556-1560 - Jun Liu, Wenyi Wang, Nuo Shen, Wei Wang, Kuanquan Wang, Qince Li, Yongfeng Yuan, Henggui Zhang, Gongning Luo:
Mutualreg: Mutual Learning for Unsupervised Medical Image Registration. 1561-1565 - Yinda Chen, Wei Huang, Xiaoyu Liu, Shiyu Deng, Qi Chen, Zhiwei Xiong:
Learning Multiscale Consistency for Self-Supervised Electron Microscopy Instance Segmentation. 1566-1570 - Xingcan Hu, Li Xiao, Yu-Ping Wang:
A Graph Neural Network Based Fusion of MRI-Derived Brain Network and Clinical Data for Glioblastoma Survival Prediction. 1571-1575 - Jiang Shang, Sifan Zhou:
LK-UNet: Large Kernel Design for 3D Medical Image Segmentation. 1576-1580 - Minghui Wu, Yangdi Xu, Yingying Xu, Guangwei Wu, Qingqing Chen, Hongxiang Lin:
Stable Optimization for Large Vision Model Based Deep Image Prior in Cone-Beam CT Reconstruction. 1581-1585 - Yan Li, Zhuoran Zheng, Wenqi Ren, Yunfeng Nie, Jingang Zhang, Xiuyi Jia:
Frequency Aware and Graph Fusion Network for Polyp Segmentation. 1586-1590 - Luyuan Xie, Cong Li, Xin Zhang, Shengfang Zhai, Yuejian Fang, Qingni Shen, Zhonghai Wu:
TRLS: A Time Series Representation Learning Framework Via Spectrogram for Medical Signal Processing. 1591-1595 - Jiacheng Hao, Junhai Xu, Mengting Liu, Jianguo Wei:
SSR-GPCsT: Deep Learning Models Based on Functional Connectivity Maps in Autism Research. 1596-1600 - Xutao Guo, Yanwu Yang, Chenfei Ye, Guoqing Cai, Ting Ma:
CALSeg: Improving Calibration of Medical Image Segmentation Via Variational Label Smoothing. 1601-1605 - Mehrab Bin Morshed, Md Mahbubur Rahman, Viswam Nathan, Li Zhu, Jungmok Bae, Christina Rosa, Wendy Berry Mendes, Jilong Kuang, Alex Gao:
Core Body Temperature and its Role in Detecting Acute Stress: A Feasibility Study. 1606-1610 - Ryo Fujii, Ryo Hachiuma, Hideo Saito:
Weakly Semi-Supervised Tool Detection in Minimally Invasive Surgery Videos. 1611-1615 - Ming Wu, Hao Qi, Wenkang Fan, Sunkui Ke, Hui-Qing Zeng, Yinran Chen, Xiongbiao Luo:
Chat: Cascade Hole-Aware Transformers with Geometric Spatial Consistency for Accurate Monocular Endoscopic Depth Estimation. 1616-1620 - Yintao Zhou, Meng Pang, Wei Huang, Binghui Wang:
Early Diagnosing Parkinson's Disease Via a Deep Learning Model Based on Augmented Facial Expression Data. 1621-1625 - Yan Zhang, Xin Liu, Zuping Zhang:
DDN-Net: Deep Residual Shrinkage Denoising Networks with Channel-Wise Adaptively Soft Thresholds for Automated Major Depressive Disorder Identification. 1626-1630 - Jiyao Wang, Ange Wang, Haolong Hu, Kaishun Wu, Dengbo He:
Multi-Source Domain Generalization for ECG-Based Cognitive Load Estimation: Adversarial Invariant and Plausible Uncertainty Learning. 1631-1635 - Chenyang Li, Zhili Zhang, Peipei Li, Zhaofeng He:
I3FDM: IRIS Inpainting Via Inverse Fusion of Diffusion Models. 1636-1640 - Wenjing Zhang, Hao Yu, Manli Zhang, Gongpeng Cao, Guixia Kang, Lixin Cai:
Matpr-Unet: A Multi Attention Two-Path Residual Unet for Focal Cortical Dysplasia Lesions Segmentation. 1641-1645 - Qijia Shao, Li Zhu, Mohsin Y. Ahmed, Korosh Vatanparvar, Migyeong Gwak, Nafiul Rashid, Jungmok Bae, Jilong Kuang, Alex Gao:
Normalization is All You Need: Robust Full-Range Contactless SpO2 Estimation Across Users. 1646-1650 - Jiawei Jiang, Jie Wu, Yueqian Quan, Jiacheng Chen, Jianwei Zheng:
Memory-Augmented Dual-Domain Unfolding Network for MRI Reconstruction. 1651-1655 - Yuan Zhang, Yaolei Qi, Xiaoming Qi, Lotfi Senhadji, Yongyue Wei, Feng Chen, Guanyu Yang:
Fedsoda: Federated Cross-Assessment and Dynamic Aggregation for Histopathology Segmentation. 1656-1660 - Boon Peng Yap, Beng Koon Ng:
Single-Source Domain Generalization in Fundus Image Segmentation Via Moderating and Interpolating Input Space Augmentation. 1661-1665 - Lingrui Gu, Weijian Deng, Guoli Wang:
UNAD: Universal Anatomy-Initialized Noise Distribution Learning Framework Towards Low-Dose CT Denoising. 1671-1675 - Chengyu Yuan, Hao Xiong, Guoqing Shangguan, Hualei Shen, Dong Liu, Haojie Zhang, Zhonghua Liu, Kun Qian, Bin Hu, Björn W. Schuller, Yoshiharu Yamamoto, Shlomo Berkovsky:
Deep Fusion of Shifted MLP and CNN for Medical Image Segmentation. 1676-1680 - Po-Chen Lin, Jeng-Lin Li, Woan-Shiuan Chien, Chi-Chun Lee:
In-The-Wild Physiological-Based Stress Detection Using Federated Strategy. 1681-1685 - Jiuming Qin, Che Liu, Sibo Cheng, Yike Guo, Rossella Arcucci:
Freeze the Backbones: a Parameter-Efficient Contrastive Approach to Robust Medical Vision-Language Pre-Training. 1686-1690 - Xianglong Wang, Xifeng An, Eric Rigall, Shu Zhang, Hui Yu, Junyu Dong:
A Method for X-Ray Image Landmarks Localization using Cyclic Coordinate-Guided Strategy. 1691-1695 - Yuanzhe Peng, Jieming Bian, Jie Xu:
Fedmm: Federated Multi-Modal Learning with Modality Heterogeneity in Computational Pathology. 1696-1700 - Guorui Liao, Jiawei Liu, Yuxuan Liang, Shu Wang, Li Liu:
Fall Prediction by a Spatio-Temporal Multi-Channel Causal Model from Wearable Sensors Data. 1701-1705 - Jing Xia, Yi Hao Chan, Deepank Girish, Jagath C. Rajapakse:
Brain Structure-Function Interaction Network for Fluid Cognition Prediction. 1706-1710 - Linyu Xing, Mengxi Chen, Jiangchao Yao, Ya Zhang, Yanfeng Wang:
Pre-Post Interaction Learning for Brain Tumor Segmentation with Missing MRI Modalities. 1711-1715 - Yuping Huang, Weisheng Li, Guofen Wang, Xiaoyu Qiao, Huanyu Chen:
CT and MRI Fusion with Anisotropic Guided Filtering. 1716-1720 - Yingwei Zhang, Changru Guo, Yiqiang Chen, Zeping Lv, Qing Li:
Effective Connectivity-Based Multi-View Feature Learning Method for Dementia Diagnosis with FNIRS Signal. 1721-1725 - Jiaqi Cui, Yan Wang, Lu Wen, Pinxian Zeng, Xi Wu, Jiliu Zhou, Dinggang Shen:
Image2Points: A 3D Point-Based Context Clusters GAN for High-Quality Pet Image Reconstruction. 1726-1730 - Zhenyu Zhang, Benlu Wang, Weijie Liang, Yizhi Li, Xuechen Guo, Guanhong Wang, Shiyan Li, Gaoang Wang:
Sam-Guided Enhanced Fine-Grained Encoding with Mixed Semantic Learning for Medical Image Captioning. 1731-1735 - Yu-Tung Liu, Kuan-Chen Wang, Kai-Chun Liu, Sheng-Yu Peng, Yu Tsao:
SDEMG: Score-Based Diffusion Model for Surface Electromyographic Signal Denoising. 1736-1740 - Zhuang Xie, Jianguo Wei, Wenhuan Lu, Zhongjie Li, Chunli Wang, Gaoyan Zhang:
EEG-Based Fast Auditory Attention Detection in Real-Life Scenarios Using Time-Frequency Attention Mechanism. 1741-1745 - Guoxin Wang, Sheng Shi, Shan An, Fengmei Fan, Wenshu Ge, Qi Wang, Feng Yu, Zhiren Wang:
A Bi-Pyramid Multimodal Fusion Method for the Diagnosis Of Bipolar Disorders. 1746-1750 - Bo Wang, Hang Zhao, Xiongfei Li, Mingjie Tian, Bo Huang, Feiyang Yang:
Multi-Task Self-Supervised Learning for Medical Image Segmentation. 1751-1755 - Yuda Bi, Anees Abrol, Jing Sui, Vince D. Calhoun:
Cross-Modal Synthesis of Structural MRI and Functional Connectivity Networks via Conditional ViT-GANs. 1756-1760 - Fangyao Shen, Zehao Zhang, Yong Peng, Hongjie Guo, Lina Chen, Hong Gao:
Self-Supervised Learning for Sleep Stage Classification with Temporal Augmentation and False Negative Suppression. 1761-1765 - Yujie Liu, Peng Zhou, Zongmin Li:
VMCC-NET: Uncovering Challenging Regions in Semi-Supervised Medical Image Segmentation with Voxel Mask Based Cyclic-Consistency Network. 1766-1770 - Chengliang Wang, Xinrun Chen, Haojian Ning, Shiying Li:
SAM-OCTA: A Fine-Tuning Strategy for Applying Foundation Model OCTA Image Segmentation Tasks. 1771-1775 - Shitao Zheng, Dongrui Wu:
Semi-Supervised Domain Adaptation for Eeg-Based Sleep Stage Classification. 1776-1780 - Qi Bi, Hao Zheng, Xu Sun, Jingjun Yi, Wentian Zhang, Yawen Huang, Yuexiang Li, Yefeng Zheng:
Self-Supervised Cross-Level Consistency Learning For Fundus Image Classification. 1781-1785 - Joao Pereira, Dimitrios Halatsis, Balint Hodossy, Dario Farina:
Tackling Electrode Shift in Gesture Recognition with HD-EMG Electrode Subsets. 1786-1790 - Zexin Feng, Na Zeng, Jiansheng Fang, Xingyue Wang, Xiaoxi Lu, Heng Meng, Jiang Liu:
Flattening Singular Values of Factorized Convolution for Medical Images. 1791-1795 - Tatsuki Seino, Naoki Saito, Takahiro Ogawa, Satoshi Asamizu, Miki Haseyama:
Confidence-Aware Spatial-Temporal Attention Graph Convolutional Network for Skeleton-Based Expert-Novice Level Classification. 1796-1800 - Bozhen Hu, Zelin Zang, Cheng Tan, Stan Z. Li:
Deep Manifold Transformation for Protein Representation Learning. 1801-1805 - Ziyi Li, Li-Ming Zhao, Wei-Long Zheng, Bao-Liang Lu:
Temporal-Spatial Prediction: Pre-Training on Diverse Datasets for EEG Classification. 1806-1810 - Johanna Wilroth, Emina Alickovic, Martin A. Skoglund, Martin Enqvist:
Nonlinearity Detection and Compensation for EEG-Based Speech Tracking. 1811-1815 - Wenlong Chen, Chuanwen Feng, Ao Ke, Xike Xie, S. Kevin Zhou:
Out-of-Distribution Detection for Learning-Based Chest X-Ray Diagnosis. 1816-1820 - He Zhu, Ren Togo, Takahiro Ogawa, Miki Haseyama:
Prompt-Based Personalized Federated Learning for Medical Visual Question Answering. 1821-1825 - Ziqiang Chen, Kang Wang, Yun Liu:
Efficient Polyp Segmentation via Integrity Learning. 1826-1830 - Trung Vu, Hanlu Yang, Francisco Laport, Ben Gabrielson, Vince D. Calhoun, Tülay Adali:
A Robust and Scalable Method with an Analytic Solution for Multi-Subject FMRI Data Analysis. 1831-1835 - Michaela Areti Zervou, Effrosyni Doutsi, Yannis Pantazis, Panagiotis Tsakalides:
Multitask Classification of Antimicrobial Peptides for Simultaneous Assessment of Antimicrobial Property and Structural Fold. 1836-1840 - Wei-Bang Jiang, Ziyi Li, Wei-Long Zheng, Bao-Liang Lu:
Functional Emotion Transformer for EEG-Assisted Cross-Modal Emotion Recognition. 1841-1845 - Erlei Zhang, Weihao Chen, Xiaowei Xu, Zhicheng Zhang, Jinglei Li:
Breast Ultrasound Computer-Aided Diagnosis Using Structure-Aware Triplet Path Networks. 1846-1850 - Ruixing Liang, Xiangyu Zhang, Qiong Li, Lai Wei, Hexin Liu, Avisha Kumar, Kelley M. Kempski Leadingham, Joshua Punnoose, Leibny Paola García, Amir Manbachi:
Unidirectional Brain-Computer Interface: Artificial Neural Network Encoding Natural Images to FMRI Response in the Visual Cortex. 1851-1855 - Tianxiang Xia, Rong Zhang, Zhenzuo Chen, Guomin Xie, Xiping Wu, Zhongyue Lv, Lijun Guo:
Progressive Learning Based Knowledge Distillation for Low Resolution Cerebral Microbleed Segmentation. 1856-1860 - Chenglin Liu, Binquan Wang, Zhi Wu:
PN-DetX: A Dedicated Framework for Pulmonary Nodule Detection in X-Ray Images. 1861-1865 - Chiao-Yi Wang, Faranguisse Kakhi Sadrieh, Yi-Ting Shen, Giovanni Oppizzi, Li-Qun Zhang, Yang Tao:
Real-Time Privacy-Preserving Fall Risk Assessment with a Single Body-Worn Tracking Camera. 1866-1870 - Yu-Ting Lan, Wei-Bang Jiang, Wei-Long Zheng, Bao-Liang Lu:
CEMOAE: A Dynamic Autoencoder with Masked Channel Modeling for Robust EEG-Based Emotion Recognition. 1871-1875 - Lu Wen, Zhenghao Feng, Yun Hou, Peng Wang, Xi Wu, Jiliu Zhou, Yan Wang:
DCL-Net: Dual Contrastive Learning Network for Semi-Supervised Multi-Organ Segmentation. 1876-1880 - Yuexiao Liang, Zhineng Chen, Xin Chen, Caiyan Jia, Xiongjun Ye, Xieping Gao:
Dual Contrastive Learning Guided Pathological Image Re-Staining. 1881-1885 - Kun Huang, Xiao Ma, Na Su, Songtao Yuan, Qiang Chen:
Model-Based Label-to-Image Diffusion for Semi-Supervised Choroidal Vessel Segmentation. 1886-1890 - Bingzhi Chen, Jiawei Zhu, Yishu Liu, Biqing Zeng, Jiahui Pan, Meirong Ding:
Medical Vision-Language Representation Learning with Cross-Modal Multi-Teacher Contrastive Distillation. 1891-1895 - Chengxi Zhu, Yong Peng, Yinfeng Fang, Wanzeng Kong:
Label Rectified and Graph Adaptive Semi-Supervised Regression for Electrode Shifted Gesture Recognition. 1896-1900 - Yibin Tang, Jikang Ding, Aimin Jiang, Chun Wang, Yuan Gao:
High-Accuracy Anxiety Disorder Identification Through Subspace-Enhanced Hypergraph Neural Network. 1901-1905 - Wenbo Qi, Wenyong Zhou, Ngai Wong, S. C. Chan:
Hybrid Module with Multiple Receptive Fields and Self-Attention Layers for Medical Image Segmentation. 1906-1910 - Yuan Gao, Xiaotong Wang, Aimin Jiang, Ying Chen, Yibin Tang:
ADHD Diagnosis and Biomarker Detection Based on Multimodal Graph Convolutional Neural Network. 1911-1915 - Xinrui Chen, Renao Yan, Yizhi Wang, Jiawen Li, Junru Cheng, Tian Guan, Yonghong He:
HIQ: One-Shot Network Quantization for Histopathological Image Classification. 1916-1920 - Jiwon Lee, Eunsong Kang, Junyeong Maeng, Heung-Il Suk:
Eigendecomposition-Based Spatial-Temporal Attention for Brain Cognitive States Identification. 1921-1925 - Yi Guo, Chao Tang, Hao Wu, Badong Chen:
EEG Emotion Recognition Based on Dynamical Graph Attention Network. 1921-1925 - Pengxuan Gao, Tianyu Liu, Jia-Wen Liu, Bao-Liang Lu, Wei-Long Zheng:
Multimodal Multi-View Spectral-Spatial-Temporal Masked Autoencoder for Self-Supervised Emotion Recognition. 1926-1930 - Xiangyu Kong, Zeyu Ren, Lu Liu:
Semi-Supervised Volumetric Medical Image Segmentation via Class Prototype Guided Distribution-Aligned Representation Learning. 1931-1935 - Jiezhou He, Zhiming Luo, Wei Peng, Songzhi Su, Shaozi Li:
CC-DA: Cross-Domain Consistency Data Augmentation for 3D Tumor Segmentation. 1936-1940 - Xiran Xu, Bo Wang, Yujie Yan, Xihong Wu, Jing Chen:
A DenseNet-Based Method for Decoding Auditory Spatial Attention with EEG. 1946-1950 - Xiao Chen, Xiaokun Dai, Xueli Liu, Xinrong Chen:
SPTESleepNet: Automatic Sleep Staging Model Based On Strip Patch Embeddings And Transformer Encoder. 1951-1955 - Zongmin Li, Xuanting Li, Jiayue Fan, Zhonghao Du, Chaozhi Yang:
Non-iterative Pyramid Network for Unsupervised Deformable Medical Image Registration. 1956-1960 - Renhe Liu, Yu Liu, Han Wang, Kai Hu, Shan Du:
A Novel Medical Image Fusion Framework Integrating Multi-scale Encoder-Decoder with Discrete Wavelet Decomposition. 1961-1965 - Haojian Ning, Chengliang Wang, Xinrun Chen, Shiying Li:
An Accurate and Efficient Neural Network for OCTA Vessel Segmentation and a New Dataset. 1966-1970 - Srikireddy Dhanunjay Reddy, Tharun Kumar Reddy:
GM-VRC: Semantic Topological Data Ensemble Approach for EEG Signal Classification. 1971-1975 - Yifan Song, Songpengcheng Xia, Jiarui Yang, Ling Pei:
A Learning-Based Multi-Node Fusion Positioning Method Using Wearable Inertial Sensors. 1976-1980 - Xiaochen He, Baoyao Yang, Fei Lyu:
MMS: Morphology-Mixup Stylized Data Generation for Single Domain Generalization in Medical Image Segmentation. 1981-1985 - Mei Yu, Hexin Wang, Xuzhou Fu, Jie Gao, Zhiqiang Liu, Xuewei Li:
DualGCN-MIL: Whole Slide Image Classification Based on Double Relationship Graph Learning. 1986-1990 - Zheyun Qin, Xiaoming Xi, Yilong Yin:
Distribution-Aware Contrastive Learning for Robust Medical Image Segmentation. 1991-1995 - Wenjie Song, Jiqing Han, Jianchen Li, Guibin Zheng, Tieran Zheng, Yongjun He:
Modeling Quasi-Periodic Dependency via Self-Supervised Pre-Training for Respiratory Sound Classification. 1996-2000 - Zeming He, Gaoyan Zhang:
CEDNet: A Continuous Emotion Detection Network for Naturalistic Stimuli Using MEG Signals. 2001-2005 - Jian Chen, Xing Wu, Chengliang Wang, Zailin Yang, Xuelian Wu, Longrong Ran, Yao Liu:
Texture-Unet: A Texture-Aware Network for Bone Marrow Smear Whole-Slide Image Region of Interest Segmentation. 2006-2010 - Shang-Jui Kuo, Po-Han Huang, Chia-Ching Lin, Jeng-Lin Li, Ming-Ching Chang:
Improving Limited Supervised Foot Ulcer Segmentation Using Cross-Domain Augmentation Strategies. 2011-2015 - Ruihan Qin, Zhenxi Song, Huixia Ren, Zian Pei, Lin Zhu, Xue Shi, Yi Guo, Honghai Liu, Min Zhang, Zhiguo Zhang:
BNMTrans: A Brain Network Sequence-Driven Manifold-Based Transformer for Cognitive Impairment Detection Using EEG. 2016-2020 - Xinxu Zhou, Zhen Liang, Weishan Ye, Junqi Xue, Honghai Liu, Min Zhang, Zhiguo Zhang:
EmoTVR: A Hybrid Model to Estimate Continuous-Time and Continuous-Level Emotion from Electroencephalography. 2021-2025 - Han Chen, Wenxuan Wu, Xiaofen Xing, Xiangmin Xu:
Clinical Scores Prediction and Medication Adjustment for Course of Parkinson's Disease. 2026-2030 - Stanislas Ducotterd, Sebastian Neumayer, Michael Unser:
Learning a Convex Patch-Based Synthesis Model via Deep Equilibrium. 2031-2035 - Christine Beauchene, Michael S. Brandstein, Thomas F. Quatieri, Eric Thompson, Christopher J. Smalt:
A Neurophysiological-Auditory "Listen Receipt" for Communication Enhancement. 2036-2040 - Nastassia Vysotskaya, Noah Maul, Alessandra Fusco, Souvik Hazra, Jens Harnisch, Tomás Arias-Vergara, Andreas K. Maier:
Transforming Cardiovascular Health: a Transformer-Based Approach to Continuous, Non-Invasive Blood Pressure Estimation via Radar Sensing. 2041-2045 - Migyeong Gwak, Korosh Vatanparvar, Li Zhu, Nafiul Rashid, Mohsin Y. Ahmed, Jungmok Bae, Jilong Kuang, Alex Gao:
Multimodal Breathing Rate Estimation Using Facial Motion and RPPG From RGB Camera. 2046-2050 - Chen Zhou, Lingjing Hu:
A Neural Syntax Parser for Coronary Artery Anatomical Labeling in Coronary CT Angiography. 2051-2055 - Wei Wang, Xingcan Hu, Li Xiao, Yu-Ping Wang:
Adaptive Multiview Community-Preserved Graph Convolutional Network for Multiatlas-Based Functional Connectivity Analysis. 2056-2060 - Niki Efthymiou, George Retsinas, Panagiotis Paraskevas Filntisis, Petros Maragos:
Augmenting Transformer Autoencoders with Phenotype Classification for Robust Detection of Psychotic Relapses. 2061-2065 - Yonathan Eder, Ravit Abel, Avi Schroeder, Yonina C. Eldar:
Localization and Tracking of Gold Nanoparticles Using mmWave FMCW Radar. 2066-2070 - Ram Sapkota, Bishal Thapaliya, Pranav Suresh, Bhaskar Ray, Vince D. Calhoun, Jingyu Liu:
Multimodal Imaging Feature Extraction with Reference Canonical Correlation Analysis Underlying Intelligence. 2071-2075 - John Stewart Fabila-Carrasco, Avalon Campbell-Cousins, Mario A. Parra-Rodriguez, Javier Escudero:
Graph-Based Permutation Patterns for the Analysis of Task-Related FMRI Signals on DTI Networks in Mild Cognitive Impairment. 2076-2080 - Siddhant Gautam, Angqi Li, Saiprasad Ravishankar:
Patient-Adaptive and Learned Mri Data Undersampling Using Neighborhood Clustering. 2081-2085 - Shadi Sartipi, Müjdat Çetin:
Multi-Source Domain Adaptation with Transformer-Based Feature Generation for Subject-Independent EEG-Based Emotion Recognition. 2086-2090 - Ramesh Kumar Sah, Md. Mahbubur Rahman, Viswam Nathan, Li Zhu, Jungmok Bae, Christina Rosa, Wendy Berry Mendes, Jilong Kuang, Jun Alex Gao:
Heart Rate Variability Estimation with Dynamic Fine Filtering and Global-Local Context Outlier Removal. 2091-2095 - Rabindra Khadka, Pedro G. Lind, Gustavo B. M. Mello, Michael A. Riegler, Anis Yazidi:
Inducing Inductive Bias in Vision Transformer for EEG Classification. 2096-2100 - Suhas BN, Rakshith Sharma Srinivasa, Yashas Malur Saidutta, Jaejin Cho, Ching Hua Lee, Chouchang Yang, Yilin Shen, Hongxia Jin:
End-To-End Personalized Cuff-Less Blood Pressure Monitoring Using ECG and PPG Signals. 2101-2105 - Hongyi Pan, Bin Wang, Zheyuan Zhang, Xin Zhu, Debesh Jha, Ahmet Enis Çetin, Concetto Spampinato, Ulas Bagci:
Domain Generalization with fourier Transform and soft thresholding. 2106-2110 - David J. Lin, Md Mahbubur Rahman, Li Zhu, Viswam Nathan, Jungmok Bae, Christina Rosa, Wendy Berry Mendes, Jilong Kuang, Jun Alex Gao:
Ballistocardiogram-Based Heart Rate Variability Estimation for Stress Monitoring using Consumer Earbuds. 2111-2115 - Yuhao Zhang, Shaoming Duan, Xinyu Zha, Jinhang Su, Peiyi Han, Chuanyi Liu:
FEDKA: Federated Knowledge Augmentation for Multi-Center Medical Image Segmentation on non-IID Data. 2116-2120 - Conghao Wang, Hiok Hian Ong, Shunsuke Chiba, Jagath C. Rajapakse:
De Novo Molecule Generation with Graph Latent Diffusion Model. 2121-2125 - Zi-Chen Fan, Di Li, Susanto Rahardja:
A Novel Discrete Fractional Complex Hadamard Transform for Medical Image Encryption. 2126-2130 - Taylor Lawson, John H. L. Hansen:
Situational Signal Processing with Ecological Momentary Assessment: Leveraging Environmental Context for Cochlear Implant Users. 2131-2135 - Jose Hoyos Sanchez, Batoul Taki, Waheed U. Bajwa, Anand D. Sarwate:
Federated Learning of Tensor Generalized Linear Models with low Separation Rank. 2136-2140 - Hanlu Yang, Meiby Ortiz-Bouza, Trung Vu, Francisco Laport, Vince D. Calhoun, Selin Aviyente, Tülay Adali:
Subgroup Identification Through Multiplex Community Structure Within Functional Connectivity Networks. 2141-2145 - Dingding Ye, Charan Santhirasegaran, Ryan Pai, Genevera I. Allen, Joseph Young:
Addressing Confounds in Functional Connectivity Analyses of Calcium Imaging. 2146-2150 - Yiqian Xu, Rui-Wei Zhao, Rui Feng:
Lesion-Aware Open Set Medical Image Recognition with Domain Shift. 2151-2155 - Qiqi Xian, Zhe Sage Chen:
Estimating Directed Spectral Information Flow between Multi-Resolution Time Series. 2156-2159 - Xin Zhu, Hongyi Pan, Shuaiang Rong, Ahmet Enis Çetin:
Electroencephalogram Sensor Data Compression Using an Asymmetrical Sparse Autoencoder with a Discrete Cosine Transform Layer. 2160-2164 - Yuanpin Zhou, Huogen Wang, Yanfeng Bai, Yidong Wan, Chaohui Jin, Ming Chen, Xiaodong Teng:
Digital Pathology Image Deblurring Via Local Focus Quality Assessment. 2165-2169 - Yudong Yang, Rongfeng Su, Xiaokang Liu, Nan Yan, Lan Wang:
An Audio-Textual Diffusion Model for Converting Speech Signals into Ultrasound Tongue Imaging Data. 2170-2174 - Suizhi Huang, Shalayiding Sirejiding, Yuxiang Lu, Yue Ding, Leheng Liu, Hui Zhou, Hongtao Lu:
YOLO-Med : Multi-Task Interaction Network for Biomedical Images. 2175-2179 - Xipeng Pan, Feihu Hou, Zhenbing Liu, Siyang Feng, Rushi Lan:
EOFD-Net: Edge Optimization and Feature Denoising for Weakly Supervised Deep Nuclei Segmentation with Point Annotations. 2180-2184 - Yu Rong, Kawon Han, Isabella Lenz, Daniel W. Bliss:
Motion-Tolerant Radar-Based Heart Sound Detection. 2185-2189 - Bo Wang, Xiran Xu, Longxiang Zhang, Boda Xiao, Xihong Wu, Jing Chen:
Semantic Reconstruction of Continuous Language from Meg Signals. 2190-2194 - Yi Hao Chan, Jun Liang Ang, Sukrit Gupta, Yinan He, Jagath C. Rajapakse:
Subtype-Specific Biomarkers of Alzheimer's Disease from Anatomical and Functional Connectomes via Graph Neural Networks. 2195-2199 - Jiawei Li, Chunxu Guo, Li Fu, Lu Fan, Edward F. Chang, Yuanning Li:
Neural2speech: A Transfer Learning Framework for Neural-Driven Speech Reconstruction. 2200-2204 - Qichang Chen, Zhonghang Zhu, Lianxin Wang, Liansheng Wang:
Shifted-Rectangle-Window Based Transformer for non-Displaced Femoral Neck Fracture Diagnosis. 2205-2209 - Minxi Yang, Dahua Gao, Jiaxuan Li, Wenlong Xu, Xiaodan Song, Guangming Shi:
Mosic: Multimodal Semantic Integrated Communication for Health Monitoring in Iot Scenarios. 2210-2214 - Yuda Jin, Weidong Chen, Yuanhe Tian, Yan Song, Chenggang Yan, Zhendong Mao:
Improving Radiology Report Generation with D2-Net: When Diffusion Meets Discriminator. 2215-2219 - Gang Liu, Hongyang Li, Zerui He, Shenjun Zhong:
Enhancing Generalization in Medical Visual Question Answering Tasks Via Gradient-Guided Model Perturbation. 2220-2224 - Peili Chen, Linyang He, Li Fu, Lu Fan, Edward F. Chang, Yuanning Li:
Do Self-Supervised Speech and Language Models Extract Similar Representations as Human Brain? 2225-2229 - Kazi Mahmudul Hassan, Xuyang Zhao, Hidenori Sugano, Toshihisa Tanaka:
Detection of Epileptic Seizures in Long Eeg Recordings Using an Anomaly Detector with Artifact Rejection. 2230-2234 - Aimin Jiang, Shanshan Hou, Yibin Tang, Yanping Zhu:
Joint Spatio-Temporal Filtering of Motion Imagery EEG Signals for Data Alignment in Transfer Learning. 2235-2239 - Ruilin Wang, Xiongfei Li, Mingjie Tian, Feiyang Yang, Xiaoli Zhang:
Patch-Level Knowledge Distillation and Regularization for Missing Modality Medical Image Segmentation. 2240-2244 - Kunpeng Qiu, Zhiying Zhou, Yongxin Guo:
Learn From Zoom: Decoupled Supervised Contrastive Learning For WCE Image Classification. 2245-2249 - Yue Hu, Huiying Xu, Xinzhong Zhu, Negalign Wake Hundera:
V-DDPM: MRI Rician Noise Removal Model Based on VST and DDPM. 2250-2254 - Veronika Ecker, Marcel Früh, Bin Yang, Sergios Gatidis, Thomas Küstner:
Deep Regression for Biological Age Estimation in Multiple Organs: Investigations on 40, 000 Subjects of the UK Biobank. 2255-2259 - Meisheng Zhang, Chenye Wang, Wenxuan Zou, Xingqun Qi, Muyi Sun, Wanting Zhou:
Contrmix: Progressive Mixed Contrastive Learning for Semi-Supervised Medical Image Segmentation. 2260-2264 - Seorim Hwang, Jaebin Cha, Junyeong Heo, Sungpil Cho, Youngcheol Park:
Multi-Label Abnormality Classification from 12-Lead ECG Using A 2D Residual U-Net. 2265-2269 - Zhiyong Jin, Guangqi Wen, Peng Cao, Lingwen Liu, Jinzhu Yang, Xinrong Zhu, Osmar R. Zaïane, Fei Wang:
Towards Disease-Aware Self-Supervised Dynamic Brain Network Learning For Mental Diagnosis. 2270-2274 - Yiwen Ruan, Rui Jin, Zhaorui Liu, Caishan Wang, Lei Zhang, Tao Peng:
Delineation of Prostate Cancer Via Enhanced AI-Based Algorithm In Ultrasound Images. 2275-2279 - Jintong Hu, Hui Che, Zishuo Li, Wenming Yang:
Residual Dense Swin Transformer for Continuous Depth-Independent Ultrasound Imaging. 2280-2284 - Hongyu Shi, Kaizhong Zheng, Huaning Wang, Baojuan Li, Badong Chen:
Predicting RTMS Treatment Effects Using Open-Loop Control and Neural Manifold. 2285-2289 - Li Li, Jiahui He, Yunxin Tang, Youjian Zhang, Jie Wang, Guanqun Zhou, Zhicheng Zhang:
SRECT: Machine-Specific Spatial-Resolution Enhancement in Computed Tomography. 2290-2294 - Jiayu Zhang, Dexuan Xu, Yiwei Lou, Yu Huang:
A Novel Multi-Atlas Fusion Model Based On Contrastive Learning For Functional Connectivity Graph Diagnosis. 2295-2299 - Peng Du, Baijia Ni, Xiaodong Ju, Xingce Wang, Zhongke Wu, Gege Lou, Keying Hua:
3D Automated Quantitative Calculations Based on CT Images of the Hip Joint. 2300-2304 - Suvadeep Maiti, Shivam Kumar Sharma, Raju S. Bapi:
Enhancing Healthcare with EOG: A Novel Approach to Sleep Stage Classification. 2305-2309 - Xiaolong Zhong, Fei Wu, Zhong Yin, Gang Liu:
An Attention-Enhanced Retentive Broad Learning System for Subject-Generic Emotion Recognition from EEG Signals. 2310-2314 - Lang Wang, Peng Jiang, Wensi Duan, Dehua Cao, Baochuan Pang, Juan Liu:
Coupling Self-Supervised and Supervised Contrastive Learning for Multiple Classification of Cervical Cytological Whole Slide Images. 2315-2319 - Siqi Cai, Ran Zhang, Haizhou Li:
Robust Decoding of the Auditory Attention from EEG Recordings Through Graph Convolutional Networks. 2320-2324 - Xiang Li, Jian Song, Zhigang Zhao, Chunxiao Wang, Dawei Song, Bin Hu:
A Supervised Information Enhanced Multi-Granularity Contrastive Learning Framework for EEG Based Emotion Recognition. 2325-2329 - Chenyi Zhou, Hualiang Wang, Xiaomeng Li, Wanlu Liu, Zuozhu Liu:
Multimodal Survival Ensemble Network: Integrating Genomic and Histopathological Insights for Enhanced Cancer Prognosis. 2330-2334 - Yingxin Lai, Guoqing Yang, Yifan He, Zhiming Luo, Shaozi Li:
Selective Domain-Invariant Feature for Generalizable Deepfake Detection. 2335-2339 - Gaoxiang Li, Ying Zhang, Yanlin Luo:
Multi-Task Cascaded Attention Network for Brain Tumor Segmentation and Classification. 2340-2344 - Huadeng Wang, Jiejiang Yu, Bingbing Li, Xipeng Pan, Zhenbing Liu, Rushi Lan, Xiaonan Luo:
Gland Segmentation Via Dual Encoders and Boundary-Enhanced Attention. 2345-2349 - Joohyung Lee, Heejeong Nam, Kwanhyung Lee, Sangchul Hahn:
Compact and De-Biased Negative Instance Embedding for Multi-Instance Learning on Whole-Slide Image Classification. 2350-2354 - Zhengda He, Linjie Chen, Jiaying Xu, Hao Lv, Rui-ning Zhou, Jianhua Hu, Yadong Chen, Yang Gao:
TD-GPT: Target Protein-Specific Drug Molecule Generation GPT. 2355-2359 - Nan Ding, Florence Rossant, Hélène Urien, Jérémie Sublime, Paul Bastelica, Christophe Baudouin, Michel Pâques:
A Complete Method for the 3D Reconstruction of Axonal Pathways from 2 Orthogonal 3D OCT Images of the Lamina Cribrosa. 2360-2364 - Changsheng Ma, Taicheng Guo, Qiang Yang, Xiuying Chen, Xin Gao, Shangsong Liang, Nitesh V. Chawla, Xiangliang Zhang:
A Property-Guided Diffusion Model For Generating Molecular Graphs. 2365-2369 - Yaping Zhao, Edmund Y. Lam:
SASA: Saliency-Aware Self-Adaptive Snapshot Compressive Imaging. 2370-2374 - Mingtao Huang, Ranhao Zhang, Xueming Li, Yuan Shen:
Fast Alignment Algorithm for Cryo-EM Particle Images Based on Harmonic Analysis. 2375-2379 - Wenbo Li, Zhipeng Mo, Yilin Shen, Hongxia Jin:
Unified Srgb Real Noise Synthesizing with Adaptive Feature Modulation. 2380-2384 - YinWei Du, Jian Wang, Xing Wu, Xian-Hua Han:
Dual Directional Complementary Gradient Fusion and Deep Refinement for Hyperspectral Image Super Resolution. 2385-2389 - Takumi Takabe, Xian-Hua Han, Yen-Wei Chen:
Deep Versatile Hyperspectral Reconstruction Model from A Snapshot Measurement with Arbitrary Masks. 2390-2394 - Jiuqiang Li, Yutong Ke:
Hybrid Convolution-Transformer for Lightweight Single Image Super-Resolution. 2395-2399 - Zean Chen, Yeyao Chen, Mei Yu, Haiyong Xu, Gangyi Jiang:
Hybrid Domain Learning towards Light Field Spatial Super-Resolution using Heterogeneous Imaging. 2400-2404 - Quanquan Xiao, Haiyan Jin, Haonan Su, Fengyuan Zuo, Yuanlin Zhang, Zhaolin Xiao, Bin Wang:
SPGFusion: A Semantic Prior Guided Infrared and Visible Image Fusion Network. 2405-2409 - Jiazhang Zheng, Lei Li, Qiuping Liao, Cheng Li, Li Li, Yangxing Liu:
Darkshot: Lighting Dark Images with Low-Compute and High-Quality. 2410-2414 - Yadong Li, Dongheng Zhang, Ruixu Geng, Jincheng Wu, Yang Hu, Qibin Sun, Yan Chen:
IFNet: Imaging and Focusing Network for handheld mmWave Devices. 2415-2419 - Refaldi I. D. Putra, Tatsuya Ishikawa, Naomi Simumba, Michiaki Tatsubori:
Sandwiched Lo-Res Simulation for Scalable Flood Modeling. 2420-2424 - Wenwu Gong, Zhejun Huang, Lili Yan:
Enhanced Low-Rank and Sparse Tucker Decomposition For Image Completion. 2425-2429 - Chen-Bin Feng, Jie Zhang, Jiaxue Li, Yicong Zhou:
Seam Mask Guided Partial Reconstruction with Quantum-Inspired Local Aggregation For Deep Image Stitching. 2430-2434 - Shijie Zhang, Boyan Jiang, Keke He, Junwei Zhu, Ying Tai, Chengjie Wang, Yinda Zhang, Yanwei Fu:
T-Pixel2Mesh: Combining Global and Local Transformer for 3D Mesh Generation from a Single Image. 2435-2439 - Denghui Yang, Yifan Ding, Hao Zhang, Yizhou Li:
PVitNet: An Effective Approach for Android Malware Detection Using Pyramid Feature Processing and Vision Transformer. 2440-2444 - Siwei Li, Mingxuan Liu, Yating Zhang, Shu Chen, Haoxiang Li, Zifei Dou, Hong Chen:
SAM-DEBLUR: Let Segment Anything Boost Image Deblurring. 2445-2449 - Qian Li, Rao Fu, Cheng Wen:
Reference Line Network: On Simultaneous Gaussian Line Detection and Connection Graph Inference. 2450-2454 - Simon Welker, Tal Peer, Henry N. Chapman, Timo Gerkmann:
Live Iterative Ptychography with Projection-Based Algorithms. 2455-2459 - Jorge Bacca, Brayan Monroy, Henry Arguello:
Deep Plug-and-Play Algorithm for Unsaturated Imaging. 2460-2464 - Tom Tirer:
Iteratively Preconditioned Guidance of Denoising (Diffusion) Models For Image Restoration. 2465-2469 - Sreemanti Dey, Snigdha Saha, Berthy T. Feng, Manxiu Cui, Laure Delisle, Oscar Leong, Lihong V. Wang, Katherine L. Bouman:
Score-based Diffusion Models for Photoacoustic Tomography Image Reconstruction. 2470-2474 - Yvette Y. Lin, Angela F. Gao, Katherine L. Bouman:
Imaging An Evolving Black Hole By Leveraging Shared Structure. 2475-2479 - Jixuan Liang, Yanshan Li:
A Fast Blind Deblurring Algorithm Using Local Gradient Product Prior. 2480-2484 - Jiabao Li, Yuqi Li, Ciliang Sun, Chong Wang, Jinhui Xiang:
SPEC-NERF: Multi-Spectral Neural Radiance Fields. 2485-2489 - Ziwen Li, Bo Xu, Cheng Lu:
KD-Former: Transformer Knowledge Distillation for Image Matting. 2490-2494 - Shengli Yan, Yuan Rao, Wenhui Hou:
Detection in Complex Scenes Using Rgb and Depth Multimodal Feature Fusion. 2495-2499 - Xian-Hua Han, Huiyan Jiang, Yen-Wei Chen:
Hyperspectral Image Reconstruction Using Hierarchical Neural Architecture Search from A Snapshot Image. 2500-2504 - Jorge Bacca, Marcus Carlsson, Brayan Monroy, Henry Arguello:
Plug-And-Play Algorithm Coupled with Low-Rank Quadratic Envelope Regularization for Compressive Spectral Imaging. 2505-2509 - Jia Chen, Jinlong Qin, Saishang Zhong, Kai Yang, Xinrong Hu, Tao Peng, Rui Li:
SGM: A Dataset for 3D Garment Reconstruction from Single Hand-Drawn Sketch. 2510-2514 - Kartheek Kumar Reddy Nareddy, Abijith Jagannath Kamath, Chandra Sekhar Seelamantula:
Image Restoration with Generalized L2 Loss and Convergent Plug-and-Play Priors. 2515-2519 - Ryosuke Isono, Shunsuke Ono:
Temporally-Guided Total Variation For Robust Spatiotemporal Fusion Of Satellite Images. 2520-2524 - Abhishek Shreekant Bhandiwad, Abijith Jagannath Kamath, Siddarth Asokan, Chandra Sekhar Seelamantula:
Variational Analysis of Adversarial Regularization for Solving Inverse Problems. 2525-2529 - Aleksei Sholokhov, Joshua Rapp, Saleh Nabi, Steven L. Brunton, J. Nathan Kutz, Hassan Mansour:
Single-Pixel Imaging Of Dynamic Flows Using Neural Ode Regularization. 2530-2534 - Robinson Czajkowski, John Murray-Bruce:
Two-Edge-Resolved 3d Non-Line-of-Sight Imaging: A Fisher Information Equalized Discretization. 2535-2539 - Zheng Zhou, Peter Gerstoft, Kim Olsen:
Fusion of Multi-Resolution Seismic Tomography Maps with Physics-Informed Probability Graphical Models. 2540-2544 - Saishang Zhong, Jiashu Wang, Xinrong Hu:
PMDI: Combining Parametric-Model and Depth-Aware Implicit Function for Single-View Human Reconstruction. 2545-2549 - Alexander Lin, Demba E. Ba:
An Efficient Algorithm For Clustered Multi-Task Compressive Sensing. 2550-2554 - Tianbo Liu, Songping Mai, Xiaoyu Wang:
Deep Learning Based Single-Shot Profilometry by Three-Channel Binary-Defocused Projection. 2555-2559 - Zhuofeng Wu, Yusuke Monno, Masatoshi Okutomi:
Self-Supervised Spatially Variant PSF Estimation for Aberration-Aware Depth-from-Defocus. 2560-2564 - Yousef Kotp, Marwan Torki:
Flare-Free Vision: Empowering Uformer with Depth Insights. 2565-2569 - Wenjiao Bian, Yusuke Monno, Masatoshi Okutomi:
Reflection Removal Using Recurrent Polarization-to-Polarization Network. 2570-2574 - Xun Wu, Fanqing Meng, Yaqi Wu, Jiawei Zhang, Feng Zhang:
An Efficient Transformer For Demosaicing Via Compressed Multi-Branch Attention Mechanism. 2575-2576 - Yanting Wang, Feng Li, Han Zhang:
TA2P: Task-Aware Adaptive Pruning Method for Image Classification on Edge Devices. 2580-2584 - Tingyou Li, Zixin Xu, Yong S. Chu, Xiaojing Huang, Jizhou Li:
Coordinate-Based Neural Network for Fourier Phase Retrieval. 2585-2589 - Daniele Picone, Mohamad Jouni, Mauro Dalla Mura:
Spectro-Spatial Hyperspectral Image Reconstruction From Interferometric Acquisitions. 2590-2594 - Shihui Zhang, Ziteng Xue, Yuhong Jiang, Houlin Wang:
Opnet: Deep Occlusion Perception Network with Boundary Awareness for Amodal Instance Segmentation. 2595-2599 - Ling Lin, Congcong Zhu, Lin Zhou, Jingrun Chen:
Toward Quantifiable Face age Transformation. 2600-2604 - Rao Fu, Cheng Wen, Qian Li:
IMFIT: Normal Estimation via Learning Neural Implicit Surface. 2605-2609 - Zhenhu Zhang, Xin Cao, Li Jin, Xueying Qin, Ruofeng Tong:
Semi-Decoupled 6D Pose Estimation via Multi-Modal Feature Fusion. 2610-2614 - Ting Liu, Yue Hu, Wansen Wu, Youkai Wang, Kai Xu, Quanjun Yin:
DAP: Domain-Aware Prompt Learning for Vision-and-Language Navigation. 2615-2619 - Shansi Zhang, Edmund Y. Lam:
Unsupervised Disparity Estimation for Light Field Videos. 2620-2624 - Chunqing Ruan, Mengzhu Wang, Shanshan Wang, Tianyi Liang, Wei Yu:
SBM: Smoothness-Based Minimization for Domain Generalization. 2625-2629 - Haoxing Chen, Yaohui Li, Zhangxuan Gu, Zhuoer Xu, Jun Lan, Huaxiong Li:
Segment Anything Model Meets Image Harmonization. 2630-2634 - Tian Yang, Cong Shen, Tiantian Yuan:
CoSLR: Contrastive Chinese Sign Language Recognition with prior knowledge And Multi-Tasks Joint Learning. 2635-2639 - Jucai Zhai, Yang Liu, Pengcheng Zeng, Chihao Ma, Xinan Wang, Yong Zhao:
Efficient Fusion of Depth Information for Defocus Deblurring. 2640-2644 - Kun Hu, Zhaoyangfan Huang, Xingjun Wang:
Highlight Removal Network Based on an Improved Dichromatic Reflection Model. 2645-2649 - Yinghui Xing, Litao Qu, Kai Zhang, Yan Zhang, Xiuwei Zhang, Yanning Zhang:
Complementary Fusion Network Based on Frequency Hybrid Attention for Pansharpening. 2650-2654 - Chao Yang, Yong Fan, Cheng Lu:
Dropout Multi-Head Attention for Single Image Super-Resolution. 2655-2659 - Shang Gao, Chenyang Yu, Pingping Zhang, Huchuan Lu:
Part Representation Learning with Teacher-Student Decoder for Occluded Person Re-Identification. 2660-2664 - Wenjie Liu, Xinlong Shi, Xianzhong Liu:
Flipping Consistent and Counterfactual Attention Network for Facial Expression Recognition. 2665-2669 - Xingshuo Han, Xiao Wang, Kui Jiang, Wei Liu, Ruimin Hu, Xuefeng Pan, Xin Xu:
Mutuality Attribute Makes Better Video Anomaly Detection. 2670-2674 - Lijun Wang:
Multi-Modality Conditional Diffusion Model for Time Series Forecasting of Live Sales Volume. 2675-2679 - Chao Wang, Yubiao Yue, Bingchun Luo, Yujie Chen, Jun Xue:
PseKD: Phase-Shift Encoded Knowledge Distillation for Oriented Object Detection in Remote Sensing Images. 2680-2684 - Jiuqiang Li, Shilei Zhu:
Channel-Spatial Transformer for Efficient Image Super-Resolution. 2685-2689 - Yueqian Quan, Honghui Xu, Yidong Yan, Hang Zheng, Jianwei Zheng:
HMNet: Hierarchical Microscale-Aware Network for Infrared Small Target Detection. 2690-2694 - Lei Zhao, Xiao-Lei Zhang:
A Hierarchical Multi-Proxy Loss with Dynamic Main-Proxy for Deep Metric Learning. 2695-2699 - Ruofei Wang, Renjie Wan, Zongyu Guo, Qing Guo, Rui Huang:
SPY-Watermark: Robust Invisible Watermarking for Backdoor Attack. 2700-2704 - Hui Zhang, Bingran Kuang, Yajie Zhao:
Camera Calibration using a Single View of a Symmetric Object. 2705-2709 - Soojung Hong, Kwanghee Choi:
Correcting Faulty Road Maps by Image Inpainting. 2710-2714 - Jinyu Shi, Wenjie Wu:
SRP-UOD: Multi-Branch Hybrid Network Framework Based on Structural Re-Parameterization for Underwater Small Object Detection. 2715-2719 - Taiwei Zhang, Zhenghui Hu, Weixin Li, Qingjie Liu, Yunhong Wang:
Read, Spell and Repeat: Scene Text Recognition with Vision-Language Circular Refinement. 2720-2724 - Deyi Ji, Siqi Gao, Mingyuan Tao, Hongtao Lu, Feng Zhao:
Changenet: Multi-Temporal Asymmetric Change Detection Dataset. 2725-2729 - Zhangxuan Gu, Haoxing Chen, Zhuoer Xu:
Diffusioninst: Diffusion Model for Instance Segmentation. 2730-2734 - Wang Yin, Peng Lu, Xujun Peng:
COLORFLOW: A Conditional Normalizing Flow for Image Colorization. 2735-2739 - Xiaoyan Tian, Ye Jin, Zhao Zhang, Peng Liu, Xianglong Tang:
MTIDNet: A Multimodal Temporal Interest Detection Network for Video Summarization. 2740-2744 - Masoud Mokhtari, Fatemeh Taheri Dezaki, Timo Bolkart, Betty Mohler Tesch, Rahul Suresh, Amin Banitalebi-Dehkordi:
Skin Tone Disentanglement in 2D Makeup Transfer With Graph Neural Networks. 2745-2749 - Eungi Lee, Eung-Joo Lee, Syed Muhammad Anwar, Seok Bong Yoo:
Child FER: Domain-Agnostic Facial Expression Recognition in Children Using a Secondary Image Diffusion Model. 2750-2754 - Lijian Yang, Jian-Xun Mi, Guofen Wang, Weisheng Li:
Window-Based Convolutional Sparse Coding: Towards A Unified Framework. 2755-2759 - Pengwei Yin, Jingjing Wang, Jiawu Dai, Xiaojun Wu:
NERF-GAZE: A Head-Eye Redirection Parametric Model for Gaze Estimation. 2760-2764 - Xuyang Liu, Siteng Huang, Yachen Kang, Honggang Chen, Donglin Wang:
VGDIFFZERO: Text-To-Image Diffusion Models Can Be Zero-Shot Visual Grounders. 2765-2769 - Wei Ji, You Qin, Long Chen, Yinwei Wei, Yiming Wu, Roger Zimmermann:
Mrtnet: Multi-Resolution Temporal Network for Video Sentence Grounding. 2770-2774 - Lei Liao, Mao Feng, Meng Yang:
Human Guided Cross-Modal Reasoning with Semantic Attention Learning for Visual Question Answering. 2775-2779 - Yuan Cao, Di Jiang, Guanqun Hou, Fan Deng, Xinjia Chen, Qiang Yang:
Learn to Cluster Faces with Better Subgraphs. 2780-2784 - Hengsheng Zhang, Xinning Chai, Yuhong Zhang, Rong Xie, Li Song:
Hdrtvformer: Efficient Sdrtv-to-Hdrtv via Affine Transformation and Spatial-Aware Transformer. 2785-2789 - Wenbo Zhou, Dongdong Chen, Jing Liao, Jie Zhang, Kejiang Chen, Weiming Zhang, Nenghai Yu:
Attribute-Aware Head Swapping Guided by 3d Modeling. 2790-2794 - Elena Camuffo, Umberto Michieli, Jijoong Moon, Daehyun Kim, Mete Ozay:
FFT-Based Selection and Optimization of Statistics for Robust Recognition of Severely Corrupted Images. 2795-2799 - Manuel Lage Cañellas, Constantino Álvarez Casado, Le Nguyen, Miguel Bordallo López:
Estimating Exercise-Induced Fatigue from Thermal Facial Images. 2800-2804 - Zhiwei Xiong, Yunfan Zhang, Zhiqi Shen, Peiran Ren, Han Yu:
Image Aesthetics Assessment Via Learnable Queries. 2805-2809 - Babak Naderi, Ross Cutler:
A Crowdsourcing Approach to Video Quality Assessment. 2810-2814 - Ting Li, Jianshu Chao, Deyu An:
Style Adaptation for Domain-Adaptive Semantic Segmentation. 2815-2819 - Zhidan Ran, Xiaobo Lu, Wei Liu:
Anomaly-Aware Semantic Self-Alignment Framework for Video-Based Person Re-Identification. 2820-2824 - Takuya Fujihashi, Sorachi Kato, Toshiaki Koike-Akino:
Implicit Neural Representation For Low-Overhead Graph-Based Holographic-Type Communications. 2825-2829 - Masato Fujitake:
RL-LOGO: Deep Reinforcement Learning Localization for Logo Recognition. 2830-2834 - Songqi Pan, Sheng Liu, Yuan Feng, Yineng Zhang, Xiaopeng Tian, Jiantao Yang:
POSE-HMR: Heuristic Transformer with Postural Prior Constraints for 3D Human Mesh Reconstruction. 2835-2839 - Han Gao, Hao Wu, Peiwen Dong, Yixin Xu, Fengyuan Xu, Sheng Zhong:
MuSR: Multi-Scale 3D Scenes Reconstruction based on Monocular Video. 2840-2844 - Jiaqi Su, Weiran Chen, Yi Ji, Chunping Liu:
Glocal Cascading Network for Topic Enhanced Visual Storytelling. 2845-2849 - Jia-Wei Ma, Min Liang, Haixia Man, Shu Tian, Jingyan Qin, Xu-Cheng Yin:
Attention Decoupling for Query-Based Object Detection. 2850-2854 - Jiayu Yang, Chunhui Yang, Yongqi Zhai, Qi Wang, Xinghao Pan, Ronggang Wang:
Improving Learned Video Compression by Exploring Spatial Redundancy. 2860-2864 - Driton Salihu, Adam Misik, Yuankai Wu, Constantin Patsch, Eckehard G. Steinbach:
NPRF: Neural Painted Radiosity Fields for Neural Implicit Rendering and Surface Reconstruction. 2865-2869 - Xin Li, Feng Xu, Runliang Xia, Nan Xu, Fan Liu, Chi Yuan, Qian Huang, Xin Lyu:
Locality-Enhanced Transformer for Semantic Segmentation of High-Resolution Remote Sensing Images. 2870-2874 - Zifan Yu, Erfan Bank Tavakoli, Meida Chen, Suya You, Raghuveer Rao, Sanjeev Agarwal, Fengbo Ren:
Tokenmotion: Motion-Guided Vision Transformer for Video Camouflaged Object Detection VIA Learnable Token Selection. 2875-2879 - Qilei Li, Jiabo Huang, Jian Hu, Shaogang Gong:
Feature-Distribution Perturbation and Calibration for Generalized Reid. 2880-2884 - Jiexin Wang, Jiahao Chen, Bing Su:
Domain-Adaptive and Subgroup-Specific Cascaded Temperature Regression for Out-of-Distribution Calibration. 2885-2889 - Chaofei Wang, Xiangan Zhao, Kai Wang, Shuai Wu, Jiayu Xiao, Guotong Geng:
ADIFT: Zero-Shot Generative Model Adaption Via Adaptive Domain-Invariant Feature Transfer. 2890-2894 - Tianle Lv, Shuang Li, Jiaxu Leng, Xinbo Gao:
MGRL: Mutual-Guidance Representation Learning for Text-to-Image Person Retrieval. 2895-2899 - Sidun Liu, Peng Qiao, Yong Dou:
Improving Motion Deblur By Multi-Output Learning. 2900-2904 - Shanzhi Yin, Tongda Xu, Yongsheng Liang, Yuanyuan Wang, Yanghao Li, Yan Wang, Jingjing Liu:
Bandwidth-Efficient Inference for Nerual Image Compression. 2905-2909 - Xiao Liu, Guangyi Chen, Yansong Tang, Guangrun Wang, Xiao-Ping Zhang, Ser-Nam Lim:
Language-Free Compositional Action Generation via Decoupling Refinement. 2910-2914 - Li Yao, Ao Gao, Yan Wan:
REGIR: Refined Geometry for Single-Image Implicit Clothed Human Reconstruction. 2915-2919 - Jiancheng Huang, Yifan Liu, Jiaxi Lv, Shifeng Chen:
Entwined Inversion: Tune-Free Inversion For Real Image Faithful Reconstruction and Editing. 2920-2924 - Rokia Abdein, Xuezhi Xiang, Yiming Chen, Mingliang Zhai, Abdulmotaleb El-Saddik:
Self-Supervised Multi-Scale Hierarchical Refinement Method for Joint Learning of Optical Flow and Depth. 2925-2929 - Yican Liu, Jiacheng Li, Delu Zeng:
Low Redundant Attention Network for Efficient Image Super-Resolution. 2930-2954 - Yanhui Guo, Fangzhou Luo, Shaoyuan Xu:
Self-Supervised Face Image Restoration with a One-Shot Reference. 2930-2934 - Xukai Zhao, Yuxing Lu, Jinzhuo Wang:
Multiscale Scoring Model for Enhanced Urban Perception Evaluation. 2935-2939 - Yuxuan Zhou, Liangcai Gao, Zhi Tang, Baole Wei:
Recognition-Guided Diffusion Model for Scene Text Image Super-Resolution. 2940-2944 - Shuo Zhang, Jing Liu:
Feature-Constrained and Attention-Conditioned Distillation Learning for Visual Anomaly Detection. 2945-2949 - Wei Jiang, Junru Li, Kai Zhang, Li Zhang:
LVC-LGMC: Joint Local and Global Motion Compensation for Learned Video Compression. 2955-2959 - Keli Deng, Peng Wang, Yuntao Qian:
RGB Images Enhancing Hyperspectral Image Denoising with Diffusion Model. 2960-2964 - Zicheng Zhang, Yingjie Zhou, Chunyi Li, Kang Fu, Wei Sun, Xiaohong Liu, Xiongkuo Min, Guangtao Zhai:
A Reduced-Reference Quality Assessment Metric for Textured Mesh Digital Humans. 2965-2969 - Xiaolu Chen, Haote Xu, Chenghao Deng, Xiaotong Tu, Xinghao Ding, Yue Huang:
Implicit Foreground-Guided Network for Anomaly Detection and Localization. 2970-2974 - Bruno Korbar, Jaesung Huh, Andrew Zisserman:
Look, Listen and Recognise: Character-Aware Audio-Visual Subtitling. 2975-2979 - Shaoxu Li, Ye Pan:
Instant Photorealistic Neural Radiance Fields Stylization. 2980-2984 - Xiangbo Gao, Qinliang Lin, Cheng Luo, Weicheng Xie, Linlin Shen, Keerthy Kusumam, Siyang Song:
Scale-Free And Task-Generic Attack: Generating Photo-Realistic Adversarial Patterns With Patch Quilting Generator. 2985-2989 - Jianan Wang, Zhiliang Wu, Hanyu Xuan, Yan Yan:
Text-Video Completion Networks With Motion Compensation And Attention Aggregation. 2990-2994 - Jing Zhang, Tengfei Zhao, Shiyu Hu, Xin Zhao:
Robust Single-Particle Cryo-Em Image Denoising and Restoration. 2995-2999 - Feng Zhou, Pei Shen, Ju Dai, Na Jiang, Yong Hu, Yu-Kun Lai, Paul L. Rosin:
AHRNET: Attention and Heatmap-Based Regressor for Hand Pose Estimation and Mesh Recovery. 3000-3004 - Cunjuan Zhu, Dongdong Cui, Qi Jia, Weimin Wang, Yu Liu, Michael S. Lew:
Sketch-Based 3D Shape Retrieval With Multi-View Fusion Transformer. 3005-3009 - Mingyuan Ge, Yewen Li, Honghao Wu, Mingyong Li:
JM-CLIP: A Joint Modal Similarity Contrastive Learning Model for Video-Text Retrieval. 3010-3014 - Linhuang Wang, Xin Kang, Fei Ding, Satoshi Nakagawa, Fuji Ren:
MSSTNet: A Multi-Scale Spatio-Temporal CNN-Transformer Network for Dynamic Facial Expression Recognition. 3015-3019 - Yilian Zhong, Jiaming Liu, Xuan Huang, Jingjing Liu, Yibo Fan, Minfeng Wu:
CDCNet: A Fast and Lightweight Dehazing Network with Color Distortion Correction. 3020-3024 - Huy Le, Tung Kieu, Anh Nguyen, Ngan Le:
WAVER: Writing-Style Agnostic Text-Video Retrieval Via Distilling Vision-Language Models Through Open-Vocabulary Knowledge. 3025-3029 - Marc Windsheimer, Fabian Brand, André Kaup:
Multiscale Augmented Normalizing Flows for Image Compression. 3030-3034 - Yan Hong, Jianfu Zhang, Zhongyi Sun, Ke Yan:
ProAug: Prototype-Based Augmentation for Long-Tailed Image Classification. 3035-3039 - Shaobo Zhang, Sheng Liu, Fei Gao, Yuan Feng:
Dynamic Mutual-Activated Transformer for Human Motion Prediction. 3040-3044 - Yan Hong, Li Niu, Jianfu Zhang:
Arbitrary Style Transfer with Prototype-Based Channel Alignment. 3045-3049 - Ziming Liu, Ezio Malis, Philippe Martinet:
One-Stage Deep Stereo Network. 3050-3054 - Peng Qin, Youneng Bao, Fanyang Meng, Wen Tan, Chao Li, Genhong Wang, Yongsheng Liang:
Leveraging Redundancy in Feature for Efficient Learned Image Compression. 3055-3059 - Shangjie Wang, Yan Zhang:
3DSAM: Segment Anything in NeRF. 3060-3064 - Hichem Sahbi:
TCMP: End-to-End Topologically Consistent Magnitude Pruning for Miniaturized Graph Convolutional Networks. 3065-3069 - Hichem Sahbi:
DAMP: Distribution-Aware Magnitude Pruning for Budget-Sensitive Graph Convolutional Networks. 3070-3074 - Dejun Zhang, Xiaowei Lin, Benxin Yi, Yiqi Wu:
Active Learning with Core-Set Sampling and Scale-Sensitive Loss for 3D Object Detection. 3075-3079 - Shaopan Wang, Jiezhou He, Xin He, Jiaoyue Hu, Zuguo Liu, Zhiming Luo:
DEEPOREDNET: Contrastive Learning-Based Attention-Weighted Dual Channel Residual Network for Ocular Redness Assessment. 3080-3084 - Jianxun Lou, Xinbo Wu, Richard White, Yingying Wu, Hantao Liu:
Time-Interval Visual Saliency Prediction in Mammogram Reading. 3085-3089 - Haifeng Zhao, Rui Zhou, Shaojie Zhang, Yanping Fu:
Single Image Reflection removal Using Feature Difference Enhancement. 3090-3094 - Mengting Ma, Chenlu Hu, Huanting Zhang, Xiaowen Ma, Tian Feng, Wei Zhang:
CROCFUN: Cross-Modal Conditional Fusion Network for Pansharpening. 3095-3099 - Jingchao Hou, Guanghui He:
Redefining Night Vision: The Power of MSR-Driven Neural ISP. 3100-3104 - Huang Huang, Qiang Wan, Jari Korhonen:
High Resolution Image Quality Database. 3105-3109 - Bolin Jiang, Yuqiu Xie, Jiawei Li, Naiqi Li, Yong Jiang, Shu-Tao Xia:
CAGEN: Controllable Anomaly Generator using Diffusion Model. 3110-3114 - Trung Hoang, Jon S. McElvain, Vishal Monga:
Fast and Physically Enriched Deep Network for Joint Low-Light Enhancement and Image Deblurring. 3115-3119 - Ryo Nakamura, Ryu Tadokoro, Eisuke Yamagata, Yusuke Kondo, Kensho Hara, Hirokatsu Kataoka, Nakamasa Inoue:
Pseudo-Outlier Synthesis Using Q-Gaussian Distributions for Out-of-Distribution Detection. 3120-3124 - Guanqun Liu, Xiaoshuai Hao:
Customized Treatment Per Pixel for Blind Image Super-Resolution. 3125-3129 - Yijin Liu, Guoqiang Xiao, Michael S. Lew, Song Wu:
Multi-Scale Fusion of Gated Neighborhood Attention Transformers for Single Image Deraining. 3130-3134 - Lu Kang, Guoqiang Xiao, Michael S. Lew, Song Wu:
Arbitrary Style Transfer Based on Content Integrity and Style Consistency Enhancement. 3135-3139 - Yinglu Zhang, Chenbo Zhang, Lu Zhang, Tianying Liu, Jihong Guan, Xinkai Liang, Jiajia Zhao, Shuigeng Zhou:
Tail Classes Matter: Long-Tailed Object Detection Revisited. 3140-3144 - Yongjian Zhao, Xinyan Cao, Siqi Liu, Jinming Che, Wei Ren, Jian Cao, Jinlong Lin:
A Facial Expression Transfer Method Based on 3DMM and Diffusion Models. 3145-3149 - Yao Lu, Yutian Huang, Jiaqi Nie, Zuohui Chen, Qi Xuan:
RK-CORE: An Established Methodology for Exploring the Hierarchical Structure within Datasets. 3150-3154 - Yuhong He, Long Peng, Lu Wang, Jun Cheng:
Latent Degradation Representation Constraint for Single Image Deraining. 3155-3159 - LeoWu TomyEnrique, Xiangcheng Du, Kangliang Liu, Han Yuan, Zhao Zhou, Cheng Jin:
Efficient Scene Text Image Super-Resolution with Semantic Guidance. 3160-3164 - Li Li, You Qin, Wei Ji, Yuxiao Zhou, Roger Zimmermann:
Domain-Wise Invariant Learning for Panoptic Scene Graph Generation. 3165-3169 - Suncheng Xiang, Cang Liu, Jiacheng Ruan, Shilun Cai, Sijia Du, Dahong Qian:
VT-ReID: Learning Discriminative Visual-Text Representation for Polyp Re-Identification. 3170-3174 - Haoye Dong, Jun Liu, Dong Huang:
DF-VTON: Dense Flow Guided Virtual Try-On Network. 3175-3179 - Jianfei Jiang, Mingwei Cao, Jun Yi, Chenglong Li:
DI-MVS: Learning Efficient Multi-View Stereo With Depth-Aware Iterations. 3180-3184 - Taohong Zhu, Jun Shen, Chali Wang, Huiyuan Xiong:
Drop Sparse Convolution for 3D Object Detection. 3185-3189 - Fan Zhang, Wei Qin, Weijieying Ren, Lei Wang, Zetong Chen, Richang Hong:
Gradient-Aware Logit Adjustment Loss for Long-Tailed Classifier. 3190-3194 - Chao Wei, Zhidong Deng:
Open-Vocabulary Skeleton Action Recognition with Diffusion Graph Convolutional Network and Pre-Trained Vision-Language Models. 3195-3199 - Rui Deng, Qian Wu, Yuke Li, Haoran Fu:
Differentiable Resolution Compression and Alignment for Efficient Video Classification and Retrieval. 3200-3204 - Chengzhang Yu, Xianjun Yang, Wenxia Bao, Shaonan Wang, Zhiming Yao:
A Self-Supervised Pressure Map Human Keypoint Detection Approch: Optimizing Generalization and Computational Efficiency Across Datasets. 3205-3209 - Jaehyeop Choi, Youngbaek Kim, Younghyun Lee:
Robust Face Recognition Based on an Angle-Aware Loss and Masked Autoencoder Pre-Training. 3210-3214 - Andy Regensky, André Kaup:
Geometry-Corrected Geodesic Motion Modeling with Per-Frame Camera Motion for 360-Degree Video Compression. 3215-3219 - Chen Zhao, Weiling Cai, Chenyu Dong, Ziqi Zeng:
Toward Sufficient Spatial-Frequency Interaction for Gradient-Aware Underwater Image Enhancement. 3220-3224 - Yuanbo Wen, Tao Gao, Ziqi Li, Jing Zhang, Ting Chen:
Multi-Dimension Queried and Interacting Network for Stereo Image Deraining. 3225-3229 - Wenxuan Zhang, Xuechao Zou, Li Wu, Xiaoying Wang, Jianqiang Huang, Junliang Xing:
ARFA: An Asymmetric Receptive Field Autoencoder Model for Spatiotemporal Prediction. 3230-3234 - Yingjie Tang, Shou Feng, Chunhui Zhao, Yongqi Chen, Yuanze Fan, Maosheng Wei:
A Lightweight Change Detection Method Based on Feature Interaction and Transformer for High Resolution Remote Sensing Images. 3235-3239 - Haisheng Li, Fang Yuan:
SIANet: Support Information-Aware Network for Category-Agnostic Pose Estimation. 3240-3244 - Hao Zhang, Zixuan Sun, Yuhui Zheng, Kaihua Zhang, Gang Dong, Lingyan Liang, Yaqian Zhao:
Glance, Focus and Refinement Network for Remote Sensing Change Detection. 3245-3249 - Yuqi Tan, Yuang Peng, Hao Fang, Bin Chen, Shu-Tao Xia:
WaterDiff: Perceptual Image Watermarks Via Diffusion Model. 3250-3254 - Kang Fu, Yicong Peng, Zicheng Zhang, Qihang Xu, Xiaohong Liu, Jia Wang, Guangtao Zhai:
AttentionLUT: Attention Fusion-Based Canonical Polyadic LUT for Real-Time Image Enhancement. 3255-3259 - Yiming Wu, Hangfei Li, Fangfang Wang, Yilong Zhang, Ronghua Liang:
Self-Distilled Dynamic Fusion Network for Language-Based Fashion Retrieval. 3260-3264 - Frank Sippel, Jürgen Seiler, André Kaup:
A Guided Upsampling Network for Short wave Infrared Images Using Graph Regularization. 3265-3269 - Nicolas Horst, Mathias Wien:
Complexity Reduction of Template Matching-Based Reference Picture Padding in Video Coding. 3270-3274 - Hao Qi, Ming Wu, Sunkui Ke, Xiangxing Chen, Hui-Qing Zeng, Yinran Chen, Xiongbiao Luo:
Deep Residual W-Unit Learning with Semantic Embedding for Automatic Pulmonary CT Artery-Vein Separation. 3275-3279 - Xing Wei, Zhaoxin Ji, Fan Yang, Chong Zhao, Bin Wen, Yang Lu:
Self-Training Domain Adaptation Via Weight Transmission Between Generators. 3280-3284 - Bowei Zhang, Rongting Xu, Peng Cui:
TransCycle: A Data Augmentation Method for 3D Human Pose Estimation. 3285-3289 - Airat Kotliar-Shapirov, Sergei Gostilovich, Anastasia Sozykina, Anh Huy Phan, Andrzej Cichocki:
Granger Connectivity Analysis as a Block-Term Tensor Regression for eSport Players. 3290-3294 - Antonio Luigi Stefani, Niccoló Bisagno, Nicola Conci:
MapFlow: Multi-Agent Pedestrian Trajectory Prediction Using Normalizing Flow. 3295-3299 - Heunseung Lim, Jungkyoo Shin, Hyoungki Choi, Dohoon Kim, Eunwoo Kim, Joonki Paik:
Gravitated Latent Space Loss Generated by Metric Tensor for High-Dynamic Range Imaging. 3300-3304 - Tianhao Xue, Gang Zhou, Runlin He, Zhong Wang, Juan Chen, Zhenhong Jia:
RVDNet: A Two-Stage Network for Real-World Video Desnowing with Domain Adaptation. 3305-3309 - Renqiu Xia, Dongyuan Zhang, Yixin Dong, Juanping Zhao, Wenlong Liao, Tao He, Junchi Yan:
Efficient Architecture Search for Real-Time Instance Segmentation. 3310-3314 - Zhong Wang, Gang Zhou, Jing Ma, Tianhao Xue, Zhenhong Jia:
Beyond the Snowfall: Enhancing Snowy Day Object Detection Through Progressive Restoration and Multi-Feature Fusion. 3315-3319 - Yuting Wu, Xian-Feng Han, Guoqiang Xiao:
Language-Driven Open-Vocabulary 3D Semantic Segmentation with Knowledge Distillation. 3320-3324 - Shuili Zhang, Hongzhang Mu, Quangang Li, Chenglong Xiao, Tingwen Liu:
Fine-Grained Features Alignment and Fusion for Text-Video Cross-Modal Retrieval. 3325-3329 - Guang Yang, Yin Tang, Zhijian Wu, Jun Li, Jianhua Xu, Xili Wan:
DMKD: Improving Feature-Based Knowledge Distillation for Object Detection Via Dual Masking Augmentation. 3330-3334 - Fei Zhu, Wanqian Zhang, Dayan Wu, Lin Wang, Bo Li, Weiping Wang:
Exploring Targeted Universal Adversarial Attack for Deep Hashing. 3335-3339 - Ze-Yu Mi, Yu-Bin Yang:
CUTDEM: Depth-Aware Enhanced Multi-View Image Mixing for Light Field Super-Resolution. 3340-3344 - Haiyi Liu, Beibei Wang, Lu Zhang, Jianmin Ji, Yanyong Zhang:
BEVoxSeg: BEV-Voxel Representation for Fast and Accurate Camera-Based 3D Segmentation. 3345-3349 - Tianyi Song, Jiuxin Cao, Kun Wang, Bo Liu, Xiaofeng Zhang:
Causal-Story: Local Causal Attention Utilizing Parameter-Efficient Tuning for Visual Story Synthesis. 3350-3354 - Bahador Rashidi, Kiarash Aghakasiri, Chao Gao, Shuting Zhang, Yue Zhang, Ying Liu, Fengyu Sun:
A Multiscale Objective Function for Camera Color Correction. 3355-3359 - Yifan Zhang, Chunzhen Lin, Donglin Cao, Dazhen Lin:
End-To-End Spatially-Constrained Multi-Perspective Fine-Grained Image Captioning. 3360-3364 - Siyang Pan, Jiaqian Yu, Dongwook Lee, Yiwei Chen, Chao Zhang, Qiang Wang, ByungIn Yoo:
Efficient Learning on Successive Test Time Augmentation. 3365-3369 - Lena Eichermüller, Gaurang Chaudhari, Ioannis Katsavounidis, Zhijun Lei, Hassene Tmar, Christian Herglotz, André Kaup:
Encoding Time and Energy Model for SVT-AV1 Based on Video Complexity. 3370-3374 - Zhenghao Zhao, Ye Zhu, Xiaoguang Zhu, Yuzhang Shang, Yan Yan:
Supplementing Missing Visions Via Dialog for Scene Graph Generations. 3375-3379 - Savas Özkan, Mete Ozay, Tom Robinson:
Texture and Normal Map Estimation for 3D Face Reconstruction. 3380-3384 - Jiahui Li, Pourya Shamsolmoali, Yue Lu:
Autoregressive 3D Shape Completion via Sphere-Guided Disentangled Representation. 3385-3389 - Pai Chet Ng, Juwei Lu, Konstantinos N. Plataniotis:
AQF: Assessing the Quality of Hyperspectral Reconstruction with a Learnable Metric. 3390-3394 - Yaokun Fang, Changxi Huang, Chengxin Zhao, Hefei Ling, Xunjie Lin, Jinlong Guo:
DITW: A High-Performance Deep-Independent Template-Based Watermarking. 3395-3399 - Youze Xue, Jiansheng Chen, Hongbing Ma, Huimin Ma:
Refining 3D Human Mesh via Model-Free Offsets Estimation. 3400-3404 - Linhao Xu, Lin Zhao, Xinxin Sun, Di Wang, Guangyu Li, Kedong Yan:
A Comprehensive Framework for Occluded Human Pose Estimation. 3405-3409 - Yunzuo Zhang, Yameng Liu, Weili Kang:
M2SUM: Multi-Granularity Scale-Adaptive Video Summarizer towards Informative Context Representation Learning. 3410-3414 - Junyan Huo, Xue Hao, Shuai Wan, Fuzheng Yang, Ming Li:
Adaptive Chroma Block Vector Derivation from Luma for Screen Content Coding. 3415-3419 - Hongning Liu, Pengming Feng, Mingjie Xie, Dongli Xu, Jian Guan, Guangjun He, Rubo Zhang:
FPN with GMM Based Feature Enhancement Strategy for Object Detection in Remote Sensing Images. 3420-3424 - Xun Sun, Baojiang Zhong, Kai-Kuang Ma:
Corner Detection Based on a Rotation-Invariant and Noise-Insensitive Curvature Measurement. 3425-3429 - Zhi Cao, Youneng Bao, Fanyang Meng, Chao Li, Wen Tan, Genhong Wang, Yongsheng Liang:
Enhancing Adversarial Training with Prior Knowledge Distillation for Robust Image Compression. 3430-3434 - Xiaoyan Sun, Yan Li, De Cheng, Dingwen Zhang, Ling Gao, Luofeng Zhai, Jiande Sun:
Gradient and Brightness Guided Low-Light Enhancement with Attention-Based Self-Paced Learning. 3435-3439 - Peng Liu, Fanyi Wang, Jingwen Su, Yanhao Zhang, Guojun Qi:
Lightweight High-Resolution Subject Matting in the Real World. 3440-3444 - Liuxue Ju, Chengdao Pu, Jun Yu, Wen Su:
Image Harmonization Based on Hierarchical Dynamics. 3445-3449 - Pei Wang, Jiumei He, Qingsen Yan, Yu Zhu, Jinqiu Sun, Yanning Zhang:
Diffevent: Event Residual Diffusion for Image Deblurring. 3450-3454 - Ziqing Wang, Qidong Zhao, Jinku Cui, Xu Liu, Dongkuan Xu:
Autost: Training-Free Neural Architecture Search For Spiking Transformers. 3455-3459 - Jiansheng Chen, Yining Qin, Poyu Lin, Jiawei Li, Youze Xue, Huimin Ma:
Center of Pressure Estimation by Analyzing Walking Videos. 3460-3464 - Chenxin Wen, Yan Gao, Jie Li:
Trades++: Enhancing Multi-Object Tracking of Real Low Confidence Targets Using a Pyramid-Like Self-Attention Model. 3465-3469 - Yongjia Ma, Bin Dou, Tianyu Zhang, Zejian Yuan:
RD-NERF: Neural Robust Distilled Feature Fields for Sparse-View Scene Segmentation. 3470-3474 - Yue Yang, Kaipeng Zhang, Yuying Ge, Wenqi Shao, Zeyue Xue, Yu Qiao, Ping Luo:
Align, Adapt and Inject: Audio-Guided Image Generation, Editing and Stylization. 3475-3479 - Xiaomei Feng, Qi Jia, Yu Liu, Xin Fan, Longin Jan Latecki:
Depth-Guided Dominant Plane Perception for Unsupervised Homography Estimation. 3480-3484 - Yijun Wang, Yuping Ye, Feifei Gu, Zhan Song, Xiaodong Bai:
Adaptive Head Pose Estimation with Real-Time Structured Light. 3485-3489 - Chongke Bi, Xiaoxing Liu, Zhilei Liu:
NERF-AD: Neural Radiance Field With Attention-Based Disentanglement For Talking Face Synthesis. 3490-3494 - Jintao Luo, Juan Li, Tonglin Cheng:
RDANet: Reject Domain Attention Network For Confused Facial Expression Recognition. 3495-3499 - Yuhu Feng, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama:
Privacy Preserving Gaze Estimation Via Federated Learning Adapted To Egocentric Video. 3500-3504 - Songsong Feng, Shengye Yan:
Internal Location Assistance for Temporal Action Proposal Generation. 3505-3509 - Yujie Zang, Yaochen Li, Luguang Cao, Ruitao Lu:
Template-Guided Data Augmentation for Unbiased Scene Graph Generation. 3510-3514 - Zhongyi Sha, Baojiang Zhong:
Prediction-Correction Line Segment Detection. 3515-3519 - Chaoran Li, Chao Yan, Xiaojia Xiang, Jun Lai, Han Zhou, Dengqing Tang:
HADGEO: Image Based 3-DoF Cross-View Geo-Localization with Hard Sample Mining. 3520-3524 - Nitin Arora, Prachi Sharma, Pradeep Kumar, Subhash Chander Sharma:
RTLBP-AN Efficient Local Pattern For Facial Images Retrieval. 3525-3529 - M. E. A. Kherchouche, Franck Galpin, Thierry Dumas, Daniel Ménard, L. Zhang:
RD-cost Regression Speed Up Technique for VVC Intra Block Partitioning. 3530-3534 - Xiaomeng Xin, Heping Song, Jianping Gou:
A New Similarity-Based Relational Knowledge Distillation Method. 3535-3539 - Fengshuo Zhang:
Multiscale Attention Distillation for Object Detection. 3540-3544 - Yuechen Xie, Haobo Jiang, Jin Xie:
Mask6D: Masked Pose Priors for 6D Object Pose Estimation. 3545-3549 - Alik Pramanick, Sandipan Sarma, Arijit Sur:
X-CAUNET: Cross-Color Channel Attention with Underwater Image-Enhancing Transformer. 3550-3554 - Hongwei Luo, Wei Liu, Cheng Chen:
A Two-Stage Dehazing Framework Based on Inverted Image Curve-Enhancement. 3555-3559 - Alik Pramanick, Dhruvil Megha, Arijit Sur:
Attention-Based Spatial-Frequency Information Network for Underwater Single Image Super-Resolution. 3560-3564 - Shuang Liang, Jiaming Lu, Yiyang Cai:
Label Correction For Sketch-Based 3d Shape Retrieval. 3565-3569 - Nanhao Liang, Yong Liu, Wenfang Sun, Yingwei Xia, Fan Wang:
CKT-RCM: Clip-Based Knowledge Transfer and Relational Context Mining for Unbiased Panoptic Scene Graph Generation. 3570-3574 - Frank Sippel, Nils Genser, Hannah Och, Jürgen Seiler, André Kaup:
Color Agnostic Cross-Spectral Disparity Estimation. 3575-3579 - Jing Zhang, Hongxi Wei, Qing Zhang, Xiandong Chen, Jingtao Ma:
HENet: Hyperbolic-Based Encoder-Decoder Network for Word Spotting in Historical Mongolian Documents. 3580-3584 - Jie Wang, Zheng Wang, Rong Wang, Feiping Nie, Xuelong Li:
Outlier-Robust Feature Selection with ℓ2, 1-Norm Minimization and Group Row-Sparsity Induced Constraints. 3585-3589 - Zi Wang, Huaibo Huang, Aihua Zheng, Chenglong Li, Ran He:
Parallel Augmentation and Dual Enhancement for Occluded Person Re-Identification. 3590-3594 - Baoye Zhang, Wenxiang Shen, Bin Tan, Die Hu, Jun Wu:
Surface-Constrained Progressive Feature Preserving Point Cloud Compression. 3595-3599 - Siwei Meng, Wuzhen Shi:
Fusing Structure and Appearance Features in Facial Expression Recognition Transformer. 3600-3604 - Xuelin Shen, Kangsheng Yin, Xu Wang, Yulin He, Shiqi Wang, Wenhan Yang:
Image Coding for Analytics via Adversarially Augmented Adaptation. 3605-3609 - Ziqi He, Mengjia Xue, Yunhao Du, Zhicheng Zhao, Fei Su:
Dynamic Clustering and Cluster Contrastive Learning for Unsupervised Person Re-Id With Feature Distribution Alignment. 3610-3614 - Huachen Gao, Shihe Shen, Zhe Zhang, Kaiqiang Xiong, Rui Peng, Zhirui Gao, Qi Wang, Yugui Xie, Ronggang Wang:
FDC-NeRF: Learning Pose-Free Neural Radiance Fields with Flow-Depth Consistency. 3615-3619 - Jialu Xiong, Yefei Wang, Jinshan Zeng:
CLIP-Font: Sementic Self-Supervised Few-Shot Font Generation with Clip. 3620-3624 - Yuxiang Lu, Shalayiding Sirejiding, Bayram Bayramli, Suizhi Huang, Yue Ding, Hongtao Lu:
Task Indicating Transformer for Task-Conditional Dense Predictions. 3625-3629 - Tao Liu, Chenpeng Du, Shuai Fan, Feilong Chen, Kai Yu:
DiffDub: Person-Generic Visual Dubbing Using Inpainting Renderer with Diffusion Auto-Encoder. 3630-3634 - Yungeng Zhang, Yuan Chang, Yun Shen, Peng Ding, Wei Liang, Mingchuan Yang:
Unsupervised Learning of Facial Optical Flow via Occlusion-Aware Global-Local Matching. 3635-3639 - Yachun Mi, Yu Li, Yan Shu, Shaohui Liu:
ZE-FESG: A Zero-Shot Feature Extraction Method Based on Semantic Guidance for No-Reference Video Quality Assessment. 3640-3644 - Yong Zhang, Chunan Yu, Chenglong Fu, Yuanqi Hu, Ying Zang:
Spatio-Temporal Action Detection with a Motion Sense and Semantic Correction Framework. 3645-3649 - Guanming Liu, Zhihua Wei, Heng Zhang, Rui Wang, Aiquan Yuan, Chuanbao Liu, Biao Chen, Guodong Cao:
Extending Implicit Neural Representations for Text-to-Image Generation. 3650-3654 - Qipei Li, Zefeng Ying, Da Pan, Zhaoxin Fan, Ping Shi:
ESTGN: Enhanced Self-Mined Text Guided Super-Resolution Network for Superior Image Super Resolution. 3655-3659 - Seungkwon Kim, Sangyeon Kim, Seung-Hun Nam:
A Framework for Portrait Stylization with Skin-Tone Awareness and Nudity Identification. 3660-3664 - Shuqing Luo, Wei Gao:
A General Framework for Rotation Invariant Point Cloud Analysis. 3665-3669 - Hannah Och, Shabhrish Reddy Uddehal, Tilo Strutz, André Kaup:
Enhanced Color Palette Modeling For Lossless Screen Content Compression. 3670-3674 - Junkai Fang, Nan Fang, Fei Huang, Jinglin Zhou, Maoying Qiao, Fei Gao:
Learning Discriminative Style Representations for Unsupervised and Few-Shot Artistic Portrait Drawing Generation. 3675-3679 - Mao Feng, Lei Liao, Meng Yang:
Implicit-Knowledge-Guided Align Before Understanding for KB-VQA. 3680-3684 - Hannah Och, Shabhrish Reddy Uddehal, Tilo Strutz, André Kaup:
Improved Screen Content Coding in VVC Using Soft Context Formation. 3685-3689 - Hao Tang, Junyuan Guo, Teng Wang, Yanwei Yu, Chao Wang:
Efficient Joint Rectification of Photometric and Geometric Distortions in Document Images. 3690-3694 - Wangze Xu, Qi Wang, Xinghao Pan, Ronggang Wang:
HDPNERF: Hybrid Depth Priors for Neural Radiance Fields from Sparse Input Views. 3695-3699 - Guojing Ge, Qi Song, Guibo Zhu, Yuting Zhang, Jinglu Chen, Miao Xin, Ming Tang, Jinqiao Wang:
BFRFormer: Transformer-Based Generator for Real-World Blind Face Restoration. 3700-3704 - Yanyi Zhang, Qi Jia, Xin Fan, Yu Liu, Ran He:
CSCNet: Class-Specified Cascaded Network for Compositional Zero-Shot Learning. 3705-3709 - Meng Tian, Ye Xiang, Lifang Wu:
Exploring Spatio-Temporal Discriminative Cues for Group Activity Recognition Via Contrastive Learning. 3710-3714 - Yiming Wang, Qian Huang, Bin Tang, Wenting Liu, Wenchao Shan, Qian Xu:
Learned Video Compression with Spatial-Temporal Optimization. 3715-3719 - Zikai Xu, Bin Liu, Fei Hu, Weihai Li, Nenghai Yu:
DSIS: A Novel (K, N) Threshold Deniable Secret Image Sharing Scheme with Lossless Recovery. 3720-3724 - Minglong Dong, Dongliang Zhou, Jianghong Ma, Haijun Zhang:
Towards Intelligent Design: A Self-Driven Framework for Collocated Clothing Synthesis Leveraging Fashion Styles and Textures. 3725-3729 - Qian Li, Cheng Wen, Rao Fu:
Incremental Tensor Decomposition for Few Shot Neural Radiance Field. 3730-3734 - Xuewei Li, Yilong Fan, Hao Zheng, Jie Gao, Xi Wei, Mei Yu:
Balanced And Discriminative Contrastive Learning For Class-Imbalanced Medical Images. 3735-3739 - Asuka Ishii, Hiroo Ikeda:
3D Pose Estimation from Monocular Video with Camera-Bone Angle Regularization on the Image Feature. 3740-3744 - Mengtang Li, Jie Zhu, Zhixin Huang, Chao Gou:
Imitating the Human Visual System for Scanpath Predicting. 3745-3749 - Qian Qiao, Yu Xie, Ziyin Zeng, Fanzhang Li:
TALDS-Net: Task-Aware Adaptive Local Descriptors Selection for Few-Shot Image Classification. 3750-3754 - Yuanyuan Gao, Pengfei Ren, Mingen Shu, Rui Chu, Jubiao Li, Jing Jin, Wei Li:
SO-Net: Model-Agnostic Sequential Hand Pose Optimization Framework. 3755-3759 - Weixing Xie, Xiao Dong, Yong Yang, Qiqin Lin, Jingze Chen, Junfeng Yao, Xiaohu Guo:
DRSM: Efficient Neural 4D Decomposition for Dynamic Reconstruction in Stationary Monocular Cameras. 3760-3764 - Ruoxi Zhu, Minfeng Wu, Xiankui Xiong, Xuanpeng Zhu, Yibo Fan:
Multi-Weather Degradation-Aware Transformer for Image Restoration. 3765-3769 - Xiaying Chen, Yue Zhou:
Efficient Hierarchical Stripe Attention for Lightweight Image Super-Resolution. 3770-3774 - Zhenwei Cheng, Lei Wu, Changshuo Wang, Xiangxu Meng:
Scene Sketch-to-Image Synthesis Based on Multi-Object Control. 3775-3779 - Yunfang Niu, Dong Yi, Lingxiang Wu, Zhiwei Liu, Pengxiang Cai, Jinqiao Wang:
PFDM: Parser-Free Virtual Try-On via Diffusion Model. 3780-3784 - Huanyu Chen, Weisheng Li, Xinbo Gao, Bin Xiao, Feiyan Li, Yuping Huang:
Facial Aesthetic Enhancement Network for Asian Faces Based on Differential Facial Aesthetic Activations. 3785-3789 - Mohammad Ghasempour, Hadi Amirpour, Mohammad Ghanbari, Christian Timmerer:
Energy-Aware Resolution Selection for Per-Title Encoding. 3790-3794 - Kexin Wu, Fan Tang, Ning Liu, Oliver Deussen, Thi Ngoc Hanh Le, Weiming Dong, Tong-Yee Lee:
Lighting Image/Video Style Transfer Methods by Iterative Channel Pruning. 3800-3804 - Suxian Xiang, Hao Yue, Chenxi Huang, Ping Li:
A Prior Driven Semi-Supervised ViTGAN for Image Recolorization. 3805-3809 - Hadi Amirpour, Jingwen Zhu, Patrick Le Callet, Christian Timmerer:
A Real-Time Video Quality Metric for HTTP Adaptive Streaming. 3810-3814 - Neha Tarigopula, Preyas Garg, Skanda Muralidhar, Sandrine Tornay, Dinesh Babu Jayagopi, Mathew Magimai-Doss:
Content-Based Objective Evaluation of Artificially Generated Sign Language Videos. 3815-3819 - Feihong He, Gang Li, Lingyu Si, Leilei Yan, Shimeng Hou, Hongwei Dong, Fanzhang Li:
CartoonDiff: Training-free Cartoon Image Generation with Diffusion Transformer Models. 3825-3829 - Kaixuan Chen, Qianji Di, Yang Lu, Hanzi Wang:
Semantic-Guided Network with Contrastive Learning for Video Caption. 3830-3884 - Xuan Wang, Mengyuan Liu:
Eye Motion Matters for 3D Face Reconstruction. 3835-3839 - Woonghyun Ka, Jae Young Lee, Jaehyun Choi, Junmo Kim:
Stereo-Matching Knowledge Distilled Monocular Depth Estimation Filtered by Multiple Disparity Consistency. 3840-3844 - Xilai Li, Xiaosong Li, Haishu Tan, Jinyang Li:
SAMF: Small-Area-Aware Multi-Focus Image Fusion for Object Detection. 3845-3849 - Yihan Zhang, Yichu Fang, Qian Zhang:
Focus Fusion Network for Visible and Infrared Image Fusion. 3850-3854 - Yuanpeng He, Lijian Li, Tianxiang Zhan, Wenpin Jiao, Chi-Man Pun:
Generalized Uncertainty-Based Evidential Fusion with Hybrid Multi-Head Attention for Weak-Supervised Temporal Action Localization. 3855-3859 - Cong Hu, Biao Fu, Pei Yu, Liang Zhang, Xiaodong Shi, Yidong Chen:
An Explicit Multi-Modal Fusion Method for Sign Language Translation. 3860-3864 - Daiki Kimura, Tatsuya Ishikawa, Masanori Mitsugi, Yasunori Kitakoshi, Takahiro Tanaka, Naomi Simumba, Kentaro Tanaka, Hiroaki Wakabayashi, Masato Sampei, Michiaki Tatsubori:
SAR2NDVI: Pre-Training for SAR-to-NDVI Image Translation. 3865-3869 - Yang Wang, Jun Xu, Jiaogen Zhou, Jihong Guan:
Video Anomaly Prediction: Problem, Dataset and Method. 3870-3874 - Jichen Yang, Fangfan Chen, Rohan Kumar Das, Zhengyu Zhu, Shunsi Zhang:
Adaptive-Avg-Pooling Based Attention Vision Transformer for Face Anti-Spoofing. 3875-3879 - Baihong Lin, Hanxing Chi, Zengrong Lin, Jun Hu, Liang Wang, Jianxiao Zou, Shicai Fan:
Dual Rank-1 Tensor Attention Module for Convolutional Neural Networks. 3880-3884 - Jiawei Yao, Xiaochao Pan, Tong Wu, Xiaofeng Zhang:
Building Lane-Level Maps from Aerial Images. 3890-3894 - Zhenyu Qiu, Qiang Qi, Yang Lu, Yan Yan, Hanzi Wang:
Proposal Distillation of Multi-Modal Feature Aggregation Network for Video Object Detection. 3895-3899 - Yang Su, Baojiang Zhong, Zikai Wang, Kai-Kuang Ma:
Ellipse Detection Based On Structure-Preserving Anisotropic Edge Extraction. 3900-3904 - Joshna Manoj Reddy, Tony Fredrick, Salman Siddique Khan, Kaushik Mitra:
Near-Field Neural Rendering Guided by Single-Shot Photometric Stereo. 3905-3909 - Dung Vo, Chenguang Liu, McClain Nelson:
Extremely Light-Weight Learning Based LDR to PQ HDR Conversion Using Bernstein Curves. 3910-3914 - Tam Thuc Do, Philip A. Chou, Gene Cheung:
Volumetric 3d Point Cloud Attribute Compression: Learned Polynomial Bilateral Filter for Prediction. 3915-3919 - Junseok Ahn, Youngjoon Jang, Joon Son Chung:
Slowfast Network for Continuous Sign Language Recognition. 3920-3924 - Yanfang Deng, Canlong Zhang, Zhixin Li, Chunrong Wei, Zhiwen Wang, Shuqi Pan:
Gradually Spatio-Temporal Feature Activation for Target Tracking. 3925-3929 - Huayu Wang, Zekun Jiang, Lingxi Xie, Dongsheng Jiang, Wei Shen, Qi Tian:
Domain-Adaptive Semantic Segmentation Emerges From Vision-Language Supervised Domain-Debiased Self-Training. 3930-3934 - Zheng Wang, Laurence T. Yang, Bocheng Ren, Jinglin Zhao, Zhe Li, Guolei Zeng:
Capturing Detail Variations for Lightweight Neural Radiance Fields. 3935-3939 - Zhongling Wang, Raymond Zhou, Shahrukh Athar, Wenbo Yang, Zhou Wang:
Boosting Image Quality Assessment Performance: Unsupervised Score Fusion by Deep Maximum a Posteriori Estimation. 3940-3944 - Qunyue Huang, Bin Fang, Xi Ai, Tianyu Nie:
Perceiving Multi-Layer Representations for No-reference Image Quality Assessment. 3945-3949 - Tianci Xie, Siyang Luo, Zhenghan Chen, Xiaoxuan Liang:
Semanticmapper: Region-Specific Domain Adaptation for 3D Shapes Through Lexical Delineation. 3950-3954 - Zhiqi Li, Xiaosong Yang, Jian Jun Zhang:
GAMAFlow: Estimating 3D Scene Flow via Grouped Attention and Global Motion Aggregation. 3955-3959 - Yongpeng Chang, Guangchun Gao:
MMAFlow: Matching-Guided Motion Aggregation for Optical Flow Estimation. 3960-3964 - Huiyun Cao, Wenqi Huang, Wenming Yang:
NLSIT: A Non-Local Stereo Interaction Transformer for Stereo Image Super-Resolution. 3965-3969 - Babak Naderi, Ross Cutler, Nabakumar Singh Khongbantabam, Yasaman Hosseinkashi, Henrik Turbell, Albert Sadovnikov, Quan Zou:
VCD: A Video Conferencing Dataset for Video Compression. 3970-3974 - Yaoyu Su, Shaohui Wang, Haoqian Wang:
DT-NeRF: Decomposed Triplane-Hash Neural Radiance Fields For High-Fidelity Talking Portrait Synthesis. 3975-3979 - Anqi Shi, Huaqiu Chen, Hong Lu, Rui Zhang:
Buffered Gaussian Modeling for Vectorized HD Map Construction. 3980-3984 - Esin Koyuncu, Timofey Solovyev, Johannes Sauer, Elena Alshina, André Kaup:
Quantized Decoder in Learned Image Compression for Deterministic Reconstruction. 3985-3989 - Xinyu Yang, Feixiang Zhou, Huiyu Zhou:
Online Mouse Behavior Detection by Historical Dependency and Typical Instances. 3990-3994 - Jucheng Song, Chi-Man Pun, Haolun Li, Rushi Lan, Jiucheng Xie, Hao Gao:
Local Optimization Networks for Multi-View Multi-Person Human Posture Estimation. 3995-3999 - Ziqiang Shi, Rujie Liu:
Noisy Image Restoration Based on Conditional Acceleration Score Approximation. 4000-4004 - Taihui Li, Anish Lahiri, Yutong Dai, Owen Mayer:
Joint Demosaicing And Denoising With Double Deep Image Priors. 4005-4009 - Haoke Xiao, Lv Tang, Bo Li, Zhiming Luo, Shaozi Li:
Zero-Shot Co-Salient Object Detection Framework. 4010-4014 - Yuchen Zhou, Guang Tan, Chao Gou:
Hierarchical Home Action Understanding with Implicit and Explicit Prior Knowledge. 4015-4019 - Hongyu Ye, Ke Xu, Xinghao Jiang, Tanfeng Sun:
Learning Spatio-Temporal Relations with Multi-Scale Integrated Perception for Video Anomaly Detection. 4020-4024 - Ryosuke Watanabe, Keisuke Nonaka, Eduardo Pavez, Tatsuya Kobayashi, Antonio Ortega:
Fast Graph-Based Denoising For Point Cloud Color Information. 4025-4029 - Yuhang Ming, Jian Ma, Xingrui Yang, Weichen Dai, Yong Peng, Wanzeng Kong:
AEGIS-Net: Attention-Guided Multi-Level Feature Aggregation for Indoor Place Recognition. 4030-4034 - Yangdong Chen, Yanfei Wang, Yuejie Zhang, Rui Feng, Tao Zhang, Xuequan Lu, Shang Gao:
Fine-Granularity Face Sketch Synthesis. 4035-4039 - Haisheng Fu, Feng Liang, Jie Liang, Zhenman Fang, Guohe Zhang, Jingning Han:
Efficient Learned Image Compression with Selective Kernel Residual Module and Channel-Wise Causal Context Model. 4040-4044 - Yanwu Yang, Xutao Guo, Guoqing Cai, Chenfei Ye, Ting Ma:
Topology-Regularized Self-Knowledge Distillation for Transductive-Inductive Learning of Brain Disorder Diagnosis. 4045-4049 - Xinran Lyu, Libao Zhang:
Phase Learning Based on Interactive Perception for Limited-Sample Residential Area Semantic Segmentation. 4050-4054 - Xingyu Ding, Weiqiang Wang:
MAS-NET: Mixed-Feature Attention Siamese Network for Change Detection on Remote Sensing Images. 4055-4059 - Haochen Chang, Jing Chen, Yilin Li, Jixiang Chen, Xiaofeng Zhang:
Wavelet-Decoupling Contrastive Enhancement Network for Fine-Grained Skeleton-Based Action Recognition. 4060-4064 - Hao Kong, Jie Xu, Shenjian Gong, Jian Yang, Shanshan Zhang:
Adaptive Pedestrian Trajectory Prediction via Target-Directed Angle Augmentation. 4065-4069 - Xudong Jin, Jianfeng Xu, Kei Kawamura:
Embedded Graph Representation for Inter-Frame Coding of Dynamic Meshes. 4070-4074 - Tzuhsuan Huang, Chen-Che Huang, Chung-Hao Ku, Jun-Cheng Chen:
Blenda: Domain Adaptive Object Detection Through Diffusion-Based Blending. 4075-4079 - Ruohui Zheng, Libao Zhang:
Unsupervised Remote Sensing Haze Removal Based on Saliency-Guided Transmission Refinement. 4080-4084 - Che Chen, Lin Chen, Xue Jiang, Xingzhao Liu, Abdelhak M. Zoubir:
Deep Unrolling Network for SAR Image Despeckling. 4085-4089 - Wanning Zhu, Lin Tan, Libao Zhang:
SDRNet: Saliency-Guided Dynamic Restoration Network for Rain and Haze Removal in Nighttime Images. 4090-4094 - Yang Zou, Xingyuan Li, Zhiying Jiang, Tiantian Yan, Jinyuan Liu:
Adaptive Multi-Exposure Fusion for Enhanced Neural Radiance Fields. 4095-4099 - Yushin Cho, Madhu Krishnan, Xin Zhao, Shan Liu:
Adaptive Secondary Transform Sets for Video Coding Beyond AV1. 4100-4104 - Junda Xu, Libao Zhang:
Hazy Remote Sensing Images Semantic Segmentation for Weakly Annotation Based on Saliency-Aware Alignment Strategy. 4105-4109 - Qi An, Mengshi Qi, Huadong Ma:
Multi-Stage Contrastive Regression for Action Quality Assessment. 4110-4114 - Luchuan Song, Pinxin Liu, Guojun Yin, Chenliang Xu:
Adaptive Super Resolution for One-Shot Talking-Head Generation. 4115-4119 - Jiangqi Liu, Feng Wang:
Mixed-Attention Auto Encoder for Multi-Class Industrial Anomaly Detection. 4120-4124 - Xiaoyu Jin, Wenqi Huang, Lingyu Liang, Yang Wu, Qunsheng Zeng, Ruiye Zhou, Zhuojun Cai, Jianing Shang, Wenming Yang:
DEGAN: Discrimination Enhanced GAN for Perceptual-Oriented Super-Resolution. 4125-4129 - Zhiyang Chen, Yousong Zhu, Zhaowen Li, Fan Yang, Chaoyang Zhao, Jinqiao Wang, Ming Tang:
The Devil is in Details: Delving Into Lite FFN Design for Vision Transformers. 4130-4134 - Yixin Lei, Xingyuan Li, Zhiying Jiang, Xinrui Ju, Jinyuan Liu:
AEAM3D: Adverse Environment-Adaptive Monocular 3D Object Detection via Feature Extraction Regularization. 4135-4139 - Hongru Wang, Baohang Zhou, Zhengkun Zhang, Yiming Du, David Ho, Kam-Fai Wong:
M3sum: A Novel Unsupervised Language-Guided Video Summarization. 4140-4144 - Long Feng, Guohua Geng, Yong Ren, Zhen Li, Yangyang Liu, Kang Li:
CReStyler: Text-Guided Single Image Style Transfer Method Based on CNN and Restormer. 4145-4149 - Sipeng Yang, Hongyu Huang, Qingchuan Zhu, Xiaogang Jin:
SR-VFA: Accurate Self-Refined Face Alignment in Videos. 4150-4154 - Jie Wu, Chunlei Wu, Yiwei Wei, Xiuxuan Shen, Leiquan Wang:
Memory Self-Calibrated Network for Visual Grounding. 4155-4159 - Zihan Gao, Peng Gao, Wei Yin, Yifan Liu, Zengchang Qin:
Robust Lightweight Depth Estimation Model via Data-Free Distillation. 4160-4164 - Binglei Li, Zhizhong Huang, Hongming Shan, Junping Zhang:
Semantic Latent Decomposition with Normalizing Flows for Face Editing. 4165-4169 - Xu Zhang, Rui Tang, Guipeng Zhang, Dehui Kong, Ke Xu:
Low-Light Raw Image Enhancement on a Dataset Suffering Light Effects. 4170-4174 - Jiajun Ling, Yifan Chen, Qimin Cheng, Xiao Huang:
Zigzag Attention: A Structural Aware Module For Lane Detection. 4175-4179 - Jiayin Wen, Dianwei Wang, Jie Fang, Yuanqing Li, Zhijie Xu:
Multi-Object Tracking for Unmanned Aerial Vehicles Based on Multi-Frame Feature Fusion. 4180-4184 - Guanchen Ding, Chang Wen Chen:
Towards Omniscient Feature Alignment for Video Rescaling. 4190-4194 - Xinran Lyu, Libao Zhang:
Semantic Segmentation for Multi-Scene Remote Sensing Images with Noisy Labels Based on Uncertainty Perception. 4195-4199 - Linya Zheng, Fan Zhang, Haichao Peng, Yong Wang, Yinran Chen, Xiongbiao Luo:
Loop Structure-Aware Learning for Fully Automated Pulmonary Fissure Completeness Assessment. 4200-4204 - Shiwei Zhao, Shengye Yan:
Bounding Box-Guided Pseudo Point Clouds Early-Fusion and Density Optimize for 3D Object Detection. 4205-4209 - Jian Xiong, Junhao Wu, Wang Luo, Jiucheng Xie, Hao Gao:
Geometry Compression Artifact Removal for V-PCC over a Wide Bitrate Range. 4210-4214 - Shuhong Liao, Chuanmin Jia, Hongfei Fan, Jingwen Yan, Siwei Ma:
Rate-Quality Based Rate Control Model for Neural Video Compression. 4215-4219 - Peng Liu, Chuanxu Wang, Min Zhao:
Modal Consensus and Contextual Separation for Weakly Supervised Temporal Action Localization. 4220-4224 - Mengyi Zhao, Mengyuan Liu, Bin Ren, Shuling Dai, Nicu Sebe:
Denoising Diffusion Probabilistic Models for Action-Conditioned 3D Motion Generation. 4225-4229 - Hongming Fu, Guanyao Wu, Zhu Liu, Tiantian Yan, Jinyuan Liu:
Segmentation-Driven Infrared and Visible Image Fusion Via Transformer-Enhanced Architecture Searching. 4230-4234 - Anzhe Cheng, Heng Ping, Zhenkun Wang, Xiongye Xiao, Chenzhong Yin, Shahin Nazarian, Mingxi Cheng, Paul Bogdan:
Unlocking Deep Learning: A BP-Free Approach for Parallel Block-Wise Training of Neural Networks. 4235-4239 - Pranay Kashyap, Sourabh Vasant Gothe, Vibhav Agarwal, Jayesh Rajkumar Vachhani:
SAM-GEBD: Zero-Cost Approach for Generic Event Boundary Detection. 4240-4244 - Baole Wei, Minghang He, Liangcai Gao, Duoyou Zhou, Xiang Bai, Zhi Tang:
Maskstr: Guide Scene Text Recognition Models with Masking. 4245-4249 - Dunbo Ning, Wenjing Chen, Wei Xie, Hao Sun:
Spatial Formation-Guided Network for Group Activity Recognition. 4250-4254 - Li Yu, Yanjun Gao, Farhad Pakdaman, Moncef Gabbouj:
Panoramic Image Inpainting with Gated Convolution and Contextual Reconstruction Loss. 4255-4259 - Arne Berresheim, Antonio Agudo:
Photovoltaic Power Forecasting Using Sky Images and Sun Motion. 4260-4264 - Chao Wu, Xiaobin Chang, Ruixuan Wang:
Generalizable Two-Branch Framework for Image Class-Incremental Learning. 4265-4269 - Yiyu Liu, Fengshan Zhao, Qin Liu, Takeshi Ikenaga:
Multi-Level Spatial-Temporal Feature Aggregation and Alignment-Based Selective Residual Dense Propagation Module for HDR Video Reconstruction. 4270-4274 - Xiaomeng Yang, Dongbao Yang, Zhi Qiao, Yu Zhou:
Accurate and Robust Scene Text Recognition via Adversarial Training. 4275-4279 - Tagon Sompong, Chawan Piansaddhayanon, Ekapol Chuangsuwanich:
Attribute-Aware Amplification of Facial Feature Sequences for Facial Emotion Recognition. 4280-4284 - Jiarun Song, Shengnan Wang, Fuzheng Yang:
Perceptual Quality Evaluation for Faster Playback Videos. 4285-4289 - Sergio Sánchez Santiesteban, Sara Atito, Muhammad Awais, Yi-Zhe Song, Josef Kittler:
Improved Image Captioning Via Knowledge Graph-Augmented Models. 4290-4294 - Zipei Yan, Zhengji Liu, Jizhou Li:
Boosting of Implicit Neural Representation-Based Image Denoiser. 4295-4299 - Hicham Talaoubrid, Khizar Hayat, Baptiste Magnier:
Straightforward Adaptation of Particle Filter to Fish Eye Images for Top View Pedestrian Tracking. 4300-4304 - Qian Wu, Ruoxuan Cui, Yuke Li, Haoqi Zhu:
HaltingVT: Adaptive Token Halting Transformer for Efficient Video Recognition. 4305-4309 - Ning Wang, Yifei She, Rui Xu, Bin Liu, Haojie Li, Zhiyong Wang, Zhihui Wang:
Bridging the Gap: Sketch to Color Diffusion Model with Semantic Prompt Learning. 4310-4314 - Zihang Chen, Zhu Liu, Jinyuan Liu:
Local Contrast Prior-Guided Cross Aggregation Model for Effective Infrared Small Target Detection. 4315-4319 - Xinyu Wang, Xiaochuan Wang, Ruijun Liu, Xiankai Huang:
Rating-Augmented No-Reference Point Cloud Quality Assessment Using Multi-Task Learning. 4320-4324 - Se Jin Park, Minsu Kim, Jeongsoo Choi, Yong Man Ro:
Exploring Phonetic Context-Aware Lip-Sync for Talking Face Generation. 4325-4329 - Xiuzhen He, Yan Wang, Qiaoli Sun, Fangxu Zhou:
PFCF-Net: A Network Based on Progressive Feature Interaction and Cross-Scale Feature Fusion for Remote Sensing Change Detection. 4330-4334 - Marouane Tliba, Aladine Chetouani, Giuseppe Valenzise, Frédéric Dufaux:
Balancing Representation Abstractions and Local Details Preservation for 3d Point Cloud Quality Assessment. 4335-4339 - Guowen Kuang, Xin Lu, Jingran Xia, Hao Geng, Xu Wang, Jinfeng Yang:
Real-Oriented Object Detection Driven by Intelligent Stockbreeding. 4345-4349 - Haokun Zhu, Juang Ian Chong, Teng Hu, Ran Yi, Yu-Kun Lai, Paul L. Rosin:
SAMVG: A Multi-Stage Image Vectorization Model with the Segment-Anything Model. 4350-4354 - Jiaxu Wang, Bo Xu, Hao Cheng, Renjing Xu:
DONE: Dynamic Neural Representation Via Hyperplane Neural ODE. 4355-4359 - Xing Wu, Zhi Li, Junfeng Yao, Quan Qian, Jian Zhang, Qun Sun, Yike Guo:
EDM: Synthetic Data from Exemplar Diffusion Model Improves Non-Communicable Diseases Detection. 4360-4364 - Liyuan Qi, Olaoluwa R. Popoola, Jingyan Wang, Muhammad Ali Imran, Wasim Ahmad:
3D Hand Joint and Grasping Estimation for Teleoperation System. 4365-4369 - Wanqiang Cai, Bin Wang:
LV-SEGFORMER: Towards More Accurate Leaf-Vein Segmentation with Transformer. 4370-4374 - Chang Liu, Aimin Jiang, Yibin Tang, Yanping Zhu, Qi Chen:
3D Point Cloud Semantic Segmentation Based on Diffusion Model. 4375-4379 - Gwanhyeong Koo, Sunjae Yoon, Chang D. Yoo:
Wavelet-Guided Acceleration of Text Inversion in Diffusion-Based Image Editing. 4380-4384 - Hatef Otroshi-Shahreza, Alexandre Veuthey, Sébastien Marcel:
Face Recognition Using Lensless Camera. 4385-4389 - Feifan Wang, Yuan Zong, Jie Zhu, Mengting Wei, Xiaolin Xu, Cheng Lu, Wenming Zheng:
Progressively Learning from Macro-Expressions for Micro-Expression Recognition. 4390-4394 - Dayong Wang, Yishen Deng, Weisheng Li, Xin Lu, Frédéric Dufaux, Bo Hang, Ce Zhu:
Fast Intra Mode Prediction Algorithms for SCBS in VVC SCC. 4395-4399 - Sensen Song, Dayong Ren, Zhenhong Jia, Fei Shi:
Adaptive Gaussian Regularization Constrained Sparse Subspace Clustering for Image Segmentation. 4400-4404 - Boya Wang, Shuo Wang, Dong Ye, Ziwen Dou:
Deep Neighbor Layer Aggregation for Lightweight Self-Supervised Monocular Depth Estimation. 4405-4409 - Zhaoyong Yan, Gang Li, Lingyu Si, Hongwei Dong:
FPGNet: Single Image Deraining with High-Frequency Channel and Frequency Domain Prior Guidance. 4410-4414 - Shougan Pan, Zhengwentai Sun, Chenxing Wang, Junkai Zhang:
A 3D Virtual Try-On Method with Global-Local Alignment and Diffusion Model. 4415-4419 - Lingyu Si, Gang Li, Hongwei Dong, Changwen Zheng, Fanjiang Xu, Fuchun Sun:
Radardiff: Improving Sea Clutter Suppression Using Diffusion Models for Radar Images. 4420-4424 - Qiang Li, Qianchen Mao, Wenjie Liu, Jinbao Wang, Wenmin Wang, Bingshu Wang:
Local Information Guided Global Integration for Infrared Small Target Detection. 4425-4429 - Qiang Yang, Xiaodong Wu, Xiuying Chen, Xin Gao, Xiangliang Zhang:
Think as People: Context-Driven Multi-Image News Captioning with Adaptive Dual Attention. 4430-4434 - Zhe Ye, Diqun Yan, Li Dong, Kailai Shen:
Breaking Speaker Recognition with Paddingback. 4435-4439 - Weifeng Ou, Lai-Man Po, Xiu-Feng Huang:
Joint Learning of Identity and Vein Features for Enhanced Representations in Vascular Biometrics. 4440-4444 - Kensuke Wagata, Andrew Beng Jin Teoh:
Cross-Domain Cross-Task Transfer Mobile Touch-Stroke Authentication. 4445-4449 - Zhuofan Yang, Qiushi Li, Shenghai Luo, Shunquan Tan, Bin Li:
Improving VGG-Style Convnet for JPEG Steganalysis. 4450-4454 - Shuai Ren, Yuxiao Li, Bo Li, Hao Gong, Qiuyu Feng:
A Multi-Carrier Information Hiding Algorithm Based on Layered Compression of 3d Point Cloud Model. 4455-4459 - Pengcheng Lu, Liang Cai, Keting Yin:
SourceP: Detecting Ponzi Schemes on Ethereum with Source Code. 4465-4469 - Tianyi Zheng, Qinji Yu, Zhaoyu Chen, Jia Wang:
FAMIM: A Novel Frequency-Domain Augmentation Masked Image Model Framework for Domain Generalizable Face Anti-Spoofing. 4470-4474 - Hui Zeng, Biwei Chen, Anjie Peng:
Enhancing Targeted Transferability VIA Feature Space Fine-Tuning. 4475-4479 - Chao-Bo Yan, Fang-Qi Li, Shi-Lin Wang:
Data-Free Watermark for Deep Neural Networks by Truncated Adversarial Distillation. 4480-4484 - Pengfei Zhao, Haojie Yuan, Qi Chu, Shubin Xu, Nenghai Yu:
Delving Deeper Into Vulnerable Samples in Adversarial Training. 4490-4494 - Yaoxing Wang, Qian Yu, Ling Lin, Zhendong Li, Hao Liu:
Language-Driven Ordinal Learning for Imbalanced Head Pose Estimation. 4495-4499 - Wentang Song, Yuzhen Lin, Bin Li:
Towards Generic Deepfake Detection with Dynamic Curriculum. 4500-4504 - Jeong Gyu Park, Sisung Liu, Je Hyeong Hong:
XMP: A Cross-Attention Multi-Scale Performer for File Fragment Classification. 4505-4509 - Quanlong Guan, Tian Zhang, Yu Qin, Yuyu Zhou, Yangguang Zhu, Yuansheng Zhong, Xiujie Huang, Zhifei Duan, Zhefu Li, Changjiang Liu, Xiaofeng Wu:
Transformer Model with Multi-Type Classification Decisions for Intrusion Attack Detection of Track Traffic and Vehicle. 4510-4514 - Heng Wang, Hongxia Wang, Xinyi Huang, Zhenhao Shi:
Enhanced Screen Shooting Resilient Document Watermarking. 4515-4519 - Zihan Chen, Tianrui Liu, Jun-Jie Huang, Wentao Zhao, Xing Bi, Meng Wang:
Invertible Mosaic Image Hiding Network for Very Large Capacity Image Steganography. 4520-4524 - Yijun Liu, Honglan Yu, Feifei Dai, Xiaoyan Gu, Chenxu Cui, Bo Li, Weiping Wang:
FUR-API: Dataset and Baselines Toward Realistic API Anomaly Detection. 4525-4529 - Xin Liu, Ning Xi, Ke Cheng, Jiaxuan Fu, Xinghui Zhu, Yulong Shen, Jianfeng Ma:
Securely and Efficiently Outsourcing Neural Network Inference via Parallel MSB Extraction. 4530-4534 - Yu Bu, Yulin Zhu, Longling Geng, Kai Zhou:
Uncovering Strong Ties: A Study of Indirect Sybil Attack on Signed Social Network. 4535-4539 - Fengfan Zhou, Hefei Ling, Yuxuan Shi, Jiazhong Chen, Ping Li:
Improving Visual Quality and Transferability of Adversarial Attacks on Face Recognition Simultaneously with Adversarial Restoration. 4540-4544 - Bing Fan, Shu Hu, Feng Ding:
Synthesizing Black-Box Anti-Forensics Deepfakes With High Visual Quality. 4545-4549 - Ali Moradi Shahmiri, Chih Wei Ling, Cheuk Ting Li:
Communication-Efficient Laplace Mechanism for Differential Privacy via Random Quantization. 4550-4554 - Li Wang, Jiaqi Li, Yuhao Luo, Jiahao Zheng, Lei Wang, Hao Li, Ke Xu, Chengfang Fang, Jie Shi, Zhizheng Wu:
ADVSV: An Over-the-Air Adversarial Attack Dataset for Speaker Verification. 4555-4559 - Ruifan Zhang, Jianyi Liu, Ru Zhang:
Controllable Semantic Linguistic Steganography via Summarization Generation. 4560-4564 - Miaoxin Ye, Dongxia Huang, Kangkang Wei, Weiqi Luo:
A Novel Residual-Guided Learning Method for Image Steganography. 4565-4569 - Ziyang Yu, Ting Yang, Qiong Chang, Yu Liu, Weimin Wang:
Attribution-Based Scanline Perturbation Attack on 3d Detectors of Lidar Point Clouds. 4570-4574 - Yanyixiao Wang, Peiya Li:
JPEG Encryption with DC Prediction and Run-Based RS Pairs Permutation. 4575-4579 - Chengrui Gao, Ziyuan Yang, Min Zhu, Andrew Beng Jin Teoh:
Scale-Aware Competition Network for Palmprint Recognition. 4580-4584 - Xiu-Feng Huang, Lai-Man Po, Weifeng Ou:
Motion Transfer-Driven Intra-Class Data Augmentation for Finger Vein Recognition. 4585-4589 - E. Chen, Yang Cao, Yifei Ge:
Rényi Differential Privacy in the Shuffle Model: Enhanced Amplification Bounds. 4590-4594 - Fei Zhang, Hongxia Wang, Mingze He, Ling Yang:
Exploring Consistent Spatio-Temporal Distortion and Stable 3-D DCT Coefficients for Robust Blind Video Watermarking. 4595-4599 - Haoyi Wang, Victor Sanchez, Chang-Tsun Li:
Cross-Age Contrastive Learning for Age-Invariant Face Recognition. 4600-4604 - Yuankun Xie, Jingjing Zhou, Xiaolin Lu, Zhenghao Jiang, Yuxin Yang, Haonan Cheng, Long Ye:
FSD: An Initial Chinese Dataset for Fake Song Detection. 4605-4609 - Fei Zhang, Hongxia Wang, Mingze He, Ling Yang, Jinhe Li:
Adaptive Video Watermarking with Perceptual Guarantee and Efficiency Optimization. 4610-4614 - Peter Rot, Janez Krizaj, Peter Peer, Vitomir Struc:
Enhancing Gender Privacy with Photo-Realistic Fusion of Disentangled Spatial Segments. 4615-4619 - Renyang Liu, Wei Zhou, Jinhong Zhang, Haoran Li, Ruxin Wang:
CNFA: Conditional Normalizing Flow for Query-Limited Attack. 4620-4624 - Folco Bertini Baldassini, Huy H. Nguyen, Ching-Chung Chang, Isao Echizen:
Cross-Attention watermarking of Large Language Models. 4625-4629 - Yuchen Wong, Chen Yan, Shengfang Zhai, Cong Li, Qingni Shen:
Security Equivalence Assessment between Cloud Standards by Mapping of Control Items. 4630-4634 - Jiaqi Li, Li Wang, Liumeng Xue, Lei Wang, Zhizheng Wu:
An Initial Investigation of Neural Replay Simulator for Over-The-Air Adversarial Perturbations to Automatic Speaker Verification. 4635-4639 - Jiatong Liu, Mingcheng Zhang, Jianpeng Ke, Lina Wang:
AdvShadow: Evading DeepFake Detection via Adversarial Shadow Attack. 4640-4644 - Tianwei Zuo, Yiping Duan, Qiyuan Du, Xiaoming Tao:
Semantic Security: A Digital Watermark Method for Image Semantic Preservation. 4645-4649 - Patrick O'Reilly, Zeyu Jin, Jiaqi Su, Bryan Pardo:
Maskmark: Robust Neuralwatermarking for Real and Synthetic Speech. 4650-4654 - Lun Wang, Om Thakkar, Rajiv Mathews:
Unintended Memorization in Large ASR Models, and How to Mitigate It. 4655-4659 - Cong Zhang, Yuezun Li, Honggang Qi, Siwei Lyu:
Enhancing Adversarial Robustness of DNNS Via Weight Decorrelation in Training. 4660-4664 - JongWon Hwang, Andrew Beng Jin Teoh:
Periocular Biometrics Enhancement Through Multimodal Embeddings And Classifier Adaptation. 4665-4669 - Haibin Wu, Heng-Cheng Kuo, Yu Tsao, Hung-Yi Lee:
Scalable Ensemble-Based Detection Method Against Adversarial Attacks For Speaker Verification. 4670-4674 - Yu Sun, Gaojian Xiong, Xianxun Yao, Kailang Ma, Jian Cui:
GI-PIP: Do We Require Impractical Auxiliary Dataset for Gradient Inversion Attacks? 4675-4679 - Wei Du, Tongxin Yuan, Haodong Zhao, Gongshen Liu:
NWS: Natural Textual Backdoor Attacks Via Word Substitution. 4680-4684 - Edoardo Daniele Cannas, P. Beaus, Paolo Bestagini, F. Marques, Stefano Tubaro:
A One-Class Approach to Detect Super-Resolution Satellite Imagery with Spectral Features. 4685-4689 - Caili Gao, Qisheng Xu, Peng Qiao, Kele Xu, Xifu Qian, Yong Dou:
Adapter-Based Incremental Learning for Face Forgery Detection. 4690-4694 - Jianmin Dong, Datian Peng, Taihao Li:
Least-Effort Adversarial Attack Against Gait-Based Identity Recognition System. 4695-4699 - Jiyuan Liu, Wenping Wei, Zhendong Li, Guanfeng Li, Hao Liu:
Invariant Motion Representation Learning for 3D Talking Face Synthesis. 4700-4704 - Lu Yuan, Jiyan Sun, Shangyuan Zhuang, Yinlong Liu, Liru Geng, Jing Zou, Peizhe Xin, Weiqing Huang, Wei Ma:
Manticore: An Unsupervised Intrusion Detection System Based on Contrastive Learning in 5G Networks. 4705-4709 - Wenhan Hou, Bo Cui, Yongxin Chen, Ru Li:
CLPSD: Detecting Ethereum Phishing Scams based on Curriculum Learning. 4710-4714 - Yulun Wu, Mingrui Lao, Yanming Guo, Dongmei Chen, Tianyuan Yu:
Boosting Adversarial Robustness Distillation Via Hybrid Decomposed Knowledge. 4715-4719 - Nan Sun, Chenxin Zhao, Sijing Xie, Hefei Ling:
InvertedFontNet: Font Watermarking based on Perturbing Style Manifold. 4720-4724 - Michele Panariello, Francesco Nespoli, Massimiliano Todisco, Nicholas W. D. Evans:
Speaker Anonymization Using Neural Audio Codec Language Models. 4725-4729 - Qixiang Li, Zhaoya Wang, Lianwen Jin, Nurbiya Yadikar, Kurban Ubul:
MMHSV: A Multimodal Handwritten Signature Verification Fusing Dynamic and Static Feature. 4730-4734 - Andrea Gemelli, Dasara Shullani, Daniele Baracchi, Simone Marinai, Alessandro Piva:
Structure Matters: Analyzing Videos Via Graph Neural Networks for Social Media Platform Attribution. 4735-4739 - Huanhuan Ma, Jinghao Zhang, Qiang Liu, Shu Wu, Liang Wang:
Interpretable Multimodal Out-of-Context Detection with Soft Logic Regularization. 4740-4744 - Shaoyou Zeng, Wenhao Wang, Fangjun Huang, Yanmei Fang:
LOFT: Latent Space Optimization and Generator Fine-Tuning for Defending Against Deepfakes. 4750-4754 - Kaiyi Pang, Minhao Bai, Jinshuai Yang, Huili Wang, Minghu Jiang, Yongfeng Huang:
FREmax: A Simple Method Towards Truly Secure Generative Linguistic Steganography. 4755-4759 - Haitian Zhang, Guang Hua, Wen Yang:
Poisoning-Free Defense Against Black-Box Model Extraction. 4760-4764 - Peishuai Sun, Chengxiang Si, Shuhao Li, Zhenyu Cheng, Shuyuan Zhao, Qingyun Liu:
A Targeted Adversarial Attack Method for Multi-Classification Malicious Traffic Detection. 4765-4769 - Hao Fang, Ajian Liu, Ning Jiang, Quan Lu, Guoqing Zhao, Jun Wan:
VL-FAS: Domain Generalization via Vision-Language Model For Face Anti-Spoofing. 4770-4774 - Zikai Xu, Bin Liu, Fei Hu, Weihai Li, Nenghai Yu:
SE-SIS: Shadow-Embeddable Lossless Secret Image Sharing for Greyscale Images. 4775-4779 - Anjith George, Sébastien Marcel:
Heterogeneous Face Recognition Using Domain Invariant Units. 4780-4784 - Anna Leschanowsky, Ünal Ege Gaznepoglu, Nils Peters:
Voice Anonymization for All-Bias Evaluation of the Voice Privacy Challenge Baseline Systems. 4785-4789 - Giulia Bertazzini, Daniele Baracchi, Dasara Shullani, Massimo Iuliani, Alessandro Piva:
A Codec-Based Approach for Video Life-Cycle Characterization in Social Networks. 4790-4794 - Zongyi Li, Zhongyang Li, Yuxuan Shi, Hefei Ling, Jiazhong Chen, Runsheng Wang, Ping Li:
Uncertainty-Guided Person Search Model with Auxiliary Shallow Feature Exploration. 4795-4799 - Wei Huang, Yinggui Wang, Anda Cheng, Aihui Zhou, Chaofan Yu, Lei Wang:
A Fast, Performant, Secure Distributed Training Framework For LLM. 4800-4804 - Luca Cuccovillo, Milica Gerhardt, Patrick Aichroth:
Audio Transformer for Synthetic Speech Detection via Formant Magnitude and Phase Analysis. 4805-4809 - Matthew Jagielski, Om Thakkar, Lun Wang:
Noise Masking Attacks and Defenses for Pretrained Speech Models. 4810-4814 - Pierre Fernandez, Guillaume Couairon, Teddy Furon, Matthijs Douze:
Functional Invariants To Watermark Large Transformers. 4815-4819 - Benjamin D. Kim, Vipindev Adat Vasudevan, Jongchan Woo, Alejandro Cohen, Rafael G. L. D'Oliveira, Thomas Stahlbuhk, Muriel Médard:
Crypto-Mine: Cryptanalysis Via Mutual Information Neural Estimation. 4820-4824 - Eleonora Breci, Luca Guarnera, Sebastiano Battiato:
Innovative Methods for Non-Destructive Inspection of Handwritten Documents. 4825-4829 - Pavel Korshunov, Anjith George, Gökhan Özbulak, Sébastien Marcel:
Vulnerability of Face age Verification to Replay Attacks. 4830-4834 - Pretom Roy Ovi, Aryya Gangopadhyay:
Gradient Inversion Attacks on Acoustic Signals: Revealing Security Risks in Audio Recognition Systems. 4835-4839 - Chu-Xiao Zuo, Zhi-Jun Jia, Wu-Jun Li:
AdvTTS: Adversarial Text-to-Speech Synthesis Attack on Speaker Identification Systems. 4840-4844 - Chenzhong Yin, Hantang Zhang, Mingxi Cheng, Xiongye Xiao, Xinghe Chen, Xin Ren, Paul Bogdan:
Discovering Malicious Signatures in Software from Structural Interactions. 4845-4849 - Yinyin Peng, Donghui Hu, Gang Pei, Yaofei Wang:
Image Steganography with Deep Orthogonal Fusion of Multi-Scale Channel Attention. 4850-4854 - Qiuxia Wu, Zicheng Wang, Kunming Su, Sangni Xu:
GSTNet: Gait Spatio-Temporal Network for Gait Recognition Using Millimeter-Wave Radar. 4855-4859 - Shoham Hanina, Alon Zolfi, Yuval Elovici, Asaf Shabtai:
Universal Adversarial Attack Against Speaker Recognition Models. 4860-4864 - Qiongxiu Li, Lixia Luo:
On the Privacy of Federated Clustering: a Cryptographic View. 4865-4869 - Wenjun Qian, Qingni Shen, Haoran Xu, Xi Huang, Zhonghai Wu:
DROPFL: Client Dropout Attacks Against Federated Learning Under Communication Constraints. 4870-4874 - Ge Han, Ahmed Salem, Zheng Li, Shanqing Guo, Michael Backes, Yang Zhang:
Detection and Attribution of Models Trained on Generated Data. 4875-4879 - Ziang Li, Chengxiang Si, Zhenyu Cheng, Shuyuan Zhao, Yong Ding:
MLMTD: A Multi-Layer Malicious Traffic Detection Model Based on Multi-Branch Octave Convolution and Attention Mechanism. 4880-4884 - Cui Chen, Zuping Zhang, Panrui Tang, Junyu Zhang:
Nebnet: Exploiting Node-Edge Bi-Level Network for Gene Expression Prediction. 4885-4889 - Yangzhao Xiang, Mutellip Mamut, Nurbiya Yadikar, Ghalipjan Ibrahim, Kurban Ubul:
The Collaboration of 3D Convolutions and CRO-TSM in Lipreading. 4890-4894 - Kun Guo, Haochen Zhu, Gang Cao:
Effective Image Tampering Localization Via Enhanced Transformer and Co-Attention Fusion. 4895-4899 - Heqing Zou, Meng Shen, Yuchen Hu, Chen Chen, Eng Siong Chng, Deepu Rajan:
Cross-Modality and Within-Modality Regularization for Audio-Visual Deepfake Detection. 4900-4904 - Jian Ge, Jianwu Rui, Hengtai Ma, Bin Li, Yeping He:
HySense: Hybrid Event Occurrence Detection Method for IoT Devices. 4905-4909 - Han Zhang, Xutao Yu, Zaichen Zhang, Bingcheng Zhu:
Robust and Imperceptible Commercial Camera-Screen Communication with 60Hz Refresh Rate. 4910-4914 - Siyang Luo, Ziyi Jiang, Zhenghan Chen, Xiaoxuan Liang:
Domain Adaptive Graph Classification. 4915-4919 - Behrooz Razeghi, Parsa Rahimi, Sébastien Marcel:
Deep Variational Privacy Funnel: General Modeling with Applications in Face Recognition. 4920-4924 - Yan He, Fei Peng, Min Long, Kwok-Yan Lam:
Causality-Inspired Single-Source Domain Generalization for Face Anti-Spoofing. 4925-4929 - Hatef Otroshi-Shahreza, Sébastien Marcel:
Face Reconstruction from Partially Leaked Facial Embeddings. 4930-4934 - Jiazhen Wang, Bin Liu, Changtao Miao, Zhiwei Zhao, Wanyi Zhuang, Qi Chu, Nenghai Yu:
Exploiting Modality-Specific Features for Multi-Modal Manipulation Detection and Grounding. 4935-4939 - Yuwei Han, Yuni Lai, Yulin Zhu, Kai Zhou:
Cost Aware Untargeted Poisoning Attack Against Graph Neural Networks. 4940-4944 - Yue Gao, Jinshuai Yang, Cheng Chen, Kaiyi Pang, Yongfeng Huang:
Enhancing Steganography of Generative Image Based on Image Retouching. 4945-4949 - Gewangzi Du, Liwei Chen, Tongshuai Wu, Chenguang Zhu, Gang Shi:
CPMSVD: Cross-Project Multiclass Software Vulnerability Detection Via Fused Deep Feature and Domain Adaptation. 4950-4954 - Shota Horiguchi, Kota Dohi, Yohei Kawaguchi:
Streaming Active Learning for Regression Problems Using Regression via Classification. 4955-4959 - Shuangjie Li, Baoming Zhang, Jianqing Song, Yifan Xia, Junyuan Xie, Chongjun Wang:
Seeking Similarities While Removing Differences: Graph Neural Networks Based on Node Correlation. 4960-4964 - Ryosuke Sonoda, Ramya Srinivasan:
Mutual Information-Based Fair Active Learning. 4965-4969 - Qi Zhang, Yanfeng Sun, Jipeng Guo, Shaofan Wang, Jinghua Li, Junbin Gao, Baocai Yin:
AutoFGNN: A Framework for Extracting All Frequency Information from Large-Scale Graphs. 4970-4974 - Qi Wu, Chengjia Wang, Xiaohui Li, Guangxing Wu, Marta Vallejo, Ruixuan Wang:
Audio-Aided Learning Framework for Image Classification with Limited Training Images. 4975-4979 - Feifei Fu, Yizhao Gao, Zhiwu Lu, Haoran Wu, Shiqi Zhao:
Unsupervised Continual Learning of Image Representation Via Rememory-Based Simsiam. 4980-4984 - Yaping Zhao, Guanghan Li, Edmund Y. Lam:
Cross-Camera Human Motion Transfer by Time Series Analysis. 4985-4989 - Hao Yang, Ruochen Gu, Zihan Yang, Anyong Hu, Tie Jun Cui, Jungang Miao:
Pmmwdeconv: Unsupervised Data-Consistent Blind Passive Millimeterwave Image Deconvolution with Global Context Priors. 4990-4994 - Qiankun Tang:
Stable Knowledge Transfer for Contrastive Distillation. 4995-4999 - Wenjie Yang, Shengzhong Zhang, Zengfeng Huang:
Enhancing Performance of Coarsened Graphs with Gradient-Matching. 5000-5004 - John Martinsson, Maria Sandsten:
DMEL: The Differentiable Log-Mel Spectrogram as a Trainable Layer in Neural Networks. 5005-5009 - Lan Wu, Quan Liu, Lihua Zhang, Zhigang Huang:
Offline Reinforcement Learning with Policy Guidance and Uncertainty Estimation. 5010-5014 - Weichen Xu, Jian Cao, Tianhao Fu, Awen Bai, Ruilong Ren, Zicong Hu, Xixin Cao, Xing Zhang:
J-MAE: Jigsaw Meets Masked Autoencoders in X-Ray Security Inspection. 5015-5019 - Yuyan Chen, Xing Zhao, Ji Gan, Jiaxu Leng, Yan Zhang, Xinbo Gao:
Structure-Aware in-Air Handwritten Text Recognition with Graph-Guided Cross-Modality Translator. 5020-5024 - Lincon S. Souza, Takumi Kobayashi, Yasunori Nishimori, Yasuko Sugase-Miyamoto, Kenji Kawano, Shotaro Akaho, Narihisa Matsumoto:
Local Distance Correlation Embedding for Time-Series Analysis on Riemannian Manifolds. 5025-5029 - Ban Chen, Xin Jin, Youxin Chen, Longhai Wu, Jie Chen, Jayoon Koo, Cheul-Hee Hahm:
Dynamic Video Frame Interpolation with Integrated Difficulty Pre-Assessment. 5030-5034 - Jing Wang, Yuang Liu, Qiang Zhou, Fan Wang:
Language-Guided Few-Shot Semantic Segmentation. 5035-5039 - Huijing Zhan, Jung-Jae Kim, Guimei Liu:
Contrastive Learning with Bidirectional Transformers for Knowledge Tracing. 5040-5044 - Ben Maman, Johannes Zeitler, Meinard Müller, Amit H. Bermano:
Performance Conditioning for Diffusion-Based Multi-Instrument Music Synthesis. 5045-5049 - Kumari Nishu, Minsik Cho, Paul Dixon, Devang Naik:
Flexible Keyword Spotting Based on Homogeneous Audio-Text Embedding. 5050-5054 - Shaokang Dong, Chao Li, Wubing Chen, Hongye Cao, Wenbin Li, Yang Gao:
Multi-Agent Exploration via Self-Learning and Social Learning. 5055-5059 - Haotian Zhang, Hong Qi:
DuNet: A Robust End-to-End Deep Neural Network Framework for Imbalanced Classification. 5060-5064 - Zexu Sun, Xu Chen:
M3TN: Multi-Gate Mixture-of-Experts Based Multi-Valued Treatment Network for Uplift Modeling. 5065-5069 - Yu Qian, Xiaoshuang Li, Jian Cao, Jie Zhang, Hufei Li, Jue Chen:
Boosting Pruned Networks with Linear Over-Parameterization. 5070-5074 - Cheng-Yen Yang, Hsiang-Wei Huang, Zhongyu Jiang, Hao Wang, Farron Wallace, Jenq-Neng Hwang:
A Density-Guided Temporal Attention Transformer for Indiscernible Object Counting in Underwater Videos. 5075-5079 - Mohamad Dhaini, Maxime Berar, Paul Honeine, Antonin Van Exem:
Contrastive Learning for Regression on Hyperspectral Data. 5080-5084 - Shuhao Shi, Zhengyan Wang, Jian Chen, Kai Qiao, Jie Yang, Bin Yan:
Representation Learning across Feature and Topology Views with Output Correction for Graph Convolutional Networks. 5085-5089 - Shuoshuo Chen, Yushun Tang, Zhehan Kan, Zhihai He:
Learning Inference-Time Drift Sensor-Actuator for Domain Generalization. 5090-5094 - Lijun Wang:
CLT: Cooperative Lottery Ticket Hypothesis in Live Streaming Sales Prediction. 5095-5099 - Shaohua Li, Haixiang Zhang, Hanjie Ma, Jie Feng, Mingfeng Jiang:
Efficient Posenet with Coarse to Fine Transformer. 5100-5104 - Weichen Xu, Xinxin Xu, Tianhao Fu, Jian Cao, Xiaoyang Xu, Yuetian Huang, Xixin Cao, Xing Zhang:
SweepMM: A High-Quality Multimodal Dataset for Sweeping Robots in Home Scenarios for Vision-Language Model. 5105-5109 - Zhiqun Pan, Yongxiong Wang, Jiapeng Zhang, Xiaoming Wang, Guangpeng Wang:
Mitigating Optimization Conflict in Domain Adversarial Neural Network via Uncertainty-Aware. 5110-5114 - Ahmad Sajedi, Samir Khaki, Yuri A. Lawryshyn, Konstantinos N. Plataniotis:
ProbMCL: Simple Probabilistic Contrastive Learning for Multi-Label Visual Classification. 5115-5119 - Wei Ding, Zhennan Chen, Hanpeng Jiang, Yuanguo Lin, Fan Lin:
Trend-Heuristic Reinforcement Learning Framework for News-Oriented Stock Portfolio Management. 5120-5124 - Xiaodong Wang, Junbao Zhuo, Shuhao Cui, Shuhui Wang, Yuejian Fang:
Learning Invariant Representation with Consistency and Diversity for Semi-Supervised Source Hypothesis Transfer. 5125-5129 - Hongbo Kang, Yong Wang, Mengyuan Liu, Doudou Wu, Peng Liu, Xinlin Yuan, Wenming Yang:
Diffusion-Based Pose Refinement and Multi-Hypothesis Generation for 3D Human Pose Estimation. 5130-5134 - Jingyu Zhuang, Kuo Wang, Liang Lin, Guanbin Li:
Credible Teacher for Semi-Supervised Object Detection in Open Scene. 5135-5139 - Baiqi Li, Yedi Ma, Yufei Liu, Hongyan Gu, Zhenghan Chen, Xinli Huang:
Federated Learning on Distributed Graphs Considering Multiple Heterogeneities. 5140-5144 - Rockson Agyeman, Bernhard Rinner:
Real-Time Multi-Human Parsing on Embedded Devices. 5145-5149 - Xinqian Chen, Jin Zhang, Xiaoli Gong:
G2G: Generalized Learning by Cross-Domain Knowledge Transfer for Federated Domain Generalization. 5150-5154 - Ruicheng Niu, Ziyuan Zhu, Chaofei Li, Dan Meng:
Search Robust and Adaptable Architecture. 5155-5159 - Yuang Liu, Jing Wang, Qiang Zhou, Fan Wang, Jun Wang, Wei Zhang:
DMT: Comprehensive Distillation with Multiple Self-Supervised Teachers. 5160-5164 - YeongHyeon Park, Sungho Kang, Myung Jin Kim, Hyeonho Jeong, Hyunkyu Park, Hyeong-Seok Kim, Juneho Yi:
Neural Network Training Strategy To Enhance Anomaly Detection Performance: A Perspective On Reconstruction Loss Amplification. 5165-5169 - Bin Wang, Jun Fang, Hongbin Li, Yonina C. Eldar:
A Stochastic Gradient Approach for Communication Efficient Confederated Learning. 5170-5174 - Yingjie Sun, Fanrui Zeng, Jiamin Xiao, Yuxiao Deng, Yifan Ding, Yizhou Li:
GPTCN: Gated Parallel Transformer Convolutional Networks for Downstream-Task User Representation Learning on App Usage. 5175-5179 - Pranay Lohia, Badri Narayana Patro, Naveen Panwar, Vijay Agneeswaran:
SPASE: Spatial Saliency Explanation For Time Series Models. 5180-5184 - Kai Xu, Lichun Wang, Huiyong Zhang, Baocai Yin:
Self-Knowledge Distillation with Learning from Role-Model Samples. 5185-5189 - Juncheng Jin, Junhao Zhang, Junjie Tang, Shengrui Liang, Zehui Qu:
Spatio-Temporal Data Mining with Information Integrity Protection: Graph Signal Based Air Quality Prediction. 5190-5194 - Xiaoou Zhang, Yang Gao, Yang Liu, Yujia Zhu, Peng Zhang, Chuan Zhou, Qingyun Liu, Hongyang Chen:
Meta Structure Search for Link Weight Prediction in Heterogeneous Graphs. 5195-5199 - Haitao Huang, Chi-Man Pun, Haolun Li, Mengqi Liu, Jian Xiong, Hao Gao:
DeformMLP: Dynamic Large-Scale Receptive Field MLP Networks for Human Motion Prediction. 5200-5204 - Yifan Pan, Guibo Luo, Bairong Li, Yuesheng Zhu:
Enhanced Unsupervised Domain Adaptation with Dual-Attention Between Classification and Domain Alignment. 5205-5209 - Lei Guan:
AdaPlus: Integrating Nesterov Momentum and Precise Stepsize Adjustment on Adamw Basis. 5210-5214 - Harry J. Davies, Yuyang Miao, Amir Nassibi, Morteza Khaleghimeybodi, Danilo P. Mandic:
Segmented Error Minimisation (Semi) for Robust Training of Deep Learning Models with Non-Linear Shifts in Reference Data. 5215-5219 - Yilang Zhang, Bingcong Li, Georgios B. Giannakis:
Meta-Learning With Versatile Loss Geometries for Fast Adaptation Using Mirror Descent. 5220-5224 - Sachini Piyoni Ekanayake, Daphney-Stavroula Zois:
Sequential Acquisition of Features and Experts for Datum-Wise Classification. 5225-5229 - Zijian Liu, Ping Jiang, Lixin Lin, Xiaoheng Deng:
Edge Attention Learning for Efficient Camouflaged Object Detection. 5230-5234 - Jun-Jie Huang, Tianrui Liu, Jingyuan Xia, Meng Wang, Pier Luigi Dragotti:
DURRNET: Deep Unfolded Single Image Reflection Removal Network with Joint Prior. 5235-5239 - Bin Jiang, Zhihao Li, M. Salman Asif, Xun Cao, Zhan Ma:
Token-Based Spatiotemporal Representation of the Events. 5240-5244 - Yujie Li, Zezhi Shao, Yongjun Xu, Qiang Qiu, Zhaogang Cao, Fei Wang:
Dynamic Frequency Domain Graph Convolutional Network for Traffic Forecasting. 5245-5249 - Shuanghao Bai, Wanqi Zhou, Zhirong Luan, Donglin Wang, Badong Chen:
Improving Cross-Domain Few-Shot Classification with Multilayer Perceptron. 5250-5254 - Lan Wu, Quan Liu, Lihua Zhang, Zhigang Huang:
Offline Reinforcement Learning with Generative Adversarial Networks and Uncertainty Estimation. 5255-5259 - Christos Korgialas, Evangelia Pantraki, Constantine Kotropoulos:
Interpretable Face Aging: Enhancing Conditional Adversarial Autoencoders with Lime Explanations. 5260-5264 - Vivien Cabannes, Charles Arnal:
Touring Sampling With Pushforward Maps. 5265-5269 - Boqi Dai, Kai Ouyang, Jun Yuan, Miaoxin Chen, Xingyu Lu, Weiwen Liu, Rui Zhang, Hai-Tao Zheng:
Enhancing Multi-Task Models For Recommendation with Tensor Trace Norm. 5270-5274 - Zijun Long, Richard McCreadie, Gerardo Aragon-Camarasa, Zaiqiao Meng:
LaCViT: A Label-Aware Contrastive Fine-Tuning Framework for Vision Transformers. 5275-5279 - Hailiang Xu, Haozhen Situ:
Dynamic Model Structure Adjustment to Realize Quantum Continual Learning Based on Quantum Data. 5280-5284 - Lin Wang, Jingjing Zhang:
Adaptive Multi-Armed Bandit Learning for Task Offloading in Mobile Edge Computing. 5285-5289 - Xuannan Liu, Yaoyao Zhong, Weihong Deng, Hongzhi Shi, Xingchen Cui, Yunfeng Yin, Dongchao Wen:
Enhancing Generalization Of Invisible Facial Privacy Cloak Via Gradient Accumulation. 5290-5294 - Hanyuan Zhang, Yuqi Chen, Xinyu Zhang, Qize Jiang, Liang Li, Baihua Zheng, Weiwei Sun:
Modeling Route Representation With Mixed-Scale Hierarchical Transformer. 5295-5299 - Valentin Breaz, Richard Wilkinson:
Randomized Maximum Likelihood Via High-Dimensional Bayesian Optimization. 5300-5304 - Hafiz Tiomoko Ali, Umberto Michieli, Ji Joong Moon, Daehyun Kim, Mete Ozay:
Deep Neural Network Models Trained with a Fixed Random Classifier Transfer Better Across Domains. 5305-5309 - Ishan D. Khurjekar, Peter Gerstoft:
Multi-Source DOA Estimation With Statistical Coverage Guarantees. 5310-5314 - Ernie Chang, Sidd Srinivasan, Mahi Luthra, Pin-Jie Lin, Varun Nagaraja, Forrest N. Iandola, Zechun Liu, Zhaoheng Ni, Changsheng Zhao, Yangyang Shi, Vikas Chandra:
On the Open Prompt Challenge in Conditional Audio Generation. 5315-5319 - Ernie Chang, Pin-Jie Lin, Yang Li, Sidd Srinivasan, Gaël Le Lan, David Kant, Yangyang Shi, Forrest N. Iandola, Vikas Chandra:
In-Context Prompt Editing for Conditional Audio Generation. 5320-5324 - Vladimir Iashin, Weidi Xie, Esa Rahtu, Andrew Zisserman:
Synchformer: Efficient Synchronization From Sparse Cues. 5325-5329 - Sanket R. Jantre, Nathan M. Urban, Xiaoning Qian, Byung-Jun Yoon:
Learning Active Subspaces for Effective and Scalable Uncertainty Quantification in Deep Neural Networks. 5330-5334 - Xianfeng Li, Weijie Chen, Shicai Yang, Yishuang Li, Wenhao Guan, Lin Li:
Multivariate Fourier Distribution Perturbation: Domain Shifts with Uncertainty in Frequency Domain. 5335-5339 - Miao Jing, Vidhyasaharan Sethu, Beena Ahmed:
A Probability Gradient Based Approach for Sampling Boundaries of In-Domain Data. 5340-5344 - Jinzhi Zheng, Libo Zhang, Yanjun Wu, Chen Zhao:
BPDO: Boundary Points Dynamic Optimization for Arbitrary Shape Scene Text Detection. 5345-5349 - Yujun Cheng, Zhewei Zhang, Shengjin Wang:
RCIF: Towards Robust Distributed DNN Collaborative Inference Under Highly Lossy Networks. 5350-5354 - Bin Liu, Yuchen Luo, Shaofeng Zhang, Zehuan Yuan, Changdong Xu, Boan Chen, Junchi Yan:
View Crafting For Instance-Level Representation from Scene Images. 5355-5359 - Kexin Ke, Jian Yang, Yingjie Liu, Mingsong Chen, Xian Wei, Xuan Tang:
Social Lode: Human Trajectory Prediction with Latent Odes. 5360-5364 - Jen-Tzung Chien, Yuan-An Chen:
Towards a Unified View of Adversarial Training: A Contrastive Perspective. 5365-5369 - Qiwei Ye, Fan Yang, Jing Lu, Yu Tang, Linbo Qiao, Yunwei Zhao:
FDIG: A Fine-Grained Data Integration Approach for Group Recommendation. 5370-5374 - Jingsen Zhang, Xiaohe Bo, Chenxi Wang, Quanyu Dai, Zhenhua Dong, Ruiming Tang, Xu Chen:
Active Explainable Recommendation with Limited Labeling Budgets. 5375-5379 - Yue Xu, Xin Liu, Kun He, Shao Huang, Yaodong Zhao, Jie Gu:
Image Mixing and Gradient Smoothing to Enhance the SAR Image Attack Transferability. 5380-5384 - Xin Liu, Yali Li, Shengjin Wang:
Learning Generalizable Visual Representations via Self-Supervised Information Bottleneck. 5385-5389 - Zheng Zeng, Tiecheng Song, Xinran Ma, Yinghao Jiu, Huaiyi Sun:
Joint Classification of Hyperspectral and Lidar Data Using Cross-Modal Hierarchical Frequency Fusion Network. 5390-5394 - Shuchang Zhang, Hongxia Wang:
A Sequential Averaging Plug-and-Play Method for Image Restoration Via Fixed-Point Projection. 5395-5399 - Fuhan Cai, Zhongqiang Zhang, Duo Liu, Xiangzhong Fang:
COPHTC: Contrastive Learning with Prompt Tuning for Hierarchical Text Classification. 5400-5404 - Guangzhi Zhao, Yuting Hou, Kedian Mu:
Prompting to Prompt for Rehearsal-Free Class Incremental Learning. 5405-5409 - Xiaohan Li, Zhaofeng He, Kai Huang, Zhibo Yang, Gaowei Zhang:
Enhancing Short-and Long-Term Sea Surface Temperature Forecasting with a Static and Dynamic Learnable Personalized Graph Convolution Network. 5410-5414 - Zhenwei Zhang, Ruiqi Wang, Ran Ding, Yuantao Gu:
Unravel Anomalies: an End-to-End Seasonal-Trend Decomposition Approach for Time Series Anomaly Detection. 5415-5419 - Bingzhi Chen, Haoming Zhou, Yishu Liu, Biqing Zeng, Guangming Lu, Zheng Zhang:
Decoupled Self-Adaptive Distribution Regularization for Few-Shot Image Classification. 5420-5424 - Yuze Liu, Ziming Zhao, Tiehua Zhang, Kang Wang, Xin Chen, Xiaowei Huang, Jun Yin, Zhishu Shen:
Exploiting Spatial-Temporal Data for Sleep Stage Classification via Hypergraph Learning. 5430-5434 - Afrina Tabassum, Dung N. Tran, Trung Dang, Ismini Lourentzou, Kazuhito Koishida:
uaMix-MAE: Efficient Tuning of Pretrained Audio Transformers with Unsupervised Audio Mixtures. 5435-5439 - Songlin Yang, Jing Li, Kuanzhi Shi, Yu Chen, Yunlong Zhu, Xudong He, Jinlong Wu, Chenling Pan:
Spatial-Temporal Interaction Decoding Transformer for Unsupervised Multivariate Time Series Anomaly Detection. 5440-5444 - Ying Peng, Yihong Dong, Muqiao Yang, Songtao Lu, Qingjiang Shi:
Signal Transformer: Complex-Valued Attention and Meta-Learning for Signal Recognition. 5445-5449 - Huacheng Li, Chunhe Xia, Tianbo Wang, Wanshuang Lin, Changnan Jiang, Chen Chen, Yuan Zhao:
Multi-Signal Fusion of Social Diffusion Graph with Bi-Directional Semantic Consistency. 5450-5454 - Junjie Wang, Tomas Nordström:
Inputmix: A Strategy to Regularize and Balance Multi-Modality and Multi-View Model Learning. 5455-5459 - Yufeng Yin, Ishwarya Ananthabhotla, Vamsi Krishna Ithapu, Stavros Petridis, Yu-Hsiang Wu, Christi Miller:
Hearing Loss Detection From Facial Expressions in One-On-One Conversations. 5460-5464 - Sota Miyamoto, Takuma Yagi, Yuto Makimoto, Mahiro Ukai, Yoshitaka Ushiku, Atsushi Hashimoto, Nakamasa Inoue:
PolarDB: Formula-Driven Dataset for Pre-Training Trajectory Encoders. 5465-5469 - Mohamad Hassan N C, Avigyan Bhattacharya, Victor G. Turrisi Da Costa, Biplab Banerjee, Elisa Ricci:
Enhancing the Domain Robustness of Self-Supervised pre-Training with Synthetic Images. 5470-5474 - Yuxin Song, Cheng Luo, Aaron Jackson, Xi Jia, Weicheng Xie, Linlin Shen, Hatice Gunes, Siyang Song:
MERG: Multi-Dimensional Edge Representation Generation Layer for Graph Neural Networks. 5475-5479 - Shiran Bian, Xiaofan Li, Yachao Zhang, Jiayong Zhong, Yanyun Qu:
One-Stage Training Generative Paradigm for Generalized Zero-Shot Learning. 5480-5484 - Zhengyang Chi, Junbin Gao:
Maximal Coding Rate Reduction for Graph Embeddings. 5485-5489 - Hanyu Guo, Wanchuan Yu, Yan Yan, Hanzi Wang:
Bi-Directional Motion Attention with Contrastive Learning for few-shot Action Recognition. 5490-5494 - Ying Lv, Jianpeng Ma, Qilin Li, Gang Xu:
Trusted Deep Domain Adaptation with Uncertainty Measure Based on Evidence Theory. 5495-5499 - Cheng-Yi Lee, Cheng-Chang Tsai, Ching-Chia Kao, Chun-Shien Lu, Chia-Mu Yu:
Defending against Clean-Image Backdoor Attack in Multi-Label Classification. 5500-5504 - Wenbo Zhou, Guoqing Zheng, Xinghao Ding:
Dataset Distillation with Channel Efficient Process. 5505-5509 - Sungguk Cha, Jusung Lee, Younghyun Lee, Cheoljong Yang:
Visually Dehallucinative Instruction Generation. 5510-5514 - Guohui Li, Xuanang Ding, Ling Yuan, Lu Zhang, Qian Rong:
Towards Resource-Efficient and Secure Federated Multimedia Recommendation. 5515-5519 - Le Jiang, Hongqiang Cheng, Xiaozhou Ye, Ye Ouyang:
Multi-Teacher Distillation for Incremental Object Detection. 5520-5524 - Xinlong Ding, Jiansheng Chen, Hongwei Yu, Yu Shang, Huimin Ma:
Enhancing Adversarial Transferability in Object Detection with Bidirectional Feature Distortion. 5525-5529 - Jean-Baptiste Malagnoux, Matthieu Kowalski:
From Convolutional Sparse Coding To *-NMF Factorization of Time-Frequency Coefficients. 5530-5534 - H. Cai, Sulaiman A. Alghunaim, Ali H. Sayed:
Diffusion Optimistic Learning for Min-Max Optimization. 5535-5539 - Dezhao Chen, Wenhui Hua:
Hierarchical VAE Based Semantic Communications for POMDP Tasks. 5540-5544 - Shiwei Liu, Yong Xu, Siliang Ma:
Hypergraph-Enhanced Self-Supervised Robust Graph Learning for Social Recommendation. 5545-5549 - Zuogang Shang, Zhibin Zhao, Shibin Wang, Ruqiang Yan:
Anomaly Detection from a Frequency Perspective: M-Band Wavelet Packet Anomaly Detection Network. 5550-5554 - Shenzhi Yang, Li Zhang, Xiaofang Zhang:
FastGAT: Simple and Efficient Graph Attention Neural Network with Global-Aware Adaptive Computational Node Attention. 5555-5559 - Lin Pan, Qianqian Ren:
Urban Traffic Flow Forecasting Based on Spatial-Temporal Graph Contrastive Learning. 5560-5564 - Likun Zhang, Jingwei Sun, Shoukun Guo, Fenghua Li, Jin Cao, Ben Niu:
Interpreting Memorization in Deep Learning from Data Distribution. 5565-5569 - Le Trung Thanh, Karim Abed-Meraim, Philippe Ravier, Olivier Buttelli, Ales Holobar:
Joint INDSCAL Decomposition Meets Blind Source Separation. 5570-5574 - Le Trung Thanh, Karim Abed-Meraim, Philippe Ravier, Olivier Buttelli, Ales Holobar:
Tensorial Convolutive Blind Source Separation. 5575-5579 - Feng Cao, Chang Liu, Deyu Li, Yuhua Qian, Chao Zhang, Hu Zhang:
Local and Global Feature Adaptive Adjustment Network for Remote Sensing Image Scene Classification. 5580-5584 - Ziwei Niu, Hao Sun, Shuyi Ouyang, Shiao Xie, Yen-Wei Chen, Ruofeng Tong, Lanfen Lin:
IRLSG: Invariant Representation Learning for Single-Domain Generalization in Medical Image Segmentation. 5585-5589 - Yuhao Zhou, Minjia Shi, Yuxin Tian, Yuanxi Li, Qing Ye, Jiancheng Lv:
Federated CINN Clustering for Accurate Clustered Federated Learning. 5590-5594 - Lei Wang, Pinyi Huang, Wangyang Cai, Xiyao Liu:
Micro-expression recognition by fusing action unit detection and Spatio-temporal features. 5595-5599 - Zhaozhe Hu, Jia-Li Yin, Bin Chen, Luojun Lin, Bo-Hao Chen, Ximeng Liu:
MEAT: Median-Ensemble Adversarial Training for Improving Robustness and Generalization. 5600-5604 - Yunling Feng, Yang Lei, Xinjie Yang, Jian Xu, Xingxian Liu, Bo Xiao, Yajing Xu:
SIMMKD: Simple Mask-Flow Keypoint Detection for Both Typhoon Detection and Typhoon Eye Location. 5605-5609 - Fabiola Espinoza Castellon, Eduardo Fernandes Montesuma, Fred Maurice Ngolè Mboula, Aurélien Mayoue, Antoine Souloumiac, Cédric Gouy-Pailler:
Federated Dataset Dictionary Learning for Multi-Source Domain Adaptation. 5610-5614 - Minxue Niu, Zhaobo K. Zheng, Kumar Akash, Teruhisa Misu:
Beyond Empirical Windowing: An Attention-Based Approach for Trust Prediction In Autonomous Vehicles. 5615-5619 - Eduardo Fernandes Montesuma, Fred Maurice Ngolè Mboula, Antoine Souloumiac:
Multi-Source Domain Adaptation Meets Dataset Distillation through Dataset Dictionary Learning. 5620-5624 - Rongyao Cai, Linpeng Peng, Zhengming Lu, Kexin Zhang, Yong Liu:
DCS: Debiased Contrastive Learning with Weak Supervision for Time Series Classification. 5625-5629 - Alina Ciocarlan, Sylvie Le Hégarat-Mascle, Sidonie Lefebvre, Arnaud Woiselle, Clara Barbanson:
A Contrario Paradigm for Yolo-Based Infrared Small Target Detection. 5630-5634 - M. M. Amaan Valiuddin, Christiaan G. A. Viviers, Ruud van Sloun, Peter H. N. de With, Fons van der Sommen:
Retaining Informative Latent Variables in Probabilistic Segmentation. 5635-5639 - Qiaowei Miao, Junkun Yuan, Shengyu Zhang, Fei Wu, Kun Kuang:
Domaindiff: Boost out-of-Distribution Generalization with Synthetic Data. 5640-5644 - Juntao Hu, Yuan Wu:
Regularized Conditional Alignment for Multi-Domain Text Classification. 5645-5649 - Xinpeng Lv, Wanrong Huang, Haotian Wang, Ruochun Jin, Xueqiong Li, Zhipeng Lin, Shuman Li, Yongquan Feng, Yuhua Tang:
Modality Re-Balance for Visual Question Answering: A Causal Framework. 5650-5654 - Kwanghee Choi, Jee-Weon Jung, Shinji Watanabe:
Understanding Probe Behaviors Through Variational Bounds of Mutual Information. 5655-5659 - Pedram Bakhtiarifard, Christian Igel, Raghavendra Selvan:
EC-NAS: Energy Consumption Aware Tabular Benchmarks for Neural Architecture Search. 5660-5664 - Louis Leconte, Van Minh Nguyen, Eric Moulines:
FAVANO: Federated Averaging with Asynchronous Nodes. 5665-5669 - Shayan Mohajer Hamidi, Linfeng Ye:
Robustness Against Adversarial Attacks Via Learning Confined Adversarial Polytopes. 5670-5674 - Jerome R. Bellegarda:
Multilingual Transliteration for Pan-Indic Keyboard Input. 5675-5679 - Qipeng Qian, Tanwi Mallick:
Wavelet-Inspired Multiscale Graph Convolutional Recurrent Network for Traffic Forecasting. 5680-5684 - Abhisek Chakraborty:
Probabilistic Spike Train Inference. 5685-5689 - Xiqiao Fang, Qingfeng Wu, Lu Cao:
SPCL-MER: Supervised Prototypical Contrastive Learning for Micro-Expression Recognition. 5690-5694 - Hanbo Cheng, Jun Du, Pengfei Hu, Jiefeng Ma, Zhenrong Zhang, Mobai Xue:
Viewing Writing as Video: Optical Flow based Multi-Modal Handwritten Mathematical Expression Recognition. 5695-5699 - Jingyi Wang, Da Huang, Xinghao Wu, Yuhua Tang, Long Lan:
Continuous Review and Timely Correction: Enhancing the Resistance to Noisy Labels via Self-Not-True Distillation. 5700-5704 - Zhibo Lou, Shinta Otake, Zhengxiao Li, Rei Kawakami, Nakamasa Inoue:
Cubic Knowledge Distillation for Speech Emotion Recognition. 5705-5709 - Ben Chen, Xuechao Zou, Yu Zhang, Jiayu Li, Kai Li, Junliang Xing, Pin Tao:
LEFormer: A Hybrid CNN-Transformer Architecture for Accurate Lake Extraction from Remote Sensing Imagery. 5710-5714 - Yuxing Zhi, Junhuai Li, Huaijun Wang, Jing Chen, Ting Cao:
A Fine-Grained Tri-Modal Interaction Model for Multimodal Sentiment Analysis. 5715-5719 - Edward Kim, Maryam Daniali, Jocelyn Rego, Garrett T. Kenyon:
The Selectivity and Competition of the Mind's Eye in Visual Perception. 5720-5724 - Fei Gao, Luofeng Zhang, Yuanming Zhang:
FDNet: A Novel Multivariate Time Series Classification Model Through Fusing Feature and Difference. 5725-5729 - Anirban Chakraborty, Abhisek Chakraborty:
Scalable Model-Based Gaussian Process Clustering. 5730-5734 - Wenbo Liu, Yifan He, Jihong Guan, Shuigeng Zhou:
Multivariate Time Series Forecasting with Causal-Temporal Attention Network. 5735-5739 - Jie Mei, Jenq-Neng Hwang:
ESA: Expert-and-Samples-Aware Incremental Learning Under Longtail Distribution. 5740-5744 - Hongyan Xu, Xiaohuan Pei, Xiu Su, Shan You, Chang Xu:
TCNAS: Transformer Architecture Evolving in Code Clone Detection. 5745-5749 - Hung Chun Hsu, Ting-Le Lin, Bo-Jun Wu, Ming-Yi Hong, Che Lin, Chih-Yu Wang:
FincGAN: A Gan Framework of Imbalanced Node Classification on Heterogeneous Graph Neural Network. 5750-5754 - Kyu Ri Park, Youngmin Oh, Jung Uk Kim:
Enhancing Audio-Visual Question Answering with Missing Modality via Trans-Modal Associative Learning. 5755-5759 - Yinjie Zhang, Ming Shao, Wenlong Shi, Haifeng Xia, Siyu Xia:
Autonomous Generative Feature Replay for Non-Exemplar Class-Incremental Learning. 5760-5764 - Chen Yang, Tongtong Liu, Didi Jiao, Wenhui Li:
MVITP: Multi-View Image-Text Perception for Few-Shot Remote Sensing Image Classification. 5765-5769 - Qi Wang, Wenxin Yu, Lu Che, Chang Liu, Zhiqiang Zhang, Jun Gong, Peng Chen:
Similarity Knowledge Distillation with Calibrated Mask. 5770-5774 - Jie Yan, Quan Liu, Lihua Zhang:
Offline Reinforcement Learning Based on Next State Supervision. 5775-5779 - Chao Wei, Zhidong Deng:
A Novel Contrastive Diffusion Graph Convolutional Network for Few-Shot Skeleton-Based Action Recognition. 5780-5784 - Hyunjin Kim, Jungwoo Shin, Wansoo Kim, Alberto A. Del Barrio:
1-D Spatial Attention in Binarized Convolutional Neural Networks. 5785-5789 - Pei-Sze Tan, Sailaja Rajanala, Arghya Pal, Shu-Min Leong, Raphaël C.-W. Phan, Huey Fang Ong:
Causally Uncovering Bias in Video Micro-Expression Recognition. 5790-5794 - Kyoungoh Lee, Kwang-Ju Kim, Pyong-Kun Kim, In-Su Jang:
TRET: Two Stream-Based Regionally Enhanced Transformers for Person Re-Identification. 5795-5799 - Masahiro Kohjima:
On The Equivalence Of Dynamic Mode Decomposition And Complex Nonnegative Matrix Factorization. 5800-5804 - Vasileios Tsouvalas, Aaqib Saeed, Tanir Ozcelebi, Nirvana Meratnia:
Communication-Efficient Federated Learning Through Adaptive Weight Clustering And Server-Side Distillation. 5805-5809 - Zhaokai Zhang, Tianpeng Feng, Yang Liu, Chunnan Sheng, Fanyi Wang, He Cai:
DBS: Differentiable Budget-Aware Searching For Channel Pruning. 5810-5814 - Zhi Chen, Cuifeng Du, Xiujie Huang, Zelong Lin, Yuyu Zhou, Quanlong Guan, Zhefu Li, Shuanghuan Lv, Xiaofeng Wu, Xiaotian Zhuang:
Deformation And Penetration Hybrid Detection-Net For Parcels Inspection In Industrial Supply Chain. 5815-5819 - Jianwen Yang, Xiao Zhang, Jun Xu:
Smooth Start: A Unified Approach for Gradual Transition from Cold to Old in Recommender Systems. 5820-5824 - Mengtian Zhang, Bo Jiang, Yuye Ling, Xinbing Wang:
Learning With Non-Uniform Label Noise: A Cluster-Dependent Weakly Supervised Approach. 5825-5829 - Zixuan Sun, Huihui Song, Kaihua Zhang, Gang Dong, Lingyan Liang, Yaqian Zhao:
Segment Anything Model Guided Semantic Knowledge Learning For Remote Sensing Change Detection. 5830-5834 - Yunfeng Xu, Shaohui Zhao, Hexun Fan, Jialin Wang:
GLMAE: Graph Representation Learning Method Combining Generative Learning and Masking Autoencoder. 5835-5839 - Zuo Zuo, Zongze Wu, Badong Chen, Xiaopin Zhong:
A Reconstruction-Based Feature Adaptation for Anomaly Detection with Self-Supervised Multi-Scale Aggregation. 5840-5844 - Fusataka Kuniyoshi, Inazumi Masanobu, Toshiyuki Koga:
Forecasting Torsional Resonance in Electric Vehicles by Learning a Quantile Regressor. 5845-5849 - Tal Vol, Loai Danial, Nir Shlezinger:
Power-Aware Task-Based Learning of Neuromorphic ADCs. 5850-5854 - Yuki Akiyama, Konstantinos Slavakis:
Proximal Bellman Mappings for Reinforcement Learning and Their Application to Robust Adaptive Filtering. 5855-5859 - Long Chen, Qianqian Ren, Zilong Li, Hui Xu:
Adaptive Multi-View Joint Contrastive Learning on Graphs. 5860-5864 - Zhongquan Jian, Jiajian Li, Junfeng Yao, Meihong Wang, Qingqiang Wu:
Conversation Clique-Based Model for Emotion Recognition In Conversation. 5865-5869 - Haoxin Xu, Bihao Hu, Xiaoqing Gu, Longwei Zheng:
A Learning Resource Recommendation Algorithm Based on Online Learning Behavior. 5870-5874 - Gehang Zhang, Jiawei Sheng, Shicheng Wang, Tingwen Liu:
Noise-Disentangled Graph Contrastive Learning via Low-Rank and Sparse Subspace Decomposition. 5880-5884 - Zhenrong Liu, Yang Li, Yi Gong, Yik-Chung Wu:
Learning a Low-Rank Feature Representation: Achieving Better Trade-Off Between Stability and Plasticity in Continual Learning. 5885-5889 - Chao Li, Shaokang Dong, Shangdong Yang, Hongye Cao, Wenbin Li, Yang Gao:
Multi-Agent Sparse Interaction Modeling is an Anomaly Detection Problem. 5890-5894 - Wenbo Qiao, Peng Zhang, Jiaming Zhao, Chang Yang:
Quantum Topic Model: Topic Modeling Using Variational Quantum Circuits. 5895-5899 - Hanpeng Jiang, Zhennan Chen, Wei Ding, Fan Lin:
Asformer: Learning From Adjacent Scale. 5900-5904 - Zeyu Liu, Heyan Chai, Qing Liao:
Learning from Easy to Hard: Multi-Task Learning with Data Scheduling. 5905-5909 - Renyang Liu, Wei Zhou, Sixing Wu, Jun Zhao, Kwok-Yan Lam:
SSTA: Salient Spatially Transformed Attack. 5910-5914 - Yan Yang, Dongdong Ren, Chenglei Peng, Jing Huo, Wenbin Li, Yang Gao:
Dynamic Replay Training for Class-Incremental Learning. 5915-5919 - Constantin Patsch, Jinghan Zhang, Yuankai Wu, Marsil Zakour, Driton Salihu, Eckehard G. Steinbach:
Long-Term Action Anticipation Based on Contextual Alignment. 5920-5924 - Sunqi Lin, Chong Wang, Yujie Zheng, Chenchen Tao, Xinmiao Dai, Yuqi Li:
Distill Vision Transformers to CNNs via Teacher Collaboration. 5925-5929 - Hao Ren, Mingwei Wang, Xinyu Lei, Mengli Zhang, Wenpeng Li, Chen Liu:
Refinement Bird's Eye View Feature for 3D Lane Detection with Dual-Branch View Transformation Module. 5930-5934 - Kangkang Ai, Haigen Hu, Qianwei Zhou, Qiu Guan:
SGT: Self-Guided Transformer for Few-Shot Semantic Segmentation. 5935-5939 - Qiongxiu Li, Wenrui Yu, Changlong Ji, Richard Heusdens:
Topology-Dependent Privacy Bound for Decentralized Federated Learning. 5940-5944 - Zihao Zhao, Yang Liu, Wenbo Ding, Xiao-Ping Zhang:
Federated PAC-Bayesian Learning on Non-IID Data. 5945-5949 - Zelong Sun, Guoxing Yang, Zhiwu Lu, Hao Jiang, Guojie Zhu, Zhao Cao:
Image Retrieval with Composed Query by Multi-Scale Multi-Modal Fusion. 5950-5954 - Eunseop Shin, Incheon Cho, Muhammad Awais, A. F. M. Shahab Uddin, Younho Jang, Sung-Ho Bae:
G-SHARP: Globally Shared Kernel with Pruning for Efficient CNNs. 5955-5959 - Peiyuan Liu, Beiliang Wu, Naiqi Li, Tao Dai, Fengmao Lei, Jigang Bao, Yong Jiang, Shu-Tao Xia:
WFTNet: Exploiting Global and Local Periodicity in Long-Term Time Series Forecasting. 5960-5964 - Shang-Fu Chen, Cheng-Xun Wen, Wen-Huang Cheng, Kai-Lung Hua:
Representation and Boundary Enhancement for Action Segmentation Using Transformer. 5965-5969 - Xiaoyong Ni, Guy Revach, Nir Shlezinger:
Adaptive Kalmannet: Data-Driven Kalman Filter with Fast Adaptation. 5970-5974 - Sergio Rozada, Antonio G. Marques:
Tensor Low-Rank Approximation of Finite-Horizon Value Functions. 5975-5979 - Fabian Perez, Jhon Lopez, Henry Arguello:
Privacy-Preserving Deep Learning Using Deformable Operators for Secure Task Learning. 5980-5984 - Di Yang, Yihao Huang, Qing Guo, Felix Juefei-Xu, Ming Hu, Yang Liu, Geguang Pu:
Architecture-Agnostic Iterative Black-Box Certified Defense Against Adversarial Patches. 5985-5989 - Aniket Singh, Anoop M. Namboodiri:
Image Attribution by Generating Images. 5990-5994 - Mostafa Bella, Shahram Hosseini, Hicham Saylani, Thierry Contini, Tristan Grégoire, Yannick Deville:
Fourier Domain Approach for Galaxy Spectra Decontamination and Deconvolution. 5995-5999 - Brian Zhang, Yuguang Yao, Sijia Liu:
Elevating Visual Prompting in Transfer Learning Via Pruned Model Ensembles: No Retrain, No Pain. 6000-6004 - Xiaochen Zheng, Xingyu Chen, Manuel Schürch, Amina Mollaysa, Ahmed Allam, Michael Krauthammer:
Simple Contrastive Representation Learning for Time Series Forecasting. 6005-6009 - William F. Jenkins, Peter Gerstoft:
Bayesian Optimization with Gaussian Processes for Robust Localization. 6010-6014 - Shen Wang, Xiaofeng Cheng, Ming Xie, Yuhang Ling, Chao Liu, Mingmin Chi, Pei Wang, Zhongyi Sun, Yabiao Wang:
Search for Gravitational Wave Probes - A Self-Supervised Learning for Pulsars Based on Signal Contexts. 6015-6019 - Boris Joukovsky, Brent De Weerdt, Nikos Deligiannis:
Learned Layered Coding for Successive Refinement in the Wyner-Ziv Problem. 6020-6024 - Junwei Su, Shan Wu, Jinhui Li:
MTRGL: Effective Temporal Correlation Discerning Through Multi-Modal Temporal Relational Graph Learning. 6025-6029 - Pingyue Zhang, Mengyue Wu, Kai Yu:
Semantic-Enhanced Supervised Contrastive Learning. 6030-6034 - Dapeng Li, Na Lou, Bin Zhang, Zhiwei Xu, Guoliang Fan:
Adaptive Parameter Sharing for Multi-Agent Reinforcement Learning. 6035-6039 - Yiping Wei, Kunyu Peng, Alina Roitberg, Jiaming Zhang, Junwei Zheng, Ruiping Liu, Yufan Chen, Kailun Yang, Rainer Stiefelhagen:
Elevating Skeleton-Based Action Recognition with Efficient Multi-Modality Self-Supervision. 6040-6044 - Zepu Yi, Songfeng Lu, Xueming Tang, Junjun Wu, Jianxin Zhu:
MACCN: Multi-Modal Adaptive Co-Attention Fusion Contrastive Learning Networks for Fake News Detection. 6045-6049 - Duolin Sun, Yimou Wang, Joey Zhaoyu Zuo, Huan Zheng:
Haformer: Heterogeneous Aggregation Transformer for Single Image Deraining. 6050-6054 - Kai-Wen Chen, Chen-Kuo Chiang:
Prototype-Guided Masking for Unsupervised Domain Adaptation. 6055-6059 - Zheng-An Zhu, Chen-Kuo Chiang:
Generative Extension Positive Pairs and Improving Sample Selection Based on Contrastive Learning for Unsupervised Person Re-Identification. 6060-6064 - Yao Liu, Yongfei Zhang, Xin Wang, Shan Yang:
Heuristic-Driven, Type-Specific Embedding in Parallel Spaces for Enhancing Knowledge Graph Reasoning. 6065-6069 - Dongze Wu, Jun Gao, Feng Yin:
Bayesian-Boosted MetaLoc: Efficient Training and Guaranteed Generalization for Indoor Localization. 6070-6074 - Roula Nassif, Soummya Kar, Stefan Vlaski:
Learning Dynamics of Low-Precision Clipped SGD with Momentum. 6075-6079 - Myung Cho, Meghana Chikkam, Weiyu Xu, Lifeng Lai:
Tree Network Design for Faster Distributed Machine Learning Process with Distributed Dual Coordinate Ascent. 6080-6084 - Hanyuan Zhang, Xinyu Zhang, Qize Jiang, Liang Li, Baihua Zheng, Weiwei Sun:
Trajectory set Empowered Hypergraph Transformer for Mobile Sensor Based Traffic Prediction. 6085-6089 - Ioannis Kordonis, Emmanouil Theodosis, George Retsinas, Petros Maragos:
Matrix Factorization in Tropical and Mixed Tropical-Linear Algebras. 6090-6094 - Zhongyu Jiang, Haorui Ji, Cheng-Yen Yang, Jenq-Neng Hwang:
2D Human Pose Estimation Calibration and Keypoint Visibility Classification. 6095-6099 - Renkun Ni, Yonghui Xiao, Phoenix Meadowlark, Oleg Rybakov, Tom Goldstein, Ananda Theertha Suresh, Ignacio López-Moreno, Mingqing Chen, Rajiv Mathews:
FedAQT: Accurate Quantized Training with Federated Learning. 6100-6104 - Smit Marvaniya, Jitendra Singh, Nicolas Galichet, Fred Ochieng Otieno, Geeth de Mel, Kommy Weldemariam:
Encoding Seasonal Climate Predictions with Modular Neural Network. 6105-6109 - Utkarsh Oggy Sarawgi, John Berkowitz, Vineet Garg, Arnav Kundu, Minsik Cho, Sai Srujana Buddi, Saurabh Adya, Ahmed H. Tewfik:
Streaming Anchor Loss: Augmenting Supervision with Temporal Significance. 6110-6114 - Qibo Chen, Weizhong Jin, Shuchang Li, Mengdi Liu, Li Yu, Jian Jiang, Xiaozheng Wang:
Exploration of Visual Prompt in Grounded Pre-Trained Open-Set Detection. 6115-6119 - Parvin Malekzadeh, Konstantinos N. Plataniotis, Zissis Poulos, Zeyu Wang:
A Robust Quantile Huber Loss with Interpretable Parameter Adjustment in Distributional Reinforcement Learning. 6120-6124 - Sungjin Park, Edward Choi:
Multimodal Transformer with a Low-Computational-Cost Guarantee. 6125-6129 - Xinlei Gao, Jing Liu:
OADAS: Optimizing Global Perturbation Attacks with Dual-Path Attribution Synergy. 6130-6134 - Xiaohui Zhou, Yijie Wang, Hongzuo Xu, Mingyu Liu:
Boundary-Driven Active Learning for Anomaly Detection in Time Series Data Streams. 6135-6139 - Yongqi Liu, Jiashuang Zhou, Xiaoqin Du:
Human Motion Generation via Conditioned GMVAE with TUNet. 6140-6144 - Yan Wang, Zhixuan Chu, Tao Zhou, Caigao Jiang, Hongyan Hao, Minjie Zhu, Xindong Cai, Qing Cui, Longfei Li, James Y. Zhang, Siqiao Xue, Jun Zhou:
Enhancing Event Sequence Modeling with Contrastive Relational Inference. 6145-6149 - Jinxu Zhao, Guanting Dong, Yueyan Qiu, Tingfeng Hui, Xiaoshuai Song, Daichi Guo, Weiran Xu:
Noise-BERT: A Unified Perturbation-Robust Framework with Noise Alignment Pre-Training for Noisy Slot Filling Task. 6150-6154 - Yixian Luo, Shaowu Yang, Tianrui Liu, Huibin Tan, Ruochun Jin, Hengzhu Liu, Xueqiong Li:
Radar Recognition in the Wild: Enhancing Radar Emitter Recognition through Auto-Correlation Model-Agnostic Meta Learning. 6155-6159 - Meng Gao, Wei Chen, Tengjiao Wang, Dawei Lu, Jiabin Zheng:
Adaptive Image-Enhanced Knowledge Graph Completion. 6160-6164 - Lorenzo Luzi, Daniel LeJeune, Ali Siahkoohi, Sina Alemohammad, Vishwanath Saragadam, Hossein Babaei, Naiming Liu, Zichao Wang, Richard G. Baraniuk:
Titan: Bringing the Deep Image Prior to Implicit Representations. 6165-6169 - Yajun Jian, Chihui Zhuang, Wenyan He, Kaiwen Du, Yang Lu, Hanzi Wang:
Spatio-Temporal Correlation Learning for Multiple Object Tracking. 6170-6174 - Wangyu Wu, Tianhong Dai, Xiaowei Huang, Fei Ma, Jimin Xiao:
Image Augmentation with Controlled Diffusion for Weakly-Supervised Semantic Segmentation. 6175-6179 - Jitendra K. Tugnait:
Delay Embedding for Matrix Graphical Model Learning from Dependent Data. 6180-6184 - Tong Chen, Guanchao Feng, Petar M. Djuric:
Improving Open-Set Recognition with Bayesian Metric Learning. 6185-6189 - Keke Tang, Wenyu Zhao, Weilong Peng, Xiang Fang, Xiaodong Cui, Peican Zhu, Zhihong Tian:
Reparameterization Head for Efficient Multi-Input Networks. 6190-6194 - Yuxuan Zhang, Yiren Song, Jinpeng Yu, Han Pan, Zhongliang Jing:
Fast Personalized Text to Image Synthesis with Attention Injection. 6195-6199 - Dennis Fedorishin, Lie Lu, Srirangaraj Setlur, Venu Govindaraju:
Audio Match Cutting: Finding and Creating Matching Audio Transitions in Movies and Videos. 6200-6204 - Xiaolong Xiong, Jinhan Cui, Rui Xie, Shuzhan Guo, Jun Zhou:
Large-Scale Multi-View Multiple Clustering. 6205-6209 - Pranoy Panda, Siddharth Tandon, Vineeth N. Balasubramanian:
FW-Shapley: Real-Time Estimation of Weighted Shapley Values. 6210-6214 - Elaheh Motamedi, Kian Behzad, Rojin Zandi, Hojjat Salehinejad, Milad Siami:
Robustness Evaluation of Machine Learning Models for Robot Arm Action Recognition in Noisy Environments. 6215-6219 - Sining Jiang, Yujun Lan, Weigang Wang, Zhongwen Guo:
Pyramid: A Heterogeneous Data Integration Algorithm Based on Hierarchical Graph. 6220-6224 - Xiu Su, Shan You, Hongyan Xu, Xiuxing Li, Jun Long, Yi Chen, Chang Xu:
Beyond the Limit of Weight-Sharing: Pioneering Space-Evolving NAS with Large Language Models. 6225-6229 - Dohoon Kim, Minwoo Shin, Jaeseok Ryu, Heunseung Lim, Joonki Paik:
Pu-Edgeformer++: An Advanced Hierarchical Edge Transformer for Arbitrary-Scale Point Cloud Upsampling using Distance Fields. 6230-6234 - Masaki Kashiwagi, Keisuke Maeda, Ren Togo, Takahiro Ogawa, Miki Haseyama:
Enhancing Noisy Label Learning Via Unsupervised Contrastive Loss with Label Correction Based on Prior Knowledge. 6235-6239 - Utsav Tiwari, Srinivas Soumitri Miriyala, Vikram Nelvoy Rajendiran:
Edge Deployable Distributed Evolutionary Optimization based Calibration method for Neural Quantization. 6240-6244 - Jiayu Zhai, Lequan Lin, Dai Shi, Junbin Gao:
Bregman Graph Neural Network. 6250-6254 - Ruiguo Yu, Yue Chen, Mankun Zhao, Jian Yu, Tianyi Xu, Mei Yu, Xuewei Li:
Debiasing Recommenders Through Personalized Popularity-Aware Margins. 6255-6259 - Srinivas Soumitri Miriyala, P. K. Suhas, Utsav Tiwari, Vikram Nelvoy Rajendiran:
Mixed Precision Neural Quantization with Multi-Objective Bayesian Optimization for on-Device Deployment. 6260-6264 - Sangwook Park, Angeles Salles, Kathryne Allen, Cynthia F. Moss, Mounya Elhilali:
Biomimetic Mappings for Active Sonar Object Recognition in Clutter. 6265-6269 - Sonal Kumar, Anirudh Phukan, Arijit Sur:
IPCL: Iterative Pseudo-Supervised Contrastive Learning to Improve Self-Supervised Feature Representation. 6270-6274 - Zhiwei Zuo, Zhuo Tang, Bin Wang, Kenli Li, Anwitaman Datta:
ECIL-MU: Embedding Based Class Incremental Learning and Machine Unlearning. 6275-6279 - Rongwei Yu, Peihao Zhang, Jingyi Xiang:
DG-RainDiff: Depth-Guided Dynamic Message Passing Diffusion Model for Mixture of Rain Removal. 6280-6284 - Yizhan Li, Rongwei Yu, Junjie Shi, Lina Wang:
Diff-HOD: Diffusion Model for Object Detection in Hazy Weather Conditions. 6285-6289 - Yuxuan Liu, Haozhao Wang, Shuang Wang, Zhiming He, Wenchao Xu, Jialiang Zhu, Fan Yang:
Disentangle Estimation of Causal Effects from Cross-Silo Data. 6290-6294 - Xiang Ao, Xiaohui Li, Xu-Yao Zhang, Chenglin Liu:
Prototype Calibration with Synthesized Samples for Zero-Shot Chinese Character Recognition. 6295-6299 - Shaoxu Cheng, Chiyuan He, Kailong Chen, Linfeng Xu, Hongliang Li, Fanman Meng, Qingbo Wu:
Vision-Sensor Attention Based Continual Multimodal Egocentric Activity Recognition. 6300-6304 - Mingyu Xu, Zheng Lian, Bin Liu, Zerui Chen, Jianhua Tao:
Pseudo Labels Regularization for Imbalanced Partial-Label Learning. 6305-6309 - Wenyan He, Yajun Jian, Yang Lu, Hanzi Wang:
Visual-Linguistic Representation Learning with Deep Cross-Modality Fusion for Referring Multi-Object Tracking. 6310-6314 - Linyan Yang, Jingwei Cheng, Chuanhao Xu, Xihao Wang, Jiayi Li, Fu Zhang:
Attr-Int: A Simple and Effective Entity Alignment Framework for Heterogeneous Knowledge Graphs. 6315-6319 - Shuai Chen, Mingyi Zhang, Junge Zhang, Kaiqi Huang:
Task-Wise Prompt Query Function for Rehearsal-Free Continual Learning. 6320-6324 - Ya-Wei Eileen Lin, Yuval Kluger, Ronen Talmon:
Hyperbolic Diffusion Procrustes Analysis for Intrinsic Representation of Hierarchical Data Sets. 6325-6329 - Yiqi Wu, Kelin Song, Xuan Huang, Dejun Zhang:
Mitigating Intra-Class Variance in Few-Shot Point Cloud Classification. 6330-6334 - Guanghui Hu, Yang Liu, Qing He, Xiang Ao:
F2GNN: An Adaptive Filter with Feature Segmentation for Graph-Based Fraud Detection. 6335-6339 - Jie Zhang, Yuan Sun, Yu Guo, Zheng Wang, Feiping Nie, Fei Wang:
Multi-View Subspace Clustering With Consensus Graph Contrastive Learning. 6340-6344 - Bingkang Shi, Xiaodan Zhang, Dehan Kong, Yulei Wu, Zongzhen Liu, Honglei Lyu, Longtao Huang:
General Phrase Debiaser: Debiasing Masked Language Models at a Multi-Token Level. 6345-6349 - Yuxin Shi, Han Yu:
Fairness-Aware Job Scheduling for Multi-Job Federated Learning. 6350-6354 - Sandipan Sarma, Pradnesh Kalkar, Arijit Sur:
Boosting Zero-Shot Human-Object Interaction Detection with Vision-Language Transfer. 6355-6359 - Hejung Yang, Hong-Goo Kang:
On Fine-Tuning Pre-Trained Speech Models With EMA-Target Self-Supervised Loss. 6360-6364 - Daniele Ugo Leonzio, Paolo Bestagini, Marco Marcon, Gian Paolo Quarta, Stefano Tubaro:
Water Leak Detection via Domain Adaptation. 6365-6369 - Shijie Chen, Rongquan Wang, Xin Li, Yuchen Wu, Haizhuang Liu, Jiansheng Chen, Huimin Ma:
PLS: Unsupervised Domain Adaptation for 3d Object Detection Via Pseudo-Label Sizes. 6370-6374 - Davinder Pal Singh, Lala Shakti Swarup Ray, Bo Zhou, Sungho Suh, Paul Lukowicz:
A Novel Local-Global Feature Fusion Framework for Body-Weight Exercise Recognition with Pressure Mapping Sensors. 6375-6379 - Boyuan Zhu, Fagui Liu, Xi Chen, Quan Tang:
EK-Net: Real-Time Scene Text Detection with Expand Kernel Distance. 6380-6384 - Jeongkyun Park, Jung-Wook Hwang, Kwanghee Choi, Seung-Hyeon Lee, Jun Hwan Ahn, Rae-Hong Park, Hyung-Min Park:
OLKAVS: An Open Large-Scale Korean Audio-Visual Speech Dataset. 6385-6389 - Yiming Zhao, Haoyu Lu, Shiqi Zhao, Haoran Wu, Zhiwu Lu:
Multi-Level Contrastive Learning For Hybrid Cross-Modal Retrieval. 6390-6394 - Naoufal El Bekri, Lucas Drumetz, Franck Vermet:
Time Changed Normalizing Flows for Accurate SDE Modeling. 6395-6399 - Zifan Jia, Qingsong Liu, Xiaoyan Gu, Haihui Fan, Feifei Dai, Bo Li, Weiping Wang:
Online Caching With Switching Cost and Operational Long-Term Constraints: An Online Learning Approach. 6400-6404 - Ying Huang, Qingfeng Du, Yongqi Han, Cheng He, Fulong Tian:
Semi-Supervised Metrics-Based Self-Training Root Cause Analysis for Cloud-Native Systems with Class-Imbalanced Data. 6405-6409 - Yue Liu, Shanlin Xiao, Bo Li, Zhiyi Yu:
Sparsespikformer: A Co-Design Framework for Token and Weight Pruning in Spiking Transformer. 6410-6414 - Jiani Liu, Ce Zhu, Yang Chen, Xiaolin Huang, Yipeng Liu:
Phase Retrieval by Tensor Total Least Squares. 6415-6419 - Aakansha Mishra, Srinivas Soumitri Miriyala, Vikram Nelvoy Rajendiran:
Learning Representations from Explainable and Connectionist Approaches for Visual Question Answering. 6420-6424 - Mete Ozay:
Joint Embedding Learning and Latent Subspace Probing for Cross-Domain Few-Shot Keyword Spotting. 6425-6429 - Shuxian Huang, Ye Wang, Kai Chen, Yan Jia:
Temporal Relational Context Learning for Extrapolation Reasoning on Temporal Knowledge Graphs. 6430-6434 - Wei Wan, Yuxuan Ning, Shengshan Hu, Lulu Xue, Minghui Li, Leo Yu Zhang, Hai Jin:
MISA: Unveiling the Vulnerabilities in Split Federated Learning. 6435-6439 - Hao Zheng, Peng Liang, Yu Tang, Yanqi Shi, Linbo Qiao, Dongsheng Li:
3D Parallelism for Transformers via Integer Programming. 6440-6444 - Kelvin Ting Zuo Han, Shengxuming Zhang, Gerard Marcos Freixas, Zunlei Feng, Cheng Jin:
Target Optimization Direction Guided Transfer Learning for Image Classification. 6445-6449 - Bhartendu Kumar, Kunal N. Chaudhury:
Lipschitz-Constrained Convolutional Layers Using Convex Projection. 6450-6454 - Qirong Liang, Da Pan, Zefeng Ying, Ping Shi:
DefocusSR: An Efficient Framework for Defocus Image Super-Resolution Guided by Depth Information. 6455-6459 - Zhen Xu, Ziqiang Chen, Yaqiang Wu, Hui Li, Wanjun Lv, Lianwen Jin, Qianying Wang:
A Multi-Scale Bimodal Fusion Network for Robust and Accurate Online Handwriting Recognition. 6460-6464 - Kejia Wan, Yuntao Liu, Hengzhu Liu, Xinhai Xu:
Unraveling Explainable Reinforcement Learning Using Behavior Tree Structures. 6465-6469 - Haoyu Liu, Yuanhai Xue, Xiaoming Yu:
Disentangled Graph Representation with Contrastive Learning for Rumor Detection. 6470-6474 - Liuzhenghao Lv, Wei Fang, Li Yuan, Yonghong Tian:
Optimal ANN-SNN Conversion with Group Neurons. 6475-6479 - Dan Lin, Philip Hann Yung Lee, Yiming Li, Ruoyu Wang, Kim-Hui Yap, Bingbing Li, You Shing Ngim:
Multi-Modality Action Recognition Based on Dual Feature Shift in Vehicle Cabin Monitoring. 6480-6484 - Spilios Evmorfos, Athina P. Petropulu:
A Meta-Preconditioning Approach for Deep Q-Learning. 6485-6489 - Maurice Kuschel, Tanuj Hasija, Timothy Marrinan:
Rademacher Complexity Regularization for Correlation-Based Multiview Representation Learning. 6490-6494 - Zheng Nan, Ting Dang, Vidhyasaharan Sethu, Beena Ahmed:
Variational Connectionist Temporal Classification for Order-Preserving Sequence Modeling. 6495-6499 - Haozhe Yang, Yuhan Xiang, Ke Sun, Jianlong Hu, Xianming Lin:
Towards Video-Text Retrieval Adversarial Attack. 6500-6504 - Feng Ding, Xiu Liu, Xinyi Wang, Fangming Zhong:
Dual-Mix for Cross-Modal Retrieval with Noisy Labels. 6505-6509 - Arian Bakhtiarnia, Qi Zhang, Alexandros Iosifidis:
Accurate Gigapixel Crowd Counting by Iterative Zooming and Refinement. 6510-6514 - Jiawei Zhang, Yufan Chen, Cheng Jin, Lei Zhu, Yuantao Gu:
EPA: Neural Collapse Inspired Robust Out-of-distribution Detector. 6515-6519 - Shiqin Tang, Yining Dong, S. Joe Qin:
A PLS-Integrated Lasso Method With Application in Index Tracking. 6520-6524 - Jiyi Li:
A Comparative Study on Annotation Quality of Crowdsourcing and LLm Via Label Aggregation. 6525-6529 - Fei He, Yipeng Liu, Da Shen, Yangyang Jiang, Ying Li, Ce Zhu:
Multi-Band Speech Tensor Decomposition for Interactive Feature Extraction in Early Dysphagia Screening. 6530-6534 - Jinpei Guo, Shaofeng Zhang, Runzhong Wang, Chang Liu, Junchi Yan:
GMTR: Graph Matching Transformers. 6535-6539 - Xin Du, Shan Zhong, Wenhao Ying, Yi Wang, Shengrong Gong:
CDA-MBPO: Corrected Data Aggregation for Model-Based Policy Optimization. 6540-6544 - Zinuo You, Pengju Zhang, Jin Zheng, John Cartlidge:
Multi-Relational Graph Diffusion Neural Network with Parallel Retention for Stock Trends Classification. 6545-6549 - Mei Yu, Yujian Zhang, Xuewei Li, Ruixuan Zhang, Han Jiang, Jie Gao, Zhiqiang Liu:
Multi-Level Augmentation Consistency Learning and Sample Selection for Semi-Supervised Domain Generalization. 6550-6554 - Benjamin Rise, Murat Uney, Xiaowei Huang:
Two-Stage Transfer Learning for Fusion and Classification of Airborne Hyperspectral Imagery. 6555-6559 - Xiang Zhang, Qiang Zhu, Tao Hu, Qingsen Yan:
EiffHDR: An Efficient Network for Multi-Exposure High Dynamic Range Imaging. 6560-6564 - Yusen Zhang, Yusong Tan, Songlei Jian, Qingbo Wu, Kenli Li:
DGLP: Incorporating Orientation Information for Enhanced Link Prediction in Directed Graphs. 6565-6569 - Xiao Liu, Jun-Jie Huang, Wentao Zhao:
GCIA: A Black-Box Graph Injection Attack Method Via Graph Contrastive Learning. 6570-6574 - Zhufeng Shao, Shoujin Wang, Wenpeng Lu, Weiyu Zhang, Hongjiao Guan, Long Zhao:
Filter-Enhanced Hypergraph Transformer for Multi-Behavior Sequential Recommendation. 6575-6579 - Zijun Long, George Killick, Richard McCreadie, Gerardo Aragon-Camarasa:
Multiway-Adapter: Adapting Multimodal Large Language Models for Scalable Image-Text Retrieval. 6580-6584 - Artur Shagidanov, Hayk Poghosyan, Xinyu Gong, Zhangyang Wang, Shant Navasardyan, Humphrey Shi:
Grounded-Instruct-Pix2Pix: Improving Instruction Based Image Editing with Automatic Target Grounding. 6585-6589 - Leonardo Rossi, Vittorio Bernuzzi, Tomaso Fontanini, Massimo Bertozzi, Andrea Prati:
Memory-Augmented Online Video Anomaly Detection. 6590-6594 - Yuanhang Qiu:
Stochastic Configuration Networks for Laboratory Seismic Time-to-Failure Prediction. 6595-6599 - Yizhou Chen, Wangjie Xu, Xincheng Wu, Meng Zhang, Bing Luo:
Personalized Local Differentially Private Federated Learning with Adaptive Client Sampling. 6600-6604 - Zaixin Ou, Yongsheng Pan, Yuanning Li, Fang Xie, Qihao Guo, Dinggang Shen:
Synthesizing Aβ-Pet Via An Image And Label Conditioning Latent Diffusion Model For Detecting Amyloid Status. 6610-6614 - Minglang Qiao, Mai Xu, Shijie Wen, Lai Jiang, Shengxi Li, Tao Xu, Yunjin Chen, Leonid Sigal:
Saliency Prediction of Sports Videos: A Large-Scale Database and a Self-Adaptive Approach. 6615-6619 - Shangdong Liu, Xiaofan Yue, Fei Wu, Jing Sun, Yujian Feng, Yimu Ji:
Semantic Distillation and Structural Alignment Network for Fake News Detection. 6620-6624 - Chu-Chun Yu, Ming-Yi Hong, Chiok-Yew Ho, Che Lin:
Push4Rec: Temporal and Contextual Trend-Aware Transformer Push Notification Recommender. 6625-6629 - Zhengyu Chen, Teng Xiao, Donglin Wang, Min Zhang:
Pareto Graph Self-Supervised Learning. 6630-6634 - Nishanth Shetty, Manikanta Bandla, Nishit Neema, Siddarth Asokan, Chandra Sekhar Seelamantula:
Momentum-Imbued Langevin Dynamics (MILD) for Faster Sampling. 6635-6639 - Zhe Zhang, Taketo Akama:
Hyperganstrument: Instrument Sound Synthesis and Editing With Pitch-Invariant Hypernetworks. 6640-6644 - Qingming Li, Xiaohang Li, Li Zhou, Xiaoran Yan:
AdaFL: Adaptive Client Selection and Dynamic Contribution Evaluation for Efficient Federated Learning. 6645-6649 - Masahiro Nakano, Ryohei Shibue, Kunio Kashino:
Sunflower Strategy for Bayesian Relational Data Analysis. 6650-6654 - Xu Wang, Kele Xu, Ting Yu, Bo Ding, Dawei Feng:
Transformer-Inspired Lightweight Model for Efficient Time Series Forecasting. 6655-6659 - Hongyu Zhu, Sichu Liang, Wentao Hu, Fang-Qi Li, Yali Yuan, Shi-Lin Wang, Guang Cheng:
Improve Deep Forest with Learnable Layerwise Augmentation Policy Schedules. 6660-6664 - Weishang Wu, Xiaoheng Deng:
Motion Latent Diffusion for Stochastic Trajectory Prediction. 6665-6669 - Ziru Zeng, Yue Ding, Hongtao Lu:
Enhancing Cross-Domain Detection: Adaptive Class-Aware Contrastive Transformer. 6670-6674 - Jiakang Chen, Di You, Deniz Gündüz, Pier Luigi Dragotti:
CommIN: Semantic Image Communications as an Inverse Problem with INN-Guided Diffusion Models. 6675-6679 - Jingzhou Hu, Kejun Huang:
Complex Bounded Component Analysis: Identifiability and Algorithm. 6680-6684 - Patrick Cormac English, Erfan A. Shams, John D. Kelleher, Julie Carson-Berndsen:
Following the Embedding: Identifying Transition Phenomena in Wav2vec 2.0 Representations of Speech Audio. 6685-6689 - Tianjiao Wan, Yutao Dou, Kele Xu, Zijian Gao, Bo Ding, Dawei Feng, Huaimin Wang:
Temporal Inconsistency-Based Active Learning. 6690-6694 - Nikolaus Mutsam, Alexander Fuchs, Fabio Ziegler, Franz Pernkopf:
Data-Scarce Condition Modeling Requires Model-Based Prior Regularization. 6695-6699 - Xu Wang, Pengfei Gu, Yudong Zhang, Binwu Wang, Pengkun Wang, Yang Wang:
Gradient Reactivation Enhanced Causal Attention for Out-Of-Distribution Generalizable Graph Classification. 6700-6704 - Lincan Li, Kaixiang Yang, Jichao Bi, Fengji Luo:
STS-CCL: Spatial-Temporal Synchronous Contextual Contrastive Learning for Urban Traffic Forecasting. 6705-6709 - Arthur Michon, Charly Poulliat, Adam Mekhiche, Antonio Maria Cipriano:
Extrinsic Versus App Information Feedback in Turbo Vep Mu-Mimo Receivers: Optimization Via Deep Unfolding. 6710-6714 - Shuxin Liu, Jiliang Li, Wei Ke, Hao Yin:
Multi-Attention Enhanced Discriminator for GAN-Based Anomalous Sound Detection. 6715-6719 - Junzhe Chen, Qiao Yang, Senmao Tian, Shunli Zhang:
Adaptive Quantization with Mixed-Precision Based on Low-Cost Proxy. 6720-6724 - Yuecheng Li, Jialong Chen, Chuan Chen, Lei Yang, Zibin Zheng:
Contrastive Deep Nonnegative Matrix Factorization For Community Detection. 6725-6729 - Yuanming Li, Gwantae Kim, Jeong-gi Kwak, Bonhwa Ku, Hanseok Ko:
Towards Multi-Domain Face Landmark Detection with Synthetic Data from Diffusion Model. 6730-6734 - Jaeyeon Kim, Jaeyoon Jung, Jinjoo Lee, Sang Hoon Woo:
EnCLAP: Combining Neural Audio Codec and Audio-Text Joint Embedding for Automated Audio Captioning. 6735-6739 - Pengfei Li, Jinlong He, Gang Liu, Shenjun Zhong:
PECR: Parameter-Efficient Transfer Learning with Cross-Modal Representation Learning for Remote Sensing Visual Question Answering. 6740-6744 - Nan Zhang, Fan Xiao, Junlin Hou, Ruiwei Zhao, Xiaobo Zhang, Rui Feng:
Cross-Image Distillation for Semi-Supervised Semantic Segmentation. 6745-6749 - Yuanqing Song, Yuhao Liu, Petar M. Djuric:
Novel Architecture of Deep Feature-Based Gaussian Processes with an Ensemble of Kernels. 6750-6754 - Suhua Zhang, Fangming Zhong, Zhikui Chen:
Context-Aware and Contrastiveness-Driven Feature Learning for Cross-Domain Few-Shot Hyperspectral Image Classification. 6755-6759 - Zhendong Liu, Jie Zhang, Qiangqiang He, Chongjun Wang:
Understanding Data Augmentation From A Robustness Perspective. 6760-6764 - Xin Zhang, Jiaxian Guo, Paul Yoo, Yutaka Matsuo, Yusuke Iwasawa:
Paste and Harmonize via Denoising: Subject-Driven Image Editing with Frozen Pre-Trained Diffusion Model. 6765-6769 - Jeonghun Baek, Yusuke Matsui, Kiyoharu Aizawa:
Cross-Lingual Learning in Multilingual Scene Text Recognition. 6770-6774 - Kai Li, Yi Luo:
Subnetwork-To-Go: Elastic Neural Network with Dynamic Training and Customizable Inference. 6775-6779 - Zhendong Liu, Wenyu Jiang, Ming Guo, Chongjun Wang:
From Game Theory to Visual Recognition: Advancing DNN Robustness. 6780-6784 - Wenbo Wang, Ben Chen, Bingquan Liu, Xinxin Wang, Luwei Yang, Wen Jiang, Wei Ning, Jian Guan:
Mutual Information Assisted Graph Convolution Network for Cold-Start Recommendation. 6785-6789 - Junqi Xue, Ruihan Qin, Xinxu Zhou, Honghai Liu, Min Zhang, Zhiguo Zhang:
Fusing Multi-Level Features from Audio and Contextual Sentence Embedding from Text for Interview-Based Depression Detection. 6790-6794 - Renxiang Guan, Zihao Li, Xianju Li, Chang Tang:
Pixel-Superpixel Contrastive Learning and Pseudo-Label Correction for Hyperspectral Image Clustering. 6795-6799 - Junsu Kim, Sumin Hong, Chanwoo Kim, Jihyeon Kim, Yihalem Yimolal Tiruneh, Jeongwan On, Jihyun Song, Sunhwa Choi, Seungryul Baek:
Class-Wise Buffer Management for Incremental Object Detection: An Effective Buffer Training Strategy. 6800-6804 - Xianbo Xu, Bart van Erp, Tanya Ignatenko:
Context-Aware Preference Learning System Based on Dirichlet Process Gaussian Mixture Model. 6805-6809 - Puja Trivedi, Danai Koutra, Jayaraman J. Thiagarajan:
On Estimating Link Prediction Uncertainty Using Stochastic Centering. 6810-6814 - Yuqing Li, Haoming Huang, Jian Xu, Shao-Lun Huang:
NAC: Mitigating Noisy Correspondence in Cross-Modal Matching Via Neighbor Auxiliary Corrector. 6815-6819 - Yoonjin Chung, Junwon Lee, Juhan Nam:
T-Foley: A Controllable Waveform-Domain Diffusion Model for Temporal-Event-Guided Foley Sound Synthesis. 6820-6824 - Rakshith Subramanyam, T. S. Jayram, Rushil Anirudh, Jayaraman J. Thiagarajan:
Exploring the Utility of Clip Priors for Visual Relationship Prediction. 6825-6829 - Bin Guo, John H. L. Hansen:
T-EnFP: An Efficient Transformer Encoder-Based System for Driving Behavior Classification. 6830-6834 - Minsu Kim, Walid Saad:
Analysis of the Memorization and Generalization Capabilities of AI Agents: are Continual Learners Robust? 6840-6844 - Guo-Zhao Liao, Xiao-Feng Gong, Qiu-Hua Lin:
Target Localization Based on Multistatic Mimo Radar via Double Coupled Canonical Polyadic Decomposition. 6845-6849 - Shuai Ju, Chenxu Wang:
Beyond Simple Text Style Transfer: Unveiling Compound Text Style Transfer with Prompt-Based Pre-Trained Language Models. 6850-6854 - Xin-Tong Liu, Xiao-Feng Gong, Dong Zhao, Qiu-Hua Lin:
An Adaptive Algorithm for Tracking Third-Order Coupled Canonical Polyadic Decomposition. 6855-6859 - Lucia Testa, Claudio Battiloro, Stefania Sardellitti, Sergio Barbarossa:
Stability of Graph Convolutional Neural Networks Through The Lens of Small Perturbation Analysis. 6865-6869 - Lu Zhang, Baolin Zheng:
FIBA: Federated Invisible Backdoor Attack. 6870-6874 - William M. Watkins, Heehwan Wang, Sangyoon Bae, Huan-Hsin Tseng, Jiook Cha, Samuel Yen-Chi Chen, Shinjae Yoo:
Quantum Privacy Aggregation of Teacher Ensembles (QPATE) for Privacy Preserving Quantum Machine Learning. 6875-6879 - Jeremy Speth, Korosh Vatanparvar, Li Zhu, Jilong Kuang, Alex Gao:
Freq2Time: Weakly Supervised Learning of Camera-Based RPPG from Heart Rate. 6880-6884 - Damian Owerko, Fernando Gama, Alejandro Ribeiro:
Unsupervised Optimal Power Flow Using Graph Neural Networks. 6885-6889 - Yuan Tseng, Layne Berry, Yiting Chen, I-Hsiang Chiu, Hsuan-Hao Lin, Max Liu, Puyuan Peng, Yi-Jen Shih, Hung-Yu Wang, Haibin Wu, Poyao Huang, Chun-Mao Lai, Shang-Wen Li, David Harwath, Yu Tsao, Abdelrahman Mohamed, Chi-Luen Feng, Hung-Yi Lee:
AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models. 6890-6894 - Xin Qi, Han Zhang, Feiping Nie:
Discriminative Semi-Supervised Feature Selection Via a Class-Credible Pseudo-Label Learning Framework. 6895-6899 - Khe Chai Sim, Zhouyuan Huo, Tsendsuren Munkhdalai, Nikhil Siddhartha, Adam Stooke, Zhong Meng, Bo Li, Tara N. Sainath:
A Comparison of Parameter-Efficient ASR Domain Adaptation Methods for Universal Speech and Language Models. 6900-6904 - Ming-Chang Chiu, Yingfei Wang, Yen-Ju Kuo, Pin-Yu Chen:
DDI-CoCo: A Dataset for Understanding the Effect of Color Contrast in Machine-Assisted Skin Disease Detection. 6905-6909 - Pravin Nair, Kunal N. Chaudhury:
Convergent Plug-And-Play Using Contractive Denoisers. 6910-6914 - Jianyi Zhang, Saeed Vahidian, Martin Kuo, Chunyuan Li, Ruiyi Zhang, Tong Yu, Guoyin Wang, Yiran Chen:
Towards Building The Federatedgpt: Federated Instruction Tuning. 6915-6919 - Dung Le, Huy Nguyen, Khai Nguyen, Trang Nguyen, Nhat Ho:
Fast Approximation of the Generalized Sliced-Wasserstein Distance. 6920-6924 - Xiaoteng Shen, Liangcai Su, Xi Xiao, Yi Li:
Multi-Interest Learning for Multi-Modal Paper Recommendation. 6925-6929 - Zihan Chen, Jundong Li, Cong Shen:
Personalized Federated Learning with Attention-Based Client Selection. 6930-6934 - Anastasios Arsenos, Dimitrios Kollias, Evangelos Petrongonas, Christos Skliros, Stefanos D. Kollias:
Uncertainty-Guided Contrastive Learning For Single Source Domain Generalisation. 6935-6939 - Esther Rodrigo Bonet, Nikos Deligiannis:
Physics-Guided Variational Graph Autoencoder For Air Quality Inference. 6940-6944 - Jinhyeok Kim, Inha Lee, Kyungdon Joo:
Fracture Assembly with Segmentation And Iterative Registration. 6945-6949 - Sabri Mustafa Kahya, Muhammet Sami Yavuz, Eckehard G. Steinbach:
HAROOD: Human Activity Classification and Out-Of-Distribution Detection with Short-Range FMCW Radar. 6950-6954 - Xinyu Feng, Qingni Shen, Cong Li, Yuejian Fang, Zhonghai Wu:
Privacy Preserving Federated Learning from Multi-Input Functional Proxy Re-Encryption. 6955-6959 - Jackson Michaels, Juncheng B. Li, Laura Yao, Lijun Yu, Zach Wood-Doughty, Florian Metze:
Audio-Journey: Open Domain Latent Diffusion Based Text-To-Audio Generation. 6960-6964 - Zhongchang Sun, Yousef El-Laham, Svitlana Vyetrenko:
Neural Stochastic Differential Equations with Change Points: A Generative Adversarial Approach. 6965-6969 - Qiang He, Xinwen Hou:
MEPE: A Minimalist Ensemble Policy Evaluation Operator for Deep Reinforcement Learning. 6970-6974 - Heshan Devaka Fernando, Lisha Chen, Songtao Lu, Pin-Yu Chen, Miao Liu, Subhajit Chaudhury, Keerthiram Murugesan, Gaowen Liu, Meng Wang, Tianyi Chen:
Variance Reduction Can Improve Trade-Off in Multi-Objective Learning. 6975-6979 - Emilian Postolache, Giorgio Mariani, Luca Cosmo, Emmanouil Benetos, Emanuele Rodolà:
Generalized Multi-Source Inference for Text Conditioned Music Diffusion Models. 6980-6984 - Domenico Parente, Nastaran Darabi, Alex C. Stutts, Theja Tulabandhula, Amit Ranjan Trivedi:
Conformalized Multimodal Uncertainty Regression and Reasoning. 6985-6989 - Tianle Zhang, Jiaxu Liu, Yanghao Zhang, Ronghui Mu, Wenjie Ruan:
DeepGRE: Global Robustness Evaluation of Deep Neural Networks. 6990-6994 - Muhammad Taimoor Haseeb, Ahmad Hammoudeh, Gus Xia:
GPT-4 Driven Cinematic Music Generation Through Text Processing. 6995-6999 - Alkis Koudounas, Eliana Pastor, Giuseppe Attanasio, Luca de Alfaro, Elena Baralis:
Prioritizing Data Acquisition for end-to-end Speech Model Improvement. 7000-7004 - Muhammad A. Shah, Bhiksha Raj:
Fixed Inter-Neuron Covariability Induces Adversarial Robustness. 7005-7009 - Antonio Almudévar, Théo Mariotte, Alfonso Ortega Giménez, Marie Tahon:
Unsupervised multiple domain translation through controlled Disentanglement in variational autoencoder. 7010-7014 - Alican Akman, Björn W. Schuller:
AttHear: Explaining Audio Transformers Using Attention-Aware NMF. 7015-7019 - Zakaria Elabid, Daniel Busby, Abdenour Hadid:
Knowledge-Based Convolutional Neural Network for the Simulation and Prediction of Two-Phase Darcy Flows. 7020-7024 - Kaito Shiku, Shinnosuke Matsuo, Daiki Suehiro, Ryoma Bise:
Counting Network for Learning from Majority Label. 7025-7029 - Kawisorn Kamtue, José M. F. Moura, Orathai Sangpetch, Paulo Garcia:
PHYOT: Physics-Informed Object Tracking in Surveillance Cameras. 7030-7034 - Mert Indibi, Selin Aviyente:
Spatiotemporal Group Anomaly Detection via Graph Total Variation on Tensors. 7035-7039 - Yousef El-Laham, Elizabeth Fons, Dillon Daudert, Svitlana Vyetrenko:
Augment on Manifold: Mixup Regularization with UMAP. 7040-7044 - John Shi, Shreyas Chaudhari, José M. F. Moura:
Graph Convolutional Neural Networks In The Companion Model. 7045-7049 - Hossein Souri, Pirazh Khorramshahi, Chun Pong Lau, Micah Goldblum, Rama Chellappa:
Identifying Attack-Specific Signatures in Adversarial Examples. 7050-7054 - Periklis Theodoropoulos, Konstantinos E. Nikolakakis, Dionysis Kalogerias:
Federated Learning under Restricted user Availability. 7055-7059 - Zheng Xing, Junting Chen:
HMM-based CSI Embedding for Trajectory Recovery from RSS Measurements of Non-Cooperative Devices. 7060-7064 - Kexin Zhang, Qingsong Wen, Chaoli Zhang, Liang Sun, Yong Liu:
Skip-Step Contrastive Predictive Coding for Time Series Anomaly Detection. 7065-7069 - Jieren Deng, Xin Zhou, Hao Tian, Zhihong Pan, Derek Aguiar:
GBSD: Generative Bokeh with Stage Diffusion. 7070-7074 - Shaohua Liu, Yinglong Zhu, Pengfei Yao, Tianlu Mao, Zhaoqi Wang:
SpectrumNet: Spectrum-Based Trajectory Encode Neural Network for Pedestrian Trajectory Prediction. 7075-7079 - Khondoker Murad Hossain, Tim Oates:
Ten-Guard: Tensor Decomposition for Backdoor Attack Detection in Deep Neural Networks. 7080-7084 - Mashrura Tasnim, Ramon E. Diaz-Ramos, Eleni Stroulia, Luis A. Trejo:
A Machine-Learning Model for Detecting Depression, Anxiety, and Stress from Speech. 7085-7089 - Geyunqian Zu, Shengjie Zhao, Jin Zeng, Shilong Dong, Zixuan Chen:
SEA-GNN: Sequence Extension Augmented Graph Neural Network for Sequential Recommendation. 7090-7094 - Liang Du, Xiaodong Li, Yan Chen, Gui Yang, Mian Ilyas Ahmad, Peng Zhou:
Higher Order Multiple Graph Filtering for Structured Graph Learning. 7095-7099 - Mohammad Jafari, Yimeng Zhang, Yihua Zhang, Sijia Liu:
The Power of Few: Accelerating and Enhancing Data Reweighting with Coreset Selection. 7100-7104 - Muqiao Yang, Umberto Cappellazzo, Xiang Li, Bhiksha Raj:
Improving Continual Learning of Acoustic Scene Classification via Mutual Information Optimization. 7105-7109 - Fen Wang, Taihao Li, Wuyue Zhang, Xue Zhang, Cheng Yang:
Graph-Enhanced Hybrid Sampling for Multi-Armed Bandit Recommendation. 7110-7114 - Jaidev Gill, Vala Vakilian, Christos Thrampoulidis:
Engineering the Neural Collapse Geometry of Supervised-Contrastive Loss. 7115-7119 - Jin Liu, Dejiao Zeng, Ludi Li, Hanhe Lin, Xu Tian:
Source-Free Domain Adaptation for Millimeter Wave Radar Based Human Activity Recognition. 7120-7124 - Muqiao Yang, Chunlei Zhang, Yong Xu, Zhongweiyang Xu, Heming Wang, Bhiksha Raj, Dong Yu:
uSee: Unified Speech Enhancement And Editing with Conditional Diffusion Models. 7125-7129 - Ping Xu, Yue Wang, Xiang Chen, Zhi Tian:
Communication-Efficient Decentralized Dynamic Kernel Learning. 7135-7139 - Yilin Wang, Tao Chen, Yuliang Tang, Lianfen Huang:
Enhanced KPI Anomaly Detection: An Unsupervised Hybrid Model with Dynamic Threshold. 7140-7144 - Yuwen Yang, Chang Liu, Xun Cai, Suizhi Huang, Hongtao Lu, Yue Ding:
UNIDEAL: Curriculum Knowledge Distillation Federated Learning. 7145-7149 - Yimin Deng, Huaizhen Tang, Xulong Zhang, Ning Cheng, Jing Xiao, Jianzong Wang:
Learning Disentangled Speech Representations with Contrastive Learning and Time-Invariant Retrieval. 7150-7154 - Tian Tan, Weimin Tan, Xuhao Jiang, Yueming Jiang, Bo Yan:
Bridging The Domain Gap Arising from Text Description Differences for Stable Text-To-Image Generation. 7155-7159 - Jiabin Lin, Karuna Anna Sajeevan, Bibek Acharya, Shana Moothedath, Ratul Chowdhury:
Distributed Stochastic Contextual Bandits for Protein Drug Interaction. 7160-7164 - Chen Peng, Di Zhang, Urbashi Mitra:
Graph Identification and Upper Confidence Evaluation for Causal Bandits with Linear Models. 7165-7169 - Changkyu Choi, Shujian Yu, Michael Kampffmeyer, Arnt-Børre Salberg, Nils Olav Handegard, Robert Jenssen:
DIB-X: Formulating Explainability Principles for a Self-Explainable Model Through Information Theoretic Learning. 7170-7174 - Liang Du, Yunhui Liang, Mian Ilyas Ahmad, Peng Zhou:
K-Means Clustering Based on Chebyshev Polynomial Graph Filtering. 7175-7179 - Akram Heidarizadeh, George K. Atia:
Adversarial Domain Adaptation for Classification with Nested Dichotomies. 7180-7184 - Bowen Tao, Lan Li, Xin-Chun Li, De-Chuan Zhan:
CLAF: Contrastive Learning with Augmented Features for Imbalanced Semi-Supervised Learning. 7185-7189 - Yichen Zhu, Bo Jiang:
StableMiss+: Prediction with Incomplete Data Under Agnostic Mask Distribution Shift. 7190-7194 - Meng Xu, Bo Jiang, Wenqiang Pu, Ya-Feng Liu, Anthony Man-Cho So:
An Efficient Alternating Riemannian/Projected Gradient Descent Ascent Algorithm for Fair Principal Component Analysis. 7195-7199 - Jingze Lu, Kaijun Ren, Taikang Yuan, Wuxin Wang:
Phase-Space-Guided Deep Learning For Time Series Forecasting. 7200-7204 - Liu Yang, Kurt Butler, Petar M. Djuric:
Sequential Detection of Anomalies in Noisy Outputs of an Unknown Function Using Gaussian and Yule-Simon Processes. 7205-7209 - Binghan Chen, Jianlong Hu, Xiawu Zheng, Wei Lin, Fei Chao, Rongrong Ji:
Functionally Similar Multi-Label Knowledge Distillation. 7210-7214 - Huiqing Qi, Shengli Tan, Xiaoliu Luo:
Self-Supervised Dual Generative Networks for Edge-Preserving Image Smoothing. 7215-7219 - Vinícius Araújo Rabello Landeira, Jardel Oliveira Santos, Hitoshi Nagano:
Comparing and Combining Audio Processing and Deep Learning Features for Classification of Heartbeat Sounds. 7220-7224 - Chun-Ti Chou, Vincent S. Tseng:
Self-Supervised Pulse-Aware Interpretable Disentangled ECG Representation Learning. 7225-7229 - Chenghao Li, Dake Chen, Yuke Zhang, Peter A. Beerel:
Mitigate Replication and Copying in Diffusion Models with Generalized Caption and Dual Fusion Enhancement. 7230-7234 - Vivek Sivaraman Narayanaswamy, Rushil Anirudh, Jayaraman J. Thiagarajan:
The Double-Edged Sword Of Ai Safety: Balancing Anomaly Detection and OOD Generalization Via Model Anchoring. 7235-7239 - Zhiyuan Wang, Xiaoyang Qu, Jing Xiao, Bokui Chen, Jianzong Wang:
INCPrompt: Task-Aware Incremental Prompting for Rehearsal-Free Class-Incremental Learning. 7240-7244 - Haofeng Sun, Hui Tian, Wanli Ni, Jingheng Zheng:
On the Convergence of Hierarchical Federated Learning with Gradient Quantization and Imperfect Transmission. 7245-7249 - Huaze Tang, Yuanquan Hu, Fanfan Zhao, Junji Yan, Ting Dong, Wenbo Ding:
M3ARL: Moment-Embedded Mean-Field Multi-Agent Reinforcement Learning for Continuous Action Space. 7250-7254 - Matthew Repasky, Xiuyuan Cheng, Yao Xie:
Stage-Regularized Neural Stein Critics For Testing Goodness-Of-Fit Of Generative Models. 7255-7259 - Risako Tanigawa, Yasunori Ishii:
Hear-Your-Action: Human Action Recognition by Ultrasound Active Sensing. 7260-7264 - Zhiyuan Wang, Xiaoyang Qu, Jing Xiao, Bokui Chen, Jianzong Wang:
P2DT: Mitigating Forgetting in Task-Incremental Learning with Progressive Prompt Decision Transformer. 7265-7269 - Kai Yang, Wenxin Tai, Zhenhui Li, Ting Zhong, Guangqiang Yin, Yong Wang, Fan Zhou:
Exploring Self-Explainable Street-Level IP Geolocation with Graph Information Bottleneck. 7270-7274 - Fahim Faisal Niloy, Kishor Kumar Bhaumik, Simon S. Woo:
Source-Free Online Domain Adaptive Semantic Segmentation of Satellite Images Under Image Degradation. 7275-7279 - Prasanna Reddy Pulakurthi, Mahsa Mozaffari, Sohail A. Dianat, Majid Rabbani, Jamison Heard, Raghuveer Rao:
Enhancing GAN Performance Through Neural Architecture Search and Tensor Decomposition. 7280-7284 - Oluwasegun A. Somefun, Stefan Lee, V. John Mathews:
AUTOSGM: A Unified Lowpass Regularization Framework for Accelerated Learning. 7285-7289 - Shizhe Ding, Boyang Xia, Jingyan Sui, Dongbo Bu:
Accurate Interpolation of Scattered Data Via Learning Relation Graph. 7290-7294 - Hao Sun, Junting Chen, Yuan Luo:
Tensor-Guided Interpolation For Off-Grid Power Spectrum Map Construction. 7295-7299 - Andreea-Maria Oncescu, João F. Henriques, Andrew Zisserman, Samuel Albanie, A. Sophia Koepke:
A Sound Approach: Using Large Language Models to Generate Audio Descriptions for Egocentric Text-Audio Retrieval. 7300-7304 - Alejandro Parada-Mayorga, Landon Butler, Alejandro Ribeiro:
Non Commutative Convolutional Signal Models in Neural Networks: Stability to Small Deformations. 7305-7309 - Xiangyu Xiong, Yue Sun, Xiaohong Liu, Chan-Tong Lam, Tong Tong, Hao Chen, Qinquan Gao, Wei Ke, Tao Tan:
A Parameterized Generative Adversarial Network Using Cyclic Projection for Explainable Medical Image Classifications. 7310-7314 - Zhe Zhao, Pengkun Wang, Haibin Wen, Yudong Zhang, Binwu Wang, Yang Wang:
Graph Networks Stand Strong: Enhancing Robustness via Stability Constraints. 7315-7319 - Yantong Lai, Yijun Su, Lingwei Wei, Tianci Wang, Daren Zha, Xin Wang:
Adaptive Spatial-Temporal Hypergraph Fusion Learning for Next POI Recommendation. 7320-7324 - Fangyuan Chi, Yixiao Wang, Panos Nasiopoulos, Victor C. M. Leung:
Multi-Modal GPT-4 Aided Action Planning and Reasoning for Self-driving Vehicles. 7325-7329 - Divyanshu Daiya, Monika Yadav, Harshit Singh Rao:
Diffstock: Probabilistic Relational Stock Market Predictions Using Diffusion Models. 7335-7339 - Zi Huang, Akila Pemasiri, Simon Denman, Clinton Fookes, Terrence Martin:
Multi-Stage Learning for Radar Pulse Activity Segmentation. 7340-7344 - Chao Zheng, Liming Wang, Zhen Xu, Hongjia Li:
Mutual Information Based Noise Scale Optimization for Gradient Leakage Resistant Federated Learning. 7345-7349 - Taewoong Kang, Jeongsik Oh, Jaeseong Lee, Sunghyun Park, Jaegul Choo:
Expression Domain Translation Network for Cross-Domain Head Reenactment. 7356-7359 - Wenxin Liang, Zhiliang Hao, Han Liu, Hongyang Chen:
Boosting Zero-Shot Node Classification via Dependency Capture and Discriminative Feature Learning. 7360-7364 - Advait Kumar, Shirsha Bose, Mohamad Hassan N C, Biplab Banerjee:
SPDG-Net: Semantics Preserving Domain Augmentation through Style Interpolation for Multi-Source Domain Generalization. 7365-7369 - Shichuan Zhang, Sunyi Zheng, Zhongyi Shui, Lin Yang:
HLS-FGVC: Hierarchical Label Semantics Enhanced Fine-Grained Visual Classification. 7370-7374 - Zhipeng Lin, Wenjing Yang, Long Lan, Mingyang Geng, Haotian Wang, Haoang Chi, Xueqiong Li, Ji Wang:
Diversifying Cross-Domain Few-Shot Learning via Multimodal Image Editing. 7375-7379 - Yunzuo Zhang, Yuxin Zheng, Cunyu Wu, Tian Zhang, Yameng Liu:
ECPNet: An Enhanced Curve Perception Network for Lane Detection. 7380-7384 - Jianping Li, Xiao Ke, Zhihao Wang, JinCheng Wan, Guozhen Tan:
Cutransnet: Transformers to Make Strong Encoders for Multi-Task Vision Perception of Autonomous Driving. 7385-7389 - Junyi Wang, Bin Chen, Wenrui Fan, Yongjiang Liu:
Co-Salient Object Detection via Discriminative Prototypes Contrast. 7390-7394 - Robin Francis, Sundeep Prabhakar Chepuri:
Differentially Private Federated Frank-Wolfe. 7395-7399 - Zhao Sun, Yulong Pei, Defu Li, Qinke Peng:
CGN: A Simple Yet Effective Multi-Channel Gated Network for Long-Term Time Series Forecasting. 7400-7404 - Qiqi Zhou, Yichen Zhu:
When Training-Free Nas Meets Vision Transformers: A Neural Tangent Kernel Perspective. 7405-7409 - Sadegh Mahdavi, Renjie Liao, Christos Thrampoulidis:
Revisiting the Equivalence of In-Context Learning and Gradient Descent: The Impact of Data Distribution. 7410-7414 - Hossein Mirzaei, Mohammad Jafari, Hamid Reza Dehbashi, Zeinab Sadat Taghavi, Mohammad Sabokrou, Mohammad Hossein Rohban:
Killing It With Zero-Shot: Adversarially Robust Novelty Detection. 7415-7419 - Zhimin Zhang, Xiang Gao, Wei Hu:
InvariantOODG: Learning Invariant Features of Point Clouds for Out-of-Distribution Generalization. 7420-7424 - Jaekwon Im, Juhan Nam:
DiffRENT: A Diffusion Model for Recording Environment Transfer of Speech. 7425-7429 - Sebastian Eliassen, Raghavendra Selvan:
Activation Compression of Graph Neural Networks Using Block-Wise Quantization with Improved Variance Minimization. 7430-7434 - Ahmed Ben Yahmed, Clément Calauzènes, Vianney Perchet:
Strategic Arms with Side Communication Prevail Over Low-Regret MAB Algorithms. 7435-7439 - Puneesh Deora, Bhavya Vasudeva, Vatsal Sharan, Christos Thrampoulidis:
Fast Test Error Rates for Gradient-Based Algorithms on Separable Data. 7440-7444 - Matthieu Gallet, Ammar Mian, Abdourrahmane M. Atto:
Renyi Divergences Learning for explainable classification of SAR Image Pairs. 7445-7449 - Chia-Hsin Lin, Charles Jones, Björn W. Schuller, Harry Coppock, Alican Akman:
Synthia's Melody: A Benchmark Framework for Unsupervised Domain Adaptation in Audio. 7450-7454 - Subhajit Chaudhury, Toshihiko Yamasaki:
Adversarial Robustness of Convolutional Models Learned in the Frequency Domain. 7455-7459 - Guanglu Wang, Xianchao Zhang, Han Liu, Xiaotong Zhang, Jie Mu, Wentao Yang, Linlin Zong:
Continual Learning with Class-Level Minimally Interfered Update. 7460-7464 - Janghoon Cho, Sunghyun Park, Hyunsin Park, Hyoungwoo Park, Seunghan Yang, Sungrack Yun:
Balanced Learning for Multi-Domain Long-Tailed Speaker Recognition. 7465-7469 - Yu-Mei Huang, Hui-Nien Hung, Vincent S. Tseng:
Privacy-Preserving Attention-Weighted Multi-Source Domain Adaptation for EEG Motor Imagery. 7470-7474 - Fadlullah Raji, John Murray-Bruce:
Towards 3D Computational Persicopy with an Ordinary Camera: a Separable Non-Linear Least Squares Formulation. 7475-7479 - Haleh Akrami, Omar Zamzam, Anand A. Joshi, Sergül Aydöre, Richard M. Leahy:
Beta Quantile Regression for Robust Estimation of Uncertainty in the Presence of Outliers. 7480-7484 - Xu Zhang, Zhengang Huang, Yunzhi Wu, Xun Lu, Erpeng Qi, Yunkai Chen, Zhongya Xue, Peng Wang, Wei Wang:
Self-Adaptive Scale Handling for Forecasting Time Series with Scale Heterogeneity. 7485-7489 - Wentao Shi, Baoqi Huang, Bing Jia:
SCRN: A Spectrogram Convolutional Recurrent Network for AoA Estimation Using Bluetooth 5. 7490-7494 - Guoxing Yang, Haoyu Lu, Chongxuan Li, Guang Zhou, Haoran Wu, Zhiwu Lu:
Progressive Image Synthesis from Semantics to Details with Denoising Diffusion GAN. 7495-7499 - Yulan Hu, Sheng Ouyang, Zhirui Yang, Yi Zhao, Junchen Wan, Fuzheng Zhang, Zhongyuan Wang, Yong Liu:
GFMAE: Self-Supervised GNN-Free Masked Autoencoders. 7500-7504 - Nianlong Gu, Kanghwi Lee, Maris Basha, Sumit Kumar Ram, Guanghao You, Richard H. R. Hahnloser:
Positive Transfer of the Whisper Speech Transformer to Human and Animal Voice Activity Detection. 7505-7509 - Chen Liu, Shibo He, Haoyu Liu, Shizhong Li:
Treemil: A Multi-Instance Learning Framework for Time Series Anomaly Detection with Inexact Supervision. 7510-7514 - Zexi Liu, Bohan Tang, Ziyuan Ye, Xiaowen Dong, Siheng Chen, Yanfeng Wang:
Hypergraph Transformer for Semi-Supervised Classification. 7515-7519 - Linghui Meng, Xi Zhang, Dengpeng Xing, Bo Xu:
A New Pre-Training Paradigm for Offline Multi-Agent Reinforcement Learning with Suboptimal Data. 7520-7524 - Hong Yin, Jiang Zhong, Rongzhen Li, Jiaqi Wang, Chen Wang, Qizhu Dai, Xue Li:
Facilitating Message Passing with Potential Links for Knowledge Graph Completion. 7525-7529 - Ankit Shah, Fuyu Tang, Zelin Ye, Rita Singh, Bhiksha Raj:
Importance of Negative Sampling in Weak Label Learning. 7530-7534 - Fengfan Zhao, Ercan Engin Kuruoglu:
Sequential Monte Carlo Graph Convolutional Network for Dynamic Brain Connectivity. 7535-7539 - Yunhang Yao, Min Gao, Hongwei Zhou, Zongwei Wang, Zehua Zhao, Qingyu Xiong:
Ranking Enhanced Fine-Grained Contrastive Learning for Recommendation. 7540-7544 - Lixu Wang, Shichao Xu, Xinyu Du, Qi Zhu:
DACR: Distribution-Augmented Contrastive Reconstruction for Time-Series Anomaly Detection. 7545-7549 - Charilaos I. Kanatsoulis, Alejandro Ribeiro:
Graph Neural Networks are More Powerful than We Think. 7550-7554 - Perrine Bauchot, Angélique Drémeau, Florian Sévellec, Ronan Fablet:
Impact of Sampling Strategies on the Monitoring of Climate Regime Shifts with a Learning Data Assimilation Method. 7555-7559 - Zichao Deng, Han Yu:
Noise-Resistant Graph Neural Network for Node Classification. 7560-7564 - Nazreen Shah, Prachi Goyal, Ranjitha Prasad:
Importance Sampling Based Federated Unsupervised Representation Learning. 7565-7569 - Rajdeep Dutta, Qincheng Wang, Ankur Singh, Dhruv Kumarjiguda, Xiaoli Li, Senthilnath Jayavelu:
Interpretable Policy Extraction with Neuro-Symbolic Reinforcement Learning. 7570-7574 - Burak Hasircioglu, Deniz Gündüz:
Communication Efficient Private Federated Learning Using Dithering. 7575-7579 - Chenyu Xu, Sihai Zhang, Zhengdao Wang:
Computational Complexity of Asynchronous Policy Iteration for Two-Player Zero-Sum Markov Games. 7580-7584 - Fanhua Li, Yuanyuan Deng, Bo Zhou, Qihui Wu:
Partial Convolutional Based-Radio Map Reconstruction for Urban Environments with Inaccessible Areas. 7585-7589 - Federico Malato, Florian Leopold, Andrew Melnik, Ville Hautamäki:
Zero-Shot Imitation Policy Via Search In Demonstration Dataset. 7590-7594 - Shu Zheng, Tiandi Ye, Xiang Li, Ming Gao:
Federated Learning via Consensus Mechanism on Heterogeneous Data: A New Perspective on Convergence. 7595-7599 - Yuefeng Ma, Lanzhen Guo:
A Counterfactual Inspired Framework For Quantifying Edge Effects On Gnns Fairness. 7600-7604 - Waleed El-Geresy, Deniz Gündüz:
Adversarial Jamming for Autoencoder Distribution Matching. 7605-7609 - Yujie Li, Yifu Wang, Zihang Ma, Xinghe Wang, Yutao Tang:
Sod-Uav: Small Object Detection For Unmanned Aerial Vehicle Images Via Improved Yolov7. 7610-7614 - Yi-Cheng Lai, Chen-Yu Wang, Feng-Tsun Chien:
When Green Learning Meets Federated Learning: Toward Distributed Learning with Low Complexity and Model Heterogeneity. 7615-7619 - Hounsu Kim, Soonbeom Choi, Juhan Nam:
Expressive Acoustic Guitar Sound Synthesis with an Instrument-Specific Input Representation and Diffusion Outpainting. 7620-7624 - Akash Sen, Pradyumna Pradhan, Ramunaidu Randhi, C. S. Sastry:
Unrolled Proximal Gradient Descent Method for Non-Negative Least Squares Problem. 7625-7629 - Stav Danino, Igal Bilik:
Automated Labeling of Automotive Radar Azimuth Multipath. 7630-7634 - Jianfu Zhang, Yan Hong, Dawei Cheng, Liqing Zhang, Qibin Zhao:
Hierarchical Attacks on Large-Scale Graph Neural Networks. 7635-7639 - Jiawei Yan, Yuxing Yang, Syed Mohsen Naqvi:
Object Detection Oriented Privacy-Preserving Frame-Level Video Anomaly Detection. 7640-7644 - Songhui Zhao, Sujuan Hou:
Multi-Stage Progressive Refinement and RoI Context Enhancement Network for Small Logo Detection. 7645-7649 - Zhihua Chen, Lei Liang, Yeting Huang, Lei Dai, Ran Li, Bin Sheng:
Context-Aware Transformer for Single Image Rain Streaks Removal. 7650-7654 - Li Ge, Xue Jiang, Lin Chen, Xingzhao Liu, Martin Haardt:
Leveraging Tensor Subspace Prior: Enhanced Sum of Nuclear Norm Minimization for Tensor Completion. 7655-7659 - Xiang Zhang, Tao Hu, Jiashuang He, Qingsen Yan:
Efficient Content Reconstruction for High Dynamic Range Imaging. 7660-7664 - Mario Döbler, Florian Marencke, Robert A. Marsden, Bin Yang:
Diversity-Aware Buffer for Coping with Temporally Correlated Data Streams in Online Test-Time Adaptation. 7665-7669 - Haiyan Lan, Qiaoxi Zhu, Jian Guan, Yuming Wei, Wenwu Wang:
Hierarchical Metadata Information Constrained Self-Supervised Learning for Anomalous Sound Detection under Domain Shift. 7670-7674 - Said Ouala, Laurent Debreu, Bertrand Chapron, Fabrice Collard, Lucile Gaultier, Ronan Fablet:
Neural Ordinary Differential Equations with Trainable Solvers. 7675-7679 - Yi Zhang, Ce Zhang:
Test-Time Distribution Learning Adapter for Cross-Modal Visual Reasoning. 7680-7684 - Himanshu Singh, A V. Subramanyam:
Language Guided Adversarial Purification. 7685-7689 - Yujie Li, Zihang Ma, Xinghe Wang, Yifu Wang, Benying Tan:
Vision Transformer with 2D Explicit Position Encoding. 7690-7694 - Ming Fang, Xinning Du, Qi Liu, Yunpeng Zhou, Qiwen Liang, Shuhua Liu:
Which is the Better Teacher Action? A New Ranking Model and Dataset. 7695-7699 - Pengjia Tu, Cheng Tian, Dandan Du, Junhuai Li, Huaijun Wang:
Activity Recognition Method Based on Kernel Supervised Laplacian Eigenmaps. 7700-7704 - Cheng Cheng, Ziping Zhao:
Accelerating Gradient Descent for Over-Parameterized Asymmetric Low-Rank Matrix Sensing via Preconditioning. 7705-7709 - Jianwei Sun, Yang An, Xinyu Jiang, Qian Li, Yulong Liu, Yongshun Gong:
Synonym Replacement and Generation Enhancement for Document Augmentation. 7710-7714 - Bumsu Park, Heedong Do, Namyoon Lee:
Multi-Rate Variable-Length CSI Compression for FDD Massive MIMO. 7715-7719 - Hua Jiang, Yixiong Chen, Li Liu, Xiaoguang Han, Xiao-Ping Zhang:
Leveraging Noisy Labels of Nearest Neighbors for Label Correction and Sample Selection. 7720-7724 - Yufeng Xie, Yinan Wang, Han Wang, Qingshan Li:
Self-Supervised Reinforcement Learning for Out-of-Distribution Recovery via Auxiliary Reward. 7725-7729 - Lingyong Fang, Gongshen Liu, Ru Zhang:
Multi-Grained Multimodal Interaction Network for Sentiment Analysis. 7730-7734 - Ling Guo, Guoguo Ai, Hui Yan:
Adaptive Order Aggregator and Extractor Graph Neural Network. 7735-7739 - Kai Song, Zhengtan Wang, Huhe Dai, Yuan Zheng:
CENet: Content-Aware Enhanced Network for Practical Scene Parsing. 7740-7744 - Hongwei Yao, Jian Lou, Zhan Qin:
PoisonPrompt: Backdoor Attack on Prompt-Based Large Language Models. 7745-7749 - Huiwen Luo, Guoqiang Zhang:
On Optimizing Timesteps of an EDM Based Diffusion Sampling Procedure. 7750-7754 - Xuanjun Chen, Haibin Wu, Chung-Che Wang, Hung-Yi Lee, Jyh-Shing Roger Jang:
Multimodal Transformer Distillation for Audio-Visual Synchronization. 7755-7759 - Zhengpin Li, Jian Wang:
Spectral Graph Neural Networks with Generalized Laguerre Approximation. 7760-7764 - Qiyuan Ou, Pei Zhang, Sihang Zhou, En Zhu:
One-Step Late Fusion Multi-View Clustering with Compressed Subspace. 7765-7769 - Abrar Zahin, Weizhi Li, Gautam Dasarathy:
Rapid Change Localization in Dynamic Graphical Models. 7770-7774 - Jiabin Liu, Zheng Wei, Zhengpin Li, Xiaojun Mao, Jian Wang, Zhongyu Wei, Qi Zhang:
SAM: A Self-Adaptive Attention Module for Context-Aware Recommendation System. 7775-7779 - Cheng Zhou, Guangxia Li, Yulong Shen:
A Simple and Effective Method for Anomaly Detection on Attributed Graphs via Feature Consistency. 7780-7784 - Dimitrios Psarras, Christos Papaioannidis, Vasileios Mygdalis, Ioannis Pitas:
A Unified DNN-Based System for Industrial Pipeline Segmentation. 7785-7789 - Parth Thaker, Vineet Sunil Gattani, Vignesh Tirukkonda, Pouria Saidi, Gautam Dasarathy:
Non-Stationary Bandits with Periodic Behavior: Harnessing Ramanujan Periodicity Transforms to Conquer Time-Varying Challenges. 7790-7794 - Hao Hu, Zhixi Feng, Ruoxue Li, Yue Ma, Shuyuan Yang:
A Novel Cross-Sensor Self-Supervised Learning Method for Rotating Machinery Fault Diagnosis. 7795-7799 - Bo Xu, Hao Zheng, Zhigang Hu, Liu Yang, Meiguang Zheng, Xianting Feng, Wei Lin:
Double Reverse Regularization Network Based on Self-Knowledge Distillation for SAR Object Classification. 7800-7804 - Xinglong Wu, Hui He, Zejun Wang, Yu Tai, Sheng Yin, Hongwei Yang, Weizhe Zhang:
How to Bridge Graph and Sequence Patterns in Session-Based Recommendation? A Self-Supervised Method. 7805-7809 - Xianting Feng, Hao Zheng, Zhigang Hu, Liu Yang, Meiguang Zheng:
Dual-Stream Contrastive Predictive Network with Joint Handcrafted Feature View for SAR Ship Classification. 7810-7814 - Guohui Li, Li Zou, Zhiying Deng, Qi Chen:
Neighborhood-Enhanced Multimodal Collaborative Filtering for Item Cold Start Recommendation. 7815-7819 - Jinzhi Zheng, Libo Zhang, Yanjun Wu, Chen Zhao:
Text Region Multiple Information Perception Network for Scene Text Detection. 7820-7824 - Abhiram Kolli, Filippo Casamassima, Horst Possegger, Horst Bischof:
Robust Localization of Key Fob Using Channel Impulse Response of Ultra Wide Band Sensors for Keyless Entry Systems. 7825-7829 - Subhajit Dutta Chowdhury, Zhiyu Ni, Qingyuan Peng, Souvik Kundu, Pierluigi Nuzzo:
Analyzing Adversarial Vulnerabilities of Graph Lottery Tickets. 7830-7834 - Calvin Murdock, Ishwarya Ananthabhotla, Hao Lu, Vamsi Krishna Ithapu:
Self-Motion As Supervision For Egocentric Audiovisual Localization. 7835-7839 - Seongmin Lee, Jeonghaeng Lee, Hyewon Song, Sanghoon Lee:
Speech-Driven Emotional 3d Talking Face Animation Using Emotional Embeddings. 7840-7844 - Yuhang Ling, Yuxi Li, Zhenye Gan, Jiangning Zhang, Mingmin Chi, Yabiao Wang:
TransAVS: End-to-End Audio-Visual Segmentation with Transformer. 7845-7849 - Aozhu Chen, Fangming Zhou, Ziyuan Wang, Xirong Li:
Cliprerank: An Extremely Simple Method For Improving Ad-Hoc Video Search. 7850-7854 - Yimo Ren, Jinfa Wang, Jie Liu, Peipei Liu, Hong Li, Hongsong Zhu, Limin Sun:
A Relation-Aware Heterogeneous Graph Transformer on Dynamic Fusion for Multimodal Classification Tasks. 7855-7859 - Liqiang Jing, Xuemeng Song, Xinxing Zu, Na Zheng, Zhongzhou Zhao, Liqiang Nie:
VK-G2T: Vision and Context Knowledge Enhanced Gloss2text. 7860-7864 - Fei Zhao, Mengxi Guo, Shijie Zhao, Junlin Li, Li Zhang, Xiaodong Xie:
A Tri-Dynamic Preprocessing Framework for UGC Video Compression. 7865-7869 - Junkun Jiang, Jie Chen:
Exploring Latent Cross-Channel Embedding for Accurate 3d Human Pose Reconstruction in a Diffusion Framework. 7870-7874 - Bingyang Cui, Qi Yang, Kaifa Yang, Yiling Xu, Xiaozhong Xu, Shan Liu:
SJTU-TMQA: A Quality Assessment Database for Static Mesh with Texture Map. 7875-7879 - Zhen Wang, Dongyuan Li, Manabu Okumura:
Multimodal Graph-Based Audio-Visual Event Localization. 7880-7884 - Umberto Michieli, Jijoong Moon, Daehyun Kim, Mete Ozay:
Object-Conditioned Bag of Instances for Few-Shot Personalized Instance Recognition. 7885-7889 - Tianjun Mao, Shansong Liu, Yunxuan Zhang, Dian Li, Ying Shan:
Unified Pretraining Target Based Video-Music Retrieval with Music Rhythm and Video Optical Flow Information. 7890-7894 - Pengyue Lin, Zhihan Yu, Mingcong Lu, Fangxiang Feng, Ruifan Li, Xiaojie Wang:
Visual Prompt Tuning for Weakly Supervised Phrase Grounding. 7895-7899 - Jian Zhu, Yu Cui, Zhangmin Huang, Xingyu Li, Lei Liu, Lingfang Zeng, Li-Rong Dai:
Adaptive Confidence Multi-View Hashing for Multimedia Retrieval. 7900-7904 - Weigang Wang, Zhongwen Guo, Chao Yang, Jinxin Wang, Sining Jiang, Tianao Zhang:
Joint-Semantics Multi-Similarity Hashing for Cross-Modal Retrieval. 7905-7909 - Haijian Liang, Weicheng Xie, Xilin He, Siyang Song, Linlin Shen:
Circular Decomposition and Cross-Modal Recombination for Multimodal Sentiment Analysis. 7910-7914 - Shansong Liu, Xu Li, Dian Li, Ying Shan:
Humtrans: A Novel Open-Source Dataset for Humming Melody Transcription and Beyond. 7915-7919 - Bowen Huang, Davi Lazzarotto, Touradj Ebrahimi:
Temporal Conditional Coding for Dynamic Point Cloud Geometry Compression. 7920-7924 - Yu-Ping Ruan, Shoukang Han, Taihao Li, Yanfeng Wu:
Fusing Modality-Specific Representations and Decisions for Multimodal Emotion Recognition. 7925-7929 - Jingzhe Li, Chengji Wang, Zhiming Luo, Yuxian Wu, Xingpeng Jiang:
Modality-Dependent Sentiments Exploring for Multi-Modal Sentiment Classification. 7930-7934 - Yating Liu, Yaowei Li, Zimo Liu, Wenming Yang, Yaowei Wang, Qingmin Liao:
Clip-Based Synergistic Knowledge Transfer for text-based Person Retrieval. 7935-7939 - Fan Yu, Haoxu Wang, Ziyang Ma, Shiliang Zhang:
Hourglass-AVSR: Down-Up Sampling-Based Computational Efficiency Model for Audio-Visual Speech Recognition. 7940-7944 - Sicheng Yang, Zunnan Xu, Haiwei Xue, Yongkang Cheng, Shaoli Huang, Mingming Gong, Zhiyong Wu:
FreeTalker: Controllable Speech and Text-Driven Gesture Generation Based on Diffusion Models for Enhanced Speaker Naturalness. 7945-7949 - Shengzhe You, Libo Weng, Fei Gao:
Weakly Supervised Few-Shot Segmentation Through Textual Prompt. 7950-7954 - Yingxue Pang, Shijie Zhao, Mengxi Guo, Junlin Li, Li Zhang:
Region-Adaptive Video Sharpening Via Rate-Perception Optimization. 7955-7959 - Sichun Luo, Jiansheng Wang, Aojun Zhou, Li Ma, Linqi Song:
Large Language Models Augmented Rating Prediction in Recommender System. 7960-7964 - Yating Liu, Ziyu Shan, Yujie Zhang, Yiling Xu:
MFT-PCQA: Multi-Modal Fusion Transformer for No-Reference Point Cloud Quality Assessment. 7965-7969 - Minsu Kim, Jeongsoo Choi, Soumi Maiti, Jeong Hun Yeo, Shinji Watanabe, Yong Man Ro:
Towards Practical and Efficient Image-to-Speech Captioning with Vision-Language Pre-Training and Multi-Modal Tokens. 7970-7974 - Nan Li, Songlin Du:
Underlying-Complementarity and Surrounding-Correspondence for Multi-View Clustering. 7975-7979 - Miao Liu, Jing Wang, Xinyuan Qian, Xiang Xie:
Visually Guided Binaural Audio Generation with Cross-Modal Consistency. 7980-7984 - Yiqi Jin, Ziyu Zhu, Tongda Xu, Yuhuan Lin, Yan Wang:
ECM-OPCC: Efficient Context Model for Octree-Based Point Cloud Compression. 7985-7989 - Yunqi Li, Shulin Liu, Haonan Cheng, Long Ye:
Binauralmusic: A Diverse Dataset for Improving Cross-Modal Binaural Audio Generation. 7990-7994 - Rui Zhang, Xiaoran Yan:
Video-Language Graph Convolutional Network for Human Action Recognition. 7995-7999 - Trung Hieu Le, Xavier Pic, Jeremy Mateos, Marc Antonini:
Implicit Neural Multiple Description for DNA-Based Data Storage. 8000-8004 - Bo-Wei Tseng, Kenneth Yang, Yu-Hua Hu, Wen-Li Wei, Jen-Chun Lin:
Music-to-Dance Poses: Learning to Retrieve Dance Poses from Music. 8005-8009 - Qiang Su, Zhixin Li:
Multi-Source Dynamic Interactive Network Collaborative Reasoning Image Captioning. 8010-8014 - Xiaoya Fan, Yuntao Liu, Zhong Wang:
Electroencephalogram Helps Few-Shot Learning. 8015-8019 - Ruixiang Xue, Jiaxin Li, Tong Chen, Dandan Ding, Xun Cao, Zhan Ma:
NeRI: Implicit Neural Representation of LiDAR Point Cloud Using Range Image Sequence. 8020-8024 - Zhongjie Mao, Yucheng Wang, Xi Chen, Jia Yan:
Textual Tokens Classification for Multi-Modal Alignment in Vision-Language Tracking. 8025-8029 - Ruijia Fan, Hong Liu, Yidi Li, Peini Guo, Guoquan Wang, Ti Wang:
AttA-NET: Attention Aggregation Network for Audio-Visual Emotion Recognition. 8030-8034 - Zhanbei Cui, Tongda Xu, Jia Wang, Yu Liao, Yan Wang:
GeneFormer: Learned Gene Compression using Transformer-Based Context Modeling. 8035-8039 - Amit Sofer, Shlomo E. Chazan:
C-CLAPA: Improving Text-Audio Cross Domain Retrieval with Captioning and Augmentations. 8040-8044 - Lingwei Wei, Dou Hu, Wei Zhou, Songlin Hu:
Transferring Structure Knowledge: A New Task to Fake News Detection towards Cold-Start Propagation. 8045-8049 - Mehdi Fatan, Emanuele Mincato, Dimitra Pintzou, Mariella Dimiccoli:
3M-Transformer: A Multi-Stage Multi-Stream Multimodal Transformer for Embodied Turn-Taking Prediction. 8050-8054 - Kai Wang, Dimitrios Hatzinakos:
MOMA: Mixture-of-Modality-Adaptations for Transferring Knowledge from Image Models Towards Efficient Audio-Visual Action Recognition. 8055-8059 - Zhuoyao Gu, Miao Pang, Zhen Xing, Weimin Tan, Xuhao Jiang, Bo Yan:
Facial Micro-Motion-Aware Mixup for Micro-Expression Recognition. 8060-8064 - Jeongsoo Choi, Minsu Kim, Se Jin Park, Yong Man Ro:
Text-Driven Talking Face Synthesis by Reprogramming Audio-Driven Models. 8065-8069 - Jaehyuk Jang, Yooseung Wang, Changick Kim:
Towards Robust Multimodal Prompting with Missing Modalities. 8070-8074 - Weichen Zhao, Yuxing Lu, Ge Jiao, Yuan Yang:
Dual-Color Granularity Alignment for Text-Based Person Search. 8075-8079 - Natalie Lang, Itamar Assaf, Omer Bokobza, Nir Shlezinger:
Data-Driven Lattices for Vector Quantization. 8080-8084 - Masaya Sato, Keisuke Maeda, Ren Togo, Takahiro Ogawa, Miki Haseyama:
Caption Unification for Multi-View Lifelogging Images Based on In-Context Learning with Heterogeneous Semantic Contents. 8085-8089 - Anfeng Xu, Kevin Huang, Tiantian Feng, Helen Tager-Flusberg, Shrikanth Narayanan:
Audio-Visual Child-Adult Speaker Classification in Dyadic Interactions. 8090-8094 - Jiwei Shen, Hu Lu, Hao Zhang, Shujing Lyu, Yue Lu:
Enhancing Reinforcement Learning via Causally Correct Input Identification and Targeted Intervention. 8095-8099 - Xinyuan Qian, Zexu Pan, Qiquan Zhang, Kainan Chen, Shoufeng Lin:
GLMB 3D Speaker Tracking with Video-Assisted Multi-Channel Audio Optimization Functions. 8100-8104 - Chanul Park, Dahyun Jeon, Seongwook Lee:
Camera-Radar Association for Data Annotation. 8105-8109 - Kuan Liu, Yanmin Zhu, Zhaobo Wang, Ke Wang, Gang Zhou:
Rethinking Normals: Direction Guided Point Cloud Recognition. 8110-8114 - Tianfei Ling, Deyuan Chen, Baobin Li:
MDAVIF: A Multi-Domain Acoustical-Visual Information Fusion Model for Depression Recognition from Vlog Data. 8115-8119 - Kuan Liu, Yanmin Zhu, Zhaobo Wang, Ke Wang, Gang Zhou:
Axis Order Invariance Learned from Point Clouds. 8120-8124 - Jiapeng Liu, Chengyang Fang, Liang Li, Bing Li, Dayong Hu, Can Ma:
Prompting Large Language Models with Fine-Grained Visual Relations from Scene Graph for Visual Question Answering. 8125-8129 - Chengyang Fang, Liang Li, Jiapeng Liu, Bing Li, Dayong Hu, Can Ma:
Segment then Match: Find the Carrier before Reasoning in Scene-Text VQA. 8130-8134 - Shanti Stewart, Kleanthis Avramidis, Tiantian Feng, Shrikanth Narayanan:
Emotion-Aligned Contrastive Learning Between Images and Music. 8135-8139 - Haruka Matsuda, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama:
Multi-Object Editing in Personalized Text-To-Image Diffusion Model Via Segmentation Guidance. 8140-8144 - Qi Huang, Pingting Cai, Tanyue Nie, Jinshan Zeng:
CLIP-MSA: Incorporating Inter-Modal Dynamics and Common Knowledge to Multimodal Sentiment Analysis With Clip. 8145-8149 - He Wang, Pengcheng Guo, Pan Zhou, Lei Xie:
MLCA-AVSR: Multi-Layer Cross Attention Fusion Based Audio-Visual Speech Recognition. 8150-8154 - Nanjie Chen, Jinping Wang, Hao Chen, Ying Shen, Shuai Wang, Xiaojun Tan:
BEVLOC: End-to-End 6-DoF Localization Via Cross-Modality Correlation Under Bird's Eye View. 8155-8159 - Dingxin Cheng, Shuhan Kong, Wenyu Wang, Meixia Qu, Bin Jiang:
Long Term Memory-Enhanced Via Causal Reasoning for Text-To-Video Retrieval. 8160-8164 - Jiamu Li, Dongheng Zhang, Qi Chen, Yadong Li, Jianyang Wang, Wenxuan Li, Yang Hu, Qibin Sun, Yan Chen:
SIMFALL: A Data Generator for RF-Based Fall Detection. 8165-8169 - Rodrigo B. Pinheiro, Jean-Eudes Marvie, Giuseppe Valenzise, Frédéric Dufaux:
Reducing the Complexity of Normalizing Flow Architectures for Point Cloud Attribute Compression. 8170-8174 - Hao-Chiang Shao, Yu-Hsien Lin, Chia-Wen Lin:
A Fine-Grained Attribute Pre-Labeling Method Based on Label Dependency and Feature Similarity Dynamics. 8175-8179 - Binqiang Huang, Zhijie Huang, Shoujie Lan, Qinghai Zheng, Yuanlong Yu:
Incomplete Multi-View Clustering Via Inference and Evaluation. 8180-8184 - Qiaochu Huang, Xu He, Boshi Tang, Haolin Zhuang, Liyang Chen, Shuochen Gao, Zhiyong Wu, Haozhi Huang, Helen Meng:
Enhancing Expressiveness in Dance Generation Via Integrating Frequency and Music Style Information. 8185-8189 - Weichen Zhao, Yuxing Lu, Ge Jiao, Yuan Yang:
Concentrated Reasoning and Unified Reconstruction for Multi-Modal Media Manipulation. 8190-8194 - Alexandr Axyonov, Dmitry Ryumin, Denis Ivanko, Alexey M. Kashevnik, Alexey Karpov:
Audio-Visual Speech Recognition In-The-Wild: Multi-Angle Vehicle Cabin Corpus and Attention-Based Method. 8195-8199 - Xi Chen:
MMRBN: Rule-Based Network for Multimodal Emotion Recognition. 8200-8204 - Hualin Ren, Christian Ritz, Jiahong Zhao, Daeyoung Jang:
Towards an Objective Quality Metric for Interpolated Directional Room Impulse Responses. 8205-8209 - Andréas Pastor, Pierre R. Lebreton, Toinon Vigier, Patrick Le Callet:
Comparison of Conditions for Omnidirectional Video with Spatial Audio in Terms of Subjective Quality and Impacts on Objective Metrics Resolving Power. 8210-8214 - Baogui Xu, Yafei Lu, Bing Su, Xiaoran Yan:
Position-Aware Active Learning for Multi-Modal Entity Alignment. 8215-8219 - Shivam Mehta, Ruibo Tu, Simon Alexanderson, Jonas Beskow, Éva Székely, Gustav Eje Henter:
Unified Speech and Gesture Synthesis Using Flow Matching. 8220-8224 - Salvador Medina, Sarah L. Taylor, Carsten Stoll, Gareth Edwards, Alex Hauptmann, Shinji Watanabe, Iain A. Matthews:
PhISANet: Phonetically Informed Speech Animation Network. 8225-8229 - Che Liu, Zhongwei Wan, Sibo Cheng, Mi Zhang, Rossella Arcucci:
ETP: Learning Transferable ECG Representations via ECG-Text Pre-Training. 8230-8234 - Qian Gao, Yanling Hao, Yuanwei Liu:
AutoSen: Improving Automatic WiFi Human Sensing through Cross-Modal Autoencoder. 8235-8239 - Gautam Krishna, Sameer Dharur, Oggi Rudovic, Pranay Dighe, Saurabh Adya, Ahmed Hussen Abdelaziz, Ahmed H. Tewfik:
Modality Drop-Out for Multimodal Device Directed Speech Detection Using Verbal and Non-Verbal Features. 8240-8244 - Zuhui Wang, Yunting Yin, I. V. Ramakrishnan:
Enhancing Image-Text Matching with Adaptive Feature Aggregation. 8245-8249 - Deqian Kong, Furqan Khan, Xu Zhang, Prateek Singhal, Ying Nian Wu:
Long-Term Social Interaction Context: The Key to Egocentric Addressee Detection. 8250-8254 - Mingwei Sun, Kunpeng Zhang:
Sec2Sec Co-Attention Transformer for Video-Based Apparent Affective Prediction. 8255-8259 - Xiuyun Ma, Na Lv:
Human Motion Capture Data Segmentation Based on ST-GCN. 8260-8264 - Desen Yuan:
Balancing Easy and Hard Distortions: A Multi-Rate Knowledge Distillation Strategy for Blind Image Quality Assessment. 8265-8269 - Sabyasachee Baruah, Shrikanth Narayanan:
Character Attribute Extraction from Movie Scripts Using LLMs. 8270-8275 - Bingyuan Zhang, Xulong Zhang, Ning Cheng, Jun Yu, Jing Xiao, Jianzong Wang:
EmoTalker: Emotionally Editable Talking Face Generation via Diffusion Model. 8276-8280 - Ronghui Li, Yuqin Dai, Yachao Zhang, Jun Li, Jian Yang, Jie Guo, Xiu Li:
Exploring Multi-Modal Control in Music-Driven Dance Generation. 8281-8285 - Jianhua Dong, Shengrong Zhao, Hu Liang:
Learning Fine-Grained Information Alignment for Calibrated Cross-Modal Retrieval. 8286-8290 - Jincheng Wu, Ruixu Geng, Yadong Li, Dongheng Zhang, Zhi Lu, Yang Hu, Yan Chen:
Diffradar: High-Quality Mmwave Radar Perception With Diffusion Probabilistic Model. 8291-8295 - Haiwei Xue, Sicheng Yang, Zhensong Zhang, Zhiyong Wu, Minglei Li, Zonghong Dai, Helen Meng:
Conversational Co-Speech Gesture Generation via Modeling Dialog Intention, Emotion, and Context with Diffusion Models. 8296-8300 - Yusong Wang, Dongyuan Li, Jialun Shen:
Inter-Modality and Intra-Sample Alignment for Multi-Modal Emotion Recognition. 8301-8305 - Zhixin Huang, Yuchen Zhou, Jie Zhu, Chao Gou:
Driver Scanpath Prediction Based On Inverse Reinforcement Learning. 8306-8310 - Guolong Wang, Yike Tan, Hangyu Lin, Chuchun Zhang:
Keep Knowledge in Perception: Zero-Shot Image Aesthetic Assessment. 8311-8315 - Lingling Li, Weicong Li, Qiyuan Ding, Chengpei Tang, Keze Wang:
Gesture Generation Via Diffusion Model with Attention Mechanism. 8316-8320 - Wootaek Lim, Juhan Nam:
Enhancing Spatial Audio Generation with Source Separation and Channel Panning Loss. 8321-8325 - Qiujie Xie, Qiming Feng, Yuejie Zhang, Rui Feng, Tao Zhang, Shang Gao:
ControlCap: Controllable Captioning via No-Fuss Lexicon. 8326-8330 - Ting Wang, Zongkai Wu, Feiyu Yao, Donglin Wang:
Graph-Based Environment Representation for Vision-and-Language Navigation in Continuous Environments. 8331-8335 - Xin Sun, Xiangyu Ren, Xiaohao Xie:
A Novel Multimodal Sentiment Analysis Model Based on Gated Fusion and Multi-Task Learning. 8336-8340 - Yi Zhao, Chunyu Qiang, Hao Li, Yulan Hu, Wangjin Zhou, Sheng Li:
Enhancing Realism in 3D Facial Animation Using Conformer-Based Generation and Automated Post-Processing. 8341-8345 - Jiaqing He, Yanzhen Ren, Liming Zhai, Wuyang Liu:
FCC-MF: Detecting Violence in Audio-Visual Context with Frame-Wise Cluster Contrast and Modality-Stage Flooding. 8346-8350 - Shilong Wu, Chenxi Wang, Hang Chen, Yusheng Dai, Chenyue Zhang, Ruoyu Wang, Hongbo Lan, Jun Du, Chin-Hui Lee, Jingdong Chen, Sabato Marco Siniscalchi, Odette Scharenborg, Zhong-Qiu Wang, Jia Pan, Jianqing Gao:
The Multimodal Information Based Speech Processing (MISP) 2023 Challenge: Audio-Visual Target Speaker Extraction. 8351-8355 - Yuxin Guo, Shijie Ma, Yuhao Zhao, Hu Su, Wei Zou:
Cross Pseudo-Labeling for Semi-Supervised Audio-Visual Source Localization. 8356-8360 - Jongbhin Woo, Hyeonggon Ryu, Arda Senocak, Joon Son Chung:
Speech Guided Masked Image Modeling for Visually Grounded Speech. 8361-8365 - Rui Zhang, Jingyi Xu, Weidong Yang, Lipeng Ma, Menglong Chen, Ben Fei:
Learning Density Regulated and Multi-View Consistent Unsigned Distance Fields. 8366-8370 - Min Zheng, Chunpeng Wu, Yue Wang, Yantao Jia, Weiwei Liu, Long Lin, Shuai Chen, Fei Zhou:
Fast Cross-Modality Knowledge Transfer via a Contextual Autoencoder Transformation. 8371-8375 - Kaixuan Wu, Donglin Cao:
Evidence-Aware Multimodal Chinese Social Media Rumor Detection. 8376-8380 - Qiancheng Wei, Xiaoping Jiang, Ying Liu, Qiya Su, Muyao Yu:
Small Object Detection on the Water Surface Based on Radar and Camera Fusion. 8381-8385 - Jiangwei Deng, Yuhao An, Thomas H. Li, Shan Liu, Ge Li:
ScanPCGC: Learning-Based Lossless Point Cloud Geometry Compression using Sequential Slice Representation. 8386-8390 - Chaeyoung Jung, Suyeon Lee, Kihyun Nam, Kyeongha Rho, You Jin Kim, Youngjoon Jang, Joon Son Chung:
TalkNCE: Improving Active Speaker Detection with Talk-Aware Contrastive Learning. 8391-8395 - Yoonsoo Nam, Adam Lehavi, Daniel Yang, Digbalay Bose, Swabha Swayamdipta, Shrikanth Narayanan:
Does Video Summarization Require Videos? Quantifying the Effectiveness of Language in Video Summarization. 8396-8400 - Yun-Hao Yuan, Mingzhi Hao, Yun Li, Jipeng Qiang, Yi Zhu, Xiaobo Shen:
Learning Spectral Canonical ℱ-Correlation Representation for Face Super-Resolution. 8401-8405 - Yuhan Hao, Xin Jin, Dongyu Du:
Multi-Dimensional Geometric Feature-Based Calibration Method for LiDAR and Camera Fusion. 8406-8410 - Saki Mizuno, Nobukatsu Hojo, Kazutoshi Shinoda, Keita Suzuki, Mana Ihori, Hiroshi Sato, Tomohiro Tanaka, Naotaka Kawata, Satoshi Kobashikawa, Ryo Masumura:
Talking Face Generation for Impression Conversion Considering Speech Semantics. 8411-8415 - Mingyao Zhou, Wenjing Chen, Hao Sun, Wei Xie:
Cross-Modal Multiscale Difference-Aware Network for Joint Moment Retrieval and Highlight Detection. 8416-8420 - Yaru Chen, Ruohao Guo, Xubo Liu, Peipei Wu, Guangyao Li, Zhenbo Li, Wenwu Wang:
CM-PIE: Cross-Modal Perception for Interactive-Enhanced Audio-Visual Video Parsing. 8421-8425 - Jingshu Zhang, Yueru Chen, Guoqing Liu, Wei Gao, Ge Li:
Efficient Point Cloud Attribute Compression Framework using Attribute-Guided Graph Fourier Transform. 8426-8430 - Zhikai Hu, Yiu-Ming Cheung, Mengke Li, Weichao Lan, Donglin Zhang:
Key Points Centered Sparse Hashing for Cross-Modal Retrieval. 8431-8435 - Ruishan Huang, Pengpeng Yu, Shaolin Liao, Fan Liang:
Efficient Point Cloud Attribute Compression Using Rich Parallelizable Context Model. 8436-8440 - Biyun Yao, Wuzhen Shi:
Speaker-Centric Multimodal Fusion Networks for Emotion Recognition in Conversations. 8441-8445 - Jiarui Yang, Songpengcheng Xia, Yifan Song, Qi Wu, Ling Pei:
mmBaT: A Multi-Task Framework for Mmwave-Based Human Body Reconstruction and Translation Prediction. 8446-8450 - Junwei Zhang, Shufeng Li, Libiao Jin, Wei Liu, Hing Cheung So:
Multi-Beam Multiplexing Design with Phase-Only Excitation Based on Hybrid Beamforming Architectures. 8451-8455 - Yanbin Zou, Liehu Wu, Yimao Sun:
Analysis of an Elliptic Localization Algorithm Using Fixed Point Iteration. 8456-8460 - Xue Xiong, Hao Liang, Bin Liao:
Robust Beamforming for DFRC Systems in Complex Environments. 8461-8465 - Zhengang Guo, Wei Dai:
Joint Multi-Band DOA Estimation Using Low-Rank Matrix Recovery. 8466-8470 - Yanan Wu, Andreas Jakobsson:
Adaptive Grid 2-D Direction of Arrival Estimation Method Using an Integrated Dictionary. 8471-8475 - Beichuan Tang, Yimao Sun, K. C. Ho, Lei Zhang, Yanbing Yang:
Multidimensional Scaling-Based TDOA Localization in Modified Polar Representation. 8476-8480 - Yifeng Xiong, Fan Liu, Marco Lops:
Generalized Deterministic-Random Tradeoff of Integrated Sensing and Communications: The Sensing-Optimal Operating Point. 8481-8485 - Haodong Guo, Hua Chen, Hongguang Lin, Wei Liu, Qing Shen, Gang Wang:
A New Fourth-Order Sparse Array Generator Based on Sum-Difference Co-Array Analysis. 8486-8490 - Sebastian Semper, J. Chuang, Samuel Berweger, Camillo Gentile:
Using Temporal Consistency for Compressed Sensing in High-Resolution mmWave Sounding. 8491-8495 - Md. Waqeeb T. S. Chowdhury, Yimin D. Zhang, Wei Liu, Maria S. Greco:
Identifiability Analysis of Sensor Arrays with Sensors off Half-Wavelength Grid. 8496-8500 - Liang Liu, Zhouchen Li, Jiancheng An, Lu Gan, Hongbin Li:
A CCM-Based Joint DOA-Frequency Estimation and Signal Recovery with Efficient Sub-Nyquist Sampling. 8501-8505 - Liang Liu, Zhouchen Li, Jiancheng An, Lu Gan, Hongbin Li:
DOA Estimation for Switch-Element Arrays Based on Sparse Representation. 8506-8510 - Noriyuki Tonami, Wataru Kohno, Sakiko Mishima, Yumi Arai, Reishi Kondo, Tomoyuki Hino:
Low-Rank Constrained Multichannel Signal Denoising Considering Channel-Dependent Sensitivity Inspired by Self-Supervised Learning for Optical Fiber Sensing. 8511-8515 - Jiaxiong Fang, Juan Liu, Hua Chen, Wei Liu, Ye Tian, Gang Wang:
Three-Dimensional Spatial-Temporal Near-Field Passive Localization Based on an Exact Spatial Propagation Model. 8516-8520 - Joseph S. Picard, Amitay Bar, Ronen Talmon:
Direct Position Determination by Covariance-Fitting on the Riemannian Manifold of Hermitian Positive Definite Matrices. 8521-8525 - Xiangtian Meng, Fenggang Yan, Maria Greco, Fulvio Gini, Ming Jin:
Reduced-Dimensional Decomposition and Eigenspace Reconstruction of Coherent Sources with Arbitrary Rectangle Arrays. 8526-8530 - Moein Ahmadi, Mohammad Alaee-Kerahroodi, Linlong Wu, Bhavani Shankar M. R., Björn E. Ottersten:
Detector Design for Distributed Multichannel Radar Sensors in Colored Interference Environments. 8531-8535 - Xueqin Luo, Jilu Jin, Gongping Huang, Yingke Zhao, Jingdong Chen, Jacob Benesty:
On the Design of Planar Differential Microphone Arrays with Specified Beamwidth or Sidelobe Level. 8536-8540 - Ching Hua Lee, Kashyap Patel, Chouchang Yang, Yilin Shen, Hongxia Jin:
An MVDR-Embedded U-Net Beamformer for Effective and Robust Multichannel Speech Enhancement. 8541-8545 - Xinghao Qu, Zhigang Shang, Gang Qiao, Jixing Qin, Xuerui Liu:
Subspace-Based Co-Array Processing For Nested Arrays without Eigendecomposition. 8546-8550 - Xiaohu Li, Jiawei Liu, Guorui Liao, Mingrui Yin, Shu Wang, Guoxin Su, Jun Liao, Li Liu:
Predicting Fall Events by a Spatio-Temporal Topological Network with Multiple Wearable Sensors. 8551-8555 - Xianzhen Guo, Qin Shi, Liang Liu, Shuowen Zhang:
User-Assisted Networked Sensing in OFDM Cellular Network with Erroneous Anchor Position Information. 8556-8560 - Jilu Jin, Xueqin Luo, Gongping Huang, Jingdong Chen, Jacob Benesty:
Beamforming Through Online Convex Combination of Differential Beamformers. 8561-8565 - Weixin Meng, Xiaoyu Li, Andong Li, Jian Li, Xiaodong Li, Chengshi Zheng:
All Neural Kronecker Product Beamforming for Speech Extraction with Large-Scale Microphone Arrays. 8566-8570 - Sebastian O. Jordan, Qiongxiu Li, Richard Heusdens:
Privacy-Preserving Distributed Optimisation using Stochastic PDMM. 8571-8575 - Junkai Ji, Wei Mao, Feng Xi, Shengyao Chen:
TransMUSIC: A Transformer-Aided Subspace Method for DOA Estimation with Low-Resolution ADCS. 8576-8580 - Chenkang Duan, Ye Tian, Wei Liu:
Deep Convolution Network Based Super Resolution DOA Estimation with Toeplitz and Sparse Prior. 8581-8585 - Kunkun SongGong, Pufen Zhang, Xiongwei Zhang, Meng Sun, Wenwu Wang:
Multi-Speaker Localization in the Circular Harmonic Domain on Small Aperture Microphone Arrays Using Deep Convolutional Networks. 8586-8590 - Ruiwa Sun, Congwei Feng, Huawei Chen:
Further Results on the Design Of Real-Valued Wideband Beamformers Using Adaptive-Array-Theory-Inspired Weighted Least Squares. 8591-8595 - Hugo Brehier, Arnaud Breloy, Chengfang Ren, Guillaume Ginolhac:
Through-The-Wall Radar Imaging With Wall Clutter Removal Via Riemannian Optimization On The Fixed-Rank Manifold. 8596-8600 - Mirza Asif Haider, Yimin D. Zhang, Elias Aboutanios:
Channel Estimation and Prediction in Wireless Communications Assisted by Semi-Passive RIS. 8601-8605 - Luning Lin, Hang Zheng, Sergiy A. Vorobyov, Chengwei Zhou, Zhiguo Shi:
Sensing-Aided Communication Channel Estimation with Tensor-Based Moving Target Localization. 8606-8610 - Wenzhe Lu, Mingyu Jiang, Heng Qiao:
Unified Analysis of Correlation-Aware Joint Sparse Support Recovery with ℓ0-Norm Constraint. 8611-8615 - Zihan Chen, Xiaolu Zeng, Xiaopeng Yang, Jiarong Zhao, Junbo Gong:
High-Resolution Through-Wall Imaging Using Data Fusion and Reasoning. 8616-8620 - Edoardo Focante, Nitin Jonathan Myers, Geethu Joseph, Ashish Pandharipande:
Situation-Aware Adaptive Transmit Beamforming for Automotive Radars. 8621-8625 - Tunç Alkanat, Ashish Pandharipande:
Automotive Radar Point Cloud Parametric Density Estimation using Camera Images. 8636-8640 - Shen Zhong, Zhongyu Li, Junjie Wu, Jianyu Yang:
A Novel 3-D Focusing Scheme for Distributed SAR Tomography. 8641-8645 - Jiancheng Liao, Xiaolu Zeng, Xiaopeng Yang, Zixiang Yin, Junbo Gong:
Block Adaptive Subspace Pursuit Method for Wall Clutter Mitigation. 8646-8650 - Weijie Chen, Yaling Deng, Chongtao Guo, Yuan Ma, Bin Liao:
Transmit Beampattern Optimization for MIMO-ISAC Systems with Hybrid Beamforming. 8651-8655 - Waradon Phokhinanan, Nicolas Obin, Sylvain Argentieri:
Auditory Cortex-Inspired Spectral Attention Modulation for Binaural Sound Localization in HRTF Mismatch. 8656-8660 - Kuranage Roche Rayan Ranasinghe, Hyeon Seok Rou, Giuseppe Thadeu Freitas de Abreu:
Fast and Efficient Sequential Radar Parameter Estimation in MIMO-OTFS Systems. 8661-8665 - Zongyu Wang, Yuhan Li, Yihan Su, Tianyao Huang, Yimin Liu:
Fundamental Limits of Direction Finding in Distributed Arrays Exploiting Auxiliary Sources. 8666-8670 - Andreas Jansson, Andreas Jakobsson:
Close-Range Direction of Arrival Estimation in the Presence of Clock Jitter. 8671-8675 - Sihan Wen, Zongyu Zhang, Chengwei Zhou, Zhiguo Shi:
ZIV-Zakai Bound for DOA Estimation with Gain-Phase Error. 8681-8685 - Yusun Shul, Jung-Woo Choi:
CST-Former: Transformer with Channel-Spectro-Temporal Attention for Sound Event Localization and Detection. 8686-8690 - Han Wang, Yiming Zhou, Eduardo Pérez, Florian Römer:
Jointly Learning Selection Matrices for Transmitters, Receivers and Fourier Coefficients in Multichannel Imaging. 8691-8695 - Yu Chen, Xinyuan Qian, Zexu Pan, Kainan Chen, Haizhou Li:
LOCSELECT: Target Speaker Localization with an Auditory Selective Hearing Mechanism. 8696-8700 - Biao Xue, Gong Zhang, Fulvio Gini, Maria S. Greco, Henry Leung:
An Optimized Interleaved OFDM Chirp Orthogonal Waveform Design for Dechirped Miniature MMW MIMO Radar. 8701-8705 - Wen-Qin Wang:
FDA-MIMO Radar Using Ambiguity Function for Target Two-Dimensional Localization. 8706-8710 - S. P. Tripathi, Bertrand Chapron, Fabrice Collard, Gilles Guitton, Manuel Lopez-Radcenco, Alexis Mouche, Ronan Fablet:
Deep Learning Inversion of Ocean Wave Spectrum from SAR Satellite Observations. 8711-8715 - Mohammadreza Bagheri Jazi, Seyed Mohammad Karbasi, Prabhu Babu:
Three-Dimensional Decoupled Atomic Norm Minimization. 8716-8720 - Eleftherios Kofidis:
Adaptive Joint Channel Estimation/Data Detection in Flexible Multicarrier Mimo Systems - A Tensor-Based Approach. 8721-8725 - Gerald C. Nwalozie, Damir Rakhimov, Martin Haardt:
Robust Near-Field Beamforming for Millimeter Wave Communication System with Aperture Perturbations. 8726-8730 - M. Hartenstein, F. Ollivier, F. Silva, P. Luizard:
Batch Substitution Calibration of a Mems Microphone Array : Impact of Sensor Performance Dispersion on Directivity Estimation. 8731-8735 - Yu Zhang, Yue Wang, Zhipeng Cai, Fangqing Wen, Gong Zhang:
Harmonic Retrieval for Non-Circular Coherent Signals via Double Decoupled Atomic Norm Minimization. 8736-8740 - Ziyu Zhou, Wei Dai:
Multispectral RF Imaging Using Multiple Narrow-Band FMCW Signals. 8741-8745 - Menghong Cai, Bin Wang, Jun Fang:
Max-Min Beamforming for Multi-User Massive MIMO Systems: An Alternating Projection-Based Approach. 8746-8750 - Hanqin Gong, Dongheng Zhang, Jinbo Chen, Yadong Li, Guixin Xu, Yuqin Yuan, Yang Hu, Yan Chen:
Enabling Orientation-Free Mmwave-Based Vital Sign Sensing with Multi-Domain Signal Analysis. 8751-8755 - Jinyi Yang, Lin Chen, Xue Jiang, Wei Liu:
Frequency-Domain Signal Reconstruction for Dynamic Time-Domain Weighting Hybrid Precoding with Beam Squint. 8756-8760 - Yuexian Wang, Qianyuan Shi, Chuang Han, Ling Wang, Chintha Tellambura:
Sparse Bayesian Learning-Based Direct Localization for Distributed Sensor Arrays with Unknown Gain and Phase Errors. 8761-8765 - Chunxuan Shi, Yongzhe Li, Ran Tao:
Fast Algorithm Design for the Constant-Envelope Precoding in Massive Mimo Communications with Interference Exploitation. 8766-8770 - Yuxuan Zhen, Chunxuan Shi, Yongzhe Li, Ran Tao:
Design of Spatial-Slow-Time Constant-Modulus Waveform Transmission and Receive Adaptive Filter for Dual-Function Radar Communications with Reconfigurable Intelligent Surface. 8771-8775 - Xiaonan Xu, Yongzhe Li, Ran Tao, Tao Shan:
OFDM Waveform Design with Good Correlation Level and Peak-to-Mean Envelope Power Ratio for the Joint MIMO Radar And Communications. 8776-8780 - Ilya Gurvich, Ido Leichter, Dharmendar Reddy Palle, Yossi Asher, Alon Vinnikov, Igor Abramovski, Vishak Gopal, Ross Cutler, Eyal Krupka:
A Real-Time Active Speaker Detection System Integrating an Audio-Visual Signal with a Spatial Querying Mechanism. 8781-8785 - Patitapaban Palo, Aurobinda Routray, Ritesh Chandra Tewari:
A Graph Neural Network Based Approach for Fault Delineation in Seismic Data using Graph Total Variation and Multigraph. 8786-8790 - Qi Zhang, Hong Jiang, Yunchang Liu:
Newtonalized Orthogonal Matching Pursuit for Mixed Far-Field and Near-Field Source Localization. 8791-8795 - Tianyi Liu, Sai Pavan Deram, Khaled Ardah, Martin Haardt, Marc E. Pfetsch, Marius Pesavento:
Gridless Parameter Estimation in Partly Calibrated Rectangular Arrays. 8796-8800 - Yongwei Huang, Jiachao Liang:
Joint Robust Optimal Transmit and Receive Beamforming Designs for a DFRC System for the MIMO Radar and Secondary Multicast Communication in a Cognitive Radio Network. 8801-8805 - Jens Gulin, Kalle Åström:
GCC-PHAT Re-Imagined - A U-Net Filter for Audio TDOA Peak-Selection. 8806-8810 - Wentao Shi, Tao Zhang, Baoqi Huang, Bing Jia:
Enhancing AoA Estimation Via Phase Modeling of Bluetooth 5 CTE Signals. 8811-8815 - Davide Berghi, Peipei Wu, Jinzheng Zhao, Wenwu Wang, Philip J. B. Jackson:
Fusion of Audio and Visual Embeddings for Sound Event Localization and Detection. 8816-8820 - Gil Geva, Olivier Warusfel, Shlomo Dubnov, Tammuz Dubnov, Amir Amedi, Yacov Hel-Or:
Binaural Sound Source Localization Using a Hybrid Time and Frequency Domain Model. 8821-8825 - Michael Shifrin, Joseph Tabrikian, Igal Bilik:
Identifiability Study of Near-Field Automotive SAR. 8826-8830 - Tianyi Xing, Yimao Sun, Lihua Ni, Xiangyu Peng, Kehao Zhang, Qun Wan:
Solution and Analysis For 3-D Localization In Closed-Form Integrating Sa and TDOA Measurements. 8831-8835 - Maria Francis, K. V. S. Hari:
Selective User Forwarded Cell-Free Massive Mimo with Quantized Symbols. 8836-8840 - Sikai Ge, Zhiqiang Wei, Zai Yang:
Target Signal Power Improvement and Clutter Suppression via Beamforming for Integrated Sensing and Communication Systems. 8841-8845 - Weichao Zheng, Zai Yang:
Reweighted Atomic Norm Minimization for One-Bit Multichannel Spectral Compressed Sensing. 8846-8850 - Mahdi Koloushani, Mohammad Mahdi Naghsh, Mohammad Reza Taban, Seyed Mohammad Karbasi:
Multitarget Tracking in the Presence of Velocity Ambiguity for Automotive Radar. 8851-8855 - Tomer Raviv, Alon Goldmann, Ofek Vayner, Yair Be'ery, Nir Shlezinger:
CRC-Aided Learned Ensembles of Belief-Propagation Polar Decoders. 8856-8860 - Xing Zhang, Haiyang Zhang, Yonina C. Eldar:
Sparse Channel Representation and Estimation in Near Field Communications. 8861-8865 - Heedong Do, Namyoon Lee:
Global Optimization of Active RIS in Linear Time. 8866-8870 - Hao Zhang, Qingfeng Lin, Yang Li, Lei Cheng, Yik-Chung Wu:
Bayesian Activity Detection for Massive Connectivity in Cell-Free IoT Networks. 8871-8875 - Sriram Ganesan, Neelesh B. Mehta, Rimalapudi Sarvendranath:
A Novel Demodulation and Selection Pilot Power Trade-Off for Codebook-Based IRS with Imperfect Channel Estimates. 8876-8880 - Rui Zhou, Wenqiang Pu, Licheng Zhao, Ming-Yi You, Qingjiang Shi, Sergios Theodoridis:
Cooperative Sensing Via Matrix Factorization of the Partially Received Sample Covariance Matrix. 8881-8885 - Franz Weißer, Nurettin Turan, Dominik Semmler, Wolfgang Utschick:
Data-Aided Channel Estimation Utilizing Gaussian Mixture Models. 8886-8890 - Shuyan Ji, Constantinos Psomas, John Thompson:
Correlation-Based Machine Learning Techniques for Channel Estimation with Fluid Antennas. 8891-8895 - Heqiang Wang, Jie Xu:
Friends to Help: Saving Federated Learning from Client Dropout. 8896-8900 - Chunli Song, Xiaohua Chen, Wenqiu Zhu, Yucan Zhou, Xiaoyan Gu, Bo Li:
Meta-Knowledge Enhanced Data Augmentation for Federated Person Re-Identification. 8901-8905 - Xiaotong Zhao, Xi Wang, Juncheng Wang, Qingjiang Shi:
A Stochastic Proximal WMMSE for Ergodic Sum Rate Maximization. 8906-8910 - Wenqiang Pu, Jiawei Zhang, Rui Zhou, Xiao Fu, Mingyi Hong:
A Smoothed Bregman Proximal Gradient Algorithm for Decentralized Nonconvex Optimization. 8911-8915 - Lei Wang, Jieming Bian, Jie Xu:
Federated Learning with Instance-Dependent Noisy Label. 8916-8920 - Qian Xiang, Cong Sun, Danpu Liu:
Location Optimization for RIS Aided mmWave Downlink Network. 8921-8925 - Chin Choy Chai, Xiao-Ping Zhang:
Unified Probability Distributions of Generalized Composite Fading with Inverse-Type Distributions of Large-Scale Shadowing/Fluctuations. 8926-8930 - Zhiguo Wang, Jiageng Wu, Ya-Feng Liu, Fan Liu:
Globally Optimal Beamforming Design for Integrated Sensing and Communication Systems. 8931-8935 - Yuhao Liu, Xinyu Bian, Yizhou Xu, Tianqi Hou, Wenjie Wang, Yuyi Mao, Jun Zhang:
Decentralizing Coherent Joint Transmission Precoding Via Deterministic Equivalents. 8936-8940 - Jun Sun, Ye Yuan, Maria Sabrina Greco, Fulvio Gini, Wei Yi:
Anti-Deception Jamming Power Optimization Strategy for Multi-Target Tracking Tasks in Multi-Radar Systems. 8941-8945 - Jiaojiao Zhang, Jiang Hu, Mikael Johansson:
Composite Federated Learning with Heterogeneous Data. 8946-8950 - Zhongyuan Zhao, Jake B. Perazzone, Gunjan Verma, Santiago Segarra:
Congestion-Aware Distributed Task Offloading in Wireless Multi-Hop Networks Using Graph Neural Networks. 8951-8955 - Wenhai Lai, Kaiming Shen:
Blind Beamforming for Intelligent Reflecting Surface: A Reinforcement Learning Approach. 8956-8960 - Dario Tagliaferri, Marouan Mizmizi, Silvia Mura, Umberto Spagnolini:
RIS Localization and Spatially Wideband Filtering Effects. 8961-8965 - Jinghui Guan, Rui Zhou, Wenqiang Pu, Qingjiang Shi, Tsung-Hui Chang:
A Robust GLRT Detector Against Missing Data in Cooperative Sensing. 8966-8970 - Emile Ghizzo, Axel Garcia Pena, Julien Lesouple, Carl Milner, Christophe Macabiau:
Assessing GNSS Carrier-to-Noise-Density Ratio Estimation in The Presence of Meaconer Interference. 8971-8975 - Yihao Chen, Bin Tan, Jun Wu, Die Hu:
PJSCC: A Puncturing-Based Joint Source Channel Coding Scheme with Hierarchical Down-Sampling Layer. 8976-8980 - Clayton A. Harper, Mitchell A. Thornton, Eric C. Larson:
Learnable Statistical Moments Pooling for Automatic Modulation Classification. 8981-8985 - Zhaoyi Xu, Athina P. Petropulu:
Time-Modulated Intelligent Reflecting Surface for Waveform Security. 8986-8990 - Leah Woldemariam, Hang Liu, Anna Scaglione:
Low-Complexity Vector Source Coding for Discrete Long Sequences with Unknown Distributions. 8991-8995 - Spyridon Peppas, Nicholas D. Sidiropoulos:
Binary Signal Alignment: Optimal Solution is Polynomial-Time and Linear-Time Solution is Quasi-Optimal. 8996-9000 - Tzu-Hsuan Huang, Yeong-Luh Ueng:
A Binary BP Decoding Using Posterior Adjustment for Quantum LDPC Codes. 9001-9005 - Qianru Wang, Qingyang Li, Bin Guo, Jiangtao Cui:
Efficient Federated Learning with Smooth Aggregation for Non-IID Data from Multiple Edges. 9006-9010 - Mengying Sun, Wanli Ni, Xiaodong Xu, Xiaofeng Tao:
Deep Reinforcement Learning for Energy Minimization in Multi-RIS-Aided Cell-Free MEC Networks. 9011-9015 - Lunan Sun, Caili Guo, Mingzhe Chen, Yang Yang:
Privacy-Aware Joint Source-Channel Coding For Image Transmission Based On Disentangled Information Bottleneck. 9016-9020 - Daniel Bonilla Licea, Giuseppe Silano, Mounir Ghogho, Martin Saska:
Omnidirectional Multi-Rotor Aerial Vehicle Pose Optimization: A Novel Approach to Physical Layer Security. 9021-9025 - Emanuele Peschiera, Xavier Mestre, François Rottenberg:
Energy-Saving Cell-Free Massive MIMO Precoders with a per-AP Wideband Kronecker Channel Model. 9026-9030 - Michael Baur, Nurettin Turan, Benedikt Fesl, Wolfgang Utschick:
Channel Estimation in Underdetermined Systems Utilizing Variational Autoencoders. 9031-9035 - Qian Zhang, Mingjie Shao, Qiang Li, Ju Liu:
An Efficient Algorithm for Multiuser Sum-Rate Maximization of Large-Scale Active RIS-Aided MIMO System. 9036-9040 - Jiawang Zeng, Deepak Mishra, Hassan Habibi Gharakheili, Aruna Seneviratne:
Secure Energy Efficiency Fairness Maximization in Backscatter Throughput Constrained UAV-Assisted Data Collection. 9041-9045 - Pengfei Yin, Dongheng Zhang, Tianyu Zhang, Shuai Yang, Guanzhong Wang, Yang Hu, Yan Chen:
AutoCali: Enhancing AoA-based Indoor Localization through Automatic Phase Calibration. 9046-9050 - Kexin Huang, Chaohua Shi, Lu Gan, Hongqing Liu:
Understanding Gaussian Noise Mismatch: A Hellinger Distance Approach. 9051-9055 - Itsik Bergel:
Deep Optimization of Relay Networks-Using Relays as Neurons. 9056-9060 - Weijun Zhang, Hao Han, Mingwei Li, Yulong Tian:
Towards Faster End-to-End Data Transmission Over Voice Channels. 9061-9065 - Shenjian Wang, Shuichi Ohno:
One-bit Quantization Robust to Angle-of-Arrivals for Uniform Linear Antenna Array. 9066-9070 - Jingran Lin, Weijie Xiong, Qiang Li, Xiangze Kong, Yuhan Zhang:
Joint Admission Control and Beamformer Design for Mobile Users: Stay Here or Move to a Better Position? 9071-9075 - Jiayu Mao, Aylin Yener:
Personalized Over-The-Air Federated Learning with Personalized Reconfigurable Intelligent Surfaces. 9076-9080 - Zhihao Tao, Zhaoyi Xu, Athina P. Petropulu:
How Secure is the Time-Modulated Array-Enabled OFDM Directional Modulation? 9081-9085 - Diego Cuevas, Javier Álvarez-Vizoso, Mikel Gutiérrez, Ignacio Santamaría, Vít Tucek, Gunnar Peters:
Hardware Impairments-Aware Design of noncoherent Grassmannian Constellations. 9086-9090 - Xilai Fan, Ya-Feng Liu, Bo Jiang:
Joint Beamforming and Compression Design for Per-Antenna Power Constrained Cooperative Cellular Networks. 9091-9095 - Emrecan Kutay, Aylin Yener:
Classification-Oriented Semantic Wireless Communications. 9096-9100 - Mohammad NaseriTehrani, Mohammad Javad Salehi, Antti Tölli:
Multicast Transmission Design With Enhanced DOF For Mimo Coded Caching Systems. 9101-9105 - Zhaohui Yang, Mingzhe Chen, Yuchen Liu, Zhaoyang Zhang:
Optimizing Synchronization Delay for Digital Twin over Wireless Networks. 9106-9110 - Yaela Gabay, Nir Shlezinger, Tirza Routtenberg, Yasaman Ghasempour, George C. Alexandropoulos, Yonina C. Eldar:
Leaky Waveguide Antennas for Downlink Wideband THz Communications. 9111-9115 - Or Ohev Shalom, Amir Leshem, Waheed U. Bajwa:
Mitigating Data Injection Attacks on Federated Learning. 9116-9120 - Robert M. Oliveira, Rodrigo C. de Lamare:
Adaptive Reweighted Sparse Belief Propagation Decoding for Polar Codes. 9121-9125 - Yangming Lai, Musa Furkan Keskin, Henk Wymeersch, Luca Venturino, Wei Yi, Lingjiang Kong:
Subspace-Based Detection in OFDM ISAC Systems Under Different Constellations. 9126-9130 - M. Amin Manouchehrpour, Timothy N. Davidson:
Joint Computing and Communication Resource Allocation for TDMA-Based Binary Computation Offloading. 9136-9140 - Panagiotis N. Gavriilidis, Italo Atzeni, George C. Alexandropoulos:
Metasurface-Based Receivers with 1-bit ADCS for multi-user Uplink Communications. 9141-9145 - Chong Zhang, Min Dong, Ben Liang, Ali Afana, Yahia Ahmed:
Multi-Model Wireless Federated Learning with Downlink Beamforming. 9146-9150 - Xin Zhu, Hongyi Pan, Salih Atici, Ahmet Enis Çetin:
Stein Variational Gradient Descent-Based Detection for Random Access with Preambles in MTC. 9151-9155 - Zhaoye Pan, Haoqi Yang, Huikang Liu:
Utilizing Second-Order Information in Noisy Information-Sharing Environments for Distributed Optimization. 9156-9160 - Tianyu Fang, Yijie Mao:
Optimal Beamforming Structure for Rate Splitting Multiple Access. 9161-9165 - Yangrui Dong, Fan Li, Cunyan Ma, Chen He, Z. Jane Wang:
UAV-Based Dynamic Object Tracking with Radio Map. 9166-9170 - Mingxiao Li, Rui Jin, Liyao Xiang, Kaiming Shen, Shuguang Cui:
CROSSWORD: A Semantic Approach To Text Compression Via Masking. 9171-9175 - Mengting Chen, Ziping Zhao:
Joint Blind Deconvolution And Demixing Of Sparse Signals Via Factorization And Nonconvex Optimization. 9176-9180 - Sourajit Das, Navid Naderializadeh, Alejandro Ribeiro:
State-Augmented Information Routing In Communication Systems With Graph Neural Networks. 9181-9185 - Zhenqiao Cheng, Nanxi Li, Jianchi Zhu, Xiaoming She, Chongjun Ouyang, Peng Chen:
Enabling Secure Wireless Communications via Movable Antennas. 9186-9190 - Girim Kwon, Zhenyu Liu, Andrea Conti, Hyuncheol Park, Moe Z. Win:
Integrated Localization and Communication in 3GPP Industrial Environments. 9191-9195 - Yaru Zhao, Yakun Huang:
SemDA: Communication-Efficient Data Aggregation Through Distributed Semantic Transmission. 9196-9200 - Ehsan Lari, Vinay Chakravarthi Gogineni, Reza Arablouei, Stefan Werner:
On The Resilience Of Online Federated Learning To Model Poisoning Attacks Through Partial Sharing. 9201-9205 - Hongbin Zhu, Hua Qian:
Optimal Structure of Receive Beamforming for over-The-Air Computation. 9206-9210 - Cunyan Ma, Xiaoya Li, Yangrui Dong, Chen He:
Coverage Analysis For mmWAVE UAV Networks with Static and Dynamic Blockages. 9211-9215 - Anubhab Chowdhury, Chandra R. Murthy:
Pilot Length Minimization via AP-UE Clustering in Cell-Free Systems. 9216-9220 - William W. Zheng, Jamison R. Ebert, Stefano Rini, Jean-François Chamberland:
Coding for the Unsourced B-Channel with Erasures: Enhancing the Linked Loop Code. 9221-9225 - Wai-Yiu Keung, Yatao Liu, Wing-Kin Ma:
Robust Symbol-Level Precoding via a Symbol-Perturbed Zero-Forcing Structure. 9226-9230 - Yujun Cheng, Zhewei Zhang, Shengjin Wang:
FED-SDS: Adaptive Structured Dynamic Sparsity for Federated Learning Under Heterogeneous Clients. 9231-9235 - Martin Andersson, Tung Thanh Vu, Pål K. Frenger, Erik G. Larsson:
Uplink Symbol Detection in Dynamic TDD Mimo Systems with AP-AP Interference. 9236-9240 - Umar Rashid, Rafay Chughtai:
Scaling Results for Robust Distributed Estimation in Sensor Networks Using Order Statistics. 9241-9245 - Elsa Rizk, Kun Yuan, Ali H. Sayed:
Asynchronous Diffusion Learning with Agent Subsampling and Local Updates. 9246-9250 - Wai-Yiu Keung, Hei Victor Cheng, Wing-Kin Ma:
Transmitting Data Through Reconfigurable Intelligent Surface: A Spatial Sigma-Delta Modulation Approach. 9251-9255 - Shima Eslami, Bikshapathi Gouda, Antti Tölli:
Near-Field MIMO Channel Reconstruction Via Limited Geometry Feedback. 9256-9260 - Alexandra Gallyas-Sanhueza, Gian Marti, Victoria M. T. Palhares, Reinhard Wiesmayr, Christoph Studer:
LoFi User Scheduling for Multiuser Mimo Wireless Systems. 9261-9265 - Amus Chee Yuen Goay, Deepak Mishra, Ross D. Murch, Aruna Seneviratne:
Tag Antenna Structure Calibrated Backscattering Signal Detection. 9266-9270 - Zhenyu Liu, Stefano Maranò, Moe Z. Win:
An Asymptotically Achievable Rate Bound for Establishing High-Fidelity Entanglements in Quantum Networks. 9271-9275 - Zhaorui Guo, Jiyan Sun, Jiadong Fu, Lu Yuan, Shangyuan Zhuang, Liru Geng, Yinlong Liu, Wei Ma:
Fast and Accurate Root Cause Analysis Based on Signalling Messages for 5G Networks. 9276-9280 - Chaohao Fu, Weijia Jia, Na Ruan:
Client-Free Federated Unlearning via Training Reconstruction with Anchor Subspace Calibration. 9281-9285 - Abdulaziz Al-Amodi, Nour Kouzayha, Nasir Saeed, Mudassir Masood, Tareq Y. Al-Naffouri:
Energy Efficient Wake-Up Solution for Large-Scale Internet of Underwater Things Networks. 9286-9290 - Yifan Wu, Michael B. Wakin, Peter Gerstoft, Yongsung Park:
Non-Uniform Frequency Spacing for Regularization-Free Gridless DOA. 9291-9295 - Keita Kume, Isao Yamada:
A Variable Smoothing for Nonconvexly Constrained Nonsmooth Optimization with Application to Sparse Spectral Clustering. 9296-9300 - Ziang Li, Kailun Wu, Yiwen Guo, Changshui Zhang:
Learned ISTA with Error-Based Thresholding for Adaptive Sparse Coding. 9301-9305 - Stefanie Horstmann, David Ramírez, Peter J. Schreier:
Multistatic Passive Detection of Cyclostationary Signals. 9306-9310 - Yi-Peng Wang, Wei-Ta Chu:
Multiple Player Tracking With 3D Projection and Spatio-Temporal Information In Multi-View Sports Videos. 9311-9315 - Valentina Shumovskaia, Mert Kayaalp, Ali H. Sayed:
Distributed Decision-Making for Community Structured Networks. 9316-9320 - Soo-Chang Pei, Kuo-Wei Chang:
Shift Operator and Separation Filter for Different Period Mixed Signals Using Companion Matrix. 9321-9325 - Soo-Chang Pei, Kuo-Wei Chang:
Diagonalize Integral Graph by DCT. 9326-9330 - Runhua Wang, Qing Ling, Zhi Tian:
D3: Dual-Domain Defenses for Byzantine-Resilient Decentralized Resource Allocation. 9331-9335 - Haoxiang Ye, Heng Zhu, Qing Ling:
On the Tradeoff Between Privacy Preservation and Byzantine-Robustness in Decentralized Learning. 9336-9340 - Yan Xing, Qi'ao Xu, Jingcheng Zeng, Rui Huang, Sihua Gao, Weifeng Xu, Yuxiang Zhang, Wei Fan:
Cross Branch Feature Fusion Decoder for Consistency Regularization-Based Semi-Supervised Change Detection. 9341-9345 - Lorenzo Ortega, Stefano Fortunati:
Misspecified Time-Delay and Doppler Estimation over Non Gaussian Scenarios. 9346-9350 - Marco Carpentiero, Virginia Bordignon, Vincenzo Matta, Ali H. Sayed:
Social Learning with Adaptive Models. 9351-9355 - Rui Huang, Qingyi Zhao, Yan Xing, Sihua Gao, Weifeng Xu, Yuxiang Zhang, Wei Fan:
A Saliency Enhanced Feature Fusion Based Multiscale RGB-D Salient Object Detection Network. 9356-9360 - Samuel Pinilla, Siu Lun Yeung, Jeyan Thiyagalingam:
Global Convergence of Alternating Direction Method of Multipliers for Invex Objective Losses. 9361-9365 - Amir Weiss, Yuval Kochman, Gregory W. Wornell:
A Joint Data Compression and Time-Delay Estimation Distributed Systems via Extremum Encoding. 9366-9370 - Hang Liu, Anna Scaglione, Sean Peisert:
Privacy Leakage In Graph Signal To Graph Matching Problems. 9371-9375 - Zikai Wang, Baojiang Zhong, Kai-Kuang Ma:
Ellipse Detection Based on Contrast-Guided Arc Enhancement. 9376-9380 - Syed Ahmed Pasha, Victor Solo:
Vector Nonlinear Hawkes Model with Inhibition. 9381-9385 - Kenta Yanagiya, Junya Hara, Hiroshi Higashi, Yuichi Tanaka, Antonio Ortega:
Lossy Compression of Adjacency Matrices by Graph Filter Banks. 9386-9390 - Guangtong Zhang, Qihua Liang, Zhiyi Mo, Ning Li, Bineng Zhong:
Visual Adapt for RGBD Tracking. 9391-9395 - Jiuqiang Li, Shilei Zhu:
Multi-View Interactive Compromise Learning for Group Recommendation. 9396-9400 - Victor M. Tenorio, Samuel Rey, Antonio G. Marques:
Blind Deconvolution of Sparse Graph Signals in the Presence of Perturbations. 9406-9410 - Yuval Haitman, Joseph M. Francos:
Mesh-RTUME: Universal Manifold Embedding for Estimating 3D Rigid Transformations of Surfaces. 9411-9415 - Thu Ha Phi, Alexandre Hippert-Ferrer, Florent Bouchard, Arnaud Breloy:
Robust Low-Rank Correlation Fitting. 9416-9420 - Matthew Callahan, Trung Vu, Raviv Raich:
Provable Randomized Coordinate Descent for Matrix Completion. 9421-9425 - Han Shen, Santiago Paternain, Gaowen Liu, Ramana Kompella, Tianyi Chen:
A Method for Bilevel Optimization with Convex Lower-Level Problem. 9426-9430 - Niruhan Viswarupan, Gene Cheung, Fengbo Lan, Michael S. Brown:
Mixed Graph Signal Analysis of Joint Image Denoising / Interpolation. 9431-9435 - Haohe Li, Chong Wang, Shenghao Yu, Zheng Huo, Yujie Zheng, Jiangbo Qian:
Zero-Shot Object Detection with Partitioned Contrastive Feature Alignment. 9436-9440 - Asuka Tamaru, Junya Hara, Hiroshi Higashi, Yuichi Tanaka, Antonio Ortega:
Optimizing k in kNN Graphs with Graph Learning Perspective. 9441-9445 - Zhenhuan Xu, Yongfei Wu, Liming Zhang, Yidi Li:
Adaptive Fourier Decomposition Based Signal Extraction on Weak Electromagnetic Field. 9446-9450 - Zhenchang Xia, Guanqun Zheng, Shengwu Xiong, Jia Wu, Junyin Wang, Chenghu Du:
IFNET: Integrating Data Augmentation and Decoupled Attention Fusion for 3D Object Detection. 9451-9455 - Yi Zhang, Isao Yamada:
Computing an Entire Solution Path of a Nonconvexly Regularized Convex Sparse Model. 9456-9460 - Jiaojiao Zhang, Dominik Fay, Mikael Johansson:
Dynamic Privacy Allocation for Locally Differentially Private Federated Learning with Composite Objectives. 9461-9465 - Duc Thien Nguyen, Konstantinos Slavakis:
Multi-Linear Kernel Regression and Imputation VIA Manifold Learning: the Dynamic MRI Case. 9466-9470 - Kyohei Suzuki, Masahiro Yukawa:
External Division of Two Proximity Operators: An Application to Signal Recovery with Structured Sparsity. 9471-9475 - Xi Yao, Wei Dai:
Accelerated Recovery of Spectrally Sparse Signals Viamodified Proximal Gradient in Hankel Space. 9476-9480 - Xue Yang, Yue Zhou, Wenlong Liao, Tao He, Junchi Yan:
Alpharotate: A Rotation Detection Benchmark Using Tensorflow. 9481-9485 - Roman Jacome, Edwin Vargas, Kumar Vijay Mishra, Brian M. Sadler, Henry Arguello:
Multi-Antenna ISAC Receiver with n-Tuple Blind Deconvolution. 9486-9490 - Yiran Yang, Liyan Xie:
Sequential Wasserstein Uncertainty Sets for Minimax Robust Online Change Detection. 9491-9495 - Wenyi Yan, Lu Gan, Shaoqing Hu, Hongqing Liu:
Towards Optimized Multi-Channel Modulo-ADCs: Moduli Selection Strategies and Bit Depth Analysis. 9496-9500 - Mor Oren-Loberman, Vered Azar, Wasim Huleihel:
Online Auditing of Information Flow. 9501-9505 - Fangqing Xiao, Dirk Slock:
Parameter Estimation Via Expectation Maximization - Expectation Consistent Algorithm. 9506-9510 - Yair Sorek, Koby Todros:
Robust Regression Analysis Based on the K-Divergence. 9511-9515 - Rohan T. Money, Joshin Krishnan, Baltasar Beferull-Lozano, Elvin Isufi:
Evolution Backcasting of Edge Flows From Partial Observations Using Simplicial Vector Autoregressive Models. 9516-9520 - Yaoqi Hu, Axi Niu, Yu Zhu, Qingsen Yan, Jinqiu Sun, Yanning Zhang:
Multiple Object Tracking Based on Occlusion-Aware Embedding Consistency Learning. 9521-9525 - Fei Chen, Gene Cheung, Xue Zhang:
Soft Image Segmentation Using Gradient Graph Laplacian Regularizer. 9526-9530 - Takayuki Sasaki, Yukihiro Bandoh, Masaki Kitahara:
Sparse Regularization Based on Reverse Ordered Weighted L1-Norm and Its Application to Edge-Preserving Smoothing. 9531-9535 - Mingyu Jiang, Wenzhe Lu, Heng Qiao:
A New Perspective on Understanding Resolution Limit Via an Asymptotic Study of Christoffel-Darboux Kernel Based Spectrum Estimator. 9536-9540 - Yodai Suzuki, Ryosuke Isono, Shunsuke Ono:
A Convergent Primal-Dual Deep Plug-and-Play Algorithm for Constrained Image Restoration. 9541-9545 - Julien Valognes, Maria A. Amer:
Ranking of Visual Trackers Using Robust Error Norms. 9546-9550 - Garweet Sresth, Ajit Rajwade, Satish Mulleti:
Unlabelled Sensing with Priors: Algorithm and Bounds. 9551-9555 - Fred Goodyer, Bashar I. Ahmad, Simon J. Godsill:
Reversible Jump Markov Chain Monte Carlo for Pulse Fitting. 9556-9560 - Jiahui Pan, Pengjie Shen, Hui Zhang, Xueliang Zhang:
Efficient Multi-Channel Speech Enhancement with Spherical Harmonics Injection for Directional Encoding. 9561-9565 - Kaushani Majumder, Sibi Raj B. Pillai, Yonina C. Eldar, Satish Mulleti:
Adaptive Sensor Selection with Deterministic Priors for DoA Tracking. 9566-9570 - Andreas G. Angelou, Georgios K. Apostolidis, Leontios J. Hadjileontiadis:
Dynamic Bandwidth Variational Mode Decomposition. 9571-9575 - Zhengdao Yuan, Qinghua Guo, Yonina C. Eldar, Yonghui Li:
Unitary Approximate Message Passing for Matrix Factorization. 9576-9580 - Yue Huang, Zhaoxian Wu, Qing Ling:
On the Convergence of Single-Timescale Multi-Sequence Stochastic Approximation Without Fixed Point Smoothness. 9581-9585 - Eyal Fishel Ben, Nikita Tsarov, Tslil Tapiro, Itay Nuri, Nir Shlezinger:
Learn to Track-Before-Detect via Neural Dynamic Programming. 9586-9590 - Quentin Laborde, Antoine Mazarguil, Laurent Oudre:
Graph Local-Smooth Dictionary Learning. 9591-9595 - Mohamed Akrout, Tiancheng Gao, Faouzi Bellili, Amine Mezghani:
Vector Approximate message Passing with Arbitrary I.I.D. Noise Priors. 9596-9600 - Tianyi Li, Geert Leus:
Finding Representative Sampling Subsets on Graphs via Submodularity. 9601-9605 - Rongyao Hu, Xinyu Yuan, Yan Qiao, Benchu Zhang, Pei Zhao:
Unsupervised Anomaly Detection for Multivariate Time Series Using Diffusion Model. 9606-9610 - Mukilan Karuppasamy, Mohamed Akrout, Faouzi Bellili, Amine Mezghani:
Distributed Vector Approximate Message Passing. 9611-9615 - Jie Chang, Huhe Dai, Yuan Zheng:
CAG-FPN: Channel Self-Attention Guided Feature Pyramid Network for Object Detection. 9616-9620 - Martin Voigt Vejling, Christophe A. N. Biscio, Petar Popovski:
Multi-Sensor Multi-Scan Radar Sensing of Multiple Extended Targets. 9621-9625 - Marom Dadon, Wasim Huleihel, Tamir Bendory:
Statistical and Computational Limits of Detecting and Recovering Hidden Submatrices. 9626-9630 - Ya Liu, Junbin Liu, Wing-Kin Ma:
Cardinality-Constrained Binary Quadratic Optimization via Extreme Point Pursuit, with Application to the Densest K-Subgraph Problem. 9631-9635 - Yuliang Zhu, Ruiming Guo, Peiyu Zhang, Ayush Bhandari:
Frequency Estimation via Sub-Nyquist Unlimited Sampling. 9636-9640 - Haoxiang Ye, Qing Ling:
On the Generalization Error of Byzantine-Resilient Decentralized Learning. 9641-9645 - Saghar Bagheri, Gene Cheung, Tim Eadie, Antonio Ortega:
Joint Signal Interpolation / Time-Varying Graph Estimation Via Smoothness and Low-Rank Priors. 9646-9650 - Zihao He, Qianyu Shu, Jinming Wen, Hing Cheung So:
A Novel Iterative Thresholding Algorithm for Arctangent Regularization Problem. 9651-9655 - Shanshan Zou, Ziping Zhao:
Large Covariance Matrix Estimation Based on Factor Models via Nonconvex Optimization. 9656-9660 - Semin Kwak, Laura Shimabukuro, Antonio Ortega:
Frequency Analysis and Filter Design for Directed Graphs with Polar Decomposition. 9661-9665 - Xiaoping Liu, Gong Chen, Jun Shi, Ran Tao:
Signal Reconstruction from Nonideal Samples in Fractional Fourier Transform Domain. 9666-9670 - Yuening Li, Xiao Fu, Wing-Kin Ma:
Probabilistic Simplex Component Analysis via Variational Auto-Encoding. 9671-9675 - Ori Ohayon, Arie Yeredor:
Blind Separation of Noisy Mixtures Over Galois Fields. 9677-9680 - Panagiotis Misiakos, Vedran Mihal, Markus Püschel:
Learning Signals and Graphs from Time-Series Graph Data with Few Causes. 9681-9685 - Saulo Cardoso Barreto, Julien Flamant, Sebastian Miron, David Brie:
Physically-Constrained Block-Term Tensor Decomposition for Polarimetric Image Recovery. 9686-9690 - Jordi Pérez-Guijarro, Alba Pagès-Zamora, Javier R. Fonollosa:
Extension of Clifford Data Regression Methods for Quantum Error Mitigation. 9691-9695 - Wataru Yata, Isao Yamada:
Imposing Early and Asymptotic Constraints on Ligme with Application to Nonconvex Enhancement of Fused Lasso Models. 9696-9700 - Benjamin Cox, Sara Pérez-Vieites, Nicolas Zilberstein, Martin Sevilla, Santiago Segarra, Víctor Elvira:
End-to-End Learning of Gaussian Mixture Proposals Using Differentiable Particle Filters and Neural Networks. 9701-9705 - Sara El Bouch, Jérôme Galy, Eric Chaumette, Jordi Vilà-Valls:
A Modified Cramér-Rao Bound for Discrete-Time Markovian Dynamic Systems. 9706-9710 - Gal Shtendel, Ayush Bhandari:
Dual-Channel Unlimited Sampling for Bandpass Signals. 9711-9715 - Jasin Machkour, Arnaud Breloy, Michael Muma, Daniel P. Palomar, Frédéric Pascal:
Sparse PCA with False Discovery Rate Controlled Variable Selection. 9716-9720 - Martin Sevilla, Santiago Segarra:
Bayesian Topology Inference on Partially Known Networks from Input-Output Pairs. 9721-9725 - Pradyumna Pradhan, Shaik Basheeruddin Shah, Ramunaidu Randhi, Yonina C. Eldar:
Recursive-Tail-Fista for Sparse Signal Recovery. 9726-9730 - Abijith Jagannath Kamath, Chandra Sekhar Seelamantula:
Neuromorphic Sensing Meets Unlimited Sampling. 9731-9735 - Xiuheng Wang, Ricardo Augusto Borsoi, Cédric Richard:
Riemannian Diffusion Adaptation over Graphs with Application to Online Distributed PCA. 9736-9740 - Roshaan Soundarapandian, Amitalok J. Budkuley, Stefano Rini:
On Time-Encoded Sampling for Multigenerator Shift Invariant Spaces. 9741-9745 - Alexander Möllers, Alexander Immer, Vincent Fortuin, Elvin Isufi:
Hodge-Aware Contrastive Learning. 9746-9750 - Michael Scholkemper, Damin Kühn, Gerion Nabbefeld, Simon Musall, Björn Kampa, Michael T. Schaub:
A Wasserstein Graph Distance Based on Distributions of Probabilistic Node Embeddings. 9751-9755 - Fernando Llorente, Petar M. Djuric:
Dynamic Random Feature Gaussian Processes for Bayesian Optimization of Time-Varying Functions. 9756-9760 - Aarthi Venkat, Joyce A. Chew, Ferran Cardoso Rodriguez, Christopher J. Tape, Michael Perlmutter, Smita Krishnaswamy:
Directed Scattering for Knowledge Graph-Based Cellular Signaling Analysis. 9761-9765 - Dmitriy Shutin:
Asymptotic Behavior of Super-Resolution Sparse Bayesian Learning. 9766-9770 - Pedro Izquierdo Lehmann, Aline Xavier, Marcelo E. Andia, Carlos A. Sing-Long:
Exact Classification of NMR Spectra from NMR Signals. 9771-9775 - Lang Liu, Zaïd Harchaoui:
The Rao, Wald, And Likelihood-Ratio Tests under Generalized Self-Concordance. 9776-9780 - Hanyang Jiang, Yao Xie:
A Graph-Prediction-Based Approach for Debiasing Underreported Data. 9781-9785 - Xinhui Rong, Victor Solo:
Symmetric VAR(1) Modelling with Guaranteed Stability. 9786-9790 - Seonho Kim, Kiryung Lee:
Sequence of Linear Program for Robust Phase Retrieval. 9791-9795 - Eisuke Yamagata, Shunsuke Ono:
Risk-Managed Sparse Index Tracking Via Market Graph Clustering. 9796-9800 - Darukeesan Pakiyarajah, Eduardo Pavez, Antonio Ortega:
Irregularity-Aware Bandlimited Approximation for Graph Signal Interpolation. 9801-9805 - John Shi, José M. F. Moura:
Graph Signal Processing: The 2D Companion Model. 9806-9810 - Rod Rofougaran, Shinjae Yoo, Huan-Hsin Tseng, Samuel Yen-Chi Chen:
Federated Quantum Machine Learning with Differential Privacy. 9811-9815 - Jeongmin Chae, Praneeth Narayanamurthy, Selin Bac, Shaama Mallikarjun Sharada, Urbashi Mitra:
Sketched Column-Based Matrix Approximation With Side Information. 9816-9820 - Paris A. Karakasis, Nicholas D. Sidiropoulos:
Multivariate Density Estimation Using Low-Rank Fejér-Riesz Factorization. 9821-9825 - Hongwei Wang, Xi Zheng, Hongbin Li:
Kalman Filtering With Unlimited Sensing. 9826-9830 - Jie Li, Yan Huang, Qihui Wu, Arye Nehorai:
A Riemannian-Based Joint Design Framework of Mimo Radar Transmit Waveform And Receive Filter Via Information Theory. 9831-9835 - Zhaoye Pan, Xiaolu Wang, Huikang Liu, Jun Zhang:
An Efficient Hierarchical Block Coordinate Descent Method for Time-Varying Graphical Lasso. 9836-9840 - Skyepaphora Griffith, Glen Takahara, Wesley S. Burr:
Spectrogram Smoothing for Estimation of the Evolutionary Spectra of Uniformly Modulated Processes. 9841-9845 - Bing Wang, Hangbin Ye, Xingpeng Zhang, Dong He, Xin Wang, Qiuli Wang, Chunlan Zhao:
Object Correlation Matrix for Two-Stage Object Detection Network. 9846-9850 - Liangqi Zhong, Shengye Yan:
Self Knowledge Distillation Based On Layer-Wise Weighted Feature Imitation For Efficient Object Detection. 9851-9855 - James Shiniti Nagai, Ivan G. Costa, Michael T. Schaub:
Optimal Transport Distances for Directed, Weighted Graphs: A Case Study With Cell-Cell Communication Networks. 9856-9860 - Andrei Buciulea, Elvin Isufi, Geert Leus, Antonio G. Marques:
Learning Graphs and Simplicial Complexes from Data. 9861-9865 - Nacer Yousfi, Karim Abed-Meraim, Yosra Marnissi, Maxime Leiber, Mohamed El Badaoui:
Neural Network-Based Symbolic Regression for Empirical Modeling of the Behavior of a Planetary Gearbox. 9866-9870 - Morad Halihal, Tirza Routtenberg:
Cramer-Rao Bound for Admittance Matrix Estimation under Laplacian Constraints. 9871-9875 - Mohammad Sabbaqi, Elvin Isufi:
Inferring Time Varying Signals over Uncertain Graphs. 9876-9880 - Hessa Alfalahi, Ahsan H. Khandoker, Leontios J. Hadjileontiadis:
Spiral Shape Matters: Novel Bio-Inspired Cochlear Cepstrum. 9881-9885 - Yuxuan Zhang, Jian Wang:
Robust Recovery of Joint Sparse Signals via Simultaneous Orthogonal Matching Pursuit. 9886-9890 - Shivani Yadav, Dipanjan Gope, K. Uma Maheswari, Prasanta Kumar Ghosh:
An Unsupervised Segmentation of Vocal Breath Sounds. 9891-9895 - Vincent P. Grande, Michael T. Schaub:
Disentangling the Spectral Properties of the Hodge Laplacian: not all small Eigenvalues are Equal. 9896-9900 - Bishwadeep Das, Elvin Isufi:
Tensor Graph Decomposition for Temporal Networks. 9901-9905 - Hai Victor Habi, Hagit Messer, Yoram Bresler:
Learning the Barankin Lower Bound on DOA Estimation Error. 9906-9910 - Malaak Khatib, Nadav Harel, Yochai Ben-Horin, Yael Radzyner, Tirza Routtenberg:
Cyclic Misspecified Cramer-Rao Bound for Periodic Parameter Estimation. 9911-9915 - Nadav E. Rosenthal, Joseph Tabrikian:
Asymptotically Tight Misspecified Bayesian Cramér-Rao Bound. 9916-9920 - Koyo Sato, Kazuki Naganuma, Shunsuke Ono:
Enhancing Hyperspectral Anomaly Detection by Difference-of-Convex Sparse Anomaly Modeling. 9921-9925 - Boyuan Zhang, Shuyuan Zhu, Tong Xie, Xibang Yang, Yahui Liu, Bing Zeng:
Filamentary Convolution for Spoken Language Identification: A Brain-Inspired Approach. 9926-9930 - Victor M. Tenorio, Madeline Navarro, Santiago Segarra, Antonio G. Marques:
Recovering Missing Node Features with Local Structure-Based Embeddings. 9931-9935 - Luana Ruiz, Ningyuan Teresa Huang, Soledad Villar:
A Spectral Analysis of Graph Neural Networks on Dense and Sparse Graphs. 9936-9940 - Kevin Wilkinghoff, Alessia Cornaggia-Urrigshardt:
TACos: Learning Temporally Structured Embeddings for Few-Shot Keyword Spotting with Dynamic Time Warping. 9941-9945 - Jiawen Huang, Donglin Cao, Dazhen Lin:
Leverage Causal Graphs and Rumor-Refuting Texts for Interpretable Rumor Analysis. 9946-9950 - Zhixin Guo, Jianping Zhou, Jiexing Qi, Mingxuan Yan, Ziwei He, Guanjie Zheng, Zhouhan Lin, Xinbing Wang, Chenghu Zhou:
Towards Controlled Table-to-Text Generation with Scientific Reasoning. 9951-9955 - Jeremy H. M. Wong, Nancy F. Chen:
Distilling Distributional Uncertainty from a Gaussian Process. 9956-9960 - Yang Wu, Jing Yang, Liming Wang, Zhen Xu:
Graph-Aware Multi-View Fusion for Rumor Detection on Social Media. 9961-9965 - Feifan Song, Lianzhe Huang, Houfeng Wang:
A Unified Framework for Multi-Intent Spoken Language Understanding with Prompting. 9966-9970 - Xiang Zhang, Shizhu He, Kang Liu, Jun Zhao:
Unsupervised Learning of Neural Semantic Mappings with the Hungarian Algorithm for Compositional Semantics. 9971-9975 - Aoxiong Yin, Tianyun Zhong, Haoyuan Li, Siliang Tang, Zhou Zhao:
Language Model is a Branch Predictor for Simultaneous Machine Translation. 9976-9980 - Pengyu Xu, Mingyang Song, Ziyi Li, Sijin Lu, Liping Jing, Jian Yu:
Taming Prompt-Based Data Augmentation for Long-Tailed Extreme Multi-Label Text Classification. 9981-9985 - Zhiyun Fan, Linhao Dong, Jun Zhang, Lu Lu, Zejun Ma:
SA-SOT: Speaker-Aware Serialized Output Training for Multi-Talker ASR. 9986-9990 - Hui Wan, Hongkang Li, Songtao Lu, Xiaodong Cui, Marina Danilevsky:
How Can Personalized Context Help? Exploring Joint Retrieval of Passage and Personalized Context. 9991-9995 - Xi Wang, Ruoqing Zhao, Hongliang Dai, Piji Li:
An Empirical Investigation of Domain Adaptation Ability for Chinese Spelling Check Models. 9996-10000 - Yichao Du, Zhirui Zhang, Linan Yue, Xu Huang, Yuqing Zhang, Tong Xu, Linli Xu, Enhong Chen:
Communication-Efficient Personalized Federated Learning for Speech-to-Text Tasks. 10001-10005 - Kuan-Po Huang, Chih-Kai Yang, Yu-Kuan Fu, Ewan Dunbar, Hung-Yi Lee:
Zero Resource Code-Switched Speech Benchmark Using Speech Utterance Pairs for Multiple Spoken Languages. 10006-10010 - Desheng Wang, Jing Wang, Hao Zheng, Yanbin Hou:
Automatic Temporal Alignment for Pitch Estimation Evaluation. 10011-10015 - Jingyi Zhou, Jie Zhou, Jiabao Zhao, Siyin Wang, Haijun Shan, Tao Gui, Qi Zhang, Xuanjing Huang:
A Soft Contrastive Learning-Based Prompt Model for Few-Shot Sentiment Analysis. 10016-10020 - Yu Pan, Yanni Hu, Yuguang Yang, Wen Fei, Jixun Yao, Heng Lu, Lei Ma, Jianjun Zhao:
GEmo-CLAP: Gender-Attribute-Enhanced Contrastive Language-Audio Pretraining for Accurate Speech Emotion Recognition. 10021-10025 - Meishan Zhang, Bin Wang, Hao Fei, Min Zhang:
In-Context Learning for Few-Shot Nested Named Entity Recognition. 10026-10030 - Zhenbin Chen, Zhixin Li, Ying Huang, Zhenjun Tang:
Adaptive Prompt Construction Method for Relation Extraction. 10031-10035 - Otavio Braga, Wei Xia, Keith Johnson, Alice Chuang, Yunfan Ye, Olivier Siohan, Tuan Anh Nguyen:
Large Scale Self-Supervised Pretraining for Active Speaker Detection. 10036-10040 - Zexin Cai, Ming Li:
Invertible Voice Conversion with Parallel Data. 10041-10045 - Yu Gu, Xianlong Luo, Meng Yang:
Incomplete Observations Bias Suppression for Abductive Natural Language Inference. 10046-10050 - Minghui Wu, Haitao Tang, Jiahuan Fan, Ruoyu Wang, Hang Chen, Yanyong Zhang, Jun Du, Hengshun Zhou, Lei Sun, Xin Fang, Tian Gao, Genshun Wan, Jia Pan, Jianqing Gao:
Implicit Enhancement of Target Speaker in Speaker-Adaptive ASR through Efficient Joint Optimization. 10051-10055 - Yueting Yang, Xintong Zhang, Jinan Xu, Wenjuan Han:
Empowering Vision-Language Models for Reasoning Ability through Large Language Models. 10056-10060 - Heting Gao, Mark Hasegawa-Johnson, Chang D. Yoo:
G2PU: Grapheme-To-Phoneme Transducer with Speech Units. 10061-10065 - Prabhav Agrawal, Thilo Köhler, Zhiping Xiu, Prashant Serai, Qing He:
Ultra-Lightweight Neural Differential DSP Vocoder for High Quality Speech Synthesis. 10066-10070 - Souvik Kundu, Sharath Nittur Sridhar, Maciej Szankin, Sairam Sundaresan:
Sensi-Bert: Towards Sensitivity Driven Fine-Tuning for Parameter-Efficient Language Model. 10071-10075 - Jie Liu, Xue Han, Chao Deng, Junlan Feng:
Robust Self-Supervised Learning with Contrast Samples for Natural Language Understanding. 10076-10080 - Liang He, Zhihua Fang, Zuoer Chen, Minqiang Xu, Ying Meng, Penghao Wang:
Multi-View Speaker Embedding Learning for Enhanced Stability and Discriminability. 10081-10085 - Zhiyuan Zha, Pengnian Qi, Xigang Bao, Mengyuan Tian, Biao Qin:
M3TQA: Multi-View, Multi-Hop and Multi-Stage Reasoning for Temporal Question Answering. 10086-10090 - Zhida Song, Liang He, Penghao Wang, Ying Hu, Hao Huang:
Introducing Multilingual Phonetic Information to Speaker Embedding for Speaker Verification. 10091-10095 - Zhihong Lei, Ernest Pusateri, Shiyi Han, Leo Liu, Mingbin Xu, Tim Ng, Ruchir Travadi, Youyuan Zhang, Mirko Hannemann, Man-Hung Siu, Zhen Huang:
Personalization of CTC-Based End-to-End Speech Recognition Using Pronunciation-Driven Subword Tokenization. 10096-10100 - Leyuan Qu, Wei Wang, Cornelius Weber, Pengcheng Yue, Taihao Li, Stefan Wermter:
Improving Speech Emotion Recognition with Unsupervised Speaking Style Transfer. 10101-10105 - Sensen Zhang, Xun Liang, Simin Niu, Junlan Feng, Chen Feng, Mengwei Wang:
Temporal Knowledge Graph Embedding using Householder Transformations. 10106-10110 - Siyuan Shen, Yu Gao, Feng Liu, Hanyang Wang, Aimin Zhou:
Emotion Neural Transducer for Fine-Grained Speech Emotion Recognition. 10111-10115 - Junnan Liu, Wenlong Du, Qingquan Li, Xuewei Wang, Zhongjun Zhou, Jin Liu:
S-Evaluator: Enhance Factual Consistency Evaluator with Adversarial Data Synthesized by Large Language Model. 10116-10120 - Takashi Shibuya, Yuhta Takida, Yuki Mitsufuji:
BIGVSAN: Enhancing Gan-Based Neural Vocoders with Slicing Adversarial Network. 10121-10125 - Holger Severin Bovbjerg, Jesper Jensen, Jan Østergaard, Zheng-Hua Tan:
Self-Supervised Pretraining for Robust Personalized Voice Activity Detection in Adverse Conditions. 10126-10130 - Qing-Tian Xu, Jie Zhang, Zhen-Hua Ling:
An End-to-End EEG Channel Selection Method with Residual Gumbel Softmax for Brain-Assisted Speech Enhancement. 10131-10135 - Oscar Chang, Hank Liao, Dmitriy Serdyuk, Ankit Shahy, Olivier Siohan:
Conformer is All You Need for Visual Speech Recognition. 10136-10140 - Rayan Daod Nathoo, Mikolaj Kegler, Marko Stamenovic:
Two-Step Knowledge Distillation for Tiny Speech Enhancement. 10141-10145 - Soha Sadat Mahdi, Eirini Papagiannopoulou, Nikos Deligiannis, Hichem Sahli:
Co-Occurrence Graph-Enhanced Hierarchical Prediction of ICD Codes. 10146-10150 - Chuanneng Sun, Zeeshan Ahmed, Yingyi Ma, Zhe Liu, Lucas Kabela, Yutong Pang, Ozlem Kalinli:
Contextual Biasing of Named-Entities with Large Language Models. 10151-10155 - Haochen Wu, Jie Zhang, Zhentao Zhang, Wenting Zhao, Bin Gu, Wu Guo:
Robust Spoof Speech Detection Based on Multi-Scale Feature Aggregation and Dynamic Convolution. 10156-10160 - Hai Zhu, Xin Wang, Kun Wang, Huayi Zhan:
Temporal Convolution Shrinkage Network for Keyword Spotting. 10161-10165 - Takanori Ashihara, Marc Delcroix, Takafumi Moriya, Kohei Matsuura, Taichi Asami, Yusuke Ijima:
What Do Self-Supervised Speech and Speaker Models Learn? New Findings from a Cross Model Layer-Wise Analysis. 10166-10170 - Hongzhan Lin, Haiqin Yang, Ziyang Luo, Jing Ma:
Unleashing Trigger-Free Event Detection: Revealing Event Correlations Via a Contrastive Derangement Framework. 10171-10175 - Takuma Udagawa, Masayuki Suzuki, Gakuto Kurata, Masayasu Muraoka, George Saon:
Multiple Representation Transfer from Large Language Models to End-to-End ASR Systems. 10176-10180 - Zelin Ying, Chen Li, Yu Dong, Qiuqiang Kong, Qiao Tian, Yuanyuan Huo, Yuxuan Wang:
A Unified Front-End Framework for English Text-to-Speech Synthesis. 10181-10185 - Chunyu Qiang, Hao Li, Hao Ni, He Qu, Ruibo Fu, Tao Wang, Longbiao Wang, Jianwu Dang:
Minimally-Supervised Speech Synthesis with Conditional Diffusion Model and Language Model: A Comparative Study of Semantic Coding. 10186-10190 - Lishi Zuo, Man-Wai Mak, Youzhi Tu:
Promoting Independence of Depression and Speaker Features for Speaker Disentanglement in Speech-Based Depression Detection. 10191-10195 - Chunyu Qiang, Hao Li, Yixin Tian, Ruibo Fu, Tao Wang, Longbiao Wang, Jianwu Dang:
Learning Speech Representation from Contrastive Token-Acoustic Pretraining. 10196-10200 - Li Li, Yijie Li, Dongxing Xu, Haoran Wei, Yanhua Long:
Accent-Specific Vector Quantization for Joint Unsupervised and Supervised Training in Accent Robust Speech Recognition. 10201-10205 - Shijue Huang, Libo Qin, Bingbing Wang, Geng Tu, Ruifeng Xu:
SDIF-DA: A Shallow-to-Deep Interaction Framework with Data Augmentation for Multi-Modal Intent Detection. 10206-10210 - Maxime Burchi, Krishna C. Puvvada, Jagadeesh Balam, Boris Ginsburg, Radu Timofte:
Multilingual Audio-Visual Speech Recognition with Hybrid CTC/RNN-T Fast Conformer. 10211-10215 - Yuan Shangguan, Haichuan Yang, Danni Li, Chunyang Wu, Yassir Fathullah, Dilin Wang, Ayushi Dalmia, Raghuraman Krishnamoorthi, Ozlem Kalinli, Junteng Jia, Jay Mahadeokar, Xin Lei, Mike Seltzer, Vikas Chandra:
TODM: Train Once Deploy Many Efficient Supernet-Based RNN-T Compression For On-Device ASR Models. 10216-10220 - Jian Zhang, Jing Ma, Xiaochen Guo, Lin Li, Liang He:
A Speaker Recognition Method Based on Stable Learning. 10221-10225 - Shenjie Jiang, Peng Song, Shaokai Li, Run Wang, Wenming Zheng:
Multi-Source Unsupervised Transfer Components Learning for Cross-Domain Speech Emotion Recognition. 10226-10230 - Hao Xu, Jing Yang, Jiahao Wang, Wenxin Hu:
Build a 50+ Hours Chinese Mandarin Corpus for Children's Speech Recognition. 10231-10235 - Jakob Poncelet, Hugo Van hamme:
Unsupervised Accent Adaptation Through Masked Language Model Correction of Discrete Self-Supervised Speech Units. 10236-10240 - Lingjun Meng, Jozef Coldenhoff, Paul Kendrick, Tijana Stojkovic, Andrew Harper, Kiril Ratmanski, Milos Cernak:
On Real-Time Multi-Stage Speech Enhancement Systems. 10241-10245 - Chengwen Zhang, Yuhao Zhang, Bo Cheng:
RL-EMO: A Reinforcement Learning Framework for Multimodal Emotion Recognition. 10246-10250 - Annahita Sarré, Hagar Salpeter, Deliane Bechar, Laurent Cohen, Yair Lakretz:
Automatic Recognition of Gesture Identity and Onset of Cued-Speech. 10251-10255 - Bobbi Aditya, Mahdin Rohmatillah, Liang-Hsuan Tai, Jen-Tzung Chien:
Attention-Guided Adaptation for Code-Switching Speech Recognition. 10256-10260 - Zhe Liu, Ozlem Kalinli:
Forgetting Private Textual Sequences in Language Models Via Leave-One-Out Ensemble. 10261-10265 - Jian Huang, Yancheng Bai, Yang Cai, Wei Bian:
A Study on the Adverse Impact of Synthetic Speech on Speech Recognition. 10266-10270 - Yuke Lin, Xiaoyi Qin, Guoqing Zhao, Ming Cheng, Ning Jiang, Haiying Wu, Ming Li:
Voxblink: A Large Scale Speaker Verification Dataset on Camera. 10271-10275 - Yifan Wang, Qingyan Guo, Xinzhe Ni, Chufan Shi, Lemao Liu, Haiyun Jiang, Yujiu Yang:
Hint-Enhanced In-Context Learning Wakes Large Language Models Up For Knowledge-Intensive Tasks. 10276-10280 - Xihui Wang, Xiaojun Wu:
Can ChatGPT Serve as a Multi-Criteria Decision Maker? A Novel Approach to Supplier Evaluation. 10281-10285 - Sadeen Alharbi, Areeb Alowisheq, Zoltán Tüske, Kareem Darwish, Abdullah Alrajeh, Abdulmajeed Alrowithi, Aljawharah Bin Tamran, Asma Ibrahim, Raghad Aloraini, Raneem Alnajim, Ranya Alkahtani, Renad Almuasaad, Sara Alrasheed, Shaykhah Alsubaie, Yaser Alonaizan:
SADA: Saudi Audio Dataset for Arabic. 10286-10290 - Sunmook Choi, Sanghyeok Chung, Seungeun Lee, Soyul Han, Taein Kang, Jaejin Seo, Il-Youp Kwak, Seungsang Oh:
TB-ResNet: Bridging the Gap from TDNN to ResNet in Automatic Speaker Verification with Temporal-Bottleneck Enhancement. 10291-10295 - Xiaoliang Wu, Peter Bell, Ajitha Rajan:
Can We Trust Explainable AI Methods on ASR? An Evaluation on Phoneme Recognition. 10296-10300 - Shengpeng Ji, Jialong Zuo, Minghui Fang, Ziyue Jiang, Feiyang Chen, Xinyu Duan, Baoxing Huai, Zhou Zhao:
TextrolSpeech: A Text Style Control Speech Corpus with Codec Language Text-to-Speech Models. 10301-10305 - Haoyu Dong, Mengkang Hu, Qinyu Xu, Haochen Wang, Yue Hu:
OpenTE: Open-Structure Table Extraction From Text. 10306-10310 - Xin Wang, Junichi Yamagishi:
Can Large-Scale Vocoded Spoofed Data Improve Speech Spoofing Countermeasure with a Self-Supervised Front End? 10311-10315 - Guan-Ting Lin, Prashanth Gurunath Shivakumar, Ankur Gandhe, Chao-Han Huck Yang, Yile Gu, Shalini Ghosh, Andreas Stolcke, Hung-Yi Lee, Ivan Bulyko:
Paralinguistics-Enhanced Large Language Modeling of Spoken Dialogue. 10316-10320 - Jae Hyun Park, Joon-Gyu Maeng, Taejun Bak, Young-Sun Joo:
SYNTHE-SEES: Face Based Text-to-Speech for Virtual Speaker. 10321-10325 - Jaemin Lim, Kiyeon Kim:
Wav2vec-VC: Voice Conversion via Hidden Representations of Wav2vec 2.0. 10326-10330 - Rikui Huang, Wei Wei, Xiaoye Qu, Wenfeng Xie, Xianling Mao, Dangyang Chen:
Joint Multi-Facts Reasoning Network for Complex Temporal Question Answering Over Knowledge Graph. 10331-10335 - Duc-Tuan Truong, Ruijie Tao, Jia Qi Yip, Kong Aik Lee, Eng Siong Chng:
Emphasized Non-Target Speaker Knowledge in Knowledge Distillation for Automatic Speaker Verification. 10336-10340 - Ju-Ho Kim, Jungwoo Heo, Hyun-seo Shin, Chan-yeong Lim, Ha-Jin Yu:
Diff-SV: A Unified Hierarchical Framework for Noise-Robust Speaker Verification Using Score-Based Diffusion Probabilistic Models. 10341-10345 - Xian Shi, Yexin Yang, Zerui Li, Yanni Chen, Zhifu Gao, Shiliang Zhang:
SeACo-Paraformer: A Non-Autoregressive ASR System with Flexible and Effective Hotword Customization Ability. 10346-10350 - Xixi Zhou, Xin Jie, Sheng Zhou, Keyue Shi, Zhi Yu, Jiajun Bu, Haishuai Wang:
Inversive-Reasoning Augmentation for Natural Language Inference. 10351-10355 - Shengkui Zhao, Yukun Ma, Chongjia Ni, Chong Zhang, Hao Wang, Trung Hieu Nguyen, Kun Zhou, Jia Qi Yip, Dianwen Ng, Bin Ma:
MossFormer2: Combining Transformer and RNN-Free Recurrent Network for Enhanced Time-Domain Monaural Speech Separation. 10356-10360 - Shulin He, Huaiwen Zhang, Wei Rao, Kanghao Zhang, Yukai Ju, Yang Yang, Xueliang Zhang:
Hierarchical Speaker Representation for Target Speaker Extraction. 10361-10365 - Dianwen Ng, Chong Zhang, Ruixi Zhang, Yukun Ma, Fabian Ritter Gutierrez, Trung Hieu Nguyen, Chongjia Ni, Shengkui Zhao, Eng Siong Chng, Bin Ma:
Are Soft Prompts Good Zero-Shot Learners for Speech Recognition? 10366-10370 - Soumya Dutta, Sriram Ganapathy:
Zero Shot Audio To Audio Emotion Transfer With Speaker Disentanglement. 10371-10375 - Zhuhai Li, Wu Guo, Jie Zhang:
Generating High-Quality Adversarial Examples with Universal Perturbation-Based Adaptive Network and Improved Perceptual Loss. 10376-10380 - Sara Papi, Peidong Wang, Junkun Chen, Jian Xue, Naoyuki Kanda, Jinyu Li, Yashesh Gaur:
Leveraging Timestamp Information for Serialized Joint Streaming Recognition and Translation. 10381-10385 - Alexandre Bittar, Paul Dixon, Mohammad Samragh, Kumari Nishu, Devang Naik:
Improving Vision-Inspired Keyword Spotting Using Dynamic Module Skipping in Streaming Conformer Encoder. 10386-10390 - Hendrik Laux, Emil Mededovic, Ahmed Hallawa, Lukas Martin, Arne Peine, Anke Schmeink:
LITEVSR: Efficient Visual Speech Recognition by Learning from Speech Representations of Unlabeled Data. 10391-10395 - Renchang Dong, Yijie Li, Dongxing Xu, Yanhua Long:
Cross-Modal Parallel Training for Improving end-to-end Accented Speech Recognition. 10396-10400 - Yifan Yang, Feiyu Shen, Chenpeng Du, Ziyang Ma, Kai Yu, Daniel Povey, Xie Chen:
Towards Universal Speech Discrete Tokens: A Case Study for ASR and TTS. 10401-10405 - Kuan-Hsun Ho, Jeih-weih Hung, Berlin Chen:
What Do Neural Networks Listen to? Exploring the Crucial Bands in Speech Enhancement Using SINC-Convolution. 10406-10410 - Yi He, Lei Yang, Hanyi Wang, Yun Zhu, Shilin Wang:
Speaker-Adaptive Lipreading Via Spatio-Temporal Information Learning. 10411-10415 - Clément Le Moine Veillon, Victor Rosi, Pablo Arias Sarah, Léane Salais, Nicolas Obin:
BWSNET: Automatic Perceptual Assessment of Audio Signals. 10416-10420 - Junyi Peng, Marc Delcroix, Tsubasa Ochiai, Oldrich Plchot, Shoko Araki, Jan Cernocký:
Target Speech Extraction with Pre-Trained Self-Supervised Learning Models. 10421-10425 - Zehua Zhang, Xingwei Liang, Ruifeng Xu, Mingjiang Wang:
Hybrid Attention Time-Frequency Analysis Network for Single-Channel Speech Enhancement. 10426-10430 - Philippe Gonzalez, Zheng-Hua Tan, Jan Østergaard, Jesper Jensen, Tommy Sonne Alstrøm, Tobias May:
Diffusion-Based Speech Enhancement in Matched and Mismatched Conditions Using a Heun-Based Sampler. 10431-10435 - Caoyun Fan, Jidong Tian, Yitian Li, Hao He, Yaohui Jin:
Comparable Demonstrations Are Important In In-Context Learning: A Novel Perspective On Demonstration Selection. 10436-10440 - Kangrui Ruan, Xin He, Jiyang Wang, Xiaozhou Zhou, Helian Feng, Ali Kebarighotbi:
S2E: Towards an End-to-End Entity Resolution Solution from Acoustic Signal. 10441-10445 - Thinh Pham, Dat Quoc Nguyen:
JPIS: A Joint Model for Profile-Based Intent Detection and Slot Filling with Slot-to-Intent Attention. 10446-10450 - Dominik Wagner, Alexander W. Churchill, Siddharth Sigtia, Panayiotis G. Georgiou, Matt Mirsamadi, Aarshee Mishra, Erik Marchi:
A Multimodal Approach to Device-Directed Speech Detection with Large Language Models. 10451-10455 - Amin Edraki, Wai-Yip Chan, Jesper Jensen, Daniel Fogerty:
Speaker Adaptation For Enhancement Of Bone-Conducted Speech. 10456-10460 - Thai-Binh Nguyen, Alexander Waibel:
Synthetic Conversations Improve Multi-Talker ASR. 10461-10465 - Ruiteng Zhang, Jianguo Wei, Xugang Lu, Yongwei Li, Wenhuan Lu, Di Jin, Junhai Xu:
Self-Supervised Domain Exploration with an Optimal Transport Regularization for Open Set Cross-Domain Speech Emotion Recognition. 10466-10470 - Jeong Hun Yeo, Minsu Kim, Shinji Watanabe, Yong Man Ro:
Visual Speech Recognition for Languages with Limited Labeled Data Using Automatic Labels from Whisper. 10471-10475 - Xue Yang, Changchun Bao, Jing Zhou, Xianhong Chen:
Target Speaker Extraction by Directly Exploiting Contextual Information in the Time-Frequency Domain. 10476-10480 - Xinmeng Xu, Yiqun Zhang, Weiping Tu, Yuhong Yang:
An Efficient and Interpre Table Speech Enhancement Network Via Deep Dictionary Learning. 10481-10485 - Xinmeng Xu, Chang Han, Yiqun Zhang, Weiping Tu, Yuhong Yang:
Curricular Contrastive Regularization for Speech Enhancement with Self-Supervised Representations. 10486-10490 - Chang Huai You, Minghui Dong:
A Study on Combining Non-Parallel and Parallel Methodologies for Mandarin-English Cross-Lingual Voice Conversion. 10491-10495 - Yanfeng Wu, Pengcheng Yue, Leyuan Qu, Taihao Li, Yu-Ping Ruan:
Multi-Modal Emotion Recognition Using Multiple Acoustic Features and Dual Cross-Modal Transformer. 10496-10500 - Wenhao Guan, Qi Su, Haodong Zhou, Shiyu Miao, Xingjia Xie, Lin Li, Qingyang Hong:
Reflow-TTS: A Rectified Flow Model for High-Fidelity Text-to-Speech. 10501-10505 - Changjiang Zhao, Shulin He, Xueliang Zhang:
SICRN: Advancing Speech Enhancement through State Space Model and Inplace Convolution Techniques. 10506-10510 - Xingwei Liang, Zehua Zhang, Mingjiang Wang, Ruifeng Xu:
Lightweight Multi-Axial Transformer with Frequency Prompt for Single Channel Speech Enhancement. 10511-10515 - Hongcheng Liu, Zhe Chen, Hui Li, Pingjie Wang, Yanfeng Wang, Yu Wang:
MSG-BART: Multi-Granularity Scene Graph-Enhanced Encoder-Decoder Language Model for Video-Grounded Dialogue Generation. 10516-10520 - Di Liang, Nian Shao, Xiaofei Li:
Frame-Wise Streaming end-to-end Speaker Diarization with Non-Autoregressive Self-Attention-Based Attractors. 10521-10525 - Yifan Yang, Yice Zhang, Ruifeng Xu:
Enhancing Generative Aspect-Based Sentiment Analysis with Relation-Level Supervision and Prompt. 10526-10530 - Tian-Hao Zhang, Dinghao Zhou, Guiping Zhong, Jiaming Zhou, Baoxiang Li:
CIF-T: A Novel CIF-Based Transducer Architecture for Automatic Speech Recognition. 10531-10535 - Xiaoyu Yang, Wei Kang, Zengwei Yao, Yifan Yang, Liyong Guo, Fangjun Kuang, Long Lin, Daniel Povey:
PromptASR for Contextualized ASR with Controllable Style. 10536-10540 - Xinyu Yang, Hengxuan Wang, Huiling Jin, Zhenguo Zhang, Xiaojie Yuan:
Knowledge-Aware Prompt Learning Framework for Korean-Chinese Microblog Sentiment Analysis. 10541-10545 - Ziyang Zhuang, Kun Zou, Chenfeng Miao, Ming Fang, Tao Wei, Zijian Li, Wei Hu, Shaojun Wang, Jing Xiao:
Improving Attention-Based End-to-End Speech Recognition by Monotonic Alignment Attention Matrix Reconstruction. 10546-10550 - Xue Han, Qing Wang, Yitong Wang, Jiahui Wang, Chao Deng, Junlan Feng:
Feature Mixing-Based Active Learning for Multi-Label Text Classification. 10551-10555 - Chenji Lu, Ge Bai, Shilong Li, Ying Liu, Xiyan Liu, Zerong Zeng, Ruifang Liu:
CausalME: Balancing bi-modalities in Visual Question Answering. 10556-10560 - Wen Shen Teo, Yasuhiro Minami:
CIF-RNNT: Streaming ASR Via Acoustic Word Embeddings with Continuous Integrate-and-Fire and RNN-Transducers. 10561-10565 - Shilong Li, Chenji Lu, Ge Bai, Ying Liu, Xiyan Liu, Zhang Zhang, Ruifang Liu:
CDUMA: An Adaptive Approach for Mitigating Confounder for MCQA. 10566-10570 - Jixun Yao, Yuguang Yang, Yi Lei, Ziqian Ning, Yanni Hu, Yu Pan, Jingjing Yin, Hongbin Zhou, Heng Lu, Lei Xie:
Promptvc: Flexible Stylistic Voice Conversion in Latent Space Driven by Natural Language Prompts. 10571-10575 - Beida Zheng, Mijit Ablimit, Askar Hamdulla:
Cross-Modal Alignment for End-to-End Spoken Language Understanding Based on Momentum Contrastive Learning. 10576-10580 - Hyun-seo Shin, Jungwoo Heo, Ju-ho Kim, Chan-yeong Lim, Wonbin Kim, Ha-Jin Yu:
HM-CONFORMER: A Conformer-Based Audio Deepfake Detection System with Hierarchical Pooling and Multi-Level Classification Token Aggregation Methods. 10581-10585 - Yerbolat Khassanov, Zhipeng Chen, Tianfeng Chen, Tze Yuang Chong, Wei Li, Lu Lu, Zejun Ma:
Extending Multilingual ASR to New Languages Using Supplementary Encoder and Decoder Components. 10586-10590 - Ying Fang, Xiaofei Li:
Unimodal Aggregation for CTC-Based Speech Recognition. 10591-10595 - Vincent P. Martin, Jean-Luc Rouas, Pierre Philip:
Automatic Detection Of Sleepiness-Related Syndromes and Symptoms Using Voice and Speech Biomarkers. 10596-10600 - Sho Inoue, Kun Zhou, Shuai Wang, Haizhou Li:
Hierarchical Emotion Prediction and Control in Text-to-Speech Synthesis. 10601-10605 - Vincent P. Martin, Jean-Luc Rouas:
Estimating Symptoms and Clinical Signs Instead of Disorders: The Path Toward The Clinical Use of Voice and Speech Biomarkers In Psychiatry. 10606-10610 - Jingguang Tian, Xinhui Hu, Xinkang Xu:
Learning Emotion-Invariant Speaker Representations for Speaker Verification. 10611-10615 - Yicheng Gu, Xueyao Zhang, Liumeng Xue, Zhizheng Wu:
Multi-Scale Sub-Band Constant-Q Transform Discriminator for High-Fidelity Vocoder. 10616-10620 - Fan Yu, Haoxu Wang, Xian Shi, Shiliang Zhang:
LCB-Net: Long-Context Biasing for Audio-Visual Speech Recognition. 10621-10625 - Zengrui Jin, Xurong Xie, Tianzi Wang, Mengzhe Geng, Jiajun Deng, Guinan Li, Shujie Hu, Xunying Liu:
Towards Automatic Data Augmentation for Disordered Speech Recognition. 10626-10630 - Mingxiu Cai, Daling Wang, Shi Feng, Yifei Zhang:
PECER: Empathetic Response Generation Via Dynamic Personality Extraction and Contextual Emotional Reasoning. 10631-10635 - Karim M. Ibrahim, Antony Perzo, Simon Leglaive:
Towards Improving Speech Emotion Recognition Using Synthetic Data Augmentation from Emotion Conversion. 10636-10640 - Hayato Futami, Emiru Tsunoo, Yosuke Kashiwagi, Hiroaki Ogawa, Siddhant Arora, Shinji Watanabe:
Phoneme-Aware Encoding for Prefix-Tree-Based Contextual ASR. 10641-10645 - Hao Li, Yanan Cao, Yubing Ren, Fang Fang, Lanxue Zhang, Yingjie Li, Shi Wang:
Sorting, Reasoning, and Extraction: An Easy-to-Hard Reasoning Framework for Document-Level Event Argument Extraction. 10646-10650 - Daijun Ding, Rong Chen, Liwen Jing, Bowen Zhang, Xu Huang, Li Dong, Xiaowen Zhao, Ge Song:
Cross-Target Stance Detection by Exploiting Target Analytical Perspectives. 10651-10655 - Yinru He, Guihua Wen, Pei Yang, Dongliang Chen:
Speech Relationship Learning for Cross-Corpus Speech Emotion Recognition. 10656-10660 - Ziqiang Shi, Rujie Liu:
Langwave: Realistic Voice Generation Based on High-Order Langevin Dynamics. 10661-10665 - Junjie Li, Ruijie Tao, Zexu Pan, Meng Ge, Shuai Wang, Haizhou Li:
Audio-Visual Active Speaker Extraction for Sparsely Overlapped Multi-Talker Speech. 10666-10670 - Lei Yang, Wei Liu, Ruijie Meng, Gunwoo Lee, Soonho Baek, Han-Gil Moon:
Fspen: an Ultra-Lightweight Network for Real Time Speech Enahncment. 10671-10675 - Varsha Suresh, Salah Aït-Mokhtar, Caroline Brun, Ioan Calapodescu:
An Adapter-Based Unified Model for Multiple Spoken Language Processing Tasks. 10676-10680 - Minghui Xu, Zishan Guo, Yulong Zeng, Deyi Xiong:
Enhanced Transfer Learning with Efficient Modeling and Adaptive Fusion of Knowledge Via Prompt Tuning. 10681-10685 - Eliya Nachmani, Alon Levkovitch, Yifan Ding, Chulayuth Asawaroengchai, Heiga Zen, Michelle Tadmor Ramanovich:
Translatotron 3: Speech to Speech Translation with Monolingual Data. 10686-10690 - Guillermo Cámbara, Patrick Lumban Tobing, Mikolaj Babianski, Ravichander Vipperla, Duo Wang, Ron Shmelkin, Giuseppe Coccia, Orazio Angelini, Arnaud Joly, Mateusz Lajszczak, Vincent Pollet:
Mapache: Masked Parallel Transformer for Advanced Speech Editing and Synthesis. 10691-10695 - Wangyou Zhang, Jee-weon Jung, Yanmin Qian:
Improving Design of Input Condition Invariant Speech Enhancement. 10696-10700 - Antoine Nzeyimana:
Improving Kinyarwanda Speech Recognition Via Semi-Supervised Learning. 10701-10705 - Yayue Deng, Jinlong Xue, Yukang Jia, Qifei Li, Yichen Han, Fengping Wang, Yingming Gao, Dengfeng Ke, Ya Li:
Concss: Contrastive-based Context Comprehension for Dialogue-Appropriate Prosody in Conversational Speech Synthesis. 10706-10710 - Nikolaos Lagos, Ioan Calapodescu:
Unsupervised Multi-Domain Data Selection for Asr Fine-Tuning. 10711-10715 - Thomas Palmeira Ferraz, Marcely Zanon Boito, Caroline Brun, Vassilina Nikoulina:
Multilingual Distilwhisper: Efficient Distillation of Multi-Task Speech Models Via Language-Specific Experts. 10716-10720 - Kangwook Jang, Sungnyun Kim, Hoirin Kim:
STaR: Distilling Speech Temporal Relation for Lightweight Speech Self-Supervised Learning Models. 10721-10725 - Dechuan Teng, Chunlin Lu, Xiao Xu, Wanxiang Che, Libo Qin:
Pro-HAN: A Heterogeneous Graph Attention Network for Profile-based Spoken Language Understanding. 10726-10730 - Ziyi Xu, Marvin Sach, Jan Pirklbauer, Tim Fingscheidt:
Employing Real Training Data for Deep Noise Suppression. 10731-10735 - Tan Dat Nguyen, Ji-Hoon Kim, Youngjoon Jang, Jaehun Kim, Joon Son Chung:
Fregrad: Lightweight and Fast Frequency-Aware Diffusion Vocoder. 10736-10740 - Liang He, Ruida Li, Mengqi Niu:
A Study on Graph Embedding for Speaker Recognition. 10741-10745 - Lingxing Kong, Zheng Ma, Jianbing Zhang, Liang He, Jiajun Chen:
EmoRED: A Dataset for Relation Extraction in Texts with Emoticons. 10746-10750 - Zhe Li, Man-Wai Mak, Helen Mei-Ling Meng:
Dual Parameter-Efficient Fine-Tuning for Speaker Representation Via Speaker Prompt Tuning and Adapters. 10751-10755 - Shaojin Ding, David Qiu, David Rim, Yanzhang He, Oleg Rybakov, Bo Li, Rohit Prabhavalkar, Weiran Wang, Tara N. Sainath, Zhonglin Han, Jian Li, Amir Yazdanbakhsh, Shivani Agrawal:
USM-Lite: Quantization and Sparsity Aware Fine-Tuning for Speech Recognition with Universal Speech Models. 10756-10760 - Awais Khan, Khalid Mahmood Malik, Shah Nawaz:
Frame-to-Utterance Convergence: A Spectra-Temporal Approach for Unified Spoofing Detection. 10761-10765 - Yihao Wang, Zhongdi Wu, Joseph Nese, Akihito Kamata, Vedant Nilabh, Eric C. Larson:
Improving Oral Reading Fluency Assessment Through Sub-Sequence Matching of Acoustic Word Embeddings. 10766-10770 - Arpita Vats, Zhe Liu, Peng Su, Debjyoti Paul, Yingyi Ma, Yutong Pang, Zeeshan Ahmed, Ozlem Kalinli:
Recovering from Privacy-Preserving Masking with Large Language Models. 10771-10775 - Jaejin Cho, Rakshith Sharma Srinivasa, Ching Hua Lee, Yashas Malur Saidutta, Chouchang Yang, Yilin Shen, Hongxia Jin:
Zero-Shot Intent Classification Using a Semantic Similarity Aware Contrastive Loss and Large Language Model. 10776-10780 - Chunyu Qiang, Hao Li, Yixin Tian, Yi Zhao, Ying Zhang, Longbiao Wang, Jianwu Dang:
High-Fidelity Speech Synthesis with Minimal Supervision: All Using Diffusion Models. 10781-10785 - Yunyun Wang, Jiaqi Su, Adam Finkelstein, Zeyu Jin:
GR0: Self-Supervised Global Representation Learning for Zero-Shot Voice Conversion. 10786-10790 - Stefano Bannò, Rao Ma, Mengjie Qian, Kate M. Knill, Mark J. F. Gales:
Towards End-to-End Spoken Grammatical Error Correction. 10791-10795 - Wonjune Kang, Yun Wang, Shun Zhang, Arthur Hinsvark, Qing He:
Multi-Task Learning for Front-End Text Processing in TTS. 10796-10800 - Heng-Jui Chang, Ning Dong, Ruslan Mavlyutov, Sravya Popuri, Yu-An Chung:
COLLD: Contrastive Layer-to-Layer Distillation for Compressing Multilingual Pre-Trained Speech Encoders. 10801-10805 - Ju-Chieh Chou, Chung-Ming Chien, Karen Livescu:
AV2WAV: Diffusion-Based Re-Synthesis from Continuous Self-Supervised Features for Audio-Visual Speech Enhancement. 10806-10810 - Rricha Jalota, Lyan Verwimp, Markus Nußbaum-Thom, Amr El-Desoky Mousa, Arturo Argueta, Youssef Oualil:
Towards A World-English Language Model for on-Device Virtual Assistants. 10811-10815 - Pin-Jui Ku, I-Fan Chen, Chao-Han Huck Yang, Anirudh Raju, Pranav Dheram, Pegah Ghahremani, Brian King, Jing Liu, Roger Ren, Phani Sankar Nidadavolu:
Hot-Fixing Wake Word Recognition for End-to-End ASR Via Neural Model Reprogramming. 10816-10820 - Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
Stable Distillation: Regularizing Continued Pre-Training for Low-Resource Automatic Speech Recognition. 10821-10825 - Zuzhao Ye, Gregory Ciccarelli, Brian Kulis:
Maximum-Entropy Adversarial Audio Augmentation for Keyword Spotting. 10826-10830 - Ching Hua Lee, Chouchang Yang, Rakshith Sharma Srinivasa, Yashas Malur Saidutta, Jaejin Cho, Yilin Shen, Hongxia Jin:
Leveraging Self-Supervised Speech Representations for Domain Adaptation in Speech Enhancement. 10831-10835 - Chenyang Gao, Brecht Desplanques, Chelsea J.-T. Ju, Aman Chadha, Andreas Stolcke:
Post-Training Embedding Alignment for Decoupling Enrollment and Runtime Speaker Recognition Models. 10836-10840 - Junwen Bai, Bo Li, Qiujia Li, Tara N. Sainath, Trevor Strohman:
Efficient Adapter Finetuning for Tail Languages in Streaming Multilingual ASR. 10841-10845 - Katrin Tomanek, Jimmy Tobin, Subhashini Venugopalan, Richard Cave, Katie Seaver, Jordan R. Green, Rus Heywood:
Large Language Models As A Proxy For Human Evaluation In Assessing The Comprehensibility Of Disordered Speech Transcription. 10846-10850 - Atli Sigurgeirsson, Simon King:
Controllable Speaking Styles Using A Large Language Model. 10851-10855 - Yingyi Ma, Zhe Liu, Ozlem Kalinli:
Correction Focused Language Model Training For Speech Recognition. 10856-10860 - Taejin Park, Kunal Dhawan, Nithin Rao Koluguri, Jagadeesh Balam:
Enhancing Speaker Diarization with Large Language Models: A Contextual Beam Search Approach. 10861-10865 - Mu Yang, Naoyuki Kanda, Xiaofei Wang, Junkun Chen, Peidong Wang, Jian Xue, Jinyu Li, Takuya Yoshioka:
Diarist: Streaming Speech Translation with Speaker Diarization. 10866-10870 - Yamato Ohtani, Takuma Okamoto, Tomoki Toda, Hisashi Kawai:
FIRNet: Fundamental Frequency Controllable Fast Neural Vocoder With Trainable Finite Impulse Response Filter. 10871-10875 - Michael Hentschel, Yuta Nishikawa, Tatsuya Komatsu, Yusuke Fujita:
Keep Decoding Parallel With Effective Knowledge Distillation From Language Models To End-To-End Speech Recognisers. 10876-10880 - Akshay Muppidi, Martin Radfar:
Emohrnet: High-Resolution Neural Network Based Speech Emotion Recognition. 10881-10885 - Hexin Liu, Leibny Paola García, Xiangyu Zhang, Andy W. H. Khong, Sanjeev Khudanpur:
Enhancing Code-Switching Speech Recognition With Interactive Language Biases. 10886-10890 - Youzhi Tu, Man-Wai Mak, Jen-Tzung Chien:
Contrastive Speaker Embedding With Sequential Disentanglement. 10891-10895 - Yui Sudo, Muhammad Shakeel, Yosuke Fukumoto, Yifan Peng, Shinji Watanabe:
Contextualized Automatic Speech Recognition With Attention-Based Bias Phrase Boosted Beam Search. 10896-10900 - Shuai Wang, Qibing Bai, Qi Liu, Jianwei Yu, Zhengyang Chen, Bing Han, Yanmin Qian, Haizhou Li:
Leveraging in-the-wild Data for Effective Self-supervised Pretraining in Speaker Recognition. 10901-10905 - Xiaoxia Cheng, Weiming Lu:
Shapley Value Guided Extractive Text Summarization. 10906-10910 - Ying Zhang, Depeng Dang, Ning Wang, Hu Gao:
A Prompt-Based Method with Multi-View Optimization for Open Relation Extraction. 10911-10915 - Linqin Wang, Zhengtao Yu, Shengxiang Gao, Cunli Mao, Yuxin Huang:
DETS: End-to-End Single-Stage Text-to-Speech Via Hierarchical Diffusion Gan Models. 10916-10920 - Yang Zhang, Travis M. Bartley, Mariana Graterol-Fuenmayor, Vitaly Lavrukhin, Evelina Bakhturina, Boris Ginsburg:
A Chat about Boring Problems: Studying GPT-Based Text Normalization. 10921-10925 - Cheng Luo, Yiguang Liu, Wenhui Sun, Zhoujian Sun:
Multi-Modality Speech Recognition Driven by Background Visual Scenes. 10926-10930 - A F. M. Saif, Xiaodong Cui, Han Shen, Songtao Lu, Brian Kingsbury, Tianyi Chen:
Joint Unsupervised and Supervised Training for Automatic Speech Recognition via Bilevel Optimization. 10931-10935 - Liming Wang, Mark Hasegawa-Johnson, Chang D. Yoo:
Unsupervised Speech Recognition with N-skipgram and Positional Unigram Matching. 10936-10940 - Song Li, Yongbin You, Xuezhi Wang, Ke Ding, Guanglu Wan:
Enhancing Multilingual Speech Recognition through Language Prompt Tuning and Frame-Level Language Adapter. 10941-10945 - Won-Gook Choi, Donghyun Seong, Joon-Hyuk Chang:
Adversarial Learning on Compressed Posterior Space for Non-Iterative Score-based End-to-End Text-to-Speech. 10946-10950 - Gene-Ping Yang, Yue Gu, Sashank Macha, Qingming Tang, Yuzong Liu:
On-Device Constrained Self-Supervised Learning for Keyword Spotting via Quantization Aware Pre-Training and Fine-Tuning. 10951-10955 - Mahdin Rohmatillah, Jen-Tzung Chien:
Revise the NLU: A Prompting Strategy for Robust Dialogue System. 10956-10960 - Lester Phillip Violeta, Wen-Chin Huang, Ding Ma, Ryuichi Yamamoto, Kazuhiro Kobayashi, Tomoki Toda:
Electrolaryngeal Speech Intelligibility Enhancement through Robust Linguistic Encoders. 10961-10965 - Mohammad Rasool Izadi, Yujia Yan, Shuo Zhang, Robert Stevenson:
Towards Optimal Voice Disentanglement with Weak Supervision. 10966-10970 - Eesung Kim, Yun Tang, Taeyeon Ki, Divya Neelagiri, Vijendra Raj Apsingekar:
Joint End-to-End Spoken Language Understanding and Automatic Speech Recognition Training Based on Unified Speech-to-Text Pre-Training. 10971-10975 - Jiajun Deng, Xurong Xie, Guinan Li, Mingyu Cui, Mengzhe Geng, Zengrui Jin, Tianzi Wang, Shujie Hu, Zhaoqing Li, Xunying Liu:
Towards High-Performance and Low-Latency Feature-Based Speaker Adaptation of Conformer Speech Recognition Systems. 10976-10980 - Yuhang Sun, Chenxing Li, Biao Li:
Branchformer-Based TDNN for Automatic Speaker Verification. 10981-10985 - Nineli Lashkarashvili, Wen Wu, Guangzhi Sun, Philip C. Woodland:
Parameter Efficient Finetuning for Speech Emotion Recognition and Domain Adaptation. 10986-10990 - Wei Kang, Xiaoyu Yang, Zengwei Yao, Fangjun Kuang, Yifan Yang, Liyong Guo, Long Lin, Daniel Povey:
Libriheavy: A 50, 000 Hours ASR Corpus with Punctuation Casing and Context. 10991-10995 - Linjuan Zhang, Kong Aik Lee, Lin Zhang, Longbiao Wang, Baoning Niu:
CPAUG: Refining Copy-Paste Augmentation for Speech Anti-Spoofing. 10996-11000 - Long Mai, Julie Carson-Berndsen:
Enhancing Conversation Smoothness in Language Learning Chatbots: An Evaluation of GPT4 for ASR Error Correction. 11001-11005 - Jiaming Zhou, Shiwan Zhao, Yaqi Liu, Wenjia Zeng, Yong Chen, Yong Qin:
KNN-CTC: Enhancing ASR via Retrieval of CTC Pseudo Labels. 11006-11010 - Weiqing Wang, Danwei Cai, Ming Cheng, Ming Li:
Joint Inference of Speaker Diarization and ASR with Multi-Stage Information Sharing. 11011-11015 - Yang Yang, Yury Kartynnik, Yunpeng Li, Jiuqiang Tang, Xing Li, George Sung, Matthias Grundmann:
STREAMVC: Real-Time Low-Latency Voice Conversion. 11016-11020 - Hanako Segawa, Tsubasa Ochiai, Marc Delcroix, Tomohiro Nakatani, Rintaro Ikeshita, Shoko Araki, Takeshi Yamada, Shoji Makino:
Neural Network-Based Virtual Microphone Estimation with Virtual Microphone and Beamformer-Level Multi-Task Loss. 11021-11025 - Jennifer Santoso, Kenkichi Ishizuka, Taiichi Hashimoto:
Large Language Model-Based Emotional Speech Annotation Using Context and Acoustic Feature for Speech Emotion Recognition. 11026-11030 - Kazuma Iwamoto, Tsubasa Ochiai, Marc Delcroix, Rintaro Ikeshita, Hiroshi Sato, Shoko Araki, Shigeru Katagiri:
How Does End-To-End Speech Recognition Training Impact Speech Enhancement Artifacts? 11031-11035 - Ting Zou, Zhong Qian, Peifeng Li, Qiaoming Zhu:
PVCG: Prompt-Based Vision-Aware Classification and Generation for Multi-Modal Rumor Detection. 11036-11040 - Yuhao Zhang, Kaiqi Kou, Bei Li, Chen Xu, Chunliang Zhang, Tong Xiao, Jingbo Zhu:
Soft Alignment of Modality Space for End-to-End Speech Translation. 11041-11045 - Shaoshi Ling, Yuxuan Hu, Shuangbei Qian, Guoli Ye, Yao Qian, Yifan Gong, Ed Lin, Michael Zeng:
Adapting Large Language Model with Speech for Fully Formatted End-to-End Speech Recognition. 11046-11050 - Haoqin Sun, Shiwan Zhao, Xuechen Wang, Wenjia Zeng, Yong Chen, Yong Qin:
Fine-Grained Disentangled Representation Learning For Multimodal Emotion Recognition. 11051-11055 - Qian Chen, Wen Wang, Qinglin Zhang, Siqi Zheng, Shiliang Zhang, Chong Deng, Yukun Ma, Hai Yu, Jiaqing Liu, Chong Zhang:
Loss Masking Is Not Needed In Decoder-Only Transformer For Discrete-Token-Based ASR. 11056-11060 - Gang Zhao, Yidong Shi, Shudong Lu, Xinjie Yang, Guanting Dong, Jian Xu, Xiaocheng Gong, Si Li:
Type-Aware Decoding Via Explicitly Aggregating Event Information for Document-Level Event Extraction. 11061-11065 - Jiajun He, Xiaohan Shi, Xingfeng Li, Tomoki Toda:
MF-AED-AEC: Speech Emotion Recognition by Leveraging Multimodal Fusion, Asr Error Detection, and Asr Error Correction. 11066-11070 - Mengchao Zhang, Mei Tu, Fan Zhang, Song Liu:
A Cross Search Method for Data Augmentation in Neural Machine Translation. 11071-11075 - Haoxu Wang, Fan Yu, Xian Shi, Yuezhang Wang, Shiliang Zhang, Ming Li:
SlideSpeech: A Large Scale Slide-Enriched Audio-Visual Corpus. 11076-11080 - Chong-Xin Gan, Man-Wai Mak, Weiwei Lin, Jen-Tzung Chien:
Asymmetric Clean Segments-Guided Self-Supervised Learning for Robust Speaker Verification. 11081-11085 - Yidi Jiang, Zhengyang Chen, Ruijie Tao, Liqun Deng, Yanmin Qian, Haizhou Li:
Prompt-Driven Target Speech Diarization. 11086-11090 - Abhinav Garg, Jiyeon Kim, Sushil Khyalia, Chanwoo Kim, Dhananjaya Gowda:
Data Driven Grapheme-to-Phoneme Representations for a Lexicon-Free Text-to-Speech. 11091-11095 - Chen Gao, Xugong Qin, Peng Zhang, Yongquan He, Xinjian Huang, Ming Zhou, Liehuang Zhu, Qingfeng Tan:
MHPS: Multimodality-Guided Hierarchical Policy Search for Knowledge Graph Reasoning. 11096-11100 - Huanxi Liu, Yuanzhao Zhai, Kele Xu, Dawei Feng, Yiying Li:
Nuclear-Norm Maximization for Low-Rank Updates. 11101-11105 - Ziqian Ning, Yuepeng Jiang, Pengcheng Zhu, Shuai Wang, Jixun Yao, Lei Xie, Mengxiao Bi:
Dualvc 2: Dynamic Masked Convolution for Unified Streaming and Non-Streaming Voice Conversion. 11106-11110 - Vikramjit Mitra, Jingping Nie, Erdrin Azemi:
Investigating Salient Representations and Label Variance in Dimensional Speech Emotion Analysis. 11111-11115 - Yiwei Guo, Chenpeng Du, Ziyang Ma, Xie Chen, Kai Yu:
VoiceFlow: Efficient Text-To-Speech with Rectified Flow Matching. 11121-11125 - Jianwei Cui, Yu Gu, Chao Weng, Jie Zhang, Liping Chen, Lirong Dai:
Sifisinger: A High-Fidelity End-to-End Singing Voice Synthesizer Based on Source-Filter Model. 11126-11130 - Dong-Hyun Kim, Jae-Hong Lee, Joon-Hyuk Chang:
Text-Only Unsupervised Domain Adaptation for Neural Transducer-Based ASR Personalization Using Synthesized Data. 11131-11135 - Xupeng Zha, Huan Zhao, Zixing Zhang:
Esihgnn: Event-State Interactions Infused Heterogeneous Graph Neural Network for Conversational Emotion Recognition. 11136-11140 - Hui Lu, Xixin Wu, Haohan Guo, Songxiang Liu, Zhiyong Wu, Helen Meng:
Unifying One-Shot Voice Conversion and Cloning with Disentangled Speech Representations. 11141-11145 - Ziyang Ma, Wen Wu, Zhisheng Zheng, Yiwei Guo, Qian Chen, Shiliang Zhang, Xie Chen:
Leveraging Speech PTM, Text LLM, And Emotional TTS For Speech Emotion Recognition. 11146-11150 - Jian Huang, Yuanyuan Pu, Dongming Zhou, Hang Shi, Zhengpeng Zhao, Dan Xu, Jinde Cao:
Multimodal Sentiment Analysis Based on 3D Stereoscopic Attention. 11151-11155 - Suwon Shon, Kwangyoun Kim, Prashant Sridhar, Yi-Te Hsu, Shinji Watanabe, Karen Livescu:
Generative Context-Aware Fine-Tuning of Self-Supervised Speech Models. 11156-11160 - Yiming Wang, Jinyu Li:
Residualtransformer: Residual Low-Rank Learning With Weight-Sharing For Transformer Layers. 11161-11165 - Jae-Sung Bae, Joun Yeop Lee, Ji-Hyun Lee, Seongkyu Mun, Taehwa Kang, Hoon-Young Cho, Chanwoo Kim:
Latent Filling: Latent Space Data Augmentation for Zero-Shot Speech Synthesis. 11166-11170 - Amit Kumar Singh Yadav, Kratika Bhagtani, Sriram Baireddy, Paolo Bestagini, Stefano Tubaro, Edward J. Delp:
Mdrt: Multi-Domain Synthetic Speech Localization. 11171-11175 - Wenyuan Zhang, Xinghua Zhang, Shiyao Cui, Kun Huang, Xuebin Wang, Tingwen Liu:
Adaptive Data Augmentation for Aspect Sentiment Quad Prediction. 11176-11180 - Jiangyu Han, Federico Landini, Johan Rohdin, Mireia Díez, Lukás Burget, Yuhang Cao, Heng Lu, Jan Cernocký:
Diacorrect: Error Correction Back-End for Speaker Diarization. 11181-11185 - Hyunjun Heo, Ui-Hyeop Shin, Ran Lee, Youngju Cheon, Hyung-Min Park:
NeXt-TDNN: Modernizing Multi-Scale Temporal Convolution Backbone for Speaker Verification. 11186-11190 - Jiaxin Duan, Fengyu Lu, Junfei Liu:
Alleviating Hallucinations Via Supportive Window Indexing in Abstractive Summarization. 11191-11195 - Anja Virkkunen, Marek Sarvas, Guangpu Huang, Tamás Grósz, Mikko Kurimo:
Investigating the Clusters Discovered By Pre-Trained AV-HuBERT. 11196-11200 - Tiantian Feng, Rajat Hebbar, Shrikanth Narayanan:
TRUST-SER: On The Trustworthiness Of Fine-Tuning Pre-Trained Speech Embeddings For Speech Emotion Recognition. 11201-11205 - Xinkai Du, Quanjie Han, Yalin Sun, Chao Lv, Maosong Sun:
Label Dependencies-Aware Set Prediction Networks for Multi-Label Text Classification. 11206-11210 - Jinpeng Li, Wei-Qiang Zhang:
Whisper-Based Transfer Learning for Alzheimer Disease Classification: Leveraging Speech Segments with Full Transcripts as Prompts. 11211-11215 - Jiaxin Duan, Fengyu Lu, Junfei Liu:
Iterative Autoregressive Generation for Abstractive Summarization. 11216-11220 - Yuzhu Wang, Archontis Politis, Tuomas Virtanen:
Attention-Driven Multichannel Speech Enhancement in Moving Sound Source Scenarios. 11221-11225 - Tzu-Ting Yang, Hsin-Wei Wang, Yi-Cheng Wang, Chi-Han Lin, Berlin Chen:
An Effective Mixture-Of-Experts Approach For Code-Switching Speech Recognition Leveraging Encoder Disentanglement. 11226-11230 - Lauri Juvela, Xin Wang:
Collaborative Watermarking for Adversarial Speech Synthesis. 11231-11235 - Changli Tang, Wenyi Yu, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Extending Large Language Models for Speech and Audio Captioning. 11236-11240 - Yuqing Li, Wenyuan Zhang, Binbin Li, Siyu Jia, Zisen Qi, Xingbang Tan:
Dynamic Multi-Scale Context Aggregation for Conversational Aspect-Based Sentiment Quadruple Analysis. 11241-11245 - Zhongren Dong, Zixing Zhang, Weixiang Xu, Jing Han, Jianjun Ou, Björn W. Schuller:
HAFFormer: A Hierarchical Attention-Free Framework for Alzheimer's Disease Detection From Spontaneous Speech. 11246-11250 - Jingze Lu, Yuxiang Zhang, Wenchao Wang, Zengqiang Shang, Pengyuan Zhang:
One-Class Knowledge Distillation for Spoofing Speech Detection. 11251-11255 - Yuquan Lan, Dongxu Li, Yunqi Zhang, Hui Zhao, Gang Zhao:
RSED: Zero-Shot Relation Triplet Extraction via Relation Selection and Entity Boundary Detection. 11256-11260 - Kazuki Yamauchi, Yusuke Ijima, Yuki Saito:
STYLECAP: Automatic Speaking-Style Captioning from Speech Based on Speech and Language Self-Supervised Learning Models. 11261-11265 - Yu Gu, Qiushi Zhu, Guangzhi Lei, Chao Weng, Dan Su:
DurIAN-E 2: Duration Informed Attention Network with Adaptive Variational Autoencoder and Adversarial Learning for Expressive Text-to-Speech Synthesis. 11266-11270 - Yinghan Shen, Yu Yan, Dechun Yin, Huawei Shen:
BCC: Bidirectional Consistency Constraint Method for Hierarchical Text Classification. 11271-11275 - Karel Benes, Martin Kocour, Lukás Burget:
Hystoc: Obtaining Word Confidences for Fusion of End-To-End ASR Systems. 11276-11280 - Naohiro Tawara, Marc Delcroix, Atsushi Ando, Atsunori Ogawa:
NTT Speaker Diarization System for Chime-7: Multi-Domain, Multi-Microphone end-to-end and Vector Clustering Diarization. 11281-11285 - Itzik Malkiel, Yakir Yehuda, Jonathan Ephrath, Ori Katz, Oren Barkan, Nir Nice, Noam Koenigstein:
Unsupervised Topic-Conditional Extractive Summarization. 11286-11290 - Yudong Li, Yuhao Feng, Wen Zhou, Zhe Zhao, Linlin Shen, Cheng Hou, Xianxu Hou:
Dynamic Data Sampler for Cross-Language Transfer Learning in Large Language Models. 11291-11295 - Hang Shao, Bei Liu, Yanmin Qian:
One-Shot Sensitivity-Aware Mixed Sparsity Pruning for Large Language Models. 11296-11300 - Jian-Tao Zhang, Yan Song, Jin Li, Wu Guo, Hao-Yu Song, Ian McLoughlin:
Meta Representation Learning Method for Robust Speaker Verification in Unseen Domains. 11301-11305 - Hao Wang, Zhengshan Xue, Yikun Lei, Deyi Xiong:
End-to-End Speech Translation with Mutual Knowledge Distillation. 11306-11310 - Yi Ma, Kong Aik Lee, Ville Hautamäki, Meng Ge, Haizhou Li:
Gradient Weighting for Speaker Verification in Extremely Low Signal-to-Noise Ratio. 11311-11315 - Yuan Gao, Hao Shi, Chenhui Chu, Tatsuya Kawahara:
Enhancing Two-Stage Finetuning for Speech Emotion Recognition Using Adapters. 11316-11320 - Seunghee Han, Se Jin Park, Chae Won Kim, Yong Man Ro:
Persona Extraction Through Semantic Similarity for Emotional Support Conversation Generation. 11321-11325 - Liyizhe Peng, Zixing Zhang, Tao Pang, Jing Han, Huan Zhao, Hao Chen, Björn W. Schuller:
Customising General Large Language Models for Specialised Emotion Recognition Tasks. 11326-11330 - Mohammad Zeineldeen, Albert Zeyer, Ralf Schlüter, Hermann Ney:
Chunked Attention-Based Encoder-Decoder Model for Streaming Speech Recognition. 11331-11335 - Zhichao Wu, Qiulin Li, Sixing Liu, Qun Yang:
DCTTS: Discrete Diffusion Model with Contrastive Learning for Text-to-Speech Generation. 11336-11340 - Shivam Mehta, Ruibo Tu, Jonas Beskow, Éva Székely, Gustav Eje Henter:
Matcha-TTS: A Fast TTS Architecture with Conditional Flow Matching. 11341-11345 - Li Zhou, Wenyu Chen, Yong Cao, Dingyi Zeng, Wanlong Liu, Hong Qu:
MLPs Compass: What is Learned When MLPs are Combined with PLMs? 11346-11350 - Yu Xi, Hao Li, Baochen Yang, Haoyu Li, Hainan Xu, Kai Yu:
TDT-KWS: Fast and Accurate Keyword Spotting Using Token-and-Duration Transducer. 11351-11355 - Elio Gruttadauria, Mathieu Fontaine, Slim Essid:
Online Speaker Diarization of Meetings Guided by Speech Separation. 11356-11360 - Itzik Malkiel, Uri Alon, Yakir Yehuda, Shahar Keren, Oren Barkan, Royi Ronen, Noam Koenigstein:
SEGLLM: Topic-Oriented Call Segmentation Via LLM-Based Conversation Synthesis. 11361-11365 - Ziyi Ni, Minglun Han, Feilong Chen, Linghui Meng, Jing Shi, Pin Lv, Bo Xu:
ViLaS: Exploring the Effects of Vision and Language Context in Automatic Speech Recognition. 11366-11370 - Weiguang Chen, Tran The Anh, Xionghu Zhong, Eng Siong Chng:
Enhancing Low-Latency Speaker Diarization with Spatial Dictionary Learning. 11371-11375 - Andong Li, Rilin Chen, Yu Gu, Chao Weng, Dan Su:
Opine: Leveraging a Optimization-Inspired Deep Unfolding Method for Multi-Channel Speech Enhancement. 11376-11380 - Arun Baby, George Joseph, Shatrughan Singh:
Robust Speaker Personalisation Using Generalized Low-Rank Adaptation for Automatic Speech Recognition. 11381-11385 - Shoutao Guo, Shaolei Zhang, Yang Feng:
Glancing Future for Simultaneous Machine Translation. 11386-11390 - Mingyang Mei, Yue Hu, Yifan Deng, Xingsheng Zhang, Yunpeng Li, Hao You:
Summarizing Community-Based Question-Answer Pairs with Focus Rectification. 11391-11395 - Bingshen Mu, Pengcheng Guo, Dake Guo, Pan Zhou, Wei Chen, Lei Xie:
Automatic Channel Selection and Spatial Feature Integration for Multi-Channel Speech Recognition Across Various Array Topologies. 11396-11400 - Charlotte Fooks, Oliver Niebuhr:
Assessing Vibroacoustic Sound Massage Through The Biosignal of Human Speech: Evidence of Improved Wellbeing. 11401-11405 - Yongheng Zhang, Peng Wang, Qiguang Chen, Jingxuan Zhou, Yongmei Wang, Min Li, Libo Qin:
LabCLIP: Label-Enhanced Clip for Improving Zero-Shot Text Classification. 11406-11410 - Shihao Chen, Liping Chen, Jie Zhang, Kong-Aik Lee, Zhenhua Ling, Lirong Dai:
Adversarial Speech for Voice Privacy Protection from Personalized Speech Generation. 11411-11415 - Sevada Hovsepyan, Mathew Magimai-Doss:
Syllable Level Features for Parkinson's Disease Detection from Speech. 11416-11420 - Xiaoxiao Miao, Xin Wang, Erica Cooper, Junichi Yamagishi, Nicholas W. D. Evans, Massimiliano Todisco, Jean-François Bonastre, Mickael Rouvier:
Synvox2: Towards A Privacy-Friendly Voxceleb2 Dataset. 11421-11425 - Kangdi Mei, Zhaoci Liu, Hui-Peng Du, Hengyu Li, Yang Ai, Liping Chen, Zhenhua Ling:
Considering Temporal Connection between Turns for Conversational Speech Synthesis. 11426-11430 - Alexandros Haliassos, Andreas Zinonos, Rodrigo Mira, Stavros Petridis, Maja Pantic:
BRAVEn: Improving Self-supervised pre-training for Visual and Auditory Speech Recognition. 11431-11435 - Mohammed Maqsood Shaik, Dietrich Klakow, Badr M. Abdullah:
Self-Supervised Adaptive Pre-Training of Multilingual Speech Models for Language and Dialect Identification. 11436-11440 - Xiao-Ying Zhao, Qiushi Zhu, Yuchen Hu:
An Experimental Comparison of Noise-Robust Text-To-Speech Synthesis Systems Based On Self-Supervised Representation. 11441-11445 - Lin Wang, Haithm M. Al-Gunid, Ammar Hawbani, Yan Xiong:
Cooking-Clip: Context-Aware Language-Image Pretraining for Zero-Shot Recipe Generation. 11446-11450 - Weitai Zhang, Hanyi Zhang, Chenxuan Liu, Zhongyi Ye, Xinyuan Zhou, Chao Lin, Lirong Dai:
Pre-Trained Acoustic-and-Textual Modeling for End-To-End Speech-To-Text Translation. 11451-11455 - Zexu Pan, Gordon Wichern, François G. Germain, Sameer Khurana, Jonathan Le Roux:
NeuroHeed+: Improving Neuro-Steered Speaker Extraction with Joint Auditory Attention Detection. 11456-11460 - Zhenxi Lin, Ziheng Zhang, Xian Wu, Yefeng Zheng:
Improving Biomedical Entity Linking with Retrieval-Enhanced Learning. 11461-11465 - Rehan Ahmad, Muhammad Umar Farooq, Thomas Hain:
Progressive Unsupervised Domain Adaptation for ASR Using Ensemble Models and Multi-Stage Training. 11466-11470 - Kenichi Fujita, Hiroshi Sato, Takanori Ashihara, Hiroki Kanagawa, Marc Delcroix, Takafumi Moriya, Yusuke Ijima:
Noise-Robust Zero-Shot Text-to-Speech Synthesis Conditioned on Self-Supervised Speech-Representation Model with Adapters. 11471-11475 - Fenting Liu, Feifei Xiong, Yiya Hao, Kechenying Zhou, Chenhui Zhang, Jinwei Feng:
AS-pVAD: A Frame-Wise Personalized Voice Activity Detection Network with Attentive Score Loss. 11476-11480 - Xuankai Chang, Brian Yan, Kwanghee Choi, Jee-Weon Jung, Yichen Lu, Soumi Maiti, Roshan S. Sharma, Jiatong Shi, Jinchuan Tian, Shinji Watanabe, Yuya Fujita, Takashi Maekaku, Pengcheng Guo, Yao-Fei Cheng, Pavel Denisov, Kohei Saijo, Hsiu-Hsuan Wang:
Exploring Speech Recognition, Translation, and Understanding with Discrete Speech Units: A Comparative Study. 11481-11485 - Qifei Li, Yingming Gao, Cong Wang, Yayue Deng, Jinlong Xue, Yichen Han, Ya Li:
Frame-Level Emotional State Alignment Method for Speech Emotion Recognition. 11486-11490 - William Ravenscroft, Stefan Goetze, Thomas Hain:
Combining Conformer and Dual-Path-Transformer Networks for Single Channel Noisy Reverberant Speech Separation. 11491-11495 - Hongxuan Wang, Prahlad Vadakkepat:
Gradient-Based Dimensionality Reduction for Speech Emotion Recognition Using Deep Networks. 11496-11500 - Yusheng Tian, Jingyu Li, Tan Lee:
Creating Personalized Synthetic Voices from Articulation Impaired Speech Using Augmented Reconstruction Loss. 11501-11505 - Minjie Tang, Hao Huang, Wenbo Zhang, Liang He:
Phase Continuity-Aware Self-Attentive Recurrent Network with Adaptive Feature Selection for Robust VAD. 11506-11510 - Simin Huang, Peijie Huang, Yuhong Xu, Jingzhou Liang, Jingde Niu:
Exploring Label Hierarchy in Dialogue Intent Classification. 11511-11515 - Jiankai Zhu, Peijie Huang, Ziheng Ruan, Yuhui Zhu, Chaojie Liang, Yuhong Xu:
Anchor-Guided GAN with Contrastive Loss for Low-Resource Out-of-Domain Detection. 11516-11520 - Sen Liu, Yiwei Guo, Xie Chen, Kai Yu:
StoryTTS: A Highly Expressive Text-to-Speech Dataset with Rich Textual Expressiveness Annotations. 11521-11525 - Shikhar Bharadwaj, Min Ma, Shikhar Vashishth, Ankur Bapna, Sriram Ganapathy, Vera Axelrod, Siddharth Dalmia, Wei Han, Yu Zhang, Daan van Esch, Sandy Ritchie, Partha Talukdar, Jason Riesa:
Multimodal Modeling for Spoken Language Identification. 11526-11530 - Jian Wu, Naoyuki Kanda, Takuya Yoshioka, Rui Zhao, Zhuo Chen, Jinyu Li:
T-SOT FNT: Streaming Multi-Talker ASR with Text-Only Domain Adaptation Capability. 11531-11535 - Shaoyao Huang, Ziqiang Cao, Luozheng Qin, Jun Gao, Jun Zhang:
Contrastive Learning with High-Quality and Low-Quality Augmented Data for Query-Focused Summarization. 11536-11540 - Yumnah Mohamied, Peter Bell:
Bootstrap Predictive Coding: Investigating a Non-Contrastive Self-Supervised Learning Approach. 11541-11545 - Takaaki Saeki, Gary Wang, Nobuyuki Morioka, Isaac Elias, Kyle Kastner, Andrew Rosenberg, Bhuvana Ramabhadran, Heiga Zen, Françoise Beaufays, Hadar Shemtov:
Extending Multilingual Speech Synthesis to 100+ Languages without Transcribed Data. 11546-11550 - Jon Barker, Michael A. Akeroyd, Will Bailey, Trevor J. Cox, John F. Culling, Jennifer Firth, Simone Graetzer, Graham Naylor:
The 2nd Clarity Prediction Challenge: A Machine Learning Challenge for Hearing Aid Intelligibility Prediction. 11551-11555 - Haoxu Wang, Ming Cheng, Qiang Fu, Ming Li:
Robust Wake Word Spotting With Frame-Level Cross-Modal Attention Based Audio-Visual Conformer. 11556-11560 - Ziqian Wang, Xinfa Zhu, Zihan Zhang, Yuanjun Lv, Ning Jiang, Guoqing Zhao, Lei Xie:
SELM: Speech Enhancement using Discrete Tokens and Language Models. 11561-11565 - Qiang Huang, Feng Huang, Dehao Tao, Yuetong Zhao, Bingkun Wang, Yongfeng Huang:
CoQ: AN Empirical Framework for Multi-hop Question Answering Empowered by Large Language Models. 11566-11570 - Liang Li, Qisheng Liao, Meiting Lai, Di Liang, Shangsong Liang:
Local and Global: Text Matching Via Syntax Graph Calibration. 11571-11575 - Zhangyin Feng, Yong Dai, Fan Zhang, Duyu Tang, Xiaocheng Feng, Shuangzhi Wu, Bing Qin, Yunbo Cao, Shuming Shi:
SkillNet-X: A Multilingual Multitask Model with Sparsely Activated Skills. 11576-11580 - Qian Wang, Jia-Chen Gu, Zhen-Hua Ling:
Multiscale Matching Driven by Cross-Modal Similarity Consistency for Audio-Text Retrieval. 11581-11585 - Yu Sheng, Linjing Li, Yifei Wang, Daniel Zeng:
Integrating Language Models with Symbolic Formulas for First-Order Logic Reasoning. 11586-11590 - Yishuang Li, Hukai Huang, Zhicong Chen, Wenhao Guan, Jiayan Lin, Lin Li, Qingyang Hong:
SR-HuBERT : An Efficient Pre-Trained Model for Speaker Verification. 11591-11595 - Zhenyu Zhou, Junhui Chen, Namin Wang, Lantian Li, Dong Wang:
An Investigation of Distribution Alignment in Multi-Genre Speaker Recognition. 11596-11600 - Liping Chen, Kong Aik Lee, Wu Guo, Zhen-Hua Ling:
Modeling Pseudo-Speaker Uncertainty in Voice Anonymization. 11601-11605 - John Harvill, Rinat Khaziev, Scarlett Li, Randy Cogill, Lidan Wang, Gopinath Chennupati, Hari Thadakamalla:
Significant ASR Error Detection for Conversational Voice Assistants. 11606-11610 - Haocheng Liu, Teysir Baoueb, Mathieu Fontaine, Jonathan Le Roux, Gaël Richard:
GLA-GRAD: A Griffin-Lim Extended Waveform Generation Diffusion Model. 11611-11615 - Chenyang Lyu, Wenxi Li, Tianbo Ji, Yi Yu, Longyue Wang:
Semantic Enrichment for Video Question Answering with Gated Graph Neural Networks. 11616-11620 - Yanan Zhang, Chaofan Wu, Rongkun Shi, Yiying Zhang:
Using Clustering to Improve the Performance of few-shot Learning. 11621-11625 - Gaobin Yang, Maokui He, Shutong Niu, Ruoyu Wang, Yanyan Yue, Shuangqing Qian, Shilong Wu, Jun Du, Chin-Hui Lee:
Neural Speaker Diarization Using Memory-Aware Multi-Speaker Embedding with Sequence-to-Sequence Architecture. 11626-11630 - Erik Goron, Lena Asai, Elias Rut, Martin Dinov:
Improving Domain Generalization in Speech Emotion Recognition with Whisper. 11631-11635 - Yuxiang Zhang, Jingze Lu, Zengqiang Shang, Wenchao Wang, Pengyuan Zhang:
Improving Short Utterance Anti-Spoofing with Aasist2. 11636-11640 - Yuncong Liu, Lu Chen, Kai Yu:
Label-Aware Auxiliary Learning for Dialogue State Tracking. 11641-11645 - Yong Wang, Cheng Lu, Hailun Lian, Yan Zhao, Björn W. Schuller, Yuan Zong, Wenming Zheng:
Speech Swin-Transformer: Exploring a Hierarchical Transformer with Shifted Windows for Speech Emotion Recognition. 11646-11650 - Navin Raj Prabhu, Bunlong Lay, Simon Welker, Nale Lehmann-Willenbrock, Timo Gerkmann:
EMOCONV-Diff: Diffusion-Based Speech Emotion Conversion for Non-Parallel and in-the-Wild Data. 11651-11655 - Zhangyin Feng, Xiaocheng Feng, Dezhi Zhao, Maojin Yang, Bing Qin:
Retrieval-Generation Synergy Augmented Large Language Models. 11661-11665 - Yu Xi, Baochen Yang, Hao Li, Jiaqi Guo, Kai Yu:
Contrastive Learning with Audio Discrimination for Customizable Keyword Spotting in Continuous Speech. 11666-11670 - Binbin Shen, Jie Wang, Meng Meng, Yujun Wang:
TNFormer: Single-Pass Multilingual Text Normalization with a Transformer Decoder Model. 11671-11675 - Yin-Jyun Luo, Simon Dixon:
Posterior Variance-Parameterised Gaussian Dropout: Improving Disentangled Sequential Autoencoders for Zero-Shot Voice Conversion. 11676-11680 - Siddhant Arora, George Saon, Shinji Watanabe, Brian Kingsbury:
Semi-Autoregressive Streaming ASR with Label Context. 11681-11685 - Zhichen Yuan, C. L. Philip Chen, Shuzhen Li, Tong Zhang:
Disentanglement Network: Disentangle the Emotional Features from Acoustic Features for Speech Emotion Recognition. 11686-11690 - Hongliang Sun, Xiaofeng Bi, Dianbo Sui, Zhiying Tu:
A Federated Graph to Embedding Approach for Knowledge Graph Completion. 11691-11695 - Cheng Lu, Yuan Zong, Hailun Lian, Yan Zhao, Björn W. Schuller, Wenming Zheng:
Improving Speaker-Independent Speech Emotion Recognition using Dynamic Joint Distribution Adaptation. 11696-11700 - Jin Zhao, Chao Liu, Jiaqing Liang, Zhixu Li, Yanghua Xiao:
A Novel Cascade Instruction Tuning Method for Biomedical NER. 11701-11705 - Hui Yan, Zhenchun Lei, Changhong Liu, Yong Zhou:
Gmm-Resnext: Combining Generative and Discriminative Models for Speaker Verification. 11706-11710 - Danilo de Oliveira, Timo Gerkmann:
Distilling Hubert with LSTMs via Decoupled Knowledge Distillation. 11711-11715 - Dario Albesano, Nicola Ferri, Felix Weninger, Puming Zhan:
Improving Speed/Accuracy Tradeoff for Online Streaming ASR via Real-Valued and Trainable Strides. 11716-11720 - Baifeng Li, Qingmu Liu, Yuhong Yang, Hongyang Chen, Weiping Tu, Song Lin:
EMALG: An Enhanced Mandarin Lombard Grid Corpus with Meaningful Sentences. 11721-11725 - Xuechen Zhao, Lei Tian, Feng Xie, Bin Zhou, Haiyang Wang, Hongzhou Wu, Liqun Gao:
MSFR: Stance Detection Based on Multi-Aspect Semantic Feature Representation via Hierarchical Contrastive Learning. 11726-11730 - Vishal Sunder, Beulah Karrolla, Eric Fosler-Lussier:
End-To-End Real Time Tracking of Children's Reading with Pointer Network. 11731-11735 - Wenhao Zhang, Shiyao Cui, Wenyuan Zhang, Xinghua Zhang, Tingwen Liu, Hongbo Xu:
Improving Chinese Spelling Correction with Text-Phonetics Differentiation and Adaptive Fusion. 11736-11740 - Takashi Maekaku, Jiatong Shi, Xuankai Chang, Yuya Fujita, Shinji Watanabe:
Hubertopic: Enhancing Semantic Representation of Hubert Through Self-Supervision Utilizing Topic Model. 11741-11745 - Feiyu Shen, Yiwei Guo, Chenpeng Du, Xie Chen, Kai Yu:
Acoustic BPE for Speech Generation with Discrete Tokens. 11746-11750 - Wei Liu, Ying Qin, Zhiyuan Peng, Tan Lee:
Sparsely Shared Lora on Whisper for Child Speech Recognition. 11751-11755 - Rini A. Sharon, Debdoot Mukherjee:
Study of Abuse Detection in Continuous Speech for Indian Languages. 11756-11760 - Yi Huang, Wei Hu, Junlan Feng:
A Generative Adversarial Framework for Dialogue Generation with Neural Architecture Search. 11761-11765 - Haotian Wang, Jun Du, Yusheng Dai, Chin-Hui Lee, Yuling Ren, Yu Liu:
Improving Multi-Modal Emotion Recognition Using Entropy-Based Fusion and Pruning-Based Network Architecture Optimization. 11766-11770 - Qin Zhang, Hao Ge, Xiaojun Chen, Meng Fang:
Unsupervised Multiple Choices Question Answering Via Universal Corpus. 11771-11775 - Yan Liu, Yazheng Yang, Xiaokang Chen:
Improving Long Text Understanding with Knowledge Distilled from Summarization Model. 11776-11780 - Wen Huang, Bing Han, Shuai Wang, Zhengyang Chen, Yanmin Qian:
Robust Cross-Domain Speaker Verification with Multi-Level Domain Adapters. 11781-11785 - Guorui Yu, Yimin Hu, Yiqian Xu, Yuejie Zhang, Rui Feng, Tao Zhang, Shang Gao:
Exploring Object-Centered External Knowledge for Fine-Grained Video Paragraph Captioning. 11786-11790 - Wei-Cheng Lin, Shabnam Ghaffarzadegan, Luca Bondi, Abinaya Kumar, Samarjit Das, Ho-Hsiang Wu:
CLAP4Emo: ChatGPT-Assisted Speech Emotion Retrieval with Natural Language Supervision. 11791-11795 - Kevin Cai, Chonghua Liu, David M. Chan:
Anim-400K: A Large-Scale Dataset for Automated End to End Dubbing of Video. 11796-11800 - Guanlong Zhao, Yongqiang Wang, Jason Pelecanos, Yu Zhang, Hank Liao, Yiling Huang, Han Lu, Quan Wang:
USM-SCD: Multilingual Speaker Change Detection Based on Large Pretrained Foundation Models. 11801-11805 - David M. Chan, Shalini Ghosh, Hitesh Tulsiani, Ariya Rastrow, Björn Hoffmeister:
Task Oriented Dialogue as a Catalyst for Self-Supervised Automatic Speech Recognition. 11806-11810 - Xin Zhang, Yu Liu, Zhehuan Zhao:
Semantics Driven Multi-View Knowledge Graph Embedding for Cross-Lingual Entity Alignment. 11811-11815 - Rohit Prabhavalkar, Zhong Meng, Weiran Wang, Adam Stooke, Xingyu Cai, Yanzhang He, Arun Narayanan, Dongseong Hwang, Tara N. Sainath, Pedro J. Moreno:
Extreme Encoder Output Frame Rate Reduction: Improving Computational Latencies of Large End-to-End Models. 11816-11820 - Sergio Duarte Torres, Arunasish Sen, Aman Rana, Lukas Drude, Alejandro Gomez-Alanis, Andreas Schwarz, Leif Rädel, Volker Leutnant:
Promptformer: Prompted Conformer Transducer for ASR. 11821-11825 - Bohao Yang, Chen Tang, Chenghua Lin:
Improving Medical Dialogue Generation with Abstract Meaning Representations. 11826-11830 - Ruizhe Huang, Xiaohui Zhang, Zhaoheng Ni, Li Sun, Moto Hira, Jeff Hwang, Vimal Manohar, Vineel Pratap, Matthew Wiesner, Shinji Watanabe, Daniel Povey, Sanjeev Khudanpur:
Less Peaky and More Accurate CTC Forced Alignment by Label Priors. 11831-11835 - Keqi Deng, Philip C. Woodland:
FastInject: Injecting Unpaired Text Data into CTC-Based ASR Training. 11836-11840 - Bogdan Vlasenko, Sargam Vyas, Mathew Magimai-Doss:
Comparing data-Driven and Handcrafted Features for Dimensional Emotion Recognition. 11841-11845 - Yan Zhao, Jincen Wang, Cheng Lu, Sunan Li, Björn W. Schuller, Yuan Zong, Wenming Zheng:
Emotion-Aware Contrastive Adaptation Network for Source-Free Cross-Corpus Speech Emotion Recognition. 11846-11850 - Krishna Subramani, Jean-Marc Valin, Jan Büthe, Paris Smaragdis, Mike Goodwin:
Noise-Robust DSP-Assisted Neural Pitch Estimation With Very Low Complexity. 11851-11855 - Samuele Cornell, Jee-Weon Jung, Shinji Watanabe, Stefano Squartini:
One Model to Rule Them All ? Towards End-to-End Joint Speaker Diarization and Speech Recognition. 11856-11860 - Woan-Shiuan Chien, Shreya G. Upadhyay, Chi-Chun Lee:
Balancing Speaker-Rater Fairness for Gender-Neutral Speech Emotion Recognition. 11861-11865 - Rijul Gupta, Catherine J. Madill, Dhanshree R. Gunjawate, Duy Duong Nguyen, Craig T. Jin:
Addressing Data Scarcity in Voice Disorder Detection with Self-Supervised Models. 11866-11870 - Dominik Klement, Mireia Díez, Federico Landini, Lukás Burget, Anna Silnova, Marc Delcroix, Naohiro Tawara:
Discriminative Training of VBx Diarization. 11871-11875 - Hao Yen, Sabato Marco Siniscalchi, Chin-Hui Lee:
Boosting End-to-End Multilingual Phoneme Recognition Through Exploiting Universal Speech Attributes Constraints. 11876-11880 - Meena M. Chandra Shekar, John H. L. Hansen:
Apollo's Unheard Voices: Graph Attention Networks for Speaker Diarization and Clustering for Fearless Steps Apollo Collection. 11881-11885 - Tobias Cord-Landwehr, Christoph Böddeker, Catalin Zorila, Rama Doddipatla, Reinhold Haeb-Umbach:
Geodesic Interpolation of Frame-Wise Speaker Embeddings for the Diarization of Meeting Scenarios. 11886-11890 - Huazhen Wang, Huan Wang, Jianguo Chen, Shiyue Zhu, Hao Zhou, Yifei Zhao:
Communication-Oriented Automatic Assessment System for Accented Spoken Chinese in Read-Aloud Tasks. 11891-11895 - Peng-Jen Chen, Bowen Shi, Kelvin Niu, Ann Lee, Wei-Ning Hsu:
M2BART: Multilingual and Multimodal Encoder-Decoder Pre-Training for Any-to-Any Machine Translation. 11896-11900 - Yang Li, Liangzhen Lai, Yuan Shangguan, Forrest N. Iandola, Zhaoheng Ni, Ernie Chang, Yangyang Shi, Vikas Chandra:
Folding Attention: Memory and Power Optimization for On-Device Transformer-Based Streaming Speech Recognition. 11901-11905 - Dongmei Wang, Xiong Xiao, Naoyuki Kanda, Midia Yousefi, Takuya Yoshioka, Jian Wu:
Profile-Error-Tolerant Target-Speaker Voice Activity Detection. 11906-11910 - David Palzer, Matthew Maciejewski, Eric Fosler-Lussier:
Improving Neural Diarization through Speaker Attribute Attractors and Local Dependency Modeling. 11911-11915 - Dan-Andrei Iliescu, Devang S. Ram Mohan, Tian Huey Teh, Zack Hodari:
Controllable Prosody Generation with Partial Inputs. 11916-11920 - Amrutha Prasad, Andrés Carofilis, Geoffroy Vanderreydt, Driss Khalil, Srikanth R. Madikeri, Petr Motlícek, Christof Schüpbach:
Fine-Tuning Self-Supervised Models for Language Identification Using Orthonormal Constraint. 11921-11925 - Rongjie Shi, Oliver Niebuhr, Wentao Gu, Nafiseh Taghva:
The Effects of Loudness and Smiling on Timbre Features: Implications for Charismatic Voices in Mandarin, German and Danish. 11926-11930 - Avihu Dekel, Slava Shechtman, Raul Fernandez, David Haws, Zvi Kons, Ron Hoory:
Speak While You Think: Streaming Speech Synthesis During Text Generation. 11931-11935 - Hira Dhamyal, Benjamin Elizalde, Soham Deshmukh, Huaming Wang, Bhiksha Raj, Rita Singh:
Prompting Audios Using Acoustic Properties for Emotion Representation. 11936-11940 - Brian Yan, Xuankai Chang, Antonios Anastasopoulos, Yuya Fujita, Shinji Watanabe:
Cross-Modal Multi-Tasking for Speech-to-Text Translation via Hard Parameter Sharing. 11941-11945 - Xinglong Wu, C. L. Philip Chen, Shuzhen Li, Tony Zhang:
Snapshot Prompt Ensemble for Parameter-Efficient Soft Prompt Transfer. 11946-11950 - Ju Lin, Niko Moritz, Yiteng Huang, Ruiming Xie, Ming Sun, Christian Fuegen, Frank Seide:
AGADIR: Towards Array-Geometry Agnostic Directional Speech Recognition. 11951-11955 - Yuan Xu, Meng Yang:
MCM-CSD: Multi-Granularity Context Modeling with Contrastive Speaker Detection for Emotion Recognition in Real-Time Conversation. 11956-11960 - Bo Li, Yuyan Chen, Liang Zeng:
Kenet: Knowledge-Enhanced DOC-Label Attention Network for Multi-Label Text Classification. 11961-11965 - David Budaghyan, Charles C. Onu, Arsenii Gorin, Cem Subakan, Doina Precup:
CryCeleb: A Speaker Verification Dataset Based on Infant Cry Sounds. 11966-11970 - Amir Hussein, Brian Yan, Antonios Anastasopoulos, Shinji Watanabe, Sanjeev Khudanpur:
Enhancing End-to-End Conversational Speech Translation Through Target Language Context Utilization. 11971-11975 - Debaditya Shome, Ali Etemad:
Speech Emotion Recognition with Distilled Prosodic and Linguistic Affect Representations. 11976-11980 - Adam Sabra, Cyprian M. Wronka, Michelle Mao, Samer L. Hijazi:
SECP: A Speech Enhancement-Based Curation Pipeline for Scalable Acquisition of Clean Speech. 11981-11985 - Jiawen Kang, Lingwei Meng, Mingyu Cui, Haohan Guo, Xixin Wu, Xunying Liu, Helen Meng:
Cross-Speaker Encoding Network for Multi-Talker Speech Recognition. 11986-11990 - Zili Huang, Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
UniX-Encoder: A Universal X-Channel Speech Encoder for AD-HOC Microphone Array Speech Processing. 11991-11995 - Min Zhang, Jianfeng He, Shuo Lei, Murong Yue, Linhan Wang, Chang-Tien Lu:
Can LLM Find the Green Circle? Investigation and Human-Guided Tool Manipulation for Compositional Generalization. 11996-12000 - Vahid Ahmadi Kalkhorani, Anurag Kumar, Ke Tan, Buye Xu, DeLiang Wang:
Audiovisual Speaker Separation with Full- and Sub-Band Modeling in the Time-Frequency Domain. 12001-12005 - Amir Hussein, Dorsa Zeinali, Ondrej Klejch, Matthew Wiesner, Brian Yan, Shammur Absar Chowdhury, Ahmed Ali, Shinji Watanabe, Sanjeev Khudanpur:
Speech Collage: Code-Switched Audio Generation by Collaging Monolingual Corpora. 12006-12010 - Petar Ivanov, Ivan Koychev, Momchil Hardalov, Preslav Nakov:
Detecting Check-Worthy Claims in Political Debates, Speeches, and Interviews Using Audio Data. 12011-12015 - Hussein Yusufali, Roger K. Moore, Stefan Goetze:
Refining Text Input For Augmentative and Alternative Communication (AAC) Devices: Analysing Language Model Layers For Optimisation. 12016-12020 - Paula Andrea Pérez-Toro, Judith Dineley, Agnieszka Kaczkowska, Pauline Conde, Yuezhou Zhang, Faith Matcham, Sara Siddi, Josep Maria Haro, Stuart Bruce, Til Wykes, Raquel Bailón, Srinivasan Vairavan, Richard J. B. Dobson, Andreas K. Maier, Elmar Nöth, Juan Rafael Orozco-Arroyave, Vaibhav A. Narayan, Nicholas Cummins:
Longitudinal Modeling of Depression Shifts Using Speech and Language. 12021-12025 - Hainan Xu, Zhehuai Chen, Fei Jia, Boris Ginsburg:
Transducers with Pronunciation-Aware Embeddings for Automatic Speech Recognition. 12026-12030 - Abinay Reddy Naini, Mary A. Kohler, Elizabeth Richerson, Donita Robinson, Carlos Busso:
Generalization of Self-Supervised Learning-Based Representations for Cross-Domain Speech Emotion Recognition. 12031-12035 - Luz Martinez-Lucas, Carlos Busso:
Dynamic Speech Emotion Recognition Using A Conditional Neural Process. 12036-12040 - Vahid Noroozi, Somshubra Majumdar, Ankur Kumar, Jagadeesh Balam, Boris Ginsburg:
Stateful Conformer with Cache-Based Inference for Streaming Automatic Speech Recognition. 12041-12045 - Yihan Cao, Xu Chen, Lun Du, Hao Chen, Qiang Fu, Shi Han, Yushu Du, Yanbin Kang, Guangming Lu, Zi Li:
TAROT: A Hierarchical Framework with Multitask co-pretraining on Semi-Structured Data Towards Effective Person-Job fit. 12046-12050 - Alexander H. Liu, Sung-Lin Yeh, James R. Glass:
Revisiting Self-supervised Learning of Speech Representation from a Mutual Information Perspective. 12051-12055 - Mingqiu Wang, Izhak Shafran, Hagen Soltau, Wei Han, Yuan Cao, Dian Yu, Laurent El Shafey:
Retrieval Augmented End-to-End Spoken Dialog Models. 12056-12060 - Cheol Jun Cho, Abdelrahman Mohamed, Alan W. Black, Gopala Krishna Anumanchipalli:
Self-Supervised Models of Speech Infer Universal Articulatory Kinematics. 12061-12065 - Sudipta Paul, Lingyu Zhang, Yilin Shen, Hongxia Jin:
Enabling Device Control Planning Capabilities of Small Language Model. 12066-12070 - Jee-Weon Jung, Roshan S. Sharma, William Chen, Bhiksha Raj, Shinji Watanabe:
AugSumm: Towards Generalizable Speech Summarization Using Synthetic Labels from Large Language Models. 12071-12075 - Cheol Jun Cho, Abdelrahman Mohamed, Shang-Wen Li, Alan W. Black, Gopala Krishna Anumanchipalli:
SD-HuBERT: Sentence-Level Self-Distillation Induces Syllabic Organization in Hubert. 12076-12080 - Ismail Rasim Ulgen, Zongyang Du, Carlos Busso, Berrak Sisman:
Revealing Emotional Clusters in Speaker Embeddings: A Contrastive Learning Strategy for Speech Emotion Recognition. 12081-12085 - Amit Meghanani, Thomas Hain:
SCORE: Self-Supervised Correspondence Fine-Tuning for Improved Content Representations. 12086-12090 - Xiangheng He, Junjie Chen, Björn W. Schuller:
Task Selection and Assignment for Multi-Modal Multi-Task Dialogue Act Classification with Non-Stationary Multi-Armed Bandits. 12091-12095 - Jiyang Tang, Kwangyoun Kim, Suwon Shon, Felix Wu, Prashant Sridhar:
Improving ASR Contextual Biasing with Guided Attention. 12096-12100 - Zhenchun Lei, Hui Yan, Changhong Liu, Yong Zhou, Minglei Ma:
GMM-ResNet2: Ensemble of Group Resnet Networks for Synthetic Speech Detection. 12101-12105 - Zhehao Zhang, Tong Yu, Handong Zhao, Kaige Xie, Lina Yao, Shuai Li:
Exploring Soft Prompt Initialization Strategy for Few-Shot Continual Text Classification. 12106-12110 - Krishna C. Puvvada, Nithin Rao Koluguri, Kunal Dhawan, Jagadeesh Balam, Boris Ginsburg:
Discrete Audio Representation as an Alternative to Mel-Spectrograms for Speaker and Speech Recognition. 12111-12115 - Tiantian Feng, Shrikanth Narayanan:
Foundation Model Assisted Automatic Speech Emotion Recognition: Transcribing, Annotating, and Augmenting. 12116-12120 - Jinhan Wang, Long Chen, Aparna Khare, Anirudh Raju, Pranav Dheram, Di He, Minhua Wu, Andreas Stolcke, Venkatesh Ravichandran:
Turn-Taking and Backchannel Prediction with Acoustic and Large Language Model Fusion. 12121-12125 - Enting Zhou, You Zhang, Zhiyao Duan:
Learning Arousal-Valence Representation from Categorical Emotion Labels of Speech. 12126-12130 - Mufan Sang, John H. L. Hansen:
Efficient Adapter Tuning of Pre-Trained Speech Models for Automatic Speaker Verification. 12131-12135 - Chien-Yu Huang, Ke-Han Lu, Shih-Heng Wang, Chi-Yuan Hsiao, Chun-Yi Kuan, Haibin Wu, Siddhant Arora, Kai-Wei Chang, Jiatong Shi, Yifan Peng, Roshan S. Sharma, Shinji Watanabe, Bhiksha Ramakrishnan, Shady Shehata, Hung-Yi Lee:
Dynamic-Superb: Towards a Dynamic, Collaborative, and Comprehensive Instruction-Tuning Benchmark For Speech. 12136-12140 - Zehua Zhou, Haoyuan Yang, Takahiro Shinozaki:
Self-Supervised Speaker Verification with Adaptive Threshold and Hierarchical Training. 12141-12145 - Haobin Tang, Xulong Zhang, Ning Cheng, Jing Xiao, Jianzong Wang:
ED-TTS: Multi-Scale Emotion Modeling Using Cross-Domain Emotion Diarization for Emotional Speech Synthesis. 12146-12150 - Junlong Deng, Yanzhen Ren, Tong Zhang, Hongcheng Zhu, Zongkun Sun:
VFD-Net: Vocoder Fingerprints Detection for Fake Audio. 12151-12155 - Yongyi Zang, You Zhang, Mojtaba Heydari, Zhiyao Duan:
SingFake: Singing Voice Deepfake Detection. 12156-12160 - Zeyu Yang, Minchuan Chen, Yanping Li, Wei Hu, Shaojun Wang, Jing Xiao, Zijian Li:
ESVC: Combining Adaptive Style Fusion and Multi-Level Feature Disentanglement for Expressive Singing Voice Conversion. 12161-12165 - Qihao Yang, Xuelin Wang, Yong Li, Lap-Kei Lee, Fu Lee Wang, Tianyong Hao:
MTA: A Lightweight Multilingual Text Alignment Model for Cross-Language Visual Word Sense Disambiguation. 12166-12170 - Hanzhao Li, Xinfa Zhu, Liumeng Xue, Yang Song, Yunlin Chen, Lei Xie:
Spontts: Modeling and Transferring Spontaneous Style for TTS. 12171-12175 - Chen Xu, Xiaoqian Liu, Erfeng He, Yuhao Zhang, Qianqian Dong, Tong Xiao, Jingbo Zhu, Dapeng Man, Wu Yang:
Bridging the Gaps of Both Modality and Language: Synchronous Bilingual CTC for Speech Translation and Speech Recognition. 12176-12180 - Tsun-An Hsieh, Jacob Donley, Daniel Wong, Buye Xu, Ashutosh Pandey:
On the Importance of Neural Wiener Filter for Resource Efficient Multichannel Speech Enhancement. 12181-12185 - Hyun-Wook Yoon, Jin-Seob Kim, Ryuichi Yamamoto, Ryo Terashima, Chan-Ho Song, Jae-Min Kim, Eunwoo Song:
Enhancing Multilingual TTS with Voice Conversion Based Data Augmentation and Posterior Embedding. 12186-12190 - Junjie Wu, Lemao Liu, Wei Bi, Dit-Yan Yeung:
Rethinking Targeted Adversarial Attacks for Neural Machine Translation. 12191-12195 - Heitor R. Guimarães, Arthur Pimentel, Anderson R. Avila, Tiago H. Falk:
VIC-KD: Variance-Invariance-Covariance Knowledge Distillation to Make Keyword Spotting More Robust Against Adversarial Attacks. 12196-12200 - Jiamin Xie, Ke Li, Jinxi Guo, Andros Tjandra, Yuan Shangguan, Leda Sari, Chunyang Wu, Junteng Jia, Jay Mahadeokar, Ozlem Kalinli:
Dynamic ASR Pathways: An Adaptive Masking Approach Towards Efficient Pruning of a Multilingual ASR Model. 12201-12205 - Ashutosh Pandey, Buye Xu:
Decoupled Spatial and Temporal Processing for Resource Efficient Multichannel Speech Enhancement. 12206-12210 - Sijie Feng, Haoxiang Su, Hongyan Xie, Di Wu, Hao Huang, Wushour Silamu:
Fact-Aware Summarization with Contrastive Learning for Few-Shot Dialogue State Tracking. 12211-12215 - Sadia Nowrin, Keith Vertanen:
Leveraging Large Pretrained Models for Line-by-Line Spoken Program Recognition. 12216-12220 - Haozhe Shan, Albert Gu, Zhong Meng, Weiran Wang, Krzysztof Choromanski, Tara N. Sainath:
Augmenting Conformers With Structured State-Space Sequence Models For Online Speech Recognition. 12221-12225 - Rupak Vignesh Swaminathan, Grant P. Strimel, Ariya Rastrow, Sri Harish Mallidi, Kai Zhen, Hieu Duy Nguyen, Nathan Susanj, Athanasios Mouchtaris:
Max-Margin Transducer Loss: Improving Sequence-Discriminative Training Using a Large-Margin Learning Strategy. 12226-12230 - Pranay Dighe, Yi Su, Shangshang Zheng, Yunshu Liu, Vineet Garg, Xiaochuan Niu, Ahmed H. Tewfik:
Leveraging Large Language Models for Exploiting ASR Uncertainty. 12231-12235 - Ze Li, Yuke Lin, Ning Jiang, Xiaoyi Qin, Guoqing Zhao, Haiying Wu, Ming Li:
Multi-Objective Progressive Clustering for Semi-Supervised Domain Adaptation in Speaker Verification. 12236-12240 - Bang Zeng, Ming Cheng, Yao Tian, Haifeng Liu, Ming Li:
Efficient Personal Voice Activity Detection with Wake Word Reference Speech. 12241-12245 - Jin Jiang, Xiaojun Wan, Wei Peng, Rongjun Li, Jingyuan Yang, Yanquan Zhou:
Cross Modal Training for ASR Error Correction with Contrastive Learning. 12246-12250 - Hongshen Xu, Ruisheng Cao, Su Zhu, Sheng Jiang, Hanchong Zhang, Lu Chen, Kai Yu:
A Birgat Model for Multi-Intent Spoken Language Understanding with Hierarchical Semantic Frames. 12251-12255 - Gowtham Ramesh, Kartik Audhkhasi, Bhuvana Ramabhadran:
Task Vector Algebra for ASR Models. 12256-12260 - Ziwei He, Jian Yuan, Le Zhou, Jingwen Leng, Bo Jiang:
Fovea Transformer: Efficient Long-Context Modeling with Structured Fine-To-Coarse Attention. 12261-12265 - Hwabyeong Chae, Sunggu Lee:
Small-Footprint Convolutional Neural Network with Reduced Feature Map for Voice Activity Detection. 12266-12270 - Mengbo Li, Yuanzhong Zheng, Dichucheng Li, Yulun Wu, Yaoxuan Wang, Haojun Fei:
MS-SENet: Enhancing Speech Emotion Recognition Through Multi-Scale Feature Fusion with Squeeze-and-Excitation Blocks. 12271-12275 - Hangting Chen, Jianwei Yu, Chao Weng:
Complexity Scaling for Speech Denoising. 12276-12280 - Max Morrison, Pranav Pawar, Nathan Pruyne, Jennifer Cole, Bryan Pardo:
Crowdsourced and Automatic Speech Prominence Estimation. 12281-12285 - Wenyao Cui, Jiahao Cai, Baohua Zhang, Yongyi Huang, Huaping Zhang:
Bridging the Gap: A Self-Learning Model Using Implicit Knowledge for Chinese Spelling Correction. 12286-12290 - Rosy Southwell, Wayne H. Ward, Viet Anh Trinh, Charis Clevenger, Clay Clevenger, Emily Watts, Jason G. Reitman, Sidney D'Mello, Jacob Whitehill:
Automatic Speech Recognition Tuned for Child Speech in the Classroom. 12291-12295 - Junjie Li, Yiwei Guo, Xie Chen, Kai Yu:
SEF-VC: Speaker Embedding Free Zero-Shot Voice Conversion with Cross Attention. 12296-12300 - Di Wu, Liting Jiang, Lili Yin, Kai Wang, Haoxiang Su, Zhe Li, Hao Huang:
Dual Level Intent-Slot Interaction for Improved Multi-Intent Spoken Language Understanding. 12301-12305 - Yuejiao Wang, Xixin Wu, Disong Wang, Lingwei Meng, Helen Meng:
UNIT-DSR: Dysarthric Speech Reconstruction System Using Speech Unit Normalization. 12306-12310 - Huimeng Wang, Zengrui Jin, Mengzhe Geng, Shujie Hu, Guinan Li, Tianzi Wang, Haoning Xu, Xunying Liu:
Enhancing Pre-Trained ASR System Fine-Tuning for Dysarthric Speech Recognition Using Adversarial Data Augmentation. 12311-12315 - Xueyuan Chen, Xi Wang, Shaofei Zhang, Lei He, Zhiyong Wu, Xixin Wu, Helen Meng:
Stylespeech: Self-Supervised Style Enhancing with VQ-VAE-Based Pre-Training for Expressive Audiobook Speech Synthesis. 12316-12320 - Hee-Soo Heo, Kihyun Nam, Bong-Jin Lee, Youngki Kwon, Minjae Lee, You Jin Kim, Joon Son Chung:
Rethinking Session Variability: Leveraging Session Embeddings for Session Robustness in Speaker Verification. 12321-12325 - Hsuan Su, Ting-Yao Hu, Hema Swetha Koppula, Raviteja Vemulapalli, Jen-Hao Rick Chang, Karren D. Yang, Gautam Varma Mantena, Oncel Tuzel:
Corpus Synthesis for Zero-Shot ASR Domain Adaptation Using Large Language Models. 12326-12330 - Feng Ma, Yanhui Tu, Maokui He, Ruoyu Wang, Shutong Niu, Lei Sun, Zhongfu Ye, Jun Du, Jia Pan, Chin-Hui Lee:
A Spatial Long-Term Iterative Mask Estimation Approach for Multi-Channel Speaker Diarization and Speech Recognition. 12331-12335 - Hyung Yong Kim, Byeong-Yeol Kim, Yunkyu Lim, Jihwan Park, Jinseok Park, Youshin Lim, Seung Woo Yu, Hanbin Lee:
Learning Contextualized Representation on Discrete Space Via Hierarchical Product Quantization. 12336-12340 - Xueyuan Chen, Yuejiao Wang, Xixin Wu, Disong Wang, Zhiyong Wu, Xunying Liu, Helen Meng:
Exploiting Audio-Visual Features with Pretrained AV-HuBERT for Multi-Modal Dysarthric Speech Reconstruction. 12341-12345 - Zilin Yuan, Borun Chen, Yimeng Dai, Yinghui Li, Hai-Tao Zheng, Rui Zhang:
An Anchor Learning Approach for Citation Field Learning. 12346-12350 - Kentaro Seki, Shinnosuke Takamichi, Takaaki Saeki, Hiroshi Saruwatari:
Diversity-Based Core-Set Selection for Text-to-Speech with Linguistic and Acoustic Features. 12351-12355 - Shefali Garg, Zhouyuan Huo, Khe Chai Sim, Suzan Schwartz, Mason Chua, Alëna Aksënova, Tsendsuren Munkhdalai, Levi King, Darryl Wright, Zion Mengesha, Dongseong Hwang, Tara N. Sainath, Françoise Beaufays, Pedro Moreno Mengibar:
Improving Speech Recognition for African American English with Audio Classification. 12356-12360 - Zhizhao Luo, Youchen Wang, Wenjun Ke, Rui Qi, Yikai Guo, Peng Wang:
Boosting LLMS with Ontology-Aware Prompt for Ner Data Augmentation. 12361-12365 - Jaeyeon Kim, Injune Hwang, Kyogu Lee:
Learning Semantic Information from Raw Audio Signal Using Both Contextual and Phonetic Representations. 12366-12370 - Yu Zheng, Yajun Zhang, Chuanying Niu, Yibin Zhan, Yanhua Long, Dongxing Xu:
Score Calibration Based on Consistency Measure Factor for Speaker Verification. 12371-12375 - Subhajit Chaudhury, Keerthiram Murugesan, Thomas Carta, Kartik Talamadupula, Michiaki Tatsubori:
Leveraging Visual Handicaps for Text-Based Reinforcement Learning. 12376-12380 - Han Liu, Junjie Sun, Xiaotong Zhang, Hongyang Chen:
New Intent Discovery with Multi-View Clustering. 12381-12385 - Pham Viet Thanh, Ngo Thi Thu Huyen, Pham Ngoc Quan, Nguyen Thi Thu Trang:
A Robust Pitch-Fusion Model for Speech Emotion Recognition in Tonal Languages. 12386-12390 - Robin Netzorg, Bohan Yu, Andrea Guzman, Peter Wu, Luna McNulty, Gopala Krishna Anumanchipalli:
Towards an Interpretable Representation of Speaker Identity via Perceptual Voice Qualities. 12391-12395 - Yoonhyung Lee, Kyomin Jung:
Boosting Speech Enhancement with Clean Self-Supervised Features Via Conditional Variational Autoencoders. 12396-12400 - Jia Yi, Xiaoming Wu, Xiangzhi Liu:
Context-Guided and Syntactic Augmented Dual Graph Convolutional Network for Aspect-Based Sentiment Analysis. 12401-12405 - Egor Lakomkin, Chunyang Wu, Yassir Fathullah, Ozlem Kalinli, Michael L. Seltzer, Christian Fuegen:
End-to-End Speech Recognition Contextualization with Large Language Models. 12406-12410 - Seongmin Park, Kyungho Kim, Jaejin Seo, Jihwa Lee:
Unsupervised Extractive Dialogue Summarization in Hyperdimensional Space. 12411-12415 - Tao Li, Feng Wang, Wenhao Guan, Lingyan Huang, Qingyang Hong, Lin Li:
Improving Multi-Speaker ASR With Overlap-Aware Encoding And Monotonic Attention. 12416-12420 - Zhichao Yin, Binyuan Hui, Min Yang, Fei Huang, Yongbin Li:
DialCLIP: Empowering Clip As Multi-Modal Dialog Retriever. 12421-12425 - Qiuming Zhao, Guangzhi Sun, Chao Zhang, Mingxing Xu, Thomas Fang Zheng:
Enhancing Quantised End-to-End ASR Models Via Personalisation. 12426-12430 - Krishna Gurugubelli, Sahil Mohamed, Rajesh Krishna K. S:
Comparative Study of Tokenization Algorithms for End-to-End Open Vocabulary Keyword Detection. 12431-12435 - Fulu Li, Zhiwen Xie, Guangyou Zhou:
Theme-Enhanced Hard Negative Sample Mining for Open-Domain Question Answering. 12436-12440 - Zhengjie Huang, Pingsheng Liu, Gerard de Melo, Liang He, Linlin Wang:
Generating Persona-Aware Empathetic Responses with Retrieval-Augmented Prompt Learning. 12441-12445 - Tsu-Hsien Shih, Chin-Yuan Yeh, Ming-Syan Chen:
Does Audio Deepfake Detection Rely on Artifacts? 12446-12450 - Ali Golmakani, Mostafa Sadeghi, Xavier Alameda-Pineda, Romain Serizel:
A Weighted-Variance Variational Autoencoder Model for Speech Enhancement. 12451-12455 - Takuma Okamoto, Yamato Ohtani, Tomoki Toda, Hisashi Kawai:
Convnext-TTS And Convnext-VC: Convnext-Based Fast End-To-End Sequence-To-Sequence Text-To-Speech And Voice Conversion. 12456-12460 - Zih-Jyun Lin, Yi-Ju Chen, Po-Chih Kuo, Likai Huang, Chaur-Jong Hu, Cheng-Yu Chen:
Dementia Assessment Using Mandarin Speech with an Attention-Based Speech Recognition Encoder. 12461-12465 - Mostafa Sadeghi, Romain Serizel:
Posterior Sampling Algorithms for Unsupervised Speech Enhancement with Recurrent Variational Autoencoder. 12466-12470 - Cong Ma, Yaping Zhang, Yang Zhao, Yu Zhou, Chengqing Zong:
Vector Quantization Knowledge Transfer for End-to-End Text Image Machine Translation. 12471-12475 - Chyi-Jiunn Lin, Guan-Ting Lin, Yung-Sung Chuang, Wei-Lun Wu, Shang-Wen Li, Abdelrahman Mohamed, Hung-Yi Lee, Lin-Shan Lee:
SpeechDPR: End-To-End Spoken Passage Retrieval For Open-Domain Spoken Question Answering. 12476-12480 - Berné Nortier, Mostafa Sadeghi, Romain Serizel:
Unsupervised Speech Enhancement with Diffusion-Based Generative Models. 12481-12485 - Jennifer Williams, Karla Pizzi, Natalia A. Tomashenko, Sneha Das:
Anonymizing Speaker Voices: Easy to Imitate, Difficult to Recognize? 12491-12495 - Wei-Tung Hsu, Chin-Po Chen, Chi-Chun Lee:
Concealing Medical Condition by Node Toggling in ASR for Dementia Patients. 12496-12500 - Xi Chen, Jiakun Pei, Liumeng Xue, Mingyang Zhang:
Transfer the Linguistic Representations from TTS to Accent Conversion with Non-Parallel Data. 12501-12505 - Jean-Eudes Ayilo, Mostafa Sadeghi, Romain Serizel:
Diffusion-Based Speech Enhancement with a Weighted Generative-Supervised Learning Loss. 12506-12510 - Shaoxiang Dang, Tetsuya Matsumoto, Yoshinori Takeuchi, Hiroaki Kudo:
A Separation Priority Pipeline for Single-Channel Speech Separation in Noisy Environments. 12511-12515 - Hao Ma, Zhiyuan Peng, Mingjie Shao, Jing Li, Ju Liu:
Extending Whisper with Prompt Tuning to Target-Speaker ASR. 12516-12520 - Haoxiang Su, Sijie Feng, Hongyan Xie, Di Wu, Hao Huang, Zhongjiang He, Shuangyong Song, Ruiyu Fang, Xiaomeng Huang, Wushour Silamu:
Domain-Slot Aware Contrastive Learning for Improved Dialogue State Tracking. 12521-12525 - Shinnosuke Takamichi, Hiroki Maeda, Joonyong Park, Daisuke Saito, Hiroshi Saruwatari:
Do Learned Speech Symbols Follow Zipf's Law? 12526-12530 - Wanying Ge, Xin Wang, Junichi Yamagishi, Massimiliano Todisco, Nicholas W. D. Evans:
Spoofing Attack Augmentation: Can Differently-Trained Attack Models Improve Generalisation? 12531-12535 - Siya Xu, Xinyan Wen:
Automatic Design of Adapter Architectures for Enhanced Parameter-Efficient Fine-Tuning. 12536-12540 - Junwei Yin, Min Gao, Kai Shu, Jia Wang, Yinqiu Huang, Wei Zhou:
Fine-Grained Discrepancy Contrastive Learning for Robust Fake News Detection. 12541-12545 - Yong Zhang, Hanzhang Li, Zhitao Li, Ning Cheng, Ming Li, Jing Xiao, Jianzong Wang:
Leveraging Biases in Large Language Models: "bias-kNN" for Effective Few-Shot Learning. 12546-12550 - Doyeop Kwak, Jaemin Jung, Kihyun Nam, Youngjoon Jang, Jee-Weon Jung, Shinji Watanabe, Joon Son Chung:
VoxMM: Rich Transcription of Conversations in the Wild. 12551-12555 - Hongtao Wang, Ang Li:
Are Deep Neural Networks Robust to Named Entities? An Adversarial Attack and Defense Perspective. 12556-12560 - Takuhiro Kaneko, Hirokazu Kameoka, Kou Tanaka:
Training Generative Adversarial Network-Based Vocoder with Limited Data Using Augmentation-Conditional Discriminator. 12561-12565 - Yeonghyeon Lee, Inmo Yeon, Juhan Nam, Joon Son Chung:
VoiceLDM: Text-to-Speech with Environmental Context. 12566-12571 - Ashish Seth, Sreyan Ghosh, Srinivasan Umesh, Dinesh Manocha:
FusDom: Combining in-Domain and Out-of-Domain Knowledge for Continuous Self-Supervised Learning. 12572-12576 - Binzhu Sha, Xu Li, Zhiyong Wu, Ying Shan, Helen Meng:
Neural Concatenative Singing Voice Conversion: Rethinking Concatenation-Based Approach for One-Shot Singing Voice Conversion. 12577-12581 - Jiaxiang Chen, Yu Hong, Chaoqun Liu, Qingting Xu, Guodong Zhou:
Decoupling and Refilling: A Simple Data Augmentation Method for Aspect Term Extraction. 12582-12586 - Yuewei Zhang, Huanbin Zou, Jie Zhu:
A Two-Stage Framework in Cross-Spectrum Domain for Real-Time Speech Enhancement. 12587-12591 - Shashi Kumar, Srikanth R. Madikeri, Iuliia Nigmatulina, Esaú Villatoro-Tello, Petr Motlícek, Karthik Pandia, S. Pavankumar Dubagunta, Aravind Ganapathiraju:
Multitask Speech Recognition and Speaker Change Detection for Unknown Number of Speakers. 12592-12596 - Abderrahim Fathan, Jahangir Alam:
Self-Supervised Speaker Verification Employing A Novel Clustering Algorithm. 12597-12601 - Ilja Baumann, Dominik Wagner, Maria Schuster, Elmar Nöth, Tobias Bocklet:
Towards Interpretability of Automatic Phoneme Analysis in Cleft Lip and Palate Speech. 12602-12606 - Ahmed Amine Ben Abdallah, Ata Kabboudi, Amir Kanoun, Salah Zaiem:
Leveraging Data Collection and Unsupervised Learning for Code-Switched Tunisian Arabic Automatic Speech Recognition. 12607-12611 - Linfeng Yu, Wangyou Zhang, Chenpeng Du, Leying Zhang, Zheng Liang, Yanmin Qian:
Generation-Based Target Speech Extraction with Speech Discretization and Vocoder. 12612-12616 - Esaú Villatoro-Tello, Srikanth R. Madikeri, Bidisha Sharma, Driss Khalil, Shashi Kumar, Iuliia Nigmatulina, Petr Motlícek, Aravind Ganapathiraju:
Probability-Aware Word-Confusion-Network-To-Text Alignment Approach for Intent Classification. 12617-12621 - Dong-Min Byun, Sang-Hoon Lee, Ji-Sang Hwang, Seong-Whan Lee:
Midi-Voice: Expressive Zero-Shot Singing Voice Synthesis via Midi-Driven Priors. 12622-12626 - Zijian Yang, Wei Zhou, Ralf Schlüter, Hermann Ney:
On the Relation Between Internal Language Model and Sequence Discriminative Training for Neural Transducers. 12627-12631 - Suyeon Lee, Chaeyoung Jung, Youngjoon Jang, Jaehun Kim, Joon Son Chung:
Seeing Through The Conversation: Audio-Visual Speech Separation Based on Diffusion Model. 12632-12636 - Wenyi Yu, Changli Tang, Guangzhi Sun, Xianzhao Chen, Tian Tan, Wei Li, Lu Lu, Zejun Ma, Chao Zhang:
Connecting Speech Encoder and Large Language Model for ASR. 12637-12641 - Yong-Hyeok Lee, Namhyun Cho:
Iphonmatchnet: Zero-Shot User-Defined Keyword Spotting Using Implicit Acoustic Echo Cancellation. 12642-12646 - Zihao Zheng, Tao He, Ming Liu, Zhongyuan Wang, Ruiji Fu, Bing Qin:
Relational Graph-Bridged Image-Text Interaction: A Novel Method for Multi-Modal Relation Extraction. 12647-12651 - Mrinmoy Bhattacharjee, Iuliia Nigmatulina, Amrutha Prasad, Pradeep Rangappa, Srikanth R. Madikeri, Petr Motlícek, Hartmut Helmke, Matthias Kleinert:
Contextual Biasing Methods for Improving Rare Word Detection in Automatic Speech Recognition. 12652-12656 - Yuhan Liu, Neng Gao, Yifei Zhang, Zhe Kong:
Enhancing Document-Level Event Extraction via Structure-Aware Heterogeneous Graph with Multi-Granularity Subsentences. 12657-12661 - Shun Lei, Yixuan Zhou, Liyang Chen, Dan Luo, Zhiyong Wu, Xixin Wu, Shiyin Kang, Tao Jiang, Yahui Zhou, Yuxing Han, Helen Meng:
Improving Language Model-Based Zero-Shot Text-to-Speech Synthesis with Multi-Scale Acoustic Prompts. 12662-12666 - Wanli Sun, Zehai Tu, Anton Ragni:
Energy-Based Models for Speech Synthesis. 12667-12671 - Reo Shimizu, Ryuichi Yamamoto, Masaya Kawamura, Yuma Shirahata, Hironori Doi, Tatsuya Komatsu, Kentaro Tachibana:
PromptTTS++: Controlling Speaker Identity in Prompt-Based Text-To-Speech Using Natural Language Descriptions. 12672-12676 - Zongcheng Ji, Yinlong Xiao:
LLET: Lightweight Lexicon-Enhanced Transformer for Chinese NER. 12677-12681 - Heejin Choi, Jae-Sung Bae, Joun Yeop Lee, Seongkyu Mun, Jihwan Lee, Hoon-Young Cho, Chanwoo Kim:
Mels-Tts : Multi-Emotion Multi-Lingual Multi-Speaker Text-To-Speech System Via Disentangled Style Tokens. 12682-12686 - Jinxi Guo, Niko Moritz, Yingyi Ma, Frank Seide, Chunyang Wu, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer:
Effective Internal Language Model Training and Fusion for Factorized Transducer Model. 12687-12691 - Dehua Tao, Tan Lee, Harold Chui, Sarah Luk:
Modeling Intrapersonal and Interpersonal Influences for Automatic Estimation of Therapist Empathy in Counseling Conversation. 12692-12696 - Tianhua Qi, Wenming Zheng, Cheng Lu, Yuan Zong, Hailun Lian:
PAVITS: Exploring Prosody-Aware VITS for End-to-End Emotional Voice Conversion. 12697-12701 - Yinlin Guo, Haofan Huang, Xi Chen, He Zhao, Yuehai Wang:
Audio Deepfake Detection With Self-Supervised Wavlm And Multi-Fusion Attentive Classifier. 12702-12706 - SooHwan Eom, Eunseop Yoon, Hee Suk Yoon, Chanwoo Kim, Mark Hasegawa-Johnson, Chang D. Yoo:
AdaMER-CTC: Connectionist Temporal Classification with Adaptive Maximum Entropy Regularization for Automatic Speech Recognition. 12707-12711 - Haozhou Li, Xinyuan Wang, Hongkai Du, Wentong Sun, Qinke Peng:
SADE: A Speaker-Aware Dual Encoding Model Based on Diagbert for Medical Triage and Pre-Diagnosis. 12712-12716 - Yinlong Xiao, Zongcheng Ji, Jianqiang Li:
Dust: Dual-Grained Syntax-Aware Transformer Network for Chinese Named Entity Recognition. 12717-12721 - Seung-Bin Kim, Sang-Hoon Lee, Seong-Whan Lee:
TranSentence: speech-to-speech Translation via Language-Agnostic Sentence-Level Speech Encoding without Language-Parallel Data. 12722-12726 - Yuxuan Yuan, Yue Zhou, Xiaodong Shi:
Memory-Augmented speech-to-text Translation with Multi-Scale Context Translation Strategy. 12727-12731 - Jingyu Li, Tan Lee:
Efficient Black-Box Speaker Verification Model Adaptation With Reprogramming And Backend Learning. 12732-12736 - Yong Ren, Tao Wang, Jiangyan Yi, Le Xu, Jianhua Tao, Chu Yuan Zhang, Junzuo Zhou:
Fewer-Token Neural Speech Codec with Time-Invariant Codes. 12737-12741 - Shentong Mo, Miao Xin:
Tree of Uncertain Thoughts Reasoning for Large Language Models. 12742-12746 - Thomas Rolland, Alberto Abad:
Exploring Adapters with Conformers for Children's Automatic Speech Recognition. 12747-12751 - Yassine El Kheir, Shammur Absar Chowdhury, Ahmed Ali:
L1-Aware Multilingual Mispronunciation Detection Framework. 12752-12756 - Thomas Rolland, Alberto Abad:
Improved Children's Automatic Speech Recognition Combining Adapters and Synthetic Data Augmentation. 12757-12761 - Yuxuan Li, Jianguo Wei, Qiang Fang, Xugang Lu:
Evaluation of an Improved Ultrasonic Imaging Helmet for Observing Articulatory Data. 12762-12766 - Chowdam Venkata Thirumala Kumar, Tanuka Bhattacharjee, Seena Vengalil, Saraswati Nashi, Madassu Keerthipriya, Yamini Belur, Atchayaram Nalini, Prasanta Kumar Ghosh:
Spectral Analysis of Vowels and Fricatives at Varied Levels of Dysarthria Severity for Amyotrophic Lateral Sclerosis. 12767-12771 - Caihua Yang, Jianzhu Bao, Bin Liang, Ruifeng Xu:
Enhancing Argumentative Relation Classification by Multi-Granularity Retrieval and Heterogeneous Graph Reasoning. 12772-12776 - Liangyi Kang, Jie Liu, Dan Ye, Zhiyang Zhou:
Context-Aware Dual Attention Network for Multimodal Sarcasm Detection. 12777-12781 - Yukiya Hono, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda:
PeriodGrad: Towards Pitch-Controllable Neural Vocoder Based on a Diffusion Probabilistic Model. 12782-12786 - Christopher Simic, Tobias Bocklet:
Self-Supervised Adaptive AV Fusion Module for Pre-Trained ASR Models. 12787-12791 - Yukai Wan, Yuqi Shi, Binghuai Lin, Yanlu Xie:
A Study of Mispronunciation Detection and Diagnosis Based on Meta-Learning. 12792-12796 - Hao Yang, Min Zhang, Daimeng Wei, Jiaxin Guo:
CSNet: Contrastive Siamese Network for Robust SLU. 12797-12801 - Anshuman Tripathi, Soheil Khorram, Han Lu, Jaeyoung Kim, Qian Zhang, Hasim Sak:
Monte Carlo Self-Training for Speech Recognition. 12802-12806 - Stephen R. Alty, Clive Cheong Took:
Widrow-Hoff LMS Adaline Demonstrator for Schools and Colleges. 12807-12810 - Shuaishuai Zu, Songtao Cai, Weitao Tang, Chuyu Wang, Li Li, Jun Shen:
GuessKT: Improving Knowledge Tracing via Considering Guess Behaviors. 12811-12815 - John H. L. Hansen, Aditya Joglekar, Meena M. Chandra Shekar, Szu-Jui Chen, Xi Liu:
Fearless Steps Apollo: Team Communications Based Community Resource Development for Science, Technology, Education, and Historical Preservation. 12816-12820 - Zelong Yi, Hua Chen, Zhiwei Jiang, Wei Liu, Qing Wang, Gang Wang:
3-D Near-Field Localization by Jointly Exploiting Spatial and Temporal Information Based on a Nonuniform Cross Array. 12821-12825 - Ohad Levy, Nir Shlezinger:
Rapid Hybrid Modular Receive Beamforming Via Learned Optimization. 12826-12830 - Zhilin Chen, Shihao Yan, Xiaobo Zhou, Feng Shu, Jiande Sun, Derrick Wing Kwan Ng:
IRS-Assisted Covert Communication with a BPP Distributed Warden outside a Safety Zone. 12831-12835 - Baptiste Chatelier, Luc Le Magoarou, Vincent Corlay, Matthieu Criissière:
Model-Based Learning for Location-to-Channel Mapping. 12836-12840 - Ismail Alkhouri, Shijun Liang, Rongrong Wang, Qing Qu, Saiprasad Ravishankar:
Diffusion-Based Adversarial Purification for Robust Deep Mri Reconstruction. 12841-12845 - Rongxuan Peng, Xianbo Mo, Shunquan Tan, Bin Li, Jiwu Huang:
A Keyless Extraction Framework Targeting at Deep Learning Based Image-Within-Image Models. 12846-12850 - Huiping Huang, Tianjian Zhang, Feng Yin, Bin Liao, Henk Wymeersch:
Joint DOA Estimation and Distorted Sensor Detection Under Entangled Low-Rank and Row-Sparse Constraints. 12851-12855 - Kevin Everson, Yile Gu, Chao-Han Huck Yang, Prashanth Gurunath Shivakumar, Guan-Ting Lin, Jari Kolehmainen, Ivan Bulyko, Ankur Gandhe, Shalini Ghosh, Wael Hamza, Hung-Yi Lee, Ariya Rastrow, Andreas Stolcke:
Towards ASR Robust Spoken Language Understanding Through in-Context Learning with Word Confusion Networks. 12856-12860 - Yufan Fan, Marius Pesavento:
Localization in Sensor Networks Using Distributed Low-Rank Matrix Completion. 12861-12865 - Keigo Takeuchi:
Decentralized Generalized Approximate Message-Passing for Tree-Structured Networks. 12866-12870 - Indrasish Ghosh, Arpan Chattopadhyay, Kumar Vijay Mishra, Athina P. Petropulu:
Multicast with Multiple Wardens in IRS-Aided Covert DFRC System. 12871-12875 - Saidur R. Pavel, Yimin D. Zhang, Shunqiao Sun, André L. F. de Almeida:
Tensor Reconstruction-Based Sparse Array 2-D DOA Estimation of Mixed Coherent and Uncorrelated Signals. 12876-12880 - Zhidi Lin, Juan Maroñas, Ying Li, Feng Yin, Sergios Theodoridis:
Towards Efficient Modeling and Inference in Multi-Dimensional Gaussian Process State-Space Models. 12881-12885 - Yao Tang, Guangxu Zhu, Wei Xu, Man Hon Cheung, Tat-Ming Lok, Shuguang Cui:
Integrating Sensing, Communication, and Computation in the Sky. 12886-12890 - Pranav Kulkarni, P. P. Vaidyanathan:
Sparse, Weight-Constrained Arrays With O(N) Aperture for Reduced Mutual Coupling. 12891-12895 - Hongyang Du, Guangyuan Liu, Dusit Niyato, Jiayi Zhang, Jiawen Kang, Zehui Xiong, Bo Ai, Dong In Kim:
Generative Al-aided Joint Training-free Secure Semantic Communications via Multi-modal Prompts. 12896-12900 - Hallysson Oliveira, Stiven S. Dias, Marcelo G. S. Bruno:
Simultaneous Positioning and Tracking Using Dynamic Factor Graphs and Geometric Average Fusion. 12901-12905 - Thomas Kropfreiter, Florian Meyer, Franz Hlawatsch:
A Distributed Joint Integrated Probabilistic Data Association (JIPDA) Filter with Soft Object Association. 12906-12910 - Ban-Sok Shin, Dhruv Patel, Luis Wientgens, Dmitriy Shutin, Armin Dekorsy:
Multi-Agent 3D Seismic Exploration Using Adapt-then-Combine Full Waveform Inversion in a hardware-in-the-loop System. 12911-12915 - Chenyue Zhang, Hoi-To Wai:
Learning Multiplex Graph With Inter-Layer Coupling. 12916-12920 - Francesco Guidi, Anna Guerra, Emanuele Mengoli, Alberto Zanella:
A Statistical Characterization Of Communication Performance In RIS-Aided Networks. 12921-12925 - Qimei Chen, Yipeng Liang, Hao Jiang:
Integrated Sensing And Communication In Unlicensed Mmwave Bands: Joint Beamforming Training And Energy Allocation. 12926-12930 - Xianyan Fu, Xiao-Lei Zhang, Chao-Han Huck Yang, Jun Qi:
Exploiting A Quantum Multiple Kernel Learning Approach For Low-Resource Spoken Command Recognition. 12931-12935 - Shana Moothedath, Namrata Vaswani:
Decentralized Low Rank Matrix Recovery from Column-Wise Projections by Alternating GD and Minimization. 12936-12940 - Wanqing Xiong, Zailiang Chen, Qing Liu, Wenjia Wu, Jian Zhang, Hailan Shen:
PaCaS-WAA: Patch-Based Contrastive Semi-Supervised Learning with Wavelet Guidance and Adaptive Augmentation for Tumour Segmentation. 12941-12945 - Yitao Zhang, Lan Lan, Guisheng Liao, Shengqi Zhu, Jingwei Xu, Hing Cheung So:
Mainlobe Deceptive Jammer Suppression Using FDA-MIMO Radar in the Presence of Multipath Propagation. 12946-12950 - Hao Shi, Kazuki Shimada, Masato Hirano, Takashi Shibuya, Yuichiro Koyama, Zhi Zhong, Shusuke Takahashi, Tatsuya Kawahara, Yuki Mitsufuji:
Diffusion-Based Speech Enhancement with Joint Generative and Predictive Decoders. 12951-12955 - Chengwei Wei, Runqi Pang, C.-C. Jay Kuo:
A Green Learning Approach to Spoofed Speech Detection. 12956-12960 - Shihang Lu, Fan Liu, Fuwang Dong, Yifeng Xiong, Jie Xu, Ya-Feng Liu:
Sensing with Random Signals. 12961-12965 - Jun Wu, Weijie Yuan, Zhiqiang Wei, Jinjin Yan, Derrick Wing Kwan Ng:
Optimal Ber Minimum Precoder Design for OTFS-Based ISAC Systems. 12966-12970 - Onur Günlü, Matthieu R. Bloch, Rafael F. Schaefer, Aylin Yener:
Nonasymptotic Performance Limits of Low-Latency Secure Integrated Sensing and Communication Systems. 12971-12975 - Leonardo Spampinato, Enrico Testi, Chiara Buratti, Riccardo Marini:
MADRL-Based UAVs Trajectory Design with Anti-Collision Mechanism in Vehicular Networks. 12976-12980 - Yanling An, Shaohai Hu, Shuaiqi Liu, Zeyao Wang, Xinrui Wang, Xiaole Ma:
Cross-Subject EEG Emotion Recognition Based on Interconnected Dynamic Domain Adaptation. 12981-12985 - Simone Fiorellino, Claudio Battiloro, Paolo Di Lorenzo:
Topological Neural Networks over the Air. 12986-12990 - Shuaiqi Liu, Siqi Wang, Beibei Liang, Bing Li, Jianpeng Xu:
Diagnosis of Autism Spectrum Disorder Based on Contrastive Functional Connectivity Graph Learning Network. 12991-12995 - Petteri Pulkkinen, Visa Koivunen:
Partially Observable Model-Based Learning FOR ISAC Resource Allocation. 12996-13000 - Ling Zhao, Shuaiqi Liu, Bing Li, Wenjia Cai, Ping Liang, Jie Yu, Jie Zhao:
A Hybrid CNN-Transformer for Focal Liver Lesion Classification. 13001-13005 - Erik Sausa, Pavel Rajmic, Franz Hlawatsch:
Likelihood Consensus 2.0: Reducing Interagent Communication in Distributed Bayesian Target Tracking. 13006-13010 - Alejandro Gonzalez-Garrido, Jorge Querol, Symeon Chatzinotas:
Analysis of the SINR in LEO-PNT Systems with 5G PRS Multiplexing: Integration of PRS and NTN. 13016-13020 - Eleonora Grassucci, Yuki Mitsufuji, Ping Zhang, Danilo Comminiello:
Enhancing Semantic Communication with Deep Generative Models: An Overview. 13021-13025 - Xusheng Zhang, Cho-Chun Chiu, Ting He:
Energy-Efficient Decentralized Learning Via Graph Sparsification. 13026-13030 - Bong-Seok Kim, Jonghun Lee, Youngseok Jin, Sangdong Kim, Ram M. Narayanan:
MIMO imaging method with iterative-based super-resolution for automotive radar. 13031-13035 - Kwanyoung Kim, Jong Chul Ye:
Noise2one: One-Shot Image Denoising with Local Implicit Learning. 13036-13040 - Steve Blandino, Jihoon Bang, Jian Wang, Samuel Berweger, Jack Chuang, Jelena Senic, Tanguy Ropitault, Camillo Gentile, Nada Golmie:
Low Overhead DMG Sensing for Vital Signs Detection. 13041-13045 - Gourav Datta, Zeyu Liu, Peter A. Beerel:
Training Ultra-Low-Latency Spiking Neural Networks from Scratch. 13046-13050 - Nebiyou Yismaw, Ulugbek S. Kamilov, M. Salman Asif:
Parameter-Efficient Adaptation for Computational Imaging. 13051-13055 - Zahra Esmaeilbeig, Kumar Vijay Mishra, Mojtaba Soltanalian:
Space-Time Adaptive Processing for Radars in Connected and Automated Vehicular Platoons. 13056-13060 - Kris Li, David Ramirez, Kumar Vijay Mishra, Ashutosh Sabharwal:
Repurposing Mu-Mimo Downlink For Joint Wireless Communications And Imaging Via Virtual Users. 13061-13065 - William Chen, Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe:
Train Long and Test Long: Leveraging Full Document Contexts in Speech Processing. 13066-13070 - Zeyu Jiang, Xiaohong Liu, Guoxing Yang, Weizhi Li, Aini Li, Guangyu Wang:
DIFFSC: Semantic Communication Framework With Enhanced Denoising Through Diffusion Probabilistic Models. 13071-13075 - Nicolò Michelusi:
CSI-Free Over-The-Air Decentralized Learning Over Frequency Selective Channels. 13076-13080 - Yunqiao Hu, Shunqiao Sun:
IHT-Inspired Neural Network for Single-Snapshot DOA Estimation with Sparse Linear Arrays. 13081-13085 - Lei Li, Tenghao Cai, Tsung-Hui Chang:
Isac Beamforming Optimization For Robust Transmission In Dynamic Mmwave Mimo Networks. 13086-13090 - Michael Macedo Diniz, Anderson Rocha:
Open-Set Deepfake Detection To Fight The Unknown. 13091-13095 - Hao-Hsuan Chang, Vishnu V. Ratnam, Hao Chen, Junsu Choi, Charlie Jianzhong Zhang:
Multi-Person Respiration Rate Estimation With Single Pair Of Transmit And Receive Antenna. 13096-13100 - Silvia Mura, Marouan Mizmizi, Umberto Spagnolini, Athina P. Petropulu:
Enhanced Channel Estimation in mm-Wave Mimo Systems Leveraging Integrated Communication and Sensing. 13101-13105 - Robert Pöhlmann, Emanuel Staudinger, Siwei Zhang, Armin Dammann:
A Joint Look on Lunar Satellite and Cooperative Surface PNT. 13106-13110 - Xiaohuan Wu, Jiang Wang, Yazhou Liu, Jianing Li:
A Near-Field Source Localization Method for Uniform/Sparse Centrally Symmetric Rectangular Arrays. 13111-13115 - Xugang Lu, Peng Shen, Yu Tsao, Hisashi Kawai:
Hierarchical Cross-Modality Knowledge Transfer with Sinkhorn Attention for CTC-Based ASR. 13116-13120 - Yehonatan Dahan, Guy Revach, Jindrich Duník, Nir Shlezinger:
Uncertainty Quantification in Deep Learning Based Kalman Filters. 13121-13125 - Chong Wang, Yi Yu, Lanqing Guo, Bihan Wen:
Benchmarking Adversarial Robustness of Image Shadow Removal with Shadow-Adaptive Attacks. 13126-13130 - Yujie Yang, Haochen Qin, Hang Zhou, Chengcheng Wang, Tianyu Guo, Kai Han, Yunhe Wang:
A Robust Audio Deepfake Detection System via Multi-View Feature. 13131-13135 - Eleonora Grassucci, Christian Marinoni, Andrea Rodriguez, Danilo Comminiello:
Diffusion Models for Audio Semantic Communication. 13136-13140 - Emilie Chouzenoux, Víctor Elvira:
Graphical Inference in Non-Markovian Linear-Gaussian State-Space Models. 13141-13145 - K. R. Stunnenberg, Richard C. Hendriks, J. L. Vroegop, M. L. Adank, Borbála Hunyadi:
Tensor Decomposition-Based Data Fusion for Biomarker Extraction from Multiple EEG Experiments. 13146-13150 - Al Depope, Marco Mondelli, Matthew R. Robinson:
Inference of Genetic Effects via Approximate Message Passing. 13151-13155 - Augusto Santos, Diogo Rente, Rui Seabra, José M. F. Moura:
Inferring the Graph of Networked Dynamical Systems under Partial Observability and Spatially Colored Noise. 13156-13160 - Jie Yang, Yixin Yang, Bin Liao:
Sparse Bayesian Synthetic Aperture Processing Based DOA Estimation with Deformed Towed Arrays. 13161-13165 - Lei Wang, Pinjun Zheng, Xing Liu, Tarig Ballal, Tareq Y. Al-Naffouri:
Beamforming Design and Performance Evaluation for RIS-Aided Localization Using LEO Satellite Signals. 13166-13170 - Yuan Liu, M. R. Bhavani Shankar, Linlong Wu, Björn E. Ottersten:
Debris Sensing Based on Leo Constellation: An Intersatellite Channel Parameter Estimation Approach. 13171-13175 - Giada Zingarini, Davide Cozzolino, Riccardo Corvi, Giovanni Poggi, Luisa Verdoliva:
M3DSYNTH: A Dataset of Medical 3D Images with AI-Generated Local Manipulations. 13176-13180 - Ali Krayani, Khalid Khan, Lucio Marcenaro, Mario Marchese, Carlo S. Regazzoni:
Self-Supervised Path Planning in UAV-Aided Wireless Networks Based on Active Inference. 13181-13185 - Samuel Yen-Chi Chen:
Efficient Quantum Recurrent Reinforcement Learning Via Quantum Reservoir Computing. 13186-13190 - Thummaluru Siddartha Reddy, Sundeep Prabhakar Chepuri:
Sampling and Recovery of Signals Over Product Cell Structures. 13191-13195 - Domenico Gaglione, Leonardo Maria Millefiori, Paolo Braca, Peter Willett, Moe Z. Win:
Tracking of Multiple Spawning Targets with Heterogeneous Sensors for Seabed-To-Space Situational Awareness. 13196-13200 - Jaebok Lee, Hyunwoo Park, Hyeonjin Chung, Sunwoo Kim:
Radio Slam with Hybrid Sensing for Mixed Reflection Type Environments. 13201-13205 - Bingyin Li, Xiaoyu Xu, Sheyang Tang, Li Yu, Zhou Wang:
Human Perception-Guided Meta-Training for Few-Shot NeRF. 13206-13210 - Pranay Sharma, Jiarui Li, Gauri Joshi:
On Improved Distributed Random Reshuffling over Networks. 13211-13215 - Lital Dabush, Tirza Routtenberg:
Kalman Filter for Tracking Network Dynamic. 13216-13220 - Pei An, Di Zhu, You Yang, Jie Ma:
Low-Rank Completion Based Normal Guided Lidar Point Cloud Up-Sampling. 13221-13225 - Marco Piavanini, Luca Barbieri, Mattia Brambilla, Monica Nicoli:
Deep Unfolded Annealed Stein Particle Filter for Vehicle Tracking. 13226-13230 - Zijun Wan, Yunying Wu, Mohamed Baha Ben Ticha, Gaël Le Godais, Philippe Kahane, Stéphan Chabardès, Weidong Chen, Shaomin Zhang, Blaise Yvert:
An Interpretable and Generalizable Speech Detector Based on a CNN-LSTM Framework. 13231-13235 - Christos Merkatas, Simo Särkkä:
A Gibbs Sampler for Bayesian Nonparametric State-Space Models. 13236-13240 - He Wang, Yuejie Chi:
Communication-Efficient Federated Optimization over Semi-Decentralized Networks. 13241-13245 - Jennifer Drexler Fox, Desh Raj, Natalie Delworth, Quinn McNamara, Corey Miller, Migüel Jetté:
Updated Corpora and Benchmarks for Long-Form Speech Recognition. 13246-13250 - Georgios Vasileios Karanikolas, Alba Pagès-Zamora, Georgios B. Giannakis:
A Bayesian Approach to High-Order Link Prediction. 13251-13255 - Souvik Kundu, Rui-Jie Zhu, Akhilesh Jaiswal, Peter A. Beerel:
Recent Advances in Scalable Energy-Efficient and Trustworthy Spiking Neural Networks: from Algorithms to Technology. 13256-13260 - Sorachi Kato, Pu Wang, Toshiaki Koike-Akino, Takuya Fujihashi, Hassan Mansour, Petros Boufounos:
Object Trajectory Estimation with Multi-Band Wi-Fi Neural Dynamic Fusion. 13261-13265 - Ryoma Yataka, Pu Wang, Petros Boufounos, Ryuhei Takahashi:
Radar Perception with Scalable Connective Temporal Relations for Autonomous Driving. 13266-13270 - Shoutik Mukherjee, Peter Jendrichovksy, Patrick O. Kanold, Behtash Babadi:
Reinforcement Learning-Guided Optogenetic Stimulation Policies for Robust Functional Network Discovery. 13271-13275 - Chen Cui, Petar M. Djuric:
Inference of Time-Varying Graph Topologies via Gaussian Processes. 13276-13280 - Zilu Zhao, Fangqing Xiao, Dirk Slock:
Vector Approximate Message Passing for Not So Large N.I.I.D. Generalized I/O Linear Models. 13281-13285 - Ioannis Gavras, Italo Atzeni, George C. Alexandropoulos:
Near-Field Localization with 1-bit Quantized Hybrid A/D Reception. 13286-13290 - Nicolas Zilberstein, Ananthram Swami, Santiago Segarra:
Joint Channel Estimation and Data Detection in Massive Mimo Systems Based on Diffusion Models. 13291-13295 - Chiori Hori, Pu Wang, Mahbub Rahman, Cristian J. Vaca-Rubio, Sameer Khurana, Anoop Cherian, Jonathan Le Roux:
WI-FI based Indoor Monitoring Enhanced by Multimodal Fusion. 13296-13300 - Ioannis Gavras, George C. Alexandropoulos:
Joint Near-Field Target Tracking and Communications with full Duplex Holographic MIMO. 13301-13305 - W. Ronny Huang, Cyril Allauzen, Tongzhou Chen, Kilol Gupta, Ke Hu, James Qin, Yu Zhang, Yongqiang Wang, Shuo-Yiin Chang, Tara N. Sainath:
Multilingual and Fully Non-Autoregressive ASR with Large Language Model Fusion: A Comprehensive Study. 13306-13310 - Abhiroop Bhattacharjee, Ruokai Yin, Abhishek Moitra, Priyadarshini Panda:
Are SNNs Truly Energy-efficient? - A Hardware Perspective. 13311-13315 - Zhiwei Tang, Tsung-Hui Chang:
FedLion: Faster Adaptive Federated Optimization with Fewer Communication. 13316-13320 - Soumi Maiti, Yifan Peng, Shukjae Choi, Jee-Weon Jung, Xuankai Chang, Shinji Watanabe:
VoxtLM: Unified Decoder-Only Models for Consolidating Speech Recognition, Synthesis and Speech, Text Continuation Tasks. 13326-13330 - Liana Khamidullina, Martin Haardt:
Coupled Block-Term Tensor Decomposition for Near-Field Localization in multi-static MIMO Radar Systems. 13331-13335 - Gerald Matz:
On Generalized Signature Graphs. 13336-13340 - Cheng-chieh Yeh, Amirreza Shirani, Weicheng Zhang, Tuomo Raitio, Ramya Rasipuram, Ladan Golipour, David Winarsky:
Dialog Modeling in Audiobook Synthesis. 13341-13345 - Ashkan Faghiri, Armin Iraji, Tülay Adali, Vince D. Calhoun:
Analysis of High-Order Brain Networks Resolved in Time and Frequency Using CP Decomposition. 13346-13350 - Yassir Fathullah, Chunyang Wu, Egor Lakomkin, Junteng Jia, Yuan Shangguan, Ke Li, Jinxi Guo, Wenhan Xiong, Jay Mahadeokar, Ozlem Kalinli, Christian Fuegen, Mike Seltzer:
Prompting Large Language Models with Speech Recognition Abilities. 13351-13355 - Md. Abdullah-Al Kaiser, Akhilesh R. Jaiswal:
Hardware-Algorithm Co-Design Enabling Processing-In-Pixel-In-Memory (P2M) for Neuromorphic Vision Sensors. 13356-13360 - Xiang Li, Feng-Gang Yan, Ming Jin, Maria Sabrina Greco, Fulvio Gini:
Generalized Hole-Filling Strategy for Overlapping Hole-Existing Coprime Arrays for DOA Estimation. 13361-13365 - Nithin Rao Koluguri, Samuel Kriman, Georgy Zelenfroind, Somshubra Majumdar, Dima Rekesh, Vahid Noroozi, Jagadeesh Balam, Boris Ginsburg:
Investigating End-to-End ASR Architectures for Long Form Audio Transcription. 13366-13370 - Natarajan Balaji Shankar, Alexander Johnson, Christina Chance, Hariram Veeramani, Abeer Alwan:
CORAAL QA: A Dataset and Framework for Open Domain Spontaneous Speech Question Answering from Long Audio Files. 13371-13375 - Jinyang Li, Ang Li, Weiwen Jiang:
QUAPPROX: A Framework for Benchmarking the Approximability of Variational Quantum Circuit. 13376-13380 - Pu Wang, Petros Boufounos:
Monostatic DMG Passive Sensing with Hypothesis Testing. 13381-13385 - Subhadip Mukherjee, Sören Dittmer, Zakhar Shumaylov, Sebastian Lunz, Ozan Öktem, Carola B. Schönlieb:
Data-Driven Convex Regularizers for Inverse Problems. 13386-13390 - Niall Lyons, Avik Santra, Vikram Kumar Ramanna, Kiran Uln, Rakesh Taori, Ashutosh Pandey:
WIFIACT: Enhancing Human Sensing Through Environment Robust Preprocessing And Bayesian Self-Supervised Learning. 13391-13395 - Aboulnasr Hassanien, Elias Aboutanios:
A Hybrid Slow-Time Coding Framework for Automotive MIMO Radar. 13396-13400 - Tyler Wang, Huan-Hsin Tseng, Shinjae Yoo:
Quantum Federated Learning with Quantum Networks. 13401-13405 - Khurram Usman Mazher, Andrew M. Graff, Nuria González Prelcic, Robert W. Heath Jr.:
Automotive Radar Interference Characterization: FMCW or PMCW? 13406-13410 - Joan Palacios, Murat Bayraktar, Nuria González Prelcic, Hao Chen:
High Accuracy Device Localization in Indoor Mmwave Networks Exploiting Channel Sparsity and Virtual Anchor Mapping. 13411-13415 - Pao-Sheng Vincent Sun, Arren Glover, Chiara Bartolozzi, Arindam Basu:
Memory Efficient Corner Detection for Event-Driven Dynamic Vision Sensors. 13416-13420 - Siyin Wang, Chao-Han Huck Yang, Ji Wu, Chao Zhang:
Can Whisper Perform Speech-Based In-Context Learning? 13421-13425 - Weitong Zhai, Xiangrong Wang, Moeness G. Amin, Maria S. Greco, Fulvio Gini:
IRS-Assisted Joint Sensing and Communication Design for Autonomous Driving. 13426-13430 - Rupam Kalyan Chakraborty, Geethu Joseph, Chandra R. Murthy:
Bayesian Learning-Based Kalman Smoothing For Linear Dynamical Systems With Unknown Sparse Inputs. 13431-13435 - Jianjun Gao, Kim-Hui Yap, Kejun Wu, Duc Tri Phan, Kratika Garg, Boon Siew Han:
Contextual Human Object Interaction Understanding from Pre-Trained Large Language Model. 13436-13440 - Chi Zhang, Mehmet Akçakaya:
Uncertainty-Guided Physics-Driven Deep Learning Reconstruction via Cyclic Measurement Consistency. 13441-13445 - Robin Rajamäki, Mehmet Can Hücümenoglu, Pulak Sarangi, Piya Pal:
Effect of Beampattern on Matrix Completion with Sparse Arrays. 13451-13455 - Yi Wang, Qiongyang Hu, Lap-Pui Chau:
Weakly-Supervised Crowd Counting with Token Attention and Fusion: A Simple and Effective Baseline. 13456-13460 - Nguyen Van Phi, Tran Minh Duc, Pham Huy Hieu, Tran Quoc Long:
Echocardiography Video Synthesis from End Diastolic Semantic Map Via Diffusion Model. 13461-13465 - Chandler Timm C. Doloriel, Ngai-Man Cheung:
Frequency Masking for Universal Deepfake Detection. 13466-13470 - Jiang Zhu, Xupeng Lei, Mihai-Alin Badiu:
Estimation of Spectral Lines Using Expectation Propagation. 13471-13475 - Bohan Tang, Siheng Chen, Xiaowen Dong:
Hypergraph-Mlp: Learning on Hypergraphs Without Message Passing. 13476-13480 - Sumit Bam Shrestha, Jonathan Timcheck, Edward Paxon Frady, Leobardo Campos-Macias, Mike Davies:
Efficient Video and Audio Processing with Loihi 2. 13481-13485 - Minxi Yang, Dahua Gao, Feng Xie, Jiaxuan Li, Xiaodan Song, Guangming Shi:
SG2SC: A Generative Semantic Communication Framework for Scene Understanding-Oriented Image Transmission. 13486-13490 - Chengyuan He, Chengwei Zhou, Zhiguo Shi, Jiming Chen:
Deep INCM Reconstruction for Adaptive Beamforming. 13491-13495 - Jeongwan Kang, Paulson Eberechukwu, Jeonghaeng Lee, Henk Wymeersch, Sunwoo Kim:
Fundamental Performance Bounds for Carrier Phase Positioning in LEO-PNT Systems. 13496-13500 - Francesco Pezone, Osman Musa, Giuseppe Caire, Sergio Barbarossa:
Semantic-Preserving Image Coding Based on Conditional Diffusion Models. 13501-13505 - Hyelin Nam, Jihong Park, Jinho Choi, Mehdi Bennis, Seong-Lyun Kim:
Language-Oriented Communication with Semantic Coding and Knowledge Distillation for Text-to-Image Generation. 13506-13510 - Amirhossein Javaheri, Arash Amini, Farokh Marvasti, Daniel P. Palomar:
Joint Signal Recovery and Graph Learning from Incomplete Time-Series. 13511-13515 - Wenlong Wang, Zai Yang, Xunmeng Wu:
On Unique Localization of Uncorrelated Constant-Modulus Sources Using Sparse Linear Arrays. 13516-13520 - Zhehuai Chen, He Huang, Andrei Andrusenko, Oleksii Hrinchuk, Krishna C. Puvvada, Jason Li, Subhankar Ghosh, Jagadeesh Balam, Boris Ginsburg:
SALM: Speech-Augmented Language Model with in-Context Learning for Speech Recognition and Translation. 13521-13525 - Lei Ding, Han Yuan:
Fast Dynamics of Brain-wide Patterns on Neuronal Oscillations. 13526-13530 - Da Chang, Xi-Nian Zuo:
Healthy Aging is Marked by Entropy Reduction in Cortical Spontaneous Activity. 13531-13535
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.