default search action
ICASSP 2022: Virtual and Singapore
- IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2022, Virtual and Singapore, 23-27 May 2022. IEEE 2022, ISBN 978-1-6654-0541-6
- Shibo Zhang, Ebrahim Nemati, Minh Dinh, Nathan Folkman, Tousif Ahmed, Md. Mahbubur Rahman, Jilong Kuang, Nabil Alshurafa, Alex Gao:
Coughtrigger: Earbuds IMU Based Cough Detection Activator Using An Energy-Efficient Sensitivity-Prioritized Time Series Classifier. 1-5 - Hoang Truong, Alessandro Montanari, Fahim Kawsar:
Non-Invasive Blood Pressure Monitoring with Multi-Modal In-Ear Sensing. 6-10 - Xiaolu Zeng, Beibei Wang, Chenshu Wu, Sai Deepika Regani, K. J. Ray Liu:
Intelligent Wi-Fi Based Child Presence Detection System. 11-15 - Wenxuan Li, Dongheng Zhang, Yadong Li, Zhi Wu, Jinbo Chen, Dong Zhang, Yang Hu, Qibin Sun, Yan Chen:
Real-Time Fall Detection Using Mmwave Radar. 16-20 - Dae Yon Hwang, Pai Chet Ng, Yuanhao Yu, Yang Wang, Petros Spachos, Dimitrios Hatzinakos, Konstantinos N. Plataniotis:
Hierarchical Deep Learning Model with Inertial and Physiological Sensors Fusion for Wearable-Based Human Activity Recognition. 21-25 - Yu-Chen Lin, Tsun-An Hsieh, Kuo-Hsuan Hung, Cheng Yu, Harinath Garudadri, Yu Tsao, Tei-Wei Kuo:
Speech Recovery For Real-World Self-Powered Intermittent Devices. 26-30 - Ai Okano, Yoshinobu Kajikawa:
Phase Control of Parametric Array Loudspeaker by Optimizing Sideband Weights. 31-35 - Florian Scalvini, Camille Bordeau, Maxime Ambard, Cyrille Migniot, Julien Dubois:
Low-Latency Human-Computer Auditory Interface Based on Real-Time Vision Analysis. 36-40 - Akihiko Sugiyama:
Robust Adaptive Noise Canceller Algorithm with Snr-Based Stepsize Control and Noise-Path Gain Compensation. 41-45 - Chao Liu, Linlin Gao, Ruobing Jiang:
Neartracker: Acoustic 2-D Target Tracking with Nearby Reflector in Siso System. 46-50 - Harinarayanan. E. V, Sachin Ghanekar:
An Efficient Method For Generic Dsp Implementation Of Dilated Convolution. 51-55 - Yu-Shan Tai, Chieh-Fang Teng, Cheng-Yang Chang, An-Yeu Andy Wu:
Compression-Aware Projection with Greedy Dimension Reduction for Convolutional Neural Network Activations. 56-60 - Simon Narduzzi, Siavash Arjomand Bigdeli, Shih-Chii Liu, L. Andrea Dunbar:
Optimizing The Consumption Of Spiking Neural Networks With Activity Regularization. 61-65 - Sujan Kumar Gonugondla, Naresh R. Shanbhag:
IMPQ: Reduced Complexity Neural Networks Via Granular Precision Assignment. 66-70 - Youngeun Kim, Hyoungseob Park, Abhishek Moitra, Abhiroop Bhattacharjee, Yeshwanth Venkatesha, Priyadarshini Panda:
Rate Coding Or Direct Coding: Which One Is Better For Accurate, Robust, And Energy-Efficient Spiking Neural Networks? 71-75 - Linghao Song, Yuze Chi, Jason Cong:
PYXIS: An Open-Source Performance Dataset Of Sparse Accelerators. 76-80 - Zuozhou Pan, Zhiping Lin, Yuanjin Zheng, Zong Meng:
Fast Fault Diagnosis Method Of Rolling Bearings In Multi-Sensor Measurement Enviroment. 81-85 - Diaa Badawi, Ishaan Bassi, Sule Ozev, Ahmet Enis Çetin:
Detecting Anomaly in Chemical Sensors via Regularized Contrastive Learning. 86-90 - Cheng Tang, Junkai Ji, Qiuzhen Lin, Yan Zhou:
Evolutionary Neural Architecture Design of Liquid State Machine for Image Classification. 91-95 - Huy Phan, Yi Xie, Jian Liu, Yingying Chen, Bo Yuan:
Invisible and Efficient Backdoor Attacks for Compressed Deep Neural Networks. 96-100 - Cheng-Hung Lo, Pei-Yun Tsai:
Tensor-Based Orthogonal Matching Pursuit with Phase Rotation for Channel Estimation In Hybrid Beamforming Mimo-Ofdm Systems. 101-105 - Darius Petermann, Minje Kim:
Spain-Net: Spatially-Informed Stereophonic Music Source Separation. 106-110 - Siyuan Yuan, Zhepei Wang, Umut Isik, Ritwik Giri, Jean-Marc Valin, Michael M. Goodwin, Arvindh Krishnaswamy:
Improved Singing Voice Separation with Chromagram-Based Pitch-Aware Remixing. 111-115 - Haici Yang, Shivani Firodiya, Nicholas J. Bryan, Minje Kim:
Don't Separate, Learn To Remix: End-To-End Neural Remixing With Joint Optimization. 116-120 - Yu Wang, Daniel Stoller, Rachel M. Bittner, Juan Pablo Bello:
Few-Shot Musical Source Separation. 121-125 - Ethan Manilow, Patrick O'Reilly, Prem Seetharaman, Bryan Pardo:
Source Separation By Steering Pretrained Music Models. 126-130 - Xuewen Yao, Megan Micheletti, Mckensey Johnson, Edison Thomaz, Kaya de Barbaro:
Infant Crying Detection In Real-World Environments. 131-135 - Qin Zhang, Qingming Tang, Chieh-Chi Kao, Ming Sun, Yang Liu, Chao Wang:
Wikitag: Wikipedia-Based Knowledge Embeddings Towards Improved Acoustic Event Classification. 136-140 - Magdalena Fuentes, Bea Steers, Pablo Zinemanas, Martín Rocamora, Luca Bondi, Julia Wilkins, Qianyi Shi, Yao Hou, Samarjit Das, Xavier Serra, Juan Pablo Bello:
Urban Sound & Sight: Dataset And Benchmark For Audio-Visual Urban Scene Understanding. 141-145 - Sai Srinadhu Katta, Kide Vuojärvi, Sivaprasad Nandyala, Ulla-Maria Kovalainen, Lauren Baddeley:
Real-World On-Board Uav Audio Data Set For Propeller Anomalies. 146-150 - Yuan Gong, Jin Yu, James R. Glass:
Vocalsound: A Dataset for Improving Human Vocal Sounds Recognition. 151-155 - Kento Nagatomo, Masahiro Yasuda, Kohei Yatabe, Shoichiro Saito, Yasuhiro Oikawa:
Wearable Seld Dataset: Dataset For Sound Event Localization And Detection Using Wearable Devices Around Head. 156-160 - Viet-Anh Nguyen, Anh H. T. Nguyen, Andy W. H. Khong:
Tunet: A Block-Online Bandwidth Extension Model Based On Transformers And Self-Supervised Pretraining. 161-165 - Jinjiang Liu, Xueliang Zhang:
DRC-NET: Densely Connected Recurrent Convolutional Neural Network for Speech Dereverberation. 166-170 - Jean-Marie Lemercier, Joachim Thiemann, Raphael Koning, Timo Gerkmann:
Customizable End-To-End Optimization Of Online Neural Network-Supported Dereverberation For Hearing Devices. 171-175 - Naoyuki Kamo, Rintaro Ikeshita, Keisuke Kinoshita, Tomohiro Nakatani:
Importance of Switch Optimization Criterion in Switching WPE Dereverberation. 176-180 - Ziyu Wang, Dejing Xu, Gus Xia, Ying Shan:
Audio-To-Symbolic Arrangement Via Cross-Modal Music Representation Learning. 181-185 - Shiqi Wei, Gus Xia, Yixiao Zhang, Liwei Lin, Weiguo Gao:
Music Phrase Inpainting Using Long-Term Representation and Contrastive Loss. 186-190 - Yi Zou, Pei Zou, Yi Zhao, Kaixiang Zhang, Ran Zhang, Xiaorui Wang:
Melons: Generating Melody With Long-Term Structure Using Transformers And Structure Graph. 191-195 - Moyu Terao, Yuki Hiramatsu, Ryoto Ishizuka, Yiming Wu, Kazuyoshi Yoshii:
Difficulty-Aware Neural Band-to-Piano Score Arrangement based on Note- and Statistic-Level Criteria. 196-200 - Pedro Ramoneda, Nazif Can Tamer, Vsevolod Eremenko, Xavier Serra, Marius Miron:
Score Difficulty Analysis for Piano Performance Education based on Fingering. 201-205 - Zhipeng Chen, Yiya Hao, Yaobin Chen, Gong Chen, Liang Ruan:
A Neural Network-based Howling Detection Method for Real-Time Communication Applications. 206-210 - Tomer Fireaizen, Saar Ron, Omer Bobrowski:
Alarm Sound Detection Using Topological Signal Processing. 211-215 - Osamu Ichikawa, Yuuto Shima, Takahiro Nakayama, Hajime Shirouzu:
A Method For Estimating The Grouping Of Participants In Classroom Group Work Using Only Audio Information. 216-220 - Yuki Okamoto, Shota Horiguchi, Masaaki Yamamoto, Keisuke Imoto, Yohei Kawaguchi:
Environmental Sound Extraction Using Onomatopoeic Words. 221-225 - Masahiro Yasuda, Yasunori Ohishi, Shoichiro Saito:
Echo-Aware Adaptation of Sound Event Localization and Detection in Unknown Environments. 226-230 - Juncheng B. Li, Shuhui Qu, Xinjian Li, Bernie Po-Yao Huang, Florian Metze:
On Adversarial Robustness Of Large-Scale Audio Visual Learning. 231-235 - Haibin Wu, Po-Chun Hsu, Ji Gao, Shanshan Zhang, Shen Huang, Jian Kang, Zhiyong Wu, Helen Meng, Hung-Yi Lee:
Adversarial Sample Detection for Speaker Verification by Neural Vocoders. 236-240 - Naoya Takahashi, Yuki Mitsufuji:
Amicable Examples for Informed Source Separation. 241-245 - David M. Chan, Shalini Ghosh, Debmalya Chakrabarty, Björn Hoffmeister:
Multi-Modal Pre-Training for Automated Speech Recognition. 246-250 - Ryota Tsunoda, Ryo Aihara, Ryoichi Takashima, Tetsuya Takiguchi, Yoshie Imai:
Speaker-Targeted Audio-Visual Speech Recognition Using a Hybrid CTC/Attention Model with Interference Loss. 251-255 - Yifei Wu, Chenda Li, Jinfeng Bai, Zhongqin Wu, Yanmin Qian:
Time-Domain Audio-Visual Speech Separation on Low Quality Videos. 256-260 - Mhd Modar Halimeh, Walter Kellermann:
Complex-Valued Spatial Autoencoders for Multichannel Speech Enhancement. 261-265 - Zhi-Wei Tan, Anh H. T. Nguyen, Yuan Liu, Andy W. H. Khong:
Multichannel Noise Reduction Using Dilated Multichannel U-Net and Pre-Trained Single-Channel Network. 266-270 - Hassan Taherian, Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Zhuo Chen, Xuedong Huang:
One Model to Enhance Them All: Array Geometry Agnostic Multi-Channel Personalized Speech Enhancement. 271-275 - Cong Han, Emine Merve Kaya, Kyle Hoefer, Malcolm Slaney, Simon Carlile:
Multi-Channel Speech Denoising for Machine Ears. 276-280 - Zhong-Qiu Wang, DeLiang Wang:
Localization based Sequential Grouping for Continuous Speech Separation. 281-285 - Mieszko Fras, Marcin Witkowski, Konrad Kowalczyk:
Convolutional Weighted Minimum Mean Square Error Filter for Joint Source Separation and Dereverberation. 286-290 - Ethan Manilow, Curtis Hawthorne, Cheng-Zhi Anna Huang, Bryan Pardo, Jesse H. Engel:
Improving Source Separation by Explicitly Modeling Dependencies between Sources. 291-295 - Yuichiro Koyama, Naoki Murata, Stefan Uhlich, Giorgio Fabbro, Shusuke Takahashi, Yuki Mitsufuji:
Music Source Separation With Deep Equilibrium Models. 296-300 - Natsuki Akaishi, Kohei Yatabe, Yasuhiro Oikawa:
Harmonic and Percussive Sound Separation Based on Mixed Partial Derivative of Phase Spectrogram. 301-305 - Enric Gusó, Jordi Pons, Santiago Pascual, Joan Serrà:
On Loss Functions and Evaluation Metrics for Music Source Separation. 306-310 - Sangwook Park, Mounya Elhilali:
Time-Balanced Focal Loss for Audio Event Detection. 311-315 - Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Naoya Takahashi, Emiru Tsunoo, Yuki Mitsufuji:
Multi-ACCDOA: Localizing And Detecting Overlapping Sounds From The Same Class With Auxiliary Duplicating Permutation Invariant Training. 316-320 - Arman Zharmagambetov, Qingming Tang, Chieh-Chi Kao, Qin Zhang, Ming Sun, Viktor Rozgic, Jasha Droppo, Chao Wang:
Improved Representation Learning For Acoustic Event Classification Using Tree-Structured Ontology. 321-325 - Sandeep Kothinti, Mounya Elhilali:
Temporal Contrastive-Loss for Audio Event Detection. 326-330 - Xu Wang, Xiangjinzi Zhang, Yunfei Zi, Shengwu Xiong:
A Frame Loss of Multiple Instance Learning for Weakly Supervised Sound Event Detection. 331-335 - Heinrich Dinkel, Zhiyong Yan, Yongqing Wang, Junbo Zhang, Yujun Wang:
Pseudo Strong Labels for Large Scale Weakly Supervised Audio Tagging. 336-340 - Wenyu Jin, Tim Schoof, Henning F. Schepker:
Individualized Hear-Through For Acoustic Transparency Using PCA-Based Sound Pressure Estimation At The Eardrum. 341-345 - Benjamin Lentz, Rainer Martin, Kirsten Oberländer, Christiane Völter:
On Spectral and Temporal Sparsification of Speech Signals for the Improvement of Speech Perception in CI Listeners. 346-350 - Fotios Drakopoulos, Sarah Verhulst:
A Differentiable Optimisation Framework for The Design of Individualised DNN-based Hearing-Aid Strategies. 351-355 - Sefik Emre Eskimez, Takuya Yoshioka, Huaming Wang, Xiaofei Wang, Zhuo Chen, Xuedong Huang:
Personalized speech enhancement: new models and Comprehensive evaluation. 356-360 - Jinxu Xiang, Yuyang Zhu, Rundi Wu, Ruilin Xu, Yuko Ishiwaka, Changxi Zheng:
Dynamic Sliding Window for Realtime Denoising Networks. 361-365 - Sunwoo Kim, Minje Kim:
Bloom-Net: Blockwise Optimization for Masking Networks Toward Scalable and Efficient Speech Enhancement. 366-370 - Tianrui Wang, Weibin Zhu, Yingying Gao, Junlan Feng, Shilei Zhang:
HGCN: Harmonic Gated Compensation Network for Speech Enhancement. 371-375 - Wenbin Jiang, Zhijun Liu, Kai Yu, Fei Wen:
Speech Enhancement with Neural Homomorphic Synthesis. 376-380 - Yang Xiang, Jesper Lisby Højvang, Morten Højfeldt Rasmussen, Mads Græsbøll Christensen:
A Bayesian Permutation Training Deep Representation Learning Method for Speech Enhancement with Variational Autoencoder. 381-385 - Huajian Fang, Tal Peer, Stefan Wermter, Timo Gerkmann:
Integrating Statistical Uncertainty into Neural Network-Based Speech Enhancement. 386-390 - Viet Anh Trinh, Sebastian Braun:
Unsupervised Speech Enhancement with Speech Recognition Embedding and Disentanglement Losses. 391-395 - Xianke Wang, Wei Xu, Weiming Yang, Wenqing Cheng:
Musicyolo: A Sight-Singing Onset/Offset Detection Framework Based on Object Detection Instead of Spectrum Frames. 396-400 - Yun-Ning Hung, Ju-Chiang Wang, Xuchen Song, Wei Tsung Lu, Minz Won:
Modeling Beats and Downbeats with a Time-Frequency Transformer. 401-405 - Michael Krause, Meinard Müller:
Hierarchical Classification of Singing Activity, Gender, and Type in Complex Music Recordings. 406-410 - Qiqi He, Xiaoheng Sun, Yi Yu, Wei Li:
Deepchorus: A Hybrid Model of Multi-Scale Convolution And Self-Attention for Chorus Detection. 411-415 - Ju-Chiang Wang, Yun-Ning Hung, Jordan B. L. Smith:
To Catch A Chorus, Verse, Intro, or Anything Else: Analyzing a Song with Structural Functions. 416-420 - Mojtaba Heydari, Matthew C. McCallum, Andreas F. Ehmann, Zhiyao Duan:
A Novel 1D State Space for Efficient Music Rhythmic Analysis. 421-425 - Haici Yang, Sanna Wager, Spencer Russell, Mike Luo, Minje Kim, Wontak Kim:
Upmixing Via Style Transfer: A Variational Autoencoder for Disentangling Spatial Images And Musical Content. 426-430 - Ricardo Falcón Pérez, Kazuki Shimada, Yuichiro Koyama, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Mixup: Directional Loudness Modification as Data Augmentation for Sound Event Localization and Detection. 431-435 - Tobias Kabzinski, Peter Jax:
Towards Faster Continuous Multi-Channel HRTF Measurements Based On Learning System Models. 436-440 - Bowen Zhi, Dmitry N. Zotkin, Ramani Duraiswami:
Towards Fast And Convenient End-To-End HRTF Personalization. 441-445 - Mateusz Guzik, Konrad Kowalczyk:
Wishart Localization Prior On Spatial Covariance Matrix In Ambisonic Source Separation Using Non-Negative Tensor Factorization. 446-450 - Jiawen Huang, Emmanouil Benetos, Sebastian Ewert:
Improving Lyrics Alignment Through Joint Pitch Detection. 451-455 - Ilaria Manco, Emmanouil Benetos, Elio Quinton, György Fazekas:
Learning Music Audio Representations Via Weak Language Supervision. 456-460 - David Giuseppe Badiane, Raffaele Malvermi, Sebastian Gonzalez, Fabio Antonacci, Augusto Sarti:
On the Prediction of the Frequency Response of a Wooden Plate from Its Mechanical Parameters. 461-465 - Bo-Yu Chen, Wei-Han Hsu, Wei-Hsiang Liao, Marco A. Martínez Ramírez, Yuki Mitsufuji, Yi-Hsuan Yang:
Automatic DJ Transitions with Differentiable Audio Effects and Generative Adversarial Networks. 466-470 - Han Chen, Yan Song, Li-Rong Dai, Ian McLoughlin, Lin Liu:
Self-Supervised Representation Learning for Unsupervised Anomalous Sound Detection Under Domain Shift. 471-475 - Vasileios Tsouvalas, Aaqib Saeed, Tanir Ozcelebi:
Federated Self-Training for Data-Efficient Audio Recognition. 476-480 - Meng Feng, Chieh-Chi Kao, Qingming Tang, Ming Sun, Viktor Rozgic, Spyros Matsoukas, Chao Wang:
Federated Self-Supervised Learning for Acoustic Event Classification. 481-485 - Kwanghee Choi, Martin Kersner, Jacob Morton, Buru Chang:
Temporal Knowledge Distillation for on-device Audio Classification. 486-490 - Ognjen (Oggi) Rudovic, Akanksha Bindal, Vineet Garg, Pramod Simha, Pranay Dighe, Sachin Kajarekar:
Streaming on-Device Detection of Device Directed Speech from Voice and Touch-Based Invocation. 491-495 - Hiroshi Sawada, Rintaro Ikeshita, Keisuke Kinoshita, Tomohiro Nakatani:
Multi-Frame Full-Rank Spatial Covariance Analysis for Underdetermined BSS in Reverberant Environments. 496-500 - Aditya Arie Nugraha, Kouhei Sekiguchi, Mathieu Fontaine, Yoshiaki Bando, Kazuyoshi Yoshii:
Flow-Based Fast Multichannel Nonnegative Matrix Factorization for Blind Source Separation. 501-505 - Yudong He, He Wang, Qifeng Chen, Richard Hau Yue So:
Harvesting Partially-Disjoint Time-Frequency Information for Improving Degenerate Unmixing Estimation Technique. 506-510 - Shogo Seki, Hirokazu Kameoka, Li Li:
Investigation And Comparison of Optimization Methods for Variational Autoencoder-Based Underdetermined Multichannel Source Separation. 511-515 - Li Li, Hirokazu Kameoka, Shogo Seki:
HBP: An Efficient Block Permutation Solver Using Hungarian Algorithm and Spectrogram Inpainting for Multichannel Audio Source Separation. 516-520 - Chenxing Li, Yang Wang, Feng Deng, Zhuo Zhang, Xiaorui Wang, Zhongyuan Wang:
EAD-Conformer: a Conformer-Based Encoder-Attention-Decoder-Network for Multi-Task Audio Source Separation. 521-525 - Darius Petermann, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux:
The Cocktail Fork Problem: Three-Stem Audio Separation for Real-World Soundtracks. 526-530 - Félix Mathieu, Thomas Courtat, Gaël Richard, Geoffroy Peeters:
Phase Shifted Bedrosian Filterbank: An Interpretable Audio Front-End for Time-Domain Audio Source Separation. 531-535 - Rahil Parikh, Ilya Kavalerov, Carol Y. Espy-Wilson, Shihab A. Shamma:
Harmonicity Plays a Critical Role in DNN Based Versus in Biologically-Inspired Monaural Speech Segregation Systems. 536-540 - Changsheng Quan, Xiaofei Li:
Multi-Channel Narrow-Band Deep Speech Separation with Full-Band Permutation Invariant Training. 541-545 - Cunhang Fan, Zhao Lv, Shengbing Pei, Mingyue Niu:
Csenet: Complex Squeeze-and-Excitation Network for Speech Depression Level Prediction. 546-550 - Ebrahim Nemati, Xuhai Xu, Viswam Nathan, Korosh Vatanparvar, Tousif Ahmed, Md. Mahbubur Rahman, Dan McCaffrey, Jilong Kuang, Alex Gao:
Ubilung: Multi-Modal Passive-Based Lung Health Assessment. 551-555 - Neeraj Kumar Sharma, Srikanth Raj Chetupalli, Debarpan Bhattacharya, Debottam Dutta, Pravin Mote, Sriram Ganapathy:
The Second Dicova Challenge: Dataset and Performance Analysis for Diagnosis of Covid-19 Using Acoustics. 556-560 - Xing-Yu Chen, Qiu-Shi Zhu, Jie Zhang, Li-Rong Dai:
Supervised and Self-Supervised Pretraining Based Covid-19 Detection Using Acoustic Breathing/Cough/Speech Signals. 561-565 - Madhu R. Kamble, Jose Patino, Maria A. Zuluaga, Massimiliano Todisco:
Exploring Auditory Acoustic Features for The Diagnosis of Covid-19. 566-570 - Anton Ratnarajah, Shi-Xiong Zhang, Meng Yu, Zhenyu Tang, Dinesh Manocha, Dong Yu:
Fast-Rir: Fast Neural Diffuse Room Impulse Response Generator. 571-575 - Juliano G. C. Ribeiro, Shoichi Koyama, Hiroshi Saruwatari:
Region-to-Region Kernel Interpolation of Acoustic Transfer Function with Directional Weighting. 576-580 - Philipp Götz, Cagdas Tuna, Andreas Walther, Emanuël A. P. Habets:
Blind Reverberation Time Estimation in Dynamic Acoustic Conditions. 581-585 - Maozhong Fu, Jesper Rindom Jensen, Yuhan Li, Mads Græsbøll Christensen:
Sparse Modeling of The Early Part of Noisy Room Impulse Responses with Sparse Bayesian Learning. 586-590 - Jack Deadman, Jon Barker:
Improved Simulation of Realistically-Spatialised Simultaneous Speech Using Multi-Camera Analysis in The Chime-5 Dataset. 591-595 - Mattia Papa, Clara Borrelli, Paolo Bestagini, Fabio Antonacci, Augusto Sarti, Stefano Tubaro:
A Data-Driven Approach for Acoustic Parameter Similarity Estimation of Speech Recording. 596-600 - Yudong Zhao, György Fazekas, Mark B. Sandler:
Violinist Identification Using Note-Level Timbre Feature Distributions. 601-605 - Hang Zhao, Chen Zhang, Bilei Zhu, Zejun Ma, Kejun Zhang:
S3T: Self-Supervised Pre-Training with Swin Transformer For Music Classification. 606-610 - Morgan Buisson, Pablo Alonso-Jiménez, Dmitry Bogdanov:
Ambiguity Modelling with Label Distribution Learning for Music Classification. 611-615 - Xingjian Du, Ke Chen, Zijie Wang, Bilei Zhu, Zejun Ma:
Bytecover2: Towards Dimensionality Reduction of Latent Embedding for Efficient Cover Song Identification. 616-620 - Ke Chen, Shuai Yu, Cheng-i Wang, Wei Li, Taylor Berg-Kirkpatrick, Shlomo Dubnov:
Tonet: Tone-Octave Network for Singing Melody Extraction from Polyphonic Music. 621-625 - Shuai Yu, Xi Chen, Wei Li:
Hierarchical Graph-Based Neural Network for Singing Melody Extraction. 626-630 - Michel Olvera, Emmanuel Vincent, Gilles Gasso:
On The Impact of Normalization Strategies in Unsupervised Adversarial Domain Adaptation for Acoustic Scene Classification. 631-635 - Tom Denton, Scott Wisdom, John R. Hershey:
Improving Bird Classification with Unsupervised Sound Separation. 636-640 - Francesco Paissan, Alberto Ancilotto, Alessio Brutti, Elisabetta Farella:
Scalable Neural Architectures for End-to-End Environmental Sound Classification. 641-645 - Ke Chen, Xingjian Du, Bilei Zhu, Zejun Ma, Taylor Berg-Kirkpatrick, Shlomo Dubnov:
HTS-AT: A Hierarchical Token-Semantic Audio Transformer for Sound Classification and Detection. 646-650 - You Wang, David V. Anderson:
Hybrid Attention-Based Prototypical Networks for Few-Shot Sound Classification. 651-655 - Karn N. Watcharasupat, Thi Ngoc Tho Nguyen, Woon-Seng Gan, Shengkui Zhao, Bin Ma:
End-to-End Complex-Valued Multidilated Convolutional Neural Network for Joint Acoustic Echo Cancellation and Noise Suppression. 656-660 - Ziteng Wang, Yueyue Na, Biao Tian, Qiang Fu:
NN3A: Neural Network Supported Acoustic Echo Cancellation, Noise Suppression and Automatic Gain Control for Real-Time Communications. 661-665 - Jan Franzen, Tim Fingscheidt:
Deep Residual Echo Suppression and Noise Reduction: A Multi-Input FCRN Approach in a Hybrid Speech Enhancement System. 666-670 - Hao Zhang, DeLiang Wang:
Neural Cascade Architecture for Joint Acoustic Echo and Noise Suppression. 671-675 - Santiago Ruiz, Toon van Waterschoot, Marc Moonen:
Cascade Multi-Channel Noise Reduction and Acoustic Feedback Cancellation. 676-680 - Chenda Li, Lei Yang, Weiqin Wang, Yanmin Qian:
Skim: Skipping Memory Lstm for Low-Latency Real-Time Continuous Speech Separation. 681-685 - Aswin Sivaraman, Scott Wisdom, Hakan Erdogan, John R. Hershey:
Adapting Speech Separation to Real-World Meetings using Mixture Invariant Training. 686-690 - Eisuke Konno, Daisuke Saito, Nobuaki Minematsu:
Quantifying Discriminability between NMF Bases. 691-695 - Hassan Taherian, Ke Tan, DeLiang Wang:
Location-Based Training for Multi-Channel Talker-Independent Speaker Separation. 696-700 - Robin Scheibler:
SDR - Medium Rare with Fast Computations. 701-705 - Hirokazu Kameoka, Shogo Seki, Li Li, Chihiro Watanabe:
Attentionpit: Soft Permutation Invariant Training for Audio Source Separation with Attention Mechanism. 706-710 - Olga Slizovskaia, Gordon Wichern, Zhong-Qiu Wang, Jonathan Le Roux:
Locate This, Not that: Class-Conditioned Sound Event DOA Estimation. 711-715 - Thi Ngoc Tho Nguyen, Douglas L. Jones, Karn N. Watcharasupat, Huy Phan, Woon-Seng Gan:
SALSA-Lite: A Fast and Effective Feature for Polyphonic Sound Event Localization and Detection with Microphone Arrays. 716-720 - Bing Yang, Hong Liu, Xiaofei Li:
SRP-DNN: Learning Direct-Path Phase Difference for Multiple Moving Sound Source Localization. 721-725 - Yonggang Hu, Sharon Gannot:
Closed-Form Single Source Direction-of-Arrival Estimator Using First-Order Relative Harmonic Coefficients. 726-730 - Jianhua Geng, Sifan Wang, Xin Lou:
A Slide-Save Based Framework for Multi-Source DOA Extraction with Closely Spaced Sources. 731-735 - Yu Chen, Bowen Liu, Zijian Zhang, Hun-Seok Kim:
An End-to-End Deep Learning Framework For Multiple Audio Source Separation And Localization. 736-740 - Amir Ivry, Israel Cohen, Baruch Berdugo:
Deep Adaptation Control for Acoustic Echo Cancellation. 741-745 - Amir Ivry, Israel Cohen, Baruch Berdugo:
Off-the-Shelf Deep Integration For Residual-Echo Suppression. 746-750 - Chenggang Zhang, Jinjiang Liu, Xueliang Zhang:
A Complex Spectral Mapping with Inplace Convolution Recurrent Neural Networks For Acoustic Echo Cancellation. 751-755 - Hao Zhang, Srivatsan Kandadai, Harsha Rao, Minje Kim, Tarun Pruthi, Trausti Kristjansson:
Deep Adaptive Aec: Hybrid of Deep Learning and Adaptive Acoustic Echo Cancellation. 756-760 - Yurii Iotov, Sidsel Marie Nørholm, Valiantsin Belyi, Mads Dyrholm, Mads Græsbøll Christensen:
Computationally Efficient Fixed-Filter ANC for Speech Based on Long-Term Prediction for Headphone Applications. 761-765 - Thomas Haubner, Andreas Brendel, Walter Kellermann:
End-To-End Deep Learning-Based Adaptation Control for Frequency-Domain Adaptive System Identification. 766-770 - Grigoris Bastas, Stefanos Koutoupis, Maximos A. Kaliakatsos-Papakostas, Vassilis Katsouros, Petros Maragos:
A Few-Sample Strategy for Guitar Tablature Transcription Based on Inharmonicity Analysis and Playability Constraints. 771-775 - Longshen Ou, Ziyi Guo, Emmanouil Benetos, Jiqing Han, Ye Wang:
Exploring Transformer's Potential on Automatic Piano Transcription. 776-780 - Rachel M. Bittner, Juan José Bosch, David Rubinstein, Gabriel Meseguer-Brocal, Sebastian Ewert:
A Lightweight Instrument-Agnostic Model for Polyphonic Note Transcription and Multipitch Estimation. 781-785 - Yu-Hua Chen, Wen-Yi Hsiao, Tsu-Kuang Hsieh, Jyh-Shing Roger Jang, Yi-Hsuan Yang:
Towards Automatic Transcription of Polyphonic Electric Guitar Music: A New Dataset and a Multi-Loss Transformer Model. 786-790 - Xiaoxue Gao, Chitralekha Gupta, Haizhou Li:
Genre-Conditioned Acoustic Models for Automatic Lyrics Transcription of Polyphonic Music. 791-795 - Sangeun Kum, Jongpil Lee, Keunhyoung Luke Kim, Taehyoung Kim, Juhan Nam:
Pseudo-Label Transfer from Frame-Level to Note-Level in a Teacher-Student Framework for Singing Transcription from Polyphonic Music. 796-800 - Noriyuki Tonami, Keisuke Imoto, Ryotaro Nagase, Yuki Okamoto, Takahiro Fukumori, Yoichi Yamashita:
Sound Event Detection Guided by Semantic Contexts of Scenes. 801-805 - Keigo Wakayama, Shoichiro Saito:
CNN-Transformer with Self-Attention Network for Sound Event Detection. 806-810 - Dongchao Yang, Helin Wang, Yuexian Zou, Zhongjie Ye, Wenwu Wang:
A Mutual Learning Framework for Few-Shot Sound Event Detection. 811-815 - Youde Liu, Jian Guan, Qiaoxi Zhu, Wenwu Wang:
Anomalous Sound Detection Using Spectral-Temporal Information Fusion. 816-820 - Yadong Guan, Jiabin Xue, Guibin Zheng, Jiqing Han:
Sparse Self-Attention for Semi-Supervised Sound Event Detection. 821-825 - Hayato Endo, Hiromitsu Nishizaki:
Peer Collaborative Learning for Polyphonic Sound Event Detection. 826-830 - Srikanth Korse, Nicola Pia, Kishan Gupta, Guillaume Fuchs:
PostGAN: A GAN-Based Post-Processor to Enhance the Quality of Coded Speech. 831-835 - Kishan Gupta, Srikanth Korse, Bernd Edler, Guillaume Fuchs:
A DNN Based Post-Filter to Enhance the Quality of Coded Speech in MDCT Domain. 836-840 - Eloi Moliner, Vesa Välimäki:
A Two-Stage U-Net for High-Fidelity Denoising of Historical Recordings. 841-845 - Marvin Borsdorf, Kevin Scheck, Haizhou Li, Tanja Schultz:
Experts Versus All-Rounders: Target Language Extraction for Multiple Target Languages. 846-850 - Guangwei Li, Xuenan Xu, Heinrich Dinkel, Mengyue Wu, Kai Yu:
Category-Adapted Sound Event Enhancement with Weakly Labeled Data. 851-855 - Rubén M. Clavería, Simon J. Godsill:
Sequential MCMC Methods for Audio Signal Enhancement. 856-860 - Tejas Jayashankar, Thilo Köhler, Kaustubh Kalgaonkar, Zhiping Xiu, Jilong Wu, Ju Lin, Prabhav Agrawal, Qing He:
Architecture for Variable Bitrate Neural Speech Codec with Configurable Computation Complexity. 861-865 - Xue Jiang, Xiulian Peng, Chengyu Zheng, Huaying Xue, Yuan Zhang, Yan Lu:
End-to-End Neural Speech Coding for Real-Time Communications. 866-870 - Seungmin Shin, Joon Byun, Youngcheol Park, Jongmo Sung, Seungkwon Beack:
Deep Neural Network (DNN) Audio Coder Using A Perceptually Improved Training Method. 871-875 - Chanwoo Lee, Hyungseob Lim, Jihyun Lee, Inseon Jang, Hong-Goo Kang:
Progressive Multi-Stage Neural Audio Coding with Guided References. 876-880 - Ehab A. AlBadawy, Andrew Gibiansky, Qing He, Jilong Wu, Ming-Ching Chang, Siwei Lyu:
Vocbench: A Neural Vocoder Benchmark for Speech Synthesis. 881-885 - Chandan K. A. Reddy, Vishak Gopal, Ross Cutler:
Dnsmos P.835: A Non-Intrusive Perceptual Objective Speech Quality Metric to Evaluate Noise Suppressors. 886-890 - Pranay Manocha, Zeyu Jin, Adam Finkelstein:
SQAPP: No-Reference Speech Quality Assessment Via Pairwise Preference. 891-895 - Wen-Chin Huang, Erica Cooper, Junichi Yamagishi, Tomoki Toda:
LDNet: Unified Listener Dependent Modeling in MOS Prediction for Synthetic Speech. 896-900 - Marju Purin, Sten Sootla, Mateja Sponza, Ando Saabas, Ross Cutler:
AECMOS: A Speech Quality Assessment Metric for Echo Impairment. 901-905 - Miao Liu, Jing Wang, Shicong Li, Fei Xiang, Yue Yao, Lidong Yang:
MOS Predictor for Synthetic Speech with I-Vector Inputs. 906-910 - Daan Ratering, W. Bastiaan Kleijn, Jean Gonzalez Silva, Riccardo M. G. Ferrari:
Wave-Domain Approach for Cancelling Noise Entering Open Windows. 911-915 - Tobias Gburrek, Joerg Schmalenstroeer, Reinhold Haeb-Umbach:
On Synchronization of Wireless Acoustic Sensor Networks in the Presence of Time-Varying Sampling Rate Offsets and Speaker Changes. 916-920 - Takuya Yoshioka, Xiaofei Wang, Dongmei Wang:
Picknet: Real-Time Channel Selection for Ad Hoc Microphone Arrays. 921-925 - Jarred Barber, Yifeng Fan, Tao Zhang:
End-To-End Alexa Device Arbitration. 926-930 - Natsuki Ueno, Nobutaka Ono:
Instantaneous Linear Dimensionality Reduction of Multichannel Time-Series Signal for Array Signal Processing. 931-935 - Srdan Kitic, Jérôme Daniel:
Generalized Time Domain Velocity Vector. 936-940 - Masaya Kawamura, Tomohiko Nakamura, Daichi Kitamura, Hiroshi Saruwatari, Yu Takahashi, Kazunobu Kondo:
Differentiable Digital Signal Processing Mixture Model for Synthesis Parameter Extraction from Mixture of Harmonic Sounds. 941-945 - Yashish M. Siriwardena, Guilhem Marion, Shihab A. Shamma:
The Mirrornet : Learning Audio Synthesizer Controls Inspired by Sensorimotor Interaction. 946-950 - Hao-Wen Dong, Cong Zhou, Taylor Berg-Kirkpatrick, Julian J. McAuley:
Deep Performer: Score-to-Audio Music Performance Synthesis. 951-955 - Chien-Feng Liao, Jen-Yu Liu, Yi-Hsuan Yang:
KaraSinger: Score-Free Singing Voice Synthesis with VQ-VAE Using Mel-Spectrograms. 956-960 - Jihyun Lee, Hyungseob Lim, Chanwoo Lee, Inseon Jang, Hong-Goo Kang:
Adversarial Audio Synthesis Using a Harmonic-Percussive Discriminator. 961-965 - Jing Yang, Chulhong Min, Akhil Mathur, Fahim Kawsar:
SleepGAN: Towards Personalized Sleep Therapy Music. 966-970 - Xuenan Xu, Mengyue Wu, Kai Yu:
Diversity-Controllable and Accurate Audio Captioning Based on Neural Condition. 971-975 - Andrey Guzhov, Federico Raue, Jörn Hees, Andreas Dengel:
Audioclip: Extending Clip to Image, Text and Audio. 976-980 - Zelin Zhou, Zhiling Zhang, Xuenan Xu, Zeyu Xie, Mengyue Wu, Kenny Q. Zhu:
Can Audio Captions Be Evaluated With Image Caption Metrics? 981-985 - Pablo M. Delgado, Jürgen Herre:
A Data-Driven Cognitive Salience Model for Objective Perceptual Audio Quality Assessment. 986-990 - Ryosuke Sawata, Yosuke Kashiwagi, Shusuke Takahashi:
Improving Character Error Rate is Not Equal to Having Clean Speech: Speech Enhancement for ASR Systems with Black-Box Acoustic Models. 991-995 - Sebastian Braun, Hannes Gamper:
Effect of Noise Suppression Losses on Speech Distortion and ASR Performance. 996-1000 - Alix Jeannerot, Niels de Koeijer, Pablo Martínez-Nuevo, Martin Bo Møller, Jakob Dyreby, Paolo Prandoni:
Increasing Loudness in Audio Signals: A Perceptually Motivated Approach to Preserve Audio Quality. 1001-1005 - Sebastian J. Schlecht, Leonardo Fierro, Vesa Välimäki, Juha Backman:
Audio Peak Reduction Using a Synced allpass Filter. 1006-1010 - Tomoro Tanaka, Kohei Yatabe, Masahiro Yasuda, Yasuhiro Oikawa:
APPLADE: Adjustable Plug-and-Play Audio Declipper Combining DNN with Sparse Optimization. 1011-1015 - Daniel Tompkins, Kshitiz Kumar, Jian Wu:
Maximizing Audio Event Detection Model Performance on Small Datasets Through Knowledge Transfer, Data Augmentation, and Pretraining: an Ablation Study. 1016-1020 - Janek Ebbers, Reinhold Haeb-Umbach, Romain Serizel:
Threshold Independent Evaluation of Sound Event Detection Scores. 1021-1025 - Seyed M. R. Modaresi, Aomar Osmani, Mohammadreza Razzazi, Abdelghani Chibani:
Multimodal Evaluation Method for Sound Event Detection. 1026-1030 - Francesca Ronchini, Romain Serizel:
A Benchmark of State-of-the-Art Sound Event Detection Systems Evaluated on Synthetic Soundscapes. 1031-1035 - Hye-jin Shim, Jee-weon Jung, Ju-ho Kim, Ha-Jin Yu:
Attentive Max Feature Map and Joint Training for Acoustic Scene Classification. 1036-1040 - Hu Hu, Sabato Marco Siniscalchi, Chao-Han Huck Yang, Chin-Hui Lee:
A Variational Bayesian Approach to Learning Latent Variables for Acoustic Knowledge Transfer. 4041-4045 - Christian Bergler, Manuel Schmitt, Andreas K. Maier, Rachael Xi Cheng, Volker Barth, Elmar Nöth:
ORCA-PARTY: An Automatic Killer Whale Sound Type Separation Toolkit Using Deep Learning. 1046-1050 - Mirco Pezzoli, Maximo Cobos, Fabio Antonacci, Augusto Sarti:
Sparsity-Based Sound Field Separation in the Spherical Harmonics Domain. 1051-1055 - Kazuyuki Arikawa, Shoichi Koyama, Hiroshi Saruwatari:
Spatial Active Noise Control Based on Individual Kernel Interpolation of Primary and Secondary Sound Fields. 1056-1060 - Sipei Zhao, Ian S. Burnett:
Time-Domain Acoustic Contrast Control with A Spatial Uniformity Constraint for Personal Audio Systems. 1061-1065 - Liming Shi, Guoli Ping, Xiaoxiang Shen, Mads Græsbøll Christensen:
Generation of Personal Sound Fields in Reverberant Environments Using Interframe Correlation. 1066-1070 - Jesper Brunnström, Shoichi Koyama, Marc Moonen:
Variable Span Trade-Off Filter for Sound Zone Control with Kernel Interpolation Weighting. 1071-1075 - Nara Hahn, Frank Schultz, Sascha Spors:
Time Domain Radial Filter Design for Spherical Waves. 1076-1080 - Junxiao Sun, Ke Zhang, Shuyi Niu, Yan Zhang, Youyong Kong:
Feature Space Message Passing Network for Medical Image Semantic Segmentation. 1081-1085 - Yixin Wang, Zhe Xu, Jiang Tian, Jie Luo, Zhongchao Shi, Yang Zhang, Jianping Fan, Zhiqiang He:
Cross-Domain Few-Shot Learning for Rare-Disease Skin Lesion Segmentation. 1086-1090 - Chen Li, Wei Chen, Xin Luo, Yulin He, Yusong Tan:
Adaptive Pseudo Labeling for Source-Free Domain Adaptation in Medical Image Segmentation. 1091-1095 - Abdullah F. Al-Battal, Imanuel R. Lerman, Truong Q. Nguyen:
Object Detection and Tracking in Ultrasound Scans Using an Optical Flow and Semantic Segmentation Framework Based on Convolutional Neural Networks. 1096-1100 - Dachuan Shi, Ruiyang Liu, Linmi Tao, Chun Yuan:
Heuristic Dropout: An Efficient Regularization Method for Medical Image Segmentation Models. 1101-1105 - Paria Jeihouni, Omid Dehzangi, Annahita Amireskandari, Ali Dabouei, Ali Rezai, Nasser M. Nasrabadi:
Superresolution and Segmentation of OCT Scans Using Multi-Stage Adversarial Guided Attention Training. 1106-1110 - Yusuke Akamatsu, Yoshifumi Onishi, Hitoshi Imaoka:
Heart Rate and Oxygen Saturation Estimation from Facial Video with Multimodal Physiological Data Generation. 1111-1115 - Kuan-Chen Wang, Kai-Chun Liu, Hsin-Min Wang, Yu Tsao:
EMGSE: Acoustic/EMG Fusion for Multimodal Speech Enhancement. 1116-1120 - Sawon Pratiher, Apoorva Srivastava, Yedla Bindu Priyatha, Nirmalya Ghosh, Amit Patra:
A Dilated Residual Vision Transformer for Atrial Fibrillation Detection from Stacked Time-Frequency ECG Representations. 1121-1125 - Crystal T. Wei, Ming-En Hsieh, Chien-Liang Liu, Vincent S. Tseng:
Contrastive Heartbeats: Contrastive Learning for Self-Supervised ECG Representation and Phenotyping. 1126-1130 - Omid Dehzangi, Paria Jeihouni, Jad Ramadan, Victor S. Finomore, Nasser M. Nasrabadi, Ali Rezai:
Ubiquitous Physiological Prediction of SUD Patients' Wellness State Using Memory-Based Convolutional Models. 1131-1135 - Mu Yang, Darpit Dave, Madhav Erraguntla, Gerard L. Coté, Ricardo Gutierrez-Osuna:
Joint Hypoglycemia Prediction and Glucose Forecasting via Deep Multi-Task Learning. 1136-1140 - Siddharth Subramani, Achuth Rao M. V, Anwesha Roy, Prasanna Suresh Hegde, Prasanta Kumar Ghosh:
SegNet-Based Deep Representation Learning for Dysphagia Classification. 1141-1145 - Francois Buet-Golfouse, Hans Roggeman, Islam Utyagulov:
Robust Collaborative Learning for Sequence Modelling. 1146-1150 - Jen-Cheng Hou, Aileen McGonigal, Fabrice Bartolomei, Monique Thonnat:
A Self-Supervised Pre-Training Framework for Vision-Based Seizure Classification. 1151-1155 - Huaiwen Luo, Lu Zhang, Lianyu Zhou, Xu Lin, Zehuai Zhang, Mingjiang Wang:
Design of Real-Time System Based on Machine Learning for Snoring and OSA Detection. 1156-1160 - Kaan Sel, Noah Huerta, Michael S. Sacks, Roozbeh Jafari:
Parametric Modeling of Human Wrist for Bioimpedance-Based Physiological Sensing. 1161-1165 - José Fernando Adrán Otero, Oscar Soláns Caballer, Pere Martí-Puig, Zhe Sun, Toshihisa Tanaka, Jordi Solé-Casals:
Preliminary Results on the Generation of Artificial Handwriting Data Using a Decomposition-Recombination Strategy. 1166-1170 - Suguru Kanoga, Takayuki Hoshino, Mitsunori Tada:
A Style Transfer Mapping and Fine-Tuning Subject Transfer Framework Using Convolutional Neural Networks for Surface Electromyogram Pattern Recognition. 1171-1175 - Chencheng Guo, Hui Qian, Baoling Hong:
Feature-Based Sensing Matrix Design for Analog to Information Converters. 1176-1180 - K. M. Naimul Hassan, Md. Shamiul Alam Hridoy, Naima Tasnim, Atia Faria Chowdhury, Tanvir Alam Roni, Sheikh Tabrez, Arik Subhana, Celia Shahnaz:
ALSNet: A Dilated 1-D CNN for Identifying ALS from Raw EMG Signal. 1181-1185 - Bilal Ahmad, Liana Khamidullina, Alexey Alexandrovich Korobkov, Alla Manina, Jens Haueisen, Martin Haardt:
Joint Model Order Estimation for Multiple Tensors with A Coupled Mode and Applications to the Joint Decomposition of EEG, MEG Magnetometer, and Gradiometer Tensors. 1186-1190 - Zhikang Zhang, Jonathan Zhao, Fengbo Ren:
An Experimental Study on Transferring Data-Driven Image Compressive Sensing to Bioelectric Signals. 1191-1195 - Elahe Rahimian, Soheil Zabihi, Amir Asif, Dario Farina, Seyed Farokh Atashzar, Arash Mohammadi:
Hand Gesture Recognition Using Temporal Convolutions and Attention Mechanism. 1196-1200 - Bo Fang, Junxin Chen, Wei Wang, Yicong Zhou:
Combining Multiple Style Transfer Networks and Transfer Learning For LGE-CMR Segmentation. 1201-1205 - Jaeyoung Huh, Shujaat Khan, Jong Chul Ye:
Multi-Domain Unpaired Ultrasound Image Artifact Removal Using a Single Convolutional Neural Network. 1206-1210 - Xiao Li, Huizhi Liang, Sidhartha Nagala, Jane Chen:
Improving Ultrasound Image Classification with Local Texture Quantisation. 1211-1215 - Tristan S. W. Stevens, Nishith Chennakeshava, Frederik J. de Bruijn, Martin Pekar, Ruud J. G. van Sloun:
Accelerated Intravascular Ultrasound Imaging using Deep Reinforcement Learning. 1216-1220 - Nishith Chennakeshava, Tristan S. W. Stevens, Frederik J. de Bruijn, Andrew Hancock, Martin Pekar, Yonina C. Eldar, Massimo Mischi, Ruud J. G. van Sloun:
Deep Proximal Unfolding For Image Recovery from Under-Sampled Channel Data in Intravascular Ultrasound. 1221-1225 - Gongpeng Cao, Yiping Wang, Manli Zhang, Jing Zhang, Guixia Kang, Xin Xu:
Multiview Long-Short Spatial Contrastive Learning For 3D Medical Image Analysis. 1226-1230 - Khuong Vo, Manoj Vishwanath, Ramesh Srinivasan, Nikil D. Dutt, Hung Cao:
Composing Graphical Models with Generative Adversarial Networks for EEG Signal Modeling. 1231-1235 - David Bethge, Philipp Hallgarten, Tobias Grosse-Puppendahl, Mohamed Kari, Ralf Mikut, Albrecht Schmidt, Ozan Özdenizci:
Domain-Invariant Representation Learning from EEG with Private Encoders. 1236-1240 - Guangyi Zhang, Ali Etemad:
Holistic Semi-Supervised Approaches for EEG Representation Learning. 1241-1245 - Pankaj Pandey, Gulshan Sharma, Krishna P. Miyapuram, Ramanathan Subramanian, Derek Lomas:
Music Identification Using Brain Responses to Initial Snippets. 1246-1250 - Wei Xu, Jing Wang, Ziyu Jia, Zhiqing Hong, Yunze Li, Youfang Lin:
Multi-Level Spatial-Temporal Adaptation Network for Motor Imagery Classification. 1251-1255 - Lies Bollens, Tom Francart, Hugo Van hamme:
Learning Subject-Invariant Representations from Speech-Evoked EEG Using Variational Autoencoders. 1256-1260 - Xinru Dai, Tai Ma, Haibin Cai, Ying Wen:
Unsupervised Hierarchical Translation-Based Model for Multi-Modal Medical Image Registration. 1261-1265 - Zailiang Chen, Hailei Lan, Yongan Meng, Yuchen Xiong, Jing Luo, Hailan Shen:
FAZ-BV: A Diabetic Macular Ischemia Grading Framework Combining Faz Attention Network and Blood Vessel Enhancement Filters. 1266-1270 - Lijuan Lu, Shun Miao, Ling Ye:
Fracture Detection and Localization in Chest X-Rays Using Semi-Supervised Learning with Dynamic Sharpening. 1271-1275 - Ryan Zhang, Jiadai Zhu, Stephen Yang, Mahdi S. Hosseini, Angelo Genovese, Lina Chen, Corwyn Rowsell, Savvas Damaskinos, Sonal Varma, Konstantinos N. Plataniotis:
Histokt: Cross Knowledge Transfer in Computational Pathology. 1276-1280 - Giovana Augusta Benvenuto, Marilaine Colnago, Wallace Casaca:
Unsupervised Deep Learning Network for Deformable Fundus Image Registration. 1281-1285 - Huijuan Yang, Aaron S. Coyner, Feri Guretno, Ivan Ho Mien, Chuan Sheng Foo, J. Peter Campbell, Susan Ostmo, Michael F. Chiang, Pavitra Krishnaswamy:
A Minimally Supervised Approach for Medical Image Quality Assessment in Domain Shift Settings. 1286-1290 - Yanbin He, Zhiyang Lu, Jun Wang, Jun Shi:
A Channel Attention Based MLP-Mixer Network for Motor Imagery Decoding With EEG. 1291-1295 - Miguel Angrick, Maarten C. Ottenhoff, Lorenz Diener, Darius Ivucic, Gabriel Ivucic, Sophocles Goulis, Albert J. Colon, G. Louis Wagner, Dean J. Krusienski, Pieter Leonard Kubben, Tanja Schultz, Christian Herff:
Towards Closed-Loop Speech Synthesis from Stereotactic EEG: A Unit Selection Approach. 1296-1300 - Jaeun Phyo, Wonjun Ko, Eunjin Jeon, Heung-Il Suk:
Enhancing Contextual Encoding With Stage-Confusion and Stage-Transition Estimation for EEG-Based Sleep Staging. 1301-1305 - Hadi Habibzadeh, Kevin J. Long, Ally E. Atkins, Daphney-Stavroula Zois, James J. S. Norton:
Improving BCI-based Color Vision Assessment Using Gaussian Process Regression. 1306-1310 - Shuji Komeiji, Kai Shigemi, Takumi Mitsuhashi, Yasushi Iimura, Hiroharu Suzuki, Hidenori Sugano, Koichi Shinoda, Toshihisa Tanaka:
Transformer-Based Estimation of Spoken Sentences Using Electrocorticography. 1311-1315 - Marzieh Ajirak, Cassandra Heiselman, J. Gerald Quirk, Petar M. Djuric:
Boost Ensemble Learning for Classification of CTG SIGNALS. 1316-1320 - Yifan Wang, Ying Lan:
Multi-View Learning Based on Non-Redundant Fusion for Icu Patient Mortality Prediction. 1321-1325 - Tong Chen, Guanchao Feng, Cassandra Heiselman, J. Gerald Quirk, Petar M. Djuric:
Improving Phase-Rectified Signal Averaging for Fetal Heart Rate Analysis. 1326-1330 - Liu Yang, Cassandra Heiselman, J. Gerald Quirk, Petar M. Djuric:
Unsupervised Clustering and Analysis of Contraction-Dependent Fetal Heart Rate Segments. 1331-1335 - Orestis Apostolou, Vasileios S. Charisis, Georgios K. Apostolidis, Leontios J. Hadjileontiadis:
A Method for Detecting Coronary Artery Disease using Noisy Ultrashort Electrocardiogram Recordings. 1336-1340 - Nele Sophie Brügge, Jan Graßhoff, Arne Weigenand, Philipp Rostalski:
Multi-Task Gaussian Process Regression for the Detection of Sleep Cycles in Premature Infants. 1341-1345 - Silpa Babu, Seyedehsara Nayer, Sajan Goud Lingala, Namrata Vaswani:
Fast Low Rank Column-Wise Compressive Sensing For Accelerated Dynamic MRI. 1346-1350 - Sizhuo Liu, Philip Schniter, Rizwan Ahmad:
MRI Recovery with a Self-Calibrated Denoiser. 1351-1355 - Wanqi Zhang, Lulu Wang, Wei Chen, Yuanyuan Jia, Zhongshi He, Jinglong Du:
3d Cross-Scale Feature Transformer Network for Brain Mr Image Super-Resolution. 1356-1360 - Harsh Singh, Ognjen Arandjelovic:
Data Efficient Support Vector Machine Training Using the Minimum Description Length Principle. 1361-1365 - Yuanpin Zhou, Yao Lu:
Multiple Instance Learning with Task-Specific Multi-Level Features for Weakly Annotated Histopathological Image Classification. 1366-1370 - Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama:
Self-Knowledge Distillation based Self-Supervised Learning for Covid-19 Detection from Chest X-Ray Images. 1371-1375 - Rui Xu, Yufeng Wang, Xinchen Ye, Pengcheng Wu, Yen-Wei Chen, Fangyi Xu, Wenchao Zhu, Chao Chen, Yong Zhou, Hongjie Hu, Xiaofeng Qu, Shoji Kido, Noriyuki Tomiyama:
Pixel-Level and Affinity-Level Knowledge Distillation for Unsupervised Segmentation of Covid-19 Lesions. 1376-1380 - Nastaran Enshaei, Moezedin Javad Rafiee, Arash Mohammadi, Farnoosh Naderkhani:
Data Shapley Value for Handling Noisy Labels: An Application in Screening Covid-19 Pneumonia from Chest CT Scans. 1381-1385 - Xiongbiao Luo:
Accurate Multiscale Selective Fusion of CT and Video Images for Real-Time Endoscopic Camera 3D Tracking in Robotic Surgery. 1386-1390 - Ruixiang Geng, Qing Liu, Shuo Feng, Yixiong Liang:
Learning Deep Pathological Features for WSI-Level Cervical Cancer Grading. 1391-1395 - Bowen Xu, Wenqiang Zhang:
Selective Scale Cascade Attention Network for Breast Cancer Histopathology Image Classification. 1396-1400 - Archishman Biswas, Hernando C. Ombao:
Frequency-Specific Non-Linear Granger Causality in a Network of Brain Signals. 1401-1405 - Kosuke Fukumori, Noboru Yoshida, Hidenori Sugano, Madoka Nakajima, Toshihisa Tanaka:
Epileptic Spike Detection by Recurrent Neural Networks with Self-Attention Mechanism. 1406-1410 - Jian Yin, Yuan Wang:
Topological Correlation of Brain Signals. 1411-1415 - Bahman Abdi-Sargezeh, Antonio Valentín, Gonzalo Alarcón, Saeid Sanei:
Online Detection of Scalp-Invisible Mesial-Temporal Brain Interictal Epileptiform Discharges from EEG. 1416-1420 - Yulu Wang, Yiwen Sun, Lei Fang, Changshui Zhang:
Leveraging Sparse Coding for EEG Based Emotion Recognition in Shooting. 1421-1425 - Weilai Li, Lanfeng Zhong, Weixi Xiang, Tongzhou Kang, Dakun Lai:
A Novel Unsupervised Autoencoder-Based HFOs Detector in Intracranial EEG Signals. 1426-1430 - Fei Ye, Zhiqiang Wang, Sheng Zhu, Xuanya Li, Kai Hu:
A Novel Convolutional Neural Network Based on Adaptive Multi-Scale Aggregation and Boundary-Aware for Lateral Ventricle Segmentation on MR images. 1431-1435 - Wentao Liu, Huihua Yang, Tong Tian, Xipeng Pan, Weijin Xu:
Multiscale Attention Aggregation Network for 2D Vessel Segmentation. 1436-1440 - Xinxin Shan, Tai Ma, Anqi Gu, Haibin Cai, Ying Wen:
TCRNet: Make Transformer, CNN and RNN Complement Each Other. 1441-1445 - Ke Zheng, Junhai Xu, Jianguo Wei:
Double Noise Mean Teacher Self-Ensembling Model for Semi-Supervised Tumor Segmentation. 1446-1450 - Siming Yuan, Qing Liu, Shenghui Liao, Fuchang Han, Haitao Wei, Yingqi Zhang:
Rethinking Computer-Aided Pelvis Segmentation. 1451-1455 - Hyunwoo Yu, Jae-hun Shim, Jaeho Kwak, Jou Won Song, Suk-Ju Kang:
Vision Transformer-Based Retina Vessel Segmentation with Deep Adaptive Gamma Correction. 1456-1460 - Yuan Wang, Moo K. Chung, Julius Fridriksson:
Spectral Permutation Test on Persistence Diagrams. 1461-1465 - Isabell Lehmann, Evrim Acar, Tanuj Hasija, Mohammad A. B. S. Akhonda, Vince D. Calhoun, Peter J. Schreier, Tülay Adali:
Multi-Task fMRI Data Fusion Using IVA and PARAFAC2. 1466-1470 - Hanlu Yang, Mohammad A. B. S. Akhonda, Fateme Ghayem, Qunfang Long, Vince D. Calhoun, Tülay Adali:
Independent Vector Analysis Based Subgroup Identification from Multisubject fMRI Data. 1471-1475 - Damian Pascual, Béni Egressy, Nicolas Affolter, Yiming Cai, Oliver Richter, Roger Wattenhofer:
Improving Brain Decoding Methods and Evaluation. 1476-1480 - Xiaofeng Liu, Fangxu Xing, Maureen Stone, Jerry L. Prince, Jangwon Kim, Georges El Fakhri, Jonghye Woo:
Cmri2spec: Cine MRI Sequence to Spectrogram Synthesis via A Pairwise Heterogeneous Translator. 1481-1485 - Wenhan Wang, Youyong Kong, Zhenghua Hou, Chunfeng Yang, Yonggui Yuan:
Spatio-Temporal Attention Graph Convolution Network for Functional Connectome Classification. 1486-1490 - Avrajit Ghosh, Michael T. McCann, Saiprasad Ravishankar:
Bilevel Learning of ℓ1 Regularizers with Closed-Form Gradients (BLORC). 1491-1495 - V. S. Unni, Ruturaj G. Gavaskar, Kunal N. Chaudhury:
Multiband Image Fusion with Controllable Error Guarantees. 1496-1500 - Zhuojie Huang, Shuping Zhao, Lunke Fei, Jigang Wu:
Weighted Graph Embedded Low-Rank Projection Learning for Feature Extraction. 1501-1505 - Vasiliki Kouni, Georgios Paraskevopoulos, Holger Rauhut, George C. Alexandropoulos:
ADMM-DAD Net: A Deep Unfolding Network for Analysis Compressed Sensing. 1506-1510 - Alexander Lin, Andrew H. Song, Berkin Bilgic, Demba E. Ba:
High-Dimensional Sparse Bayesian Learning without Covariance Matrices. 1511-1515 - Baoshun Shi, Yuxin Wang, Qiusheng Lian:
A Trainable Bounded Denoiser Using Double Tight Frame Network for Snapshot Compressive Imaging. 1516-1520 - Seobin Park, Tae Hyun Kim:
Progressive Image Super-Resolution via Neural Differential Equation. 1521-1525 - Yuhui Quan, Xinran Qin, Mingqin Chen, Yan Huang:
High-Quality Self-Supervised Snapshot Hyperspectral Imaging. 1526-1530 - Abderrahim Halimi, Jakeoung Koo, Robert A. Lamb, Gerald S. Buller, Steve McLaughlin:
Robust Bayesian Reconstruction of Multispectral Single-Photon 3D Lidar Data with Non-Uniform Background. 1531-1535 - Quentin Febvre, Ronan Fablet, Julien Le Sommer, Clément Ubelmann:
Joint Calibration and Mapping of Satellite Altimetry Data Using Trainable Variational Models. 1536-1540 - Michalis Giannopoulos, Grigorios Tsagkatakis, Panagiotis Tsakalides:
4D Convolutional Neural Networks for Multi-Spectral and Multi-Temporal Remote Sensing Data Classification. 1541-1545 - Cheick T. Cissé, Ahed Alboody, Matthieu Puigt, Gilles Roussel, Vincent Vantrepotte, Cédric Jamet, Trung-Kien Tran:
A New Deep Learning Method for Multispectral Image Time Series Completion Using Hyperspectral Data. 1546-1550 - Xinyi Wei, Hans Van Gorp, Lizeth Gonzalez-Carabarin, Daniel Freedman, Yonina C. Eldar, Ruud J. G. van Sloun:
Image Denoising with Deep Unfolding And Normalizing Flows. 1551-1555 - Rohit Ranade, Yangwen Liang, Shuangquan Wang, Dongwoon Bai, Jungwon Lee:
3D Texture Super Resolution via the Rendering Loss. 1556-1560 - Changhun Sung, Byungdeok Kim:
Bundle ICP with Virtual Depth for Hand-Held 3d Scanner. 1561-1565 - Julián Tachella, Michael P. Sheehan, Mike E. Davies:
Sketched RT3D: How to Reconstruct Billions of Photons Per Second. 1566-1570 - Naveen Kuruba, Neel Badadare, Vikram Narayan, Satish Putta:
A Generic Method to Estimate Camera Extrinsic Parameters. 1571-1575 - Yash Sanghvi, Abhiram Gnanasambandan, Stanley H. Chan:
Photon-Limited Deblurring Using Algorithm Unrolling. 1576-1580 - Wenpeng Xing, Jie Chen:
NEX+: Novel View Synthesis with Neural Regularisation Over Multi-Plane Images. 1581-1585 - Daniel Nicholls, Alex W. Robinson, Jack Wells, Amirafshar Moshtaghpour, Mounib Bahri, Angus I. Kirkland, Nigel D. Browning:
Compressive Scanning Transmission Electron Microscopy. 1586-1590 - Simon Welker, Tal Peer, Henry N. Chapman, Timo Gerkmann:
Deep Iterative Phase Retrieval for Ptychography. 1591-1595 - Vinayak Killedar, Chandra Sekhar Seelamantula:
Compressive Phase Retrieval Based On Sparse Latent Generative Priors. 1596-1600 - Abdulrahman M. Alanazi, Singanallur V. Venkatakrishnan, Hector J. Santos-Villalobos, Gregery T. Buzzard, Charles A. Bouman:
Model-Based Reconstruction for Collimated Beam Ultrasound Systems. 1601-1605 - Tim Straubinger, Robert Xiao, Helge Rhodin:
Learned Acoustic Reconstruction Using Synthetic Aperture Focusing. 1606-1610 - Guanze Liu, Bo Xu, Han Huang, Cheng Lu, Yandong Guo:
SDETR: Attention-Guided Salient Object Detection with Transformer. 1611-1615 - Kristian Fischer, Markus Hofbauer, Christopher B. Kuhn, Eckehard G. Steinbach, André Kaup:
Evaluation of Video Coding for Machines without Ground Truth. 1616-1620 - Thuc Nguyen Huu, Vinh Van Duong, Jonghoon Yim, Byeungwoo Jeon:
Raw Plenoptic Video Coding Under Hexagonal Lattice Resolution of Motion Vectors. 1621-1624 - Kianoush Jafari, Alireza Aminlou, Miska M. Hannuksela:
Comparison of Boundary Artifact Removal Methods in Coding of Generalized Cubemap Projection Using VVC. 1625-1629 - Shen Wang, Yibing Fu, Chen Zhu, Li Song, Wenjun Zhang:
Low-Complexity Multi-Model CNN in-Loop Filter for AVS3. 1630-1634 - Junyan Huo, Yu Sun, Haixin Wang, Shuai Wan, Fuzheng Yang, Ming Li:
Unified Matrix Coding for NN Originated MIP in H.266/VVC. 1635-1639 - Yuanyuan Xu, Taoyu Yang, Zengjie Tan, Haolun Lan:
FOV-Based Coding Optimization for 360-Degree Virtual Reality Videos. 1640-1644 - Jian Wang, Xinyue Li, Wei Song, Zhichao Zhang, Weiqi Guo:
Multi-Hierarchy Proxy Structure for Deep Metric Learning. 1645-1649 - Michail Kaseris, Ioannis Mademlis, Ioannis Pitas:
Exploiting Caption Diversity for Unsupervised Video Summarization. 1650-1654 - Wanqian Zhang, Dayan Wu, Chule Yang, Bo Li, Weiping Wang:
Clustering and Separating Similarities for Deep Unsupervised Hashing. 1655-1659 - Junying Huang, Fan Chen, Keze Wang, Liang Lin, Dongyu Zhang:
Enhancing Prototypical Few-Shot Learning By Leveraging The Local-Level Strategy. 1660-1664 - Chao Zhou, Miguel R. D. Rodrigues:
Blind Unmixing Using A Double Deep Image Prior. 1665-1669 - Yi Liu, Yanjie Liang, Qiangqiang Wu, Liming Zhang, Hanzi Wang:
A New Framework for Multiple Deep Correlation Filters Based Object Tracking. 1670-1674 - Bo-Hao Chen, Hsiang-Yin Cheng, Jia-Li Yin:
Adaptive Actor-Critic Bilateral Filter. 1675-1679 - Niklas Kämper, Joachim Weickert:
Domain Decomposition Algorithms for Real-Time Homogeneous Diffusion Inpainting in 4K. 1680-1684 - Michiaki Tatsubori, Takao Moriyama, Tatsuya Ishikawa, Paolo Fraccaro, Anne Jones, Blair Edwards, Julian Kuehnert, Sekou L. Remy:
Deep Temporal Interpolation of Radar-Based Precipitation. 1685-1689 - Zikai Sun, Thierry Blu:
A Nonlinear Steerable Complex Wavelet Decomposition of Images. 1690-1694 - Xiang Cao, Haibo Shen, Liangqi Zhang, Yihao Luo, Tianjiang Wang:
Kernel Estimation Network for Blind Super-Resolution. 1695-1699 - Yixiong Zhang, Zhipeng Su, Feng Qi, Jianyang Zhou, Xiaoping Zhang:
Terahertz Image Restoration Benchmarking Dataset. 1700-1704 - Xingrun Xing, Yalong Jiang, Baochang Zhang, Wenrui Ding, Yangguang Li, Hongguang Li, Huan Peng:
Binary Dense Predictors for Human Pose Estimation Based on Dynamic Thresholds and Filtering. 1705-1709 - Haidong Zhu, Zhaoheng Zheng, Mohammad Soleymani, Ram Nevatia:
Self-Supervised Learning for Sentiment Analysis via Image-Text Matching. 1710-1714 - Wei-Yu Lee, Jheng-Yu Wang, Yu-Chiang Frank Wang:
Domain-Agnostic Meta-Learning for Cross-Domain Few-Shot Classification. 1715-1719 - Dahyun Kim, Sunjae Yoon, Ji Woo Hong, Chang D. Yoo:
Semantic Association Network for Video Corpus Moment Retrieval. 1720-1724 - Nida Itrat Abbasi, Siyang Song, Hatice Gunes:
Statistical, Spectral and Graph Representations for Video-Based Facial Expression Recognition in Children. 1725-1729 - Nakyeong Yang, Taegwan Kang, Kyomin Jung:
Deriving Explainable Discriminative Attributes Using Confusion About Counterfactual Class. 1730-1734 - Chenghu Du, Feng Yu, Minghua Jiang, Yaxin Zhao, Xiong Wei, Tao Peng, Xinrong Hu:
Realistic Monocular-To-3d Virtual Try-On Via Multi-Scale Characteristics Capture. 1735-1739 - Ehsan Pajouheshgar, Tong Zhang, Sabine Süsstrunk:
Optimizing Latent Space Directions for Gan-Based Local Image Editing. 1740-1744 - Jingning Xu, Benlai Tang, Mingjie Wang, Siyuan Bian, Wenyi Guo, Xiang Yin, Zejun Ma:
Towards Using Clothes Style Transfer for Scenario-Aware Person Video Generation. 1745-1749 - Somi Jeong, Jiyoung Lee, Kwanghoon Sohn:
Multi-Domain Unsupervised Image-to-Image Translation with Appearance Adaptive Convolution. 1750-1754 - Yifan Yuan, Siteng Ma, Junping Zhang:
VR-FAM: Variance-Reduced Encoder with Nonlinear Transformation for Facial Attribute Manipulation. 1755-1759 - George Eskandar, Mohamed Abdelsamad, Karim Armanious, Shuai Zhang, Bin Yang:
Wavelet-Based Unsupervised Label-to-Image Translation. 1760-1764 - Sadid Sahami, Gene Cheung, Chia-Wen Lin:
Fast Graph Sampling for Short Video Summarization Using Gershgorin Disc Alignment. 1765-1769 - Xiaopeng Ke, Boyu Chang, Hao Wu, Fengyuan Xu, Sheng Zhong:
Towards Practical and Efficient Long Video Summary. 1770-1774 - Sunhee Hwang, Minsong Ki, Seung-Hyun Lee, Sanghoon Park, Byoung-Ki Jeon:
Cut And Continuous Paste Towards Real-Time Deep Fall Detection. 1775-1779 - Aditya Singh, Saheb Chhabra, Puspita Majumdar, Richa Singh, Mayank Vatsa:
Mannet: A Large-Scale Manipulated Image Detection Dataset And Baseline Evaluations. 1780-1784 - Laura Kart, Niv Cohen:
Approaches Toward Physical and General Video Anomaly Detection. 1785-1789 - Suiyi Ling, Andreas Pastor, Junle Wang, Patrick Le Callet:
Considering User Agreement in Learning to Predict the Aesthetic Quality. 1790-1794 - Qi Zheng, Zhengzhong Tu, Yibo Fan, Xiaoyang Zeng, Alan C. Bovik:
No-Reference Quality Assessment of Variable Frame-Rate Videos Using Temporal Bandpass Statistics. 1795-1799 - Joel Jung, Alexandre Giraud, Meijia Song, Songnan Li, Xiang Li, Shan Liu:
Towards Joint Frame-Level and MOS Quality Predictions with Low-Complexity Objective Models. 1800-1804 - Satyam Mohla, Anshul Nasery, Biplab Banerjee:
Teaching CNNs to Mimic Human Visual Cognitive Process & Regularise Texture-Shape Bias. 1805-1809 - Shaoguo Wen, Suiyi Ling, Junle Wang, Ximing Chen, Yanqing Jing, Patrick Le Callet:
Subjective And Objective Quality Assessment Of Mobile Gaming Video. 1810-1814 - Yanzhe Zhong, Huadong Pan, Bangjie Tang, Zhonggeng Liu, Yiming Zhu, Jun Yin:
ER-PIQA: A Task-Guided Pedestrian Image Quality Assessment Via Embedding Reconstruction. 1815-1819 - Mohsen Zand, Haleh Damirchi, Andrew Farley, Mahdiyar Molahasani, Michael A. Greenspan, Ali Etemad:
Multiscale Crowd Counting and Localization By Multitask Point Supervision. 1820-1824 - Yu-Zhang Chen, Tsung-Jung Liu, Kuan-Hsien Liu:
Super-Resolution of Satellite Images by two-Dimensional RRDB and Edge-Enhancement Generative Adversarial Network. 1825-1829 - Saurabh Sahu, Palash Goyal:
Leveraging Local Temporal Information for Multimodal Scene Classification. 1830-1834 - Menghao Li, Mingtao Pei, Wei Liang:
Predicting Human Motion Using Key Subsequences. 1835-1839 - Ruxin Ding, Jianfeng Ren, Heng Yu, Jiawei Li:
Dynamic Texture Recognition Using PDV Hashing and Dictionary Learning on Multi-Scale Volume Local Binary Pattern. 1840-1844 - Qing Gao, Mingtao Pei, Hongyu Shen:
Do You Live a Healthy Life? Analyzing Lifestyle by Visual Life Logging. 1845-1849 - Liping Huang, Taizo Suzuki:
Weighted Wavelet-Based Spectral-Spatial Transforms For CFA-Sampled Raw Camera Image Compression Considering Image Features. 1850-1854 - Dongyang Li, Zhenhong Sun, Zhiyu Tan, Xiuyu Sun, Fangyi Zhang, Yichen Qian, Hao Li:
Jmpnet: Joint Motion Prediction for Learning-Based Video Compression. 1855-1859 - Fabian Brand, Christian Herglotz, André Kaup:
A Low-Parametric Model for Bit-Rate Estimation of VVC Residual Coding. 1860-1864 - Vignesh V. Menon, Hadi Amirpour, Mohammed Ghanbari, Christian Timmerer:
OPTE: Online Per-Title Encoding for Live Video Streaming. 1865-1869 - Kedeng Tong, Xin Jin, Chen Wang, Fan Jiang:
SADN: Learned Light Field Image Compression with Spatial-Angular Decorrelation. 1870-1874 - Wenfeng Li, Zongcai Du, Hao He, Jie Tang, Gangshan Wu:
Hierarchical Feature Aggregation Network for Deep Image Compression. 1875-1879 - Tianyou Chen, Xiaoguang Hu, Jin Xiao, Guofeng Zhang, Shaojie Wang:
Accurate Instance Segmentation Via Collaborative Learning. 1880-1884 - Jiehua Zhang, Zhuo Su, Yanghe Feng, Xin Lu, Matti Pietikäinen, Li Liu:
Dynamic Binary Neural Network by Learning Channel-Wise Thresholds. 1885-1889 - Wanyu Wu, Wei Wang, Kui Jiang, Xin Xu, Ruimin Hu:
Self-Supervised Learning on A Lightweight Low-Light Image Enhancement Model with Curve Refinement. 1890-1894 - Jingquan Wang, Jing Xu, Yu Pan, Zenglin Xu:
Semantically Proportional Patchmix for Few-Shot Learning. 1895-1899 - Zhikui Chen, Tiandong Ji, Suhua Zhang, Fangming Zhong:
Noise Suppression for Improved Few-Shot Learning. 1900-1904 - Cheryl Sze Yin Wong, Guo Yang, Arulmurugan Ambikapathi, Savitha Ramasamy:
Online Continual Learning Using Enhanced Random Vector Functional Link Networks. 1905-1909 - Miaohua Zhang, Yongsheng Gao, Jun Zhou:
A Generalized Kernel Risk Sensitive Loss for Robust Two-Dimensional Singular Value Decomposition. 1910-1914 - Xiangling Ding, Pu Huang, Dengyong Zhang, Xianfeng Zhao:
Video Frame Interpolation via Local Lightweight Bidirectional Encoding with Channel Attention Cascade. 1915-1919 - Yue Lv, Wenming Yang, Wangmeng Zuo, Qingmin Liao, Rui Zhu:
Sain: Similarity-Aware Video Frame Interpolation. 1920-1924 - Zejia Fan, Jiaying Liu, Wenhan Yang, Wei Xiang, Zongming Guo:
Self-Learned Video Super-Resolution with Augmented Spatial and Temporal Context. 1925-1929 - Jiahui Liu, Mingcai Zhou, Meng Xiao:
Deformable Convolution Dense Network for Compressed Video Quality Enhancement. 1930-1934 - Siying Liu, Roxana Alexandru, Pier Luigi Dragotti:
Convolutional ISTA Network with Temporal Consistency Constraints for Video Reconstruction from Event Cameras. 1935-1939 - Xuezhi Tong, Rui Wang, Chuan Wang, Sanyi Zhang, Xiaochun Cao:
PMP-NET: Rethinking Visual Context for Scene Graph Generation. 1940-1944 - Feicheng Huang, Zhixin Li:
Improve Image Captioning Via Relation Modeling. 1945-1949 - Lei Cui, Huan Peng, Yangguang Li, Chuming Li, Xingrun Xing:
Equal Loss: A Simple Loss Function for Noise Robust Learning. 1950-1954 - Boyang Wan, Wenhui Jiang, Yuming Fang:
Informative Attention Supervision for Grounded Video Description. 1955-1959 - Jialu Zhang, Qian Zhang, Jianfeng Ren, Yitian Zhao, Jiang Liu:
Spatial-Context-Aware Deep Neural Network for Multi-Class Image Classification. 1960-1964 - Hongjun Wu, Mengzhu Li, Yongcheng Liu, Hongzhe Liu, Cheng Xu, Xuewei Li:
Transtl: Spatial-Temporal Localization Transformer for Multi-Label Video Classification. 1965-1969 - Kyuyeon Kim, Junsik Jung, Woo Jae Kim, Sung-Eui Yoon:
Deep Video Inpainting Guided by Audio-Visual Self-Supervision. 1970-1974 - Guangwei Li, Xuenan Xu, Mengyue Wu, Kai Yu:
Navigating Audio-Visual Event Detection Across Mismatched Modalities. 1975-1979 - Donglai Wei, Chen-Geng Liu, Yang Liu, Jing Liu, Xiao-Guang Zhu, Xinhua Zeng:
Look, Listen and Pay More Attention: Fusing Multi-Modal Information for Video Violence Detection. 1980-1984 - Changsheng Xu, Zhenlong Xu, Yifan He, Shuigeng Zhou, Jihong Guan:
Multi-Modal Learning with Text Merging for TEXTVQA. 1985-1989 - Ping Wang, Yijie Cao, Lei Lu:
A Novel Part Feature Integration and Fusion Method for Fine-Grained Vehicle Recognition. 1990-1994 - Yiqiang Chen, Feng Liu, Ke Pei:
Monocular Vehicle 3D Bounding Box Estimation Using Homograhy and Geometry in Traffic Scene. 1995-1999 - Xin Yi, Bo Ma, Jiahao Wu:
FSM: Feature Sampling Module for Object Detection. 2000-2004 - Senyun Kuang, Shijin Meng, Bo Xiao, Lv Tang, Bo Li:
Rethinking Two-B-Real Net for Real-Time Salient Object Detection. 2005-2009 - Bo Cui, Hui Qu, Xuhui Huang, Shan Yu:
Balanced Ranking and Sorting For Class Incremental Object Detection. 2010-2014 - Yihao Luo, Xiang Cao, Juntao Zhang, Leixilan Pan, Tianjiang Wang, Qi Feng:
Multi-Scale Reinforcement Learning Strategy for Object Detection. 2015-2019 - Zhihao Wu, Chengliang Liu, Chao Huang, Jie Wen, Yong Xu:
Deep Object Detection with Example Attribute Based Prediction Modulation. 2020-2024 - Shanzhi Yin, Chao Li, Youneng Bao, Yongsheng Liang, Fanyang Meng, Wei Liu:
Universal Efficient Variable-Rate Neural Image Compression. 2025-2029 - Bowen Li, Xin Yao, Chao Li, Youneng Bao, Fanyang Meng, Yongsheng Liang:
AdderIC: Towards Low Computation Cost Image Compression. 2030-2034 - Saiping Zhang, Luis Herranz, Marta Mrak, Marc Górriz Blanch, Shuai Wan, Fuzheng Yang:
DCNGAN: A Deformable Convolution-Based GAN with QP Adaptation for Perceptual Quality Enhancement of Compressed Video. 2035-2039 - Anne-Flore Perrin, Yejing Xie, Tao Zhang, Yiting Liao, Junlin Li, Patrick Le Callet:
Specialised Video Quality Model For Enhanced User Generated Content (UGC) With Special Effects. 2040-2044 - Andreas Pastor, Lukás Krasula, Xiaoqing Zhu, Zhi Li, Patrick Le Callet:
Improving Maximum Likelihood Difference Scaling Method To Measure Inter Content Scale. 2045-2049 - Ao-Xiang Zhang, Yuan-Gen Wang:
Texture Information Boosts Video Quality Assessment. 2050-2054 - Keisuke Ozawa:
Plug-and-Play and Relay Regularizations on Noisy Low Rank Tensor Completion for Snapshot Multispectral Image Restoration. 2055-2059 - Ashish Tiwari, Shanmuganathan Raman:
LERPS: Lighting Estimation and Relighting for Photometric Stereo. 2060-2064 - Huiyu Duan, Xiongkuo Min, Wei Shen, Guangtao Zhai:
A Unified Two-Stage Model for Separating Superimposed Images. 2065-2069 - Siyu Huang, Haoyi Xiong, Tianyang Wang, Bihan Wen, Qingzhong Wang, Zeyu Chen, Jun Huan, Dejing Dou:
Parameter-Free Style Projection for Arbitrary Image Style Transfer. 2070-2074 - Yangfan Sun, Zhu Li, Li Li, Shizheng Wang, Wei Gao:
Optimization of Compressive Light Field Display in Dual-Guided Learning. 2075-2079 - Yusuke Matsui, Yoshiki Imaizumi, Naoya Miyamoto, Naoki Yoshifuji:
ARM 4-BIT PQ: SIMD-Based Acceleration for Approximate Nearest Neighbor Search on ARM. 2080-2084 - Chao Wang, Yi Gu, Jie Li, Xinlei He, Zirui Zhang, Yuting Gao, Chentao Wu:
Iterative Learning for Distorted Image Restoration. 2085-2089 - Xiaoyu Zhang, Wei Gao, Hui Yuan, Ge Li:
JE2NET: Joint Exploitation and Exploration in Reinforcement Learning Based Image Restoration. 2090-2094 - Kun Yang, Juan Zhang, Xiaoqi Lang:
Multiple Patch-Aware Network for Faster Real-World Image Dehazing. 2095-2099 - Zhenyu Tang, Long Ma, Xiaoke Shang, Xin Fan:
Learning to Fuse Heterogeneous Features for Low-Light Image Enhancement. 2100-2104 - Jiachun Li, Kunkun Qin, Ruotao Xu, Hui Ji:
Deep Scale-Aware Image Smoothing. 2105-2109 - Yanbo Gao, Menghu Jia, Shuai Li, Xun Cai, Mao Ye, Frédéric Dufaux:
A Multiscale Gradient-Backpropagation Optimization Framework for Deformable Convolution Based Compressed Video Enhancement. 2110-2114 - Tomohiro Hayase, Suguru Yasutomi, Nakamasa Inoue:
Downstream Augmentation Generation For Contrastive Learning. 2115-2119 - Chao Dong, Qi Ye, Wenchao Meng, Kaixiang Yang:
Few-Shot Learning with Improved Local Representations via Bias Rectify Module. 2120-2124 - Pichao Wang, Fan Wang, Hao Li:
Image-to-Video Re-Identification via Mutual Discriminative Knowledge Transfer. 2125-2129 - Fangxin Liu, Wenbo Zhao, Yongbiao Chen, Zongwu Wang, Fei Dai:
DynSNN: A Dynamic Approach to Reduce Redundancy in Spiking Neural Networks. 2130-2134 - Yongsheng Zhang, Qing Liu, Yang Zhao, Yixiong Liang:
MEJIGCLU: More Effective Jigsaw Clustering For Unsupervised Visual Representation Learning. 2135-2139 - Cheng Zhuang, Yunlian Sun:
Ganet: Unary Attention Reaches Pairwise Attention Via Implicit Group Clustering in Light-Weight CNNs. 2140-2144 - Ting-Wei Chang, Wei-Chen Chiu, Ching-Chun Huang:
Find The Way Back: Invertible Kernel Estimator For Blind Image Super-Resolution. 2145-2149 - Haoquan Wang, Gang Zhang, Zhichun Lei:
Fine-Grained Dynamic Loss for Accurate Single-Image Super-Resolution. 2150-2154 - Gongzhe Li, Linwei Qiu, Haopeng Zhang, Fengying Xie, Zhiguo Jiang:
Multi-Frame Super-Resolution With Raw Images Via Modified Deformable Convolution. 2155-2159 - Yan Wang, Yao Lu, Shunzhou Wang, Wenyao Zhang, Zijian Wang:
Local-Global Feature Aggregation for Light Field Image Super-Resolution. 2160-2164 - Hao He, Zongcai Du, Wenfeng Li, Jie Tang, Gangshan Wu:
Pyramid Fusion Attention Network For Single Image Super-Resolution. 2165-2169 - Xian Zhong, Zhuo Zhou, Wenxuan Liu, Kui Jiang, Xuemei Jia, Wenxin Huang, Zheng Wang:
VCD: View-Constraint Disentanglement for Action Recognition. 2170-2174 - Chengming Zou, Ducheng Yuan, Long Lan, Haoang Chi:
Privacy-Preserving Action Recognition. 2175-2179 - Hongcheng Zhang, Xu Zhao:
Spatio-Temporal Motion Aggregation Network for Video Action Detection. 2180-2184 - Yanhao Jing, Feng Wang:
TP-VIT: A Two-Pathway Vision Transformer for Video Action Recognition. 2185-2189 - Yang Liu, Jing Liu, Xiaoguang Zhu, Donglai Wei, Xiaohong Huang, Liang Song:
Learning Task-Specific Representation for Video Anomaly Detection with Spatial-Temporal Attention. 2190-2194 - Mengzhu Li, Hongjun Wu, Yongcheng Liu, Hongzhe Liu, Cheng Xu, Xuewei Li:
W-ART: Action Relation Transformer for Weakly-Supervised Temporal Action Localization. 2195-2199 - Jinpeng Liu, Song Wu, Dehong He, Guoqiang Xiao:
MS-ROCANet: Multi-Scale Residual Orthogonal-Channel Attention Network for Scene Text Detection. 2200-2204 - Shan Liu, Guoqiang Xiao, Xiaohui Xu, Song Wu:
Bi-Directional Normalization and Color Attention-Guided Generative Adversarial Network for Image Enhancement. 2205-2209 - Zhikui Chen, Han Wang, Suhua Zhang, Fangming Zhong:
Dual-Attention Network for Few-Shot Segmentation. 2210-2214 - Jiapeng Li, Ge Li, Thomas H. Li:
Attention Guided Invariance Selection for Local Feature Descriptors. 2215-2219 - Jiahao Wang, Mingdeng Cao, Shuwei Shi, Baoyuan Wu, Yujiu Yang:
Attention Probe: Vision Transformer Distillation in the Wild. 2220-2224 - Bin Jiang, Fangqiang Xu, Jun Xia, Chao Yang, Wei Huang, Yun Huang:
Stacked Multi-Scale Attention Network for Image Colorization. 2225-2229 - Han Wang, Yali Li, Shengjin Wang:
CRPN: Distinguish Novel Categories Via Class-Relevant Region Proposal Network for Few-Shot Object Detection. 2230-2234 - Zhishan Li, Mingmu Chen, Yifan He, Lei Xie, Hongye Su:
An Efficient Framework for Detection and Recognition of Numerical Traffic Signs. 2235-2239 - Zongyao Li, Ren Togo, Takahiro Ogawa, Miki Haseyama:
Divergence-Guided Feature Alignment for Cross-Domain Object Detection. 2240-2244 - Jun Wang, Hefeng Zhou, Xiaohan Yu:
PGTRNET: Two-Phase Weakly Supervised Object Detection with Pseudo Ground Truth Refinement. 2245-2249 - Weijie Liu, Chong Wang, Shenghao Yu, Chenchen Tao, Jun Wang, Jiafei Wu:
Novel Instance Mining with Pseudo-Margin Evaluation for Few-Shot Object Detection. 2250-2254 - Chuang Yang, Mulin Chen, Yuan Yuan, Qi Wang:
BiP-Net: Bidirectional Perspective Strategy Based Arbitrary-Shaped Text Detection Network. 2255-2259 - Tim Heydrich, Yimin Yang, Xiangyu Ma, Yu Liu, Shan Du:
A Novel Lightweight Network for Fast Monocular Depth Estimation. 2260-2264 - Tim Heydrich, Yimin Yang, Shan Du:
A Lightweight Self-Supervised Training Framework for Monocular Depth Estimation. 2265-2269 - Hao Liu, Hui Yuan, Raouf Hamzaoui, Wei Gao, Shuai Li:
PU-Refiner: A Geometry Refiner with Adversarial Learning for Point Cloud Upsampling. 2270-2274 - Bo-Fan Chen, Yang-Ming Yeh, Yi-Chang Lu:
CF-Net: Complementary Fusion Network for Rotation Invariant Point Cloud Completion. 2275-2279 - Zihao Zhang, Nan Sang, Xupeng Wang:
TH-Net: A Method Of Single 3d Object Tracking Based On Transformers And Hausdorff Distance. 2280-2284 - Hengxin Feng, Weifeng Liu, Yanjiang Wang, Baodi Liu:
Enrich Features for Few-Shot Point Cloud Classification. 2285-2289 - Jaewoo Lee, Daeul Park, Dongwook Lee, Daehyun Ji:
Semi-Supervised 360° Depth Estimation from Multiple Fisheye Cameras with Pixel-Level Selective Loss. 2290-2294 - Wei Zhong, Yazhi Yuan, Xinchen Ye, Dian Zheng, Rui Xu:
Underwater Stereo Matching Via Unsupervised Appearance And Feature Adaptation Networks. 2295-2299 - Pei Tang, Liangrui Peng, Ruijie Yan, Haodong Shi, Gang Yao, Changsong Liu, Jie Li, Yuqi Zhang:
Domain Adaptation via Mutual Information Maximization for Handwriting Recognition. 2300-2304 - Ang Li, Jian Hu, Chilin Fu, Xiaolu Zhang, Jun Zhou:
Attribute-Conditioned Face Swapping Network for Low-Resolution Images. 2305-2309 - Ying Bian, Peng Zhang, Jingjing Wang, Chunmao Wang, Shiliang Pu:
Learning Multiple Explainable and Generalizable Cues for Face Anti-Spoofing. 2310-2314 - Bastien Laville, Laure Blanc-Féraud, Gilles Aubert:
Off-The-Grid Covariance-Based Super-Resolution Fluctuation Microscopy. 2315-2319 - Zhiyuan Zha, Bihan Wen, Xin Yuan, Jiantao Zhou, Ce Zhu:
Simultaneous Nonlocal Low-Rank And Deep Priors For Poisson Denoising. 2320-2324 - Yiming Liu, Yanni Zhang, Qiang Li, Jun Kong, Miao Qi, Jianzhong Wang:
Double Closed-Loop Network for Image Deblurring. 2325-2329 - Ying Zhang, Youjun Xiang, Lei Cai, Yuli Fu, Wanliang Huo, Junjun Xia:
Single Image De-Raining with High-Low Frequency Guidance. 2330-2334 - Wu Yang, Wuzhen Shi:
Detail Generation and Fusion Networks for Image Inpainting. 2335-2339 - Hong Liu, Ying Zhu, Guoliang Hua, Weibo Huang, Runwei Ding:
Adaptive Weighted Network With Edge Enhancement Module For Monocular Self-Supervised Depth Estimation. 2340-2344 - Diclehan Karakaya, Oguzhan Ulucan, Mehmet Türkan:
Pas-Mef: Multi-Exposure Image Fusion Based On Principal Component Analysis, Adaptive Well-Exposedness And Saliency Map. 2345-2349 - Miaoju Ban, Runwei Ding, Jian Zhang, Tianyu Guo, Tao Wang:
PDD-Net: A Precise Defect Detection Network Based on Point Set Representation. 2350-2354 - Renhui Zhang, Tiancheng Lin, Rui Zhang, Yi Xu:
Solving The Long-Tailed Problem Via Intra- And Inter-Category Balance. 2355-2359 - Zhanchao Huang, Wei Li, Ran Tao:
Extracting and Distilling Direction-Adaptive Knowledge for Lightweight Object Detection in Remote Sensing Images. 2360-2364 - Xiaoliu Luo, Jing Luo, Zhao Duan, Jin Tan, Taiping Zhang:
Pseudo-Interacting Guided Network for Few-Shot Segmentation. 2365-2369 - Yuehui Wang, Qing Wang, Dongyu Zhang:
Few-Shot Generation By Modeling Stereoscopic Priors. 2370-2374 - Kohei Matsuzaki, Kei Kawamura:
Relative Viewpoint Estimation Based on Structured 3d Representation Alignment. 2375-2379 - Minxiang Ye, Yifei Zhang, Shiqiang Zhu, Anhuan Xie, Dan Zhang:
Deep Markov Clustering for Panoptic Segmentation. 2380-2384 - Libo Liu, Chengjian Huang, Chunsheng Cai, Xiaodong Zhang, Qingmao Hu:
Multi-Task Learning Improves the Brain Stoke Lesion Segmentation. 2385-2389 - Hongyi Wang, Shiao Xie, Lanfen Lin, Yutaro Iwamoto, Xian-Hua Han, Yen-Wei Chen, Ruofeng Tong:
Mixed Transformer U-Net for Medical Image Segmentation. 2390-2394 - Wankang Zeng, Wenkang Fan, Dongfang Shen, Yinran Chen, Xiongbiao Luo:
Contrastive Translation Learning For Medical Image Segmentation. 2395-2399 - Tianfang Meng, Wenqiang Zhang:
Fast Video Object Segmentation via Dynamic YOLACT. 2400-2404 - Tiyu Fang, Zhen Liang, Xiuli Shao, Zihao Dong, Jinping Li:
Depth Removal Distillation for RGB-D Semantic Segmentation. 2405-2409 - Lingzhao Ju, Xu Zhao:
Mask-Based Attention Parallel Network for in-the-Wild Facial Expression Recognition. 2410-2414 - Lifang Zhou, Siqin Li, Yi Wang, Junlin Liu:
SDNET: Lightweight Facial Expression Recognition For Sample Disequilibrium. 2415-2419 - Mengting Wei, Wenming Zheng, Yuan Zong, Xingxun Jiang, Cheng Lu, Jiateng Liu:
A Novel Micro-Expression Recognition Approach Using Attention-Based Magnification-Adaptive Networks. 2420-2424 - Weidong Tian, Housen Zhang, Chen Peng, Zhong-Qiu Zhao:
Lipreading Model Based On Whole-Part Collaborative Learning. 2425-2429 - Ahmed Al-Hindawi, Marcela P. Vizcaychipi, Yiannis Demiris:
What Is The Patient Looking At? Robust Gaze-Scene Intersection Under Free-Viewing Conditions. 2430-2434 - Haoxian Huang, Luqian Ren, Zhuo Yang, Yinwei Zhan, Qieshi Zhang, Jujian Lv:
GAZEATTENTIONNET: Gaze Estimation with Attentions. 2435-2439 - Yang Yang, Yonghua Zhang, Xiaojie Guo:
Low-Light Image Enhancement via Feature Restoration. 2440-2444 - Xiaoyu Zhang, Wei Gao:
HIRL: Hybrid Image Restoration Based on Hierarchical Deep Reinforcement Learning via Two-Step Analysis. 2445-2449 - Chengrong Wang, Chenjie Cao, Yanwei Fu, Xiangyang Xue:
High-Fidelity Portrait Editing Via Exploring Differentiable Guided Sketches from the Latent Space. 2450-2454 - Zhihong Pan:
Learning Adjustable Image Rescaling with Joint Optimization of Perception and Distortion. 2455-2459 - Wenjun Chen, Chunling Yang, Xin Yang:
FSOINET: Feature-Space Optimization-Inspired Network For Image Compressive Sensing. 2460-2464 - Keuntek Lee, Yeong Il Jang, Nam Ik Cho:
Disentangled Feature-Guided Multi-Exposure High Dynamic Range Imaging. 2465-2469 - Peilun Du, Xiaolong Zheng, Liang Liu, Huadong Ma:
Defending Against Universal Attack Via Curvature-Aware Category Adversarial Training. 2470-2474 - Yunjian Zhang, Yanwei Liu, Jinxia Liu, Pengwei Zhan, Liming Wang, Zhen Xu:
SP Attack: Single-Perspective Attack for Generating Adversarial Omnidirectional Images. 2475-2479 - Yachun Li, Ying Lian, Jingjing Wang, Yuhui Chen, Chunmao Wang, Shiliang Pu:
Few-Shot One-Class Domain Adaptation Based On Frequency For Iris Presentation Attack Detection. 2480-2484 - Margarita Geleta, Cristina Punti, Kevin McGuinness, Jordi Pons, Cristian Canton, Xavier Giró-i-Nieto:
Pixinwav: Residual Steganography for Hiding Pixels in Audio. 2485-2489 - Yurui Xie, Ling Guan:
A Semi-Handcrafted Keypoint Detector with Discriminative Feature Encoding. 2490-2494 - Antonio Agudo:
Safari from Visual Signals: Recovering Volumetric 3d Shapes. 2495-2499 - Farshad G. Veshki, Sergiy A. Vorobyov:
Coupled Feature Learning Via Structured Convolutional Sparse Coding for Multimodal Image Fusion. 2500-2504 - Rongtao Xu, Changwei Wang, Bin Fan, Yuyang Zhang, Shibiao Xu, Weiliang Meng, Xiaopeng Zhang:
DOMAINDESC: Learning Local Descriptors With Domain Adaptation. 2505-2509 - Arya Aftab, Alireza Morsali, Shahrokh Ghaemmaghami:
Multi-Head Relu Implicit Neural Representation Networks. 2510-2514 - ZhaoJing Zhou, Yun Zhou, Zhuqing Jiang, Aidong Men, Haiying Wang:
An Efficient Method for Model Pruning Using Knowledge Distillation with Few Samples. 2515-2519 - Guangyu Ren, Tianhong Dai, Tania Stathaki:
Adaptive Intra-Group Aggregation for Co-Saliency Detection. 2520-2524 - Tanmoy Mukherjee, Nikos Deligiannis:
Novel Class Discovery: A Dependency Approach. 2525-2528 - Yanfeng Liu, Qiang Li, Yuan Yuan, Qi Wang:
Single-Shot Balanced Detector for Geospatial Object Detection. 2529-2533 - Ruixin Shi, Junzheng Zhang, Yong Li, Shiming Ge:
Regularized Latent Space Exploration for Discriminative Face Super-Resolution. 2534-2538 - Yi Hou, Chengyang Li, Yuheng Lu, Liping Zhu, Yuan Li, Huizhu Jia, Xiaodong Xie:
Enhancing and Dissecting Crowd Counting by Synthetic Data. 2539-2543 - Chenghu Du, Feng Yu, Minghua Jiang, Xiong Wei, Tao Peng, Xinrong Hu:
Multi-Pose Virtual Try-On Via Self-Adaptive Feature Filtering. 2544-2548 - Jie Zhang, Yi Xiao, Guo Chen, Qingping Sun, Fangqiang Xu, Chi-Sing Leung:
Histogram-Guided Semantic-Aware Colorization. 2549-2553 - Green Rosh K. S, Nikhil Krishnan, B. H. Pawan Prasad, Sachin Deepak Lomte:
Content Preserving Scale Space Network for Fast Image Restoration from Noisy-Blurry Pairs. 2554-2558 - Rong Bao, Yurui Ren, Ge Li, Wei Gao, Shan Liu:
Flow-Based Point Cloud Completion Network with Adversarial Refinement. 2559-2563 - Zezeng Li, Weimin Wang, Na Lei, Rui Wang:
Weakly Supervised Point Cloud Upsampling VIA Optimal Transport. 2564-2568 - Ryosuke Watanabe, Keisuke Nonaka, Haruhisa Kato, Eduardo Pavez, Tatsuya Kobayashi, Antonio Ortega:
Point Cloud Denoising Using Normal Vector-Based Graph Wavelet Shrinkage. 2569-2573 - Anique Akhtar, Zhu Li, Geert Van Der Auwera, Jianle Chen:
Dynamic Point Cloud Interpolation. 2574-2578 - Shashank N. Sridhara, Eduardo Pavez, Antonio Ortega, Ryosuke Watanabe, Keisuke Nonaka:
Point Cloud Attribute Compression Via Chroma Subsampling. 2579-2583 - Lili Zhao, Xuhu Lin, Wenyi Wang, Kai-Kuang Ma, Jianwen Chen:
Rangeinet: Fast Lidar Point Cloud Temporal Interpolation. 2584-2588 - Lianlei Shan, Weiqiang Wang:
MBNet: A Multi-Resolution Branch Network for Semantic Segmentation Of Ultra-High Resolution Images. 2589-2593 - Yuxuan Zhang, Wei Yang:
BSOLO: Boundary-Aware One-Stage Instance Segmentation SOLO. 2594-2598 - Shaoping Jiang, Xiangmin Xu, Fang Liu, Xiaofen Xing, Lin Wang:
CS-GResNet: A Simple and Highly Efficient Network for Facial Expression Recognition. 2599-2603 - Bingxu Lu, Qinghua Hu, Yu Wang, Guosheng Hu:
RCANet: Row-Column Attention Network for Semantic Segmentation. 2604-2608 - Zhaozhi Xie, Hongtao Lu:
Exploring Category Consistency for Weakly Supervised Semantic Segmentation. 2609-2613 - Hyeonbin Hwang, Soyeon Kim, Wei-Jin Park, Jiho Seo, Kyungtae Ko, Hyeon Yeo:
Vision Transformer Equipped With Neural Resizer On Facial Expression Recognition Task. 2614-2618 - Kaining Ying, Zhenhua Wang, Cong Bai, Pengfei Zhou:
ISDA: Position-Aware Instance Segmentation with Deformable Attention. 2619-2623 - Zhenfei Zhang, Ming-Ching Chang, Tien D. Bui:
Improving Class Activation Map for Weakly Supervised Object Localization. 2624-2628 - Ruizhe Chen, Zhenqi Fu, Yue Huang, En Cheng, Xinghao Ding:
A Robust Object Segmentation Network for UnderWater Scenes. 2629-2633 - Leiping Jie, Hui Zhang:
A Fast and Efficient Network for Single Image Shadow Detection. 2634-2638 - Arvi Jonnarth, Michael Felsberg:
Importance Sampling Cams For Weakly-Supervised Segmentation. 2639-2643 - Qingfeng Liu, Hai Su, Mostafa El-Khamy, Kee-Bong Song:
DeepGBASS: Deep Guided Boundary-Aware Semantic Segmentation. 2644-2648 - Talha Hanif Butt, Murtaza Taj:
Camera Calibration Through Camera Projection Loss. 2649-2653 - Christopher Walker, Yuxing Wang, Yawen Lu, Guoyu Lu:
Inferring Camera Intrinsics Based on Surfaces of Revolution: A Single Image Geometric Network Approach for Camera Calibration. 2654-2658 - Sibo Zhang, Jiahong Yuan, Miao Liao, Liangjun Zhang:
Text2video: Text-Driven Talking-Head Video Synthesis with Personalized Phoneme - Pose Dictionary. 2659-2663 - Mohamed Afham, Udith Haputhanthri, Jathurshan Pradeepkumar, Mithunjha Anandakumar, Ashwin De Silva, Chamira U. S. Edussooriya:
Towards Accurate Cross-Domain in-Bed Human Pose Estimation. 2664-2668 - Yu Sun, Tianyu Huang, Qian Bao, Wu Liu, Wenpeng Gao, Yili Fu:
Learning Monocular Mesh Recovery of Multiple Body Parts Via Synthesis. 2669-2673 - Xiyang Liu, Peng Li, Ding Ni, Yan Wang, Hui Xue:
LightPose: A Lightweight and Efficient Model with Transformer for Human Pose Estimation. 2674-2678 - Qier An, Yuan Shen:
On The Observability in Visual Slam Networks. 2679-2683 - Yuxiao Li, Santiago Mazuelas, Yuan Shen:
Variational Bayesian Framework for Advanced Image Generation with Domain-Related Variables. 2684-2688 - Marina Gardella, Tina Nikoukhah, Yanhao Li, Quentin Bammey:
The Impact of JPEG Compression on Prior Image Noise. 2689-2693 - Tin Lay Nwe, Ramanpreet Singh Pahwa, Richard Chang, Oo Zaw Min, Jie Wang, Yiqun Li, Dongyun Lin, Shitala Prasad, Sheng Dong:
On the Use of Component Structural Characteristics for Voxel Segmentation in Semicon 3D Images. 2694-2698 - Zihan Zhang, Thierry Blu:
Blind Source Separation via a Weak Exclusion Principle. 2699-2703 - Yuqi Zhang, Qi Qian, Chong Liu, Weihua Chen, Fan Wang, Hao Li, Rong Jin:
Graph Convolution for Re-Ranking in Person Re-Identification. 2704-2708 - Jing Yang, Canlong Zhang, Zhixin Li, Yanping Tang:
Multi-Level Relation Aware Network for Person Re-Identification. 2709-2713 - Zhaopeng Dou, Zhongdao Wang, Yali Li, Shengjin Wang:
Progressive-Granularity Retrieval Via Hierarchical Feature Alignment for Person Re-Identification. 2714-2718 - Minjung Kim, MyeongAh Cho, Heansung Lee, Suhwan Cho, Sangyoun Lee:
Occluded Person Re-Identification Via Relational Adaptive Feature Correction Learning. 2719-2723 - Shiping Li, Min Cao, Min Zhang:
Learning Semantic-Aligned Feature Representation for Text-Based Person Search. 2724-2728 - Xuezhi Xiang, Ning Lv, Yulong Qiao:
Transformer-Based Person Search Model with Symmetric Online Instance Matching. 2729-2733 - Qingye Zhao, Xin Chen, Zhuoyu Zhao, Enyi Tang, Xuandong Li:
Wassertrain: An Adversarial Training Framework Against Wasserstein Adversarial Attacks. 2734-2738 - Siao Liu, Zhaoyu Chen, Wei Li, Jiwei Zhu, Jiafeng Wang, Wenqiang Zhang, Zhongxue Gan:
Efficient Universal Shuffle Attack for Visual Object Tracking. 2739-2743 - Riran Cheng, Nan Sang, Yinyuan Zhou, Xupeng Wang:
Non-Rigid Transformation Based Adversarial Attack Against 3d Object Tracking. 2744-2748 - Zhengyi Wang, Xupeng Wang, Ferdous Sohel, Mohammed Bennamoun, Yong Liao, Jiali Yu:
Adversary Distillation for One-Shot Attacks on 3D Target Tracking. 2749-2453 - Yin Yin Low, Angeline Tanvy, Raphaël C.-W. Phan, Xiaojun Chang:
AdverFacial: Privacy-Preserving Universal Adversarial Perturbation Against Facial Micro-Expression Leakages. 2754-2758 - Suryabhan Singh Hada, Miguel Á. Carreira-Perpiñán:
Interpretable Image Classification Using Sparse Oblique Decision Trees. 2759-2763 - Zhenqi Fu, Xiaopeng Lin, Wu Wang, Yue Huang, Xinghao Ding:
Underwater Image Enhancement Via Learning Water Type Desensitized Representations. 2764-2768 - Ziyin Ma, Changjae Oh:
A Wavelet-Based Dual-Stream Network for Underwater Image Enhancement. 2769-2773 - Shu Chai, Zhenqi Fu, Yue Huang, Xiaotong Tu, Xinghao Ding:
Unsupervised and Untrained Underwater Image Restoration Based on Physical Image Formation Model. 2774-2778 - Zhenlong Wang, Weifeng Liu, Yanjiang Wang, Baodi Liu:
Agcyclegan: Attention-Guided Cyclegan for Single Underwater Image Restoration. 2779-2783 - Shuhan Qi, Jianjun Du, Mingyan Wu, Hong Yi, Linlin Tang, Tao Qian, Xuan Wang:
Underwater Small Target Detection Based on Deformable Convolutional Pyramid. 2784-2788 - Kaixin Chen, Lin Zhang, Ying Shen, Yicong Zhou:
Towards Controllable and Physical Interpretable Underwater Scene Simulation. 2789-2793 - Yongshan Zhang, Xinxin Wang, Zhenyu Wang, Xinwei Jiang, Yicong Zhou:
Graph Learning Based Autoencoder for Hyperspectral Band Selection. 2794-2798 - Fengchao Xiong, Minchao Ye, Jun Zhou, Jianfeng Lu, Yuntao Qian:
Multitask Sparse Neural Network for Hyperspectral Image Denoising. 2799-2803 - Chen Xiaoyue, Xianghai Cao:
Hyperspectral Image Classification Based on Co-Learning Through Dual-Architecture Ensemble. 2804-2808 - Zhuanfeng Li, Fengchao Xiong, Jianfeng Lu, Jun Zhou, Yuntao Qian:
Material-Guided Siamese Fusion Network for Hyperspectral Object Tracking. 2809-2813 - Xiuheng Wang, Jie Chen, Cédric Richard:
Hyperspectral Image Super-Resolution with Deep Priors and Degradation Model Inversion. 2814-2818 - Na Liu, Wei Li, Ran Tao:
Geometric Low-Rank Tensor Approximation for Remotely Sensed Hyperspectral And Multispectral Imagery Fusion. 2819-2823 - Haoyue Tian, Pan Gao, Ran Wei, Manoranjan Paul:
Dilated Convolutional Neural Network-Based Deep Reference Picture Generation for Video Compression. 2824-2828 - Yanghao Li, Xinyao Chen, Jisheng Li, Jiangtao Wen, Yuxing Han, Shan Liu, Xiaozhong Xu:
Rate Control for Learned Video Compression. 2829-2833 - Xuekai Wei, Mingliang Zhou, Weijia Jia:
Global Optimization Solution for Dynamic Adaptive 360-Degree Streaming. 1-5 - Juliano S. Assine, José Cândido Silveira Santos Filho, Eduardo Valle:
Collaborative Object Detectors Adaptive to Bandwidth and Computation. 2839-2843 - Mu Li, Baojiang Zhong, Kai-Kuang Ma:
MA-NET: Multi-Scale Attention-Aware Network for Optical Flow Estimation. 2844-2848 - Yizhuo Li, Cewu Lu:
Modeling Human Memory in Multi-Object Tracking with Transformers. 2849-2853 - Chang-Sheng Lin, Chia-Yi Hsu, Pin-Yu Chen, Chia-Mu Yu:
Real-World Adversarial Examples Via Makeup. 2854-2858 - Joseph Clements, Yingjie Lao:
In Pursuit of Preserving the Fidelity of Adversarial Images. 2859-2863 - Meiling Li, Nan Zhong, Xinpeng Zhang, Zhenxing Qian, Sheng Li:
Object-Oriented Backdoor Attack Against Image Captioning. 2864-2868 - Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich:
Towards Robust Speech-to-Text Adversarial Attack. 2869-2873 - Yixiao Xu, Xiaolei Liu, Mingyong Yin, Teng Hu, Kangyi Ding:
Sparse Adversarial Attack For Video Via Gradient-Based Keyframe Selection. 2874-2878 - Hui Zeng, Kang Deng, Biwei Chen, Anjie Peng:
How Secure Are The Adversarial Examples Themselves? 2879-2883 - Xiaohui Zhao, Yang Yu, Rongrong Ni, Yao Zhao:
Exploring Complementarity of Global and Local Spatiotemporal Information for Fake Face Video Detection. 2884-2888 - Edoardo Daniele Cannas, János Horváth, Sriram Baireddy, Paolo Bestagini, Edward J. Delp, Stefano Tubaro:
Panchromatic Imagery Copy-Paste Localization Through Data-Driven Sensor Attribution. 2889-2893 - Lv Chen, Dengpan Ye, Yueyun Shang, Jiaqing Huang:
Robust Video Hashing Based on Local Fluctuation Preserving for Tracking Deep Fake Videos. 2894-2898 - Ping Wang, Kunlin Liu, Wenbo Zhou, Hang Zhou, Honggu Liu, Weiming Zhang, Nenghai Yu:
ADT: Anti-Deepfake Transformer. 2899-1903 - Hui Guo, Shu Hu, Xin Wang, Ming-Ching Chang, Siwei Lyu:
Eyes Tell All: Irregular Pupil Shapes Reveal GAN-Generated Faces. 2904-2908 - Antonio Theophilo, Rafael Padilha, Fernanda A. Andaló, Anderson Rocha:
Explainable Artificial Intelligence for Authorship Attribution on Social Media. 2909-2913 - Guiping Zhu, Mingzhu Ma, Yuwen Huang, Kuikui Wang, Gongping Yang:
Dual-Domain Low-Rank Fusion Deep Metric Learning for Off-the-Person ECG Biometrics. 2914-2918 - Kanghao Zhang, Shan Liang, Shuai Nie, Shulin He, Jiahui Pan, Xueliang Zhang, Haoxin Ma, Jiangyan Yi:
A Robust Deep Audio Splicing Detection Method via Singularity Detection Feature. 2919-2923 - Kuikui Wang, Gongping Yang, Yuwen Huang, Lu Yang, Yilong Yin:
Online Ecg Biometrics Via Hadamard Code. 2924-2928 - Ziyue Xiang, Paolo Bestagini, Stefano Tubaro, Edward J. Delp:
Forensic Analysis and Localization of Multiply Compressed MP3 Audio Using Transformers. 2929-2933 - Chong Liu, Yuqi Zhang, Weihua Chen, Fan Wang, Hao Li, Yi-Dong Shen:
Adaptive Matching Strategy for Multi-Target Multi-Camera Tracking. 2934-2938 - Hanye Huang, Youjun Xiang, Guodong Yang, Lingling Lv, Xianfeng Li, Zichun Weng, Yuli Fu:
Generalized Face Anti-Spoofing via Cross-Adversarial Disentanglement with Mixing Augmentation. 2939-2943 - Taoshan Zhang, Youjun Xiang, Xianfeng Li, Zichun Weng, Zhen Chen, Yuli Fu:
Free Lunch for Cross-Domain Occluded Face Recognition without Source Data. 2944-2948 - Zijun Zhuang, Hongtao Lu:
Coneface: Approximate Pairwise Loss for Face Recognition. 2949-2953 - Jie Jiang, Yunlian Sun:
Depth-Based Ensemble Learning Network For Face Anti-Spoofing. 2954-2958 - Eklavya Sarkar, Pavel Korshunov, Laurent Colbois, Sébastien Marcel:
Are GAN-based morphs threatening face recognition? 2959-2963 - Yulu Jin, Lifeng Lai:
Privacy Protection In Learning Fair Representations. 2964-2968 - Le Feng, Sheng Li, Zhenxing Qian, Xinpeng Zhang:
Stealthy Backdoor Attack with Adversarial Training. 2969-2973 - Dan Zhao, Hong Chen, Suyun Zhao, Ruixuan Liu, Cuiping Li, Xiaoying Zhang:
Fldp: Flexible Strategy For Local Differential Privacy. 2974-2978 - Mohammad Amin Zarrabian, Ni Ding, Parastoo Sadeghi, Thierry Rakotoarivelo:
Enhancing Utility In The Watchdog Privacy Mechanism. 2979-2983 - Michele Cirillo, Mario Di Mauro, Vincenzo Matta, Giuseppe Basileo:
Cyber-Threat Propagation over Network-Slicing Architectures. 2984-2988 - Ecenaz Erdemir, Pier Luigi Dragotti, Deniz Gündüz:
Privacy-Aware Communication over a Wiretap Channel with Generative Networks. 2989-2993 - Ran Shi, Jian Xiong, Tong Qiao:
Encrypted Image Visual Security Index via Non-Local Recognizable Degree Evaluation. 2994-2998 - Lu Miao, Wei Yang, Rong Hu, Lu Li, Liusheng Huang:
Against Backdoor Attacks In Federated Learning With Differential Privacy. 2999-3003 - Xinying Liao, Jiaye Xue, Shengxing Yu, Ximeng Liu, Jiangang Shu:
SecMPNN: 3-Party Privacy-Preserving Molecular Structure Properties Inference. 3004-3008 - Behrooz Razeghi, Shideh Rezaeifar, Sohrab Ferdowsi, Taras Holotyak, Slava Voloshynovskiy:
Compressed Data Sharing Based On Information Bottleneck Model. 3009-3013 - Thibault Maho, Teddy Furon, Erwan Le Merrer:
Randomized Smoothing Under Attack: How Good is it in Practice? 3014-3018 - Chau Yi Li, Andrea Cavallaro:
Training Privacy-Preserving Video Analytics Pipelines by Suppressing Features That Reveal Information About Private Attributes. 3019-3023 - Yulong Wang, Xingshu Chen, Qixu Wang, Run Yang, Bangzhou Xin:
Unsupervised Anomaly Detection for Container Cloud Via BILSTM-Based Variational Auto-Encoder. 3024-3028 - Fusen Wang, Jun Sang, Chunlin Huang, Bin Cai, Hong Xiang, Nong Sang:
Applying Deep Learning to Known-Plaintext Attack on Chaotic Image Encryption Schemes. 3029-3033 - Jiahong Xie, Haibo Cheng, Rong Zhu, Ping Wang, Kaitai Liang:
WordMarkov: A New Password Probability Model of Semantics. 3034-3038 - Cong Li, Qingni Shen, Zhikang Xie, Jisheng Dong, Yuejian Fang, Zhonghai Wu:
Efficient Identity-Based Chameleon Hash for Mobile Devices. 3039-3043 - Xiaoxi He, Haibo Cheng, Jiahong Xie, Ping Wang, Kaitai Liang:
Passtrans: An Improved Password Reuse Model Based on Transformer. 3044-3048 - Fang-Qi Li, Shi-Lin Wang, Yun Zhu:
Fostering The Robustness Of White-Box Deep Neural Network Watermarks By Neuron Alignment. 3049-3053 - Pierre Fernandez, Alexandre Sablayrolles, Teddy Furon, Hervé Jégou, Matthijs Douze:
Watermarking Images in Self-Supervised Latent Spaces. 3054-3058 - Haozhe Chen, Weiming Zhang, Kunlin Liu, Kejiang Chen, Han Fang, Nenghai Yu:
Speech Pattern Based Black-Box Model Watermarking for Automatic Speech Recognition. 3059-3063 - Guobiao Li, Sheng Li, Zhenxing Qian, Xinpeng Zhang:
Encryption Resistant Deep Neural Network Watermarking. 3064-3068 - Yongbaek Cho, Changhoon Kim, Yezhou Yang, Yi Ren:
Attributable Watermarking of Speech Generative Models. 3069-3073 - Biao Yi, Hanzhou Wu, Guorui Feng, Xinpeng Zhang:
Exploiting Language Model For Efficient Linguistic Steganalysis. 3074-3078 - Chuan Qin, Na Zhao, Weiming Zhang, Nenghai Yu:
Patch Steganalysis: A Sampling Based Defense Against Adversarial Steganography. 3079-3083 - Jinliu Feng, Yaofei Wang, Kejiang Chen, Weiming Zhang, Nenghai Yu:
An Effective Steganalysis for Robust Steganography with Repetitive JPEG Compression. 3084-3088 - Ge Luo, Ping Wei, Shuwen Zhu, Xinpeng Zhang, Zhenxing Qian, Sheng Li:
Image Steganalysis with Convolutional Vision Transformer. 3089-3093 - Paul-Gauthier Noé, Andreas Nautsch, Driss Matrouf, Pierre-Michel Bousquet, Jean-François Bonastre:
A Bridge between Features and Evidence for Binary Attribute-Driven Perfect Privacy. 3094-3098 - Yi Xu, Chong Xiao Wang, Yang Song, Wee Peng Tay:
Preserving Trajectory Privacy in Driving Data Release. 3099-3103 - Joseph T. Colonel, Christian J. Steinmetz, Marcus Michelen, Joshua D. Reiss:
Direct Design of Biquad Filter Cascades with Deep Learning by Sampling Random Polynomials. 3104-3108 - Tom Gajecki, Waldo Nogueira:
An End-to-End Deep Learning Speech Coding and Denoising Strategy for Cochlear Implants. 3109-3113 - Jun Qi, Javier Tejedor:
Exploiting Hybrid Models of Tensor-Train Networks For Spoken Command Recognition. 3114-3118 - Gaëtan Frusque, Olga Fink:
Learnable Wavelet Packet Transform for Data-Adapted Spectrograms. 3119-3123 - Nikhil Kandpal, Oriol Nieto, Zeyu Jin:
Music Enhancement via Image Translation and Vocoding. 3124-3128 - Rui Lu, Baigong Zheng, Jiarui Hai, Fei Tao, Zhiyao Duan, Ji Liu:
Progressive Teacher-Student Training Framework for Music Tagging. 3129-3133 - Kuikui Wang, Gongping Yang, Yuwen Huang, Lu Yang, Yilong Yin:
Joint Dual-Domain Matrix Factorization for ECG Biometric Recognition. 3134-3138 - Kuan-Chuan Peng:
Iterative Self Knowledge Distillation - from Pothole Classification to Fine-Grained and Covid Recognition. 3139-3143 - Mengzhu Wang, Shan An, Xiao Luo, Xiong Peng, Wei Yu, Junyang Chen, Zhigang Luo:
Attention-based Adversarial Partial Domain Adaptation. 3144-3148 - Qi Xiao, Hebi Li, Jin Tian, Zhengdao Wang:
Group-Wise Feature Selection for Supervised Learning. 3149-3153 - Junhua Liao, Haihan Duan, Wanbin Zhao, Yanbing Yang, Liangyin Chen:
A Light Weight Model for Video Shot Occlusion Detection. 3154-3158 - Zhen Xiang, David J. Miller, Siheng Chen, Xi Li, George Kesidis:
Detecting Backdoor Attacks against Point Cloud Classifiers. 3159-3163 - Haibin Wu, Bo Zheng, Xu Li, Xixin Wu, Hung-Yi Lee, Helen Meng:
Characterizing the Adversarial Vulnerability of Speech self-Supervised Learning. 3164-3168 - Joel Shor, Aren Jansen, Wei Han, Daniel S. Park, Yu Zhang:
Universal Paralinguistic Speech Representations Using self-Supervised Conformers. 3169-3173 - Qiu-Shi Zhu, Jie Zhang, Zi-qiang Zhang, Ming-Hui Wu, Xin Fang, Li-Rong Dai:
A Noise-Robust Self-Supervised Pre-Training Model Based Speech Representation Learning for Automatic Speech Recognition. 3174-3178 - Samuel Kessler, Bethan Thomas, Salah Karout:
An Adapter Based Pre-Training for Efficient and Scalable Self-Supervised Speech Representation Learning. 3179-3183 - Qiqi Wang, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao:
DRVC: A Framework of Any-to-Any Voice Conversion with Self-Supervised Learning. 3184-3188 - Santiago Cuervo, Maciej Grabias, Jan Chorowski, Grzegorz Ciesielski, Adrian Lancucki, Pawel Rychlikowski, Ricard Marxer:
Contrastive Prediction Strategies for Unsupervised Segmentation and Categorization of Phonemes and Words. 3189-3193 - Itzik Klein, Guy Revach, Nir Shlezinger, Jonas E. Mehr, Ruud J. G. van Sloun, Yonina C. Eldar:
Uncertainty in Data-Driven Kalman Filtering for Partially Known State-Space Models. 3194-3198 - Jingzi Gu, Dayan Wu, Peng Fu, Bo Li, Weiping Wang:
Deep Piecewise Hashing for Efficient Hamming Space Retrieval. 3199-3203 - Arnaud Deleruyelle, John Klein, Cristian Versari:
SODA: Self-Organizing Data Augmentation in Deep Neural Networks Application to Biomedical Image Segmentation Tasks. 3204-3208 - Alexander Richard, Peter Sheridan Dodds, Vamsi Krishna Ithapu:
Deep Impulse Responses: Estimating and Parameterizing Filters with Deep Networks. 3209-3213 - Zhipeng He, Yongshi Zhong, Jiahui Pan:
Joint Temporal Convolutional Networks and Adversarial Discriminative Domain Adaptation for EEG-Based Cross-Subject Emotion Recognition. 3214-3218 - Lusine Abrahamyan, Anh Minh Truong, Wilfried Philips, Nikos Deligiannis:
Gradient Variance Loss for Structure-Enhanced Image Super-Resolution. 3219-3223 - Shaoyu Zhang, Chen Chen, Xiujuan Zhang, Silong Peng:
Label-Occurrence-Balanced Mixup for Long-Tailed Recognition. 3224-3228 - Chuanfei Hu, Weijie Sheng, Bo Dong, Xinde Li:
TNTC: Two-Stream Network with Transformer-Based Complementarity for Gait-Based Emotion Recognition. 3229-3233 - Yuan Zhang, Jian Cao, Ling Zhang, Xiangcheng Liu, Zhiyi Wang, Feng Ling, Weiqian Chen:
A free lunch from ViT: adaptive attention multi-scale fusion Transformer for fine-grained visual recognition. 3234-3238 - Hyungtae Lee, Heesung Kwon:
Self-Supervised Contrastive Learning for Cross-Domain Hyperspectral Image Representation. 3239-3243 - Mingye Xie, Ting Liu, Yuzhuo Fu:
GOS: A Large-Scale Annotated Outdoor Scene Synthetic Dataset. 3244-3248 - Antoine Tadros, Sébastien Drouyer, Rafael Grompone von Gioi:
Out-Of-Distribution As A Target Class in Semi-Supervised Learning. 3249-3252 - Hadi Hojjati, Narges Armanfard:
Self-Supervised Acoustic Anomaly Detection Via Contrastive Learning. 3253-3257 - Yen Meng, Yi-Hui Chou, Andy T. Liu, Hung-yi Lee:
Don't Speak Too Fast: The Impact of Data Bias on Self-Supervised Speech Models. 3258-3262 - Ibuki Kuroyanagi, Tatsuya Komatsu:
Self-Supervised Learning Method Using Multiple Sampling Strategies for General-Purpose Audio Representation. 3263-3267 - Varun Krishna, Sriram Ganapathy:
Self Supervised Representation Learning with Deep Clustering for Acoustic Unit Discovery from Raw Speech. 3268-3272 - Shu Wang, Yuhuang Hu, Shih-Chii Liu:
T-NGA: Temporal Network Grafting Algorithm for Learning to Process Spiking Audio Sensor Events. 3273-3277 - Xiyao Ma, Zheng Gao, Qian Hu, Mohamed Abdelhady:
Contrastive Knowledge Graph Attention Network for Request-Based Recipe Recommendation. 3278-3282 - Hui Zhu, Xiaofang Zhao:
TargetDrop: A Targeted Regularization Method for Convolutional Neural Networks. 3283-3287 - Xuan Hou, Yunpeng Bai, Haonan Shi, Ying Li:
Coarse-To-Fine Unsupervised Change Detection for Remote Sensing Images Via Object-Based MRF and Inception UNET. 3288-3292 - Mingyuan Fan, Yang Liu, Cen Chen, Shengxing Yu, Wenzhong Guo, Ximeng Liu:
Combating False Sense of Security: Breaking the Defense of Adversarial Training Via Non-Gradient Adversarial Attack. 3293-3297 - Haoli Bai, Hongda Mao, Dinesh Nair:
Dynamically Pruning Segformer for Efficient Semantic Segmentation. 3298-3302 - Sudhir Yarram, Jialian Wu, Pan Ji, Yi Xu, Junsong Yuan:
Deformable VisTR: Spatio Temporal Deformable Attention for Video Instance Segmentation. 3303-3307 - Chao Yang, Xianzhi Wang, Lina Yao, Guodong Long, Jing Jiang, Guandong Xu:
Attentional Gated Res2net for Multivariate Time Series Classification. 3308-3312 - Max Revay, Victor Solo:
Convex Clustering for Autocorrelated Time Series. 3313-3317 - Amil Dravid, Florian Schiffers, Yunan Wu, Oliver Cossairt, Aggelos K. Katsaggelos:
Investigating the Potential of Auxiliary-Classifier Gans for Image Classification in Low Data Regimes. 3318-3322 - Kunlei Jing, Xinman Zhang, Zhiyuan Yang, Bihan Wen:
Feature Augmentation Learning for Few-Shot Palmprint Image Recognition With Unconstrained Acquisition. 3323-3327 - Qiankun Tang, Xiaogang Xu, Jun Wang:
Prime Knowledge with Local Pattern Consistency for Knowledge Distillation. 3328-3332 - Xi Li, Zhen Xiang, David J. Miller, George Kesidis:
Test-Time Detection of Backdoor Triggers for Poisoned Deep Neural Networks. 3333-3337 - Haonan Huang, Yihao Luo, Guoxu Zhou, Qibin Zhao:
Multi-View Data Representation Via Deep Autoencoder-Like Nonnegative Matrix Factorization. 3338-3342 - Bariscan Bozkurt, Alper T. Erdogan:
On Identifiable Polytope Characterization for Polytopic Matrix Factorization. 3343-3347 - Quoc-Tung Le, Léon Zheng, Elisa Riccietti, Rémi Gribonval:
Fast Learning of Fast Transforms, with Guarantees. 3348-3352 - Hao Sun, Junting Chen:
Regression Assisted Matrix Completion for Reconstructing a Propagation Field with Application to Source Localization. 3353-3357 - Abhishek Sharma, Maks Ovsjanikov:
Matrix Decomposition on Graphs: A Simplified Functional View. 3358-3362 - Satish Mulleti, Haiyang Zhang, Yonina C. Eldar:
Learning to Sample for Sparse Signals. 3363-3367 - Alexander Lin, Andrew H. Song, Demba E. Ba:
Mixture Model Auto-Encoders: Deep Clustering Through Dictionary Learning. 3368-3372 - Yerlan Idelbayev, Miguel Á. Carreira-Perpiñán:
Exploring the Effect of ℓ0/ℓ2 Regularization in Neural Network Pruning using the LC Toolkit. 3373-3377 - Paul Irofti, Cristian Rusu, Andrei Patrascu:
Dictionary Learning with Uniform Sparse Representations for Anomaly Detection. 3378-3382 - Ruixian Liu, Michael J. Bianco, Peter Gerstoft, Bhaskar D. Rao:
Data-Driven Spatially Dependent PDE Identification. 3383-3387 - Shusen Liu, Rushil Anirudh, Jayaraman J. Thiagarajan, Peer-Timo Bremer:
Sparsity Improves Unsupervised Attribute Discovery in Stylegan. 3388-3392 - Sanghyun Yoo, Ohyun Kwon, Hoshik Lee:
Image-to-Graph Transformers for Chemical Structure Recognition. 3393-3397 - S. H. Shabbeer Basha, Sheethal N. Gowda, Dakala Jayachandra:
A Simple Hybrid Filter Pruning for Efficient Edge Inference. 3398-3402 - Bilel Kanoun, Mohamed Abderrazak Cherif, Isabelle Manighetti, Yuliya Tarabalka, Josiane Zerubia:
An Enhanced Deep Learning Approach for Tectonic Fault and Fracture Extraction in Very High Resolution Optical Images. 3403-3407 - Jiwon Kim, Youngjo Min, Mira Kim, Seungryong Kim:
Joint Learning of Feature Extraction and Cost Aggregation for Semantic Correspondence. 3408-3412 - Junhan Kim, Byonghyo Shim:
Generalized Zero-Shot Learning Using Conditional Wasserstein Autoencoder. 3413-3417 - Yiyang Shen, Yidan Feng, Weiming Wang, Dong Liang, Jing Qin, Haoran Xie, Mingqiang Wei:
MBA-RainGAN: A Multi-Branch Attention Generative Adversarial Network for Mixture of Rain Removal. 3418-3422 - David Peter, Wolfgang Roth, Franz Pernkopf:
End-to-End Keyword Spotting Using Neural Architecture Search and Quantization. 3423-3427 - Zefang Yu, Yangcheng Li, Yicheng Liu, Ting Liu, Yuzhuo Fu:
Synpose: A Large-Scale and Densely Annotated Synthetic Dataset for Human Pose Estimation in Classroom. 3428-3432 - Chunyu Wang, Peixian Gong, Lihua Zhang:
Stpointgcn: Spatial Temporal Graph Convolutional Network for Multiple People Recognition Using Millimeter-Wave Radar. 3433-3437 - Hanhui Li, Xinggan Peng, Huiping Zhuang, Zhiping Lin:
Multiple Temporal Context Embedding Networks for Unsupervised time Series Anomaly Detection. 3438-3442 - Ramit Sawhney, Atula Tejaswi Neerkaje:
Intermix: An Interference-Based Data Augmentation and Regularization Technique for Automatic Deep Sound Classification. 3443-3447 - Weibo Zhang, Fuqing Zhu, Jizhong Han, Tao Guo, Songlin Hu:
Cross-Layer Aggregation with Transformers for Multi-Label Image Classification. 3448-3452 - Prarthana Bhattacharyya, Chenge Li, Xiaonan Zhao, István Fehérvári, Jason Sun:
Visual Representation Learning with Self-Supervised Attention for Low-Label High-Data Regime. 3453-3457 - Guang Li, Ren Togo, Takahiro Ogawa, Miki Haseyama:
TriBYOL: Triplet BYOL for Self-Supervised Representation Learning. 3458-3462 - Chun-Hsiao Yeh, Cheng-Yao Hong, Yen-Chi Hsu, Tyng-Luh Liu:
SAGA: Self-Augmentation with Guided Attention for Representation Learning. 3463-3467 - Chuanfei Hu, Yongxiong Wang:
An Anomaly Detection Method Based on Self-Supervised Learning with Soft Label Assignment for Defect Visual Inspection. 3468-3472 - Bert de Vries, Iris A. M. Huijben, René D. Kok, Ruud J. G. van Sloun, Rik Vullings:
Contrastive Predictive Coding for Anomaly Detection of Fetal Health from the Cardiotocogram. 3473-3477 - Hui Tang, Xun Liang, Yuhui Guo, Xiangping Zheng, Bo Wu:
Graph Fine-Grained Contrastive Representation Learning. 3478-3482 - Zhen Yu, Yifeng Xiong, Kun He, Shao Huang, Yaodong Zhao, Jie Gu:
Position-Invariant Adversarial Attacks on Neural Modulation Recognition. 3483-3487 - Haziq Razali, Yiannis Demiris:
Using a Single Input to Forecast Human Action Keystates in Everyday Pick and Place Actions. 3488-3492 - Alessandro Cappelli, Ruben Ohana, Julien Launay, Laurent Meunier, Iacopo Poli, Florent Krzakala:
Adversarial Robustness by Design Through Analog Computing And Synthetic Gradients. 3493-3497 - Vincent Roulet, Zaïd Harchaoui:
Differentiable Programming A La Moreau. 3498-3502 - Hongyan Xu, Xiu Su, Shan You, Tao Huang, Fei Wang, Chen Qian, Changshui Zhang, Chang Xu, Dadong Wang, Arcot Sowmya:
Data Agnostic Filter Gating For Efficient Deep Networks. 3503-3507 - Abu Hasnat Mohammad Rubaiyat, Mohammad Shifat-E.-Rabbi, Yan Zhuang, Shiying Li, Gustavo K. Rohde:
Nearest Subspace Search in The Signed Cumulative Distribution Transform Space For 1d Signal Classification. 3508-3512 - Bowen Zhao, Chen Chen, Xi Xiao, Qi Ju, Shutao Xia:
Energy Alignment for Bias Rectification in Class Incremental Learning. 3513-3517 - Lexing Huang, Senlin Cai, Yihong Zhuang, Changxing Jing, Yue Huang, Xiaotong Tu, Xinghao Ding:
A Two-Stage Contrastive Learning Framework For Imbalanced Aerial Scene Recognition. 3518-3522 - Joshua K. Lee, Yuheng Bu, Prasanna Sattigeri, Rameswar Panda, Gregory W. Wornell, Leonid Karlinsky, Rogério Feris:
A Maximal Correlation Approach to Imposing Fairness in Machine Learning. 3523-3527 - Yan Zhang, Xue Jiang, Siqi Liu, Bo Hu, Xinbo Gao:
Boundary-Aware Bias Loss for Transformer-Based Aerial Image Segmentation Model. 3528-3532 - Yanpeng Zhou, Maosen Wang, Manas Gupta, Arulmurugan Ambikapathi, Ponnuthurai Nagaratnam Suganthan, Savitha Ramasamy:
Investigating Robustness of Biological vs. Backprop Based Learning. 3533-3537 - Abdullah Abdulaziz, Jianxin Zhou, Angela Di Fulvio, Yoann Altmann, Stephen McLaughlin:
Semi-Supervised Gaussian Mixture Variational Autoencoder for Pulse Shape Discrimination. 3538-3542 - Huidong Liang, Junbin Gao:
How Neural Processes Improve Graph Link Prediction. 3543-3547 - Shuyu Lin, Ronald Clark, Niki Trigoni, Stephen J. Roberts:
Uncertainty Estimation with a VAE-Classifier Hybrid Model. 3548-3552 - Milan Aryal, Nasim Yahya Soltani:
Context-Aware Graph-Based Self-Supervised Learning of Whole Slide Images. 3553-3557 - Zaharah Allah Bukhsh:
Contrastive Sensor Transformer for Predictive Maintenance of Industrial Assets. 3558-3562 - Heyan Chai, Weijun Su, Siyu Tang, Ye Ding, Binxing Fang, Qing Liao:
Improving Anomaly Detection with a Self-Supervised Task Based on Generative Adversarial Network. 3563-3567 - Jun Zhan, Siqi Wang, Xiandong Ma, Chengkun Wu, Canqun Yang, Detian Zeng, Shilin Wang:
Stgat-Mad : Spatial-Temporal Graph Attention Network For Multivariate Time Series Anomaly Detection. 3568-3572 - Yuxiang Zhang, Wei Li, Mengmeng Zhang, Ran Tao:
Dual Graph Cross-Domain Few-Shot Learning for Hyperspectral Image Classification. 3573-3577 - Julie Choi:
Personalized Pagerank Graph Attention Networks. 3578-3582 - Muberra Ozmen, Hao Zhang, Pengyun Wang, Mark Coates:
Multi-Relation Message Passing for Multi-Label Text Classification. 3583-3587 - Xiangping Zheng, Xun Liang, Bo Wu, Yuhui Guo, Hui Tang:
Adaptive Attention Graph Capsule Network. 3588-3592 - Lorenzo Giusti, Claudio Battiloro, Paolo Di Lorenzo, Sergio Barbarossa:
Graph Convolutional Networks With Autoencoder-Based Compression And Multi-Layer Graph Learning. 3593-3597 - Julian P. Merkofer, Guy Revach, Nir Shlezinger, Ruud J. G. van Sloun:
Deep Augmented Music Algorithm for Data-Driven Doa Estimation. 3598-3602 - Dianwen Ng, Yunqi Chen, Biao Tian, Qiang Fu, Eng Siong Chng:
Convmixer: Feature Interactive Convolution with Curriculum Learning for Small Footprint and Noisy Far-Field Keyword Spotting. 3603-3607 - Michael J. Bianco, Peter Gerstoft:
Semi-Supervised Source Localization With Residual Physical Learning. 3608-3612 - George Sammit, Zhongjie Wu, Yihao Wang, Zhongdi Wu, Akihito Kamata, Joseph Nese, Eric C. Larson:
Automated Prosody Classification for Oral Reading Fluency with Quadratic Kappa Loss and Attentive X-Vectors. 3613-3617 - Xujiang Zhao, Xuchao Zhang, Wei Cheng, Wenchao Yu, Yuncong Chen, Haifeng Chen, Feng Chen:
Seed: Sound Event Early Detection Via Evidential Uncertainty. 3618-3622 - Inês Nolasco, Dan Stowell:
Rank-Based Loss For Learning Hierarchical Representations. 3623-3627 - Keisuke Ozawa:
On The Relaxation of Orthogonal Tensor Rank and Its Nonconvex Riemannian Optimization for Tensor Completion. 3628-3632 - Wenjin Qin, Hailin Wang, Weijun Ma, Jianjun Wang:
Robust High-Order Tensor Recovery Via Nonconvex Low-Rank Approximation. 3633-3637 - Kriton Konstantinidis, Yao Lei Xu, Qibin Zhao, Danilo P. Mandic:
Variational Bayesian Tensor Networks with Structured Posteriors. 3638-3642 - Soo Min Kwon, Xin Li, Anand D. Sarwate:
Low-Rank Phase Retrieval with Structured Tensor Models. 3643-3647 - Yuchen Sun, Kejun Huang:
HOQRI: Higher-Order QR Iteration for Scalable Tucker Decomposition. 3648-3652 - Sergio Rozada, Antonio G. Marques:
A Multi-Resolution Low-Rank Tensor Decomposition. 3653-3657 - Lillian Zhou, Dhruv Guliani, Andreas Kabel, Giovanni Motta, Françoise Beaufays:
Exploring Heterogeneous Characteristics of Layers in ASR Models for More Efficient Training. 3658-3662 - Ting Zhong, Haoyang Yu, Rongfan Li, Xovee Xu, Xucheng Luo, Fan Zhou:
Probabilistic Fine-Grained Urban Flow Inference with Normalizing Flows. 3663-3667 - Shiliang Chen, Wentao He, Jianfeng Ren, Xudong Jiang:
Attention-Based Dual-Stream Vision Transformer for Radar Gait Recognition. 3668-3672 - Marcio L. Lima de Oliveira, Marco Jan Gerrit Bekooij:
Deep-MLE: Fusion between a Neural Network and MLE for A Single Snapshot DOA Estimation. 3673-3677 - Ha Minh Tan, Duc-Quang Vu, Chung-Ting Lee, Yung-Hui Li, Jia-Ching Wang:
Selective Mutual Learning: An Efficient Approach for Single Channel Speech Separation. 3678-3682 - John B. Harvill, Yash R. Wani, Moitreya Chatterjee, Mustafa Alam, David G. Beiser, David Chestek, Mark Hasegawa-Johnson, Narendra Ahuja:
Detection of Covid-19 from Joint Time and Frequency Analysis of Speech, Breathing and Cough Audio. 3683-3687 - Chen Chen, Yuchen Hu, Nana Hou, Xiaofeng Qi, Heqing Zou, Eng Siong Chng:
Self-Critical Sequence Training for Automatic Speech Recognition. 3688-3692 - Quchen Fu, Zhongwei Teng, Jules White, Maria E. Powell, Douglas C. Schmidt:
FastAudio: A Learnable Audio Front-End For Spoof Speech Detection. 3693-3697 - Yifei Zhao, Yazid Attabi, Benoît Champagne, Wei-Ping Zhu:
Complex IRM-Aware Training for Voice Activity Detection Using Attention Model. 3698-3702 - Jaechang Kim, Yunjoo Lee, Seunghoon Hong, Jungseul Ok:
Learning Continuous Representation of Audio for Arbitrary Scale Super Resolution. 3703-3707 - Shunsuke Hidaka, Kohei Wakamiya, Tokihiko Kaburagi:
An Investigation of the Effectiveness of Phase for Audio Classification. 3708-3712 - Leonardo Pepino, Pablo Riera, Luciana Ferrer:
Study of Positional Encoding Approaches for Audio Spectrogram Transformers. 3713-3717 - Jian Han, Yali Li, Shengjin Wang:
Few-Shot Object Detection with Local Correspondence RPN and Attentive Head. 3718-3722 - Hak Gu Kim, Davide Nanni, Sabine Süsstrunk:
Natural-Looking Adversarial Examples from Freehand Sketches. 3723-3727 - Guodong Shen, Yuqi Ouyang, Victor Sanchez:
Video Anomaly Detection via Prediction Network with Enhanced Spatio-Temporal Memory Exchange. 3728-3732 - Francesca Pistilli, Diego Valsesia, Giulia Fracastoro, Enrico Magli:
Signal Compression via Neural Implicit Representations. 3733-3737 - Yuan Cao, Lei Chen, Danchen Zhang, Leiming Ma, Hongming Shan:
Hybrid Weighting Loss for Precipitation Nowcasting from Radar Images. 3738-3742 - Yidian Sun, Jiwei Zhang, Wendong Wang:
Adversarial Learning Enhancement for 3D Human Pose and Shape Estimation. 3743-3747 - Min Zhang, Siteng Huang, Donglin Wang:
Domain Generalized Few-Shot Image Classification via Meta Regularization Network. 3748-3752 - Junxuan Huang, Junsong Yuan, Chunming Qiao:
Generation for Unsupervised Domain Adaptation: A Gan-Based Approach for Object Classification with 3D Point Cloud Data. 3753-3757 - Xin-Chun Li, Yan-Jia Wang, Le Gan, De-Chuan Zhan:
Exploring Transferability Measures and Domain Selection in Cross-Domain Slot Filling. 3758-3762 - Yuan Wu, Diana Inkpen, Ahmed El-Roby:
Maximum Batch Frobenius Norm for Multi-Domain Text Classification. 3763-3767 - Sudhir Yarram, Ming Yang, Junsong Yuan, Chunming Qiao:
Joint Global-Local Alignment for Domain Adaptive Semantic Segmentation. 3768-3772 - Zhiming Wang, Yantian Luo, Danlan Huang, Ning Ge, Jianhua Lu:
Category-Adaptive Domain Adaptation for Semantic Segmentation. 3773-3777 - Sara Björk, Jonas Nordhaug Myhre, Thomas Haugland Johansen:
Simpler is Better: Spectral Regularization and Up-Sampling Techniques for Variational Autoencoders. 3778-3782 - Yair Schiff, Vijil Chenthamarakshan, Samuel C. Hoffman, Karthikeyan Natesan Ramamurthy, Payel Das:
Augmenting Molecular Deep Generative Models with Topological Data Analysis Representations. 3783-3787 - Arthur Conmy, Subhadip Mukherjee, Carola-Bibiane Schönlieb:
Stylegan-Induced Data-Driven Regularization for Inverse Problems. 3788-3792 - Oyebade K. Oyedotun, Djamila Aouada:
A Closer Look at Autoencoders for Unsupervised Anomaly Detection. 3793-3797 - Sina Alemohammad, Hossein Babaei, C. J. Barberan, Naiming Liu, Lorenzo Luzi, Blake Mason, Richard G. Baraniuk:
NFT-K: Non-Fungible Tangent Kernels. 3798-3802 - Jian Zhang, Runwei Ding, Miaoju Ban, Tianyu Guo:
FDSNeT: An Accurate Real-Time Surface Defect Segmentation Network. 3803-3807 - Paul Moore, Theodor-Mihai Iliant, Filip-Alexandru Ion, Yue Wu, Terry J. Lyons:
Path Signatures for Non-Intrusive Load Monitoring. 3808-3812 - Alexander Hvatov:
Data-Driven Approach for the Floquet Propagator Inverse Problem Solution. 3813-3817 - Chaozheng Guo, Lin Zhang, Ying Shen, Yicong Zhou:
Chunkfusion: A Learning-Based RGB-D 3D Reconstruction Framework Via Chunk-Wise Integration. 3818-3822 - Ishan D. Khurjekar, Joel B. Harley:
Closing the Sim-to-Real Gap in Guided Wave Damage Detection with Adversarial Training of Variational Auto-Encoders. 3823-3827 - Andrea Littardi, Anders Hildeman, Mihalis A. Nicolaou:
Deep Learning on the Sphere for Multi-model Ensembling of Significant Wave Height. 3828-3832 - Wang Lu, Jindong Wang, Yiqiang Chen:
Local and Global Alignments for Generalizable Sensor-Based Human Activity Recognition. 3833-3837 - Wei Zhang, Zhipeng Li, Yiduo Guo, Ao Qiu, Yanjun Li, Yibing Shi:
Study on Time-of-Flight Estimation in Ultrasonic Well Logging Tool: Model-Driven Transfer Learning. 3838-3842 - Peng Yuan, Weijie Chen, Shicai Yang, Yunyi Xuan, Di Xie, Yueting Zhuang, Shiliang Pu:
Simulation-and-Mining: Towards Accurate Source-Free Unsupervised Domain Adaptive Object Detection. 3843-3847 - Zhaoyang Li, Long Zhao, Weijie Chen, Shicai Yang, Di Xie, Shiliang Pu:
Target-Aware Auto-Augmentation for Unsupervised Domain Adaptive Object Detection. 3848-3852 - Xinyi Liu, Tao Dai, Shu-Tao Xia, Yong Jiang:
Self-Ensemble Variance Regularization for Domain Adaptation. 3853-3857 - Junchu Huang, Weijie Chen, Shicai Yang, Di Xie, Shiliang Pu, Yueting Zhuang:
Transductive Clip with Class-Conditional Contrastive Learning. 3858-3862 - Reinmar J. Kobler, Jun-ichiro Hirayama, Motoaki Kawanabe:
Controlling The Fréchet Variance Improves Batch Normalization on the Symmetric Positive Definite Manifold. 3863-3867 - Maryam Abdolali, Nicolas Gillis:
Subspace Clustering Using Unsupervised Data Augmentation. 3868-3872 - Dominik Fay, Jens Sjölund, Tobias J. Oechtering:
Private Learning Via Knowledge Transfer with High-Dimensional Targets. 3873-3877 - Hongming Li, Shujian Yu, José C. Príncipe:
Deep Deterministic Independent Component Analysis for Hyperspectral Unmixing. 3878-3882 - Lorenzo Servadei, Huawei Sun, Julius Ott, Michael Stephan, Souvik Hazra, Thomas Stadelmayer, Daniela Sanchez Lopera, Robert Wille, Avik Santra:
Label-Aware Ranked Loss for Robust People Counting Using Automotive In-Cabin Radar. 3883-3887 - Randall Balestriero, Zichao Wang, Richard G. Baraniuk:
DeepHull: Fast Convex Hull Approximation in High Dimensions. 3888-3892 - Jihai Zhang, Fangquan Lin, Wei Jiang, Cheng Yang, Gaoge Liu:
Neighbor-Augmented Transformer-Based Embedding for Retrieval. 3893-3897 - Georgios Panagiotatos, Nikolaos Passalis, Avraam Tsantekidis, Anastasios Tefas:
Sentiment-Aware Distillation for Bitcoin Trend Forecasting Under Partial Observability. 3898-3902 - Longshaokan Wang, Lingda Wang, Mina Georgieva, Paulo Machado, Abinaya Ulagappa, Safwan Ahmed, Yan Lu, Arjun Bakshi, Farhad Ghassemi:
Robust Nonparametric Distribution Forecast with Backtest-Based Bootstrap and Adaptive Residual Selection. 3903-3907 - Nozomu Onodera, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama:
Variational Bayesian Graph Convolutional Network for Robust Collaborative Filtering. 3908-3912 - Zhishan Zhao, Sen Yang, Guohui Liu, Dawei Feng, Kele Xu:
FINT: Field-Aware Interaction Neural Network for Click-Through Rate Prediction. 3913-3917 - Heike Brock, Randy Gomez:
Making The Unknown More Certain: A Stacked Ensemble Classifier for Open Gesture Recognition with a Social Robot. 3918-3922 - Zheng Wei, Zhengpin Li, Xiaojun Mao, Jian Wang:
Applying Differential Privacy to Tensor Completion. 3923-3927 - Yao Lei Xu, Kriton Konstantinidis, Shengxi Li, Ljubisa Stankovic, Danilo P. Mandic:
Low-Complexity Attention Modelling via Graph Tensor Networks. 3928-3932 - Li-Dan Kuang, Biao Wang, Qiu-Hua Lin, Haopeng Zhang, Jianming Zhang, Wenjun Li, Feng Li, Vince D. Calhoun:
An Accelerated Rank-(L, L, 1, 1) Block Term Decomposition Of Multi-Subject Fmri Data Under Spatial Orthonormality Constraint. 3933-3937 - Bo Wu, Xun Liang, Xiangping Zheng, Yuhui Guo, Hui Tang:
Improving Dynamic Graph Convolutional Network with Fine-Grained Attention Mechanism. 3938-3942 - Boxi Weng, Jian Sun, Alireza Sadeghi, Gang Wang:
AdaPID: An Adaptive PID Optimizer for Training Deep Neural Networks. 3943-3947 - Brian Whiteaker, Peter Gerstoft:
Memory in Echo State Networks and the Controllability Matrix Rank. 3948-3952 - Jun Xia, Cheng Tan, Lirong Wu, Yongjie Xu, Stan Z. Li:
OT Cleaner: Label Correction as Optimal Transport. 3953-3957 - John Chen, Cameron R. Wolfe, Zhao Li, Anastasios Kyrillidis:
Demon: Improved Neural Network Training With Momentum Decay. 3958-3962 - Josen Daniel De Leon, Rowel Atienza:
Depth Pruning with Auxiliary Networks for Tinyml. 3963-3967 - Yunling Zheng, Carson Hu, Guang Lin, Meng Yue, Bao Wang, Jack Xin:
Glassoformer: A Query-Sparse Transformer for Post-Fault Power Grid Voltage Prediction. 3968-3972 - Yuwen Deng, Donghai Guan, Yanyu Chen, Weiwei Yuan, Jiemin Ji, Mingqiang Wei:
Sar-Shipnet: Sar-Ship Detection Neural Network via Bidirectional Coordinate Attention and Multi-Resolution Feature Fusion. 3973-3977 - Mohammadsadegh Shamsabardeh, Bahar Azari, Beatriz Martínez-López:
Spatio-Temporal PRRS Epidemic Forecasting via Factorized Deep Generative Modeling. 3978-3982 - Harshat Kumar, Hojjat Seyed Mousavi, Behrooz Shahsavari:
Fusion-Id: A Photoplethysmography and Motion Sensor Fusion Biometric Authenticator With Few-Shot on-Boarding. 3983-3987 - Zepeng Huo, Taowei Ji, Yifei Liang, Shuai Huang, Zhangyang Wang, Xiaoning Qian, Bobak Mortazavi:
Dynimp: Dynamic Imputation for Wearable Sensing Data through Sensory and Temporal Relatedness. 3988-3992 - Cheryl Sze Yin Wong, Guo Yang, Nancy F. Chen, Savitha Ramasamy:
Incremental Context Aware Attentive Knowledge Tracing. 3993-3997 - Alexander Campbell, Lorena Qendro, Pietro Liò, Cecilia Mascolo:
Robust and Efficient Uncertainty Aware Biosignal Classification via Early Exit Ensembles. 3998-4002 - Xinyu Yuan, Wenhan Wang, Youyong Kong, Jiasong Wu, Guanyu Yang, Huazhong Shu:
Temporal Cross-Graph Network for Brain Functional Activity Prediction. 4003-4007 - Qiang He, Xinwen Hou, Yu Liu:
POPO: Pessimistic Offline Policy Optimization. 4008-4012 - Qifeng Lin, Qing Ling:
Byzantine-Robust Federated Deep Deterministic Policy Gradient. 4013-4017 - Duo Xu, Faramarz Fekri:
Improving Actor-Critic Reinforcement Learning Via Hamiltonian Monte Carlo Method. 4018-4022 - Mingzhe Chen, Xi Xiao, Wanpeng Zhang, Xiaotian Gao:
Efficient and Stable Information Directed Exploration for Continuous Reinforcement Learning. 4023-4027 - Xiaojie Li, Chaoran Cui, Donglin Cao, Juan Du, Chunyun Zhang:
Hypergraph-Based Reinforcement Learning for Stock Portfolio Selection. 4028-4032 - Jie Chen, Weiqi Liu, Jian Pu:
Memory-Based Message Passing: Decoupling the Message for Propagation from Discrimination. 4033-4037 - Hao Wu, Jiangchao Yao:
PEAR: Photographic Embedding for Aesthetic Rating. 4038-4042 - Pratyusha Das, Antonio Ortega:
Gradient-Weighted Class Activation Mapping for Spatio Temporal Graph Convolutional Network. 4043-4047 - Ehsaneddin Jalilian, Georg Wimmer, Andreas Uhl, Mahmut Karakaya:
Deep Learning Based Off-Angle Iris Recognition. 4048-4052 - Sajjad Amini, Shahrokh Ghaemmaghami:
Towards Robust Visual Transformer Networks via K-Sparse Attention. 4053-4057 - Wei Wang, Yimeng Chai, Yue Li:
A Global to Local Guiding Network for Missing Data Imputation. 4058-4062 - Çagkan Yapar, Ron Levie, Gitta Kutyniok, Giuseppe Caire:
LocUNet: Fast Urban Positioning Using Radio Maps and Deep Learning. 4063-4067 - Hojjat Salehinejad, Shahrokh Valaee:
LiteHAR: Lightweight Human Activity Recognition from WIFI Signals with Random Convolution Kernels. 4068-4072 - Jiajia Li, Ling Dai, Feng Tan, Hui Shen, Zikai Wang, Bin Sheng, Pengwei Hu:
CDX-NET: Cross-Domain Multi-Feature Fusion Modeling Via Deep Neural Networks for Multivariate Time Series Forecasting in AIOps. 4073-4077 - Li-Wei Liu, Yen-Chin Liao, Hsie-Chia Chang:
A Clustering-based ML Scheme for Capacity Approaching Soft Level Sensing in 3D TLC NAND. 4078-4082 - Claudio Battiloro, Mattia Merluzzi, Paolo Di Lorenzo, Sergio Barbarossa:
Dynamic Resource Optimization for Adaptive Federated Learning Empowered by Reconfigurable Intelligent Surfaces. 4083-4087 - Pourya Behmandpoor, Panagiotis Patrinos, Marc Moonen:
Learning-Based Resource Allocation with Dynamic Data Rate Constraints. 4088-4092 - Qihan Du, Li Yu, Huiyuan Li, Youfang Leng, Ningrui Ou:
Denoising-Oriented Deep Hierarchical Reinforcement Learning for Next-Basket Recommendation⋆. 4093-4097 - DiJia Su, Jason D. Lee, John M. Mulvey, H. Vincent Poor:
Competitive Multi-Agent Reinforcement Learning with Self-Supervised Representation. 4098-4102 - Petteri Pulkkinen, Visa Koivunen:
Model-Based Online Learning for Resource Sharing in Joint Radar-Communication Systems. 4103-4107 - Siqi Shen, Jun Liu, Mengwei Qiu, Weiquan Liu, Cheng Wang, Yongquan Fu, Qinglin Wang, Peng Qiao:
Qrelation: an Agent Relation-Based Approach for Multi-Agent Reinforcement Learning Value Function Factorization. 4108-4112 - Qihan Du, Li Yu, Huiyuan Li, Youfang Leng, Ningrui Ou, Junyao Xiang:
Denoising-Guided Deep Reinforcement Learning For Social Recommendation. 4113-4117 - Christophe Dupuy, Radhika Arava, Rahul Gupta, Anna Rumshisky:
An Efficient DP-SGD Mechanism for Large Scale NLU Models. 4118-4122 - Zehan Chen, Xuan Jin, Yuan He, Hui Xue:
MAKD: MULTIPLE Auxiliary Knowledge Distillation. 4123-4127 - Sari Saba-Sadiya, Tuka Waddah AlHanai, Mohammad M. Ghassemi:
Feature Imitating Networks. 4128-4132 - Ji Li, Chao Wang:
Over-Parameterized Network Solves Phase Retrieval Effectively. 4133-4137 - Jiangyuan Li, Mohammadreza Armandpour:
Deep Spatio-Temporal Wind Power Forecasting. 4138-4142 - Jitao Lu, Yihang Lu, Rong Wang, Feiping Nie, Xuelong Li:
Multiple Kernel K-Means Clustering with Simultaneous Spectral Rotation. 4143-4147 - Kai Chen, Twan van Laarhoven, Elena Marchiori, Feng Yin, Shuguang Cui:
Multitask Gaussian Process With Hierarchical Latent Interactions. 4148-4152 - Yihang Lu, Jitao Lu, Rong Wang, Feiping Nie:
Discrete Multi-Kernel K-Means with Diverse and Optimal Kernel Learning. 4153-4157 - Takayuki Nakachi, Yitu Wang:
Access Control for Privacy-Preserving Gaussian Process Regression. 4158-4162 - Farah Cherfaoui, Hachem Kadri, Liva Ralaivola:
Scalable Ridge Leverage Score Sampling for the Nyström Method. 4163-4167 - Hrusikesha Pradhan, Alec Koppel, Ketan Rajawat:
On Submodular Set Cover Problems for Near-Optimal Online Kernel Basis Selection. 4168-4172 - Dong Ma, Chi Ian Tang, Cecilia Mascolo:
Improving Feature Generalizability with Multitask Learning in Class Incremental Learning. 4173-4177 - Hou Lio, Shang-En Li, Jen-Tzung Chien:
Adversarial Mask Transformer for Sequential Learning. 4178-4182 - Pouya M. Ghari, Yanning Shen:
Online Learning with Probabilistic Feedback. 4183-4187 - Jen-Hao Rick Chang, Martin Bresler, Youssouf Chherawala, Adrien Delaye, Thomas Deselaers, Ryan S. Dixon, Oncel Tuzel:
Data Incubation - Synthesizing Missing Data for Handwriting Recognition. 4188-4192 - Yuhao Liu, Petar M. Djuric:
Tracking the Dimensions of Latent Spaces of Gaussian Process Latent Variable Models. 4193-4197 - Chen Zhong, Mustafa Cenk Gursoy, Senem Velipasalar:
Controlled Sensing and Anomaly Detection Via Soft Actor-Critic Reinforcement Learning. 4198-4202 - Yuzhang Shang, Bin Duan, Ziliang Zong, Liqiang Nie, Yan Yan:
Win The Lottery Ticket Via Fourier Analysis: Frequencies Guided Network Pruning. 4203-4207 - Kyungmi Lee, Anantha P. Chandrakasan:
SparseBFA: Attacking Sparse Deep Neural Networks with the Worst-Case Bit Flips on Coordinates. 4208-4212 - Sizhao Huang, Shuai Wang, Jian Chen, Guozhi Li, Wenyi Wang:
Adversarial Examples Detection Based on Error Level Analysis and Space Mapping. 1-5 - Ziyi Chen, Akihiro Sugimoto, Shang-Hong Lai:
Learning Monocular 3D Human Pose Estimation With Skeletal Interpolation. 4218-4222 - Juan Cerviño, Luana Ruiz, Alejandro Ribeiro:
Training Stable Graph Neural Networks Through Constrained Learning. 4223-4227 - Xun Xian, Mingyi Hong, Jie Ding:
Mismatched Supervised Learning. 4228-4232 - Mateusz Pabian, Dominik Rzepka, Miroslaw Pawlak:
Supervised Training of Siamese Spiking Neural Networks with Earth Mover's Distance. 4233-4237 - Xiaoyi Mai, Salman Avestimehr, Antonio Ortega, Mahdi Soltanolkotabi:
On The Effectiveness of Active Learning by Uncertainty Sampling in Classification of High-Dimensional Gaussian Mixture Data. 4238-4242 - Akshay Rangamani, Andrzej Banburski-Fahey:
Neural Collapse in Deep Homogeneous Classifiers and The Role of Weight Decay. 4243-4247 - Ismail R. Alkhouri, Alvaro Velasquez, George K. Atia:
Synthesis of Adversarial Samples in Two-Stage Classifiers. 4248-4252 - Chen Gong, Kong Bin, Eric J. Seibel, Xin Wang, Youbing Yin, Qi Song:
Synergistic Network Learning and Label Correction for Noise-Robust Image Classification. 4253-4257 - Jianan Chen, Qin Hu, Honglu Jiang:
Social Welfare Maximization in Cross-Silo Federated Learning. 4258-4262 - Qiongxiu Li, Jaron Skovsted Gundersen, Katrine Tjell, Rafal Wisniewski, Mads Græsbøll Christensen:
Privacy-Preserving Distributed Expectation Maximization for Gaussian Mixture Model Using Subspace Perturbation. 4263-4267 - Yichuan Li, Petros G. Voulgaris, Nikolaos M. Freris:
A Communication Efficient Quasi-Newton Method for Large-Scale Distributed Multi-Agent Optimization. 4268-4272 - Kun Yuan, Zhaoxian Wu, Qing Ling:
A Byzantine-Resilient Dual Subgradient Method for Vertical Federated Learning. 4273-4277 - Heng Zhu, Qing Ling:
Byzantine-Robust Aggregation with Gradient Difference Compression and Stochastic Variance Reduction for Federated Learning. 4278-4282 - Jie Peng, Weiyu Li, Qing Ling:
Variance Reduction-Boosted Byzantine Robustness in Decentralized Stochastic Optimization. 4283-4287 - Sehoon Kim, Amir Gholami, Zhewei Yao, Nicholas Lee, Patrick Wang, Aniruddha Nrusimha, Bohan Zhai, Tianren Gao, Michael W. Mahoney, Kurt Keutzer:
Integer-Only Zero-Shot Quantization for Efficient Speech Recognition. 4288-4292 - Botao Zhao, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao:
nnSpeech: Speaker-Guided Conditional Variational Autoencoder for Zero-Shot Multi-speaker text-to-speech. 4293-4297 - Chen Chen, Nana Hou, Yuchen Hu, Shashank Shirol, Eng Siong Chng:
Noise-Robust Speech Recognition With 10 Minutes Unparalleled In-Domain Data. 4298-4302 - Yuhao Dan, Jie Zhou, Qin Chen, Qingchun Bai, Liang He:
Enhancing Class Understanding Via Prompt-Tuning For Zero-Shot Text Classification. 4303-4307 - Hyeonuk Nam, Seong-Hu Kim, Yong-Hwa Park:
Filteraugment: An Acoustic Environmental Data Augmentation Method. 4308-4312 - Jhoan Keider Hoyos-Osorio, Oscar Skean, Austin J. Brockmeier, Luis Gonzalo Sánchez Giraldo:
The Representation Jensen-Rényi Divergence. 4313-4317 - Qi Zhang, Shujian Yu, Jingmin Xin, Badong Chen:
Multi-View Information Bottleneck Without Variational Approximation. 4318-4322 - Devansh Gupta, Vinayak Abrol:
Time-Frequency and Geometric Analysis of Task-Dependent Learning in Raw Waveform Based Acoustic Models. 4323-4327 - David Bonet, Antonio Ortega, Javier Ruiz Hidalgo, Sarath Shekkizhar:
Channel Redundancy and Overlap in Convolutional Neural Networks with Channel-Wise NNK Graphs. 4328-4332 - Abhishek Kumar, Vivek Khimani, Dimitris Chatzopoulos, Pan Hui:
FedClean: A Defense Mechanism against Parameter Poisoning Attacks in Federated Learning. 4333-4337 - Trung Dang, Om Thakkar, Swaroop Ramaswamy, Rajiv Mathews, Peter Chin, Françoise Beaufays:
A Method to Reveal Speaker Identity in Distributed ASR Training, and How to Counter IT. 4338-4342 - Babak Barazandeh, Kristal Curtis, Chandrima Sarkar, Ram Sriharsha, George Michailidis:
On The Convergence of ADAM-Type Algorithms for Solving Structured Single Node and Decentralized Min-Max Saddle Point Games. 4343-4347 - Tien-Ju Yang, Dhruv Guliani, Françoise Beaufays, Giovanni Motta:
Partial Variable Training for Efficient on-Device Federated Learning. 4348-4352 - Haider Al-Lawati, Stark C. Draper:
Gradient Staleness in Asynchronous Optimization Under Random Communication Delays. 4353-4357 - Chen Ying, Baochun Li, Bo Li:
Tempo: Improving Training Performance in Cross-Silo Federated Learning. 4358-4362 - Xiaokang Yang, Jianguo Wei:
DMANET: Deep Learning-Based Differential Microphone Arrays for Multi-Channel Speech Separation. 4363-4367 - Naoya Takahashi, Yuki Mitsufuji:
Amicable Examples for Informed Source Separation. 4368-4372 - Kohei Saijo, Tetsuji Ogawa:
Remix-Cycle-Consistent Learning on Adversarially Learned Separator for Accurate and Stable Unsupervised Speech Separation. 4373-4377 - Alper T. Erdogan:
An Information Maximization Based Blind Source Separation Approach for Dependent and Independent Sources. 4378-4382 - Shahram Hosseini, Yannick Deville:
Blind Separation of Linear-Quadratic Mixtures of Mutually Independent and Autocorrelated Sources. 4383-4387 - Matthias Hermann, Georg Umlauf, Matthias O. Franz:
Large-Scale Independent Component Analysis By Speeding Up Lie Group Techniques. 4388-4392 - Vivek Sivaraman Narayanaswamy, Rushil Anirudh, Irene Kim, Yamen Mubarka, Andreas Spanias, Jayaraman J. Thiagarajan:
Predicting the Generalization Gap in Deep Models using Anchoring. 4393-4397 - Vardaan Taneja, Pin-Yu Chen, Yuguang Yao, Sijia Liu:
When Does Backdoor Attack Succeed in Image Reconstruction? A Study of Heuristics vs. Bi-Level Solution. 4398-4402 - Tianyu Chen, Zhixin Li, Jiahui Wei, Tiantao Xian:
Mixed Knowledge Relation Transformer for Image Captioning. 4403-4407 - Zheng Huo, Chong Wang, Weiwei Chen, Yuqi Li, Jun Wang, Jiafei Wu:
Balanced Stripe-Wise Pruning In The Filter. 4408-4412 - Shuang Liang, Yinan Zou, Yong Zhou:
Gan-Based Joint Activity Detection and Channel Estimation for Grant-Free Random Access. 4413-4417 - Kun Wang, Jing Dong, Baoxiang Wang, Shuai Li:
Cascading Bandit Under Differential Privacy. 4418-4422 - Angshul Majumdar:
Iterative Re-weighted Least Squares Algorithms for Non-negative Sparse and Group-sparse Recovery. 4423-4427 - Chuyang Ke, Jean Honorio:
Exact Partitioning of High-Order Planted Models with A Tensor Nuclear Norm Constraint. 4428-4432 - Ahmed Imtiaz Humayun, Randall Balestriero, Anastasios Kyrillidis, Richard G. Baraniuk:
No More Than 6ft Apart: Robust K-Means via Radius Upper Bounds. 4433-4437 - Ping Xu, Yue Wang, Xiang Chen, Zhi Tian:
Deep Kernel Learning Networks with Multiple Learning Paths. 4438-4442 - Adarsh Barik, Jean Honorio:
Provable Sample Complexity Guarantees For Learning Of Continuous-Action Graphical Games With Nonparametric Utilities. 4443-4447 - Jianyuan Ni, Raunak Sarbajna, Yang Liu, Anne H. H. Ngu, Yan Yan:
Cross-Modal Knowledge Distillation For Vision-To-Sensor Action Recognition. 4448-4452 - Hsuan-An Hsia, Che-Hsien Lin, Bo-Han Kung, Jhao-Ting Chen, Daniel Stanley Tan, Jun-Cheng Chen, Kai-Lung Hua:
CLIPCAM: A Simple Baseline For Zero-Shot Text-Guided Object And Action Localization. 4453-4457 - Tiantao Xian, Zhixin Li, Tianyu Chen, Huifang Ma:
Exploring Dual Stream Global Information For Image Captioning. 4458-4462 - Georgii Mikriukov, Mahdyar Ravanbakhsh, Begüm Demir:
Unsupervised Contrastive Hashing for Cross-Modal Retrieval in Remote Sensing. 4463-4467 - Sungjune Park, Dae Hwi Choi, Jung Uk Kim, Yong Man Ro:
Robust Thermal Infrared Pedestrian Detection By Associating Visible Pedestrian Knowledge. 4468-4472 - Joshua Vendrow, Jamie Haddock, Deanna Needell:
A Generalized Hierarchical Nonnegative Tensor Decomposition. 4473-4477 - Zongcai Du, Jie Liu, Jie Tang, Gangshan Wu:
Two Strategies Toward Lightweight Image Super-Resolution. 4478-4482 - Brian Kenji Iwana:
On Mini-Batch Training with Varying Length Time Series. 4483-4487 - Yuan Zhang, Yuan Yuan, Qi Wang:
ACP: Adaptive Channel Pruning for Efficient Neural Networks. 4488-4492 - Yang Guo, Jeanette Wen Jun Poh, Cheryl Sze Yin Wong, Savitha Ramasamy:
Bayesian Continual Imputation and Prediction For Irregularly Sampled Time Series Data. 4493-4497 - Hailin Zhang, Defang Chen, Can Wang:
Confidence-Aware Multi-Teacher Knowledge Distillation. 4498-4502 - Jiying Zhang, Yuzhao Chen, Xi Xiao, Runiu Lu, Shu-Tao Xia:
Learnable Hypergraph Laplacian for Hypergraph Learning. 4503-4507 - Jitendra K. Tugnait:
Graph Learning From Multivariate Dependent Time Series Via A Multi-Attribute Formulation. 4508-4512 - Soheil Kolouri, Kimia Nadjahi, Shahin Shahrampour, Umut Simsekli:
Generalized Sliced Probability Metrics. 4513-4517 - Eyal Fishel Ben-Knaan, Yonina C. Eldar, Nir Shlezinger:
Recovery of Noisy Pooled Tests via Learned Factor Graphs with Application to COVID-19 Testing. 4518-4522 - Jean-Christophe Gagnon-Audet, Soroosh Shahtalebi, Frank Rudzicz, Irina Rish:
A Remedy For Distributional Shifts Through Expected Domain Translation. 4523-4527 - Pol Grau Jurado, Xinyue Liang, Saikat Chatterjee:
Deterministic Transform Based Weight Matrices for Neural Networks. 4528-4532 - Mingzhou Fan, Byung-Jun Yoon, Francis J. Alexander, Edward R. Dougherty, Xiaoning Qian:
Adaptive Group Testing with Mismatched Models. 4533-4537 - Yonggang Zhu, Chao Tian, Zhuqing Jiang, Aidong Men, Haiying Wang, Qingchao Chen:
Mixed In Time And Modality: Curse Or Blessingƒ Cross-Instance Data Augmentation for Weakly Supervised Multimodal Temporal Fusion. 4538-4542 - Ningrui Ou, Li Yu, Huiyuan Li, Qihan Du, Junyao Xiang, Wei Gong:
MTAF: Shopping Guide Micro-Videos Popularity Prediction Using Multimodal and Temporal Attention Fusion Approach. 4543-4547 - Oliver Stromann, Seyed Alireza Razavi, Michael Felsberg:
Learning To Integrate Vision Data Into Road Network Data. 4548-4552 - Huajian Wu, Mingmin Chi:
Hierarchical Signal Fusion Network for Pulsar Detection with Phase-Correlation and Signal Attentions. 4553-4557 - Sahil Datta, Akuha Aondoakaa, Jorunn Jo Holmberg, Elena Antonova:
Recognition Of Silently Spoken Word From Eeg Signals Using Dense Attention Network (DAN). 4558-4562 - Ho-Hsiang Wu, Prem Seetharaman, Kundan Kumar, Juan Pablo Bello:
Wav2CLIP: Learning Robust Audio Representations from Clip. 4563-4567 - Gourav Datta, Tyler Etchart, Vivek Yadav, Varsha Hedau, Pradeep Natarajan, Shih-Fu Chang:
Asd-Transformer: Efficient Active Speaker Detection Using Self And Multimodal Transformers. 4568-4572 - Georgios Paraskevopoulos, Efthymios Georgiou, Alexandros Potamianos:
Mmlatch: Bottom-Up Top-Down Fusion For Multimodal Sentiment Analysis. 4573-4577 - Luwei Xiao, Xingjiao Wu, Wen Wu, Jing Yang, Liang He:
Multi-Channel Attentive Graph Convolutional Network with Sentiment Fusion for Multimodal Sentiment Analysis. 4578-4582 - Tianyu Chen, Yuan Xie, Shuai Zhang, Shaohan Huang, Haoyi Zhou, Jianxin Li:
Learning Music Sequence Representation From Text Supervision. 4583-4587 - Kleanthis Avramidis, Christos Garoufis, Athanasia Zlatintsi, Petros Maragos:
Enhancing Affective Representations Of Music-Induced Eeg Through Multimodal Supervision And Latent Domain Adaptation. 4588-4592 - Luyu Wang, Pauline Luc, Yan Wu, Adrià Recasens, Lucas Smaira, Andrew Brock, Andrew Jaegle, Jean-Baptiste Alayrac, Sander Dieleman, João Carreira, Aäron van den Oord:
Towards Learning Universal Audio Representations. 4593-4597 - Siyuan Shan, Lamtharn Hantrakul, Jitong Chen, Matt Avent, David Trevelyan:
Differentiable Wavetable Synthesis. 4598-4602 - Víctor Arroyo, Jose J. Valero-Mas, Jorge Calvo-Zaragoza, Antonio Pertusa:
Neural Audio-To-Score Music Transcription For Unconstrained Polyphony Using Compact Output Representations. 4603-4607 - Junghyun Koo, Seungryeol Paik, Kyogu Lee:
End-To-End Music Remastering System Using Self-Supervised And Adversarial Training. 4608-4612 - Huaizhen Tang, Xulong Zhang, Jianzong Wang, Ning Cheng, Jing Xiao:
Avqvc: One-Shot Voice Conversion By Vector Quantization With Applying Contrastive Learning. 4613-4617 - Shijing Si, Jianzong Wang, Junqing Peng, Jing Xiao:
Towards Speaker Age Estimation With Label Distribution Learning. 4618-4622 - Penghong Wang, Jiahui Li, Mengyao Ma, Xiaopeng Fan:
Distributed Audio-Visual Parsing Based On Multimodal Transformer and Deep Joint Source Channel Coding. 4623-4627 - Sen Liang, Zhize Zhou, Rong Li, Juyong Zhang, Hujun Bao:
TalkingFlow: Talking Facial Landmark Generation with Multi-Scale Normalizing Flow Network. 4628-4632 - Yuning Qiu, Carlos Busso, Teruhisa Misu, Kumar Akash:
Incorporating Gaze Behavior Using Joint Embedding With Scene Context for Driver Takeover Detection. 4633-4637 - Masahiro Yasuda, Yasunori Ohishi, Shoichiro Saito, Noboru Harada:
Multi-View And Multi-Modal Event Detection Utilizing Transformer-Based Multi-Sensor Fusion. 4638-4642 - Koshi Watanabe, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama:
Distributed Label Dequantized Gaussian Process Latent Variable Model for Multi-View Data Integration. 4643-4647 - Go Irie, Takashi Shibata, Akisato Kimura:
Co-Attention-Guided Bilinear Model for Echo-Based Depth Estimation. 4648-4652 - Junwei Zhao, Zhaofei Yu, Lei Ma, Ziluo Ding, Shiliang Zhang, Yonghong Tian, Tiejun Huang:
Modeling The Detection Capability Of High-Speed Spiking Cameras. 4653-4657 - Zenghao Chai, Zhengzhuo Xu, Chun Yuan:
Modernn: Towards Fine-Grained Motion Details for Spatiotemporal Predictive Learning. 4658-4662 - Keisuke Nonaka, Ryosuke Watanabe, Haruhisa Kato, Tatsuya Kobayashi, Eduardo Pavez, Antonio Ortega:
Graph-Based Point Cloud Denoising Using Shape-Aware Consistency For Free-Viewpoint Video. 4663-4667 - Bor-Sheng Huang, Chih-Chung Hsu, Wo-Ting Liao, Han-Yi Kao, Xian-Yun Wang:
DCSN: Deformable Convolutional Semantic Segmentation Neural Network for Non-Rigid Scenes. 4668-4672 - Junwei Zhao, Shiliang Zhang, Tiejun Huang:
Transformer-Based Domain Adaptation for Event Data Classification. 4673-4677 - Ziqing Yang, Katherine Nayan, Zehao Fan, Houwei Cao:
Multimodal Emotion Recognition with Surgical and Fabric Masks. 4678-4682 - Yuya Moroto, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama:
Human Emotion Recognition Using Multi-Modal Biological Signals Based On Time Lag-Considered Correlation Maximization. 4683-4687 - Mixiao Hou, Zheng Zhang, Guangming Lu:
Multi-Modal Emotion Recognition with Self-Guided Modality Calibration. 4688-4692 - Vandana Rajan, Alessio Brutti, Andrea Cavallaro:
Is Cross-Attention Preferable to Self-Attention for Multi-Modal Emotion Recognition? 4693-4697 - Minh Tran, Mohammad Soleymani:
A Pre-Trained Audio-Visual Transformer for Emotion Recognition. 4698-4702 - Jinming Zhao, Ruichen Li, Qin Jin, Xinchao Wang, Haizhou Li:
Memobert: Pre-Training Model with Prompt-Based Learning for Multimodal Emotion Recognition. 4703-4707 - Kaikai Deng, Dong Zhao, Qiaoyue Han, Zihan Zhang, Shuyue Wang, Huadong Ma:
Global-Local Feature Enhancement Network for Robust Object Detection using mmWave Radar and Camera. 4708-4712 - Ying Wang, Chihui Zhuang, Haihui Ye, Yan Yan, Hanzi Wang:
Learning Correlation for Online Multiple Object Tracking. 4713-4717 - Chihui Zhuang, Yanjie Liang, Yan Yan, Yang Lu, Hanzi Wang:
Bounding Box Distribution Learning and Center Point Calibration for Robust Visual Tracking. 4718-4722 - Haihui Ye, Guangge Wang, Yang Lu, Yan Yan, Hanzi Wang:
Multi-Focus Guided Semantic Aggregation for Video Object Detection. 4723-4727 - Chandrashekhar Lavania, Shiva Sundaram, Sundararajan Srinivasan, Katrin Kirchhoff:
Enhancing Contrastive Learning with Temporal Cognizance for Audio-Visual Representation Generation. 4728-4732 - Zimian Wei, Hengyue Pan, Linbo Qiao, Xin Niu, Peijie Dong, Dongsheng Li:
Cross-Modal Knowledge Distillation in Multi-Modal Fake News Detection. 4733-4737 - Tao Qian, Jiatong Shi, Shuai Guo, Peter Wu, Qin Jin:
Training Strategies for Automatic Song Writing: A Unified Framework Perspective. 4738-4742 - Jianrong Wang, Zixuan Wang, Xiaosheng Hu, Xuewei Li, Qiang Fang, Li Liu:
Residual-Guided Personalized Speech Synthesis based on Face Image. 4743-4747 - Yucheng Zhou:
Sketch Storytelling. 4748-4752 - Xianbing Zhao, Yixin Chen, Wanting Li, Lei Gao, Buzhou Tang:
MAG+: An Extended Multimodal Adaptation Gate for Multimodal Sentiment Analysis. 4753-4757 - Wenrui Li, Xiaopeng Fan:
Image-Text Alignment and Retrieval Using Light-Weight Transformer. 4758-4762 - Mingyang Li, Shao-Lun Huang, Lin Zhang:
A General Framework For Incomplete Cross-Modal Retrieval With Missing Labels And Missing Modalities. 4763-4767 - Heeyoung Kwak, Hyunkyung Bae, Kyomin Jung:
Subgraph Representation Learning with Hard Negative Samples for Inductive Link Prediction. 4768-4772 - Abin Jose, Daniel Filbert, Christian Rohlfing, Jens-Rainer Ohm:
Deep Hashing with Hash Center Update for Efficient Image Retrieval. 4773-4777 - Lin Wang, Wanqian Zhang, Dayan Wu, Pingting Hong, Bo Li:
Prototype-Based Inter-Camera Learning for Person Re-Identification. 4778-4782 - Zeyu Ma, Yuhang Guo, Xiao Luo, Chong Chen, Minghua Deng, Wei Cheng, Guangming Lu:
DHWP: Learning High-Quality Short Hash Codes Via Weight Pruning. 4783-4787 - Fagui Liu, Xinjie Wu, Chao Li:
Node Slicing Broad Learning System for Text Classification. 4788-4792 - Siyu Lou, Xuenan Xu, Mengyue Wu, Kai Yu:
Audio-Text Retrieval in Context. 4793-4797 - Satwinder Singh, Ruili Wang, Feng Hou:
Improved Meta Learning for Low Resource Speech Recognition. 4798-4802 - Yiwu Yao, Chengyu Wang, Jun Huang:
Quantized Winograd Acceleration for CONV1D Equipped ASR Models on Mobile Devices. 4803-4807 - Jianrong Wang, Jinyu Liu, Longxuan Zhao, Shanyu Wang, Ruiguo Yu, Li Liu:
Acoustic-to-Articulatory Inversion Based on Speech Decomposition and Auxiliary Feature. 4808-4812 - Ya-Tse Wu, Jeng-Lin Li, Chi-Chun Lee:
An Audio-Saliency Masking Transformer for Audio Emotion Classification in Movies. 4813-4817 - Yuto Watanabe, Ren Togo, Keisuke Maeda, Takahiro Ogawa, Miki Haseyama:
Generative Adversarial Network Including Referring Image Segmentation For Text-Guided Image Manipulation. 4818-4822 - Chuhao Jin, Hongteng Xu, Ruihua Song, Zhiwu Lu:
Text2Poster: Laying Out Stylized Texts on Retrieved Images. 4823-4827 - Xiaoqing Liu, Huanqiang Zeng, Yifan Shi, Jianqing Zhu, Kai-Kuang Ma:
Deep Rank Cross-Modal Hashing with Semantic Consistent for Image-Text Retrieval. 4828-4832 - Mingrui Lao, Yanming Guo, Wei Chen, Nan Pu, Michael S. Lew:
VQA-BC: Robust Visual Question Answering Via Bidirectional Chaining. 4833-4837 - Anda Zhang, Wei Tao, Ziyan Li, Haofen Wang, Wenqiang Zhang:
Type-Aware Medical Visual Question Answering. 4838-4842 - Ron M. Hecht, Ke Liu, Noa Garnett, Ariel Telpaz, Omer Tsimhoni:
From Bottom-Up To Top-Down: Characterization Of Training Process In Gaze Modeling. 4843-4847 - Yuhan Zhang, Weihua He, Minglei Li, Kun Tian, Ziyang Zhang, Jie Cheng, Yaoyuan Wang, Jianxing Liao:
Meta Talk: Learning To Data-Efficiently Generate Audio-Driven Lip-Synchronized Talking Face With High Definition. 4848-4852 - Taeheon Kim, Hong Joo Lee, Yong Man Ro:
Map: Multispectral Adversarial Patch to Attack Person Detection. 4853-4857 - Yuhang Huang, Junjie Zhang, Shuyan Liu, Qian Bao, Dan Zeng, Zhineng Chen, Wu Liu:
Genre-Conditioned Long-Term 3D Dance Generation Driven by Music. 4858-4862 - Arda Senocak, Hyeonggon Ryu, Junsik Kim, In So Kweon:
Learning Sound Localization Better from Semantically Similar Samples. 4863-4867 - Shuo Liu, Weize Quan, Yuan Liu, Dong-Ming Yan:
Bi-Directional Modality Fusion Network For Audio-Visual Event Localization. 4868-4872 - Yihao Luo, Xiang Cao, Juntao Zhang, Peng Cheng, Tianjiang Wang, Qi Feng:
Dynamic Multi-Scale Loss Balance for Object Detection. 4873-4877 - Nicolas Frank, Davi Lazzarotto, Touradj Ebrahimi:
Latent Space Slicing for Enhanced Entropy Modeling In Learning-Based Point Cloud Geometry Compression. 4878-4882 - Dongmin Cha, Daijin Kim:
DAM-GAN : Image Inpainting Using Dynamic Attention Map Based on Fake Texture Detection. 4883-4887 - Shukai Wu, Qingqin Wang, Shuchang Xu, Sanyuan Zhang:
Improving Reference-Based Image Colorization For Line Arts Via Feature Aggregation And Contrastive Learning. 4888-4892 - Jiawei Ma, Xu Zhang, Yue Wu, Varsha Hedau, Shih-Fu Chang:
Few-Shot Gaze Estimation with Model Offset Predictors. 4893-4897 - Masatomo Yoshida, Masahiro Okuda:
Adversarial Examples for Image Cropping in Social Media. 4898-4902 - Saeed Mohammadzadeh, Vítor H. Nascimento, Rodrigo C. de Lamare, Osman Kukrer:
Robust Adaptive Beamforming Based on Power Method Processing and Spatial Spectrum Matching. 4903-4907 - Jiangyan Han, Boon Poh Ng, Meng Hwa Er:
An Adaptive Orientational Beamforming Technique for Narrowband Interference Rejection. 4908-4912 - Syed A. Hamza, Moeness G. Amin, Batu K. Chalise:
Phase-Only Reconfigurable Sparse Array Beamforming Using Deep Learning. 4913-4917 - Yongwei Huang, Wenzheng Yang, Sergiy A. Vorobyov:
Robust Adaptive Beamforming Maximizing the Worst-Case SINR Over Distributional Uncertainty Sets for Random INC Matrix And Signal Steering Vector. 4918-4922 - Tuomas Aittomäki, Visa Koivunen:
Improved Beamforming Encoding for Joint Radar and Communication. 4923-4927 - Xuehan Wang, Israel Cohen, Jacob Benesty, Jingdong Chen:
Study of the Null Directions on The Performance of Differential Beamformers. 4928-4932 - Christoph F. Mecklenbräuker, Peter Gerstoft, Esa Ollila:
DOA M-Estimation Using Sparse Bayesian Learning. 4933-4937 - Yongsung Park, Florian Meyer, Peter Gerstoft:
Learning-Aided Initialization for Variational Bayesian DOA Estimation. 4938-4942 - Saidur R. Pavel, Yimin D. Zhang:
Neural Network-Based Compression Framework for DOA Estimation Exploiting Distributed Array. 4943-4947 - Long Chen, Weize Sun, Lei Huang, Guitong Chen:
T-SVD Based Broadband Non-Synchronous Measurements. 4948-4952 - Changheng Li, Jorge Martínez, Richard C. Hendriks:
Low Complex Accurate Multi-Source RTF Estimation. 4953-4957 - Luca Ferranti, Kalle Åström, Magnus Oskarsson, Jani Boutellier, Juho Kannala:
Multiple Offsets Multilateration: A New Paradigm for Sensor Network Calibration with Unsynchronized Reference Nodes. 4958-4962 - Xing-Yu Chen, Jie Zhang, Li-Rong Dai:
Reference Microphone Selection and Low-Rank Approximation Based Multichannel Wiener Filter with Application to Speech Recognition. 4963-4967 - Guy Gubnitsky, Yaakov Buchris, Israel Cohen:
Incoherent Synthesis of Sparse Broadband Arrays based on a Parameter-Free Subspace Clustering. 4968-4972 - Jake Millhiser, Pulak Sarangi, Piya Pal:
Initialization-Free Implicit-Focusing (IF2) for Wideband Direction-of-Arrival Estimation. 4973-4977 - Linlong Wu, Jisheng Dai, M. R. Bhavani Shankar, Ruizhi Hu, Björn E. Ottersten:
Recurrent Design of Probing Waveform for Sparse Bayesian Learning Based DOA Estimation. 4978-4982 - Yongzhe Li, Chunxuan Shi, Ran Tao:
Unimodular Waveform Design with Low Correlation Levels: A Fast Algorithm Development to Support Large-Scale Code Lengths. 4983-4987 - Zhihui Li, Junpeng Shi, Dongming Wu, Shujie Shi, Qingsong Zhou:
Airborne Mimo Radar Transmit-Receive Design Under Spectral Constraint in Signal-Dependent Clutter. 4988-4992 - Weitong Zhai, Xiangrong Wang, Maria S. Greco, Fulvio Gini:
Weak Target Detection in Massive MIMO Radar via an Improved Reinforcement Learning Approach. 4993-4997 - Stefano Buzzi, Emanuele Grossi, Marco Lops, Luca Venturino:
RIS-Aided Monostatic Mimo Radar with Co-Located Antennas. 4998-5002 - Po-Chih Chen, P. P. Vaidyanathan:
Convolutional Beamspace Using IIR Filters. 5003-5007 - Pranav Kulkarni, P. P. Vaidyanathan:
Rational Arrays for DOA Estimation. 5008-5012 - Xinyao Chen, Zai Yang:
Localizing More Sources than Sensors in Presence of Coherent Sources. 5013-5017 - Mohammad Bokaei, Saeed Razavikia, Arash Amini, Stefano Rini:
Two-Snapshot DOA Estimation Via Hankel-Structured Matrix Completion. 5018-5022 - Majdoddin Esfandiari, Sergiy A. Vorobyov:
A Novel Angular Estimation Method in the Presence of Nonuniform Noise. 5023-5027 - David Schenck, Katja Lübbe, Minh Trinh-Hoang, Marius Pesavento:
Partially Relaxed Orthogonal Least Squares Weighted Subspace Fitting Direction-of-Arrival Estimation. 5028-5032 - Nabil Mohsen, Ammar Hawbani, Monika Agrawal, Saeed H. Alsamhi, Liang Zhao:
A New Coprime-Array-based Configuration with Augmented Degrees of Freedom and Reduced Mutual Coupling. 5033-5037 - Shekhar Kumar Yadav, Nithin V. George:
Coarray Manifold Separation In The Spherical Harmonics Domain For Enhanced Source Localization. 5038-5042 - Chun-Lin Liu:
Sparse Array Source Enumeration Via Coarray Subspace Optimization. 5043-5047 - Ahmed M. A. Shaalan, Jun Du:
The Prototype Co-Prime Array with a Robust Difference Co-Array. 5048-5052 - Hang Zheng, Chengwei Zhou, André L. F. de Almeida, Yujie Gu, Zhiguo Shi:
Doa Estimation Via Coarray Tensor Completion with Missing Slices. 5053-5057 - Yuan-Pon Chen, Chun-Lin Liu:
Half Inverted Nested Arrays with Large Hole-Free Fourth-Order Difference Co-Arrays. 5058-5062 - Tianle Zhong, Israel Mendoza Velázquez, Yi Ren, Héctor Manuel Pérez Meana, Yoichi Haneda:
Spherical Convolutional Recurrent Neural Network for Real-Time Sound Source Tracking. 5063-5067 - Jinzheng Zhao, Peipei Wu, Xubo Liu, Yong Xu, Lyudmila Mihaylova, Simon J. Godsill, Wenwu Wang:
Audio-Visual Tracking of Multiple Speakers Via a PMBM Filter. 5068-5072 - Guozhen Zhu, Chenshu Wu, Beibei Wang, K. J. Ray Liu:
Floor Plan Reconstruction with High-Precision Rf-Based Tracking. 5073-5077 - Peipei Wu, Jinzheng Zhao, Shidrokh Goudarzi, Wenwu Wang:
Partial Arithmetic Consensus based Distributed Intensity Particle Flow SMC-PHD Filter for Multi-Target Tracking. 5078-5082 - Jianyuan Yu, Pu Wang, Toshiaki Koike-Akino, Philip V. Orlik:
Multi-Modal Recurrent Fusion for Indoor Localization. 5083-5087 - Seyyede Fatemeh Seyyedsalehi, Hamid R. Rabiee:
Improving Joint Sparse Hyperspectral Unmixing by Simultaneously Clustering Pixels According To Their Mixtures. 5088-5092 - Xinzhe Geng, Tao Lei, Qi Chen, Jian Su, Xi He, Qi Wang, Asoke K. Nandi:
Global Evolution Neural Network for Segmentation of Remote Sensing Images. 5093-5097 - Jinping Wang, Jun Li, Xiaojun Tan:
Spectral-Spatial Symmetrical Aggregation Cross-Linking Multi-Modal Data Fusion Network. 5098-5102 - Mohammad Ali Vosoughi, Adora M. DSouza, Anas Z. Abidin, Axel Wismüller:
Relation Discovery in Nonlinearly Related Large-Scale Settings. 5103-5107 - Luca Bondi, Gabriel Chuang, Christopher Ick, Adarsh Dave, Charles Shelton, Brian Coltin, Trey Smith, Samarjit Das:
Acoustic Imaging Aboard The International Space Station (ISS): Challenges and Preliminary Results. 5108-5112 - Zhiwei Jiang, Hua Chen, Wei Liu, Ye Tian, Gang Wang:
Conjugate Augmented Spatial-Temporal Near-Field Sources Localization with Cross Array. 5113 - Ruchi Pandey, Santosh Nannuru:
Parametric Models for Doa Trajectory Localization. 5118-5122 - Yuan Liu, Zhi-Wei Tan, Andy W. H. Khong, Hongwei Liu:
Joint Source Localization and Association Through Overcomplete Representation Under Multipath Propagation Environment. 5123-5127 - Ruichao Zheng, Gang Wang, K. C. Ho, Lei Huang:
Semidefinite Relaxation Method for Moving Object Localization Using a Stationary Transmitter at Unknown Position. 5128-5132 - Hantian Wu, Qing Shen, Wei Liu, Yibao Liang:
Underdetermined Two-Dimensional Localization for Wideband Sources Based on Distributed Sensor Array Networks. 5133-5137 - Shiva Akbari, Shahrokh Valaee:
Direct Localization: An Ising Model Approach. 5138-5142 - Andrew Robert Finelli, Peter Willett, Yaakov Bar-Shalom, Stefano Maranò:
Transient Detection with Unknown Statistics Via Source Coding. 5143-5147 - Meghna Kalra, Yoram Bresler, Kiryung Lee:
Identification of Pulse Streams Of Unknown Shape From Time Encoding Machine Samples. 5148-5152 - Hongqing Yu, Heng Qiao:
Exact Sparse Super-Resolution Via Model Aggregation. 5153-5157 - Wentao Shi, Baoqi Huang, Kai Sun:
A CRLB Analysis of AoA Estimation Using Bluetooth 5. 5158-5162 - Md. Waqeeb T. S. Chowdhury, Yimin D. Zhang:
Cramer-Rao Bound Analysis of Distributed DOA Estimation Exploiting Mixed-Precision Covariance Matrix. 5163-5167 - Zhaoyi Xu, Fan Liu, Athina P. Petropulu:
Cramér-Rao Bound and Antenna Selection Optimization for Dual Radar-Communication Design. 5168-5172 - Adarsh Barik, Jean Honorio:
Information Theoretic Limits For Standard and One-Bit Compressed Sensing with Graph-Structured Sparsity. 5173-5177 - Zachariah Sutton, Peter Willett, Stefano Maranò:
The Data/Identity Tradeoff with Censored Sensors. 5178-5182 - Khaled Ardah, Sepideh Gherekhloo, André L. F. de Almeida, Martin Haardt:
Double-RIS Versus Single-RIS Aided Systems: Tensor-Based Mimo Channel Estimation and Design Perspectives. 5183-5187 - Hyeonjin Chung, Sunwoo Kim:
Efficient Two-Stage Beam Training and Channel Estimation for Ris-Aided Mmwave Systems Via Fast Alternating Least Squares. 5188-5192 - Mingyu Yang, Hun-Seok Kim:
Deep Joint Source-Channel Coding for Wireless Image Transmission with Adaptive Rate Control. 5193-5197 - Aditya Sant, Afshin Abdi, Joseph Soriaga:
Deep Sequential Beamformer Learning for Multipath Channels in Mmwave Communication Systems. 5198-5202 - Elad Domanovitz, Daniel Severo, Ashish Khisti, Wei Yu:
Data-Driven Optimization for Zero-Delay Lossy Source Coding with Side Information. 5203-5207 - Sixian Wang, Ke Yang, Jincheng Dai, Kai Niu:
Distributed Image Transmission Using Deep Joint Source-Channel Coding. 5208-5212 - Navid Naderializadeh, Mark Eisen, Alejandro Ribeiro:
Adaptive Wireless Power Allocation with Graph Neural Networks. 5213-5217 - Tomer Gafni, Michal Yemini, Kobi Cohen:
Restless Multi-Armed Bandits under Exogenous Global Markov Process. 5218-5222 - Xuechao He, Heng Zhu, Qing Ling:
Byzantine-Robust and Communication-Efficient Distributed Non-Convex Learning Over Non-IID Data. 5223-5227 - Vinay Chakravarthi Gogineni, Stefan Werner, Yih-Fang Huang, Anthony Kuh:
Communication-Efficient Online Federated Learning Framework for Nonlinear Regression. 5228-5232 - Keshi Ge, Yongquan Fu, Yiming Zhang, Zhiquan Lai, Xiaoge Deng, Dongsheng Li:
S2 Reducer: High-Performance Sparse Communication to Accelerate Distributed Deep Learning. 5233-5237 - Sebastian Espinosa, Jorge F. Silva, Pablo Piantanida:
A Data-Driven Quantization Design for Distributed Testing Against Independence with Communication Constraints. 5238-5242 - Boning Li, Ananthram Swami, Santiago Segarra:
Power Allocation for Wireless Federated Learning Using Graph Neural Networks. 5243-5247 - Zirui Yan, Quan Xiao, Tianyi Chen, Ali Tajer:
Federated Multi-Armed Bandit Via Uncoordinated Exploration. 5248-5252 - Jian Xu, Shao-Lun Huang:
Byzantine-Resilient Decentralized Collaborative Learning. 5253-5257 - Maciej Niedzwiecki, Artur Gancza, Lu Shen, Yuriy V. Zakharov:
Adaptive Identification of Underwater Acoustic Channel with a Mix of Static and Time-Varying Parameters. 5258-5262 - Rabah Ouchikh, Abdeldjalil Aïssa-El-Bey, Thierry Chonavel, Mustapha Djeddou:
Iterative Channel Estimation and Data Detection Algorithm For OTFS Modulation. 5263-5267 - Michael Koller, Benedikt Fesl, Nurettin Turan, Wolfgang Utschick:
An Asymptotically Optimal Approximation of the Conditional Mean Channel Estimator Based on Gaussian Mixture Models. 5268-5272 - Ali Bemani, Nassar Ksairi, Marios Kountouris:
Low Complexity Equalization for Afdm In Doubly Dispersive Channels. 5273-5277 - Michael Baur, Michael Würth, Michael Koller, Vlad-Costin Andrei, Wolfgang Utschick:
CSI Clustering with Variational Autoencoding. 5278-5282 - Ramzi Ayachi, Mohamed Akrout, Volodymyr Shyianov, Faouzi Bellili, Amine Mezghani:
Massive Unsourced Random Access Based on Bilinear Vector Approximate Message Passing. 5283-5287 - Wei-Kun Chen, Ya-Feng Liu, Yu-Hong Dai, Zhi-Quan Luo:
Optimal Qos-Aware Network Slicing for Service-Oriented Networks with Flexible Routing. 5288-5292 - Runhua Wang, Yaohua Liu, Qing Ling:
Byzantine-Resilient Decentralized Resource Allocation. 5293-5297 - Arindam Chowdhury, Fernando Gama, Santiago Segarra:
Stability Analysis of Unfolded WMMSE for Power Allocation. 5298-5302 - Wenbo Wang, Amir Leshem:
Monotonic Generalized Nash Games with Application to the Management of Energy-Aware Aloha Networks. 5303-5307 - Zhongyuan Zhao, Ananthram Swami, Santiago Segarra:
Distributed Link Sparsification for Scalable Scheduling Using Graph Neural Networks. 5308-5312 - Farjam Karim, Bishmita Hazarika, Sandeep Kumar Singh, Keshav Singh:
A Performance Analysis for Multi-Ris-Assisted Full Duplex Wireless Communication System. 5313-5317 - Yang Liu, Yancheng Hou, Jiaxuan Wei, Yinghui Zhang, Junxing Zhang, Tiankui Zhang:
Joint Beam Selection and Precoding Based on Differential Evolution for Millimeter-Wave Massive MIMO Systems. 5318-5322 - Zheyu Wu, Bo Jiang, Ya-Feng Liu, Yu-Hong Dai:
A Novel Negative ℓ1 Penalty Approach for Multiuser One-Bit Massive MIMO Downlink with PSK Signaling. 5323-5327 - Jochen Fink, Renato L. G. Cavalcante, Zoran Utkovski, Slawomir Stanczak:
A Set-Theoretic Approach to Mimo Detection. 5328-5332 - Quan Zhang, Xuyang Zhao, Jiangtao Wang, Yongchao Wang:
Designing a QAM Signal Detector for Massive Mimo Systems via PS-ADMM Approach. 5333-5337 - Timur Zirtiloglu, Nir Shlezinger, Yonina C. Eldar, Rabia Tugce Yazicigil:
Power-Efficient Hybrid MIMO Receiver with Task-Specific Beamforming using Low-Resolution ADCs. 5338-5342 - Junbin Liu, Mingjie Shao, Wing-Kin Ma:
Mimo Detection by Variational Posterior Inference. 5343-5347 - Trinh Van Chien, Tu Lam Thanh, Tran Dinh Hieu, Hieu V. Nguyen, Symeon Chatzinotas, Marco Di Renzo, Björn E. Ottersten:
Controlling Smart Propagation Environments: Long-Term Versus Short-Term Phase Shift Optimization. 5348-5352 - Spilios Evmorfos, Athina P. Petropulu:
Deep Actor-Critic for Continuous 3D Motion Control in Mobile Relay Beamforming Networks. 5353-5357 - Daniel Romero, Pham Q. Viet, Geert Leus:
Aerial Base Station Placement Leveraging Radio Tomographic Maps. 5358-5362 - Jianxiu Li, Maxime Ferreira Da Costa, Urbashi Mitra:
Atomic Norm Based Localization and Orientation Estimation for Millimeter-Wave MIMO OFDM Systems. 5363-5367 - Michael Joham, Hangze Gao, Wolfgang Utschick:
Estimation Of Channels In Systems With Intelligent Reflecting Surfaces. 5368-5372 - Nuan Song, Tao Yang:
Distributed Hybrid Beamforming for Mmwave Cell-Free Massive MIMO. 5373-5377 - Yasaman Khorsandmanesh, Emil Björnson, Joakim Jaldén:
Quantization-Aware Precoding For Mu-Mimo With Limited-Capacity Fronthaul. 5378-5382 - Yanjie Dong, Haijun Zhang, Jianqiang Li, F. Richard Yu, Song Guo, Victor C. M. Leung:
An Online Throughput Maximization Algorithm for Green Coordinated Multi-Point Systems. 5383-5387 - Xilai Fan, Ya-Feng Liu, Liang Liu:
Efficiently and Globally Solving Joint Beamforming and Compression Problem in the Cooperative Cellular Network Via Lagrangian Duality. 5388-5392 - Juan Vidal Alegría, Jinliang Huang, Fredrik Rusek:
Cell-Free Massive Mimo: Exploiting The Wax Decomposition. 5393-5397 - Lei Jiang, Haijian Zhang, Lei Yu:
Learning Structured Sparsity For Time-Frequency Reconstruction. 5398-5402 - Ayush Bhandari:
Unlimited Sampling with Sparse Outliers: Experiments with Impulsive and Jump or Reset Noise. 5403-5407 - Haiyan Yu, Zhen Qin, Zhihui Zhu:
Learning Approach For Fast Approximate Matrix Factorizations. 5408-5412 - Pierre Barbault, Matthieu Kowalski, Charles Soussen:
Parameter Estimation in Sparse Inverse Problems Using Bernoulli-Gaussian Prior. 5413-5417 - Mohamed Mansour:
Sparse Recovery of Acoustic Waves. 5418-5422 - El-Hadji Samba Diop, Karl Skretting:
Nonlinear Signal Decomposition Based on Block Sparse Approximation. 5423-5427 - Minh N. Bùi, Patrick L. Combettes, Zev Woodstock:
Block-Activated Algorithms For Multicomponent Fully Nonsmooth Minimization. 5428-5432 - Takumi Fukunaga, Hiroyuki Kasai:
Block-Coordinate Frank-Wolfe Algorithm And Convergence Analysis For Semi-Relaxed Optimal Transport Problem. 5433-5437 - Ioannis C. Tsaknakis, Prashant Khanduri, Mingyi Hong:
An Implicit Gradient-Type Method for Linearly Constrained Bilevel Problems. 5438-5442 - Théo Guyard, Cédric Herzet, Clément Elvira:
Screen & Relax: Accelerating The Resolution Of Elastic-Net By Safe Identification of The Solution Support. 5443-5447 - Théo Guyard, Cédric Herzet, Clément Elvira:
Node-Screening Tests For The L0-Penalized Least-Squares Problem. 5448-5452 - Thomas Guilmeau, Emilie Chouzenoux, Víctor Elvira:
Proximal-Based Adaptive Simulated Annealing for Global Optimization. 5453-5457 - Ashish Tiwari, Richeek Das, Shanmuganathan Raman:
Exploring Deeper Graph Convolutions for Semi-Supervised Node Classification. 5463-5467 - Alvaro Arroyo, Bruno Scalzo, Ljubisa Stankovic, Danilo P. Mandic:
Dynamic Portfolio Cuts: A Spectral Approach to Graph-Theoretic Diversification. 5468-5472 - Zhiyang Wang, Luana Ruiz, Alejandro Ribeiro:
Stability of Neural Networks on Manifolds to Relative Perturbations. 5473-5477 - Jiawei Sun, Jie Li, Chentao Wu, Zili Tang, Celimuge Wu:
Ada-STNet: A Dynamic AdaBoost Spatio-Temporal Network for Traffic Flow Prediction. 5478-5482 - Artun Bayer, Arindam Chowdhury, Santiago Segarra:
Label Propagation Across Graphs: Node Classification Using Graph Neural Tangent Kernels. 5483-5487 - Claudio J. Bordin, Caio Gomes de Figueredo, Marcelo G. S. Bruno:
Distributed Particle Filters for State Tracking on the Stiefel Manifold Using Tangent Space Statistics. 5488-5492 - Baocheng Geng, Qunwei Li, Pramod K. Varshney:
Human Decision Making with Bounded Rationality. 5493-5497 - Fernando Gama, Nicolas Zilberstein, Richard G. Baraniuk, Santiago Segarra:
Unrolling Particles: Unsupervised Learning of Sampling Distributions. 5498-5502 - Qing Li, Jiaming Liang, Simon J. Godsill:
Scalable Data Association and Multi-Target Tracking Under a Poisson Mixture Measurement Process. 5503-5507 - Asher A. Hensley, Petar M. Djuric:
Online Learning for Latent Yule-Simon Processes. 5508-5512 - Charles-Gérard Lucas, Patrice Abry, Herwig Wendt, Gustavo Didier:
Counting the Number of Different Scaling Exponents in Multivariate Scale-Free Dynamics: Clustering by Bootstrap in the Wavelet Domain. 5513-5517 - Peter Neuhaus, Nir Shlezinger, Meik Dörpinghaus, Yonina C. Eldar, Gerhard P. Fettweis:
On the Acquisition of Stationary Signals Using Uniform ADCS. 5518-5522 - Yang Sun, Jonathan Scarlett:
Data-Driven Algorithms for Gaussian Measurement Matrix Design in Compressive Sensing. 5523-5527 - Michael Perlmutter, Jieqian He, Matthew J. Hirn:
Scattering Statistics of Generalized Spatial Poisson Point Processes. 5528-5532 - Ruturaj G. Gavaskar, Kunal N. Chaudhury:
Regularization Using Denoising: Exact and Robust Signal Recovery. 5533-5537 - Hiroki Kuroda, Daichi Kitahara:
Graph-Structured Sparse Regularization Via Convex Optimization. 5538-5542 - Songtao Lu, Xiaodong Cui, Mark S. Squillante, Brian Kingsbury, Lior Horesh:
Decentralized Bilevel Optimization for Personalized Client Learning. 5543-5547 - Mingjie Shao, Qi Dai, Wing-Kin Ma:
Extreme-Point Pursuit for Unit-Modulus Optimization. 5548-5552 - Sebastian Ament, Carla P. Gomes:
Generalized Matching Pursuits for the Sparse Optimization of Separable Objectives. 5553-5557 - Andrew D. McRae, Austin Xu, Jihui Jin, Namrata Nadagouda, Nauman Ahad, Peimeng Guan, Santhosh Karnik, Mark A. Davenport:
Delta Distancing: A Lifting Approach to Localizing Items from User Comparisons. 5558-5562 - Yunhe Li, Yaochen Hu, Yingxue Zhang:
Dual Path Graph Convolutional Networks. 5563-5567 - Hoang-Son Nguyen, Yiran He, Hoi-To Wai:
On the Stability of Low Pass Graph Filter with a Large Number of Edge Rewires. 5568-5572 - Zida Cheng, Siheng Chen, Ya Zhang:
Spatio-Temporal Graph Complementary Scattering Networks. 5573-5577 - Elvin Isufi, Maosheng Yang:
Convolutional Filtering in Simplicial Complexes. 5578-5582 - Arun Venkitaraman, Pascal Frossard:
Annihilation Filter Approach for Estimating Graph Dynamics from Diffusion Processes. 5583-5587 - Lili Zheng, Genevera I. Allen:
Learning Gaussian Graphical Models with Differing Pairwise Sample Sizes. 5588-5592 - Ahmed Ali Abbasi, Abiy Tasissa, Shuchin Aeron:
r-Local Unlabeled Sensing: Improved Algorithm and Applications. 5593-5597 - Praneeth Narayanamurthy, Namrata Vaswani, Aditya Ramamoorthy:
Federated Over-Air Robust Subspace Tracking from Missing Data. 5598-5602 - Rahul Parhi, Robert D. Nowak:
On Continuous-Domain Inverse Problems with Sparse Superpositions of Decaying Sinusoids as Solutions. 5603-5607 - Hongyi Pan, Diaa Badawi, Runxuan Miao, Erdem Koyuncu, Ahmet Enis Çetin:
Multiplication-Avoiding Variant of Power Iteration with Applications. 5608-5612 - Pol del Aguila Pla, Michael Unser:
Bona Fide Riesz Projections for Density Estimation. 5613-5616 - Amir Weiss, Everest W. Huang, Or Ordentlich, Gregory W. Wornell:
Blind Modulo Analog-to-Digital Conversion of Vector Processes. 5617-5621 - Edwin Vargas, Kumar Vijay Mishra, Roman Jacome, Brian M. Sadler, Henry Arguello:
Joint Radar-Communications Processing from A Dual-Blind Deconvolution Perspective. 5622-5626 - Sibylle Marcotte, Amélie Barbe, Rémi Gribonval, Titouan Vayer, Marc Sebban, Pierre Borgnat, Paulo Gonçalves:
Fast Multiscale Diffusion On Graphs. 5627-5631 - Hao Liang, Xinghao Ding, Andreas Jakobsson, Xiaotong Tu, Yue Huang:
Adaptive Variational Nonlinear Chirp Mode Decomposition. 5632-5636 - Abijith Jagannath Kamath, Chandra Sekhar Seelamantula:
Differentiate-and-Fire Time-Encoding of Finite-Rate-of-Innovation Signals. 5637-5641 - Koki Yamada, Yuichi Tanaka:
Graph Learning Information Criterion. 5642-5646 - Alexander Tong, Guillaume Huguet, Dennis L. Shung, Amine Natik, Manik Kuchroo, Guillaume Lajoie, Guy Wolf, Smita Krishnaswamy:
Embedding Signals on Graphs with Unbalanced Diffusion Earth Mover's Distance. 5647-5651 - Lukas Wielandner, Erik Leitinger, Florian Meyer, Bryan Teague, Klaus Witrisal:
Message Passing-Based Cooperative Localization with Embedded Particle Flow. 5652-5656 - Maxime Ferreira Da Costa, Urbashi Mitra:
A Framework for Private Communication with Secret Block Structure. 5657-5661 - Jacob Benesty, Constantin Paleologu, Silviu Ciochina, Eduardo Vinicius Kuhn, Khaled Jamal Bakri, Rui Seara:
LMS and NLMS Algorithms for the Identification of Impulse Responses with Intrinsic Symmetric or Antisymmetric Properties. 5662-5666 - Roula Nassif, Virginia Bordignon, Stefan Vlaski, Ali H. Sayed:
Decentralized Learning in the Presence of Low-Rank Noise. 5667-5671 - Marco Carpentiero, Vincenzo Matta, Ali H. Sayed:
Adaptive Diffusion with Compressed Communication. 5672-5676 - Yiran He, Hoi-To Wai:
Joint Centrality Estimation and Graph Identification from Mixture of Low Pass Graph Signals. 5677-5681 - Oyku Deniz Kose, Yanning Shen:
Fairness-Aware Selective Sampling on Attributed Graphs. 5682-5686 - Dingyi Zeng, Li Zhou, Wanlong Liu, Hong Qu, Wenyu Chen:
A Simple Graph Neural Network via Layer Sniffer. 5687-5691 - Prakash B. Gohain, Magnus Jansson:
New Improved Criterion for Model Selection in Sparse High-Dimensional Linear Regression Models. 5692-5696 - Antoine Collas, Florent Bouchard, Guillaume Ginolhac, Arnaud Breloy, Chengfang Ren, Jean Philippe Ovarlez:
On the Use of Geodesic Triangles between Gaussian Distributions for Classification Problems. 5697-5701 - Mewe-Hezoudah Kahanam, Laurent Le Brusquet, Ségolène Martin, Jean-Christophe Pesquet:
A Non-Convex Proximal Approach for Centroid-Based Classification. 5702-5706 - Zhenyu Wei, Raymond K. W. Wong, Thomas C. M. Lee:
Extending the Use of MDL for High-Dimensional Problems: Variable Selection, Robust Fitting, and Additive Modeling. 5707-5711 - Roberto Pereira, Xavier Mestre, David Gregoratti:
Clustering Complex Subspaces in Large Dimensions. 5712-5716 - Pierre Houdouin, Andrew Wang, Matthieu Jonckheere, Frédéric Pascal:
Robust Classification with Flexible Discriminant Analysis in Heterogeneous Data. 5717-5721 - Eyar Azar, Satish Mulleti, Yonina C. Eldar:
Residual Recovery Algorithm for Modulo Sampling. 5722-5726 - Adeem Aslam, Zubair Khalid:
Operator Formulation for Linear Transformations and Signal Estimation in the Joint Spatial-Slepian Domain. 5727-5731 - Junya Hara, Yuichi Tanaka:
Sampling Set Selection for Graph Signals under Arbitrary Signal Priors. 5732-5736 - David Svedberg, Filip Elvander, Andreas Jakobsson:
Determining Joint Periodicities in Multi-Time Data with Sampling Uncertainties. 5737-5741 - Dorian Florescu, Ayush Bhandari:
Unlimited Sampling with Local Averages. 5742-5746 - Dorian Florescu, Ayush Bhandari:
Modulo Event-Driven Sampling: System Identification and Hardware Experiments. 5747-5751 - Petr Tichavský, Ondrej Straka, Jindrich Duník:
Point-Mass Filter with Decomposition of Transient Density. 5752-5756 - Xinhui Rong, Victor Solo:
Cramer-Rao Bound for the Time-Varying Poisson. 5757-5761 - Nadav E. Rosenthal, Joseph Tabrikian:
Model Selection via Misspecified Cramér-Rao Bound Minimization. 5762-5766 - Yair Sorek, Koby Todros:
Robust Parameter Estimation Based on the K-Divergence. 5767-5771 - Nora Ouzir, Jean-Christophe Pesquet, Frédéric Pascal:
A Convex Formulation for the Robust Estimation of Multivariate Exponential Power Models. 5772-5776 - Runze Gan, Simon J. Godsill:
Conditionally Factorized Variational Bayes with Importance Sampling. 5777-5781 - Pierre Develter, Jonathan Bosse, Olivier Rabaste, Philippe Forster, Jean Philippe Ovarlez:
On the False Alarm Probability of the Normalized Matched Filter for Off-Grid Target Detection. 5782-5786 - Yuxing Yang, Zeyu Fu, Syed Mohsen Naqvi:
A Two-Stream Information Fusion Approach to Abnormal Event Detection in Video. 5787-5791 - Marc Vilà, Jaume Riba:
A Test for Conditional Correlation Between Random Vectors Based on Weighted U-Statistics. 5792-5796 - Sophia Sulis, David Mary, Lionel Bigot:
Semi-Supervised Standardized Detection of Periodic Signals with Application to Exoplanet Detection. 5797-5801 - Sara ElBouch, Olivier J. J. Michel, Pierre Comon:
Joint Normality Test Via Two-Dimensional Projection. 5802-5806 - Yuchen Liang, Venugopal V. Veeravalli:
Quickest Detection of Composite and Non-Stationary Changes with Application to Pandemic Monitoring. 5807-5811 - Payam Shahsavari Baboukani, Sergios Theodoridis, Jan Østergaard:
A Stimuli-Relevant Directed Dependency Index for Time Series. 5812-5816 - Samuel Rey, Andrei Buciulea, Madeline Navarro, Santiago Segarra, Antonio G. Marques:
Joint Inference of Multiple Graphs with Hidden Variables from Stationary Graph Signals. 5817-5821 - Jitendra K. Tugnait:
Sparse-Group Log-Sum Penalized Graphical Model Learning For Time Series. 5822-5826 - Xingchao Jian, Wee Peng Tay:
Wide-Sense Stationarity and Spectral Estimation for Generalized Graph Signal. 5827-5831 - Michael Scholkemper, Michael T. Schaub:
Blind Extraction of Equitable Partitions from Graph Signals. 5832-5836 - Sravanthi Gurugubelli, Sundeep Prabhakar Chepuri:
Learning Sparse Graphs with a Core-Periphery Structure. 5837-5841 - Ping Hu, Virginia Bordignon, Stefan Vlaski, Ali H. Sayed:
Optimal Combination Policies for Adaptive Social Learning. 5842-5846 - Patitapaban Palo, Aurobinda Routray:
Seismic Fault Identification Using Graph High-Frequency Components as Input to Graph Convolutional Network. 5847-5851 - Isabela Cunha Maia Nobre, Mireille El Gheche, Pascal Frossard:
Distributed Graph Learning With Smooth Data Priors. 5852-5856 - Jiayu Li, Tianyun Zhang, Shengmin Jin, Makan Fardad, Reza Zafarani:
AdverSparse: An Adversarial Attack Framework for Deep Spatial-Temporal Graph Neural Networks. 5857-5861 - Masatoshi Nagahama, Yuichi Tanaka:
Multimodal Graph Signal Denoising Via Twofold Graph Smoothness Regularization with Deep Algorithm Unrolling. 5862-5866 - Xiaolong Xu, Lingjuan Lyu, Hong Jin, Weiqiang Wang, Shuo Jia:
Heterogeneous Graph Node Classification With Multi-Hops Relation Features. 5867-5871 - Patrick L. Combettes, Zev Woodstock:
Signal Recovery from Inconsistent Nonlinear Observations. 5872-5876 - Renke Wang, Roxana Alexandru, Pier Luigi Dragotti:
Perfect Reconstruction of Classes of Non-Bandlimited Signals from Projections with Unknown Angles. 5877-5881 - Cheng Cheng, Wei Dai:
Short-and-Sparse Deconvolution Via Rank-One Constrained Optimization (Roco). 5882-5886 - Arie Yeredor:
Blind Equalization of Moving Average Channels Over Galois Fields. 5887-5891 - Le Trung Thanh, Karim Abed-Meraim, Adel Hafiane, Nguyen Linh Trung:
Sparse Subspace Tracking in High Dimensions. 5892-5896 - Kunal Pattanayak, Vikram Krishnamurthy, Christopher Berry:
How Can a Cognitive Radar Mask its Cognition? 5897-5901 - Xiaoyong Ni, Guy Revach, Nir Shlezinger, Ruud J. G. van Sloun, Yonina C. Eldar:
RTSNet: Deep Learning Aided Kalman Smoothing. 5902-5906 - Ye'Ela Shalit, Ran Weber, Asaf Abas, Shay Kreymer, Tamir Bendory:
Generalized Autocorrelation Analysis for Multi-Target Detection. 5907-5911 - Kostas Tsampourakis, Víctor Elvira:
Approximating The Likelihood Ratio in Linear-Gaussian State-Space Models for Change Detection. 5912-5916 - Bishwadeep Das, Elvin Isufi:
Learning Expanding Graphs for Signal Interpolation. 5917-5921 - T. Mitchell Roddenberry, Florian Frantzen, Michael T. Schaub, Santiago Segarra:
Hodgelets: Localized Spectral Representations of Flows On Simplicial Complexes. 5922-5926 - Wenwei Liu, Hui Feng, Kaixuan Wang, Feng Ji, Bo Hu:
Recovery of Graph Signals From Sign Measurements. 5927-5931 - Kenta Yanagiya, Koki Yamada, Yasuo Katsuhara, Tomoya Takatani, Yuichi Tanaka:
Edge Sampling of Graphs Based on Edge Smoothness. 5932-5936 - Darukeesan Pakiyarajah, Chamira U. S. Edussooriya:
WLS Design of Arma Graph Filters Using Iterative Second-Order Cone Programming. 5937-5941 - Chinthaka Dinesh, Saghar Bagheri, Gene Cheung, Ivan V. Bajic:
Linear-Time Sampling on Signed Graphs Via Gershgorin Disc Perfect Alignment. 5942-5946 - Harlin Lee, Andrea L. Bertozzi, Jelena Kovacevic, Yuejie Chi:
Privacy-Preserving Federated Multi-Task Linear Regression: A One-Shot Linear Mixing Approach Inspired By Graph Regularization. 5947-5951 - Sarit Khirirat, Sindri Magnússon, Mikael Johansson:
Eco-Fedsplit: Federated Learning with Error-Compensated Compression. 5952-5956 - Karen Adam:
A Time Encoding Approach to Training Spiking Neural Networks. 5957-5961 - Wei Gao, Jie Chen, Cédric Richard, Wentao Shi, Qunfei Zhang:
Transient Analysis of Clustered Multitask Diffusion RLS Algorithm. 5962-5966 - Martin Gölz, Abdelhak M. Zoubir, Visa Koivunen:
Improving Inference for Spatial Signals by Contextual False Discovery Rates. 5967-5971 - Morad Halihal, Tirza Routtenberg:
Estimation of the Admittance Matrix in Power Systems Under Laplacian and Physical Constraints. 5972-5976 - Junjie Yang, Claude Delpha:
Incipient Fault Severity Estimation Using Local Mahalanobis Distance. 5977-5981 - Yifan Wu, Michael B. Wakin, Peter Gerstoft:
Gridless DOA Estimation Under the Multi-Frequency Model. 5982-5986 - Meiby Ortiz-Bouza, Selin Aviyente:
Orthogonal Nonnegative Matrix Tri-Factorization for Community Detection in Multiplex Networks. 5987-5991 - Éric Grivel:
Studying Three Families of Divergences to Compare Wide-Sense Stationary Gaussian Arma Processes. 5992-5996 - Hongjian Xiao, Theerasak Chanwimalueang, Danilo P. Mandic:
Multivariate Multiscale Cosine Similarity Entropy. 5997-6001 - Erik Berglund, Sarit Khirirat, Xiaoyu Wang:
Zeroth-Order Randomized Subspace Newton Methods. 6002-6006 - Dionysios S. Kalogerias:
Fast and Stable Convergence of Online SGD for CV@R-Based Risk-Aware Learning. 6007-6011 - Amrutha Varshini Ramesh, Mojtaba Soltanalian:
Deep Initialization for Guaranteed Unimodular Quadratic Programming. 6012-6016 - Yixuan Zhang, Zhuo Chen, Jian Wu, Takuya Yoshioka, Peidong Wang, Zhong Meng, Jinyu Li:
Continuous Speech Separation with Recurrent Selective Attention Network. 6017-6021 - Thilo von Neumann, Keisuke Kinoshita, Christoph Böddeker, Marc Delcroix, Reinhold Haeb-Umbach:
SA-SDR: A Novel Loss Function for Separation of Meeting Style Data. 6022-6026 - Takuya Yoshioka, Xiaofei Wang, Dongmei Wang, Min Tang, Zirun Zhu, Zhuo Chen, Naoyuki Kanda:
VarArray: Array-Geometry-Agnostic Continuous Speech Separation. 6027-6031 - Zhuohuang Zhang, Takuya Yoshioka, Naoyuki Kanda, Zhuo Chen, Xiaofei Wang, Dongmei Wang, Sefik Emre Eskimez:
All-Neural Beamformer for Continuous Speech Separation. 6032-6036 - Kai Wang, Yizhou Peng, Hao Huang, Ying Hu, Sheng Li:
Mining Hard Samples Locally And Globally For Improved Speech Separation. 6037-6041 - Guinan Li, Jianwei Yu, Jiajun Deng, Xunying Liu, Helen Meng:
Audio-Visual Multi-Channel Speech Separation, Dereverberation and Recognition. 6042-6046 - Otavio Braga, Olivier Siohan:
Best of Both Worlds: Multi-Task Audio-Visual Automatic Speech Recognition and Active Speaker Detection. 6047-6051 - Junqi Chen, Mou Wang, Xiao-Lei Zhang, Zhiyong Huang, Susanto Rahardja:
End-To-End Multi-Modal Speech Recognition with Air and Bone Conducted Speech. 6052-6056 - Rohit Kumar, Anurenjan Purushothaman, Anirudh Sreeram, Sriram Ganapathy:
End-To-End Speech Recognition with Joint Dereverberation of Sub-Band Autoregressive Envelopes. 6057-6061 - Heming Wang, Yao Qian, Xiaofei Wang, Yiming Wang, Chengyi Wang, Shujie Liu, Takuya Yoshioka, Jinyu Li, DeLiang Wang:
Improving Noise Robustness of Contrastive Speech Representation Learning with Speech Reconstruction. 6062-6066 - Yiwen Shao, Shi-Xiong Zhang, Dong Yu:
Multi-Channel Multi-Speaker ASR Using 3D Spatial Feature. 6067-6071 - Jianhao Ye, Hongbin Zhou, Zhiba Su, Wendi He, Kaimeng Ren, Lin Li, Heng Lu:
Improving Cross-Lingual Speech Synthesis with Triplet Training Scheme. 6072-6076 - Manish Sharma, Yizhi Hong, Emily Kaplan, Siamak Tazari, Rob Clark:
Improving Phonetic Realizations in its by Using Phoneme-Aligned Graphemes. 6077-6081 - Tao Wang, Jiangyan Yi, Liqun Deng, Ruibo Fu, Jianhua Tao, Zhengqi Wen:
Context-Aware Mask Prediction Network for End-to-End Text-Based Speech Editing. 6082-6086 - Guangyan Zhang, Yichong Leng, Daxin Tan, Ying Qin, Kaitao Song, Xu Tan, Sheng Zhao, Tan Lee:
A Study on the Efficacy of Model Pre-Training In Developing Neural Text-to-Speech System. 6087-6091 - Rohan Badlani, Adrian Lancucki, Kevin J. Shih, Rafael Valle, Wei Ping, Bryan Catanzaro:
One TTS Alignment to Rule Them All. 6092-6096 - Hao Zhang, You-Chi Cheng, Shankar Kumar, W. Ronny Huang, Mingqing Chen, Rajiv Mathews:
Capitalization Normalization for Language Modeling with an Accurate and Efficient Hierarchical RNN Model. 6097-6101 - Minguang Song, Yunxin Zhao:
Enhance Rnnlms with Hierarchical Multi-Task Learning for ASR. 6102-6106 - Antoine Bruguier, Duc Le, Rohit Prabhavalkar, Dangna Li, Zhe Liu, Bo Wang, Eun Chang, Fuchun Peng, Ozlem Kalinli, Michael L. Seltzer:
Neural-FST Class Language Model for End-to-End Speech Recognition. 6107-6111 - Lingfeng Dai, Lu Chen, Zhikai Zhou, Kai Yu:
LatticeBART: Lattice-to-Lattice Pre-Training for Speech Recognition. 6112-6116 - Liyan Xu, Yile Gu, Jari Kolehmainen, Haidar Khan, Ankur Gandhe, Ariya Rastrow, Andreas Stolcke, Ivan Bulyko:
RescoreBERT: Discriminative Speech Recognition Rescoring With Bert. 6117-6121 - Sreeja Manghat, Sreeram Manghat, Tanja Schultz:
Hybrid sub-word segmentation for handling long tail in morphologically rich low resource languages. 6122-6126 - Mufan Sang, Haoqi Li, Fang Liu, Andrew O. Arnold, Li Wan:
Self-Supervised Speaker Verification with Simple Siamese Network and Self-Supervised Regularization. 6127-6131 - Metehan Cekic, Ruirui Li, Zeya Chen, Yuguang Yang, Andreas Stolcke, Upamanyu Madhow:
Self-Supervised Speaker Recognition Training using Human-Machine Dialogues. 6132-6136 - Shehzeen Hussain, Van Nguyen, Shuhua Zhang, Erik Visser:
Multi-Task Voice Activated Framework Using Self-Supervised Learning. 6137-6141 - Ruijie Tao, Kong Aik Lee, Rohan Kumar Das, Ville Hautamäki, Haizhou Li:
Self-Supervised Speaker Recognition with Loss-Gated Learning. 6142-6146 - Zhengyang Chen, Sanyuan Chen, Yu Wu, Yao Qian, Chengyi Wang, Shujie Liu, Yanmin Qian, Michael Zeng:
Large-Scale Self-Supervised Speech Representation Learning for Automatic Speaker Verification. 6147-6151 - Sanyuan Chen, Yu Wu, Chengyi Wang, Zhengyang Chen, Zhuo Chen, Shujie Liu, Jian Wu, Yao Qian, Furu Wei, Jinyu Li, Xiangzhan Yu:
Unispeech-Sat: Universal Speech Representation Learning With Speaker Aware Pre-Training. 6152-6156 - Hannes Kath, Simon Stone, Stefan Rapp, Peter Birkholz:
Carina - A Corpus of Aligned German Read Speech Including Annotations. 6157-6161 - Chunxi Liu, Michael Picheny, Leda Sari, Pooja Chitkara, Alex Xiao, Xiaohui Zhang, Mark Chou, Andres Alvarado, Caner Hazirbas, Yatharth Saraf:
Towards Measuring Fairness in Speech Recognition: Casual Conversations Dataset Transcriptions. 6162-6166 - Fan Yu, Shiliang Zhang, Yihui Fu, Lei Xie, Siqi Zheng, Zhihao Du, Weilong Huang, Pengcheng Guo, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
M2Met: The Icassp 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. 6167-6171 - Vikram Gupta, Rini A. Sharon, Ramit Sawhney, Debdoot Mukherjee:
ADIMA: Abuse Detection In Multilingual Audio. 6172-6176 - Zixiu Wu, Simone Balloccu, Vivek Kumar, Rim Helaoui, Ehud Reiter, Diego Reforgiato Recupero, Daniele Riboni:
Anno-MI: A Dataset of Expert-Annotated Counselling Dialogues. 6177-6181 - Binbin Zhang, Hang Lv, Pengcheng Guo, Qijie Shao, Chao Yang, Lei Xie, Xin Xu, Hui Bu, Xiaoyu Chen, Chenchen Zeng, Di Wu, Zhendong Peng:
WENETSPEECH: A 10000+ Hours Multi-Domain Mandarin Corpus for Speech Recognition. 6182-6186 - Gustavo Teodoro Döhler Beck, Ulme Wennberg, Zofia Malisz, Gustav Eje Henter:
Wavebender GAN: An Architecture for Phonetically Meaningful Speech Manipulation. 6187-6191 - Sang-Hoon Lee, Ji-Hoon Kim, Kangeun Lee, Seong-Whan Lee:
FRE-GAN 2: Fast and Efficient Frequency-Consistent Audio Synthesis. 6192-6196 - Chendong Zhao, Jianzong Wang, Xiaoyang Qu, Haoqian Wang, Jing Xiao:
r-G2P: Evaluating and Enhancing Robustness of Grapheme to Phoneme Conversion by Controlled Noise Introducing and Contextual Information Incorporation. 6197-6201 - Lu Dong, Zhiqiang Guo, Chao-Hong Tan, Ya-Jun Hu, Yuan Jiang, Zhen-Hua Ling:
Neural Grapheme-To-Phoneme Conversion with Pre-Trained Grapheme Models. 6202-6206 - Takuhiro Kaneko, Kou Tanaka, Hirokazu Kameoka, Shogo Seki:
ISTFTNET: Fast and Lightweight Mel-Spectrogram Vocoder Incorporating Inverse Short-Time Fourier Transform. 6207-6211 - Tomoki Kobayashi, Tomoro Tanaka, Kohei Yatabe, Yasuhiro Oikawa:
Acoustic Application of Phase Reconstruction Algorithms in Optics. 6212-6216 - Yukun Ma, Trung Hieu Nguyen, Bin Ma:
CPT: Cross-Modal Prefix-Tuning for Speech-To-Text Translation. 6217-6221 - Tu Anh Dinh, Danni Liu, Jan Niehues:
Tackling Data Scarcity in Speech Translation Using Zero-Shot Multilingual Machine Translation Techniques. 6222-6226 - Jeong-Uk Bang, Min-Kyu Lee, Seung Yun, Sang-Hun Kim:
Improving End-To-End Speech Translation Model with Bert-Based Contextual Information. 6227-6231 - Linlin Zhang, Zhirui Zhang, Boxing Chen, Weihua Luo, Luo Si:
Context-Adaptive Document-Level Neural Machine Translation. 6232-6236 - Takatomo Kano, Atsunori Ogawa, Marc Delcroix, Shinji Watanabe:
Integrating Multiple ASR Systems into NLP Backend with Attention Fusion. 6237-6241 - Surafel Melaku Lakew, Yogesh Virkar, Prashant Mathur, Marcello Federico:
ISOMETRIC MT: Neural Machine Translation for Automatic Dubbing. 6242-6246 - Ying Shen, Huiyu Yang, Lin Lin:
Automatic Depression Detection: an Emotional Audio-Textual Corpus and A Gru/Bilstm-Based Model. 6247-6251 - Nadee Seneviratne, Carol Y. Espy-Wilson:
Multimodal Depression Classification using Articulatory Coordination Features and Hierarchical Attention Based text Embeddings. 6252-6256 - Rawan Alsarrani, Anna Esposito, Alessandro Vinciarelli:
Thin Slices of Depression: Improving Depression Detection Performance Through Data Segmentation. 6257-6261 - Wen Wu, Mengyue Wu, Kai Yu:
Climate and Weather: Inspecting Depression Detection via Emotion Recognition. 6262-6266 - Vijay Ravi, Jinhan Wang, Jonathan Flint, Abeer Alwan:
Fraug: A Frame Rate Based Data Augmentation Method for Depression Detection from Speech Signals. 6267-6271 - Suhas BN, Saeed Abdullah:
Privacy Sensitive Speech Analysis Using Federated Learning to Assess Depression. 6272-6276 - Zhaoxu Nian, Jun Du, Yu Ting Yeung, Renyu Wang:
A Time Domain Progressive Learning Approach with SNR Constriction for Single-Channel Speech Enhancement and Recognition. 6277-6281 - Iuliia Nigmatulina, Juan Zuluaga-Gomez, Amrutha Prasad, Seyyed Saeed Sarfjoo, Petr Motlícek:
A Two-Step Approach to Leverage Contextual Data: Speech Recognition in Air-Traffic Communications. 6282-6286 - Hiroshi Sato, Tsubasa Ochiai, Marc Delcroix, Keisuke Kinoshita, Naoyuki Kamo, Takafumi Moriya:
Learning to Enhance or Not: Neural Network-Based Switching of Enhanced and Observed Signals for Overlapping Speech Recognition. 6287-6291 - Yuchen Hu, Nana Hou, Chen Chen, Eng Siong Chng:
Interactive Feature Fusion for End-to-End Noise-Robust Speech Recognition. 6292-6296 - Catalin Zorila, Rama Doddipatla:
Speaker Reinforcement Using Target Source Extraction for Robust Automatic Speech Recognition. 6297-6301 - Chao-Han Huck Yang, Zeeshan Ahmed, Yile Gu, Joseph Szurley, Roger Ren, Linda Liu, Andreas Stolcke, Ivan Bulyko:
Mitigating Closed-Model Adversarial Examples with Bayesian Neural Modeling for Enhanced End-to-End Speech Recognition. 6302-6306 - Songxiang Liu, Shan Yang, Dan Su, Dong Yu:
Referee: Towards Reference-Free Cross-Speaker Style Transfer with Low-Quality Data for Expressive Speech Synthesis. 6307-6311 - Ji-Hyun Lee, Sang-Hoon Lee, Ji-Hoon Kim, Seong-Whan Lee:
PVAE-TTS: Adaptive Text-to-Speech via Progressive Style Adaptation. 6312-6316 - Chae-Bin Im, Sang-Hoon Lee, Seung-Bin Kim, Seong-Whan Lee:
EMOQ-TTS: Emotion Intensity Quantization for Fine-Grained Controllable Emotional Text-to-Speech. 6317-6321 - Kaili Zhang, Cheng Gong, Wenhuan Lu, Longbiao Wang, Jianguo Wei, Dawei Liu:
Joint and Adversarial Training with ASR for Expressive Speech Synthesis. 6322-6326 - Qinghua Wu, Quanbo Shen, Jian Luan, Yujun Wang:
MSDTRON: A High-Capability Multi-Speaker Speech Synthesis System for Diverse Data Using Characteristic Information. 6327-6331 - Chak Ho Chan, Kaizhi Qian, Yang Zhang, Mark Hasegawa-Johnson:
SpeechSplit2.0: Unsupervised Speech Disentanglement for Voice Conversion without Tuning Autoencoder Bottlenecks. 6332-6336 - Shiyao Cui, Xin Cong, Bowen Yu, Tingwen Liu, Yucheng Wang, Jinqiao Shi:
Document-Level Event Extraction via Human-Like Reading Process. 6337-6341 - Jinghui Si, Xutan Peng, Chen Li, Haotian Xu, Jianxin Li:
Generating Disentangled Arguments with Prompts: A Simple Event Extraction Framework That Works. 6342-6346 - Jingcong Tao, Youcheng Pan, Xinyu Li, Baotian Hu, Weihua Peng, Cuiyun Han, Xiaolong Wang:
Multi-Role Event Argument Extraction as Machine Reading Comprehension with Argument Match Optimization. 6347-6351 - Jia Li, Yunyan Zhang, Yifan Yang, Zhicheng An, Yefeng Zheng:
BNU: A Balance-Normalization-Uncertainty Model for Incremental Event Detection. 6352-6356 - Yongxiu Xu, Chuan Zhou, Heyan Huang, Jing Yu, Yue Hu:
Wlinker: Modeling Relational Triplet Extraction As Word Linking. 6357-6361 - Xiaobin Zhang, Liangjun Zang, Peng Cheng, Yuqi Wang, Songlin Hu:
A Knowledge/Data Enhanced Method for Joint Event and Temporal Relation Extraction. 6362-6366 - Jee-weon Jung, Hee-Soo Heo, Hemlata Tak, Hye-jin Shim, Joon Son Chung, Bong-Jin Lee, Ha-Jin Yu, Nicholas W. D. Evans:
AASIST: Audio Anti-Spoofing Using Integrated Spectro-Temporal Graph Attention Networks. 6367-6371 - Xin Wang, Junichi Yamagishi:
Estimating the Confidence of Speech Spoofing Countermeasure. 6372-6376 - Zhenchun Lei, Hui Yang, Changhong Liu, Minglei Ma, Yingen Yang:
Two-Path GMM-ResNet and GMM-SENet for ASV Spoofing Detection. 6377-6381 - Hemlata Tak, Madhu R. Kamble, Jose Patino, Massimiliano Todisco, Nicholas W. D. Evans:
Rawboost: A Raw Data Boosting and Augmentation Method Applied to Automatic Speaker Verification Anti-Spoofing. 6382-6386 - Wanying Ge, Jose Patino, Massimiliano Todisco, Nicholas W. D. Evans:
Explaining Deep Learning Models for Spoofing and Deepfake Detection with Shapley Additive Explanations. 6387-6391 - Yichuan Mo, Shilin Wang:
Multi-Task Learning Improves Synthetic Speech Detection. 6392-6396 - Bo Li, Ruoming Pang, Yu Zhang, Tara N. Sainath, Trevor Strohman, Parisa Haghani, Yun Zhu, Brian Farris, Neeraj Gaur, Manasa Prasad:
Massively Multilingual ASR: A Lifelong Learning Solution. 6397-6401 - Junwen Bai, Bo Li, Yu Zhang, Ankur Bapna, Nikhil Siddhartha, Khe Chai Sim, Tara N. Sainath:
Joint Unsupervised and Supervised Training for Multilingual ASR. 6402-6406 - Neeraj Gaur, Tongzhou Chen, Ehsan Variani, Parisa Haghani, Bhuvana Ramabhadran, Pedro J. Moreno:
Multilingual Second-Pass Rescoring for Automatic Speech Recognition Systems. 6407-6411 - Brian Yan, Chunlei Zhang, Meng Yu, Shi-Xiong Zhang, Siddharth Dalmia, Dan Berrebbi, Chao Weng, Shinji Watanabe, Dong Yu:
Joint Modeling of Code-Switched and Monolingual ASR via Conditional Factorization. 6412-6416 - Liuhui Deng, Roger Hsiao, Arnab Ghoshal:
Bilingual End-to-End ASR with Byte-Level Subwords. 6417-6421 - Long Zhou, Jinyu Li, Eric Sun, Shujie Liu:
A Configurable Multilingual Model is All You Need to Recognize All Languages. 6422-6426 - Yuan Gao, Shogo Okada, Longbiao Wang, Jiaxing Liu, Jianwu Dang:
Domain-Invariant Feature Learning for Cross Corpus Speech Emotion Recognition. 6427-6431 - Yaodong Song, Jiaxing Liu, Longbiao Wang, Ruiguo Yu, Jianwu Dang:
Multi-Stage Graph Representation Learning for Dialogue-Level Speech Emotion Recognition. 6432-6436 - Wenjing Zhu, Xiang Li:
Speech Emotion Recognition with Global-Aware Fusion on Multi-Scale Feature Representation. 6437-6441 - Sundararajan Srinivasan, Zhaocheng Huang, Katrin Kirchhoff:
Representation Learning Through Cross-Modal Conditional Teacher-Student Training For Speech Emotion Recognition. 6442-6446 - Seong-Gyun Leem, Daniel Fulford, Jukka-Pekka Onnela, David Gard, Carlos Busso:
Not All Features are Equal: Selection of Robust Features for Speech Emotion Recognition in Noisy Environments. 6447-6451 - Sneha Das, Nicole Nadine Lønfeldt, Anne Katrine Pagsberg, Line H. Clemmensen:
Towards Transferable Speech Emotion Representation: On Loss Functions for Cross-Lingual Latent Representations. 6452-6456 - Zhengyan Sheng, Zhiqiang Guo, Xin Li, Yunxia Li, Zhenhua Ling:
Dementia Detection by Fusing Speech and Eye-Tracking Representation. 6457-6461 - Youxiang Zhu, Bang Tran, Xiaohui Liang, John A. Batsis, Robert M. Roth:
Towards Interpretability of Speech Pause in Dementia Detection Using Adversarial Learning. 6462-6466 - Mercedes Vetráb, José Vicente Egas López, Réka Balogh, Nóra Imre, Ildikó Hoffmann, László Tóth, Magdolna Pákáski, János Kálmán, Gábor Gosztolya:
Using Spectral Sequence-to-Sequence Autoencoders to Assess Mild Cognitive Impairment. 6467-6471 - Ayimnisagul Ablimit, Catarina Botelho, Alberto Abad, Tanja Schultz, Isabel Trancoso:
Exploring Dementia Detection from Speech: Cross Corpus Analysis. 6472-6476 - Parvaneh Janbakhshi, Ina Kodrasi:
Experimental Investigation on STFT Phase Representations for Deep Learning-Based Dysarthric Speech Detection. 6477-6481 - Mélanie Jouaiti, Kerstin Dautenhahn:
Dysfluency Classification in Stuttered Speech Using Deep Learning for Real-Time Applications. 6482-6486 - Andong Li, Wenzhe Liu, Chengshi Zheng, Xiaodong Li:
Embedding and Beamforming: All-Neural Causal Beamformer for Multichannel Speech Enhancement. 6487-6491 - Xinmeng Xu, Rongzhi Gu, Yuexian Zou:
Improving Dual-Microphone Speech Enhancement by Learning Cross-Channel Features with Multi-Head Attention. 6492-6496 - Ashutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang:
TPARN: Triple-Path Attentive Recurrent Network for Time-Domain Multichannel Speech Enhancement. 6497-6501 - Ashutosh Pandey, Buye Xu, Anurag Kumar, Jacob Donley, Paul Calamia, DeLiang Wang:
Multichannel Speech Enhancement Without Beamforming. 6502-6506 - Samuele Cornell, Manuel Pariente, François Grondin, Stefano Squartini:
Learning Filterbanks for End-to-End Acoustic Beamforming. 6507-6511 - Minghui Hao, Jingjing Yu, Luyao Zhang:
Spatial-Temporal Graph Convolution Network for Multichannel Speech Enhancement. 6512-6516 - Atsunori Ogawa, Naohiro Tawara, Marc Delcroix, Shoko Araki:
Lattice Rescoring Based on Large Ensemble of Complementary Neural Language Models. 6517-6521 - Hossein Hadian, Arseniy Gorin:
Continual Learning Using Lattice-Free MMI for Speech Recognition. 6522-6526 - Chuan-Fei Zhang, Yan Liu, Tian-Hao Zhang, Song-Lu Chen, Feng Chen, Xu-Cheng Yin:
Non-Autoregressive Transformer with Unified Bidirectional Decoder for Automatic Speech Recognition. 6527-6531 - Zhe Liu, Irina-Elena Veliche, Fuchun Peng:
Model-Based Approach for Measuring the Fairness in ASR. 6532-6536 - Qiujia Li, Yu Zhang, David Qiu, Yanzhang He, Liangliang Cao, Philip C. Woodland:
Improving Confidence Estimation on Out-of-Domain Data for End-to-End Speech Recognition. 6537-6541 - Shubho Sengupta, Vineel Pratap, Awni Y. Hannun:
Parallel Composition of Weighted Finite-State Transducers. 6542-6546 - Ruitong Xiao, Haitong Zhang, Yue Lin:
DGC-Vector: A New Speaker Embedding for Zero-Shot Voice Conversion. 6547-6551 - Wen-Chin Huang, Shu-Wen Yang, Tomoki Hayashi, Hung-Yi Lee, Shinji Watanabe, Tomoki Toda:
S3PRL-VC: Open-Source Voice Conversion Framework with Self-Supervised Speech Representations. 6552-6556 - Trung Dang, Dung N. Tran, Peter Chin, Kazuhito Koishida:
Training Robust Zero-Shot Voice Conversion Models with Self-Supervised Features. 6557-6561 - Benjamin van Niekerk, Marc-André Carbonneau, Julian Zaïdi, Matthew Baas, Hugo Seuté, Herman Kamper:
A Comparison of Discrete and Soft Speech Units for Improved Voice Conversion. 6562-6566 - Haozhe Zhang, Zexin Cai, Xiaoyi Qin, Ming Li:
SIG-VC: A Speaker Information Guided Zero-Shot Voice Conversion System for Both Human Beings and Machines. 6567-65571 - Jiachen Lian, Chunlei Zhang, Dong Yu:
Robust Disentangled Variational Speech Representation Learning for Zero-Shot Voice Conversion. 6572-6576 - Xiangyu Zhao, Longbiao Wang, Jianwu Dang:
Improving Dialogue Generation via Proactively Querying Grounded Knowledge. 6577-6581 - Rongyi Sun, Borun Chen, Qingyu Zhou, Yinghui Li, Yunbo Cao, Hai-Tao Zheng:
A Non-Hierarchical Attention Network with Modality Dropout for Textual Response Generation in Multimodal Dialogue Systems. 6582-6586 - Qi Song, Sheng Li, Ping Wei, Ge Luo, Xinpeng Zhang, Zhenxing Qian:
Joint Learning for Addressee Selection and Response Generation in Multi-Party Conversation. 6587-6591 - Miaoxin Chen, Zibo Lin, Rongyi Sun, Kai Ouyang, Hai-Tao Zheng, Rui Xie, Wei Wu:
Retrieval Enhanced Segment Generation Neural Network for Task-Oriented Dialogue Systems. 6592-6596 - Xiuyi Chen, Feilong Chen, Shuang Xu, Bo Xu:
A Multi Domain Knowledge Enhanced Matching Network for Response Selection in Retrieval-Based Dialogue Systems. 6597-6601 - Yiping Song, Zheng Xie, Jianping Li, Luchen Liu, Ming Zhang, Zhiliang Tian:
Retrieval Bias Aware Ensemble Model for Conditional Sentence Generation. 6602-6606 - Patrick O'Reilly, Pranjal Awasthi, Aravindan Vijayaraghavan, Bryan Pardo:
Effective and Inconspicuous Over-the-Air Adversarial Examples with Adaptive Filtering. 6607-6611 - Ivan Yakovlev, Mikhail Melnikov, Nikita Bukhal, Rostislav Makarov, Alexander Alenin, Nikita Torgashov, Anton Okhotnikov:
LRPD: Large Replay Parallel Dataset. 6612-6616 - Woo Hyun Kang, Jahangir Alam, Abderrahim Fathan:
Robust Self-Supervised Speaker Representation Learning Via Instance Mix Regularization. 6617-6621 - Fuchuan Tong, Siqi Zheng, Min Zhang, Yafeng Chen, Hongbin Suo, Qingyang Hong, Lin Li:
Graph Convolutional Network Based Semi-Supervised Learning on Multi-Speaker Meeting Data. 6622-6626 - Dongseong Hwang, Ananya Misra, Zhouyuan Huo, Nikhil Siddhartha, Shefali Garg, David Qiu, Khe Chai Sim, Trevor Strohman, Françoise Beaufays, Yanzhang He:
Large-Scale ASR Domain Adaptation Using Self- and Semi-Supervised Learning. 6627-6631 - Tsendsuren Munkhdalai, Khe Chai Sim, Angad Chandorkar, Fan Gao, Mason Chua, Trevor Strohman, Françoise Beaufays:
Fast Contextual Adaptation with Neural Associative Memory for On-Device Personalized Speech Recognition. 6632-6636 - Jimmy Tobin, Katrin Tomanek:
Personalized Automatic Speech Recognition Trained on Small Disordered Speech Datasets. 6637-6641 - Namkyu Jung, Geonmin Kim, Joon Son Chung:
Spell My Name: Keyword Boosted Speech Recognition. 6642-6646 - Sameer Khurana, Antoine Laurent, James R. Glass:
Magic Dust for Cross-Lingual Adaptation of Monolingual Wav2vec-2.0. 6647-6651 - Yu Xi, Tian Tan, Wangyou Zhang, Baochen Yang, Kai Yu:
Text Adaptive Detection for Customizable Keyword Spotting. 6652-6656 - Haohan Guo, Zhiping Zhou, Fanbo Meng, Kai Liu:
Improving Adversarial Waveform Generation Based Singing Voice Conversion with Harmonic Signals. 6657-6661 - Ying Zhang, Peng Yang, Jinba Xiao, Ye Bai, Hao Che, Xiaorui Wang:
K-Converter: An Unsupervised Singing Voice Conversion System. 6662-6666 - Yong Zhou, Xiangju Lu:
HiFi-SVC: Fast High Fidelity Cross-Domain Singing Voice Conversion. 6667-6671 - Wen-Chin Huang, Bence Mark Halpern, Lester Phillip Violeta, Odette Scharenborg, Tomoki Toda:
Towards Identity Preserving Normal to Dysarthric Voice Conversion. 6672-6676 - Disong Wang, Songxiang Liu, Xixin Wu, Hui Lu, Lifa Sun, Xunying Liu, Helen Meng:
Speaker Identity Preservation in Dysarthric Speech Reconstruction by Adversarial Speaker Adaptation. 6677-6681 - Yunyun Wang, Jiaqi Su, Adam Finkelstein, Zeyu Jin:
Controllable Speech Representation Learning Via Voice Conversion and AIC Loss. 6682-6686 - Tong Ye, Shijing Si, Jianzong Wang, Rui Wang, Ning Cheng, Jing Xiao:
VU-BERT: A Unified Framework for Visual Dialog. 6687-6691 - Hongru Wang, Huimin Wang, Zezhong Wang, Kam-Fai Wong:
Integrating Pretrained Language Model for Dialogue Policy Evaluation. 6692-6696 - Jianshu Qi, Yuke Si, Longbiao Wang, Jianwu Dang:
Cache: Modeling Contribution-Aware Context Hierarchically for Long-Range Dialogue State Tracking. 6697-6701 - Yik-Cheung Tam, Jiacheng Xu, Jiakai Zou, Zecheng Wang, Tinglong Liao, Shuhan Yuan:
Robust Unstructured Knowledge Access in Conversational Dialogue with ASR Errors. 6702-6706 - Fuzhao Xue, Aixin Sun, Hao Zhang, Jinjie Ni, Eng Siong Chng:
An Embarrassingly Simple Model for Dialogue Relation Extraction. 6707-6711 - Qingqing Zhu, Pengfei Wu, Zhouxing Tan, Jiaxin Duan, Fengyu Lu, Junfei Liu:
A Gaussian Mixture Model for Dialogue Generation with Dynamic Parameter Sharing Strategy. 6712-6716 - Chang Zeng, Xin Wang, Erica Cooper, Xiaoxiao Miao, Junichi Yamagishi:
Attention Back-End for Automatic Speaker Verification with Multiple Enrollment Utterances. 6717-6721 - Xiaoyi Qin, Na Li, Chao Weng, Dan Su, Ming Li:
Simple Attention Module Based Speaker Verification with Iterative Noisy Label Detection. 6722-6726 - Bing Han, Zhengyang Chen, Yanmin Qian:
Local Information Modeling with Self-Attention for Speaker Verification. 6727-6731 - Rui Wang, Junyi Ao, Long Zhou, Shujie Liu, Zhihua Wei, Tom Ko, Qing Li, Yu Zhang:
Multi-View Self-Attention Based Transformer for Speaker Recognition. 6732-6736 - Miao Zhao, Yufeng Ma, Yiwei Ding, Yu Zheng, Min Liu, Minqiang Xu:
Multi-Query Multi-Head Attention Pooling and Inter-Topk Penalty for Speaker Verification. 6737-6741 - Seong-Hu Kim, Hyeonuk Nam, Yong-Hwa Park:
Temporal Dynamic Convolutional Neural Network for Text-Independent Speaker Verification and Phonemic Analysis. 6742-6746 - Shujie Hu, Shansong Liu, Xurong Xie, Mengzhe Geng, Tianzi Wang, Shoukang Hu, Mingyu Cui, Xunying Liu, Helen Meng:
Exploiting Cross Domain Acoustic-to-Articulatory Inverted Features for Disordered Speech Recognition. 6747-6751 - Kun Wei, Yike Zhang, Sining Sun, Lei Xie, Long Ma:
Conversational Speech Recognition by Learning Conversation-Level Characteristics. 6752-6756 - Fengpeng Yue, Yan Deng, Lei He, Tom Ko, Yu Zhang:
Exploring Machine Speech Chain For Domain Adaptation. 6757-6761 - Chhavi Choudhury, Ankur Gandhe, Xiaohan Ding, Ivan Bulyko:
A Likelihood Ratio Based Domain Adaptation Method for E2E Models. 6762-6766 - Salima Mdhaffar, Jean-François Bonastre, Marc Tommasi, Natalia A. Tomashenko, Yannick Estève:
Retrieving Speaker Information from Personalized Acoustic Models for Speech Recognition. 6767-6771 - Motoi Omachi, Yuya Fujita, Shinji Watanabe, Tianzi Wang:
Non-Autoregressive End-To-End Automatic Speech Recognition Incorporating Downstream Natural Language Processing. 6772-6776 - Chien-yu Huang, Kai-Wei Chang, Hung-Yi Lee:
Toward Degradation-Robust Voice Conversion. 6777-6781 - Thomas Merritt, Abdelhamid Ezzerg, Piotr Bilinski, Magdalena Proszewska, Kamil Pokora, Roberto Barra-Chicote, Daniel Korzekwa:
Text-Free Non-Parallel Many-To-Many Voice Conversion Using Normalising Flow. 6782-6786 - Chao Xie, Yi-Chiao Wu, Patrick Lumban Tobing, Wen-Chin Huang, Tomoki Toda:
Direct Noisy Speech Modeling for Noisy-To-Noisy Voice Conversion. 6787-6791 - Zhichao Wang, Qicong Xie, Tao Li, Hongqiang Du, Lei Xie, Pengcheng Zhu, Mengxiao Bi:
One-Shot Voice Conversion For Style Transfer Based On Speaker Adaptation. 6792-6796 - Manuel Sam Ribeiro, Julian Roth, Giulia Comini, Goeric Huybrechts, Adam Gabrys, Jaime Lorenzo-Trueba:
Cross-Speaker Style Transfer for Text-to-Speech Using Data Augmentation. 6797-6801 - Tomoki Hayashi, Kazuhiro Kobayashi, Tomoki Toda:
An Investigation of Streaming Non-Autoregressive sequence-to-sequence Voice Conversion. 6802-6806 - Shaoguang Mao, Frank K. Soong, Yan Xia, Jonathan Tien:
A Universal Ordinal Regression for Assessing Phoneme-Level Pronunciation. 6807-6811 - Marcelo Sancinetti, Jazmín Vidal, Cyntia Bonomi, Luciana Ferrer:
A Transfer Learning Approach for Pronunciation Scoring. 6812-6816 - Hsin-Wei Wang, Bi-Cheng Yan, Hsuan-Sheng Chiu, Yung-Chang Hsu, Berlin Chen:
Exploring Non-Autoregressive End-to-End Neural Modeling for English Mispronunciation Detection and Diagnosis. 6817-6821 - Binghuai Lin, Liyuan Wang:
Phoneme Mispronunciation Detection By Jointly Learning To Align. 6822-6826 - Wenxuan Ye, Shaoguang Mao, Frank K. Soong, Wenshan Wu, Yan Xia, Jonathan Tien, Zhiyong Wu:
An Approach to Mispronunciation Detection and Diagnosis with Acoustic, Phonetic and Linguistic (APL) Embeddings. 6827-6831 - Zhan Zhang, Yuehai Wang, Jianyi Yang:
Masked Acoustic Unit for Mispronunciation Detection and Correction. 6832-6836 - Zili Huang, Shinji Watanabe, Shu-Wen Yang, Paola García, Sanjeev Khudanpur:
Investigating Self-Supervised Learning for Speech Enhancement and Separation. 6837-6841 - Lei Yang, Wei Liu, Weiqin Wang:
TFPSNet: Time-Frequency Domain Path Scanning Network for Speech Separation. 6842-6846 - Shuang-qing Qian, Lijian Gao, Hongjie Jia, Qirong Mao:
Efficient Monaural Speech Separation with Multiscale Time-Delay Sampling. 6847-6851 - Muhammed Zahid Ozturk, Chenshu Wu, Beibei Wang, K. J. Ray Liu:
Toward mmWave-Based Sound Enhancement and Separation. 6852-6856 - Feng Dang, Hangting Chen, Pengyuan Zhang:
DPT-FSNet: Dual-Path Transformer Based Full-Band and Sub-Band Fusion Network for Speech Enhancement. 6857-6861 - Cem Subakan, Mirco Ravanelli, Samuele Cornell, François Grondin:
Real-M: Towards Speech Separation on Real Mixtures. 6862-6866 - Stanislaw Kacprzak, Magdalena Rybicka, Konrad Kowalczyk:
Spoken Language Recognition with Cluster-Based Modeling. 6867-6871 - David Romero, Luis Fernando D'Haro, Marcos Estecha-Garitagoitia, Christian Salamea:
Phonotactic Language Recognition Using A Universal Phoneme Recognizer and A Transformer Architecture. 6872-6876 - Andros Tjandra, Diptanu Gon Choudhury, Frank Zhang, Kritika Singh, Alexis Conneau, Alexei Baevski, Assaf Sela, Yatharth Saraf, Michael Auli:
Improved Language Identification Through Cross-Lingual Self-Supervised Learning. 6877-6881 - Yizhou Lu, Mingkun Huang, Xinghua Qu, Pengfei Wei, Zejun Ma:
Language Adaptive Cross-Lingual Speech Representation Learning with Sparse Sharing Sub-Networks. 6882-6886 - Pratik Kumar, Vrunda N. Sukhadia, Srinivasan Umesh:
Investigation of Robustness of Hubert Features from Different Layers to Domain, Accent and Language Variations. 6887-6891 - Chak-Fai Li, Francis Keith, William Hartmann, Matthew Snover:
Combining Unsupervised and Text Augmented Semi-Supervised Learning For Low Resourced Autoregressive Speech Recognition. 6892-6896 - Weidong Chen, Xiaofeng Xing, Xiangmin Xu, Jichen Yang, Jianxin Pang:
Key-Sparse Transformer for Multimodal Speech Emotion Recognition. 6897-6901 - Xixin Wu, Shoukang Hu, Zhiyong Wu, Xunying Liu, Helen Meng:
Neural Architecture Search for Speech Emotion Recognition. 6902-6906 - Mayank Sharma:
Multi-Lingual Multi-Task Speech Emotion Recognition Using wav2vec 2.0. 6907-6911 - Arya Aftab, Alireza Morsali, Shahrokh Ghaemmaghami, Benoît Champagne:
LIGHT-SERNET: A Lightweight Fully Convolutional Neural Network for Speech Emotion Recognition. 6912-6916 - Soumya Dutta, Sriram Ganapathy:
Multimodal Transformer with Learnable Frontend and Self Attention for Emotion Recognition. 6917-6921 - Edmilson da Silva Morais, Ron Hoory, Weizhong Zhu, Itai Gat, Matheus Damasceno, Hagai Aronowitz:
Speech Emotion Recognition Using Self-Supervised Features. 6922-6926 - Gábor Gosztolya, László Tóth, Veronika Svindt, Judit Bóna, Ildikó Hoffmann:
Using Acoustic Deep Neural Network Embeddings to Detect Multiple Sclerosis From Speech. 6927-6931 - R'mani Haulcy, Katerina Placek, Brian Tracey, Adam P. Vogel, James R. Glass:
Repetition Assessment for Speech and Language Disorders: A Study of the Logopenic Variant of Primary Progressive Aphasia. 6932-6936 - Bang Tran, Youxiang Zhu, Xiaohui Liang, James W. Schwoebel, Lindsay A. Warrenburg:
Speech Tasks Relevant to Sleepiness Determined With Deep Transfer Learning. 6937-6941 - Doyeon Kim, Hyewon Han, Hyeon-Kyeong Shin, Soo-Whan Chung, Hong-Goo Kang:
Phase Continuity: Learning Derivatives of Phase Spectrum for Speech Enhancement. 6942-6946 - Efthymios Tzinis, Yossi Adi, Vamsi K. Ithapu, Buye Xu, Anurag Kumar:
Continual Self-Training With Bootstrapped Remixing For Speech Enhancement. 6947-6951 - Yang Yang, Hui Zhang, Xueliang Zhang, Huaiwen Zhang:
Alleviating the Loss-Metric Mismatch in Supervised Single-Channel Speech Enhancement. 6952-6956 - Tong Lei, Haoxin Ruan, Kai Chen, Jing Lu:
A Priori SNR Estimation for Speech Enhancement Based on PESQ-Induced Reinforcement Learning. 6957-6961 - Bahareh Tolooshams, Kazuhito Koishida:
A Training Framework for Stereo-Aware Speech Enhancement Using Deep Neural Networks. 6962-6966 - Guochen Yu, Andong Li, Yutian Wang, Yinuo Guo, Hui Wang, Chengshi Zheng:
Joint Magnitude Estimation and Phase Recovery Using Cycle-In-Cycle GAN for Non-Parallel Speech Enhancement. 6967-6971 - Natalia A. Tomashenko, Salima Mdhaffar, Marc Tommasi, Yannick Estève, Jean-François Bonastre:
Privacy Attacks for Automatic Speech Recognition Acoustic Models in A Federated Learning Framework. 6972-6976 - Jinhan Wang, Xiaosu Tong, Jinxi Guo, Di He, Roland Maas:
VADOI: Voice-Activity-Detection Overlapping Inference for End-To-End Long-Form Speech Recognition. 6977-6981 - Yao-Yuan Yang, Moto Hira, Zhaoheng Ni, Artyom Astafurov, Caroline Chen, Christian Puhrsch, David Pollack, Dmitriy Genzel, Donny Greenberg, Edward Z. Yang, Jason Lian, Jeff Hwang, Ji Chen, Peter Goldsborough, Sean Narenthiran, Shinji Watanabe, Soumith Chintala, Vincent Quenneville-Bélair:
Torchaudio: Building Blocks for Audio and Speech Processing. 6982-6986 - Ganesh Sivaraman, Ricardo Casal, Matt Garland, Elie Khoury:
Unsupervised Model Adaptation for End-to-End ASR. 6987-6991 - Thomas Bohnstingl, Ayush Garg, Stanislaw Wozniak, George Saon, Evangelos Eleftheriou, Angeliki Pantazi:
Speech Recognition Using Biologically-Inspired Neural Networks. 6992-6996 - Kang-wook Kim, Seung Won Park, Junhyeok Lee, Myun-chul Joe:
ASSEM-VC: Realistic Voice Conversion by Assembling Modern Speech Synthesis Techniques. 6997-7001 - Christopher Liberatore, Ricardo Gutierrez-Osuna:
Minimizing Residuals for Native-Nonnative Voice Conversion in a Sparse, Anchor-Based Representation of Speech. 7002-7006 - Yan-Nian Chen, Li-Juan Liu, Ya-Jun Hu, Yuan Jiang, Zhen-Hua Ling:
Improving Recognition-Synthesis Based any-to-one Voice Conversion with Cyclic Training. 7007-7011 - Bac Nguyen, Fabien Cardinaux:
NVC-Net: End-To-End Adversarial Voice Conversion. 7012-7016 - Sheng Shi, Jiahao Shao, Yifei Hao, Yangzhou Du, Jianping Fan:
U-GAT-VC: Unsupervised Generative Attentional Networks for Non-Parallel Voice Conversion. 7017-7021 - Xintao Zhao, Feng Liu, Changhe Song, Zhiyong Wu, Shiyin Kang, Deyi Tuo, Helen Meng:
Disentangling Content and Fine-Grained Prosody Information Via Hybrid ASR Bottleneck Features for Voice Conversion. 7022-7026 - Yunhe Xie, Chengjie Sun, Zhenzhou Ji:
A Commonsense Knowledge Enhanced Network with Retrospective Loss for Emotion Recognition in Spoken Dialog. 7027-7031 - Yu-Ping Ruan, Shu-Kai Zheng, Taihao Li, Fen Wang, Guanxiong Pei:
Hierarchical and Multi-View Dependency Modelling Network for Conversational Emotion Recognition. 7032-7036 - Dou Hu, Xiaolong Hou, Lingwei Wei, Lian-Xin Jiang, Yang Mo:
MM-DFN: Multimodal Dynamic Fusion Network for Emotion Recognition in Conversations. 7037-7041 - Wei Peng, Yue Hu, Luxi Xing, Yuqiang Xie, Xingsheng Zhang, Yajing Sun:
Modeling Intention, Emotion and External World in Dialogue Systems. 7042-7046 - Kai Wei, Dillon Knox, Martin Radfar, Thanh Tran, Markus Müller, Grant P. Strimel, Nathan Susanj, Athanasios Mouchtaris, Maurizio Omologo:
A Neural Prosody Encoder for End-to-End Dialogue Act Classification. 7047-7051 - Jing Yang Lee, Kong Aik Lee, Woon-Seng Gan:
Improving Contextual Coherence in Variational Personalized and Empathetic Dialogue Agents. 7052-7056 - Muhammad Saad Saeed, Muhammad Haris Khan, Shah Nawaz, Muhammad Haroon Yousaf, Alessio Del Bue:
Fusion and Orthogonal Projection for Improved Face-Voice Association. 7057-7061 - K. C. Kishan, Zhenning Tan, Long Chen, Minho Jin, Eunjung Han, Andreas Stolcke, Chul Lee:
OpenFEAT: Improving Speaker Identification by Open-Set Few-Shot Embedding Adaptation with Transformer. 7062-7066 - Qingjian Li, Lin Yang, Xuyang Wang, Xiaoyi Qin, Junjie Wang, Ming Li:
Towards Lightweight Applications: Asymmetric Enroll-Verify Structure for Speaker Verification. 7067-7071 - Tianxiang Chen, Elie Khoury:
Speaker Embedding Conversion for Backward and Cross-Channel Compatibility. 7072-7076 - Hua Shen, Yuguang Yang, Guoli Sun, Ryan Langman, Eunjung Han, Jasha Droppo, Andreas Stolcke:
Improving Fairness in Speaker Verification via Group-Adapted Fusion Network. 7077-7081 - Ruiteng Zhang, Jianguo Wei, Wenhuan Lu, Lin Zhang, Yantao Ji, Junhai Xu, Xugang Lu:
CS-REP: Making Speaker Verification Networks Embracing Re-Parameterization. 7082-7086 - Heng-Jui Chang, Shu-Wen Yang, Hung-yi Lee:
Distilhubert: Speech Representation Learning by Layer-Wise Distillation of Hidden-Unit Bert. 7087-7091 - Chengyi Wang, Yu Wu, Sanyuan Chen, Shujie Liu, Jinyu Li, Yao Qian, Zhenglu Yang:
Improving Self-Supervised Learning for Speech Recognition with Intermediate Layer Supervision. 7092-7096 - Yiming Wang, Jinyu Li, Heming Wang, Yao Qian, Chengyi Wang, Yu Wu:
Wav2vec-Switch: Contrastive Learning from Original-Noisy Speech Pairs for Robust Speech Recognition. 7097-7101 - Bethan Thomas, Samuel Kessler, Salah Karout:
Efficient Adapter Transfer of Self-Supervised Speech Models for Automatic Speech Recognition. 7102-7106 - Takashi Maekaku, Xuankai Chang, Yuya Fujita, Shinji Watanabe:
An Exploration of Hubert with Large Number of Cluster Units and Model Assessment Using Bayesian Information Criterion. 7107-7111 - Liu Chen, Meysam Asgari, Hiroko H. Dodge:
Optimize Wav2vec2s Architecture for Small Training Set Through Analyzing its Pre-Trained Models Attention Pattern. 7112-7116 - Marek Kubis, Maxime Méloux, Pawel Skórzewski, Marcin Lewandowski, Gunu Jho, Hyoungmin Park:
Part-of-Speech Models Compression Methods for on-Device Grapheme-to-Phoneme Conversion. 7117-7121 - Wenlin Dai, Changhe Song, Xiang Li, Zhiyong Wu, Huashan Pan, Xiulin Li, Helen Meng:
An End-to-End Chinese Text Normalization Model Based on Rule-Guided Flat-Lattice Transformer. 7122-7126 - Su Dong, Shan Liu, Sicen Liu, Buzhou Tang:
Chinese Spelling Text Generation of Mathematical Formulas. 7127-7131 - Rem Hida, Masaki Hamada, Chie Kamada, Emiru Tsunoo, Toshiyuki Sekiya, Toshiyuki Kumakura:
Polyphone Disambiguation and Accent Prediction Using Pre-Trained Language Models in Japanese TTS Front-End. 7132-7136 - Yang Zhang, Haitong Zhang, Yue Lin:
Data Augmentation for Long-Tailed and Imbalanced Polyphone Disambiguation in Mandarin. 7137-7141 - Dongsheng Chen, Zhiqi Huang, Yuexian Zou:
Leveraging Bilinear Attention to Improve Spoken Language Understanding. 7142-7146 - Zexun Wang, Yuquan Le, Yi Zhu, Yuming Zhao, Mingchao Feng, Meng Chen, Xiaodong He:
Building Robust Spoken Language Understanding by Cross Attention Between Phoneme Sequence and ASR Hypothesis. 7147-7151 - Seunghyun Seo, Donghyun Kwak, Bowon Lee:
Integration of Pre-Trained Networks with Continuous Token Interface for End-to-End Spoken Language Understanding. 7152-7156 - Bhuvan Agrawal, Markus Müller, Samridhi Choudhary, Martin Radfar, Athanasios Mouchtaris, Ross McGowan, Nathan Susanj, Siegfried Kunzmann:
Tie Your Embeddings Down: Cross-Modal Latent Spaces for End-to-end Spoken Language Understanding. 7157-7161 - Hong-Kwang Jeff Kuo, Zoltán Tüske, Samuel Thomas, Brian Kingsbury, George Saon:
Improving End-to-end Models for Set Prediction in Spoken Language Understanding. 7162-7166 - Siddhant Arora, Siddharth Dalmia, Pavel Denisov, Xuankai Chang, Yushi Ueda, Yifan Peng, Yuekai Zhang, Sujay Kumar, Karthik Ganesan, Brian Yan, Ngoc Thang Vu, Alan W. Black, Shinji Watanabe:
ESPnet-SLU: Advancing Spoken Language Understanding Through ESPnet. 7167-7171 - Rongjin Li, Weibin Zhang, Dongpeng Chen:
The Coral++ Algorithm for Unsupervised Domain Adaptation of Speaker Recognition. 7172-7176 - Hanyi Zhang, Longbiao Wang, Kong Aik Lee, Meng Liu, Jianwu Dang, Hui Chen:
Learning Domain-Invariant Transformation for Speaker Verification. 7177-7181 - Hang-Rui Hu, Yan Song, Ying Liu, Li-Rong Dai, Ian McLoughlin, Lin Liu:
Domain Robust Deep Embedding Learning for Speaker Recognition. 7182-7186 - Jenthe Thienpondt, Brecht Desplanques, Kris Demuynck:
Tackling the Score Shift in Cross-Lingual Speaker Verification by Exploiting Language Information. 7187-7191 - Anurag Chowdhury, Austin Cozzo, Arun Ross:
Domain Adaptation for Speaker Recognition in Singing and Spoken Voice. 7192-7196 - Jianchen Li, Jiqing Han, Hongwei Song:
CDMA: Cross-Domain Distance Metric Adaptation for Speaker Verification. 7197-7201 - Vineel Pratap, Qiantong Xu, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert:
Word Order does not Matter for Speech Recognition. 7202-7206 - Soheil Khorram, Jaeyoung Kim, Anshuman Tripathi, Han Lu, Qian Zhang, Hasim Sak:
Contrastive Siamese Network for Semi-Supervised Speech Recognition. 7207-7211 - Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux:
Sequence Transduction with Graph-Based Supervision. 7212-7216 - Zhao You, Shulin Feng, Dan Su, Dong Yu:
Speechmoe2: Mixture-of-Experts Model with Improved Routing. 7217-7221 - Gene-Ping Yang, Hao Tang:
Supervised Attention in Sequence-to-Sequence Models for Speech Recognition. 7222-7226 - Yan Gao, Titouan Parcollet, Salah Zaiem, Javier Fernández-Marqués, Pedro P. B. de Gusmao, Daniel J. Beutel, Nicholas D. Lane:
End-to-End Speech Recognition from Federated Acoustic Models. 7227-7231 - Lichao Zhang, Yi Ren, Liqun Deng, Zhou Zhao:
HiFiDenoise: High-Fidelity Denoising Text to Speech with Adversarial Networks. 7232-7236 - Yongmao Zhang, Jian Cong, Heyang Xue, Lei Xie, Pengcheng Zhu, Mengxiao Bi:
VISinger: Variational Inference with Adversarial Learning for End-to-End Singing Voice Synthesis. 7237-7241 - Soonbeom Choi, Juhan Nam:
A Melody-Unsupervision Model for Singing Voice Synthesis. 7242-7246 - Liyang Chen, Zhiyong Wu, Jun Ling, Runnan Li, Xu Tan, Sheng Zhao:
Transformer-S2A: Robust and Efficient Speech-to-Animation. 7247-7251 - Disong Wang, Shan Yang, Dan Su, Xunying Liu, Dong Yu, Helen Meng:
VCVTS: Multi-Speaker Video-to-Speech Synthesis Via Cross-Modal Knowledge Transfer from Voice Conversion. 7252-7256 - Binghuai Lin, Liyuan Wang:
Fast Task-Specific Adaptation in Spoken Language Assessment with Meta-Learning. 7257-7261 - Yuan Gong, Ziyi Chen, Iek-Heng Chu, Peng Chang, James R. Glass:
Transformer-Based Multi-Aspect Multi-Granularity Non-Native English Speaker Pronunciation Assessment. 7262-7266 - Jose Antonio Lopez Saenz, Thomas Hain:
A Model for Assessor Bias in Automatic Pronunciation Assessment. 7267-7271 - Yaoming Zhu, Liwei Wu, Shanbo Cheng, Mingxuan Wang:
Unified Multimodal Punctuation Restoration Framework for Mixed-Modality Corpus. 7272-7276 - Zhikai Zhou, Tian Tan, Yanmin Qian:
Punctuation Prediction for Streaming On-Device Speech Recognition. 7277-7281 - Fan Zhang, Mei Tu, Song Liu, Jinyao Yan:
ASR Error Correction with Dual-Channel Self-Supervised Learning. 7282-7286 - Meng Ge, Chenglin Xu, Longbiao Wang, Eng Siong Chng, Jianwu Dang, Haizhou Li:
L-SpEx: Localized Target Speaker Extraction. 7287-7291 - Jiangyu Han, Yanhua Long, Lukás Burget, Jan Cernocký:
DPCCN: Densely-Connected Pyramid Complex Convolutional Network for Robust Speech Separation and Extraction. 7292-7296 - Junhao Xu, Jianwei Yu, Xunying Liu, Helen Meng:
Mixed Precision DNN Quantization for Overlapped Speech Separation and Recognition. 7297-7301 - Zhiqi Kang, Mostafa Sadeghi, Radu Horaud, Xavier Alameda-Pineda, Jacob Donley, Anurag Kumar:
The Impact of Removing Head Movements on Audio-Visual Speech Enhancement. 7302-7306 - Xinmeng Xu, Yang Wang, Dongxiang Xu, Yiyuan Peng, Cong Zhang, Jie Jia, Binbin Chen:
VSEGAN: Visual Speech Enhancement Generative Adversarial Network. 7308-7311 - Liang Lu, Jinyu Li, Yifan Gong:
Endpoint Detection for Streaming End-to-End Multi-Talker ASR. 7312-7316 - Desh Raj, Liang Lu, Zhuo Chen, Yashesh Gaur, Jinyu Li:
Continuous Streaming Multi-Talker ASR with Dual-Path Transducers. 7317-7321 - Xuankai Chang, Niko Moritz, Takaaki Hori, Shinji Watanabe, Jonathan Le Roux:
Extended Graph Temporal Classification for Multi-Speaker End-to-End ASR. 7322-7326 - Taesoo Kim, Jiho Chang, Jong Hwan Ko:
ADA-VAD: Unpaired Adversarial Domain Adaptation for Noise-Robust Voice Activity Detection. 7327-7331 - Shota Horiguchi, Yuki Takashima, Paola García, Shinji Watanabe, Yohei Kawaguchi:
Multi-Channel End-To-End Neural Diarization with Distributed Microphones. 7332-7336 - Naijun Zheng, Na Li, Jianwei Yu, Chao Weng, Dan Su, Xunying Liu, Helen Meng:
Multi-Channel Speaker Diarization Using Spatial Features for Meetings. 7337-7341 - Itai Gat, Hagai Aronowitz, Weizhong Zhu, Edmilson da Silva Morais, Ron Hoory:
Speaker Normalization for Self-Supervised Speech Emotion Recognition. 7342-7346 - Ayoub Ghriss, Bo Yang, Viktor Rozgic, Elizabeth Shriberg, Chao Wang:
Sentiment-Aware Automatic Speech Recognition Pre-Training for Enhanced Speech Emotion Recognition. 7347-7351 - Yang Li, Constantinos Papayiannis, Viktor Rozgic, Elizabeth Shriberg, Chao Wang:
Confidence Estimation for Speech Emotion Recognition Based on the Relationship Between Emotion Categories and Primitives. 7352-7356 - Lucas Goncalves, Carlos Busso:
AuxFormer: Robust Approach to Audiovisual Emotion Recognition. 7357-7361 - Yuanchao Li, Peter Bell, Catherine Lai:
Fusing ASR Outputs in Joint Training for Speech Emotion Recognition. 7362-7366 - Heqing Zou, Yuke Si, Chen Chen, Deepu Rajan, Eng Siong Chng:
Speech Emotion Recognition with Co-Attention Based Multi-Level Acoustic Information. 7367-7371 - Zhengjun Yue, Erfan Loweimi, Zoran Cvetkovic, Heidi Christensen, Jon Barker:
Multi-Modal Acoustic-Articulatory Feature Fusion For Dysarthric Speech Recognition. 7372-7376 - Zhengjun Yue, Erfan Loweimi, Zoran Cvetkovic:
Raw Source and Filter Modelling for Dysarthric Speech Recognition. 7377-7381 - Mohammad Soleymanpour, Michael T. Johnson, Rahim Soleymanpour, Jeffrey Berry:
Synthesizing Dysarthric Speech Using Multi-Speaker Tts For Dysarthric Speech Recognition. 7382-7386 - Sondes Abderrazek, Corinne Fredouille, Alain Ghio, Muriel Lalain, Christine Meunier, Virginie Woisard:
Towards Interpreting Deep Learning Models to Understand Loss of Speech Intelligibility in Speech Disorders Step 2: Contribution of the Emergence of Phonetic Traits. 7387-7391 - Hemant A. Patil, Ankur T. Patil, Aastha Kachhi:
Constant Q Cepstral coefficients for classification of normal vs. Pathological infant cry. 7392-7396 - Colin Lea, Zifang Huang, Dhruv Jain, Lauren Tooley, Zeinab Liaghat, Shrinath Thelapurath, Leah Findlater, Jeffrey P. Bigham:
Nonverbal Sound Detection for Disordered Speech. 7397-7401 - Yen-Ju Lu, Zhong-Qiu Wang, Shinji Watanabe, Alexander Richard, Cheng Yu, Yu Tsao:
Conditional Diffusion Probabilistic Model for Speech Enhancement. 7402-7406 - Hendrik Schröter, Alberto N. Escalante-B., Tobias Rosenkranz, Andreas Maier:
Deepfilternet: A Low Complexity Speech Enhancement Framework for Full-Band Audio Based On Deep Filtering. 7407-7411 - Szu-Wei Fu, Cheng Yu, Kuo-Hsuan Hung, Mirco Ravanelli, Yu Tsao:
MetricGAN-U: Unsupervised Speech Enhancement/ Dereverberation Based Only on Noisy/ Reverberated Speech. 7412-7416 - Yihui Fu, Yun Liu, Jingdong Li, Dawei Luo, Shubo Lv, Yukai Jv, Lei Xie:
Uformer: A Unet Based Dilated Complex & Real Dual-Path Conformer Network for Simultaneous Speech Enhancement and Dereverberation. 7417-7421 - Tomer Rosenbaum, Israel Cohen, Emil Winebrand:
Attenuation Of Acoustic Early Reflections In Television Studios Using Pretrained Speech Synthesis Neural Network. 7422-7426 - Tatsuya Komatsu:
Non-Autoregressive ASR with Self-Conditioned Folded Encoders. 7427-7431 - Guoli Ye, Vadim Mazalov, Jinyu Li, Yifan Gong:
Have Best of Both Worlds: Two-Pass Hybrid and E2E Cascading Framework for Speech Recognition. 7432-7436 - Mohammad Zeineldeen, Jingjing Xu, Christoph Lüscher, Wilfried Michel, Alexander Gerstenberger, Ralf Schlüter, Hermann Ney:
Conformer-Based Hybrid ASR System For Switchboard Dataset. 7437-7441 - Tina Raissi, Eugen Beck, Ralf Schlüter, Hermann Ney:
Improving Factored Hybrid HMM Acoustic Modeling without State Tying. 7442-7446 - Zehai Tu, Jack Deadman, Ning Ma, Jon Barker:
Auditory-Based Data Augmentation for end-to-end Automatic Speech Recognition. 7447-7451 - Weiran Wang, Ke Hu, Tara N. Sainath:
Deliberation of Streaming RNN-Transducer by Non-Autoregressive Decoding. 7452-7456 - Shivam Mehta, Éva Székely, Jonas Beskow, Gustav Eje Henter:
Neural HMMS Are All You Need (For High-Quality Attention-Free TTS). 7457-7461 - Takato Fujimoto, Kei Hashimoto, Yoshihiko Nankaku, Keiichi Tokuda:
Autoregressive Variational Autoencoder with a Hidden Semi-Markov Model-Based Structured Attention for Speech Synthesis. 7462-7466 - Yunchao He, Jian Luan, Yujun Wang:
PAMA-TTS: Progression-Aware Monotonic Attention for Stable SEQ2SEQ TTS with Accurate Phoneme Duration Control. 7467-7471 - Yujia Xiao, Xi Wang, Lei He, Frank K. Soong:
Improving Fastspeech TTS with Efficient Self-Attention and Compact Feed-Forward Network. 7472-7476 - Yoonhyung Lee, Jinhyeok Yang, Kyomin Jung:
Varianceflow: High-Quality and Controllable Text-to-Speech using Variance Information via Normalizing Flow. 7477-7481 - Oktai Tatanov, Stanislav Beliaev, Boris Ginsburg:
Mixer-TTS: Non-Autoregressive, Fast and Compact Text-to-Speech Model Conditioned on Language Model Embeddings. 7482-7486 - Ting-Wei Wu, Biing-Hwang Juang:
Knowledge Augmented Bert Mutual Network in Multi-Turn Spoken Dialogues. 7487-7491 - Anastasios Alexandridis, Kanthashree Mysore Sathyendra, Grant P. Strimel, Pavel Kveton, Jon Webb, Athanasios Mouchtaris:
TINYS2I: A Small-Footprint Utterance Classification Model with Contextual Support for On-Device SLU. 7492-7496 - Vishal Sunder, Samuel Thomas, Hong-Kwang Jeff Kuo, Jatin Ganhotra, Brian Kingsbury, Eric Fosler-Lussier:
Towards End-to-End Integration of Dialog History for Improved Spoken Language Understanding. 7497-7501 - Thai Binh Nguyen:
Improving Spoken Language Understanding by Enhancing Text Representation. 7502-7506 - Xuandi Fu, Feng-Ju Chang, Martin Radfar, Kai Wei, Jing Liu, Grant P. Strimel, Kanthashree Mysore Sathyendra:
Multi-Task RNN-T with Semantic Decoder for Streamable Spoken Language Understanding. 7507-7511 - Wang Zhang, Lei Jiang, Shaokang Zhang, Shuo Wang, Jianlong Tan:
A Bert Based Joint Learning Model with Feature Gated Mechanism for Spoken Language Understanding. 7512-7516 - Tianchi Liu, Rohan Kumar Das, Kong Aik Lee, Haizhou Li:
MFA: TDNN with Multi-Scale Frequency-Channel Attention for Text-Independent Speaker Verification with Short Utterances. 7517-7521 - Bing Han, Zhengyang Chen, Bei Liu, Yanmin Qian:
MLP-SVNET: A Multi-Layer Perceptrons Based Network for Speaker Verification. 7522-7526 - Lantian Li, Ruiqian Nai, Dong Wang:
Real Additive Margin Softmax for Speaker Verification. 7527-7531 - Zi-Kai Wan, Qinghua Ren, You-cai Qin, Qirong Mao:
Statistical Pyramid Dense Time Delay Neural Network for Speaker Verification. 7532-7536 - Aiwen Deng, Shuai Wang, Wenxiong Kang, Feiqi Deng:
On the Importance of Different Frequency Bins for Speaker Verification. 7537-7541 - Bei Liu, Haoyu Wang, Zhengyang Chen, Shuai Wang, Yanmin Qian:
Self-Knowledge Distillation via Feature Enhancement for Speaker Verification. 7542-7546 - Yueyue Na, Ziteng Wang, Liang Wang, Qiang Fu:
Joint Ego-Noise Suppression and Keyword Spotting on Sweeping Robots. 7547-7551 - Yizheng Huang, Nana Hou, Nancy F. Chen:
Progressive Continual Learning for Spoken Keyword Spotting. 7552-7556 - Gengshen Fu, Thibaud Senechal, Aaron Challenner, Tao Zhang:
Unified Speculation, Detection, and Verification Keyword Spotting. 7557-7561 - Li Wang, Rongzhi Gu, Weiji Zhuang, Peng Gao, Yujun Wang, Yuexian Zou:
Learning Decoupling Features Through Orthogonality Regularization. 7562-7566 - Raphael Tang, Karun Kumar, Ji Xin, Piyush Vyas, Wenyan Li, Gefei Yang, Yajie Mao, G. Craig Murray, Jimmy Lin:
Temporal Early Exiting for Streaming Speech Commands Recognition. 7567-7571 - Hengshun Zhou, Jun Du, Chao-Han Huck Yang, Shifu Xiong, Chin-Hui Lee:
A Study of Designing Compact Audio-Visual Wake Word Spotting System Based on Iterative Fine-Tuning in Neural Network Pruning. 7572-7576 - Yi Ren, Ming Lei, Zhiying Huang, Shiliang Zhang, Qian Chen, Zhijie Yan, Zhou Zhao:
Prosospeech: Enhancing Prosody with Quantized Vector Pre-Training in Text-To-Speech. 7577-7581 - Yuanhao Yi, Lei He, Shifeng Pan, Xi Wang, Yujia Xiao:
Prosodyspeech: Towards Advanced Prosody Model for Neural Text-to-Speech. 7582-7586 - Tuomo Raitio, Jiangchuan Li, Shreyas Seshadri:
Hierarchical Prosody Modeling and Control in Non-Autoregressive Parallel Neural TTS. 7587-7591 - Ning-Qian Wu, Zhaoci Liu, Zhen-Hua Ling:
Discourse-Level Prosody Modeling with a Variational Autoencoder for Non-Autoregressive Expressive Speech Synthesis. 7592-7596 - Yiwei Guo, Chenpeng Du, Kai Yu:
Unsupervised Word-Level Prosody Tagging for Controllable Speech Synthesis. 7597-7601 - Xueyuan Chen, Changhe Song, Yixuan Zhou, Zhiyong Wu, Changbin Chen, Zhongqin Wu, Helen Meng:
A Character-Level Span-Based Model for Mandarin Prosodic Structure Prediction. 7602-7606 - Fengyu Cai, Wanhao Zhou, Fei Mi, Boi Faltings:
Slim: Explicit Slot-Intent Mapping with Bert for Joint Multi-Intent Detection and Slot Filling. 7607-7611 - Lisong Chen, Peilin Zhou, Yuexian Zou:
Joint Multiple Intent Detection and Slot Filling Via Self-Distillation. 7612-7616 - Zhanbiao Zhu, Peijie Huang, Haojing Huang, Shudong Liu, Leyi Lao:
A Graph Attention Interactive Refine Framework with Contextual Regularization for Jointing Intent Detection and Slot Filling. 7617-7621 - Jiabao Xu, Peijie Huang, Youming Peng, Jiande Ding, Boxi Huang, Simin Huang:
Adjacency Pairs-Aware Hierarchical Attention Networks for Dialogue Intent Classification. 7622-7626 - Nikhita Vedula, Rahul Gupta, Aman Alok, Mukund Sridhar, Shankar Ananthakrishnan:
Advin: Automatically Discovering Novel Domains and Intents from User Text Utterances. 7627-7631 - Zvi Kons, Aharon Satt, Hong-Kwang Kuo, Samuel Thomas, Boaz Carmeli, Ron Hoory, Brian Kingsbury:
A New Data Augmentation Method for Intent Classification Enhancement and its Application on Spoken Conversation Datasets. 7632-7636 - Kai Wang, Xiaolei Zhang, Miao Zhang, Yuguang Li, Jaeyun Lee, Kiho Cho, Sung-Un Park:
Robust Speaker Verification with Joint Self-Supervised and Supervised Learning. 7637-7641 - Weiwei Lin, Man-Wai Mak:
Robust Speaker Verification Using Population-Based Data Augmentation. 7642-7646 - Ju-ho Kim, Hye-Jin Shim, Jungwoo Heo, Ha-Jin Yu:
RawNeXt: Speaker Verification System For Variable-Duration Utterances With Deep Layer Aggregation And Extended Dynamic Scaling Policies. 7647-7651 - Xin Zhang, Minho Jin, Roger Cheng, Ruirui Li, Eunjung Han, Andreas Stolcke:
Contrastive-mixup Learning for Improved Speaker Verification. 7652-7656 - Ge Zhu, Frank Cwitkowitz, Zhiyao Duan:
A Study of The Robustness of Raw Waveform Based Speaker Embeddings Under Mismatched Conditions. 7657-7661 - Lu Yi, Man-Wai Mak:
Disentangled Speaker Embedding for Robust Speaker Verification. 7662-7666 - Felix Wu, Kwangyoun Kim, Jing Pan, Kyu Jeong Han, Kilian Q. Weinberger, Yoav Artzi:
Performance-Efficiency Trade-Offs in Unsupervised Pre-Training for Speech Recognition. 7667-7671 - Yosuke Higuchi, Niko Moritz, Jonathan Le Roux, Takaaki Hori:
Advancing Momentum Pseudo-Labeling with Conformer and Initialization Strategy. 7672-7676 - Zhehuai Chen, Yu Zhang, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro J. Moreno, Gary Wang:
Tts4pretrain 2.0: Advancing the use of Text and Speech in ASR Pretraining with Consistency and Contrastive Losses. 7677-7681 - Ting-Yao Hu, Mohammadreza Armandpour, Ashish Shrivastava, Jen-Hao Rick Chang, Hema Koppula, Oncel Tuzel:
SYNT++: Utilizing Imperfect Synthetic Data to Improve Speech Recognition. 7682-7686 - Loren Lugosch, Tatiana Likhomanenko, Gabriel Synnaeve, Ronan Collobert:
Pseudo-Labeling for Massively Multilingual Speech Recognition. 7687-7691 - Dongpeng Ma, Yiwen Wang, Liqiang He, Mingjie Jin, Dan Su, Dong Yu:
DP-DWA: Dual-Path Dynamic Weight Attention Network With Streaming Dfsmn-San For Automatic Speech Recognition. 7692-7696 - Neil Scheidwasser-Clow, Mikolaj Kegler, Pierre Beckmann, Milos Cernak:
SERAB: A Multi-Lingual Benchmark for Speech Emotion Recognition. 7697-7701 - Tiantian Feng, Hanieh Hashemi, Murali Annavaram, Shrikanth S. Narayanan:
Enhancing Privacy Through Domain Adaptive Noise Injection For Speech Emotion Recognition. 7702-7706 - Heran Zhang, Masato Mimura, Tatsuya Kawahara, Kenkichi Ishizuka:
Selective Multi-Task Learning For Speech Emotion Recognition Using Corpora Of Different Styles. 7707-7711 - Yuxuan Xi, Yan Song, Li-Rong Dai, Ian McLoughlin, Lin Liu:
Frontend Attributes Disentanglement for Speech Emotion Recognition. 7712-7716 - Huang-Cheng Chou, Wei-Cheng Lin, Chi-Chun Lee, Carlos Busso:
Exploiting Annotators' Typed Description of Emotion Perception to Maximize Utilization of Ratings for Speech Emotion Recognition. 7717-7721 - Andrew Koh, Fuzhao Xue, Chng Eng Siong:
Automated Audio Captioning Using Transfer Learning and Reconstruction Latent Space Similarity Regularization. 7722-7726 - Puyuan Peng, David Harwath:
Fast-Slow Transformer for Visually Grounding Speech. 7727-7731 - Ankit P. Shah, Shijie Geng, Peng Gao, Anoop Cherian, Takaaki Hori, Tim K. Marks, Jonathan Le Roux, Chiori Hori:
Audio-Visual Scene-Aware Dialog and Reasoning Using Audio-Visual Transformers with Joint Student-Teacher Learning. 7732-7736 - Jijun Shi, Shanshe Wang, Ronggang Wang, Siwei Ma:
AIMNet: Adaptive Image-Tag Merging Network For Automatic Medical Report Generation. 7737-7741 - David Xu, David Harwath:
Adversarial Input Ablation for Audio-Visual Learning. 7742-7746 - Jiudong Yang, Peiying Wang, Yi Zhu, Mingchao Feng, Meng Chen, Xiaodong He:
Gated Multimodal Fusion with Contrastive Learning for Turn-Taking Prediction in Human-Robot Dialogue. 7747-7751 - Andreas Jonas Fuglsig, Jan Østergaard, Jesper Jensen, Lars Søndergaard Bertelsen, Peter Mariager, Zheng-Hua Tan:
Joint Far- and Near-End Speech Intelligibility Enhancement Based on the Approximated Speech Intelligibility Index. 7752-7756 - Heming Wang, Xueliang Zhang, DeLiang Wang:
Attention-Based Fusion for Bone-Conducted and Air-Conducted Speech Enhancement in the Complex Domain. 7757-7761 - Xu Zhang, Lianwu Chen, Xiguang Zheng, Xinlei Ren, Chen Zhang, Liang Guo, Bing Yu:
A Two-Step Backward Compatible Fullband Speech Enhancement System. 7762-7766 - Shubo Lv, Yihui Fu, Mengtao Xing, Jiayao Sun, Lei Xie, Jun Huang, Yannan Wang, Tao Yu:
S-DCCRN: Super Wide Band DCCRN with Learnable Complex Feature for Speech Enhancement. 7767-7771 - Reza Lotfidereshgi, Philippe Gournay:
Cognitive Coding Of Speech. 7772-7776 - Ju Lin, Kaustubh Kalgaonkar, Qing He, Xin Lei:
Speech Enhancement for Low Bit Rate Speech Codec. 7777-7781 - Jinchuan Tian, Jianwei Yu, Chao Weng, Shi-Xiong Zhang, Dan Su, Dong Yu, Yuexian Zou:
Consistent Training and Decoding for End-to-End Speech Recognition Using Lattice-Free MMI. 7782-7786 - Jahn Heymann, Egor Lakomkin, Leif Rädel:
Being Greedy Does Not Hurt: Sampling Strategies for End-To-End Speech Recognition. 7787-7791 - Zeyu Zhao, Peter Bell:
Investigating Sequence-Level Normalisation For CTC-Like End-to-End ASR. 7792-7796 - Yosuke Higuchi, Keita Karube, Tetsuji Ogawa, Tetsunori Kobayashi:
Hierarchical Conditional End-to-End ASR with CTC and Multi-Granular Subword Units. 7797-7801 - Wei Wang, Shuo Ren, Yao Qian, Shujie Liu, Yu Shi, Yanmin Qian, Michael Zeng:
Optimizing Alignment of Speech and Language Latent Spaces for End-To-End Speech Recognition and Understanding. 7802-7806 - Yizhou Peng, Jicheng Zhang, Haihua Xu, Hao Huang, Eng Siong Chng:
Minimum Word Error Training For Non-Autoregressive Transformer-Based Code-Switching ASR. 7807-7811 - Rui Liu, Zheng Lin, Peng Fu, Yuanxin Liu, Weiping Wang:
Connecting Targets via Latent Topics And Contrastive Learning: A Unified Framework For Robust Zero-Shot and Few-Shot Stance Detection. 7812-7816 - Cai Ke, Qingyu Xiong, Chao Wu, Zikai Liao, Hualing Yi:
Prior-Bert and Multi-Task Learning for Target-Aspect-Sentiment Joint Detection. 7817-7821 - Huishan Ji, Zheng Lin, Peng Fu, Weiping Wang:
Cross-Target Stance Detection Via Refined Meta-Learning. 7822-7826 - Xuefeng Li, Hao Lei, Liwen Wang, Guanting Dong, Jinzheng Zhao, Jiachi Liu, Weiran Xu, Chunyun Zhang:
A Robust Contrastive Alignment Method for Multi-Domain Text Classification. 7827-7831 - Ruixue Lian, Che-Wei Huang, Yuqing Tang, Qilong Gu, Chengyuan Ma, Chenlei Guo:
Incremental User Embedding Modeling for Personalized Text Classification. 7832-7836 - Sahar Sadrizadeh, Ljiljana Dolamic, Pascal Frossard:
Block-Sparse Adversarial Attack to Fool Transformer-Based Text Classifiers. 7837-7841 - Hyun Joon Park, Byung Ha Kang, Wooseok Shin, Jin Sob Kim, Sung Won Han:
MANNER: Multi-View Attention Network For Noise Erasure. 7842-7846 - Guochen Yu, Andong Li, Chengshi Zheng, Yinuo Guo, Yutian Wang, Hui Wang:
Dual-Branch Attention-In-Attention Transformer for Single-Channel Speech Enhancement. 7847-7851 - Qiquan Zhang, Qi Song, Zhaoheng Ni, Aaron Nicolson, Haizhou Li:
Time-Frequency Attention for Monaural Speech Enhancement. 7852-7856 - Jun Chen, Zilin Wang, Deyi Tuo, Zhiyong Wu, Shiyin Kang, Helen Meng:
FullSubNet+: Channel Attention Fullsubnet with Complex Spectrograms for Speech Enhancement. 7857-7861 - Heming Wang, DeLiang Wang:
Cross-Domain Speech Enhancement with a Neural Cascade Architecture. 7862-7866 - Zhifeng Kong, Wei Ping, Ambrish Dantrey, Bryan Catanzaro:
Speech Denoising in the Waveform Domain With Self-Attention. 7867-7871 - Jing Pan, Tao Lei, Kwangyoun Kim, Kyu Jeong Han, Shinji Watanabe:
SRU++: Pioneering Fast Recurrence with Attention for Speech Recognition. 7872-7876 - Prabhat Pandey, Sergio Duarte Torres, Ali Orkan Bayer, Ankur Gandhe, Volker Leutnant:
Lattention: Lattice-Attention in ASR Rescoring. 7877-7881 - Binghuai Lin, Liyuan Wang:
Learning Acoustic Frame Labeling for Phoneme Segmentation with Regularized Attention Mechanism. 7882-7886 - Nilaksh Das, Duen Horng Chau, Monica Sunkara, Sravan Bodapati, Dhanush Bekal, Katrin Kirchhoff:
Listen, Know and Spell: Knowledge-Infused Subword Modeling for Improving ASR Performance of OOV Named Entities. 7887-7891 - Chaitanya Narisetty, Emiru Tsunoo, Xuankai Chang, Yosuke Kashiwagi, Michael Hentschel, Shinji Watanabe:
Joint Speech Recognition and Audio Captioning. 7892-7896 - Daisy Stanton, Matt Shannon, Soroosh Mariooryad, R. J. Skerry-Ryan, Eric Battenberg, Tom Bagby, David Kao:
Speaker Generation. 7897-7901 - Adam Gabrys, Goeric Huybrechts, Manuel Sam Ribeiro, Chung-Ming Chien, Julian Roth, Giulia Comini, Roberto Barra-Chicote, Bartek Perz, Jaime Lorenzo-Trueba:
Voice Filter: Few-Shot Text-to-Speech Speaker Adaptation Using Voice Conversion as a Post-Processing Module. 7902-7906 - Li-Wei Chen, Alexander Rudnicky:
Fine-Grained Style Control In Transformer-Based Text-To-Speech Synthesis. 7907-7911 - Cheng Gong, Longbiao Wang, Zhenhua Ling, Ju Zhang, Jianwu Dang:
Using Multiple Reference Audios and Style Embedding Constraints for Speech Synthesis. 7912-7916 - Jingbei Li, Yi Meng, Chenyi Li, Zhiyong Wu, Helen Meng, Chao Weng, Dan Su:
Enhancing Speaking Styles in Conversational Text-to-Speech Synthesis with Graph-Based Multi-Modal Context Modeling. 7917-7921 - Shun Lei, Yixuan Zhou, Liyang Chen, Zhiyong Wu, Shiyin Kang, Helen Meng:
Towards Expressive Speaking Style Modelling with Hierarchical Context Information for Mandarin Speech Synthesis. 7922-7926 - Suwon Shon, Ankita Pasad, Felix Wu, Pablo Brusco, Yoav Artzi, Karen Livescu, Kyu Jeong Han:
SLUE: New Benchmark Tasks For Spoken Language Understanding Evaluation on Natural Speech. 7927-7931 - Samuel Thomas, Hong-Kwang Jeff Kuo, Brian Kingsbury, George Saon:
Towards Reducing the Need for Speech Training Data to Build Spoken Language Understanding Systems. 7932-7936 - Feilong Chen, Xiuyi Chen, Shuang Xu, Bo Xu:
Improving Cross-Modal Understanding in Visual Dialog Via Contrastive Learning. 7937-7941 - Rongyao Wang, Shoujin Wang, Wenpeng Lu, Xueping Peng:
News Recommendation Via Multi-Interest News Sequence Modelling. 7942-7946 - Beiduo Chen, Wu Guo, Bin Gu, Quan Liu, Yongchao Wang:
Multi-Level Contrastive Learning for Cross-Lingual Alignment. 7947-7951 - Chang-Ting Chu, Mahdin Rohmatillah, Ching-Hsien Lee, Jen-Tzung Chien:
Augmentation Strategy Optimization for Language Understanding. 7952-7956 - Sreekanth Sankala, B. Shaik Mohammad Rafi, K. Sri Rama Murty:
Multi-Feature Integration for Speaker Embedding Extraction. 7957-7961 - Xuechen Liu, Md. Sahidullah, Tomi Kinnunen:
Learnable Nonlinear Compression for Robust Speaker Verification. 7962-7966 - Nik Vaessen, David A. van Leeuwen:
Fine-Tuning Wav2Vec2 for Speaker Recognition. 7967-7971 - Hye-Jin Shim, Jungwoo Heo, Jae-Han Park, Ga-Hui Lee, Ha-Jin Yu:
Graph Attentive Feature Aggregation for Text-Independent Speaker Verification. 7972-7976 - Ladislav Mosner, Oldrich Plchot, Lukás Burget, Jan Honza Cernocký:
Multisv: Dataset for Far-Field Multi-Channel Speaker Verification. 7977-7981 - Ladislav Mosner, Oldrich Plchot, Lukás Burget, Jan Honza Cernocký:
Multi-Channel Speaker Verification with Conv-Tasnet Based Beamformer. 7982-7986 - Kevin Ding, Martin Zong, Jiakui Li, Baoxiang Li:
LETR: A Lightweight and Efficient Transformer for Keyword Spotting. 7987-7991 - Yongjie Lv, Longbiao Wang, Meng Ge, Sheng Li, Chenchen Ding, Lixin Pan, Yuguang Wang, Jianwu Dang, Kiyoshi Honda:
Compressing Transformer-Based ASR Model by Task-Driven Loss and Attention-Based Multi-Level Feature Distillation. 7992-7996 - Dushyant Sharma, Rong Gong, James Fosburgh, Stanislav Yu. Kruchinin, Patrick A. Naylor, Ljubomir Milanovic:
Spatial Processing Front-End for Distant ASR Exploiting Self-Attention Channel Combinator. 7997-8001 - Nils-Philipp Wynands, Wilfried Michel, Jan Rosendahl, Ralf Schlüter, Hermann Ney:
Efficient Sequence Training of Attention Models Using Approximative Recombination. 8002-8006 - Jingbei Li, Yi Meng, Zhiyong Wu, Helen Meng, Qiao Tian, Yuping Wang, Yuxuan Wang:
Neufa: Neural Network Based End-to-End Forced Alignment with Bidirectional Attention Mechanism. 8007-8011 - Lahiru Samarakoon, Tsun-Yat Leung:
Conformer-Based Speech Recognition with Linear Nyström Attention and Rotary Position Embedding. 8012-8016 - Jilong Wu, Adam Polyak, Yaniv Taigman, Jason Fong, Prabhav Agrawal, Qing He:
Multilingual Text-To-Speech Training Using Cross Language Voice Conversion And Self-Supervised Learning Of Speech Representations. 8017-8021 - Mu Yang, Shaojin Ding, Tianlong Chen, Tong Wang, Zhangyang Wang:
Towards Lifelong Learning of Multilingual Text-to-Speech Synthesis. 8022-8026 - Yibin Zheng, Zewang Zhang, Xinhui Li, Wenchao Su, Li Lu:
Zero-Shot Cross-Lingual Transfer Using Multi-Stream Encoder and Efficient Speaker Representation. 8027-8031 - Junchen Lu, Berrak Sisman, Rui Liu, Mingyang Zhang, Haizhou Li:
Visualtts: TTS with Accurate Lip-Speech Synchronization for Automatic Voice Over. 8032-8036 - Johanes Effendi, Yogesh Virkar, Roberto Barra-Chicote, Marcello Federico:
Duration Modeling of Neural TTS for Automatic Dubbing. 8037-8041 - Ravindra Yadav, Ashish Sardana, Vinay P. Namboodiri, Rajesh M. Hegde:
Learning to Predict Speech in Silent Videos Via Audiovisual Analogy. 8042-8046 - Yong Zhang, Zhitao Li, Jianzong Wang, Ning Cheng, Jing Xiao:
Self-Attention for Incomplete Utterance Rewriting. 8047-8051 - Wangjie Jiang, Siheng Li, Jiayi Li, Yujiu Yang:
Multi-Turn Incomplete Utterance Restoration As Object Detection. 8052-8056 - Yuqiang Xie, Yue Hu, Luxi Xing, Yunpeng Li, Wei Peng, Ping Guo:
CLseg: Contrastive Learning of Story Ending Generation. 8057-8061 - Qianren Mao, Jianxin Li, JiaZheng Wang, Xi Li, Peng Hao, Lihong Wang, Zheng Wang:
Explicitly Modeling Importance and Coherence for Timeline Summarization. 8062-8066 - Gianluca Vico, Jan Niehues:
TED Talk Teaser Generation with Pre-Trained Models. 8067-8071 - Roshan Sharma, Shruti Palaskar, Alan W. Black, Florian Metze:
End-to-End Speech Summarization Using Restricted Self-Attention. 8072-8076 - Wei Xia, Han Lu, Quan Wang, Anshuman Tripathi, Yiling Huang, Ignacio López-Moreno, Hasim Sak:
Turn-to-Diarize: Online Speaker Diarization Constrained by Transformer Transducer Speaker Turn Detection. 8077-8081 - Naoyuki Kanda, Xiong Xiao, Yashesh Gaur, Xiaofei Wang, Zhong Meng, Zhuo Chen, Takuya Yoshioka:
Transcribe-to-Diarize: Neural Speaker Diarization for Unlimited Number of Speakers Using End-to-End Speaker-Attributed ASR. 8082-8086 - Hang Su, Danyang Zhao, Long Dang, Minglei Li, Xixin Wu, Xunying Liu, Helen Meng:
A Multitask Learning Framework for Speaker Change Detection with Content Information from Unsupervised Speech Decomposition. 8087-8091 - Aparna Khare, Eunjung Han, Yuguang Yang, Andreas Stolcke:
ASR-Aware End-to-End Neural Diarization. 8092-8096 - Siqi Zheng, Hongbin Suo:
Reformulating Speaker Diarization As Community Detection With Emphasis On Topological Structure. 8097-8101 - Nithin Rao Koluguri, Taejin Park, Boris Ginsburg:
TitaNet: Neural Model for Speaker Representation with 1D Depth-Wise Separable Convolutions and Global Context. 8102-8106 - Ke Hu, Tara N. Sainath, Arun Narayanan, Ruoming Pang, Trevor Strohman:
Transducer-Based Streaming Deliberation for Cascaded Encoders. 8107-8111 - Tara N. Sainath, Yanzhang He, Arun Narayanan, Rami Botros, Weiran Wang, David Qiu, Chung-Cheng Chiu, Rohit Prabhavalkar, Alexander Gruenstein, Anmol Gulati, Bo Li, David Rybach, Emmanuel Guzman, Ian McGraw, James Qin, Krzysztof Choromanski, Qiao Liang, Robert David, Ruoming Pang, Shuo-Yiin Chang, Trevor Strohman, W. Ronny Huang, Wei Han, Yonghui Wu, Yu Zhang:
Improving The Latency And Quality Of Cascaded Encoders. 8112-8116 - Chao Zhang, Bo Li, Zhiyun Lu, Tara N. Sainath, Shuo-Yiin Chang:
Improving the Fusion of Acoustic and Text Representations in RNN-T. 8117-8121 - Vinit Unni, Shreya Khare, Ashish R. Mittal, Preethi Jyothi, Sunita Sarawagi, Samarth Bharadwaj:
Adaptive Discounting of Implicit Language Models in RNN-Transducers. 8122-8126 - Samuel Thomas, Brian Kingsbury, George Saon, Hong-Kwang Jeff Kuo:
Integrating Text Inputs for Training and Adapting RNN Transducer ASR Models. 8127-8131 - Xie Chen, Zhong Meng, Sarangarajan Parthasarathy, Jinyu Li:
Factorized Neural Transducer for Efficient Language Model Adaptation. 8132-8136 - Junhua Ma, Jiajun Li, Yuxuan Liu, Shangbo Zhou, Xue Li:
Integrating Dependency Tree into Self-Attention for Sentence Representation. 8137-8141 - Itzik Malkiel, Dvir Ginzburg, Oren Barkan, Avi Caciularu, Yoni Weill, Noam Koenigstein:
Metricbert: Text Representation Learning Via Self-Supervised Triplet Training. 1-5 - Tuan Manh Lai, Trung Bui, Doo Soon Kim:
End-To-End Neural Coreference Resolution Revisited: A Simple Yet Effective Baseline. 8147-8151 - Junyu Lu, Pingjian Zhang:
Local Context Interaction-Aware Glyph-Vectors for Chinese Sequence Tagging. 8152-8156 - Mithilesh Vaidya, Kamini Sabu, Preeti Rao:
Deep Learning for Prominence Detection In Children's Read Speech. 8157-8161 - Hagai Aronowitz, Itai Gat, Edmilson da Silva Morais, Weizhong Zhu, Ron Hoory:
Towards A Common Speech Analysis Engine. 8162-8166 - Jian Zhu, Cong Zhang, David Jurgens:
Phone-to-Audio Alignment without Text: A Semi-Supervised Approach. 8167-8171 - Huda Alsofyani, Alessandro Vinciarelli:
Attachment Recognition in School-Age Children: A Multimodal Approach Based on Language and Paralanguage Analysis. 8172-8176 - Zhizhong Ma, Yuanhang Qiu, Feng Hou, Ruili Wang, Joanna Ting Wai Chu, Chris Bullen:
Determining the best Acoustic Features for Smoker Identification. 8177-8181 - Ephrem Tibebe Mekonnen, Alessio Brutti, Daniele Falavigna:
End-to-End Low Resource Keyword Spotting Through Character Recognition and Beam-Search Re-Scoring. 8182-8186 - Anastasia Kuznetsova, Anurag Kumar, Jennifer Drexler Fox, Francis M. Tyers:
Curriculum Optimization for Low-Resource Speech Recognition. 8187-8191 - Zhikai Zhou, Wei Wang, Wangyou Zhang, Yanmin Qian:
Exploring Effective Data Utilization for Low-Resource Speech Recognition. 8192-8196 - Haichuan Yang, Yuan Shangguan, Dilin Wang, Meng Li, Pierce Chuang, Xiaohui Zhang, Ganesh Venkatesh, Ozlem Kalinli, Vikas Chandra:
Omni-Sparsity DNN: Fast Sparsity Optimization for On-Device Streaming E2E ASR Via Supernet. 8197-8201 - Guan-Ting Lin, Chan-Jan Hsu, Da-Rong Liu, Hung-Yi Lee, Yu Tsao:
Analyzing The Robustness of Unsupervised Speech Recognition. 8202-8206 - Gasper Begus, Alan Zhou:
Interpreting Intermediate Convolutional Layers In Unsupervised Acoustic Word Classification. 8207-8211 - Sicheng Yu, Hao Zhang, Wei Jing, Jing Jiang:
Context Modeling with Evidence Filter for Multiple Choice Question Answering. 8212-8216 - Zihao Zhu:
From Shallow to Deep: Compositional Reasoning over Graphs for Visual Question Answering. 8217-8221 - Liang Wen, Houfeng Wang, Dehong Ma, Jun Fan, Yingwei Luo, Xiaolin Wang, Daiting Shi, Zhicong Cheng, Dawei Yin:
A Question-Oriented Propagation Network for News Reading Comprehension. 8222-8226 - Lu Ma, Peng Zhang, Dan Luo, Xi Zhu, Meilin Zhou, Qi Liang, Bin Wang:
Syntax-Based Graph Matching for Knowledge Base Question Answering. 8227-8231 - Dan Su, Peng Xu, Pascale Fung:
QA4QG: Using Question Answering to Constrain Multi-Hop Question Generation. 8232-8236 - Shuang Li, Xuming Hu, Li Lin, Lijie Wen:
Pair-Level Supervised Contrastive Learning for Natural Language Inference. 8237-8241 - Peter Birkholz, P. Häsner, Steffen Kürbis:
Acoustic Comparison of Physical Vocal Tract Models with Hard and Soft Walls. 8242-8246 - Anwesha Roy, Varun Belagali, Prasanta Kumar Ghosh:
An Error Correction Scheme for Improved Air-Tissue Boundary in Real-Time MRI Video for Speech Production. 8247-8251 - Marc-Antoine Georges, Julien Diard, Laurent Girin, Jean-Luc Schwartz, Thomas Hueber:
Repeat after Me: Self-Supervised Learning of Acoustic-to-Articulatory Mapping by Vocal Imitation. 8252-8256 - Xiang Li, Yifan Sun, Xihong Wu, Jing Chen:
Multi-Speaker Pitch Tracking via Embodied Self-Supervised Learning. 8257-8261 - Yunsheng Xiong, Kele Xu, Meng Jiang, Liang Cheng, Yong Dou, Jinjia Wang:
Improving the Classification of Phonetic Segments from Raw Ultrasound Using Self-Supervised Learning and Hard Example Mining. 8262-8266 - Aravind Illa, Aanish Nair, Prasanta Kumar Ghosh:
The impact of cross language on acoustic-to-articulatory inversion and its influence on articulatory speech synthesis. 8267-8271 - Mohan Li, Shucong Zhang, Catalin Zorila, Rama Doddipatla:
Transformer-Based Streaming ASR with Cumulative Attention. 8272-8276 - Yangyang Shi, Chunyang Wu, Dilin Wang, Alex Xiao, Jay Mahadeokar, Xiaohui Zhang, Chunxi Liu, Ke Li, Yuan Shangguan, Varun Nagaraja, Ozlem Kalinli, Mike Seltzer:
Streaming Transformer Transducer based Speech Recognition Using Non-Causal Convolution. 8277-8281 - Takafumi Moriya, Takanori Ashihara, Atsushi Ando, Hiroshi Sato, Tomohiro Tanaka, Kohei Matsuura, Ryo Masumura, Marc Delcroix, Takahiro Shinozaki:
Hybrid RNN-T/Attention-Based Streaming ASR with Triggered Chunkwise Attention and Dual Internal Language Model Integration. 8282-8286 - Emiru Tsunoo, Chaitanya Narisetty, Michael Hentschel, Yosuke Kashiwagi, Shinji Watanabe:
Run-and-Back Stitch Search: Novel Block Synchronous Decoding For Streaming Encoder-Decoder ASR. 8287-8291 - Yonghe Wang, Rui Liu, Feilong Bao, Hui Zhang, Guanglai Gao:
Alignment-Learning Based Single-Step Decoding for Accurate and Fast Non-Autoregressive Speech Recognition. 8292-8296 - Bolaji Yusuf, Ankur Gandhe, Alex Sokolov:
Usted: Improving ASR with a Unified Speech and Text Encoder-Decoder. 8297-8301 - Fengyu Yang, Jian Luan, Yujun Wang:
Improving Emotional Speech Synthesis by Using SUS-Constrained VAE and Text Encoder Aggregation. 8302-8306 - Mateusz Lajszczak, Animesh Prasad, Arent van Korlaar, Bajibabu Bollepalli, Antonio Bonafonte, Arnaud Joly, Marco Nicolis, Alexis Moinet, Thomas Drugman, Trevor Wood, Elena Sokolova:
Distribution Augmentation for Low-Resource Expressive Text-To-Speech. 8307-8311 - Tobias Cornille, Fengna Wang, Jessa Bekker:
Interactive Multi-Level Prosody Control for Expressive Speech Synthesis. 8312-8316 - Haitong Zhang, Yue Lin:
Improve Few-Shot Voice Cloning Using Multi-Modal Learning. 8317-8321 - Dongyang Dai, Yuanzhe Chen, Li Chen, Ming Tu, Lu Liu, Rui Xia, Qiao Tian, Yuping Wang, Yuxuan Wang:
Cloning One's Voice Using Very Limited Data in the Wild. 8322-8326 - Rui Li, Dong Pu, Minnie Huang, Bill Huang:
UNET-TTS: Improving Unseen Speaker and Style Transfer in One-Shot Voice Cloning. 8327-8331 - Yiqi Tong, Fuzhen Zhuang, Deqing Wang, Haochao Ying, Binling Wang:
Improving Biomedical Named Entity Recognition with a Unified Multi-Task MRC Framework. 8332-8336 - Xuhui Sui, Kehui Song, Baohang Zhou, Ying Zhang, Xiaojie Yuan:
A Multi-Task Learning Framework for Chinese Medical Procedure Entity Normalization. 8337-8341 - Rui Wang, Ricardo Henao:
Wasserstein Cross-Lingual Alignment For Named Entity Recognition. 8342-8346 - Luchen Liu, Xixun Lin, Peng Zhang, Lei Zhang, Bin Wang:
Learning Common Dependency Structure for Unsupervised Cross-Domain Ner. 8347-8351 - Boli Chen, Guangwei Xu, Xiaobin Wang, Pengjun Xie, Meishan Zhang, Fei Huang:
AISHELL-NER: Named Entity Recognition from Chinese Speech. 8352-8356 - Alexander Blatt, Martin Kocour, Karel Veselý, Igor Szöke, Dietrich Klakow:
Call-Sign Recognition and Understanding for Noisy Air-Traffic Transcripts Using Surveillance Information. 8357-8361 - Weiqing Wang, Ming Li:
Incorporating End-to-End Framework Into Target-Speaker Voice Activity Detection. 8362-8366 - Youngki Kwon, Hee-Soo Heo, Jee-Weon Jung, You Jin Kim, Bong-Jin Lee, Joon Son Chung:
Multi-Scale Speaker Embedding-Based Graph Attention Networks For Speaker Diarisation. 8367-8371 - Chunlei Zhang, Jiatong Shi, Chao Weng, Meng Yu, Dong Yu:
Towards end-to-end Speaker Diarization with Generalized Neural Speaker Clustering. 8372-8376 - Yechan Yu, Dongkeon Park, Hong Kook Kim:
Auxiliary Loss of Transformer with Residual Connection for End-to-End Speaker Diarization. 8377-8381 - Keisuke Kinoshita, Marc Delcroix, Tomoharu Iwata:
Tight Integration Of Neural- And Clustering-Based Diarization Through Deep Unfolding Of Infinite Gaussian Mixture Model. 8382-8386 - Shutong Niu, Jun Du, Lei Sun, Chin-Hui Lee:
Improving Separation-Based Speaker Diarization Via Iterative Model Refinement And Speaker Embedding Based Post-Processing. 8387-8391 - Mun-Hak Lee, Joon-Hyuk Chang:
Knowledge Distillation from Language Model to Acoustic Model: A Hierarchical Multi-Task Learning Approach. 8392-8396 - Shaoshi Ling, Chen Shen, Meng Cai, Zejun Ma:
Improving Pseudo-Label Training For End-To-End Speech Recognition Using Gradient Mask. 8397-8401 - Ilya Sklyar, Anna Piunova, Xianrui Zheng, Yulan Liu:
Multi-Turn RNN-T for Streaming Recognition of Multi-Party Speech. 8402-8406 - Wei Zhou, Zuoyun Zheng, Ralf Schlüter, Hermann Ney:
On Language Model Integration for RNN Transducer Based Speech Recognition. 8407-8411 - Anastasios Alexandridis, Grant P. Strimel, Ariya Rastrow, Pavel Kveton, Jon Webb, Maurizio Omologo, Siegfried Kunzmann, Athanasios Mouchtaris:
Caching Networks: Capitalizing on Common Speech for ASR. 8412-8416 - Lucas Ondel, Léa-Marie Lam-Yee-Mui, Martin Kocour, Caio Filippo Corro, Lukás Burget:
GPU-Accelerated Forward-Backward Algorithm with Application to Lattice-Free MMI. 8417-8421 - Shoule Wu, Ziqiang Shi:
ItôWave: Itô Stochastic Differential Equation is all You Need for Wave Generation. 8422-8426 - Hiroki Kanagawa, Yusuke Ijima:
Multi-Sample Subband Wavernn Via Multivariate Gaussian. 8427-8431 - Zehua Chen, Xu Tan, Ke Wang, Shifeng Pan, Danilo P. Mandic, Lei He, Sheng Zhao:
Infergrad: Improving Diffusion Models for Vocoder by Considering Inference in Training. 8432-8436 - Jean-Marc Valin, Umut Isik, Paris Smaragdis, Arvindh Krishnaswamy:
Neural Speech Synthesis on a Shoestring: Improving the Efficiency of Lpcnet. 8437-8441 - Erica Cooper, Wen-Chin Huang, Tomoki Toda, Junichi Yamagishi:
Generalization Ability of MOS Prediction Networks. 8442-8446 - Cheng-I Jeff Lai, Erica Cooper, Yang Zhang, Shiyu Chang, Kaizhi Qian, Yi-Lun Liao, Yung-Sung Chuang, Alexander H. Liu, Junichi Yamagishi, David D. Cox, James R. Glass:
On the Interplay between Sparsity, Naturalness, Intelligibility, and Prosody in Speech Synthesis. 8447-8451 - Federico Tavella, Aphrodite Galata, Angelo Cangelosi:
Phonology Recognition in American Sign Language. 8452-8456 - Maria Parelli, Katerina Papadimitriou, Gerasimos Potamianos, Georgios Pavlakos, Petros Maragos:
Spatio-Temporal Graph Convolutional Networks for Continuous Sign Language Recognition. 8457-8461 - Thomas Fouts, Ali Hindy, Chris Tanner:
Sensors to Sign Language: A Natural Approach to Equitable Communication. 8462-8466 - Alexandros Koumparoulis, Gerasimos Potamianos:
Accurate and Resource-Efficient Lipreading with Efficientnetv2 and Transformers. 8467-8471 - Pingchuan Ma, Yujiang Wang, Stavros Petridis, Jie Shen, Maja Pantic:
Training Strategies for Improved Lip-Reading. 8472-8476 - Sanjana Sankar, Denis Beautemps, Thomas Hueber:
Multistream Neural Architectures for Cued Speech Recognition Using a Pre-Trained Visual Feature Extractor and Constrained CTC Decoding. 8477-8481 - Zohreh Mostaani, RaviShankar Prasad, Bogdan Vlasenko, Mathew Magimai-Doss:
Modeling of Pre-Trained Neural Network Embeddings Learned From Raw Waveform for COVID-19 Infection Detection. 8482-8486 - Abinay Reddy Naini, Bhavuk Singhal, Prasanta Kumar Ghosh:
Dual Attention Pooling Network for Recording Device Classification Using Neutral and Whispered Speech. 8487-8491 - Keiko Ochi, Nobutaka Ono, Keiho Owada, Miho Kuroda, Shigeki Sagayama, Hidenori Yamasue:
Entrainment Analysis for Assessment of Autistic Speech Prosody Using Bottleneck Features of Deep Neural Network. 8492-8496 - Atsushi Ando, Yumiko Murata, Ryo Masumura, Satoshi Suzuki, Naoki Makishima, Takafumi Moriya, Takanori Ashihara, Hiroshi Sato:
Customer Satisfaction Estimation Using Unsupervised Representation Learning with Multi-Format Prediction Loss. 8497-8501 - José Vicente Egas López, Gábor Kiss, Dávid Sztahó, Gábor Gosztolya:
Automatic Assessment of the Degree of Clinical Depression from Speech Using X-Vectors. 8502-8506 - Ya Li, Mingyue Niu, Ziping Zhao, Jianhua Tao:
Automatic Depression Level Assessment from Speech By Long-Term Global Information Embedding. 8507-8511 - Yotaro Kubo, Shigeki Karita, Michiel Bacchiani:
Knowledge Transfer from Large-Scale Pretrained Language Models to End-To-End Speech Recognizers. 8512-8516 - Keqi Deng, Songjun Cao, Yike Zhang, Long Ma, Gaofeng Cheng, Ji Xu, Pengyuan Zhang:
Improving CTC-Based Speech Recognition Via Knowledge Transferring from Pre-Trained Language Models. 8517-8521 - Keqi Deng, Zehui Yang, Shinji Watanabe, Yosuke Higuchi, Gaofeng Cheng, Pengyuan Zhang:
Improving Non-Autoregressive End-to-End Speech Recognition with Pre-Trained Acoustic and Language Models. 8522-8526 - Xiaoyu Yang, Qiujia Li, Philip C. Woodland:
Knowledge Distillation for Neural Transducers from Large Self-Supervised Pre-Trained Models. 8527-8531 - Minglun Han, Linhao Dong, Zhenlin Liang, Meng Cai, Shiyu Zhou, Zejun Ma, Bo Xu:
Improving End-to-End Contextual Speech Recognition with Fine-Grained Contextual Knowledge Selection. 8532-8536 - Kanthashree Mysore Sathyendra, Thejaswi Muniyappa, Feng-Ju Chang, Jing Liu, Jinru Su, Grant P. Strimel, Athanasios Mouchtaris, Siegfried Kunzmann:
Contextual Adapters for Personalized Speech Recognition in Neural Transducers. 8537-8541 - Xiaohui Song, Liangjun Zang, Rong Zhang, Songlin Hu, Longtao Huang:
Emotionflow: Capture the Dialogue Level Emotion Transitions. 8542-8546 - Yukun Ma, Bin Ma:
Multimodal Sentiment Analysis on Unaligned Sequences Via Holographic Embedding. 8547-8551 - Amruta Saraf, Elie Khoury:
Distribution Learning for Age Estimation from Speech. 8552-8556 - Olivier Zhang, Nicolas Gengembre, Olivier Le Blouch, Damien Lolive:
Dispeech: A Synthetic Toy Dataset for Speech Disentangling. 8557-8561 - Jiancheng Gui, Yikai Li, Kai Chen, Joanna Siebert, Qingcai Chen:
End-to-End ASR-Enhanced Neural Network for Alzheimer's Disease Diagnosis. 8562-8566 - Jingyao Wu, Ting Dang, Vidhyasaharan Sethu, Eliathamby Ambikairajah:
A Novel Sequential Monte Carlo Framework for Predicting Ambiguous Emotion States. 8567-8571 - Sei Ueno, Tatsuya Kawahara:
Phone-Informed Refinement of Synthesized Mel Spectrogram for Data Augmentation in Speech Recognition. 8572-8576 - Alexander Johnson, Ruchao Fan, Robin Morris, Abeer Alwan:
LPC Augment: an LPC-based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects. 8577-8581 - Yunzheng Zhu, Ruchao Fan, Abeer Alwan:
Towards Better Meta-Initialization with Task Augmentation for Kindergarten-Aged Speech Recognition. 8582-8586 - Chanho Park, Rehan Ahmad, Thomas Hain:
Unsupervised Data Selection for Speech Recognition with Contrastive Loss Ratios. 8587-8591 - Viet Anh Trinh, Hassan Salami Kavaki, Michael I. Mandel:
Importantaug: A Data Augmentation Agent for Speech. 8592-8596 - Matthew Wiesner, Desh Raj, Sanjeev Khudanpur:
Injecting Text and Cross-Lingual Supervision in Few-Shot Learning from Self-Supervised Models. 8597-8601 - Chao-Han Huck Yang, Jun Qi, Samuel Yen-Chi Chen, Yu Tsao, Pin-Yu Chen:
When BERT Meets Quantum Temporal Convolution Learning for Text Classification in Heterogeneous Computing. 8602-8606 - Mohammadreza Noormandipour, Hanchen Wang:
Matching Point Sets with Quantum Circuit Learning. 8607-8611 - Riccardo Di Sipio, Jia-Hong Huang, Samuel Yen-Chi Chen, Stefano Mangini, Marcel Worring:
The Dawn of Quantum Natural Language Processing. 8612-8616 - Mahdi Chehimi, Walid Saad:
Quantum Federated Learning with Quantum Data. 8617-8621 - Samuel Yen-Chi Chen, Shinjae Yoo, Yao-Lung L. Fang:
Quantum Long Short-Term Memory. 8622-8626 - Jun Qi, Javier Tejedor:
Classical-To-Quantum Transfer Learning for Spoken Command Recognition Based on Quantum Neural Networks. 8627-8631 - Yumeng Zhang, Bruno Clerckx:
Waveform Optimization for Wireless Power Transfer with Power Amplifier and Energy Harvester Non-linearities. 8632-8636 - Zi Qin Liew, Yanyu Cheng, Wei Yang Bryan Lim, Dusit Niyato, Chunyan Miao, Sumei Sun:
Economics of Semantic Communication System in Wireless Powered Internet of Things. 8637-8641 - Nikita Shanin, Moritz Garkisch, Amelie Hagelauer, Robert Schober, Laura Cottatellucci:
Optimal Resource Allocation and Beamforming for Two-User Miso WPCNS for a Non-Linear Circuit-Based EH Model : (Invited Paper). 8642-8646 - Mingzhe Chen, Yining Wang, H. Vincent Poor:
Performance Optimization for Wireless Semantic Communications over Energy Harvesting Networks. 8647-8651 - Sahar Idrees, Xiaolun Jia, Saud Khan, Salman Durrani, Xiangyun Zhou:
Deep Learning Based Passive Beamforming for IRS-Assisted Monostatic Backscatter Systems. 8652-8656 - Cong Shen, Jing Yang, Jie Xu:
On Federated Learning with Energy Harvesting Clients. 8657-8661 - Xuelu Li, Raja Bala, Vishal Monga:
Structural Prior Models for 3-D Deep Vessel Segmentation. 8662-8666 - Saurav K. Shastri, Rizwan Ahmad, Christopher A. Metzler, Philip Schniter:
Expectation Consistent Plug-and-Play for MRI. 8667-8671 - Thanh Van Nguyen, Gauri Jagatap, Chinmay Hegde:
Inverse Imaging with Generative Priors Via Langevin Dynamics. 8672-8676 - Bahareh Salafian, Eyal Fishel Ben-Knaan, Nir Shlezinger, Sandrine de Ribaupierre, Nariman Farsad:
CNN-Aided Factor Graphs with Estimated Mutual Information Features for Seizure Detection. 8677-8681 - Christopher Khan, Ruud J. G. van Sloun, Brett C. Byram:
Unfolding Model-Based Beamforming for High Quality Ultrasound Imaging. 8682-8686 - Wei Pu, Yonina C. Eldar, Miguel R. D. Rodrigues:
Optimization Guarantees for ISTA and ADMM Based Unfolded Networks. 8687-8691 - Chuang Shi, Mengjie Huang, Huitian Jiang, Huiyong Li:
Integration of Anomaly Machine Sound Detection into Active Noise Control to Shape the Residual Sound. 8692-8696 - Ryosuke Okajima, Yoshinobu Kajikawa, Kohei Oto:
Dual Active Noise Control with Common Sensors. 8697-8701 - Xiaoyi Shen, Dong-Yuan Shi, Woon-Seng Gan:
A Hybrid Approach to Combine Wireless and Earcup Microphones for ANC Headphones with Error Separation Module. 8702-8706 - Huiyuan Sun, Jihui Zhang, Thushara D. Abhayapala, Prasanga N. Samarasinghe:
Spatial Active Noise Control with the Remote Microphone Technique: an Approach with a Moving Higher Order Microphone. 8707-8711 - Junqing Zhang, Liming Shi, Mads Græsbøll Christensen, Wen Zhang, Lijun Zhang, Jingdong Chen:
Robust Pressure Matching with ATF Perturbation Constraints for Sound Field Control. 8712-8716 - Piero Rivera Benois, Reinhild Roden, Matthias Blau, Simon Doclo:
Optimization of a Fixed Virtual Sensing Feedback ANC Controller For In-Ear Headphones with Multiple Loudspeakers. 8717-8721 - Shuangyang Li, Weijie Yuan, Jinhong Yuan, Giuseppe Caire:
On the Potential of Spatially-Spread Orthogonal Time Frequency Space Modulation for ISAC Transmissions. 8722-8726 - Zhen Du, Fan Liu, Zenghui Zhang:
Sensing-Assisted Beam Tracking in V2I Networks: Extended Target Case. 8727-8731 - Xiang Liu, Tianyao Huang, Yimin Liu, Yonina C. Eldar:
Transmit Beamforming with Fixed Covariance for Integrated MIMO Radar and Multiuser Communications. 8732-8736 - Zhiqiang Wei, Fan Liu, Derrick Wing Kwan Ng, Robert Schober:
Safeguarding UAV Networks through Integrated Sensing, Jamming, and Communications. 8737-8741 - Sangeeta Bhattacharjee, Kumar Vijay Mishra, Ramesh Annavajjala, Chandra R. Murthy:
Evaluation of Orthogonal Chirp Division Multiplexing for Automotive Integrated Sensing and Communications. 8742-8746 - Yuanhao Cui, Xiaojun Jing, Junsheng Mu:
Integrated Sensing and Communications Via 5G NR Waveform: Performance Analysis. 8747-8751 - Jie Ding, Eric W. Tramel, Anit Kumar Sahu, Shuang Wu, Salman Avestimehr, Tao Zhang:
Federated Learning Challenges and Opportunities: An Outlook. 8752-8756 - Dhruv Guliani, Lillian Zhou, Changwan Ryu, Tien-Ju Yang, Harry Zhang, Yonghui Xiao, Françoise Beaufays, Giovanni Motta:
Enabling On-Device Training of Speech Recognition Models With Federated Dropout. 8757-8761 - Amirhossein Reisizadeh, Isidoros Tziotis, Hamed Hassani, Aryan Mokhtari, Ramtin Pedarsani:
Adaptive Node Participation for Straggler-Resilient Federated Learning. 8762-8766 - Christophe Dupuy, Tanya G. Roosta, Leo Long, Clement Chung, Rahul Gupta, Salman Avestimehr:
Learnings from Federated Learning in The Real World. 8767-8771 - Zhiyuan Zhao, Gauri Joshi:
A Dynamic Reweighting Strategy For Fair Federated Learning. 8772-8776 - Hasin Us Sami, Basak Güler:
Over-the-Air Personalized Federated Learning. 8777-8781 - Ningning Pan, Jingdong Chen, Jacob Benesty:
DNN Based Multiframe Single-Channel Noise Reduction Filters. 8782-8786 - Yicheng Hsu, Yonghan Lee, Mingsian R. Bai:
Learning-Based Personal Speech Enhancement for Teleconferencing by Exploiting Spatial-Spectral Features. 8787-8791 - Andreas Brendel, Johannes Zeitler, Walter Kellermann:
Manifold Learning-Supported Estimation of Relative Transfer Functions For Spatial Filtering. 8792-8796 - Hanan Beit-On, Moti Lugasi, Lior Madmoni, Anjali Menon, Anurag Kumar, Jacob Donley, Vladimir Tourbabin, Boaz Rafaely:
Audio Signal Processing for Telepresence Based on Wearable Array in Noisy and Dynamic Scenes. 8797-8801 - Sichen Liu, Feiran Yang, Fang Kang, Jun Yang:
A Multi-Task Learning Method for Weakly Supervised Sound Event Detection. 8802-8806 - Nir Raviv, Ofer Schwartz, Sharon Gannot:
Low Resources Online Single-Microphone Speech Enhancement with Harmonic Emphasis. 8807-8811 - Luc Le Magoarou, Taha Yassine, Stéphane Paquelet, Matthieu Crussière:
Deep Learning for Location Based Beamforming with Nlos Channels. 8812-8816 - Sangwoo Park, Osvaldo Simeone:
Predicting Flat-Fading Channels via Meta-Learned Closed-Form Linear Filters and Equilibrium Propagation. 8817-8821 - Kyriakos Stylianopoulos, Nir Shlezinger, Philipp del Hougne, George C. Alexandropoulos:
Deep-Learning-Assisted Configuration of Reconfigurable Intelligent Surfaces in Dynamic Rich-Scattering Environments. 8822-8826 - Dilin Dampahalage, K. B. Shashika Manosha, Nandana Rajatheva, Matti Latva-aho:
Supervised Learning Based Sparse Channel Estimation For RIS Aided Communications. 8827-8831 - Francesco Pezone, Sergio Barbarossa, Paolo Di Lorenzo:
Goal-Oriented Communication for Edge Learning Based On the Information Bottleneck. 8832-8836 - Yu Zhu, Boning Li, Santiago Segarra:
Hypergraphs with Edge-Dependent Vertex Weights: Spectral Clustering Based on the 1-Laplacian. 8837-8841 - Georg Essl:
Causal Linear Topological Filters Over A 2-Simplex. 8842-8846 - Maosheng Yang, Elvin Isufi, Geert Leus:
Simplicial Convolutional Neural Networks. 8847-8851 - T. Mitchell Roddenberry, Michael T. Schaub, Mustafa Hajij:
Signal Processing On Cell Complexes. 8852-8856 - Stefania Sardellitti, Sergio Barbarossa:
Robust Signal Processing Over Simplicial Complexes. 8857-8861 - Sangeeta Srivastava, Yun Wang, Andros Tjandra, Anurag Kumar, Chunxi Liu, Kritika Singh, Yatharth Saraf:
Conformer-Based Self-Supervised Learning For Non-Speech Audio Tasks. 8862-8866 - Huang Xie, Okko Räsänen, Konstantinos Drossos, Tuomas Virtanen:
Unsupervised Audio-Caption Aligning Learns Correspondences Between Individual Sound Events and Textual Phrases. 8867-8871 - Yuichiro Koyama, Kazuhide Shigemi, Masafumi Takahashi, Kazuki Shimada, Naoya Takahashi, Emiru Tsunoo, Shusuke Takahashi, Yuki Mitsufuji:
Spatial Data Augmentation with Simulated Room Impulse Responses for Sound Event Localization and Detection. 8872-8876 - Huy Phan, Thi Ngoc Tho Nguyen, Philipp Koch, Alfred Mertins:
Polyphonic Audio Event Detection: Multi-Label or Multi-Class Multi-Task Classification Problem? 8877-8881 - Xinhao Mei, Xubo Liu, Jianyuan Sun, Mark D. Plumbley, Wenwu Wang:
Diverse Audio Captioning Via Adversarial Training. 8882-8886 - Kenneth Ooi, Karn N. Watcharasupat, Bhan Lam, Zhen-Ting Ong, Woon-Seng Gan:
Probably Pleasant? A Neural-Probabilistic Approach to Automatic Masker Selection for Urban Soundscape Augmentation. 8887-8891 - Zhongze Zhang, Tao Jiang, Wei Yu:
User Scheduling Using Graph Neural Networks for Reconfigurable Intelligent Surface Assisted Multiuser Downlink Communications. 8892-8896 - Ron Aharon Finish, Yoav Cohen, Tomer Raviv, Nir Shlezinger:
Symbol-Level Online Channel Tracking for Deep Receivers. 8897-8901 - Zhongyuan Zhao, Gunjan Verma, Ananthram Swami, Santiago Segarra:
Delay-Oriented Distributed Scheduling Using Graph Neural Networks. 8902-8906 - Miquel Ferriol Galmés, Xiangle Cheng, Xiang Shi, Shihan Xiao, Pere Barlet-Ros, Albert Cabellos-Aparicio:
FlowDT: A Flow-Aware Digital Twin for Computer Networks. 8907-8911 - Zhiyang Wang, Luana Ruiz, Mark Eisen, Alejandro Ribeiro:
Stable and Transferable Wireless Resource Allocation Policies Via Manifold Neural Networks. 8912-8916 - Shuncheng Jia, Ruichen Zuo, Tielin Zhang, Hongxing Liu, Bo Xu:
Motif-Topology and Reward-Learning Improved Spiking Neural Network for Efficient Multi-Sensory Integration. 8917-8921 - Qianhui Liu, Dong Xing, Lang Feng, Huajin Tang, Gang Pan:
Event-Based Multimodal Spiking Neural Network with Attention Mechanism. 8922-8926 - Yi Chen, Silin Zhang, Shiyu Ren, Hong Qu:
Gradual Surrogate Gradient Learning in Deep Spiking Neural Networks. 8927-8931 - Pengfei Sun, Longwei Zhu, Dick Botteldooren:
Axonal Delay as a Short-Term Memory for Feed Forward Deep Spiking Neural Networks. 8932-8936 - Jyotibdha Acharya, Laxmi R. Iyer, Wenyu Jiang:
Low Precision Local Learning for Hardware-Friendly Neuromorphic Visual Recognition. 8937-8941 - Jiadong Wang, Jibin Wu, Malu Zhang, Qi Liu, Haizhou Li:
A Hybrid Learning Framework for Deep Spiking Neural Networks with One-Spike Temporal Coding. 8942-8946 - Yao Zhu, Xinyu Wang, Hong-Shuo Chen, Ronald Salloum, C.-C. Jay Kuo:
A-PixelHop: A Green, Robust and Explainable Fake-Image Detector. 8947-8951 - Jing Yang, Didier Augusto Vega-Oliveros, Tais Seibt, Anderson Rocha:
Explainable Fact-Checking Through Question Answering. 8952-8956 - Shujin Wei, Haodong Li, Jiwu Huang:
Deep Video Inpainting Localization Using Spatial and Temporal Traces. 8957-8961 - Emanuele Conti, Davide Salvi, Clara Borrelli, Brian C. Hosler, Paolo Bestagini, Fabio Antonacci, Augusto Sarti, Matthew C. Stamm, Stefano Tubaro:
Deepfake Speech Detection Through Emotion Recognition: A Semantic Approach. 8962-8966 - Mingzhen Huang, Shan Jia, Ming-Ching Chang, Siwei Lyu:
Text-Image De-Contextualization Detection Using Vision-Language Models. 8967-8971 - Pavel Korshunov, Anubhav Jain, Sébastien Marcel:
Custom Attribution Loss for Improving Generalization and Interpretability of Deepfake Detection. 8972-8976 - Tamir Bendory, Oscar Michelin, Amit Singer:
Sparse Multi-Reference Alignment: Sample Complexity and Computational Hardness. 8977-8981 - Yuval Haitman, Joseph M. Francos, Louis L. Scharf:
Grassmannian Dimensionality Reduction Using Triplet Margin Loss for Ume Classification of 3d Point Clouds. 8982-8986 - Matthew Fickus, Joseph W. Iverson, John Jasper, Dustin G. Mixon:
A Note on Totally Symmetric Equi-Isoclinic Tight Fusion Frames. 8987-8991 - Stephen D. Howard, Ali Pezeshki:
A Simple Formula for the Moments of Unitarily Invariant Matrix Distributions. 8992-8996 - Yi Zhu, Tiago H. Falk:
Fusion of Modulation Spectral and Spectral Features with Symptom Metadata for Improved Speech-Based Covid-19 Detection. 8997-9001 - Kun Qian, Tanja Schultz, Björn W. Schuller:
An Overview of the FIRST ICASSP Special Session on Computer Audition for Healthcare. 9002-9006 - Shuai Yu, Yiwei Ding, Kun Qian, Bin Hu, Wei Li, Björn W. Schuller:
A Glance-and-Gaze Network for Respiratory Sound Classification. 9007-9011 - Xi Chen, Yefei Mo, Kang Ouyang, Mingyue Shi, Huali Zhou, Yupeng Shi, Wei Xiao, Shidong Shang, Qinglin Meng, Nengheng Zheng:
Internet Streaming Audio Based Speech Reception Threshold Measurement in Cochlear Implant Users. 9012-9016 - Zijie Wang, Zhao Wang:
A Domain Transfer Based Data Augmentation Method for Automated Respiratory Classification. 9017-9021 - Zhongxiang Wei, Christos Masouros, Sumei Sun:
Physical Layer Anonymous Communications: An Anonymity Entropy Oriented Precoding Design (Invited Paper). 9022-9026 - Howard H. Yang, Zuozhu Liu, Yaru Fu, Tony Q. S. Quek, H. Vincent Poor:
Federated Stochastic Gradient Descent Begets Self-Induced Momentum. 9027-9031 - Lu Zhang, Sangarapillai Lambotharan, Gan Zheng:
Adversarial Learning in Transformer Based Neural Network in Radio Signal Classification. 9032-9036 - Kumar Vijay Mishra, Arpan Chattopadhyay, Siddharth Sankar Acharjee, Athina P. Petropulu:
Optm3sec: Optimizing Multicast Irs-Aided Multiantenna Dfrc Secrecy Channel With Multiple Eavesdroppers. 9037-9041 - Ramana R. Avula, Tobias J. Oechtering:
Privacy-Enhancing Appliance Filtering For Smart Meters. 9042-9046 - Chenglong Sun, Zuxing Li, Chao Wang:
Adversarial Linear Quadratic Regulator under Falsified Actions. 9047-9051 - Sagar Shrestha, Xiao Fu:
Communication-Efficient Distributed MAX-VAR Generalized CCA via Error Feedback-Assisted Quantization. 9052-9056 - Yuetian Luo, Qin Ma, Chi Zhang, Anru R. Zhang:
Provable Second-Order Riemannian Gauss-Newton Method for Low-Rank Tensor Estimation ‖. 9057-9061 - Olivier Vu Thanh, Nicolas Gillis, Fabian Lecron:
Bounded Simplex-Structured Matrix Factorization. 9062-9066 - Eric Evert, Michiel Vandecappelle, Lieven De Lathauwer:
CPD Computation via Recursive Eigenspace Decompositions. 9067-9071 - Tian Tong, Cong Ma, Yuejie Chi:
Accelerating ILL-Conditioned Robust Low-Rank Tensor Regression. 9072-9076 - Sina Shahsavari, Pulak Sarangi, Mehmet Can Hücümenoglu, Piya Pal:
Ada-JSR: Sample Efficient Adaptive Joint Support Recovery From Extremely Compressed Measurement Vectors. 9077-9081 - Cong Cai, Bin Liu, Jianhua Tao, Zhengkun Tian, Jiahao Lu, Kexin Wang:
End-to-End Network Based on Transformer for Automatic Detection of Covid-19. 9082-9086 - Zhao Ren, Thanh Tam Nguyen, Wolfgang Nejdl:
Prototype Learning for Interpretable Respiratory Sound Analysis. 9087-9091 - Tianhao Yan, Hao Meng, Shuo Liu, Emilia Parada-Cabaleiro, Zhao Ren, Björn W. Schuller:
Convoluational Transformer With Adaptive Position Embedding For Covid-19 Detection From Cough Sounds. 9092-9096 - Venkata Srikanth Nallanthighal, Aki Härmä, Helmer Strik:
Detection of COPD Exacerbation from Speech: Comparison of Acoustic Features and Deep Learning Based Speech Breathing Models. 9097-9101 - Ziping Zhao, Zhen Gong, Mingyue Niu, Jiali Ma, Haishuai Wang, Zixing Zhang, Ya Li:
Automatic Respiratory Sound Classification Via Multi-Branch Temporal Convolutional Network. 9102-9106 - Ross Cutler, Ando Saabas, Tanel Pärnamaa, Marju Purin, Hannes Gamper, Sebastian Braun, Karsten Sørensen, Robert Aichner:
ICASSP 2022 Acoustic Echo Cancellation Challenge. 9107-9111 - Haoran Zhao, Nan Li, Runqiang Han, Lianwu Chen, Xiguang Zheng, Chen Zhang, Liang Guo, Bing Yu:
A Deep Hierarchical Fusion Network for Fullband Acoustic Echo Cancellation. 9112-9116 - Xingwei Sun, Chenbin Cao, Qinglong Li, Linzhang Wang, Fei Xiang:
Explore Relative and Context Information with Transformer for Joint Acoustic Echo Cancellation and Speech Enhancement. 9117-9121 - Guochang Zhang, Libiao Yu, Chunliang Wang, Jianqiang Wei:
Multi-Scale Temporal Frequency Convolutional Network With Axial Attention for Speech Enhancement. 9122-9126 - Shimin Zhang, Ziteng Wang, Jiayao Sun, Yihui Fu, Biao Tian, Qiang Fu, Lei Xie:
Multi-Task Deep Residual Echo Suppression with Echo-Aware Loss. 9127-9131 - Fan Cui, Liyong Guo, Wenfeng Li, Peng Gao, Yujun Wang:
Multi-Scale Refinement Network Based Acoustic Echo Cancellation. 9132-9136 - Alessio Xompero, Yik Lung Pang, T. Patten, A. Prabhakar, B. Calli, Andrea Cavallaro:
Audio-Visual Object Classification for Human-Robot Collaboration. 9137-9141 - Tomoya Matsubara, Seitaro Otsuki, Yuiga Wada, Haruka Matsuo, Takumi Komatsu, Yui Iioka, Komei Sugiura, Hideo Saito:
Shared Transformer Encoder with Mask-Based 3d Model Estimation for Container Mass Estimation. 9142-9146 - Hengyi Wang, Chaoran Zhu, Ziyin Ma, Changjae Oh:
Improving Generalization of Deep Networks for Estimating Physical Properties of Containers and Fillings. 9147-9151 - Tommaso Apicella, Giulia Slavic, Edoardo Ragusa, Paolo Gastaldo, Lucio Marcenaro:
Container Localisation and Mass Estimation with an RGB-D Camera. 9152-9155 - Fan Yu, Shiliang Zhang, Pengcheng Guo, Yihui Fu, Zhihao Du, Siqi Zheng, Weilong Huang, Lei Xie, Zheng-Hua Tan, DeLiang Wang, Yanmin Qian, Kong Aik Lee, Zhijie Yan, Bin Ma, Xin Xu, Hui Bu:
Summary on the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Grand Challenge. 9156-9160 - Naijun Zheng, Na Li, Xixin Wu, Lingwei Meng, Jiawen Kang, Haibin Wu, Chao Weng, Dan Su, Helen Meng:
The CUHK-Tencent Speaker Diarization System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. 9161-9165 - Maokui He, Xiang Lv, Weilin Zhou, Jingjing Yin, Xiaoqi Zhang, Yuxuan Wang, Shutong Niu, Yuhang Cao, Heng Lu, Jun Du, Chin-Hui Lee:
The USTC-Ximalaya System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription (M2met) Challenge. 9166-9170 - Weiqing Wang, Xiaoyi Qin, Ming Li:
Cross-Channel Attention-Based Target Speaker Voice Activity Detection: Experimental Results for the M2met Challenge. 9171-9175 - Chen Shen, Yi Liu, Wenzhi Fan, Bin Wang, Shixue Wen, Yao Tian, Jun Zhang, Jingsheng Yang, Zejun Ma:
The Volcspeech System for the ICASSP 2022 Multi-Channel Multi-Party Meeting Transcription Challenge. 9176-9180 - Shuaishuai Ye, Peiyao Wang, Shunfei Chen, Xinhui Hu, Xinkang Xu:
The Royalflush System of Speech Recognition for M2met Challenge. 9181-9185 - Eric Guizzo, Christian Marinoni, Marco Pennese, Xinlei Ren, Xiguang Zheng, Chen Zhang, Bruno S. Masiero, Aurelio Uncini, Danilo Comminiello:
L3DAS22 Challenge: Learning 3D Audio Sources in a Real Office Environment. 9186-9190 - Yongjian Mao, Ying Zeng, Hongqing Liu, Wenbin Zhu, Yi Zhou:
ICASSP 2022 L3DAS22 Challenge: Ensemble of Resnet-Conformers with Ambisonics Data Augmentation for Sound Event Localization and Detection. 9191-9195 - Jinbo Hu, Yin Cao, Ming Wu, Qiuqiang Kong, Feiran Yang, Mark D. Plumbley, Jun Yang:
A Track-Wise Ensemble Event Independent Network for Polyphonic Sound Event Localization and Detection. 9196-9200 - Yen-Ju Lu, Samuele Cornell, Xuankai Chang, Wangyou Zhang, Chenda Li, Zhaoheng Ni, Zhong-Qiu Wang, Shinji Watanabe:
Towards Low-Distortion Multi-Channel Speech Enhancement: The ESPNET-Se Submission to the L3DAS22 Challenge. 9201-9205 - Guochang Zhang, Chunliang Wang, Libiao Yu, Jianqiang Wei:
Multi-Scale Temporal Frequency Convolutional Network with Axial Attention for Multi-Channel Speech Enhancement. 9206-9210 - Jingdong Li, Yuanyuan Zhu, Dawei Luo, Yun Liu, Guohui Cui, Zhaoxia Li:
The PCG-AIID System for L3DAS22 Challenge: MIMO and MISO Convolutional Recurrent Network for Multi Channel Speech Enhancement and Speech Recognition. 9211-9215 - Jiangyan Yi, Ruibo Fu, Jianhua Tao, Shuai Nie, Haoxin Ma, Chenglong Wang, Tao Wang, Zhengkun Tian, Ye Bai, Cunhang Fan, Shan Liang, Shiming Wang, Shuai Zhang, Xinrui Yan, Le Xu, Zhengqi Wen, Haizhou Li:
ADD 2022: the first Audio Deep Synthesis Detection Challenge. 9216-9220 - Cheng Wen, Tingwei Guo, Xingjun Tan, Rui Yan, Shuran Zhou, Chuandong Xie, Wei Zou, Xiangang Li:
Time Domain Adversarial Voice Conversion for ADD 2022. 9221-9225 - Rui Yan, Cheng Wen, Shuran Zhou, Tingwei Guo, Wei Zou, Xiangang Li:
Audio Deepfake Detection System with Neural Stitching for ADD 2022. 9226-9230 - Zhiqiang Lv, Shanshan Zhang, Kai Tang, Pengfei Hu:
Fake Audio Detection Based On Unsupervised Pretraining Models. 9231-9235 - Haibin Wu, Heng-Cheng Kuo, Naijun Zheng, Kuo-Hsuan Hung, Hung-Yi Lee, Yu Tsao, Hsin-Min Wang, Helen Meng:
Partially Fake Audio Detection by Self-Attention-Based Fake Span Discovery. 9236-9240 - Juan M. Martín-Doñas, Aitor Álvarez:
The Vicomtech Audio Deepfake Detection System Based on Wav2vec2 for the 2022 ADD Challenge. 9241-9245 - Yanguang Xu, Jianwei Sun, Yang Han, Shuaijiang Zhao, Chaoyang Mei, Tingwei Guo, Shuran Zhou, Chuandong Xie, Wei Zou, Xiangang Li:
Audio-Visual Wake Word Spotting System for MISP Challenge 2021. 9246-9250 - Gaopeng Xu, Song Yang, Wei Li, Song Wang, Guo Wei, Junfeng Yuan, Jie Gao:
Channel-Wise AV-Fusion Attention for Multi-Channel Audio-Visual Speech Recognition. 9251-9255 - Ming Cheng, Haoxu Wang, Yechen Wang, Ming Li:
The DKU Audio-Visual Wake Word Spotting System for the 2021 MISP Challenge. 9256-9260 - Wei Wang, Xun Gong, Yifei Wu, Zhikai Zhou, Chenda Li, Wangyou Zhang, Bing Han, Yanmin Qian:
The Sjtu System For Multimodal Information Based Speech Processing Challenge 2021. 9261-9265 - Hang Chen, Hengshun Zhou, Jun Du, Chin-Hui Lee, Jingdong Chen, Shinji Watanabe, Sabato Marco Siniscalchi, Odette Scharenborg, Diyuan Liu, Bao-Cai Yin, Jia Pan, Jianqing Gao, Cong Liu:
The First Multimodal Information Based Speech Processing (Misp) Challenge: Data, Tasks, Baselines And Results. 9266-9270 - Harishchandra Dubey, Vishak Gopal, Ross Cutler, Ashkan Aazami, Sergiy Matusevych, Sebastian Braun, Sefik Emre Eskimez, Manthan Thakker, Takuya Yoshioka, Hannes Gamper, Robert Aichner:
Icassp 2022 Deep Noise Suppression Challenge. 9271-9275 - Zehua Zhang, Lu Zhang, Xuyi Zhuang, Yukun Qian, Heng Li, Mingjiang Wang:
FB-MSTCN: A Full-Band Single-Channel Speech Enhancement Method Based on Multi-Scale Temporal Convolutional Network. 9276-9280 - Shengkui Zhao, Bin Ma, Karn N. Watcharasupat, Woon-Seng Gan:
FRCRN: Boosting Feature Representation Using Frequency Recurrence for Monaural Speech Enhancement. 9281-9285 - Tianrui Wang, Weibin Zhu, Yingying Gao, Yanan Chen, Junlan Feng, Shilei Zhang:
Harmonic Gated Compensation Network Plus for ICASSP 2022 DNS Challenge. 9286-9290 - Yukai Ju, Wei Rao, Xiaopeng Yan, Yihui Fu, Shubo Lv, Luyao Cheng, Yannan Wang, Lei Xie, Shidong Shang:
TEA-PSE: Tencent-Ethereal-Audio-Lab Personalized Speech Enhancement System for ICASSP 2022 DNS Challenge. 9291-9295 - Lianwu Chen, Chenglin Xu, Xu Zhang, Xinlei Ren, Xiguang Zheng, Chen Zhang, Liang Guo, Bing Yu:
Multi-Stage and Multi-Loss Training for Fullband Non-Personalized and Personalized Speech Enhancement. 9296-9300 - Tianjian Zhang, Qian Chen, Yi Jiang, Dandan Miao, Feng Yin, Tao Quan, Qingjiang Shi, Zhi-Quan Luo:
ICASSP-SPGC 2022: Root Cause Analysis for Wireless Network Fault Localization. 9301-9305 - Xuan Zhang, Longxiang Xiong, Ningyuan Sun, Mingxia Wang, Hao Tang, Yanxing Zhao:
Accurate Inference of Unseen Combinations of Multiple Rootcauses with Classifier Ensemble. 9306-9310 - Yuequn Liu, Wenhui Zhu, Jie Qiao, Zhiyi Huang, Yu Xiang, Xuanzhi Chen, Wei Chen, Ruichu Cai:
Causal Alignment Based Fault Root Causes Localization for Wireless Network. 9311-9315 - Chaoli Zhang, Zhiqiang Zhou, Yingying Zhang, Linxiao Yang, Kai He, Qingsong Wen, Liang Sun:
Netrca: An Effective Network Fault Cause Localization Algorithm. 9316-9320
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.