default search action
ACL 2024: Bangkok, Thailand
- Lun-Wei Ku, Andre Martins, Vivek Srikumar:
Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2024, Bangkok, Thailand, August 11-16, 2024. Association for Computational Linguistics 2024, ISBN 979-8-89176-094-3 - Frontmatter.
- Zhengxin Zhang, Dan Zhao, Xupeng Miao, Gabriele Oliaro, Zhihao Zhang, Qing Li, Yong Jiang, Zhihao Jia:
Quantized Side Tuning: Fast and Memory-Efficient Tuning of Quantized Large Language Models. 1-17 - Hanlei Zhang, Hua Xu, Fei Long, Xin Wang, Kai Gao:
Unsupervised Multimodal Clustering for Semantics Discovery in Multimodal Utterances. 18-35 - Yafu Li, Qintong Li, Leyang Cui, Wei Bi, Zhilin Wang, Longyue Wang, Linyi Yang, Shuming Shi, Yue Zhang:
MAGE: Machine-generated Text Detection in the Wild. 36-53 - Haoran Li, Dadi Guo, Donghao Li, Wei Fan, Qi Hu, Xin Liu, Chunkit Chan, Duanyi Yao, Yuan Yao, Yangqiu Song:
PrivLM-Bench: A Multi-level Privacy Evaluation Benchmark for Language Models. 54-73 - Yuchen Hu, Chen Chen, Chao-Han Huck Yang, Ruizhe Li, Dong Zhang, Zhehuai Chen, EngSiong Chng:
GenTranslate: Large Language Models are Generative Multilingual Speech and Machine Translators. 74-90 - Yanzhi Xu, Yueying Hua, Shichen Li, Zhongqing Wang:
Exploring Chain-of-Thought for Multi-modal Metaphor Detection. 91-101 - Dayou Du, Yijia Zhang, Shijie Cao, Jiaqi Guo, Ting Cao, Xiaowen Chu, Ningyi Xu:
BitDistiller: Unleashing the Potential of Sub-4-Bit LLMs via Self-Distillation. 102-116 - Kai Chen, Ye Wang, Yitong Li, Aiping Li, Han Yu, Xin Song:
A Unified Temporal Knowledge Graph Reasoning Model Towards Interpolation and Extrapolation. 117-132 - Shicheng Xu, Liang Pang, Mo Yu, Fandong Meng, Huawei Shen, Xueqi Cheng, Jie Zhou:
Unsupervised Information Refinement Training of Large Language Models for Retrieval-Augmented Generation. 133-145 - Yong Hu, Fandong Meng, Jie Zhou:
CSCD-NS: a Chinese Spelling Check Dataset for Native Speakers. 146-159 - Charu James, Mayank Nagda, Nooshin Haji Ghassemi, Marius Kloft, Sophie Fellenz:
Evaluating Dynamic Topic Models. 160-176 - Guanting Dong, Hongyi Yuan, Keming Lu, Chengpeng Li, Mingfeng Xue, Dayiheng Liu, Wei Wang, Zheng Yuan, Chang Zhou, Jingren Zhou:
How Abilities in Large Language Models are Affected by Supervised Fine-tuning Data Composition. 177-198 - Shanshan Xu, T. Y. S. S. Santosh, Oana Ichim, Barbara Plank, Matthias Grabmair:
Through the Lens of Split Vote: Exploring Disagreement, Difficulty and Calibration in Legal Case Outcome Classification. 199-216 - Dhairya Dalal, Marco Valentino, André Freitas, Paul Buitelaar:
Inference to the Best Explanation in Large Language Models. 217-235 - Eduard Poesina, Cornelia Caragea, Radu Tudor Ionescu:
A Novel Cartography-Based Curriculum Learning Method Applied on RoNLI: The First Romanian Natural Language Inference Corpus. 236-253 - Xiusi Chen, Jyun-Yu Jiang, Wei-Cheng Chang, Cho-Jui Hsieh, Hsiang-Fu Yu, Wei Wang:
MinPrompt: Graph-based Minimal Prompt Data Augmentation for Few-shot Question Answering. 254-266 - Yebowen Hu, Kaiqiang Song, Sangwoo Cho, Xiaoyang Wang, Hassan Foroosh, Dong Yu, Fei Liu:
SportsMetrics: Blending Text and Numerical Data to Understand Information Fusion in LLMs. 267-278 - Qingyun Wang, Doug Downey, Heng Ji, Tom Hope:
SciMON: Scientific Inspiration Machines Optimized for Novelty. 279-299 - Yiren Jian, Tingkai Liu, Yunzhe Tao, Chunhui Zhang, Soroush Vosoughi, Hongxia Yang:
Expedited Training of Visual Conditioned Language Generation via Redundancy Reduction. 300-314 - Abhishek Kumar, Robert Morabito, Sanzhar Umbet, Jad Kabbara, Ali Emami:
Confidence Under the Hood: An Investigation into the Confidence-Probability Alignment in Large Language Models. 315-334 - Weixuan Wang, Barry Haddow, Alexandra Birch:
Retrieval-Augmented Multilingual Knowledge Editing. 335-354 - Brendan Park, Madeline Janecek, Naser Ezzati-Jivan, Yifeng Li, Ali Emami:
Picturing Ambiguity: A Visual Twist on the Winograd Schema Challenge. 355-374 - Abhishek Kumar, Sarfaroz Yunusov, Ali Emami:
Subtle Biases Need Subtler Measures: Dual Metrics for Evaluating Representative and Affinity Bias in Large Language Models. 375-392 - Alexandria Leto, Elliot Pickens, Coen D. Needell, David Rothschild, Maria Leonor Pacheco:
Framing in the Presence of Supporting Data: A Case Study in U.S. Economic News. 393-415 - Xiyao Wang, Yuhang Zhou, Xiaoyu Liu, Hongjin Lu, Yuancheng Xu, Feihong He, Jaehong Yoon, Taixi Lu, Fuxiao Liu, Gedas Bertasius, Mohit Bansal, Huaxiu Yao, Furong Huang:
Mementos: A Comprehensive Benchmark for Multimodal Large Language Model Reasoning over Image Sequences. 416-442 - Chufan Gao, Xuan Wang, Jimeng Sun:
TTM-RE: Memory-Augmented Document-Level Relation Extraction. 443-458 - Letian Peng, Yuwei Zhang, Zilong Wang, Jayanth Srinivasa, Gaowen Liu, Zihan Wang, Jingbo Shang:
Answer is All You Need: Instruction-following Text Embedding via Answering the Question. 459-477 - Yuhang Zhou, Paiheng Xu, Xiaoyu Liu, Bang An, Wei Ai, Furong Huang:
Explore Spurious Correlations at the Concept Level in Language Models for Text Classification. 478-492 - Qi Cheng, Michael Boratko, Pranay Kumar Yelugam, Tim O'Gorman, Nalini Singh, Andrew McCallum, Xiang Li:
Every Answer Matters: Evaluating Commonsense with Probabilistic Measures. 493-506 - Yueqi Xie, Minghong Fang, Renjie Pi, Neil Gong:
GradSafe: Detecting Jailbreak Prompts for LLMs via Safety-Critical Gradient Analysis. 507-518 - Gyeongeun Lee, Christina Wong, Meghan Guo, Natalie Parde:
Pouring Your Heart Out: Investigating the Role of Figurative Language in Online Expressions of Empathy. 519-529 - Luran Wang, Mark J. F. Gales, Vatsal Raina:
An Information-Theoretic Approach to Analyze NLP Classification Tasks. 530-551 - Yuwei Zhang, Siffi Singh, Sailik Sengupta, Igor Shalyminov, Hang Su, Hwanjun Song, Saab Mansour:
Can Your Model Tell a Negation from an Implicature? Unravelling Challenges With Intent Encoders. 552-567 - Taiqi He, Kwanghee Choi, Lindia Tjuatja, Nathaniel Robinson, Jiatong Shi, Shinji Watanabe, Graham Neubig, David R. Mortensen, Lori S. Levin:
Wav2Gloss: Generating Interlinear Glossed Text from Speech. 568-582 - Yibo Hu, Erick Skorupa Parolin, Latifur Khan, Patrick T. Brandt, Javier Osorio, Vito D'Orazio:
Leveraging Codebook Knowledge with NLI and ChatGPT for Zero-Shot Political Relation Classification. 583-603 - Ziyao Xu, Houfeng Wang:
SPOR: A Comprehensive and Practical Evaluation Method for Compositional Generalization in Data-to-Text Generation. 604-621 - Haochen Shi, Zhiyuan Sun, Xingdi Yuan, Marc-Alexandre Côté, Bang Liu:
OPEx: A Component-Wise Analysis of LLM-Centric Agents in Embodied Instruction Following. 622-636 - Ying Shen, Zhiyang Xu, Qifan Wang, Yu Cheng, Wenpeng Yin, Lifu Huang:
Multimodal Instruction Tuning with Conditional Mixture of LoRA. 637-648 - Yiqing Xie, Sheng Zhang, Hao Cheng, Pengfei Liu, Zelalem Gero, Cliff Wong, Tristan Naumann, Hoifung Poon, Carolyn P. Rosé:
DocLens: Multi-aspect Fine-grained Medical Text Evaluation. 649-679 - Congying Xia, Chen Xing, Jiangshu Du, Xinyi Yang, Yihao Feng, Ran Xu, Wenpeng Yin, Caiming Xiong:
FOFO: A Benchmark to Evaluate LLMs' Format-Following Capability. 680-699 - Young Hyun Yoo, Jii Cha, Changhyeon Kim, Taeuk Kim:
Hyper-CL: Conditioning Sentence Representations with Hypernetworks. 700-711 - Seong Hoon Lim, Taejun Yun, Jinhyeon Kim, Jihun Choi, Taeuk Kim:
Analysis of Multi-Source Language Training in Cross-Lingual Transfer. 712-725 - Sreyan Ghosh, Utkarsh Tyagi, Sonal Kumar, Chandra Kiran Reddy Evuru, Ramaneswaran S., S. Sakshi, Dinesh Manocha:
ABEX: Data Augmentation for Low-Resource NLU via Expanding Abstract Descriptions. 726-748 - Lucas Bandarkar, Davis Liang, Benjamin Muller, Mikel Artetxe, Satya Narayan Shukla, Donald Husa, Naman Goyal, Abhinandan Krishnan, Luke Zettlemoyer, Madian Khabsa:
The Belebele Benchmark: a Parallel Reading Comprehension Dataset in 122 Language Variants. 749-775 - Chenyang An, Zhibo Chen, Qihao Ye, Emily First, Letian Peng, Jiayun Zhang, Zihan Wang, Sorin Lerner, Jingbo Shang:
Learn from Failure: Fine-tuning LLMs with Trial-and-Error Data for Intuitionistic Propositional Logic Proving. 776-790 - Saehyung Lee, Sangwon Yu, Junsung Park, Jihun Yi, Sungroh Yoon:
Interactive Text-to-Image Retrieval with Large Language Models: A Plug-and-Play Approach. 791-809 - Inna W. Lin, Ashish Sharma, Christopher Michael Rytting, Adam S. Miner, Jina Suh, Tim Althoff:
IMBUE: Improving Interpersonal Effectiveness through Simulation and Just-in-time Feedback with Human-Language Model Interaction. 810-840 - Huawei Lin, Jikai Long, Zhaozhuo Xu, Weijie Zhao:
Token-wise Influential Training Data Retrieval for Large Language Models. 841-860 - Maxwell A. Weinzierl, Sanda M. Harabagiu:
Tree-of-Counterfactual Prompting for Zero-Shot Stance Detection. 861-880 - Jing Yu Koh, Robert Lo, Lawrence Jang, Vikram Duvvur, Ming Chong Lim, Po-Yu Huang, Graham Neubig, Shuyan Zhou, Russ Salakhutdinov, Daniel Fried:
VisualWebArena: Evaluating Multimodal Agents on Realistic Visual Web Tasks. 881-905 - Hwanjun Song, Hang Su, Igor Shalyminov, Jason Cai, Saab Mansour:
FineSurE: Fine-grained Summarization Evaluation using LLMs. 906-922 - Daechul Ahn, Yura Choi, Youngjae Yu, Dongyeop Kang, Jonghyun Choi:
Tuning Large Multimodal Models for Videos using Reinforcement Learning from AI Feedback. 923-940 - Jingtao Zhan, Qingyao Ai, Yiqun Liu, Yingwei Pan, Ting Yao, Jiaxin Mao, Shaoping Ma, Tao Mei:
Prompt Refinement with Image Pivot for Text-to-Image Generation. 941-954 - Masato Mita, Soichiro Murakami, Akihiko Kato, Peinan Zhang:
Striking Gold in Advertising: Standardization and Exploration of Ad Text Generation. 955-972 - Zhaowei Wang, Wei Fan, Qing Zong, Hongming Zhang, Sehyun Choi, Tianqing Fang, Xin Liu, Yangqiu Song, Ginny Y. Wong, Simon See:
AbsInstruct: Eliciting Abstraction Ability from LLMs through Explanation Tuning with Plausibility Estimation. 973-994 - Runlong Zhou, Simon S. Du, Beibin Li:
Reflect-RL: Two-Player Online RL Fine-Tuning for LMs. 995-1015 - Cheng Yang, Puli Chen, Qingbao Huang:
Can ChatGPT's Performance be Improved on Verb Metaphor Detection Tasks? Bootstrapping and Combining Tacit Knowledge. 1016-1027 - Zhaorui Yang, Tianyu Pang, Haozhe Feng, Han Wang, Wei Chen, Minfeng Zhu, Qian Liu:
Self-Distillation Bridges Distribution Gap in Language Model Fine-Tuning. 1028-1043 - Kun Zhu, Xiaocheng Feng, Xiyuan Du, Yuxuan Gu, Weijiang Yu, Haotian Wang, Qianglong Chen, Zheng Chu, Jingchang Chen, Bing Qin:
An Information Bottleneck Perspective for Effective Noise Filtering on Retrieval-Augmented Generation. 1044-1069 - Zhengping Jiang, Yining Lu, Hanjie Chen, Daniel Khashabi, Benjamin Van Durme, Anqi Liu:
RORA: Robust Free-Text Rationale Evaluation. 1070-1087 - Cheng Qian, Bingxiang He, Zhong Zhuang, Jia Deng, Yujia Qin, Xin Cong, Zhong Zhang, Jie Zhou, Yankai Lin, Zhiyuan Liu, Maosong Sun:
Tell Me More! Towards Implicit User Intention Understanding of Language Model Driven Agents. 1088-1113 - Zeyuan Wang, Qiang Zhang, Keyan Ding, Ming Qin, Xiang Zhuang, Xiaotong Li, Huajun Chen:
InstructProtein: Aligning Human and Protein Language via Knowledge Instruction. 1114-1136 - Aparna Elangovan, Ling Liu, Lei Xu, Sravan Babu Bodapati, Dan Roth:
ConSiDERS-The-Human Evaluation Framework: Rethinking Human Evaluation for Generative Large Language Models. 1137-1160 - Jingxuan Tu, Keer Xu, Liulu Yue, Bingyang Ye, Kyeongmin Rim, James Pustejovsky:
Linguistically Conditioned Semantic Textual Similarity. 1161-1172 - Zheng Chu, Jingchang Chen, Qianglong Chen, Weijiang Yu, Tao He, Haotian Wang, Weihua Peng, Ming Liu, Bing Qin, Ting Liu:
Navigate through Enigmatic Labyrinth A Survey of Chain of Thought Reasoning: Advances, Frontiers and Future. 1173-1203 - Zheng Chu, Jingchang Chen, Qianglong Chen, Weijiang Yu, Haotian Wang, Ming Liu, Bing Qin:
TimeBench: A Comprehensive Evaluation of Temporal Reasoning Abilities in Large Language Models. 1204-1228 - Zheng Chu, Jingchang Chen, Qianglong Chen, Haotian Wang, Kun Zhu, Xiyuan Du, Weijiang Yu, Ming Liu, Bing Qin:
BeamAggR: Beam Aggregation Reasoning over Multi-source Knowledge for Multi-hop Question Answering. 1229-1248 - Siyu Yuan, Jiangjie Chen, Changzhi Sun, Jiaqing Liang, Yanghua Xiao, Deqing Yang:
ANALOGYKB: Unlocking Analogical Reasoning of Language Models with A Million-scale Knowledge Base. 1249-1265 - Yujie Feng, Xu Chu, Yongxin Xu, Guangyuan Shi, Bo Liu, Xiao-Ming Wu:
TaSL: Continual Dialog State Tracking via Task Skill Localization and Consolidation. 1266-1279 - Damai Dai, Chengqi Deng, Chenggang Zhao, R. X. Xu, Huazuo Gao, Deli Chen, Jiashi Li, Wangding Zeng, Xingkai Yu, Y. Wu, Zhenda Xie, Y. K. Li, Panpan Huang, Fuli Luo, Chong Ruan, Zhifang Sui, Wenfeng Liang:
DeepSeekMoE: Towards Ultimate Expert Specialization in Mixture-of-Experts Language Models. 1280-1297 - Hongjin Qian, Zheng Liu, Kelong Mao, Yujia Zhou, Zhicheng Dou:
Grounding Language Model with Chunking-Free In-Context Retrieval. 1298-1311 - Jiaxin Bai, Yicheng Wang, Tianshi Zheng, Yue Guo, Xin Liu, Yangqiu Song:
Advancing Abductive Reasoning in Knowledge Graphs through Complex Logical Hypothesis Generation. 1312-1329 - Shizhe Diao, Pengcheng Wang, Yong Lin, Rui Pan, Xiang Liu, Tong Zhang:
Active Prompting with Chain-of-Thought for Large Language Models. 1330-1350 - Xiangyu Zhao, Bo Liu, Qijiong Liu, Guangyuan Shi, Xiao-Ming Wu:
EasyGen: Easing Multimodal Generation with BiDiffuser and LLMs. 1351-1370 - Haochen Li, Xin Zhou, Zhiqi Shen:
Rewriting the Code: A Simple Method for Large Language Model Augmented Code Search. 1371-1389 - Naomi Baes, Nick Haslam, Ekaterina Vylomova:
A Multidimensional Framework for Evaluating Lexical Semantic Change with Social Science Applications. 1390-1415 - Jianheng Huang, Leyang Cui, Ante Wang, Chengyi Yang, Xinting Liao, Linfeng Song, Junfeng Yao, Jinsong Su:
Mitigating Catastrophic Forgetting in Large Language Models with Self-Synthesized Rehearsal. 1416-1428 - Baizhou Huang, Shuai Lu, Xiaojun Wan, Nan Duan:
Enhancing Large Language Models in Coding Through Multi-Perspective Self-Consistency. 1429-1450 - Weitao Li, Junkai Li, Weizhi Ma, Yang Liu:
Citation-Enhanced Generation for LLM-based Chatbots. 1451-1466 - Haoyang Wen, Eduard H. Hovy, Alexander Hauptmann:
Transitive Consistency Constrained Learning for Entity-to-Entity Stance Detection. 1467-1480 - Jiahao Li, Quan Wang, Licheng Zhang, Guoqing Jin, Zhendong Mao:
Feature-Adaptive and Data-Scalable In-Context Learning. 1481-1494 - Yizhe Zhang, Jiarui Lu, Navdeep Jaitly:
Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games. 1495-1516 - Shangqing Tu, Yuliang Sun, Yushi Bai, Jifan Yu, Lei Hou, Juanzi Li:
WaterBench: Towards Holistic Evaluation of Watermarks for Large Language Models. 1517-1542 - Yida Zhao, Chao Lou, Kewei Tu:
Dependency Transformer Grammars: Integrating Dependency Structures into Transformer Language Models. 1543-1556 - Zhengrui Ma, Qingkai Fang, Shaolei Zhang, Shoutao Guo, Yang Feng, Min Zhang:
A Non-autoregressive Generation Framework for End-to-End Simultaneous Speech-to-Any Translation. 1557-1575 - Zhenhua Liu, Tong Zhu, Chuanyuan Tan, Bing Liu, Haonan Lu, Wenliang Chen:
Probing Language Models for Pre-training Data Detection. 1576-1587 - Zhihan Zhang, Yixin Cao, Chenchen Ye, Yunshan Ma, Lizi Liao, Tat-Seng Chua:
Analyzing Temporal Complex Events with Large Language Models? A Benchmark towards Temporal, Long Context Understanding. 1588-1606 - Senyu Han, Lu Chen, Li-Min Lin, Zhengshan Xu, Kai Yu:
IBSEN: Director-Actor Agent Collaboration for Controllable and Interactive Drama Script Generation. 1607-1619 - Jiangxing Wang, Jiachen Li, Xiao Han, Deheng Ye, Zongqing Lu:
Language Model Adaption for Reinforcement Learning with Natural Language Action Space. 1620-1634 - Hiromasa Sakurai, Yusuke Miyao:
Evaluating Intention Detection Capability of Large Language Models in Persuasive Dialogues. 1635-1657 - Huiqiang Jiang, Qianhui Wu, Xufang Luo, Dongsheng Li, Chin-Yew Lin, Yuqing Yang, Lili Qiu:
LongLLMLingua: Accelerating and Enhancing LLMs in Long Context Scenarios via Prompt Compression. 1658-1677 - Chuhao Jin, Kening Ren, Lingzhen Kong, Xiting Wang, Ruihua Song, Huan Chen:
Persuading across Diverse Domains: a Dataset and Persuasion Large Language Model. 1678-1706 - Mengxi Xiao, Qianqian Xie, Ziyan Kuang, Zhicheng Liu, Kailai Yang, Min Peng, Weiguang Han, Jimin Huang:
HealMe: Harnessing Cognitive Reframing in Large Language Models for Psychotherapy. 1707-1725 - Zirun Guo, Tao Jin, Zhou Zhao:
Multimodal Prompt Learning with Missing Modalities for Sentiment Analysis and Emotion Recognition. 1726-1736 - Bi-Cheng Yan, Jiun-Ting Li, Yi-Cheng Wang, Hsin-Wei Wang, Tien-Hong Lo, Yung-Chang Hsu, Wei-Cheng Chao, Berlin Chen:
An Effective Pronunciation Assessment Approach Leveraging Hierarchical Transformers and Pre-training Strategies. 1737-1747 - Wei Li, Houfeng Wang:
Detection-Correction Structure via General Language Model for Grammatical Error Correction. 1748-1763 - Yongxin Zhu, Dan Su, Liqiang He, Linli Xu, Dong Yu:
Generative Pre-trained Speech Language Model with Efficient Hierarchical Transformer. 1764-1775 - Lichen Zhang, Shuai Lu, Nan Duan:
Selene: Pioneering Automated Proof in Software Verification. 1776-1789 - Junlong Li, Fan Zhou, Shichao Sun, Yikai Zhang, Hai Zhao, Pengfei Liu:
Dissecting Human and LLM Preferences. 1790-1811 - Tao Sun, Linzheng Chai, Jian Yang, Yuwei Yin, Hongcheng Guo, Jiaheng Liu, Bing Wang, Liqun Yang, Zhoujun Li:
UniCoder: Scaling Code Large Language Model via Universal Code. 1812-1824 - Xianming Li, Jing Li:
AoE: Angle-optimized Embeddings for Semantic Textual Similarity. 1825-1839 - Xintao Wang, Yunze Xiao, Jen-tse Huang, Siyu Yuan, Rui Xu, Haoran Guo, Quan Tu, Yaying Fei, Ziang Leng, Wei Wang, Jiangjie Chen, Cheng Li, Yanghua Xiao:
InCharacter: Evaluating Personality Fidelity in Role-Playing Agents through Psychological Interviews. 1840-1873 - Shengchao Liu, Xiaoming Liu, Yichen Wang, Zehua Cheng, Chengzhengxu Li, Zhaohan Zhang, Yu Lan, Chao Shen:
Does DetectGPT Fully Utilize Perturbation? Bridging Selective Perturbation to Fine-tuned Contrastive Learning Detector would be Better. 1874-1889 - Jingwei Ni, Minjing Shi, Dominik Stammbach, Mrinmaya Sachan, Elliott Ash, Markus Leippold:
AFaCTA: Assisting the Annotation of Factual Claim Detection with Reliable LLM Annotators. 1890-1912 - Tobias Schimanski, Jingwei Ni, Mathias Kraus, Elliott Ash, Markus Leippold:
Towards Faithful and Robust LLM Specialists for Evidence-Based Question-Answering. 1913-1931 - Shihan Dou, Enyu Zhou, Yan Liu, Songyang Gao, Wei Shen, Limao Xiong, Yuhao Zhou, Xiao Wang, Zhiheng Xi, Xiaoran Fan, Shiliang Pu, Jiang Zhu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang:
LoRAMoE: Alleviating World Knowledge Forgetting in Large Language Models via MoE-Style Plugin. 1932-1945 - Xiaoying Zhang, Baolin Peng, Ye Tian, Jingyan Zhou, Lifeng Jin, Linfeng Song, Haitao Mi, Helen Meng:
Self-Alignment for Factuality: Mitigating Hallucinations in LLMs via Self-Evaluation. 1946-1965 - Zheng Wang, Shu Xian Teo, Jieer Ouyang, Yongjun Xu, Wei Shi:
M-RAG: Reinforcing Large Language Model Performance through Retrieval-Augmented Generation with Multiple Partitions. 1966-1978 - Qian Yang, Jin Xu, Wenrui Liu, Yunfei Chu, Ziyue Jiang, Xiaohuan Zhou, Yichong Leng, Yuanjun Lv, Zhou Zhao, Chang Zhou, Jingren Zhou:
AIR-Bench: Benchmarking Large Audio-Language Models via Generative Comprehension. 1979-1998 - Tom Kocmi, Vilém Zouhar, Christian Federmann, Matt Post:
Navigating the Metrics Maze: Reconciling Score Magnitudes and Accuracies. 1999-2014 - Yuanyi Ren, Haoran Ye, Hanjun Fang, Xin Zhang, Guojie Song:
ValueBench: Towards Comprehensively Evaluating Value Orientations and Understanding of Large Language Models. 2015-2040 - Ling Hu, Yuemei Xu:
DM-BLI: Dynamic Multiple Subspaces Alignment for Unsupervised Bilingual Lexicon Induction. 2041-2052 - Jesus Solano, Mardhiyah Sanni, Oana-Maria Camburu, Pasquale Minervini:
SparseFit: Few-shot Prompting with Sparse Fine-tuning for Jointly Generating Predictions and Natural Language Explanations. 2053-2077 - Wen Wu, Bo Li, Chao Zhang, Chung-Cheng Chiu, Qiujia Li, Junwen Bai, Tara N. Sainath, Philip C. Woodland:
Handling Ambiguity in Emotion: From Out-of-Domain Detection to Distribution Estimation. 2078-2093 - Jinyuan Fang, Zaiqiao Meng, Craig MacDonald:
REANO: Optimising Retrieval-Augmented Reader Models through Knowledge Graph Generation. 2094-2112 - Yingji Zhang, Danilo Carvalho, André Freitas:
Learning Disentangled Semantic Spaces of Explanations via Invertible Neural Networks. 2113-2134 - Yan Ma, Yu Qiao, Pengfei Liu:
MoPS: Modular Story Premise Synthesis for Open-Ended Automatic Story Generation. 2135-2169 - Junfan Chen, Richong Zhang, Junchi Chen, Chunming Hu:
Open-Set Semi-Supervised Text Classification via Adversarial Disagreement Maximization. 2170-2180 - Junjie Ye, Sixian Li, Guanyu Li, Caishuang Huang, Songyang Gao, Yilong Wu, Qi Zhang, Tao Gui, Xuanjing Huang:
ToolSword: Unveiling Safety Issues of Large Language Models in Tool Learning Across Three Stages. 2181-2211 - Mohammad Javad Hosseini, Andrey Petrov, Alex Fabrikant, Annie Louis:
A synthetic data approach for domain generalization of NLI models. 2212-2226 - Ting Wu, Jingyi Liu, Rui Zheng, Tao Gui, Qi Zhang, Xuanjing Huang:
Enhancing Contrastive Learning with Noise-Guided Attack: Towards Continual Relation Extraction in the Wild. 2227-2239 - Jiaqi Zhao, Miao Zhang, Chao Zeng, Ming Wang, Xuebo Liu, Liqiang Nie:
LRQuant: Learnable and Robust Post-Training Quantization for Large Language Models. 2240-2255 - Leon Weber-Genzel, Siyao Peng, Marie-Catherine de Marneffe, Barbara Plank:
VariErr NLI: Separating Annotation Error from Human Label Variation. 2256-2269 - Xunjian Yin, Xu Zhang, Jie Ruan, Xiaojun Wan:
Benchmarking Knowledge Boundary for Large Language Models: A Different Perspective on Model Evaluation. 2270-2286 - Soyoung Yoon, Eunbi Choi, Jiyeon Kim, Hyeongu Yun, Yireun Kim, Seung-won Hwang:
ListT5: Listwise Reranking with Fusion-in-Decoder Improves Zero-shot Retrieval. 2287-2308 - Guizhen Chen, Liying Cheng, Anh Tuan Luu, Lidong Bing:
Exploring the Potential of Large Language Models in Computational Argumentation. 2309-2330 - Viktor Moskvoretskii, Ekaterina Neminova, Alina Lobanova, Alexander Panchenko, Irina Nikishina:
TaxoLLaMA: WordNet-based Model for Solving Multiple Lexical Semantic Tasks. 2331-2350 - Weiqi Wang, Tianqing Fang, Chunyang Li, Haochen Shi, Wenxuan Ding, Baixuan Xu, Zhaowei Wang, Jiaxin Bai, Xin Liu, Cheng Jiayang, Chunkit Chan, Yangqiu Song:
CANDLE: Iterative Conceptualization and Instantiation Distillation from Large Language Models for Commonsense Reasoning. 2351-2374 - Jitai Hao, Weiwei Sun, Xin Xin, Qi Meng, Zhumin Chen, Pengjie Ren, Zhaochun Ren:
MEFT: Memory-Efficient Fine-Tuning through Sparse Adapter. 2375-2388 - Arnav Chavan, Nahush Lele, Deepak K. Gupta:
Surgical Feature-Space Decomposition of LLMs: Why, When and How? 2389-2400 - Zhangyue Yin, Qiushi Sun, Qipeng Guo, Zhiyuan Zeng, Xiaonan Li, Junqi Dai, Qinyuan Cheng, Xuanjing Huang, Xipeng Qiu:
Reasoning in Flux: Enhancing Large Language Models Reasoning through Uncertainty-aware Adaptive Guidance. 2401-2416 - Junnan Dong, Qinggang Zhang, Huachi Zhou, Daochen Zha, Pai Zheng, Xiao Huang:
Modality-Aware Integration with Large Language Models for Knowledge-Based Visual Question Answering. 2417-2429 - Peiyu Liu, Ze-Feng Gao, Xin Zhao, Yipeng Ma, Tao Wang, Ji-Rong Wen:
Unlocking Data-free Low-bit Quantization with Matrix Decomposition for KV Cache Compression. 2430-2440 - Seoyeon Kim, Kwangwook Seo, Hyungjoo Chae, Jinyoung Yeo, Dongha Lee:
VerifiNER: Verification-augmented NER via Knowledge-grounded Reasoning with Large Language Models. 2441-2461 - Yanyang Li, Shuo Liang, Michael R. Lyu, Liwei Wang:
Making Long-Context Language Models Better Multi-Hop Reasoners. 2462-2475 - Yihong Liu, Chunlan Ma, Haotian Ye, Hinrich Schütze:
TransliCo: A Contrastive Learning Framework to Address the Script Barrier in Multilingual Pretrained Language Models. 2476-2499 - Vyas Raina, Samson Tan, Volkan Cevher, Aditya Rawal, Sheng Zha, George Karypis:
Extreme Miscalibration and the Illusion of Adversarial Robustness. 2500-2525 - Yongsen Zheng, Ruilin Xu, Ziliang Chen, Guohua Wang, Mingjie Qian, Jinghui Qin, Liang Lin:
HyCoRec: Hypergraph-Enhanced Multi-Preference Learning for Alleviating Matthew Effect in Conversational Recommendation. 2526-2537 - Mobashir Sadat, Cornelia Caragea:
Co-training for Low Resource Scientific Natural Language Inference. 2538-2550 - Jiongxiao Wang, Junlin Wu, Muhao Chen, Yevgeniy Vorobeychik, Chaowei Xiao:
RLHFPoison: Reward Poisoning Attack for Reinforcement Learning with Human Feedback in Large Language Models. 2551-2570 - Kai Nylund, Suchin Gururangan, Noah A. Smith:
Time is Encoded in the Weights of Finetuned Language Models. 2571-2587 - Howard Yen, Tianyu Gao, Danqi Chen:
Long-Context Language Modeling with Parallel Context Encoding. 2588-2610 - Yao Yao, Zuchao Li, Hai Zhao:
SirLLM: Streaming Infinite Retentive LLM. 2611-2624 - Tao Feng, Lizhen Qu, Zhuang Li, Haolan Zhan, Yuncheng Hua, Gholamreza Haffari:
IMO: Greedy Layer-Wise Sparse Representation Learning for Out-of-Distribution Text Classification with Pre-trained Models. 2625-2639 - Xiang Hu, Pengyu Ji, Qingyang Zhu, Wei Wu, Kewei Tu:
Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale. 2640-2657 - Ziyin Zhang, Yikang Liu, Weifang Huang, Junyu Mao, Rui Wang, Hai Hu:
MELA: Multilingual Evaluation of Linguistic Acceptability. 2658-2674 - Shilin Zhou, Zhenghua Li, Yu Hong, Min Zhang, Zhefeng Wang, Baoxing Huai:
CopyNE: Better Contextual ASR by Copying Named Entities. 2675-2686 - Peter Baile Chen, Yi Zhang, Dan Roth:
Is Table Retrieval a Solved Problem? Exploring Join-Aware Multi-Table Retrieval. 2687-2699 - Haonan Chen, Zhicheng Dou, Kelong Mao, Jiongnan Liu, Ziliang Zhao:
Generalizing Conversational Dense Retrieval via LLM-Cognition Data Augmentation. 2700-2718 - Wangtao Sun, Haotian Xu, Xuanqing Yu, Pei Chen, Shizhu He, Jun Zhao, Kang Liu:
ItD: Large Language Models Can Teach Themselves Induction through Deduction. 2719-2731 - Zimu Lu, Aojun Zhou, Houxing Ren, Ke Wang, Weikang Shi, Junting Pan, Mingjie Zhan, Hongsheng Li:
MathGenie: Generating Synthetic Data with Question Back-translation for Enhancing Mathematical Reasoning of LLMs. 2732-2747 - Heng-Da Xu, Xian-Ling Mao, Puhai Yang, Fanshu Sun, Heyan Huang:
Rethinking Task-Oriented Dialogue Systems: From Complex Modularity to Zero-Shot Autonomous Agent. 2748-2763 - Mathieu Ravaut, Aixin Sun, Nancy F. Chen, Shafiq Joty:
On Context Utilization in Summarization with Large Language Models. 2764-2781 - Yutao Zhu, Peitian Zhang, Chenghao Zhang, Yifei Chen, Binyu Xie, Zheng Liu, Ji-Rong Wen, Zhicheng Dou:
INTERS: Unlocking the Power of Large Language Models in Search with Instruction Tuning. 2782-2809 - Xiaoling Zhou, Wei Ye, Yidong Wang, Chaoya Jiang, Zhemg Lee, Rui Xie, Shikun Zhang:
Enhancing In-Context Learning via Implicit Demonstration Augmentation. 2810-2828 - Sheng Wang, Boyang Xue, Jiacheng Ye, Jiyue Jiang, Liheng Chen, Lingpeng Kong, Chuan Wu:
PRoLoRA: Partial Rotation Empowers More Parameter-Efficient LoRA. 2829-2841 - Zefan Cai, Po-Nien Kung, Ashima Suvarna, Mingyu Derek Ma, Hritik Bansal, Baobao Chang, P. Jeffrey Brantingham, Wei Wang, Nanyun Peng:
Improving Event Definition Following For Zero-Shot Event Detection. 2842-2863 - Xiao Wei, Qi Xu, Hang Yu, Qian Liu, Erik Cambria:
Through the MUD: A Multi-Defendant Charge Prediction Benchmark with Linked Crime Elements. 2864-2878 - Yiruo Cheng, Kelong Mao, Zhicheng Dou:
Interpreting Conversational Dense Retrieval by Rewriting-Enhanced Inversion of Session Embedding. 2879-2893 - Yichen Wang, Shangbin Feng, Abe Bohan Hou, Xiao Pu, Chao Shen, Xiaoming Liu, Yulia Tsvetkov, Tianxing He:
Stumbling Blocks: Stress Testing the Robustness of Machine-Generated Text Detectors Under Attacks. 2894-2925 - Chengyu Huang, Zeqiu Wu, Yushi Hu, Wenya Wang:
Training Language Models to Generate Text with Citations via Fine-grained Rewards. 2926-2949 - Qiwei Li, Zuchao Li, Ping Wang, Haojun Ai, Hai Zhao:
Hypergraph based Understanding for Document Semantic Entity Recognition. 2950-2960 - Qintong Li, Leyang Cui, Xueliang Zhao, Lingpeng Kong, Wei Bi:
GSM-Plus: A Comprehensive Benchmark for Evaluating the Robustness of LLMs as Mathematical Problem Solvers. 2961-2984 - Qingkai Min, Qipeng Guo, Xiangkun Hu, Songfang Huang, Zheng Zhang, Yue Zhang:
Synergetic Event Understanding: A Collaborative Approach to Cross-Document Event Coreference Resolution with Large Language Models. 2985-3002 - Shuofei Qiao, Ningyu Zhang, Runnan Fang, Yujie Luo, Wangchunshu Zhou, Yuchen Eleanor Jiang, Chengfei Lv, Huajun Chen:
AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning. 3003-3021 - T. Y. S. S. Santosh, Tuan-Quang Vuong, Matthias Grabmair:
ChronosLex: Time-aware Incremental Training for Temporal Generalization of Legal Classification Tasks. 3022-3039 - Zeyu Gao, Hao Wang, Yuanda Wang, Chao Zhang:
Virtual Compiler Is All You Need For Assembly Code Search. 3040-3051 - Pengjie Ren, Chengshun Shi, Shiguang Wu, Mengqi Zhang, Zhaochun Ren, Maarten de Rijke, Zhumin Chen, Jiahuan Pei:
MELoRA: Mini-Ensemble Low-Rank Adapters for Parameter-Efficient Fine-Tuning. 3052-3064 - Yongqi Tong, Dawei Li, Sizhe Wang, Yujia Wang, Fei Teng, Jingbo Shang:
Can LLMs Learn from Previous Mistakes? Investigating LLMs' Errors to Boost for Reasoning. 3065-3080 - Zhou Yang, Zhaochun Ren, Wang Yufeng, Haizhou Sun, Chao Chen, Xiaofei Zhu, Xiangwen Liao:
An Iterative Associative Memory Model for Empathetic Response Generation. 3081-3092 - Mengru Wang, Ningyu Zhang, Ziwen Xu, Zekun Xi, Shumin Deng, Yunzhi Yao, Qishen Zhang, Linyi Yang, Jindong Wang, Huajun Chen:
Detoxifying Large Language Models via Knowledge Editing. 3093-3118 - Yushi Bai, Xin Lv, Jiajie Zhang, Hongchang Lyu, Jiankai Tang, Zhidian Huang, Zhengxiao Du, Xiao Liu, Aohan Zeng, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li:
LongBench: A Bilingual, Multitask Benchmark for Long Context Understanding. 3119-3137 - Yuyan Chen, Songzhou Yan, Panjun Liu, Yanghua Xiao:
Dr.Academy: A Benchmark for Evaluating Questioning Capability in Education for Large Language Models. 3138-3167 - Trinh Pham, Khoi Le, Anh Tuan Luu:
UniBridge: A Unified Approach to Cross-Lingual Transfer Learning for Low-Resource Languages. 3168-3184 - Junjie Zhou, Zheng Liu, Shitao Xiao, Bo Zhao, Yongping Xiong:
VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval. 3185-3200 - Jiale Cheng, Xiao Liu, Kehan Zheng, Pei Ke, Hongning Wang, Yuxiao Dong, Jie Tang, Minlie Huang:
Black-Box Prompt Optimization: Aligning Large Language Models without Model Training. 3201-3219 - Chanjun Park, Hyeonwoo Kim, Dahyun Kim, Seonghwan Cho, Sanghoon Kim, Sukyung Lee, Yungi Kim, Hwalsuk Lee:
Open Ko-LLM Leaderboard: Evaluating Large Language Models in Korean with Ko-H5 Benchmark. 3220-3234 - Xiang Chen, Chenxi Wang, Yida Xue, Ningyu Zhang, Xiaoyan Yang, Qiang Li, Yue Shen, Lei Liang, Jinjie Gu, Huajun Chen:
Unified Hallucination Detection for Multimodal Large Language Models. 3235-3252 - Houxing Ren, Mingjie Zhan, Zhongyuan Wu, Hongsheng Li:
Empowering Character-level Text Infilling by Eliminating Sub-Tokens. 3253-3267 - Kun Luo, Zheng Liu, Shitao Xiao, Tong Zhou, Yubo Chen, Jun Zhao, Kang Liu:
Landmark Embedding: A Chunking-Free Embedding Method For Retrieval Augmented Long-Context Large Language Models. 3268-3281 - Dayoon Ko, Jinyoung Kim, Hahyeon Choi, Gunhee Kim:
GrowOVER: How Can LLMs Adapt to Growing Real-World Knowledge? 3282-3308 - Aviv Slobodkin, Eran Hirsch, Arie Cattan, Tal Schuster, Ido Dagan:
Attribute First, then Generate: Locally-attributable Grounded Text Generation. 3309-3344 - Aoxiong Yin, Haoyuan Li, Kai Shen, Siliang Tang, Yueting Zhuang:
T2S-GPT: Dynamic Vector Quantization for Autoregressive Sign Language Production from Text. 3345-3356 - Zhen Bi, Ningyu Zhang, Yida Xue, Yixin Ou, Daxiong Ji, Guozhou Zheng, Huajun Chen:
OceanGPT: A Large Language Model for Ocean Science Tasks. 3357-3372 - Tongyao Zhu, Qian Liu, Liang Pang, Zhengbao Jiang, Min-Yen Kan, Min Lin:
Beyond Memorization: The Challenge of Random Memory Access in Language Models. 3373-3388 - Soonwoo Kwon, Sojung Kim, Minju Park, Seunghyun Lee, Kyuseok Kim:
BIPED: Pedagogically Informed Tutoring System for ESL Education. 3389-3414 - Jianhao Chen, Haoyuan Ouyang, Junyang Ren, Wentao Ding, Wei Hu, Yuzhong Qu:
Timeline-based Sentence Decomposition with In Context Learning for Temporal Fact Extraction. 3415-3432 - Will Aitken, Mohamed Abdalla, Karen Rudie, Catherine Stinson:
Collaboration or Corporate Capture? Quantifying NLP's Reliance on Industry Artifacts and Contributions. 3433-3448 - Siddhartha Datta, Alexander Ku, Deepak Ramachandran, Peter Anderson:
Prompt Expansion for Adaptive Text-to-Image Generation. 3449-3476 - Yani Huang, Xuefeng Zhang, Richong Zhang, Junfan Chen, Jaein Kim:
Progressively Modality Freezing for Multi-Modal Entity Alignment. 3477-3489 - Chaofan Li, Zheng Liu, Shitao Xiao, Yingxia Shao, Defu Lian:
Llama2Vec: Unsupervised Adaptation of Large Language Models for Dense Retrieval. 3490-3500 - Xuan-Phi Nguyen, Mahani Aljunied, Shafiq Joty, Lidong Bing:
Democratizing LLMs for Low-Resource Languages by Leveraging their English Dominant Abilities with Linguistically-Diverse Prompts. 3501-3516 - Xiaoyu Tong, Rochelle Choenni, Martha Lewis, Ekaterina Shutova:
Metaphor Understanding Challenge Dataset for LLMs. 3517-3536 - Peitian Zhang, Zheng Liu, Shitao Xiao, Zhicheng Dou, Jian-Yun Nie:
A Multi-Task Embedder For Retrieval Augmented LLMs. 3537-3553 - Bruce W. Lee, Jaehyuk Lim:
Language Models Don't Learn the Physical Manifestation of Language. 3554-3579 - Shangbin Feng, Herun Wan, Ningnan Wang, Zhaoxuan Tan, Minnan Luo, Yulia Tsvetkov:
What Does the Bot Say? Opportunities and Risks of Large Language Models in Social Media Bot Detection. 3580-3601 - Wenqi Zhang, Yongliang Shen, Linjuan Wu, Qiuying Peng, Jun Wang, Yueting Zhuang, Weiming Lu:
Self-Contrast: Better Reflection Through Inconsistent Solving Perspectives. 3602-3622 - Kaitlyn Zhou, Jena D. Hwang, Xiang Ren, Maarten Sap:
Relying on the Unreliable: The Impact of Language Models' Reluctance to Express Uncertainty. 3623-3643 - Xiaochen Wang, Junyu Luo, Jiaqi Wang, Yuan Zhong, Xiaokun Zhang, Yaqing Wang, Parminder Bhatia, Cao Xiao, Fenglong Ma:
Unity in Diversity: Collaborative Pre-training Across Multimodal Medical Sources. 3644-3656 - Sara Papi, Marco Gaido, Andrea Pilzer, Matteo Negri:
When Good and Reproducible Results are a Giant with Feet of Clay: The Importance of Software Quality in NLP. 3657-3672 - Marco Gaido, Sara Papi, Matteo Negri, Mauro Cettolo, Luisa Bentivogli:
SBAAM! Eliminating Transcript Dependency in Automatic Subtitling. 3673-3691 - Sara Papi, Marco Gaido, Matteo Negri, Luisa Bentivogli:
StreamAtt: Direct Streaming Speech-to-Text Translation with Attention-based Audio History Selection. 3692-3707 - Lingxi Zhang, Yue Yu, Kuan Wang, Chao Zhang:
ARL2: Aligning Retrievers with Black-box Large Language Models via Self-guided Adaptive Relevance Labeling. 3708-3719 - Jihwan Bang, Juntae Lee, Kyuhong Shim, Seunghan Yang, Simyung Chang:
Crayon: Customized On-Device LLM via Instant Adapter Blending and Edge-Server Hybrid Inference. 3720-3731 - Yebin Lee, Imseong Park, Myungjoo Kang:
FLEUR: An Explainable Reference-Free Evaluation Metric for Image Captioning Using a Large Multimodal Model. 3732-3746 - Yuxin Wang, Ivory Yang, Saeed Hassanpour, Soroush Vosoughi:
MentalManip: A Dataset For Fine-grained Analysis of Mental Manipulation in Conversations. 3747-3764 - Zhenlong Dai, Chang Yao, WenKang Han, Yuanying Yuanying, Zhipeng Gao, Jingyuan Chen:
MPCoder: Multi-user Personalized Code Generator with Explicit and Implicit Style Representation Learning. 3765-3780 - Ajay Patel, Colin Raffel, Chris Callison-Burch:
DataDreamer: A Tool for Synthetic Data Generation and Reproducible LLM Workflows. 3781-3799 - Chenze Shao, Fandong Meng, Jiali Zeng, Jie Zhou:
Understanding and Addressing the Under-Translation Problem from the Perspective of Decoding Objective. 3800-3814 - Cheng Liu, Wei Xiang, Bang Wang:
Identifying while Learning for Document Event Causality Identification. 3815-3827 - Chaoqun He, Renjie Luo, Yuzhuo Bai, Shengding Hu, Zhen Leng Thai, Junhao Shen, Jinyi Hu, Xu Han, Yujie Huang, Yuxiang Zhang, Jie Liu, Lei Qi, Zhiyuan Liu, Maosong Sun:
OlympiadBench: A Challenging Benchmark for Promoting AGI with Olympiad-Level Bilingual Multimodal Scientific Problems. 3828-3850 - Wei Xue, Yongliang Shen, Wenqi Ren, Jietian Guo, Shiliang Pu, Weiming Lu:
Insert or Attach: Taxonomy Completion via Box Embedding. 3851-3863 - Hyunji Lee, Doyoung Kim, Jihoon Jun, Se June Joo, Joel Jang, Kyoung-Woon On, Minjoon Seo:
Semiparametric Token-Sequence Co-Supervision. 3864-3882 - Weidong Guo, Jiuding Yang, Kaitong Yang, Xiangyang Li, Zhuwei Rao, Yu Xu, Di Niu:
Instruction Fusion: Advancing Prompt Evolution through Hybridization. 3883-3893 - Yikai Zhang, Siyu Yuan, Caiyu Hu, Kyle Richardson, Yanghua Xiao, Jiangjie Chen:
TimeArena: Shaping Efficient Multitasking Language Agents in a Time-Aware Simulation. 3894-3916 - Shenglai Zeng, Yaxin Li, Jie Ren, Yiding Liu, Han Xu, Pengfei He, Yue Xing, Shuaiqiang Wang, Jiliang Tang, Dawei Yin:
Exploring Memorization in Fine-tuned Language Models. 3917-3948 - Shun Zhang, Chaoran Yan, Jian Yang, Jiaheng Liu, Ying Mo, Jiaqi Bai, Tongliang Li, Zhoujun Li:
Towards Real-world Scenario: Imbalanced New Intent Discovery. 3949-3963 - Yuxia Wang, Jonibek Mansurov, Petar Ivanov, Jinyan Su, Artem Shelmanov, Akim Tsvigun, Osama Mohammed Afzal, Tarek Mahmoud, Giovanni Puccetti, Thomas Arnold, Alham Fikri Aji, Nizar Habash, Iryna Gurevych, Preslav Nakov:
M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection. 3964-3992 - Jian Wang, Chak Tou Leong, Jiashuo Wang, Dongding Lin, Wenjie Li, Xiaoyong Wei:
Instruct Once, Chat Consistently in Multiple Rounds: An Efficient Tuning Framework for Dialogue. 3993-4010 - Nan He, Weichen Xiong, Hanwen Liu, Yi Liao, Lei Ding, Kai Zhang, Guohua Tang, Xiao Han, Yang Wei:
SoftDedup: an Efficient Data Reweighting Method for Speeding Up Language Model Pre-training. 4011-4022 - Ning Bian, Xianpei Han, Hongyu Lin, Yaojie Lu, Ben He, Le Sun:
Rule or Story, Which is a Better Commonsense Expression for Talking with Large Language Models? 4023-4043 - Zeqi Tan, Yongliang Shen, Xiaoxia Cheng, Chang Zong, Wenqi Zhang, Jian Shao, Weiming Lu, Yueting Zhuang:
Learning Global Controller in Latent Space for Parameter-Efficient Fine-Tuning. 4044-4055 - Yixin Chen, Shuai Zhang, Boran Han, Tong He, Bo Li:
CaMML: Context-Aware Multimodal Learner for Large Models. 4056-4071 - Xiaozhi Wang, Hao Peng, Yong Guan, Kaisheng Zeng, Jianhui Chen, Lei Hou, Xu Han, Yankai Lin, Zhiyuan Liu, Ruobing Xie, Jie Zhou, Juanzi Li:
MAVEN-ARG: Completing the Puzzle of All-in-One Event Understanding Dataset with Event Argument Annotation. 4072-4091 - Lizhou Fan, Wenyue Hua, Lingyao Li, Haoyang Ling, Yongfeng Zhang:
NPHardEval: Dynamic Benchmark on Reasoning Ability of Large Language Models via Complexity Classes. 4092-4114 - Zhiwei He, Binglin Zhou, Hongkun Hao, Aiwei Liu, Xing Wang, Zhaopeng Tu, Zhuosheng Zhang, Rui Wang:
Can Watermarks Survive Translation? On the Cross-lingual Consistency of Text Watermark for Large Language Models. 4115-4129 - Alicja Chaszczewicz, Raj Sanjay Shah, Ryan Louie, Bruce A Arnow, Robert E. Kraut, Diyi Yang:
Multi-Level Feedback Generation with Large Language Models for Empowering Novice Peer Counselors. 4130-4161 - Bhavani Shankar, Preethi Jyothi, Pushpak Bhattacharyya:
In-context Mixing (ICM): Code-mixed Prompts for Multilingual LLMs. 4162-4176 - Liang Zhang, Qin Jin, Haoyang Huang, Dongdong Zhang, Furu Wei:
Respond in my Language: Mitigating Language Inconsistency in Response Generation based on Large Language Models. 4177-4192 - Yu-Hsiang Huang, Yu-Che Tsai, Hsiang Hsiao, Hong-Yi Lin, Shou-De Lin:
Transferable Embedding Inversion Attack: Uncovering Privacy Risks in Text Embeddings without Model Queries. 4193-4205 - Kuo Liao, Shuang Li, Meng Zhao, Liqun Liu, Mengge Xue, Zhenyu Hu, Honglin Han, Chengguo Yin:
Enhancing Reinforcement Learning with Label-Sensitive Reward for Natural Language Understanding. 4206-4220 - Jiahao Ying, Yixin Cao, Kai Xiong, Long Cui, Yidong He, Yongbin Liu:
Intuitive or Dependent? Investigating LLMs' Behavior Style to Conflicting Prompts. 4221-4246 - Shiyi Zhu, Jing Ye, Wei Jiang, Siqiao Xue, Qi Zhang, Yifan Wu, Jianguo Li:
CoCA: Fusing Position Embedding with Collinear Constrained Attention in Transformers for Long Context Window Extending. 4247-4262 - Jan Trienes, Sebastian Joseph, Jörg Schlötterer, Christin Seifert, Kyle Lo, Wei Xu, Byron C. Wallace, Junyi Jessy Li:
InfoLossQA: Characterizing and Recovering Information Loss in Text Simplification. 4263-4294 - Kaiyan Zhang, Jianyu Wang, Ermo Hua, Biqing Qi, Ning Ding, Bowen Zhou:
CoGenesis: A Framework Collaborating Large and Small Language Models for Secure Context-Aware Instruction Following. 4295-4312 - Kexin Wang, Nils Reimers, Iryna Gurevych:
DAPR: A Benchmark on Document-Aware Passage Retrieval. 4313-4330 - Mengge Xue, Zhenyu Hu, Liqun Liu, Kuo Liao, Shuang Li, Honglin Han, Meng Zhao, Chengguo Yin:
Strengthened Symbol Binding Makes Large Language Models Reliable Multiple-Choice Selectors. 4331-4344 - Hanzhu Chen, Xu Shen, Qitan Lv, Jie Wang, Xiaoqi Ni, Jieping Ye:
SAC-KG: Exploiting Large Language Models as Skilled Automatic Constructors for Domain Knowledge Graph. 4345-4360 - Chuanpeng Yang, Yaxin Liu, Fuqing Zhu, Jizhong Han, Songlin Hu:
Uncertainty-Guided Modal Rebalance for Hateful Memes Detection. 4361-4371 - Max Glockner, Yufang Hou, Preslav Nakov, Iryna Gurevych:
Missci: Reconstructing Fallacies in Misrepresented Science. 4372-4405 - Daniel Reich, Tanja Schultz:
Uncovering the Full Potential of Visual Grounding Methods in VQA. 4406-4419 - Jiejun Tan, Zhicheng Dou, Yutao Zhu, Peidong Guo, Kun Fang, Ji-Rong Wen:
Small Models, Big Insights: Leveraging Slim Proxy Models To Decide When and What to Retrieve for LLMs. 4420-4436 - Pius von Däniken, Jan Deriu, Don Tuggener, Mark Cieliebak:
Favi-Score: A Measure for Favoritism in Automated Preference Ratings for Generative AI Evaluation. 4437-4454 - Timon Ziegenbein, Gabriella Skitalinskaya, Alireza Bayat Makou, Henning Wachsmuth:
LLM-based Rewriting of Inappropriate Argumentation using Reinforcement Learning from Machine Feedback. 4455-4476 - Moritz Plenz, Anette Frank:
Graph Language Models. 4477-4494 - Francesco Periti, Pierluigi Cassotti, Haim Dubossarsky, Nina Tahmasebi:
Analyzing Semantic Change through Lexical Replacements. 4495-4510 - Zhe Xu, Kun Wei, Xu Yang, Cheng Deng:
Exploiting Intrinsic Multilateral Logical Rules for Weakly Supervised Natural Language Video Localization. 4511-4521 - Lucas Weber, Jaap Jumelet, Elia Bruni, Dieuwke Hupkes:
Interpretability of Language Models via Task Spaces. 4522-4538 - Pierluigi Cassotti, Stefano De Pascale, Nina Tahmasebi:
Using Synchronic Definitions and Semantic Relations to Classify Semantic Change Types. 4539-4553 - Matéo Mahaut, Laura Aina, Paula Czarnowska, Momchil Hardalov, Thomas Müller, Lluís Màrquez:
Factual Confidence of LLMs: on Reliability and Robustness of Current Estimators. 4554-4570 - Shihan Dou, Yan Liu, Haoxiang Jia, Enyu Zhou, Limao Xiong, Junjie Shan, Caishuang Huang, Xiao Wang, Xiaoran Fan, Zhiheng Xi, Yuhao Zhou, Tao Ji, Rui Zheng, Qi Zhang, Tao Gui, Xuanjing Huang:
StepCoder: Improving Code Generation with Reinforcement Learning from Compiler Feedback. 4571-4585 - Yunshui Li, Binyuan Hui, Xiaobo Xia, Jiaxi Yang, Min Yang, Lei Zhang, Shuzheng Si, Ling-Hao Chen, Junhao Liu, Tongliang Liu, Fei Huang, Yongbin Li:
One-Shot Learning as Instruction Data Prospector for Large Language Models. 4586-4601 - Chenyu Shi, Xiao Wang, Qiming Ge, Songyang Gao, Xianjun Yang, Tao Gui, Qi Zhang, Xuanjing Huang, Xun Zhao, Dahua Lin:
Navigating the OverKill in Large Language Models. 4602-4614 - Alon Jacovi, Yonatan Bitton, Bernd Bohnet, Jonathan Herzig, Or Honovich, Michael Tseng, Michael Collins, Roee Aharoni, Mor Geva:
A Chain-of-Thought Is as Strong as Its Weakest Link: A Benchmark for Verifiers of Reasoning Chains. 4615-4634 - Qian Ruan, Ilia Kuznetsov, Iryna Gurevych:
Re3: A Holistic Framework and Dataset for Modeling Collaborative Document Revision. 4635-4655 - Tamara Czinczoll, Christoph Hönes, Maximilian Schall, Gerard de Melo:
NextLevelBERT: Masked Language Modeling with Higher-Level Representations for Long Documents. 4656-4666 - Yuxin Jiang, Yufei Wang, Xingshan Zeng, Wanjun Zhong, Liangyou Li, Fei Mi, Lifeng Shang, Xin Jiang, Qun Liu, Wei Wang:
FollowBench: A Multi-level Fine-grained Constraints Following Benchmark for Large Language Models. 4667-4688 - Yuxin Jiang, Yufei Wang, Chuhan Wu, Wanjun Zhong, Xingshan Zeng, Jiahui Gao, Liangyou Li, Xin Jiang, Lifeng Shang, Ruiming Tang, Qun Liu, Wei Wang:
Learning to Edit: Aligning LLMs with Knowledge Editing. 4689-4705 - Yejie Wang, Keqing He, Guanting Dong, Pei Wang, Weihao Zeng, Muxi Diao, Weiran Xu, Jingang Wang, Mengdi Zhang, Xunliang Cai:
DolphCoder: Echo-Locating Code Large Language Models with Diverse and Multi-Objective Instruction Tuning. 4706-4721 - Brielen Madureira, Patrick Kahardipraja, David Schlangen:
When Only Time Will Tell: Interpreting How Transformers Process Local Ambiguities Through the Lens of Restart-Incrementality. 4722-4749 - Md Imbesat Hassan Rizvi, Xiaodan Zhu, Iryna Gurevych:
SpaRC and SpaRP: Spatial Reasoning Characterization and Path Generation for Understanding Spatial Reasoning Capability of Large Language Models. 4750-4767 - Tao He, Lizi Liao, Yixin Cao, Yuanxing Liu, Ming Liu, Zerui Chen, Bing Qin:
Planning Like Human: A Dual-process Framework for Dialogue Planning. 4768-4791 - Nicola Cancedda:
Spectral Filters, Dark Signals, and Attention Sinks. 4792-4808 - Silin Gao, Mete Ismayilzada, Mengjie Zhao, Hiromi Wakaki, Yuki Mitsufuji, Antoine Bosselut:
DiffuCOMET: Contextual Commonsense Knowledge Diffusion. 4809-4831 - Furkan Sahinuç, Ilia Kuznetsov, Yufang Hou, Iryna Gurevych:
Systematic Task Exploration with LLMs: A Study in Citation Text Generation. 4832-4855 - Matteo Bortoletto, Constantin Ruhdorfer, Adnen Abdessaied, Lei Shi, Andreas Bulling:
Limits of Theory of Mind Modelling in Dialogue-Based Collaborative Plan Acquisition. 4856-4871 - Ziyang Chen, Dongfang Li, Xiang Zhao, Baotian Hu, Min Zhang:
Temporal Knowledge Question Answering via Abstract Reasoning Induction. 4872-4889 - Taehyun Lee, Seokhee Hong, Jaewoo Ahn, Ilgee Hong, Hwaran Lee, Sangdoo Yun, Jamin Shin, Gunhee Kim:
Who Wrote this Code? Watermarking for Code Generation. 4890-4911 - Md. Ashraful Islam, Mohammed Eunus Ali, Md. Rizwan Parvez:
MapCoder: Multi-Agent Code Generation for Competitive Problem Solving. 4912-4944 - Lei Zhu, Xinjiang Wang, Wayne Zhang, Rynson W. H. Lau:
RelayAttention for Efficient Large Language Model Serving with Long System Prompts. 4945-4957 - Jianing Wang, Qiushi Sun, Xiang Li, Ming Gao:
Boosting Language Models Reasoning with Chain-of-Knowledge Prompting. 4958-4981 - Shiguang Guo, Ziliang Deng, Hongyu Lin, Yaojie Lu, Xianpei Han, Le Sun:
Open Grounded Planning: Challenges and Benchmark Construction. 4982-5003 - Chenghao Xu, Guangtao Lyu, Jiexi Yan, Muli Yang, Cheng Deng:
LLM Knows Body Language, Too: Translating Speech Voices into Human Gestures. 5004-5013 - Xiang Huang, Sitao Cheng, Shanshan Huang, Jiayu Shen, Yong Xu, Chaoyun Zhang, Yuzhong Qu:
QueryAgent: A Reliable and Efficient Reasoning Framework with Environmental Feedback based Self-Correction. 5014-5035 - Yang Sun, Muyi Wang, Jianzhu Bao, Bin Liang, Xiaoyan Zhao, Caihua Yang, Min Yang, Ruifeng Xu:
PITA: Prompting Task Interaction for Argumentation Mining. 5036-5049 - Jinhao Duan, Hao Cheng, Shiqi Wang, Alex Zavalny, Chenan Wang, Renjing Xu, Bhavya Kailkhura, Kaidi Xu:
Shifting Attention to Relevance: Towards the Predictive Uncertainty Quantification of Free-Form Large Language Models. 5050-5063 - Gregor Geigle, Radu Timofte, Goran Glavas:
Babel-ImageNet: Massively Multilingual Evaluation of Vision-and-Language Representations. 5064-5084 - Diya Li, Carolyn P. Rosé, Ao Yuan, Chunxiao Zhou:
Estimating Agreement by Chance for Sequence Annotation. 5085-5097 - Sheng Lu, Irina Bigoulaeva, Rachneet Sachdeva, Harish Tayyar Madabushi, Iryna Gurevych:
Are Emergent Abilities in Large Language Models just In-Context Learning? 5098-5139 - Zhaojian Yu, Xin Zhang, Ning Shang, Yangyu Huang, Can Xu, Yishujie Zhao, Wenxiang Hu, Qiufeng Yin:
WaveCoder: Widespread And Versatile Enhancement For Code Large Language Models By Instruction Tuning. 5140-5153 - Bryan Li, Tamer Alkhouli, Daniele Bonadiman, Nikolaos Pappas, Saab Mansour:
Eliciting Better Multilingual Structured Reasoning from LLMs through Code. 5154-5169 - Timothy Ossowski, Junjie Hu:
OLIVE: Object Level In-Context Visual Embeddings. 5170-5185 - Jiuhai Chen, Jonas Mueller:
Quantifying Uncertainty in Answers from any Language Model and Enhancing their Trustworthiness. 5186-5200 - Lei Zhang, Yunshui Li, Ziqiang Liu, Jiaxi Yang, Junhao Liu, Longze Chen, Run Luo, Min Yang:
Marathon: A Race Through the Realm of Long Context with Large Language Models. 5201-5217 - Xiaochen Gao, Feng Yao, Kewen Zhao, Beilei He, Animesh Kumar, Vish Krishnan, Jingbo Shang:
Beyond Scaling: Predicting Patent Approval with Domain-specific Fine-grained Claim Dependency Graph. 5218-5234 - Xianwei Zhuang, Xuxin Cheng, Liming Liang, Yuxin Xie, Zhichang Wang, Zhiqi Huang, Yuexian Zou:
PCAD: Towards ASR-Robust Spoken Language Understanding via Prototype Calibration and Asymmetric Decoupling. 5235-5246 - Tao Jin, Wang Lin, Ye Wang, Linjun Li, Xize Cheng, Zhou Zhao:
Rethinking the Multimodal Correlation of Multimodal Sequential Learning via Generalizable Attentional Results Alignment. 5247-5265 - Xun Liang, Shichao Song, Simin Niu, Zhiyu Li, Feiyu Xiong, Bo Tang, Yezhaohui Wang, Dawei He, Cheng Peng, Zhonghao Wang, Haiying Deng:
UHGEval: Benchmarking the Hallucination of Chinese Large Language Models via Unconstrained Generation. 5266-5293 - Weizhe Lin, Jingbiao Mei, Jinghong Chen, Bill Byrne:
PreFLMR: Scaling Up Fine-Grained Late-Interaction Multi-modal Retrievers. 5294-5316 - Justus-Jonas Erker, Florian Mai, Nils Reimers, Gerasimos Spanakis, Iryna Gurevych:
Triple-Encoders: Representations That Fire Together, Wire Together. 5317-5332 - Jingbiao Mei, Jinghong Chen, Weizhe Lin, Bill Byrne, Marcus Tomalin:
Improving Hateful Meme Detection through Retrieval-Guided Contrastive Learning. 5333-5347 - Wenqi Zhang, Ke Tang, Hai Wu, Mengna Wang, Yongliang Shen, Guiyang Hou, Zeqi Tan, Peng Li, Yueting Zhuang, Weiming Lu:
Agent-Pro: Learning to Evolve via Policy-Level Reflection and Optimization. 5348-5375 - Anton Razzhigaev, Matvey Mikhalchuk, Elizaveta Goncharova, Nikolai Gerasimenko, Ivan V. Oseledets, Denis Dimitrov, Andrey Kuznetsov:
Your Transformer is Secretly Linear. 5376-5384 - Uthman Jinadu, Yi Ding:
Noise Correction on Subjective Datasets. 5385-5395 - Lütfi Kerem Senel, Besnik Fetahu, Davis Yoshida, Zhiyu Chen, Giuseppe Castellucci, Nikhita Vedula, Jason Ingyu Choi, Shervin Malmasi:
Generative Explore-Exploit: Training-free Optimization of Generative Recommender Systems using LLM Optimizers. 5396-5420 - Zhengbao Jiang, Zhiqing Sun, Weijia Shi, Pedro Rodríguez, Chunting Zhou, Graham Neubig, Xi Victoria Lin, Wen-tau Yih, Srini Iyer:
Instruction-tuned Language Models are Better Knowledge Learners. 5421-5434 - Jerry Ngo, Yoon Kim:
What Do Language Models Hear? Probing for Auditory Representations in Language Models. 5435-5448 - Zae Myung Kim, Kwang Hee Lee, Preston Zhu, Vipul Raheja, Dongyeop Kang:
Threads of Subtlety: Detecting Machine-Generated Texts Through Discourse Motifs. 5449-5474 - Hangfan Zhang, Zhimeng Guo, Huaisheng Zhu, Bochuan Cao, Lu Lin, Jinyuan Jia, Jinghui Chen, Dinghao Wu:
Jailbreak Open-Sourced Large Language Models via Enforced Decoding. 5475-5493 - Pragya Srivastava, Satvik Golechha, Amit Deshpande, Amit Sharma:
NICE: To Optimize In-Context Examples or Not? 5494-5510 - Weixiang Yan, Haitian Liu, Yunkun Wang, Yunzhe Li, Qian Chen, Wen Wang, Tingyu Lin, Weishan Zhao, Li Zhu, Hari Sundaram, Shuiguang Deng:
CodeScope: An Execution-based Multilingual Multitask Multidimensional Benchmark for Evaluating LLMs on Code Understanding and Generation. 5511-5558 - Yuling Gu, Oyvind Tafjord, Peter Clark:
Digital Socrates: Evaluating LLMs through Explanation Critiques. 5559-5586 - Zhangchen Xu, Fengqing Jiang, Luyao Niu, Jinyuan Jia, Bill Yuchen Lin, Radha Poovendran:
SafeDecoding: Defending against Jailbreak Attacks via Safety-Aware Decoding. 5587-5605 - Guijin Son, Sangwon Baek, Sangdae Nam, Ilgyun Jeong, Seungone Kim:
Multi-Task Inference: Can Large Language Models Follow Multiple Instructions at Once? 5606-5627 - Chen Qian, Yufan Dang, Jiahao Li, Wei Liu, Zihao Xie, Yifei Wang, Weize Chen, Cheng Yang, Xin Cong, Xiaoyin Che, Zhiyuan Liu, Maosong Sun:
Experiential Co-Learning of Software-Developing Agents. 5628-5640 - Kai Tang, Junbo Zhao, Xiao Ding, Runze Wu, Lei Feng, Gang Chen, Haobo Wang:
Learning Geometry-Aware Representations for New Intent Discovery. 5641-5654 - Yizhe Yang, Palakorn Achananuparp, Heyan Huang, Jing Jiang, Ee-Peng Lim:
Speaker Verification in Agent-generated Conversations. 5655-5676 - Yuge Zhang, Qiyang Jiang, XingyuHan XingyuHan, Nan Chen, Yuqing Yang, Kan Ren:
Benchmarking Data Science Agents. 5677-5700 - Tianyi Tang, Wenyang Luo, Haoyang Huang, Dongdong Zhang, Xiaolei Wang, Xin Zhao, Furu Wei, Ji-Rong Wen:
Language-Specific Neurons: The Key to Multilingual Capabilities in Large Language Models. 5701-5715 - Shiwen Ni, Dingwei Chen, Chengming Li, Xiping Hu, Ruifeng Xu, Min Yang:
Forgetting before Learning: Utilizing Parametric Arithmetic for Knowledge Updating in Large Language Models. 5716-5731 - Megh Thakkar, Quentin Fournier, Matthew Riemer, Pin-Yu Chen, Amal Zouaq, Payel Das, Sarath Chandar:
A Deep Dive into the Trade-Offs of Parameter-Efficient Preference Alignment Techniques. 5732-5745 - Xiang Luo, Zhiwen Tang, Jin Wang, Xuejie Zhang:
Zero-Shot Cross-Domain Dialogue State Tracking via Dual Low-Rank Adaptation. 5746-5765 - Jian Luo, Xuanang Chen, Ben He, Le Sun:
PRP-Graph: Pairwise Ranking Prompting to LLMs with Graph Aggregation for Effective Text Re-ranking. 5766-5776 - Zhichao Huang, Chutong Meng, Tom Ko:
RepCodec: A Speech Representation Codec for Speech Tokenization. 5777-5790 - Jiayi Fu, Xuandong Zhao, Ruihan Yang, Yuansen Zhang, Jiangjie Chen, Yanghua Xiao:
GumbelSoft: Diversified Language Model Watermarking via the GumbelMax-trick. 5791-5808 - Zihan Ma, Minnan Luo, Hao Guo, Zhi Zeng, Yiran Hao, Xiang Zhao:
Event-Radar: Event-driven Multi-View Learning for Multimodal Fake News Detection. 5809-5821 - Liyan Xu, Jiangnan Li, Mo Yu, Jie Zhou:
Fine-Grained Modeling of Narrative Context: A Coherence Perspective via Retrospective Questions. 5822-5838 - Jinghao Zhang, Yuting Liu, Qiang Liu, Shu Wu, Guibing Guo, Liang Wang:
Stealthy Attack on Large Language Model based Recommendation. 5839-5857 - Sangwon Ryu, Heejin Do, Yunsu Kim, Gary Lee, Jungseul Ok:
Multi-Dimensional Optimization for Text Summarization via Reinforcement Learning. 5858-5871 - Changyu Chen, Xiting Wang, Ting-En Lin, Ang Lv, Yuchuan Wu, Xin Gao, Ji-Rong Wen, Rui Yan, Yongbin Li:
Masked Thought: Simply Masking Partial Reasoning Steps Can Improve Mathematical Reasoning Learning of Language Models. 5872-5900 - Guoxin Chen, Kexin Tang, Chao Yang, Fuying Ye, Yu Qiao, Yiming Qian:
SEER: Facilitating Structured Reasoning and Explanation via Reinforcement Learning. 5901-5921 - Yeachan Kim, Junho Kim, SangKeun Lee:
Towards Robust and Generalized Parameter-Efficient Fine-Tuning for Noisy Label Learning. 5922-5936 - Yeachan Kim, SangKeun Lee:
SparseFlow: Accelerating Transformers by Sparsifying Information Flows. 5937-5948 - Zhiyuan Liu, An Zhang, Hao Fei, Enzhi Zhang, Xiang Wang, Kenji Kawaguchi, Tat-Seng Chua:
ProtT3: Protein-to-Text Generation for Text-based Protein Understanding. 5949-5966 - Zhuohao Yu, Chang Gao, Wenjin Yao, Yidong Wang, Wei Ye, Jindong Wang, Xing Xie, Yue Zhang, Shikun Zhang:
KIEval: A Knowledge-grounded Interactive Evaluation Framework for Large Language Models. 5967-5985 - Sahand Sabour, Siyang Liu, Zheyuan Zhang, June M. Liu, Jinfeng Zhou, Alvionna S. Sunaryo, Tatia M. C. Lee, Rada Mihalcea, Minlie Huang:
EmoBench: Evaluating the Emotional Intelligence of Large Language Models. 5986-6004 - Guanhua Huang, Yuchen Zhang, Zhe Li, Yongjian You, Mingze Wang, Zhouwang Yang:
Are AI-Generated Text Detectors Robust to Adversarial Perturbations? 6005-6024 - Jian Chen, Peilin Zhou, Yining Hua, Loh Xin, Kehui Chen, Ziyuan Li, Bing Zhu, Junwei Liang:
FinTextQA: A Dataset for Long-form Financial Question Answering. 6025-6047 - Letitia Parcalabescu, Anette Frank:
On Measuring Faithfulness or Self-consistency of Natural Language Explanations. 6048-6089 - Mengjie Ren, Boxi Cao, Hongyu Lin, Cao Liu, Xianpei Han, Ke Zeng, Guanglu Wan, Xunliang Cai, Le Sun:
Learning or Self-aligning? Rethinking Instruction Fine-tuning. 6090-6105 - Qineng Wang, Zihao Wang, Ying Su, Hanghang Tong, Yangqiu Song:
Rethinking the Bounds of LLM Reasoning: Are Multi-Agent Discussions the Key? 6106-6131 - Qunbo Wang, Ruyi Ji, Tianhao Peng, Wenjun Wu, Zechao Li, Jing Liu:
Soft Knowledge Prompt: Help External Knowledge Become a Better Teacher to Instruct LLM in Knowledge-based VQA. 6132-6143 - Yutong Wang, Jiali Zeng, Xuebo Liu, Fandong Meng, Jie Zhou, Min Zhang:
TasTe: Teaching Large Language Models to Translate through Self-Reflection. 6144-6158 - Xudong Lu, Qi Liu, Yuhui Xu, Aojun Zhou, Siyuan Huang, Bo Zhang, Junchi Yan, Hongsheng Li:
Not All Experts are Equal: Efficient Expert Pruning and Skipping for Mixture-of-Experts Large Language Models. 6159-6172 - Wei Li, Xue Xu, Jiachen Liu, Xinyan Xiao:
UNIMO-G: Unified Image Generation through Multimodal Conditional Diffusion. 6173-6188 - David Stap, Eva Hasler, Bill Byrne, Christof Monz, Ke Tran:
The Fine-Tuning Paradox: Boosting Translation Quality Without Sacrificing LLM Abilities. 6189-6206 - Hexiang Tan, Fei Sun, Wanli Yang, Yuanzhuo Wang, Qi Cao, Xueqi Cheng:
Blinded by Generated Contexts: How Language Models Merge Generated and Retrieved Contexts When Knowledge Conflicts? 6207-6227 - Zhihao Zhang, Jun Zhao, Qi Zhang, Tao Gui, Xuanjing Huang:
Unveiling Linguistic Regions in Large Language Models. 6228-6247 - Zhiqing Hong, Rongjie Huang, Xize Cheng, Yongqi Wang, Ruiqi Li, Fuming You, Zhou Zhao, Zhimeng Zhang:
Text-to-Song: Towards Controllable Music Generation Incorporating Vocal and Accompaniment. 6248-6261 - Yufei Huang, Xu Han, Maosong Sun:
FastFiD: Improve Inference Efficiency of Open Domain Question Answering via Sentence Selection. 6262-6276 - Yisong Miao, Hongfu Liu, Wenqiang Lei, Nancy F. Chen, Min-Yen Kan:
Discursive Socratic Questioning: Evaluating the Faithfulness of Language Models' Understanding of Discourse Relations. 6277-6295 - Mykola Trokhymovych, Indira Sen, Martin Gerlach:
An Open Multilingual System for Scoring Readability of Wikipedia. 6296-6311 - Masaru Isonuma, Ivan Titov:
Unlearning Traces the Influential Training Data of Language Models. 6312-6325 - Basel Mousi, Nadir Durrani, Fahim Dalvi, Majd Hawasly, Ahmed Abdelali:
Exploring Alignment in Shared Cross-lingual Spaces. 6326-6348 - Wenxuan Wang, Wenxiang Jiao, Jingyuan Huang, Ruyi Dai, Jen-tse Huang, Zhaopeng Tu, Michael R. Lyu:
Not All Countries Celebrate Thanksgiving: On the Cultural Dominance in Large Language Models. 6349-6384 - Jinglong Gao, Xiao Ding, Yiming Cui, Jianbai Zhao, Hepeng Wang, Ting Liu, Bing Qin:
Self-Evolving GPT: A Lifelong Autonomous Experiential Learner. 6385-6432 - Zhendong Tan, Xingjun Zhang, Zheng Wei:
WRP: Weight Recover Prune for Structured Sparsity. 6433-6443 - Janick Michot, Manuela Hürlimann, Jan Deriu, Luzia Sauer, Katsiaryna Mlynchyk, Mark Cieliebak:
Error-preserving Automatic Speech Recognition of Young English Learners' Language. 6444-6454 - Yuxiang Cai, Qiao Liu, Yanglei Gan, Run Lin, Changlin Li, Xueyi Liu, Da Luo, JiayeYang JiayeYang:
DiFiNet: Boundary-Aware Semantic Differentiation and Filtration Network for Nested Named Entity Recognition. 6455-6471 - Yi Feng, Chuanyi Li, Vincent Ng:
Legal Case Retrieval: A Survey of the State of the Art. 6472-6485 - Tianqi Zhong, Zhaoyi Li, Quan Wang, Linqi Song, Ying Wei, Defu Lian, Zhendong Mao:
Benchmarking and Improving Compositional Generalization of Multi-aspect Controllable Text Generation. 6486-6517 - Chengyue Wu, Yukang Gan, Yixiao Ge, Zeyu Lu, Jiahao Wang, Ye Feng, Ying Shan, Ping Luo:
LLaMA Pro: Progressive LLaMA with Block Expansion. 6518-6537 - Feiteng Mu, Wenjie Li:
Generating Contrastive Narratives Using the Brownian Bridge Process for Narrative Coherence Learning. 6538-6555 - Feiteng Mu, Wenjie Li:
A Causal Approach for Counterfactual Reasoning in Narratives. 6556-6569 - Matthias Lindemann, Alexander Koller, Ivan Titov:
SIP: Injecting a Structural Inductive Bias into a Seq2Seq Model by Simulation. 6570-6587 - Jesujoba Alabi, Marius Mosbach, Matan Eyal, Dietrich Klakow, Mor Geva:
The Hidden Space of Transformer Language Adapters. 6588-6607 - Nafis Irtiza Tripto, Saranya Venkatraman, Dominik Macko, Róbert Móro, Ivan Srba, Adaku Uchendu, Thai Le, Dongwon Lee:
A Ship of Theseus: Curious Cases of Paraphrasing in LLM-Generated Texts. 6608-6625 - Guan-Ting Lin, Cheng-Han Chiang, Hung-yi Lee:
Advancing Large Language Models to Capture Varied Speaking Styles and Respond Properly in Spoken Conversations. 6626-6642 - Prayushi Faldu, Indrajit Bhattacharya, Mausam:
RetinaQA: A Robust Knowledge Base Question Answering Model for both Answerable and Unanswerable Questions. 6643-6656 - Zhaowei Li, Qi Xu, Dong Zhang, Hang Song, Yiqing Cai, Qi Qi, Ran Zhou, Junting Pan, Zefeng Li, Vu Tu, Zhida Huang, Tao Wang:
GroundingGPT: Language Enhanced Multi-modal Grounding Model. 6657-6678 - Islam Eldifrawi, Shengrui Wang, Amine Trabelsi:
Automated Justification Production for Claim Veracity in Fact Checking: A Survey on Architectures and Approaches. 6679-6692 - Carlos Mullov, Ngoc-Quan Pham, Alexander Waibel:
Decoupled Vocabulary Learning Enables Zero-Shot Translation from Unseen Languages. 6693-6709 - Rui Kong, Yuanchun Li, Qingtian Feng, Weijun Wang, Xiaozhou Ye, Ye Ouyang, Linghe Kong, Yunxin Liu:
SwapMoE: Serving Off-the-shelf MoE-based Large Language Models with Tunable Memory Budget. 6710-6720 - Iñigo Alonso, Eneko Agirre, Mirella Lapata:
PixT3: Pixel-based Table-To-Text Generation. 6721-6736 - Gal Yona, Roee Aharoni, Mor Geva:
Narrowing the Knowledge Evaluation Gap: Open-Domain Question Answering with Multi-Granularity Answers. 6737-6751 - Enora Rice, Ali Marashian, Luke Gessler, Alexis Palmer, Katharina von der Wense:
TAMS: Translation-Assisted Morphological Segmentation. 6752-6765 - Mohammad Abdullah Matin Khan, M. Saiful Bari, Xuan Do Long, Weishi Wang, Md. Rizwan Parvez, Shafiq Joty:
XCodeEval: An Execution-based Large Scale Multilingual Multitask Benchmark for Code Understanding, Generation, Translation and Retrieval. 6766-6805 - Haochen Tan, Zhijiang Guo, Zhan Shi, Lu Xu, Zhili Liu, Yunlong Feng, Xiaoguang Li, Yasheng Wang, Lifeng Shang, Qun Liu, Linqi Song:
ProxyQA: An Alternative Framework for Evaluating Long-Form Text Generation with Large Language Models. 6806-6827 - Giovanni Monea, Maxime Peyrard, Martin Josifoski, Vishrav Chaudhary, Jason Eisner, Emre Kiciman, Hamid Palangi, Barun Patra, Robert West:
A Glitch in the Matrix? Locating and Detecting Language Model Grounding with Fakepedia. 6828-6844 - Yue Fan, Jing Gu, Kaiwen Zhou, Qianqi Yan, Shan Jiang, Ching-Chen Kuo, Yang Zhao, Xinze Guan, Xin Wang:
Muffin or Chihuahua? Challenging Multimodal Large Language Models with Multipanel VQA. 6845-6863 - Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Yong Dai, Hongming Zhang, Zhenzhong Lan, Dong Yu:
WebVoyager: Building an End-to-End Web Agent with Large Multimodal Models. 6864-6890 - Senyu Li, Bradley Hauer, Ning Shi, Grzegorz Kondrak:
Translation-based Lexicalization Generation and Lexical Gap Detection: Application to Kinship Terms. 6891-6900 - Ritam Dutt, Zhen Wu, Jiaxin Shi, Divyanshu Sheth, Prakhar Gupta, Carolyn P. Rosé:
Leveraging Machine-Generated Rationales to Facilitate Social Meaning Detection in Conversations. 6901-6929 - Jacob Daniel Devasier, Yogesh Gurjar, Chengkai Li:
Robust Frame-Semantic Models with Lexical Unit Trees and Negative Samples. 6930-6941 - Yuan Yang, Siheng Xiong, Ali Payani, Ehsan Shareghi, Faramarz Fekri:
Harnessing the Power of Large Language Models for Natural Language to First-Order Logic Translation. 6942-6959 - Siddhartha Jain, Xiaofei Ma, Anoop Deoras, Bing Xiang:
Lightweight reranking for language model generations. 6960-6984 - Mike D'Arcy, Alexis Ross, Erin Bransom, Bailey Kuehl, Jonathan Bragg, Tom Hope, Doug Downey:
ARIES: A Corpus of Scientific Paper Edits Made in Response to Peer Reviews. 6985-7001 - Peter Hase, Mohit Bansal, Peter Clark, Sarah Wiegreffe:
The Unreasonable Effectiveness of Easy Training Data for Hard Tasks. 7002-7024 - Zhihan Zhang, Dong-Ho Lee, Yuwei Fang, Wenhao Yu, Mengzhao Jia, Meng Jiang, Francesco Barbieri:
PLUG: Leveraging Pivot Language in Cross-Lingual Instruction Tuning. 7025-7046 - Inderjeet Nair, Lu Wang:
MIDGARD: Self-Consistency Using Minimum Description Length for Structured Commonsense Reasoning. 7047-7065 - Justin Chih-Yao Chen, Swarnadeep Saha, Mohit Bansal:
ReConcile: Round-Table Conference Improves Reasoning via Consensus among Diverse LLMs. 7066-7085 - Hanqi Yan, Qinglin Zhu, Xinyu Wang, Lin Gui, Yulan He:
Mirror: Multiple-perspective Self-Reflection Method for Knowledge-rich Reasoning. 7086-7103 - Maria Antoniak, Joel Mire, Maarten Sap, Elliott Ash, Andrew Piper:
Where Do People Tell Stories Online? Story Detection Across Online Communities. 7104-7130 - Yuanhe Tian, Fei Xia, Yan Song:
Large Language Models Are No Longer Shallow Parsers. 7131-7142 - Yuanhe Tian, Fei Xia, Yan Song:
Dialogue Summarization with Mixture of Experts based on Large Language Models. 7143-7155 - Yuanhe Tian, Ruyi Gan, Yan Song, Jiaxing Zhang, Yongdong Zhang:
ChiMed-GPT: A Chinese Medical Large Language Model with Full Training Regime and Better Alignment to Human Preferences. 7156-7173 - Daking Rai, Ziyu Yao:
An Investigation of Neuron Activation as a Unified Lens to Explain Chain-of-Thought Eliciting Arithmetic Reasoning of LLMs. 7174-7193 - Hang Jiang, Xiajie Zhang, Robert Mahari, Daniel Kessler, Eric Ma, Tal August, Irene Li, Alex Pentland, Yoon Kim, Deb Roy, Jad Kabbara:
Leveraging Large Language Models for Learning Complex Legal Concepts through Storytelling. 7194-7219 - Guanyi Chen, Fahime Same, Kees van Deemter:
Intrinsic Task-based Evaluation for Referring Expression Generation. 7220-7231 - Qisheng Hu, Geonsik Moon, Hwee Tou Ng:
From Moments to Milestones: Incremental Timeline Summarization Leveraging Large Language Models. 7232-7246 - Kunxun Qi, Jianfeng Du, Hai Wan:
End-to-end Learning of Logical Rules for Enhancing Document-level Relation Extraction. 7247-7263 - Qingkai Fang, Shaolei Zhang, Zhengrui Ma, Min Zhang, Yang Feng:
Can We Achieve High-quality Direct Speech-to-Speech Translation without Parallel Speech Data? 7264-7277 - Jiaqi Wang, Zhenxi Song, Zhengyu Ma, Xipeng Qiu, Min Zhang, Zhiguo Zhang:
Enhancing EEG-to-Text Decoding through Transferable Representations from Pre-trained Contrastive EEG-Text Masked Autoencoder. 7278-7292 - Longwei Zou, Qingyang Wang, Han Zhao, Jiangangkong Jiangangkong, Yi Yang, Yangdong Deng:
CQIL: Inference Latency Optimization with Concurrent Computation of Quasi-Independent Layers. 7293-7307 - Do Long, Yiran Zhao, Hannah Brown, Yuxi Xie, James Xu Zhao, Nancy F. Chen, Kenji Kawaguchi, Michael Shieh, Junxian He:
Prompt Optimization via Adversarial In-Context Learning. 7308-7327 - Zhichao Wang, Yuanzhe Chen, Xinsheng Wang, Lei Xie, Yuping Wang:
StreamVoice: Streamable Context-Aware Language Modeling for Real-time Zero-Shot Voice Conversion. 7328-7338 - Zhengliang Shi, Shuo Zhang, Weiwei Sun, Shen Gao, Pengjie Ren, Zhumin Chen, Zhaochun Ren:
Generate-then-Ground in Retrieval-Augmented Generation for Multi-hop Question Answering. 7339-7353 - Jordan Voas, David Harwath, Raymond Mooney:
Multimodal Contextualized Semantic Parsing from Speech. 7354-7369 - Alireza Salemi, Sheshera Mysore, Michael Bendersky, Hamed Zamani:
LaMP: When Large Language Models Meet Personalization. 7370-7392 - Li Lucy, Suchin Gururangan, Luca Soldaini, Emma Strubell, David Bamman, Lauren Klein, Jesse Dodge:
AboutMe: Using Self-Descriptions in Webpages to Document the Effects of English Pretraining Data Filters. 7393-7420 - Ge Bai, Jie Liu, Xingyuan Bu, Yancheng He, Jiaheng Liu, Zhanhui Zhou, Zhuoran Lin, Wenbo Su, Tiezheng Ge, Bo Zheng, Wanli Ouyang:
MT-Bench-101: A Fine-Grained Benchmark for Evaluating Large Language Models in Multi-Turn Dialogues. 7421-7454 - Tianyu Chen, Yiming Zhang, Guoxin Yu, Dapeng Zhang, Li Zeng, Qing He, Xiang Ao:
EFSA: Towards Event-Level Financial Sentiment Analysis. 7455-7467 - Alexander Wan, Eric Wallace, Dan Klein:
What Evidence Do Language Models Find Convincing? 7468-7484 - Qihang Ai, Jiafan Li, Jincheng Dai, Jianwu Zhou, Lemao Liu, Haiyun Jiang, Shuming Shi:
Advancement in Graph Understanding: A Multimodal Benchmark and Fine-Tuning of Vision-Language Models. 7485-7501 - Dongkeun Yoon, Joel Jang, Sungdong Kim, Seungone Kim, Sheikh Shafayat, Minjoon Seo:
LangBridge: Multilingual Reasoning Without Multilingual Supervision. 7502-7522 - Siyuan Wang, Zhongyu Wei, Yejin Choi, Xiang Ren:
Can LLMs Reason with Rules? Logic Scaffolding for Stress-Testing and Improving LLMs. 7523-7543 - Xueliang Zhao, Xinting Huang, Wei Bi, Lingpeng Kong:
SEGO: Sequential Subgoal Optimization for Mathematical Problem-Solving. 7544-7565 - Xuhui Jiang, Yinghan Shen, Zhichao Shi, Chengjin Xu, Wei Li, Zixuan Li, Jian Guo, Huawei Shen, Yuanzhuo Wang:
Unlocking the Power of Large Language Models for Entity Alignment. 7566-7583 - Yifan Song, Da Yin, Xiang Yue, Jie Huang, Sujian Li, Bill Yuchen Lin:
Trial and Error: Exploration-Based Trajectory Optimization of LLM Agents. 7584-7600 - Luong Quoc Trung, Xinbo Zhang, Zhanming Jie, Peng Sun, Xiaoran Jin, Hang Li:
ReFT: Reasoning with Reinforced Fine-Tuning. 7601-7614 - Yunxin Li, Xinyu Chen, Baotian Hu, Haoyuan Shi, Min Zhang:
Cognitive Visual-Language Mapper: Advancing Multimodal Comprehension with Enhanced Visual Knowledge Alignment. 7615-7626 - Zijian Feng, Hanzhang Zhou, Kezhi Mao, Zixiao Zhu:
FreeCtrl: Constructing Control Centers with Feedforward Layers for Learning-Free Controllable Text Generation. 7627-7640 - Yuxuan Liu, Tianchi Yang, Shaohan Huang, Zihan Zhang, Haizhen Huang, Furu Wei, Weiwei Deng, Feng Sun, Qi Zhang:
HD-Eval: Aligning Large Language Model Evaluators Through Hierarchical Criteria Decomposition. 7641-7660 - Shengjie Li, Vincent Ng:
Conundrums in Cross-Prompt Automated Essay Scoring: Making Sense of the State of the Art. 7661-7681 - Flor Miriam Plaza del Arco, Amanda Cercas Curry, Alba Cercas Curry, Gavin Abercrombie, Dirk Hovy:
Angry Men, Sad Women: Large Language Models Reflect Gendered Stereotypes in Emotion Attribution. 7682-7696 - Lorenzo Paletto, Valerio Basile, Roberto Esposito:
Label Augmentation for Zero-Shot Hierarchical Text Classification. 7697-7706 - Yiqun Zhang, Fanheng Kong, Peidong Wang, Shuang Sun, Lingshuai Wang, Shi Feng, Daling Wang, Yifei Zhang, Kaisong Song:
STICKERCONV: Generating Multimodal Empathetic Responses from Scratch. 7707-7733 - Tong Zheng, Bei Li, Huiwen Bao, Tong Xiao, JingBo Zhu:
EIT: Enhanced Interactive Transformer. 7734-7751 - Yavuz Faruk Bakman, Duygu Nur Yaldiz, Baturalp Buyukates, Chenyang Tao, Dimitrios Dimitriadis, Salman Avestimehr:
MARS: Meaning-Aware Response Scoring for Uncertainty Estimation in Generative LLMs. 7752-7767 - Rocktim Jyoti Das, Simeon Emilov Hristov, Haonan Li, Dimitar Dimitrov, Ivan Koychev, Preslav Nakov:
EXAMS-V: A Multi-Discipline Multilingual Multimodal Exam Benchmark for Evaluating Vision Language Models. 7768-7791 - Huiming Wang, Liying Cheng, Wenxuan Zhang, De Wen Soh, Lidong Bing:
Order-Agnostic Data Augmentation for Few-Shot Named Entity Recognition. 7792-7807 - Yiyi Chen, Heather C. Lent, Johannes Bjerva:
Text Embedding Inversion Security for Multilingual Language Models. 7808-7827 - Keming Lu, Bowen Yu, Chang Zhou, Jingren Zhou:
Large Language Models are Superpositions of All Characters: Attaining Arbitrary Role-play via Self-Alignment. 7828-7840 - Chuyi Kong, Yaxin Fan, Xiang Wan, Feng Jiang, Benyou Wang:
PlatoLM: Teaching LLMs in Multi-Round Dialogue via a User Simulator. 7841-7863 - Jiaxi Yang, Binyuan Hui, Min Yang, Jian Yang, Junyang Lin, Chang Zhou:
Synthesizing Text-to-SQL Data from Weak and Strong LLMs. 7864-7875 - Parag Jain, Andreea Marzoca, Francesco Piccinno:
STRUCTSUM Generation for Faster Text Comprehension. 7876-7896 - Yu Zhao, Yuanbin Qu, Konrad Staniszewski, Szymon Tworkowski, Wei Liu, Piotr Milos, Yuxiang Wu, Pasquale Minervini:
Analysing The Impact of Sequence Composition on Language Model Pre-Training. 7897-7912 - Yilong Chen, Guoxia Wang, Junyuan Shang, Shiyao Cui, Zhenyu Zhang, Tingwen Liu, Shuohuan Wang, Yu Sun, Dianhai Yu, Hua Wu:
NACL: A General and Effective KV Cache Eviction Framework for LLM at Inference Time. 7913-7926 - Kexin Wang, Jiahong Zhang, Yong Ren, Man Yao, Di Shang, Bo Xu, Guoqi Li:
SpikeVoice: High-Quality Text-to-Speech Via Efficient Spiking Neural Network. 7927-7940 - Yunbin Tu, Liang Li, Li Su, Zheng-Jun Zha, Chenggang Yan, Qingming Huang:
Context-aware Difference Distilling for Multi-change Captioning. 7941-7956 - Wei Cheng, Yuhan Wu, Wei Hu:
Dataflow-Guided Retrieval Augmentation for Repository-Level Code Completion. 7957-7977 - Haohao Luo, Yang Deng, Ying Shen, See-Kiong Ng, Tat-Seng Chua:
Chain-of-Exemplar: Enhancing Distractor Generation for Multimodal Educational Question Generation. 7978-7993 - ChunLiu ChunLiu, Hongguang Zhang, Kainan Zhao, Xinghai Ju, Lin Yang:
LLMEmbed: Rethinking Lightweight LLM's Genuine Function in Text Classification. 7994-8004 - Yilong Chen, Junyuan Shang, Zhenyu Zhang, Shiyao Cui, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu:
LEMON: Reviving Stronger and Smaller LMs from Larger LMs with Linear Parameter Fusion. 8005-8019 - Tengfei Yu, Xuebo Liu, Liang Ding, Kehai Chen, Dacheng Tao, Min Zhang:
Speech Sense Disambiguation: Tackling Homophone Ambiguity in End-to-End Speech Translation. 8020-8035 - Yiran Wang, Masao Utiyama:
To be Continuous, or to be Discrete, Those are Bits of Questions. 8036-8049 - Flavio Schneider, Ojasv Kamal, Zhijing Jin, Bernhard Schölkopf:
Moûsai: Efficient Text-to-Music Diffusion Models. 8050-8068 - Hengrui Gu, Kaixiong Zhou, Xiaotian Han, Ninghao Liu, Ruobing Wang, Xin Wang:
PokeMQA: Programmable knowledge editing for Multi-hop Question Answering. 8069-8083 - Prince Jha, Raghav Jain, Konika Mandal, Aman Chadha, Sriparna Saha, Pushpak Bhattacharyya:
MemeGuard: An LLM and VLM-based Framework for Advancing Content Moderation via Meme Intervention. 8084-8104 - Jacob Carlson, Tom Bryan, Melissa Dell:
Efficient OCR for Building a Diverse Digital History. 8105-8115 - Zongru Wu, Zhuosheng Zhang, Pengzhou Cheng, Gongshen Liu:
Acquiring Clean Language Models from Backdoor Poisoned Datasets by Downscaling Frequency Space. 8116-8134 - Ziwei Ji, Yuzhe Gu, Wenwei Zhang, Chengqi Lyu, Dahua Lin, Kai Chen:
ANAH: Analytical Annotation of Hallucinations in Large Language Models. 8135-8158 - Wensheng Lu, Jianxun Lian, Wei Zhang, Guanghua Li, Mingyang Zhou, Hao Liao, Xing Xie:
Aligning Large Language Models for Controllable Recommendations. 8159-8172 - Haeun Yu, Pepa Atanasova, Isabelle Augenstein:
Revealing the Parametric Knowledge of Language Models: A Unified Framework for Attribution Methods. 8173-8186 - Kai Lv, Yuqing Yang, Tengxiao Liu, Qipeng Guo, Xipeng Qiu:
Full Parameter Fine-tuning for Large Language Models with Limited Resources. 8187-8198 - Qiguang Chen, Libo Qin, Jin Zhang, Zhi Chen, Xiao Xu, Wanxiang Che:
M³CoT: A Novel Benchmark for Multi-Domain Multi-step Multi-modal Chain-of-Thought. 8199-8221 - Longze Chen, Ziqiang Liu, Wanwei He, Yinhe Zheng, Hao Sun, Yunshui Li, Run Luo, Min Yang:
Long Context is Not Long at All: A Prospector of Long-Dependency Data for Large Language Models. 8222-8234 - Keqi Deng, Philip C. Woodland:
Label-Synchronous Neural Transducer for E2E Simultaneous Speech Translation. 8235-8251 - Yunseon Choi, Sangmin Bae, Seonghyun Ban, Minchan Jeong, Chuheng Zhang, Lei Song, Li Zhao, Jiang Bian, Kee-Eung Kim:
Hard Prompts Made Interpretable: Sparse Entropy Regularization for Prompt Tuning with RL. 8252-8271 - Louis Mahon, Mirella Lapata:
A Modular Approach for Multimodal Summarization of TV Shows. 8272-8291 - Alex Wilf, Sihyun Shawn Lee, Paul Pu Liang, Louis-Philippe Morency:
Think Twice: Perspective-Taking Improves Large Language Models' Theory-of-Mind Capabilities. 8292-8308 - Michael Krumdick, Rik Koncel-Kedziorski, Viet Dac Lai, Varshini Reddy, Charles Lovering, Chris Tanner:
BizBench: A Quantitative Reasoning Benchmark for Business and Finance. 8309-8332 - Takumi Takada, Yuma Suzuki, Hiroki Takushima, Hayato Tanoue, Haruki Sato, Aiswariya Manoj Kumar, Hiroki Nishihara, Takayuki Hori, Kazuya Ueki:
Direct Metric Optimization for Image Captioning through Reward-Weighted Augmented Data Utilization. 8333-8346 - Eftekhar Hossain, Omar Sharif, Mohammed Moshiul Hoque, Sarah Masud Preum:
Deciphering Hate: Identifying Hateful Memes and Their Targets. 8347-8359 - Yichen Jiang, Xiang Zhou, Mohit Bansal:
Inducing Systematicity in Transformers by Attending to Structurally Quantized Embeddings. 8360-8383 - Shir Ashury-Tahan, Ariel Gera, Benjamin Sznajder, Leshem Choshen, Liat Ein-Dor, Eyal Shnarch:
Label-Efficient Model Selection for Text Generation. 8384-8402 - Jin Yao, Eli Chien, Minxin Du, Xinyao Niu, Tianhao Wang, Zezhou Cheng, Xiang Yue:
Machine Unlearning of Pre-trained Large Language Models. 8403-8419 - Francesco Ortu, Zhijing Jin, Diego Doimo, Mrinmaya Sachan, Alberto Cazzaniga, Bernhard Schölkopf:
Competition of Mechanisms: Tracing How Language Models Handle Facts and Counterfactuals. 8420-8436 - Sebastian Joseph, Lily Chen, Jan Trienes, Hannah Louisa Göke, Monika Coers, Wei Xu, Byron C. Wallace, Junyi Jessy Li:
FactPICO: Factuality Evaluation for Plain Language Summarization of Medical Evidence. 8437-8464 - Yinhao Bai, Yalan Xie, Xiaoyi Liu, Yuhua Zhao, Zhixin Han, Mengting Hu, Hang Gao, Renhong Cheng:
BvSP: Broad-view Soft Prompting for Few-Shot Aspect Sentiment Quad Prediction. 8465-8482 - Yu Fu, Yufei Li, Wen Xiao, Cong Liu, Yue Dong:
Safety Alignment in NLP Tasks: Weakly Aligned Summarization as an In-Context Attack. 8483-8502 - Subba Reddy Oota, Emin Çelik, Fatma Deniz, Mariya Toneva:
Speech language models lack important brain-relevant semantics. 8503-8528 - Dongsheng Wang, Natraj Raman, Mathieu Sibue, Zhiqiang Ma, Petr Babkin, Simerjot Kaur, Yulong Pei, Armineh Nourbakhsh, Xiaomo Liu:
DocLLM: A Layout-Aware Generative Language Model for Multimodal Document Understanding. 8529-8548 - Qilong Wu, Varun Chandrasekaran:
Bypassing LLM Watermarks with Color-Aware Substitutions. 8549-8581 - Yanda Chen, Chen Zhao, Zhou Yu, Kathleen R. McKeown, He He:
Parallel Structures in Pre-training Data Yield In-Context Learning. 8582-8592 - Hainiu Xu, Runcong Zhao, Lixing Zhu, Jinhua Du, Yulan He:
OpenToM: A Comprehensive Benchmark for Evaluating Theory-of-Mind Reasoning Capabilities of Large Language Models. 8593-8623 - Phillip Rust, Bowen Shi, Skyler Wang, Necati Cihan Camgöz, Jean Maillard:
Towards Privacy-Aware Sign Language Translation at Scale. 8624-8641 - Haoxiang Wang, Yong Lin, Wei Xiong, Rui Yang, Shizhe Diao, Shuang Qiu, Han Zhao, Tong Zhang:
Arithmetic Control of LLMs for Diverse User Preferences: Directional Preference Alignment with Multi-Objective Rewards. 8642-8655 - Yinghui Li, Zishan Xu, Shaoshen Chen, Haojing Huang, Yangning Li, Shirong Ma, Yong Jiang, Zhongli Li, Qingyu Zhou, Hai-Tao Zheng, Ying Shen:
Towards Real-World Writing Assistance: A Chinese Character Checking Benchmark with Faked and Misspelled Characters. 8656-8668 - Jing Huang, Zhengxuan Wu, Christopher Potts, Mor Geva, Atticus Geiger:
RAVEL: Evaluating Interpretability Methods on Disentangling Language Model Representations. 8669-8687 - Zekun Li, Zhiyu Chen, Mike Ross, Patrick Huber, Seungwhan Moon, Zhaojiang Lin, Xin Dong, Adithya Sagar, Xifeng Yan, Paul A. Crook:
Large Language Models as Zero-shot Dialogue State Tracker through Function Calling. 8688-8704 - Syrine Krichene, Francesco Piccinno, Fangyu Liu, Julian Eisenschlos:
Faithful Chart Summarization with ChaTS-Pi. 8705-8723 - Cheng Niu, Xingguang Wang, Xuxin Cheng, Juntong Song, Tong Zhang:
Enhancing Dialogue State Tracking Models through LLM-backed User-Agents Simulation. 8724-8741 - Ting-Chih Chen, Chia-Wei Tang, Chris Thomas:
MetaSumPerceiver: Multimodal Multi-Document Evidence Summarization for Fact-Checking. 8742-8757 - Zixuan Li, Yutao Zeng, Yuxin Zuo, Weicheng Ren, Wenxuan Liu, Miao Su, Yucan Guo, Yantao Liu, Lixiang Lixiang, Zhilei Hu, Long Bai, Wei Li, Yidan Liu, Pan Yang, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng:
KnowCoder: Coding Structured Knowledge into LLMs for Universal Information Extraction. 8758-8779 - Yanming Liu, Xinyue Peng, Tianyu Du, Jianwei Yin, Weihao Liu, Xuhong Zhang:
ERA-CoT: Improving Chain-of-Thought through Entity Relationship Analysis. 8780-8794 - Yang Deng, Xuan Zhang, Wenxuan Zhang, Yifei Yuan, See-Kiong Ng, Tat-Seng Chua:
On the Multi-turn Instruction Following for Conversational Web Agents. 8795-8812 - Shihan Deng, Weikai Xu, Hongda Sun, Wei Liu, Tao Tan, Jianfeng Liu, Ang Li, Jian Luan, Bin Wang, Rui Yan, Shuo Shang:
Mobile-Bench: An Evaluation Benchmark for LLM-based Mobile Agents. 8813-8831 - Chen Zhang, Mingxu Tao, Quzhe Huang, Jiuheng Lin, Zhibin Chen, Yansong Feng:
MC²: Towards Transparent and Culturally-Aware NLP for Minority Languages in China. 8832-8850 - Shoutao Guo, Shaolei Zhang, Yang Feng:
Decoder-only Streaming Transformer for Simultaneous Translation. 8851-8864 - Zhexin Zhang, Junxiao Yang, Pei Ke, Fei Mi, Hongning Wang, Minlie Huang:
Defending Large Language Models Against Jailbreaking Attacks Through Goal Prioritization. 8865-8887 - Tristan Thrush, Jared Moore, Miguel Monares, Christopher Potts, Douwe Kiela:
I am a Strange Dataset: Metalinguistic Tests for Language Models. 8888-8907 - Shaolei Zhang, Tian Yu, Yang Feng:
TruthX: Alleviating Hallucinations by Editing Large Language Models in Truthful Space. 8908-8949 - Le Zhuo, Zewen Chi, Minghao Xu, Heyan Huang, Jianan Zhao, Heqi Zheng, Conghui He, Xian-Ling Mao, Wentao Zhang:
ProtLLM: An Interleaved Protein-Language LLM with Protein-as-Word Pre-Training. 8950-8963 - Shaolei Zhang, Qingkai Fang, Shoutao Guo, Zhengrui Ma, Min Zhang, Yang Feng:
StreamSpeech: Simultaneous Speech-to-Speech Translation with Multi-task Learning. 8964-8986 - Tianjie Ju, Yijin Chen, Xinwei Yuan, Zhuosheng Zhang, Wei Du, Yubin Zheng, Gongshen Liu:
Investigating Multi-Hop Factual Shortcuts in Knowledge Editing of Large Language Models. 8987-9001 - Abdelrahman Zayed, Gonçalo Mordido, Ioana Baldini, Sarath Chandar:
Why Don't Prompt-Based Fairness Metrics Correlate? 9002-9019 - Manuel Tonneau, Pedro Vitor Quinta de Castro, Karim Lasri, Ibrahim Farouq, Lakshmi Subramanian, Víctor Orozco-Olvera, Samuel Fraiberger:
NaijaHate: Evaluating Hate Speech Detection on Nigerian Twitter Using Representative Data. 9020-9040 - Zhe Chen, Heyang Liu, Wenyi Yu, Guangzhi Sun, Hongcheng Liu, Ji Wu, Chao Zhang, Yu Wang, Yanfeng Wang:
M³AV: A Multimodal, Multigenre, and Multipurpose Audio-Visual Academic Lecture Dataset. 9041-9060 - Nakyeong Yang, Taegwan Kang, Stanley Jungkyu Choi, Honglak Lee, Kyomin Jung:
Mitigating Biases for Instruction-following Language Models via Bias Neurons Elimination. 9061-9073 - Yufeng Zhang, Jianxing Yu, Yanghui Rao, Libin Zheng, Qinliang Su, Huaijie Zhu, Jian Yin:
Domain Adaptation for Subjective Induction Questions Answering on Products by Adversarial Disentangled Learning. 9074-9089 - Keqin Peng, Liang Ding, Yancheng Yuan, Xuebo Liu, Min Zhang, Yuanxin Ouyang, Dacheng Tao:
Revisiting Demonstration Selection Strategies in In-Context Learning. 9090-9101 - Mingyu Zheng, Xinwei Feng, Qingyi Si, Qiaoqiao She, Zheng Lin, Wenbin Jiang, Weiping Wang:
Multimodal Table Understanding. 9102-9124 - Huang Lei, Jiaming Guo, Guanhua He, Xishan Zhang, Rui Zhang, Shaohui Peng, Shaoli Liu, Tianshi Chen:
Ex3: Automatic Novel Writing by Extracting, Excelsior and Expanding. 9125-9146 - Mayur Patidar, Riya Sawhney, Avinash Kumar Singh, Biswajit Chatterjee, Mausam, Indrajit Bhattacharya:
Few-shot Transfer Learning for Knowledge Base Question Answering: Fusing Supervised Models with In-Context Learning. 9147-9165 - Liang Chen, Yatao Bian, Yang Deng, Deng Cai, Shuaiyi Li, Peilin Zhao, Kam-Fai Wong:
WatME: Towards Lossless Watermarking Through Lexical Redundancy. 9166-9180 - Yang Zhang, Keqin Bao, Ming Yan, Wenjie Wang, Fuli Feng, Xiangnan He:
Text-like Encoding of Collaborative Information in Large Language Models for Recommendation. 9181-9191 - Yuhao Wang, Yusheng Liao, Heyang Liu, Hongcheng Liu, Yanfeng Wang, Yu Wang:
MM-SAP: A Comprehensive Benchmark for Assessing Self-Awareness of Multimodal Large Language Models in Perception. 9192-9205 - Jiachun Li, Pengfei Cao, Chenhao Wang, Zhuoran Jin, Yubo Chen, Daojian Zeng, Kang Liu, Jun Zhao:
Focus on Your Question! Interpreting and Mitigating Toxic CoT Problems in Commonsense Reasoning. 9206-9230 - Yi Liu, Xiangyu Liu, Xiangrong Zhu, Wei Hu:
Multi-Aspect Controllable Text Generation with Disentangled Counterfactual Augmentation. 9231-9253 - Byeonghu Na, Suhyeon Jo, Yeongmin Kim, Il-Chul Moon:
Reward-based Input Construction for Cross-document Relation Extraction. 9254-9270 - Guangjun Zhang, Hu Zhang, Yujie Wang, Ru Li, Hongye Tan, Jiye Liang:
Hyperspherical Multi-Prototype with Optimal Transport for Event Argument Extraction. 9271-9284 - Wenyan Li, Jiaang Li, Rita Ramos, Raphael Tang, Desmond Elliott:
Understanding Retrieval Robustness for Retrieval-augmented Image Captioning. 9285-9299 - Huijie Yao, Wengang Zhou, Hao Zhou, Houqiang Li:
Semi-Supervised Spoken Language Glossification. 9300-9312 - Kanzhi Cheng, Qiushi Sun, Yougang Chu, Fangzhi Xu, Yantao Li, Jianbing Zhang, Zhiyong Wu:
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents. 9313-9332 - Yakir Yehuda, Itzik Malkiel, Oren Barkan, Jonathan Weill, Royi Ronen, Noam Koenigstein:
InterrogateLLM: Zero-Resource Hallucination Detection in LLM-Generated Answers. 9333-9347 - Yu Sun, Keyuchen Keyuchen, Shujie Wang, Peiji Li, Qipeng Guo, Hang Yan, Xipeng Qiu, Xuanjing Huang, Dahua Lin:
F-Eval: Asssessing Fundamental Abilities with Refined Evaluation Methods. 9348-9369 - Philipp Mondorf, Barbara Plank:
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning. 9370-9402 - Maria Lerner, Florian E. Dorner, Elliott Ash, Naman Goel:
Whose Preferences? Differences in Fairness Preferences and Their Impact on the Fairness of AI Utilizing Human Feedback. 9403-9425 - Peiyi Wang, Lei Li, Zhihong Shao, Runxin Xu, Damai Dai, Yifei Li, Deli Chen, Yu Wu, Zhifang Sui:
Math-Shepherd: Verify and Reinforce LLMs Step-by-step without Human Annotations. 9426-9439 - Peiyi Wang, Lei Li, Liang Chen, Zefan Cai, Dawei Zhu, Binghuai Lin, Yunbo Cao, Lingpeng Kong, Qi Liu, Tianyu Liu, Zhifang Sui:
Large Language Models are not Fair Evaluators. 9440-9450 - Meiqi Chen, Yubo Ma, Kaitao Song, Yixin Cao, Yan Zhang, Dongsheng Li:
Improving Large Language Models in Event Relation Logical Prediction. 9451-9478 - Dingyi Yang, Chunru Zhan, Ziheng Wang, Biao Wang, Tiezheng Ge, Bo Zheng, Qin Jin:
Synchronized Video Storytelling: Generating Video Narrations with Structured Storyline. 9479-9493 - Wenting Chen, Linlin Shen, Jingyang Lin, Jiebo Luo, Xiang Li, Yixuan Yuan:
Fine-Grained Image-Text Alignment in Medical Imaging Enables Explainable Cyclic Image-Report Generation. 9494-9509 - Zehui Chen, Weihua Du, Wenwei Zhang, Kuikun Liu, Jiangning Liu, Miao Zheng, Jingming Zhuo, Songyang Zhang, Dahua Lin, Kai Chen, Feng Zhao:
T-Eval: Evaluating the Tool Utilization Capability of Large Language Models Step by Step. 9510-9529 - Xinyu Hu, Mingqi Gao, Sen Hu, Yang Zhang, Yicheng Chen, Teng Xu, Xiaojun Wan:
Are LLM-based Evaluators Confusing NLG Quality Criteria? 9530-9570 - Jiazhan Feng, Chongyang Tao, Xiubo Geng, Tao Shen, Can Xu, Guodong Long, Dongyan Zhao, Daxin Jiang:
Synergistic Interplay between Search and Large Language Models for Information Retrieval. 9571-9583 - Yaroslav Aksenov, Nikita Balagansky, Sofia Maria Lo Cicero Vaina, Boris Shaposhnikov, Alexey Gorbatovski, Daniil Gavrilov:
Linear Transformers with Learnable Kernel Functions are Better In-Context Models. 9584-9597 - Tong Liu, Iza Skrjanec, Vera Demberg:
Temperature-scaling surprisal estimates improve fit to human reading times - but does it do so for the "right reasons"? 9598-9619 - Ameer Saadat-Yazdi, Nadin Kökciyan:
Beyond Recognising Entailment: Formalising Natural Language Inference from an Argumentative Perspective. 9620-9636 - Jun Zhan, Junqi Dai, Jiasheng Ye, Yunhua Zhou, Dong Zhang, Zhigeng Liu, Xin Zhang, Ruibin Yuan, Ge Zhang, Linyang Li, Hang Yan, Jie Fu, Tao Gui, Tianxiang Sun, Yu-Gang Jiang, Xipeng Qiu:
AnyGPT: Unified Multimodal LLM with Discrete Sequence Modeling. 9637-9662 - Zixin Chen, Hongzhan Lin, Ziyang Luo, Mingfei Cheng, Jing Ma, Guang Chen:
CofiPara: A Coarse-to-fine Paradigm for Multimodal Sarcasm Target Identification with Large Multimodal Models. 9663-9687 - Aiwei Liu, Haoping Bai, Zhiyun Lu, Xiang Kong, Xiaoming Wang, Jiulong Shan, Meng Cao, Lijie Wen:
Direct Large Language Model Alignment Through Self-Rewarding Contrastive Prompt Distillation. 9688-9712 - Michael Toker, Hadas Orgad, Mor Ventura, Dana Arad, Yonatan Belinkov:
Diffusion Lens: Interpreting Text Encoders in Text-to-Image Pipelines. 9713-9728 - Yuchong Sun, Che Liu, Kun Zhou, Jinwen Huang, Ruihua Song, Xin Zhao, Fuzheng Zhang, Di Zhang, Kun Gai:
Parrot: Enhancing Multi-Turn Instruction Following for Large Language Models. 9729-9750 - Ruiqi Li, Yu Zhang, Yongqi Wang, Zhiqing Hong, Rongjie Huang, Zhou Zhao:
Robust Singing Voice Transcription Serves Synthesis. 9751-9766 - Tianyu Chen, Lin Li, ZhuLiuchuan ZhuLiuchuan, Zongyang Li, Xueqing Liu, Guangtai Liang, Qianxiang Wang, Tao Xie:
VulLibGen: Generating Names of Vulnerability-Affected Packages via a Large Language Model. 9767-9780 - Donglei Yu, Xiaomian Kang, Yuchen Liu, Yu Zhou, Chengqing Zong:
Self-Modifying State Modeling for Simultaneous Machine Translation. 9781-9795 - Jiaqi Chen, Bingqian Lin, Ran Xu, Zhenhua Chai, Xiaodan Liang, Kwan-Yee Kenneth Wong:
MapGPT: Map-Guided Prompting with Adaptive Path Planning for Vision-and-Language Navigation. 9796-9810 - Yifei Wang, Dizhan Xue, Shengjie Zhang, Shengsheng Qian:
BadAgent: Inserting and Activating Backdoor Attacks in LLM Agents. 9811-9827 - Hongda Sun, Weikai Xu, Wei Liu, Jian Luan, Bin Wang, Shuo Shang, Ji-Rong Wen, Rui Yan:
DetermLR: Augmenting LLM-based Logical Reasoning from Indeterminacy to Determinacy. 9828-9862 - Robert Mahari, Dominik Stammbach, Elliott Ash, Alex Pentland:
LePaRD: A Large-Scale Dataset of Judicial Citations to Precedent. 9863-9877 - Giacomo Frisoni, Alessio Cocchieri, Alex Presepi, Gianluca Moro, Zaiqiao Meng:
To Generate or to Retrieve? On the Effectiveness of Artificial Contexts for Medical Open-Domain Question Answering. 9878-9919 - Alena Fenogenova, Artem Chervyakov, Nikita Martynov, Anastasia Kozlova, Maria Tikhonova, Albina Akhmetgareeva, Anton A. Emelyanov, Denis Shevelev, Pavel Lebedev, Leonid Sinev, Ulyana Isaeva, Katerina Kolomeytseva, Daniil Moskovskiy, Elizaveta Goncharova, Nikita Savushkin, Polina Mikhailova, Anastasia Minaeva, Denis Dimitrov, Alexander Panchenko, Sergey Markov:
MERA: A Comprehensive LLM Evaluation in Russian. 9920-9948 - Jie Zhao, Ziyu Guan, Cai Xu, Wei Zhao, Yue Jiang:
SC2: Towards Enhancing Content Preservation and Style Consistency in Long Text Style Transfer. 9949-9960 - Guanghui Qin, Corby Rosset, Ethan C. Chau, Nikhil Rao, Benjamin Van Durme:
Dodo: Dynamic Contextual Compression for Decoder-only LMs. 9961-9975 - Shilong Pan, Zhiliang Tian, Liang Ding, Haoqi Zheng, Zhen Huang, Zhihua Wen, Dongsheng Li:
POMP: Probability-driven Meta-graph Prompter for LLMs in Low-resource Unsupervised Neural Machine Translation. 9976-9992 - Miao Li, Ming-Bin Chen, Bo Tang, ShengbinHou ShengbinHou, Pengyu Wang, Haiying Deng, Zhiyu Li, Feiyu Xiong, Keming Mao, Cheng Peng, Yi Luo:
NewsBench: A Systematic Evaluation Framework for Assessing Editorial Capabilities of Large Language Models in Chinese Journalism. 9993-10014 - Shuaijie She, Wei Zou, Shujian Huang, Wenhao Zhu, Xiang Liu, Xiang Geng, Jiajun Chen:
MAPO: Advancing Multilingual Reasoning through Multilingual-Alignment-as-Preference Optimization. 10015-10027 - Feiteng Fang, Yuelin Bai, Shiwen Ni, Min Yang, Xiaojun Chen, Ruifeng Xu:
Enhancing Noise Robustness of Retrieval-Augmented Language Models with Adaptive Adversarial Training. 10028-10039 - Jing Nathan Yan, Tianqi Liu, Justin T. Chiu, Jiaming Shen, Zhen Qin, Yue Yu, Charumathi Lakshmanan, Yair Kurzion, Alexander M. Rush, Jialu Liu, Michael Bendersky:
Predicting Text Preference Via Structured Comparative Reasoning. 10040-10060 - Lvxiaowei Xu, Zhilin Gong, Jianhua Dai, Tianxiang Wang, Ming Cai, Jiawei Peng:
CoELM: Construction-Enhanced Language Modeling. 10061-10081 - Songju Lei, Xize Cheng, Mengjiao Lyu, Jianqiao Hu, Jintao Tan, Runlin Liu, Lingyu Xiong, Tao Jin, Xiandong Li, Zhou Zhao:
Uni-Dubbing: Zero-Shot Speech Synthesis from Visual Articulation. 10082-10099 - Miles Williams, Nikolaos Aletras:
On the Impact of Calibration Data in Post-training Quantization and Pruning. 10100-10118 - Prerna Agarwal, Nishant Kumar, Srikanta Bedathur:
SymKGQA: Few-Shot Knowledge Graph Question Answering via Symbolic Program Generation and Execution. 10119-10140 - Yibin Lei, Di Wu, Tianyi Zhou, Tao Shen, Yu Cao, Chongyang Tao, Andrew Yates:
Meta-Task Prompting Elicits Embeddings from Large Language Models. 10141-10157 - Miao Li, Jey Han Lau, Eduard H. Hovy:
A Sentiment Consolidation Framework for Meta-Review Generation. 10158-10177 - Chengjie Zhou, Bobo Li, Hao Fei, Fei Li, Chong Teng, Donghong Ji:
Revisiting Structured Sentiment Analysis as Latent Dependency Graph Parsing. 10178-10191 - Yifan Peng, Yui Sudo, Muhammad Shakeel, Shinji Watanabe:
OWSM-CTC: An Open Encoder-Only Speech Foundation Model for Speech Recognition, Translation, and Language Identification. 10192-10209 - Sohee Yang, Elena Gribovskaya, Nora Kassner, Mor Geva, Sebastian Riedel:
Do Large Language Models Latently Perform Multi-Hop Reasoning? 10210-10229 - Chengpeng Li, Zheng Yuan, Hongyi Yuan, Guanting Dong, Keming Lu, Jiancan Wu, Chuanqi Tan, Xiang Wang, Chang Zhou:
MuggleMath: Assessing the Impact of Query and Response Augmentation on Math Reasoning. 10230-10258 - Ankita Gupta, Ethan Zuckerman, Brendan T. O'Connor:
Harnessing Toulmin's theory for zero-shot argument explication. 10259-10276 - Gaetan Latouche, Marc-André Carbonneau, Benjamin Swanson:
BinaryAlign: Word Alignment as Binary Sequence Labeling. 10277-10288 - Tiancheng Hu, Nigel Collier:
Quantifying the Persona Effect in LLM Simulations. 10289-10307 - Nishant Balepur, Abhilasha Ravichander, Rachel Rudinger:
Artifacts or Abduction: How Do LLMs Answer Multiple-Choice Questions Without the Question? 10308-10330 - Zhenrui Yue, Huimin Zeng, Lanyu Shang, Yifan Liu, Yang Zhang, Dong Wang:
Retrieval Augmented Fact Verification by Synthesizing Contrastive Arguments. 10331-10343 - Nigel Fernandez, Alexander Scarlatos, Andrew S. Lan:
SyllabusQA: A Course Logistics Question Answering Dataset. 10344-10369 - Yilin Wen, Zifeng Wang, Jimeng Sun:
MindMap: Knowledge Graph Prompting Sparks Graph of Thoughts in Large Language Models. 10370-10388 - Daniel Braun, Florian Matthes:
AGB-DE: A Corpus for the Automated Legal Assessment of Clauses in German Consumer Contracts. 10389-10405 - Charlotte Siska, Katerina Marazopoulou, Melissa Ailem, James Bono:
Examining the robustness of LLM evaluation to the distributional assumptions of benchmarks. 10406-10421 - Eric Pasewark, Kyle Montgomery, Kefei Duan, Dawn Song, Chenguang Wang:
Re-Tuning: Overcoming the Compositionality Limits of Large Language Models with Recursive Tuning. 10422-10437 - Zixuan Ke, Weize Kong, Cheng Li, Mingyang Zhang, Qiaozhu Mei, Michael Bendersky:
Bridging the Preference Gap between Retrievers and LLMs. 10438-10451 - Siheng Xiong, Ali Payani, Ramana Kompella, Faramarz Fekri:
Large Language Models Can Learn Temporal Reasoning. 10452-10470 - Raphaël Mouravieff, Benjamin Piwowarski, Sylvain Lamprier:
Learning Relational Decomposition of Queries for Question Answering from Tables. 10471-10485 - Dun-Ming Huang, Pol van Rijn, Ilia Sucholutsky, Raja Marjieh, Nori Jacoby:
Characterizing Similarities and Divergences in Conversational Tones in Humans and LLMs by Sampling with People. 10486-10512 - Theodore Zhao, Mu Wei, Joseph Preston, Hoifung Poon:
Pareto Optimal Learning for Estimating Large Language Model Errors. 10513-10529 - Victor Agostinelli, Max Wild, Matthew Raffel, Kazi Ahmed Asif Fuad, Lizhong Chen:
Simul-LLM: A Framework for Exploring High-Quality Simultaneous Translation with Large Language Models. 10530-10541 - Bochuan Cao, Yuanpu Cao, Lu Lin, Jinghui Chen:
Defending Against Alignment-Breaking Attacks via Robustly Aligned LLM. 10542-10560 - Guanming Xiong, Junwei Bao, Wen Zhao:
Interactive-KBQA: Multi-Turn Interactions for Knowledge Base Question Answering with Large Language Models. 10561-10582 - Boshi Wang, Hao Fang, Jason Eisner, Benjamin Van Durme, Yu Su:
LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error. 10583-10604 - Hao Zhao, Zihan Qiu, Huijia Wu, Zili Wang, Zhaofeng He, Jie Fu:
HyperMoE: Towards Better Mixture of Experts via Transferring Among Experts. 10605-10618 - Wenhao Liu, Xiaohua Wang, Muling Wu, Tianlong Li, Changze Lv, Zixuan Ling, Jianhao Zhu, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang:
Aligning Large Language Models with Human Preferences through Representation Engineering. 10619-10638 - Fuwen Luo, Chi Chen, Zihao Wan, Zhaolu Kang, Qidong Yan, Yingjie Li, Xiaolong Wang, Siyu Wang, Ziyue Wang, Xiaoyue Mi, Peng Li, Ning Ma, Maosong Sun, Yang Liu:
CODIS: Benchmarking Context-dependent Visual Comprehension for Multimodal Large Language Models. 10639-10659 - Chen Huang, Yiping Jin, Ilija Ilievski, Wenqiang Lei, Jiancheng Lv:
ARAIDA: Analogical Reasoning-Augmented Interactive Data Annotation. 10660-10675 - Qihao Yang, Yong Li, Xuelin Wang, Fu Lee Wang, Tianyong Hao:
PolCLIP: A Unified Image-Text Word Sense Disambiguation Model via Generating Multimodal Complementary Representations. 10676-10690 - An Tang, Xiuzhen Zhang, Minh Dinh, Erik Cambria:
Prompted Aspect Key Point Analysis for Quantitative Review Summarization. 10691-10708 - Qiming Xie, Zengzhi Wang, Yi Feng, Rui Xia:
Ask Again, Then Fail: Large Language Models' Vacillations in Judgment. 10709-10745 - Tong Zhang, Peixin Qin, Yang Deng, Chen Huang, Wenqiang Lei, Junhong Liu, Dingnan Jin, Hongru Liang, Tat-Seng Chua:
CLAMBER: A Benchmark of Identifying and Clarifying Ambiguous Information Needs in Large Language Models. 10746-10766 - Junlin Lee, Yequan Wang, Jing Li, Min Zhang:
Multimodal Reasoning with Multimodal Knowledge Graph. 10767-10782 - Rikui Huang, Wei Wei, Xiaoye Qu, Shengzhe Zhang, Dangyang Chen, Yu Cheng:
Confidence is not Timeless: Modeling Temporal Validity for Rule-based Temporal Knowledge Graph Forecasting. 10783-10794 - Weihong Du, Jia Liu, Zujie Wen, Dingnan Jin, Hongru Liang, Wenqiang Lei:
CARE: A Clue-guided Assistant for CSRs to Read User Manuals. 10795-10811 - Dingzirui Wang, Longxu Dou, Xuanliang Zhang, Qingfu Zhu, Wanxiang Che:
Enhancing Numerical Reasoning with the Guidance of Reliable Reasoning Processes. 10812-10828 - Weihong Du, Wenrui Liao, Hongru Liang, Wenqiang Lei:
PAGED: A Benchmark for Procedural Graphs Extraction from Documents. 10829-10846 - Ying Zhou, Ben He, Le Sun:
Navigating the Shadows: Unveiling Effective Disturbances for Modern AI Content Detectors. 10847-10861 - Cheng Niu, Yuanhao Wu, Juno Zhu, Siliang Xu, Kashun Shum, Randy Zhong, Juntong Song, Tong Zhang:
RAGTruth: A Hallucination Corpus for Developing Trustworthy Retrieval-Augmented Language Models. 10862-10878 - Junyi Li, Jie Chen, Ruiyang Ren, Xiaoxue Cheng, Xin Zhao, Jian-Yun Nie, Ji-Rong Wen:
The Dawn After the Dark: An Empirical Study on Factuality Hallucination in Large Language Models. 10879-10899 - Qihuang Zhong, Liang Ding, Li Shen, Juhua Liu, Bo Du, Dacheng Tao:
Revisiting Knowledge Distillation for Autoregressive Language Models. 10900-10913 - Yunlong Liang, Fandong Meng, Jiaan Wang, Jinan Xu, Yufeng Chen, Jie Zhou:
Continual Learning with Semi-supervised Contrastive Distillation for Incremental Neural Machine Translation. 10914-10928 - Rongjie Huang, Chunlei Zhang, Yongqi Wang, Dongchao Yang, Jinchuan Tian, Zhenhui Ye, Luping Liu, Zehan Wang, Ziyue Jiang, Xuankai Chang, Jiatong Shi, Chao Weng, Zhou Zhao, Dong Yu:
Make-A-Voice: Revisiting Voice Large Language Models as Scalable Multilingual and Multitask Learners. 10929-10942 - Shih-Cheng Huang, Pin-Zu Li, Yu-Chi Hsu, Kuang-Ming Chen, Yu-Tung Lin, Shih-Kai Hsiao, Richard Tzong-Han Tsai, Hung-yi Lee:
Chat Vector: A Simple Approach to Equip LLMs with Instruction Following and Model Alignment in New Languages. 10943-10959 - Neal Mangaokar, Ashish Hooda, Jihye Choi, Shreyas Chandrashekaran, Kassem Fawaz, Somesh Jha, Atul Prakash:
PRP: Propagating Universal Perturbations to Attack Large Language Model Guard-Rails. 10960-10976 - Bo Yuan, Yulin Chen, Yin Zhang, Wei Jiang:
Hide and Seek in Noise Labels: Noise-Robust Collaborative Active Learning with LLMs-Powered Assistance. 10977-11011 - Yinya Huang, Ruixin Hong, Hongming Zhang, Wei Shao, Zhicheng Yang, Dong Yu, Changshui Zhang, Xiaodan Liang, Linqi Song:
CLOMO: Counterfactual Logical Modification with Large Language Models. 11012-11034 - Qi Shi, Han Cui, Haofeng Wang, Qingfu Zhu, Wanxiang Che, Ting Liu:
Exploring Hybrid Question Answering via Program-based Prompting. 11035-11046 - Harman Singh, Nitish Gupta, Shikhar Bharadwaj, Dinesh Tewari, Partha Talukdar:
IndicGenBench: A Multilingual Benchmark to Evaluate Generation Capabilities of LLMs on Indic Languages. 11047-11073 - Rui Ying, Mengting Hu, Jianfeng Wu, Yalan Xie, Xiaoyi Liu, Zhunheng Wang, Ming Jiang, Hang Gao, Linlin Zhang, Renhong Cheng:
Simple but Effective Compound Geometric Operations for Temporal Knowledge Graph Completion. 11074-11086 - Yikun Wang, Rui Zheng, Liang Ding, Qi Zhang, Dahua Lin, Dacheng Tao:
Uncertainty Aware Learning for Language Model Alignment. 11087-11099 - Ying-Chun Lin, Jennifer Neville, Jack W. Stokes, Longqi Yang, Tara Safavi, Mengting Wan, Scott Counts, Siddharth Suri, Reid Andersen, Xiaofeng Xu, Deepak Gupta, Sujay Kumar Jauhar, Xia Song, Georg Buscher, Saurabh Tiwary, Brent J. Hecht, Jaime Teevan:
Interpretable User Satisfaction Estimation for Conversational Systems with Large Language Models. 11100-11115 - Jiawei Li, Yizhe Yang, Yu Bai, Xiaofeng Zhou, Yinghao Li, Huashan Sun, Yuhang Liu, Xingpeng Si, Yuhao Ye, Yixiao Wu, Yiguan Lin, Bin Xu, Ren Bowen, Chong Feng, Yang Gao, Heyan Huang:
Fundamental Capabilities of Large Language Models and their Applications in Domain Scenarios: A Survey. 11116-11141 - Yejin Bang, Delong Chen, Nayeon Lee, Pascale Fung:
Measuring Political Bias in Large Language Models: What Is Said and How It Is Said. 11142-11159 - Yuhan Chen, Ang Lv, Ting-En Lin, Changyu Chen, Yuchuan Wu, Fei Huang, Yongbin Li, Rui Yan:
Fortify the Shortest Stave in Attention: Enhancing Context Awareness of Large Language Models for Effective Tool Use. 11160-11174 - Haoyi Wu, Kewei Tu:
Layer-Condensed KV Cache for Efficient Inference of Large Language Models. 11175-11188 - Yuanchi Zhang, Yile Wang, Zijun Liu, Shuo Wang, Xiaolong Wang, Peng Li, Maosong Sun, Yang Liu:
Enhancing Multilingual Capabilities of Large Language Models through Self-Distillation from Resource-Rich Languages. 11189-11204 - Jiaxing Sun, Weiquan Huang, Jiang Wu, Chenya Gu, Wei Li, Songyang Zhang, Hang Yan, Conghui He:
Benchmarking Chinese Commonsense Reasoning of LLMs: From Chinese-Specifics to Reasoning-Memorization Correlations. 11205-11228 - Ziyue Wang, Chi Chen, Yiqi Zhu, Fuwen Luo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Maosong Sun, Yang Liu:
Browse and Concentrate: Comprehending Multimodal Content via Prior-LLM Context Fusion. 11229-11245 - Chi Chen, Yiyang Du, Zheng Fang, Ziyue Wang, Fuwen Luo, Peng Li, Ming Yan, Ji Zhang, Fei Huang, Maosong Sun, Yang Liu:
Model Composition for Multimodal Large Language Models. 11246-11262 - Jun Zhang, Jue Wang, Huan Li, Lidan Shou, Ke Chen, Gang Chen, Sharad Mehrotra:
Draft& Verify: Lossless Large Language Model Acceleration via Self-Speculative Decoding. 11263-11282 - Xuxin Cheng, Ziyu Yao, Yifei Xin, Hao An, Hongxiang Li, Yaowei Li, Yuexian Zou:
Soul-Mix: Enhancing Multimodal Machine Translation with Manifold Mixup. 11283-11294 - Changjiang Gao, Jixing Li, Jiajun Chen, Shujian Huang:
Measuring Meaning Composition in the Human Brain with Composition Scores from Large Language Models. 11295-11308 - Krissanee Kamthawee, Can Udomcharoenchaikit, Sarana Nutanong:
MIST: Mutual Information Maximization for Short Text Clustering. 11309-11324 - Zhonghua Zheng, Lizi Liao, Yang Deng, Libo Qin, Liqiang Nie:
Self-chats from Large Language Models Make Small Emotional Support Chatbot Better. 11325-11345 - Janghwan Lee, Seongmin Park, Sukjin Hong, Minsoo Kim, Du-Seong Chang, Jungwook Choi:
Improving Conversational Abilities of Quantized Large Language Models via Direct Preference Alignment. 11346-11364 - Tianqing Fang, Zeming Chen, Yangqiu Song, Antoine Bosselut:
Complex Reasoning over Logical Queries on Commonsense Knowledge Graphs. 11365-11384 - Ziwei Chai, Guoyin Wang, Jing Su, Tianjie Zhang, Xuanwen Huang, Xuwu Wang, Jingjing Xu, Jianbo Yuan, Hongxia Yang, Fei Wu, Yang Yang:
An Expert is Worth One Token: Synergizing Multiple Expert LLMs as Generalist via Expert Token Routing. 11385-11396 - Constanza Fierro, Reinald Kim Amplayo, Fantine Huot, Nicola De Cao, Joshua Maynez, Shashi Narayan, Mirella Lapata:
Learning to Plan and Generate Text with Citations. 11397-11417 - Florian Le Bronnec, Alexandre Verine, Benjamin Négrevergne, Yann Chevaleyre, Alexandre Allauzen:
Exploring Precision and Recall to assess the quality and diversity of LLMs. 11418-11441 - Sangkyu Lee, Sungdong Kim, Ashkan Yousefpour, Minjoon Seo, Kang Min Yoo, Youngjae Yu:
Aligning Large Language Models by On-Policy Self-Judgment. 11442-11459 - Abhinav Joshi, Shounak Paul, Akshat Sharma, Pawan Goyal, Saptarshi Ghosh, Ashutosh Modi:
IL-TUR: Benchmark for Indian Legal Text Understanding and Reasoning. 11460-11499 - Mouxiang Chen, Hao Tian, Zhongxin Liu, Xiaoxue Ren, Jianling Sun:
JumpCoder: Go Beyond Autoregressive Coder via Online Modification. 11500-11520 - Shivalika Singh, Freddie Vargus, Daniel D'souza, Börje Karlsson, Abinaya Mahendiran, Wei-Yin Ko, Herumb Shandilya, Jay Patel, Deividas Mataciunas, Laura O'Mahony, Mike Zhang, Ramith Hettiarachchi, Joseph Wilson, Marina Machado, Luisa Souza Moura, Dominik Krzeminski, Hakimeh Fadaei, Irem Ergün, Ifeoma Okoh, Aisha Alaagib, Oshan Mudannayake, Zaid Alyafeai, Minh Vu Chien, Sebastian Ruder, Surya Guthikonda, Emad A. Alghamdi, Sebastian Gehrmann, Niklas Muennighoff, Max Bartolo, Julia Kreutzer, Ahmet Üstün, Marzieh Fadaee, Sara Hooker:
Aya Dataset: An Open-Access Collection for Multilingual Instruction Tuning. 11521-11567 - Anwoy Chatterjee, Eshaan Tanwar, Subhabrata Dutta, Tanmoy Chakraborty:
Language Models can Exploit Cross-Task In-context Learning for Data-Scarce Novel Tasks. 11568-11587 - David Ponce, Thierry Etchegoyhen, Jesus Calleja, Harritxu Gete:
Split and Rephrase with Large Language Models. 11588-11607 - Lu Ye, Ze Tao, Yong Huang, Yang Li:
ChunkAttention: Efficient Self-Attention with Prefix-Aware KV Cache and Two-Phase Partition. 11608-11620 - Xiao Liu, Xuanyu Lei, Shengyuan Wang, Yue Huang, Andrew Feng, Bosi Wen, Jiale Cheng, Pei Ke, Yifan Xu, Weng Lam Tam, Xiaohan Zhang, Lichao Sun, Xiaotao Gu, Hongning Wang, Jing Zhang, Minlie Huang, Yuxiao Dong, Jie Tang:
AlignBench: Benchmarking Chinese Alignment of Large Language Models. 11621-11640 - Weixiang Zhao, Shilong Wang, Yulin Hu, Yanyan Zhao, Bing Qin, Xuanyu Zhang, Qing Yang, Dongliang Xu, Wanxiang Che:
SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models. 11641-11661 - Yulong Mao, Kaiyu Huang, Changhao Guan, Ganglin Bao, Fengran Mo, Jinan Xu:
DoRA: Enhancing Parameter-Efficient Fine-Tuning with Dynamic Rank Distribution. 11662-11675 - Jiaan Wang, Yunlong Liang, Zengkui Sun, Yuxuan Cao, Jiarong Xu, Fandong Meng:
Cross-Lingual Knowledge Editing in Large Language Models. 11676-11686 - Anar Yeginbergen, Maite Oronoz, Rodrigo Agerri:
Argument Mining in Data Scarce Settings: Cross-lingual Transfer and Few-shot Techniques. 11687-11699 - Jiaxin Wen, Ruiqi Zhong, Pei Ke, Zhihong Shao, Hongning Wang, Minlie Huang:
Learning Task Decomposition to Assist Humans in Competitive Programming. 11700-11723 - Yijian Lu, Aiwei Liu, Dianzhi Yu, Jingjing Li, Irwin King:
An Entropy-based Text Watermarking Detection Method. 11724-11735 - Huachi Zhou, Shuang Zhou, Hao Chen, Ninghao Liu, Fan Yang, Xiao Huang:
Enhancing Explainable Rating Prediction through Annotated Macro Concepts. 11736-11748 - Peng Cui, Vilém Zouhar, Xiaoyu Zhang, Mrinmaya Sachan:
How to Engage your Readers? Generating Guiding Questions to Promote Active Reading. 11749-11765 - Zihao Yue, Liang Zhang, Qin Jin:
Less is More: Mitigating Multimodal Hallucination from an EOS Decision Perspective. 11766-11781 - Xinglin Wang, Yiwei Li, Shaoxiong Feng, Peiwen Yuan, Boyuan Pan, Heda Wang, Yao Hu, Kan Li:
Integrate the Essence and Eliminate the Dross: Fine-Grained Self-Consistency for Free-Form Language Generation. 11782-11794 - Siyu Tao, Lucia Donatelli, Michael Hahn:
More frequent verbs are associated with more diverse valency frames: Efficient principles at the lexicon-grammar interface. 11795-11810 - Claudia Collacciani, Giulia Rambelli, Marianna Bolognesi:
Quantifying Generalizations: Exploring the Divide Between Human and LLMs' Sensitivity to Quantification. 11811-11822 - Giulia Rambelli, Emmanuele Chersoni, Claudia Collacciani, Marianna Bolognesi:
Can Large Language Models Interpret Noun-Noun Compounds? A Linguistically-Motivated Study on Lexicalized and Novel Compounds. 11823-11835 - Quan Tu, Shilong Fan, Zihang Tian, Tianhao Shen, Shuo Shang, Xin Gao, Rui Yan:
CharacterEval: A Chinese Benchmark for Role-Playing Conversational Agent Evaluation. 11836-11850 - Yongqi Li, Wenjie Wang, Leigang Qu, Liqiang Nie, Wenjie Li, Tat-Seng Chua:
Generative Cross-Modal Retrieval: Memorizing Images in Multimodal Language Models for Retrieval and Beyond. 11851-11861 - Yice Zhang, Jie Zeng, Weiming Hu, Ziyi Wang, Shiwei Chen, Ruifeng Xu:
Self-Training with Pseudo-Label Scorer for Aspect Sentiment Quad Prediction. 11862-11875 - Rami Aly, Zhiqiang Tang, Samson Tan, George Karypis:
Learning to Generate Answers with Citations via Factual Consistency Models. 11876-11896 - Liang Wang, Nan Yang, Xiaolong Huang, Linjun Yang, Rangan Majumder, Furu Wei:
Improving Text Embeddings with Large Language Models. 11897-11916 - Tianduo Wang, Shichen Li, Wei Lu:
Self-Training with Direct Preference Optimization Improves Chain-of-Thought Reasoning. 11917-11928 - Haoyu Wang, Shuo Wang, Yukun Yan, Xujia Wang, Zhiyu Yang, Yuzhuang Xu, Zhenghao Liu, Liner Yang, Ning Ding, Xu Han, Zhiyuan Liu, Maosong Sun:
UltraLink: An Open-Source Knowledge-Enhanced Multilingual Supervised Fine-tuning Dataset. 11929-11942 - Zhenyun Deng, Michael Schlichtkrull, Andreas Vlachos:
Document-level Claim Extraction and Decontextualisation for Fact-Checking. 11943-11954 - Xiaoqi Qiu, Yongjie Wang, Xu Guo, Zhiwei Zeng, Yu Yue, Yuhong Feng, Chunyan Miao:
PairCFR: Enhancing Model Training on Paired Counterfactually Augmented Data through Contrastive Learning. 11955-11971 - Hanzhang Zhou, Junlang Qian, Zijian Feng, Hui Lu, Zixiao Zhu, Kezhi Mao:
LLMs Learn Task Heuristics from Demonstrations: A Heuristic-Driven Prompting Strategy for Document-Level Event Argument Extraction. 11972-11990 - Weihong Zhong, Xiaocheng Feng, Liang Zhao, Qiming Li, Lei Huang, Yuxuan Gu, Weitao Ma, Yuan Xu, Bing Qin:
Investigating and Mitigating the Multimodal Hallucination Snowballing in Large Vision-Language Models. 11991-12011 - Huiyuan Lai, Malvina Nissim:
mCoT: Multilingual Instruction Tuning for Reasoning Consistency in Language Models. 12012-12026 - Nikesh Gyawali, Iustin Sirbu, Tiberiu Sosea, Sarthak Khanal, Doina Caragea, Traian Rebedea, Cornelia Caragea:
GunStance: Stance Detection for Gun Control and Gun Regulation. 12027-12044 - Zdenek Kasner, Ondrej Dusek:
Beyond Traditional Benchmarks: Analyzing Behaviors of Open LLMs on Data-to-Text Generation. 12045-12072 - Min Zhang, Jianfeng He, Taoran Ji, Chang-Tien Lu:
Don't Go To Extremes: Revealing the Excessive Sensitivity and Calibration Limitations of LLMs in Implicit Hate Speech Detection. 12073-12086 - Giorgos Vernikos, Andrei Popescu-Belis:
Don't Rank, Combine! Combining Machine Translation Hypotheses Using Quality Estimation. 12087-12105 - Antonio Di Mauro, Zhao Xu, Wiem Ben Rim, Timo Sztyler, Carolin Lawrence:
Generating and Evaluating Plausible Explanations for Knowledge Graph Completion. 12106-12118 - Tejpalsingh Siledar, Swaroop Nath, Sankara Sri Raghava Ravindra Muddu, Rupasai Rangaraju, Swaprava Nath, Pushpak Bhattacharyya, Suman Banerjee, Amey Patil, Sudhanshu Singh, Muthusamy Chelliah, Nikesh Garera:
One Prompt To Rule Them All: LLMs for Opinion Summary Evaluation. 12119-12134 - Shaolin Zhu, Leiyu Pan, Bo Li, Deyi Xiong:
LANDeRMT: Dectecting and Routing Language-Aware Neurons for Selectively Finetuning LLMs to Machine Translation. 12135-12148 - Hongjie Cai, Heqing Ma, Jianfei Yu, Rui Xia:
A Joint Coreference-Aware Approach to Document-Level Target Sentiment Analysis. 12149-12160 - Qingxing Cao, Junhao Cheng, Xiaodan Liang, Liang Lin:
VisDiaHalBench: A Visual Dialogue Benchmark For Diagnosing Hallucination in Large Vision-Language Models. 12161-12176 - Yu-Zhe Shi, Haofei Hou, Zhangqian Bi, Fanxu Meng, Xiang Wei, Lecheng Ruan, Qining Wang:
AutoDSL: Automated domain-specific language design for structural representation of procedures with constraints. 12177-12214 - Berta Franzluebbers, Donald Dunagan, Milos Stanojevic, Jan Buys, John T. Hale:
Multipath parsing in the brain. 12215-12229 - Jinsung Yoon, Yanfei Chen, Sercan Ö. Arik, Tomas Pfister:
Search-Adaptor: Embedding Customization for Information Retrieval. 12230-12247 - Arash Ahmadian, Chris Cremer, Matthias Gallé, Marzieh Fadaee, Julia Kreutzer, Olivier Pietquin, Ahmet Üstün, Sara Hooker:
Back to Basics: Revisiting REINFORCE-Style Optimization for Learning from Human Feedback in LLMs. 12248-12267 - Max Ku, Dongfu Jiang, Cong Wei, Xiang Yue, Wenhu Chen:
VIEScore: Towards Explainable Metrics for Conditional Image Synthesis Evaluation. 12268-12290 - Lingling Zhou, Suzan Verberne, Gijs Wijnholds:
Tree Transformer's Disambiguation Ability of Prepositional Phrase Attachment and Garden Path Effects. 12291-12301 - Elan Markowitz, Anil Ramakrishna, Jwala Dhamala, Ninareh Mehrabi, Charith Peris, Rahul Gupta, Kai-Wei Chang, Aram Galstyan:
Tree-of-Traversals: A Zero-Shot Reasoning Algorithm for Augmenting Black-box Language Models with Knowledge Graphs. 12302-12319 - Freda Shi, Kevin Gimpel, Karen Livescu:
Structured Tree Alignment for Evaluation of (Speech) Constituency Parsing. 12320-12332 - Akshita Jha, Vinodkumar Prabhakaran, Remi Denton, Sarah Laszlo, Shachi Dave, Rida Qadri, Chandan K. Reddy, Sunipa Dev:
ViSAGe: A Global-Scale Analysis of Visual Stereotypes in Text-to-Image Generation. 12333-12347 - Xiaokang Zhang, Zijun Yao, Jing Zhang, Kaifeng Yun, Jifan Yu, Juanzi Li, Jie Tang:
Transferable and Efficient Non-Factual Content Detection via Probe Training with Offline Consistency Checking. 12348-12364 - Jiaoda Li, Yifan Hou, Mrinmaya Sachan, Ryan Cotterell:
What Do Language Models Learn in Context? The Structured Task Hypothesis. 12365-12379 - Da Yin, Faeze Brahman, Abhilasha Ravichander, Khyathi Raghavi Chandu, Kai-Wei Chang, Yejin Choi, Bill Yuchen Lin:
Agent Lumos: Unified and Modular Training for Open-Source Language Agents. 12380-12403 - Badr AlKhamissi, Muhammad N. ElNokrashy, Mai Alkhamissi, Mona T. Diab:
Investigating Cultural Alignment of Large Language Models. 12404-12422 - Wichayaporn Wongkamjan, Feng Gu, Yanze Wang, Ulf Hermjakob, Jonathan May, Brandon M. Stewart, Jonathan K. Kummerfeld, Denis Peskoff, Jordan L. Boyd-Graber:
More Victories, Less Cooperation: Assessing Cicero's Diplomacy Play. 12423-12441 - Puyuan Peng, Po-Yao Huang, Shang-Wen Li, Abdelrahman Mohamed, David Harwath:
VoiceCraft: Zero-Shot Speech Editing and Text-to-Speech in the Wild. 12442-12462 - Liam Dugan, Alyssa Hwang, Filip Trhlík, Andrew Zhu, Josh Magnus Ludan, Hainiu Xu, Daphne Ippolito, Chris Callison-Burch:
RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors. 12463-12492 - Julia Kruk, Michela Marchini, Rijul Magu, Caleb Ziems, David Muchlinski, Diyi Yang:
Silent Signals, Loud Impact: LLMs for Word-Sense Disambiguation of Coded Dog Whistles. 12493-12509 - Franz Nowak, Anej Svete, Alexandra Butoi, Ryan Cotterell:
On the Representational Capacity of Neural Language Models with Chain-of-Thought Reasoning. 12510-12548 - Sanjana Ramprasad, Elisa Ferracane, Zachary C. Lipton:
Analyzing LLM Behavior in Dialogue Summarization: Unveiling Circumstantial Hallucination Trends. 12549-12561 - Keivan Alizadeh, Seyed-Iman Mirzadeh, Dmitry Belenko, S. Khatamifard, Minsik Cho, Carlo C. del Mundo, Mohammad Rastegari, Mehrdad Farajtabar:
LLM in a flash: Efficient Large Language Model Inference with Limited Memory. 12562-12584 - Muhammad Maaz, Hanoona Abdul Rasheed, Salman Khan, Fahad Khan:
Video-ChatGPT: Towards Detailed Video Understanding via Large Vision and Language Models. 12585-12602 - Abdul Waheed, Karima Kadaoui, Muhammad Abdul-Mageed:
To Distill or Not to Distill? On the Robustness of Robust Knowledge Distillation. 12603-12621 - Mostafa Elhoushi, Akshat Shrivastava, Diana Liskovich, Basil Hosmer, Bram Wasti, Liangzhen Lai, Anas Mahmoud, Bilge Acun, Saurabh Agarwal, Ahmed Roman, Ahmed A Aly, Beidi Chen, Carole-Jean Wu:
LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding. 12622-12642 - Amanda Cercas Curry, Giuseppe Attanasio, Zeerak Talat, Dirk Hovy:
Classist Tools: Social Class Correlates with Performance in NLP. 12643-12655 - Xianrui Zhong, Yufeng Du, Siru Ouyang, Ming Zhong, Tingfeng Luo, Qirong Ho, Hao Peng, Heng Ji, Jiawei Han:
ActionIE: Action Extraction from Scientific Literature with Programming Languages. 12656-12671 - Gaurav Verma, Rynaa Grover, Jiawei Zhou, Binny Mathew, Jordan Kraemer, Munmun De Choudhury, Srijan Kumar:
A Community-Centric Perspective for Characterizing and Detecting Anti-Asian Violence-Provoking Speech. 12672-12684 - Zhiwei Cao, Qian Cao, Yu Lu, Ningxin Peng, Luyang Huang, Shanbo Cheng, Jinsong Su:
Retaining Key Information under High Compression Ratios: Query-Guided Compressor for LLMs. 12685-12695 - Maxime Darrin, Philippe Formont, Jackie Chi Kit Cheung, Pablo Piantanida:
COSMIC: Mutual Information for Task-Agnostic Summarization Evaluation. 12696-12717 - Olivier Salaün, Frédéric Piedboeuf, Guillaume Le Berre, David Alfonso-Hermelo, Philippe Langlais:
EUROPA: A Legal Multilingual Keyphrase Generation Dataset. 12718-12736 - Maxime Darrin, Ines Arous, Pablo Piantanida, Jackie Chi Kit Cheung:
GLIMPSE: Pragmatically Informative Multi-Document Summarization for Scholarly Reviews. 12737-12752 - Fakhraddin Alwajih, El Moatez Billah Nagoudi, Gagan Bhatia, Abdelrahman Mohamed, Muhammad Abdul-Mageed:
Peacock: A Family of Arabic Multimodal Large Language Models and Benchmarks. 12753-12776 - João Bordalo, Vasco Ramos, Rodrigo Valerio, Diogo Glória-Silva, Yonatan Bitton, Michal Yarom, Idan Szpektor, João Magalhães:
Generating Coherent Sequences of Visual Illustrations for Real-World Manual Tasks. 12777-12797 - Ife Adebara, AbdelRahim A. Elmadany, Muhammad Abdul-Mageed:
Cheetah: Natural Language Generation for 517 African Languages. 12798-12823 - Yilun Zhao, Lyuhao Chen, Arman Cohan, Chen Zhao:
TaPERA: Enhancing Faithfulness and Interpretability in Long-Form Table QA by Content Planning and Execution-based Reasoning. 12824-12840 - Yilun Zhao, Hongjun Liu, Yitao Long, Rui Zhang, Chen Zhao, Arman Cohan:
KnowledgeFMath: A Knowledge-Intensive Math Reasoning Dataset in Finance Domains. 12841-12858 - Kinjal Basu, Ibrahim Abdelaziz, Subhajit Chaudhury, Soham Dan, Maxwell Crouse, Asim Munawar, Vernon Austel, Sadhana Kumaravel, Vinod Muthusamy, Pavan Kapanipathi, Luis A. Lastras:
API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs. 12859-12870 - Hanqing Wang, Bowen Ping, Shuo Wang, Xu Han, Yun Chen, Zhiyuan Liu, Maosong Sun:
LoRA-Flow: Dynamic LoRA Fusion for Large Language Models in Generative Tasks. 12871-12882 - Quzhe Huang, Zhenwei An, Nan Zhuang, Mingxu Tao, Chen Zhang, Yang Jin, Kun Xu, Liwei Chen, Songfang Huang, Yansong Feng:
Harder Task Needs More Experts: Dynamic Routing in MoE Models. 12883-12895 - HyoJung Han, Mohamed Anwar, Juan Pino, Wei-Ning Hsu, Marine Carpuat, Bowen Shi, Changhan Wang:
XLAVS-R: Cross-Lingual Audio-Visual Speech Representation Learning for Noise-Robust Speech Perception. 12896-12911 - Ruiyi Wang, Haofei Yu, Wenxin Sharon Zhang, Zhengyang Qi, Maarten Sap, Yonatan Bisk, Graham Neubig, Hao Zhu:
SOTOPIA-π: Interactive Learning of Socially Intelligent Language Agents. 12912-12940 - Yifeng Ding, Jiawei Liu, Yuxiang Wei, Lingming Zhang:
\mathcal XFT: Unlocking the Power of Code Instruction Tuning by Simply Merging Upcycled Mixture-of-Experts. 12941-12955 - Tuc Nguyen, Thai Le:
Generalizability of Mixture of Domain-Specific Adapters from the Lens of Signed Weight Directions and its Application to Effective Model Pruning. 12956-12973 - Zejiang Shen, Hunter Lang, Bailin Wang, Yoon Kim, David A. Sontag:
Learning to Decode Collaboratively with Multiple Language Models. 12974-12990 - Weihang Su, Yichen Tang, Qingyao Ai, Zhijing Wu, Yiqun Liu:
DRAGIN: Dynamic Retrieval Augmented Generation based on the Real-time Information Needs of Large Language Models. 12991-13013 - Zhaochen Su, Juntao Li, Jun Zhang, Tong Zhu, Xiaoye Qu, Pan Zhou, Yan Bowen, Yu Cheng, Min Zhang:
Living in the Moment: Can Large Language Models Grasp Co-Temporal Reasoning? 13014-13033 - Pei Ke, Bosi Wen, Andrew Feng, Xiao Liu, Xuanyu Lei, Jiale Cheng, Shengyuan Wang, Aohan Zeng, Yuxiao Dong, Hongning Wang, Jie Tang, Minlie Huang:
CritiqueLLM: Towards an Informative Critique Generation Model for Evaluation of Large Language Model Generation. 13034-13054 - Junzhe Chen, Xuming Hu, Shuodi Liu, Shiyu Huang, Wei-Wei Tu, Zhaofeng He, Lijie Wen:
LLMArena: Assessing Capabilities of Large Language Models in Dynamic Multi-Agent Environments. 13055-13077 - Sahithya Ravi, Patrick Huber, Akshat Shrivastava, Vered Shwartz, Arash Einolghozati:
Small But Funny: A Feedback-Driven Approach to Humor Distillation. 13078-13090 - Fangzhi Xu, Zhiyong Wu, Qiushi Sun, Siyu Ren, Fei Yuan, Shuai Yuan, Qika Lin, Yu Qiao, Jun Liu:
Symbol-LLM: Towards Foundational Symbol-centric Interface For Large Language Models. 13091-13116 - Akash Ghosh, Mohit Tomar, Abhisek Tiwari, Sriparna Saha, Jatin Salve, Setu Sinha:
From Sights to Insights: Towards Summarization of Multimodal Clinical Documents. 13117-13129 - Jiaxin Wang, Lingling Zhang, Wee Sun Lee, Yujie Zhong, Liwei Kang, Jun Liu:
When Phrases Meet Probabilities: Enabling Open Relation Extraction with Cooperating Large Language Models. 13130-13147 - Ján Cegin, Branislav Pecher, Jakub Simko, Ivan Srba, Mária Bieliková, Peter Brusilovsky:
Effects of diversity incentives on sample diversity and downstream model performance in LLM-based text augmentation. 13148-13171 - Yassine El Kheir, Hamdy Mubarak, Ahmed Ali, Shammur Absar Chowdhury:
Beyond Orthography: Automatic Recovery of Short Vowels and Dialectal Sounds in Arabic. 13172-13184 - Proyag Pal, Alexandra Birch, Kenneth Heafield:
Document-Level Machine Translation with Large-Scale Public Parallel Corpora. 13185-13197 - Nur Lan, Emmanuel Chemla, Roni Katzir:
Bridging the Empirical-Theoretical Gap in Neural Network Formal Language Learning Using Minimum Description Length. 13198-13210 - Kevin Du, Vésteinn Snæbjarnarson, Niklas Stoehr, Jennifer C. White, Aaron Schein, Ryan Cotterell:
Context versus Prior Knowledge in Language Models. 13211-13235 - Yinghao Li, Siyu Miao, Heyan Huang, Yang Gao:
Word Matters: What Influences Domain Adaptation in Summarization? 13236-13249 - Xinhang Li, Jingbo Zhou, Wei Chen, Derong Xu, Tong Xu, Enhong Chen:
Visualization Recommendation with Prompt-based Reprogramming of Large Language Models. 13250-13262 - Pranoy Panda, Ankush Agarwal, Chaitanya Devaguptapu, Manohar Kaul, Prathosh A P:
HOLMES: Hyper-Relational Knowledge Graphs for Multi-hop Question Answering using LLMs. 13263-13282 - Alexis Ross, Jacob Andreas:
Toward In-Context Teaching: Adapting Examples to Students' Misconceptions. 13283-13310 - Yuan Tian, Ruike Zhang, Nan Xu, Wenji Mao:
Bridging Word-Pair and Token-Level Metaphor Detection with Explainable Domain Mining. 13311-13325 - Jundong Xu, Hao Fei, Liangming Pan, Qian Liu, Mong-Li Lee, Wynne Hsu:
Faithful Logical Reasoning via Symbolic Chain-of-Thought. 13326-13365 - Bingfeng Chen, Qihan Ouyang, Yongqi Luo, Boyan Xu, Ruichu Cai, Zhifeng Hao:
S²GSL: Incorporating Segment to Syntactic Enhanced Graph Structure Learning for Aspect-based Sentiment Analysis. 13366-13379 - Giuliano Martinelli, Edoardo Barba, Roberto Navigli:
Maverick: Efficient and Accurate Coreference Resolution Defying Recent Trends. 13380-13394 - Tenggan Zhang, Xinjie Zhang, Jinming Zhao, Li Zhou, Qin Jin:
ESCoT: Towards Interpretable Emotional Support Dialogue Systems. 13395-13412 - Fangzhi Xu, Qika Lin, Tianzhe Zhao, Jiawei Han, Jun Liu:
PathReasoner: Modeling Reasoning Path with Equivalent Extension for Logical Question Answering. 13413-13429 - Anudeex Shetty, Yue Teng, Ke He, Qiongkai Xu:
WARDEN: Multi-Directional Backdoor Watermarks for Embedding-as-a-Service Copyright Protection. 13430-13444 - Muling Wu, Wenhao Liu, Xiaohua Wang, Tianlong Li, Changze Lv, Zixuan Ling, Jianhao Zhu, Cenyuan Zhang, Xiaoqing Zheng, Xuanjing Huang:
Advancing Parameter Efficiency in Fine-tuning via Representation Editing. 13445-13464 - Meizhi Zhong, Lemao Liu, Kehai Chen, Mingming Yang, Min Zhang:
Context Consistency between Training and Inference in Simultaneous Machine Translation. 13465-13476 - Xuanli He, Yuxiang Wu, Oana-Maria Camburu, Pasquale Minervini, Pontus Stenetorp:
Using Natural Language Explanations to Improve Robustness of In-context Learning. 13477-13499 - Jiawen Xie, Pengyu Cheng, Xiao Liang, Yong Dai, Nan Du:
Chunk, Align, Select: A Simple Long-sequence Processing Method for Transformers. 13500-13519 - Hojae Han, Jaejin Kim, Jaeseok Yoo, Youngwon Lee, Seung-won Hwang:
ArchCode: Incorporating Software Requirements in Code Generation with Large Language Models. 13520-13552 - Zixia Jia, Junpeng Li, Shichuan Zhang, Anji Liu, Zilong Zheng:
Combining Supervised Learning and Reinforcement Learning for Multi-Label Classification Tasks with Partial Labels. 13553-13569 - Chenhao Wang, Pengfei Cao, Zhuoran Jin, Yubo Chen, Daojian Zeng, Kang Liu, Jun Zhao:
MULFE: A Multi-Level Benchmark for Free Text Model Editing. 13570-13587 - Shengpeng Ji, Ziyue Jiang, Hanting Wang, Jialong Zuo, Zhou Zhao:
MobileSpeech: A Fast and High-Fidelity Framework for Mobile Zero-Shot Text-to-Speech. 13588-13600 - Muraleekrishna Gopinathan, Martin Masek, Jumana Abu-Khalaf, David Suter:
Spatially-Aware Speaker for Vision-and-Language Navigation Instruction Generation. 13601-13614 - Kechi Zhang, Ge Li, Huangzhao Zhang, Zhi Jin:
HiRoPE: Length Extrapolation for Code Models Using Hierarchical Position. 13615-13627 - Junqing He, Kunhao Pan, Xiaoqun Dong, Zhuoyang Song, LiuYiBo LiuYiBo, Qianguosun Qianguosun, Yuxin Liang, Hao Wang, Enming Zhang, Jiaxing Zhang:
Never Lost in the Middle: Mastering Long-Context Question Answering with Position-Agnostic Decompositional Training. 13628-13642 - Kechi Zhang, Jia Li, Ge Li, Xianjie Shi, Zhi Jin:
CodeAgent: Enhancing Code Generation with Tool-Integrated Agent Systems for Real-World Repo-level Coding Challenges. 13643-13658 - Ziru Chen, Michael White, Raymond J. Mooney, Ali Payani, Yu Su, Huan Sun:
When is Tree Search Useful for LLM Planning? It Depends on the Discriminator. 13659-13678 - Mihir Parmar, Nisarg Patel, Neeraj Varshney, Mutsumi Nakamura, Man Luo, Santosh Mashetty, Arindam Mitra, Chitta Baral:
LogicBench: Towards Systematic Evaluation of Logical Reasoning Ability of Large Language Models. 13679-13707 - Ruohao Guo, Wei Xu, Alan Ritter:
Meta-Tuning LLMs to Leverage Lexical Knowledge for Generalizable Language Style Understanding. 13708-13731 - Yao Dou, Isadora Krsek, Tarek Naous, Anubha Kabra, Sauvik Das, Alan Ritter, Wei Xu:
Reducing Privacy Risks in Online Self-Disclosures with Language Models. 13732-13754 - Zihao Lin, Mohammad Beigi, Hongxuan Li, Yufan Zhou, Yuxiang Zhang, Qifan Wang, Wenpeng Yin, Lifu Huang:
Navigating the Dual Facets: A Comprehensive Evaluation of Sequential Memory Editing in Large Language Models. 13755-13772 - Vaidehi Patil, Leonardo F. R. Ribeiro, Mengwen Liu, Mohit Bansal, Markus Dreyer:
REFINESUMM: Self-Refining MLLM for Generating a Multimodal Summarization Dataset. 13773-13786 - Norah Alzahrani, Hisham Abdullah Alyahya, Yazeed Alnumay, Sultan Alrashed, Shaykhah Alsubaie, Yousef Almushayqih, Faisal Mirza, Nouf Alotaibi, Nora Al-Twairesh, Areeb Alowisheq, M. Saiful Bari, Haidar Khan:
When Benchmarks are Targets: Revealing the Sensitivity of Large Language Model Leaderboards. 13787-13805 - Helia Hashemi, Jason Eisner, Corby Rosset, Benjamin Van Durme, Chris Kedzie:
LLM-Rubric: A Multidimensional, Calibrated Approach to Automated Evaluation of Natural Language Texts. 13806-13834 - Xiaomeng Zhu, Robert Frank:
LIEDER: Linguistically-Informed Evaluation for Discourse Entity Recognition. 13835-13850 - Adyasha Maharana, Dong-Ho Lee, Sergey Tulyakov, Mohit Bansal, Francesco Barbieri, Yuwei Fang:
Evaluating Very Long-Term Conversational Memory of LLM Agents. 13851-13870 - Jinghan Zhang, Xiting Wang, Yiqiao Jin, Changyu Chen, Xinhao Zhang, Kunpeng Liu:
Prototypical Reward Network for Data-Efficient Model Alignment. 13871-13884 - Jonathan Zheng, Alan Ritter, Wei Xu:
NEO-BENCH: Evaluating Robustness of Large Language Models with Neologisms. 13885-13906 - Greg Hanneman, Natawut Monaikul, Taichi Nakatani:
Impacts of Misspelled Queries on Translation and Product Search. 13907-13920 - Bilgehan Sel, Priya Shanmugasundaram, Mohammad Kachuee, Kun Zhou, Ruoxi Jia, Ming Jin:
Skin-in-the-Game: Decision Making via Multi-Stakeholder Alignment in LLMs. 13921-13959 - Enshi Zhang, Rafael Trujillo, Christian Poellabauer:
The MERSA Dataset and a Transformer-Based Approach for Speech Emotion Recognition. 13960-13970 - Jerome Ramos, Hossein A. Rahmani, Xi Wang, Xiao Fu, Aldo Lipani:
Transparent and Scrutable Recommendations Using Natural Language User Profiles. 13971-13984 - Hope Schroeder, Deb Roy, Jad Kabbara:
Fora: A corpus and framework for the study of facilitated dialogue. 13985-14001 - Yue Yu, Jiaming Shen, Tianqi Liu, Zhen Qin, Jing Nathan Yan, Jialu Liu, Chao Zhang, Michael Bendersky:
Explanation-aware Soft Ensemble Empowers Large Language Model In-context Learning. 14002-14024 - Shanshan Wang, Derek F. Wong, Jingming Yao, Lidia S. Chao:
What is the Best Way for ChatGPT to Translate Poetry? 14025-14043 - Pratyush Maini, Skyler Seto, Richard He Bai, David Grangier, Yizhe Zhang, Navdeep Jaitly:
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling. 14044-14072 - Junda Wu, Tong Yu, Xiang Chen, Haoliang Wang, Ryan A. Rossi, Sungchul Kim, Anup B. Rao, Julian J. McAuley:
DeCoT: Debiasing Chain-of-Thought for Knowledge-Intensive Tasks in Large Language Models via Causal Intervention. 14073-14087 - Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu:
Representation Learning with Conditional Information Flow Maximization. 14088-14103 - Virginia K. Felkner, Jennifer A. Thompson, Jonathan May:
GPT is Not an Annotator: The Necessity of Human Annotation in Fairness Benchmark Construction. 14104-14115 - Martin Riddell, Ansong Ni, Arman Cohan:
Quantifying Contamination in Evaluating Code Generation Capabilities of Language Models. 14116-14137 - Rishabh Bhardwaj, Do Duc Anh, Soujanya Poria:
Language Models are Homer Simpson! Safety Re-Alignment of Fine-tuned Language Models through Task Arithmetic. 14138-14149 - Alexander Spangher, Serdar Tumgoren, Ben Welsh, Nanyun Peng, Emilio Ferrara, Jonathan May:
Tracking the Newsworthiness of Public Documents. 14150-14168 - Mohammad Dehghan, Mohammad Ali Alomrani, Sunyam Bagga, David Alfonso-Hermelo, Khalil Bibi, Abbas Ghaddar, Yingxue Zhang, Xiaoguang Li, Jianye Hao, Qun Liu, Jimmy Lin, Boxing Chen, Prasanna Parthasarathi, Mahdi Biparva, Mehdi Rezagholizadeh:
EWEK-QA : Enhanced Web and Efficient Knowledge Graph Retrieval for Citation-based Question Answering Systems. 14169-14187 - Shengzhi Li, Rongyu Lin, Shichao Pei:
Multi-modal Preference Alignment Remedies Degradation of Visual Instruction Tuning on Language Models. 14188-14200 - Jiachen Zhao, Wenlong Zhao, Andrew Drozdov, Benjamin Rozonoyer, Md. Arafat Sultan, Jay-Yoon Lee, Mohit Iyyer, Andrew McCallum:
Multistage Collaborative Knowledge Distillation from a Large Language Model for Semi-Supervised Sequence Generation. 14201-14214 - Sangwon Yu, Changmin Lee, Hojin Lee, Sungroh Yoon:
Controlled Text Generation for Black-box Language Models via Score-based Progressive Editor. 14215-14237 - Danlu Chen, Freda Shi, Aditi Agarwal, Jacobo Myerston, Taylor Berg-Kirkpatrick:
LogogramNLP: Comparing Visual and Textual Representations of Ancient Logographic Writing Systems for NLP. 14238-14254 - Ming Li, Yong Zhang, Shwai He, Zhitao Li, Hongyu Zhao, Jianzong Wang, Ning Cheng, Tianyi Zhou:
Superfiltering: Weak-to-Strong Data Filtering for Fast Instruction-Tuning. 14255-14273 - Peiqi Sui, Eamon Duede, Sophie Wu, Richard Jean So:
Confabulation: The Surprising Value of Large Language Model Hallucinations. 14274-14284 - Wei Zhu, Aaron Xuxiang Tian, Congrui Yin, Yuan Ni, Xiaoling Wang, Guotong Xie:
IAPT: Instance-Aware Prompt Tuning for Large Language Models. 14285-14304 - Tingkai Liu, Yunzhe Tao, Haogeng Liu, Qihang Fan, Ding Zhou, Huaibo Huang, Ran He, Hongxia Yang:
DeVAn: Dense Video Annotation for Video-Language Models. 14305-14321 - Yi Zeng, Hongpeng Lin, Jingwen Zhang, Diyi Yang, Ruoxi Jia, Weiyan Shi:
How Johnny Can Persuade LLMs to Jailbreak Them: Rethinking Persuasion to Challenge AI Safety by Humanizing LLMs. 14322-14350 - Adithya Bhaskar, Dan Friedman, Danqi Chen:
The Heuristic Core: Understanding Subnetwork Generalization in Pretrained Language Models. 14351-14368 - Lei Li, Yuqi Wang, Runxin Xu, Peiyi Wang, Xiachong Feng, Lingpeng Kong, Qi Liu:
Multimodal ArXiv: A Dataset for Improving Scientific Comprehension of Large Vision-Language Models. 14369-14387 - Chenxin An, Shansan Gong, Ming Zhong, Xingjian Zhao, Mukai Li, Jun Zhang, Lingpeng Kong, Xipeng Qiu:
L-Eval: Instituting Standardized Evaluation for Long Context Language Models. 14388-14411 - Fahim Faisal, Orevaoghene Ahia, Aarohi Srivastava, Kabir Ahuja, David Chiang, Yulia Tsvetkov, Antonios Anastasopoulos:
DIALECTBENCH: An NLP Benchmark for Dialects, Varieties, and Closely-Related Languages. 14412-14454 - Zhouhao Sun, Li Du, Xiao Ding, Yixuan Ma, Yang Zhao, Kaitao Qiu, Ting Liu, Bing Qin:
Causal-Guided Active Learning for Debiasing Large Language Models. 14455-14469 - Qisen Yang, Zekun Wang, Honghui Chen, Shenzhi Wang, Yifan Pu, Xin Gao, Wenhao Huang, Shiji Song, Gao Huang:
PsychoGAT: A Novel Psychological Measurement Paradigm through Interactive Fiction Games with LLM Agents. 14470-14505 - Mingxin Li, Richong Zhang, Zhijie Nie:
Towards Better Understanding of Contrastive Sentence Representation Learning: A Unified Paradigm for Gradient. 14506-14521 - Tatsuki Kuribayashi, Ryo Ueda, Ryo Yoshida, Yohei Oseki, Ted Briscoe, Timothy Baldwin:
Emergent Word Order Universals from Cognitively-Motivated Language Models. 14522-14543 - Jintian Zhang, Xin Xu, Ningyu Zhang, Ruibo Liu, Bryan Hooi, Shumin Deng:
Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View. 14544-14607 - Tianshuo Zhou, Sen Mei, Xinze Li, Zhenghao Liu, Chenyan Xiong, Zhiyuan Liu, Yu Gu, Ge Yu:
MARVEL: Unlocking the Multi-Modal Capability of Dense Retrieval via Visual Module Plugin. 14608-14624 - Chun Hei Lo, Wai Lam, Hong Cheng, Guy Emerson:
Distributional Inclusion Hypothesis and Quantifications: Probing for Hypernymy in Functional Distributional Semantics. 14625-14637 - Aryaman Arora, Dan Jurafsky, Christopher Potts:
CausalGym: Benchmarking causal interpretability methods on linguistic tasks. 14638-14663 - Shangbin Feng, Weijia Shi, Yike Wang, Wenxuan Ding, Vidhisha Balachandran, Yulia Tsvetkov:
Don't Hallucinate, Abstain: Identifying LLM Knowledge Gaps via Multi-LLM Collaboration. 14664-14690 - Julie Kallini, Isabel Papadimitriou, Richard Futrell, Kyle Mahowald, Christopher Potts:
Mission: Impossible Language Models. 14691-14714 - Liang Lu, Peirong Xie, David R. Mortensen:
Semisupervised Neural Proto-Language Reconstruction. 14715-14759 - Marco Gaido, Sara Papi, Matteo Negri, Luisa Bentivogli:
Speech Translation with Speech Foundation Models and Large Language Models: What is There and What is Missing? 14760-14778 - Roshan Sharma, Suwon Shon, Mark Lindsey, Hira Dhamyal, Bhiksha Raj:
Speech vs. Transcript: Does It Matter for Human Annotators in Speech Summarization? 14779-14797 - Zihan Liao, Hang Yu, Jianguo Li, Jun Wang, Wei Zhang:
D2LLM: Decomposed and Distilled Large Language Models for Semantic Search. 14798-14814 - Salman Elgamal, Ossama Obeid, Mhd Tameem Kabbani, Go Inoue, Nizar Habash:
Arabic Diacritics in the Wild: Exploiting Opportunities for Improved Diacritization. 14815-14829 - Ivan Vykopal, Matús Pikuliak, Ivan Srba, Róbert Móro, Dominik Macko, Mária Bieliková:
Disinformation Capabilities of Large Language Models. 14830-14847 - Junhao Zheng, Shengjie Qiu, Qianli Ma:
Learn or Recall? Revisiting Incremental Learning with Pre-trained Language Models. 14848-14877 - Andreas Waldis, Yufang Hou, Iryna Gurevych:
How to Handle Different Types of Out-of-Distribution Scenarios in Computational Argumentation? A Comprehensive and Fine-Grained Field Study. 14878-14898 - Samuel Cahyawijaya, Holy Lovenia, Fajri Koto, Rifki Afina Putri, Tjeng Wawan Cenggoro, Jhonson Lee, Salsabil Maulana Akbar, Emmanuel Dave, Nuur Shadieq, Muhammad Ihza Mahendra, Dea Annisayanti Putri, Bryan Wilie, Genta Indra Winata, Alham Fikri Aji, Ayu Purwarianti, Pascale Fung:
Cendol: Open Instruction-tuned Generative Large Language Models for Indonesian Languages. 14899-14914 - Steven Bird:
Must NLP be Extractive? 14915-14929 - Xiaoyang Chen, Ben He, Hongyu Lin, Xianpei Han, Tianshu Wang, Boxi Cao, Le Sun, Yingfei Sun:
Spiral of Silence: How is Large Language Model Killing Information Retrieval? - A Case Study on Open Domain Question Answering. 14930-14951 - Julen Etxaniz, Oscar Sainz, Naiara Miguel, Itziar Aldabe, German Rigau, Eneko Agirre, Aitor Ormazabal, Mikel Artetxe, Aitor Soroa:
Latxa: An Open Language Model and Evaluation Suite for Basque. 14952-14972 - Michael Hahn, Mark Rofin:
Why are Sensitive Functions Hard for Transformers? 14973-15008 - Haoqiu Yan, Yongxin Zhu, Kai Zheng, Bing Liu, Haoyu Cao, Deqiang Jiang, Linli Xu:
Talk With Human-like Agents: Empathetic Dialogue Through Perceptible Acoustic Reception and Reaction. 15009-15022 - Indraneil Paul, Goran Glavas, Iryna Gurevych:
IRCoder: Intermediate Representations Make Language Models Robust Multilingual Code Generators. 15023-15041 - Rochelle Choenni, Anne Lauscher, Ekaterina Shutova:
The Echoes of Multilinguality: Tracing Cultural Value Shifts during Language Model Fine-tuning. 15042-15058 - Tomasz Limisiewicz, Terra Blevins, Hila Gonen, Orevaoghene Ahia, Luke Zettlemoyer:
MYTE: Morphology-Driven Byte Encoding for Better and Fairer Multilingual Language Modeling. 15059-15076 - Joel Niklaus, Veton Matoshi, Matthias Stürmer, Ilias Chalkidis, Daniel E. Ho:
MultiLegalPile: A 689GB Multilingual Legal Corpus. 15077-15094 - Haolin Deng, Chang Wang, Xin Li, Dezhang Yuan, Junlang Zhan, Tianhua Zhou, Jin Ma, Jun Gao, Ruifeng Xu:
WebCiteS: Attributed Query-Focused Summarization on Chinese Web Search Results with Citations. 15095-15114 - Nadav Borenstein, Anej Svete, Robin Chan, Josef Valvoda, Franz Nowak, Isabelle Augenstein, Eleanor Chodroff, Ryan Cotterell:
What Languages are Easy to Language-Model? A Perspective from Learning Probabilistic Regular Languages. 15115-15134 - Behzad Shayegh, Yuqiao Wen, Lili Mou:
Tree-Averaging Algorithms for Ensemble-Based Unsupervised Discontinuous Constituency Parsing. 15135-15156 - Fengqing Jiang, Zhangchen Xu, Luyao Niu, Zhen Xiang, Bhaskar Ramasubramanian, Bo Li, Radha Poovendran:
ArtPrompt: ASCII Art-based Jailbreak Attacks against Aligned LLMs. 15157-15173 - Chen Qian, Wei Liu, Hongzhang Liu, Nuo Chen, Yufan Dang, Jiahao Li, Cheng Yang, Weize Chen, Yusheng Su, Xin Cong, Juyuan Xu, Dahai Li, Zhiyuan Liu, Maosong Sun:
ChatDev: Communicative Agents for Software Development. 15174-15186 - Jingxuan Han, Quan Wang, Zikang Guo, Benfeng Xu, Licheng Zhang, Zhendong Mao:
Disentangled Learning with Synthetic Parallel Data for Text Style Transfer. 15187-15201 - Zaibin Zhang, Yongting Zhang, Lijun Li, Jing Shao, Hongzhi Gao, Yu Qiao, Lijun Wang, Huchuan Lu, Feng Zhao:
PsySafe: A Comprehensive Framework for Psychological-based Attack, Defense, and Evaluation of Multi-agent System Safety. 15202-15231 - Dongjin Kang, Sunghwan Kim, Taeyoon Kwon, Seungjun Moon, Hyunsouk Cho, Youngjae Yu, Dongha Lee, Jinyoung Yeo:
Can Large Language Models be Good Emotional Supporter? Mitigating Preference Bias on Emotional Support Conversation. 15232-15261 - Xinrong Zhang, Yingfa Chen, Shengding Hu, Zihang Xu, Junhao Chen, Moo Khai Hao, Xu Han, Zhen Leng Thai, Shuo Wang, Zhiyuan Liu, Maosong Sun:
ınftyBench: Extending Long Context Evaluation Beyond 100K Tokens. 15262-15277 - Tharindu Madusanka, Ian Pratt-Hartmann, Riza Batista-Navarro:
Natural Language Satisfiability: Exploring the Problem Distribution and Evaluating Transformer-based Language Models. 15278-15294 - Paul Röttger, Valentin Hofmann, Valentina Pyatkin, Musashi Hinck, Hannah Kirk, Hinrich Schütze, Dirk Hovy:
Political Compass or Spinning Arrow? Towards More Meaningful Evaluations for Values and Opinions in Large Language Models. 15295-15311 - Giovanni Puccetti, Anna Rogers, Chiara Alzetta, Felice Dell'Orletta, Andrea Esuli:
AI 'News' Content Farms Are Easy to Make and Hard to Detect: A Case Study in Italian. 15312-15338 - Mosh Levy, Alon Jacoby, Yoav Goldberg:
Same Task, More Tokens: the Impact of Input Length on the Reasoning Performance of Large Language Models. 15339-15353 - Yue Wang, Qiliang Liang, Yaqi Yin, Hansi Wang, Yang Liu:
Disambiguate Words like Composing Them: A Morphology-Informed Approach to Enhance Chinese Word Sense Disambiguation. 15354-15365 - Chris Wendler, Veniamin Veselovsky, Giovanni Monea, Robert West:
Do Llamas Work in English? On the Latent Language of Multilingual Transformers. 15366-15394 - Xingyuan Pan, Luyang Huang, Liyan Kang, Zhicheng Liu, Yu Lu, Shanbo Cheng:
G-DIG: Towards Gradient-based DIverse and hiGh-quality Instruction Data Selection for Machine Translation. 15395-15406 - Yulia Otmakhova, Shima Khanehzar, Lea Frermann:
Media Framing: A typology and Survey of Computational Approaches Across Disciplines. 15407-15428 - Fangfang Li, Cheng Huang, Puzhen Su, Jie Yin:
SPZ: A Semantic Perturbation-based Data Augmentation Method with Zonal-Mixing for Alzheimer's Disease Detection. 15429-15439 - Dennis Ulmer, Martin Gubri, Hwaran Lee, Sangdoo Yun, Seong Joon Oh:
Calibrating Large Language Models Using Their Generations Only. 15440-15459 - Jiaxi Yang, Binyuan Hui, Min Yang, Bailin Wang, Bowen Li, Binhua Li, Fei Huang, Yongbin Li:
Iterative Forward Tuning Boosts In-Context Learning in Language Models. 15460-15473 - Wenda Xu, Guanglei Zhu, Xuandong Zhao, Liangming Pan, Lei Li, William Wang:
Pride and Prejudice: LLM Amplifies Self-Bias in Self-Refinement. 15474-15492 - Chihiro Taguchi, David Chiang:
Language Complexity and Speech Recognition Accuracy: Orthographic Complexity Hurts, Phonological Complexity Doesn't. 15493-15503 - Nina Rimsky, Nick Gabrieli, Julian Schulz, Meg Tong, Evan Hubinger, Alexander Matt Turner:
Steering Llama 2 via Contrastive Activation Addition. 15504-15522 - Nian Li, Chen Gao, Mingyu Li, Yong Li, Qingmin Liao:
EconAgent: Large Language Model-Empowered Agents for Simulating Macroeconomic Activities. 15523-15536 - Zhexin Zhang, Leqi Lei, Lindong Wu, Rui Sun, Yongkang Huang, Chong Long, Xiao Liu, Xuanyu Lei, Jie Tang, Minlie Huang:
SafetyBench: Evaluating the Safety of Large Language Models. 15537-15553 - Haisu Guan, Huanxin Yang, Xinyu Wang, Shengwei Han, Yongge Liu, Lianwen Jin, Xiang Bai, Yuliang Liu:
Deciphering Oracle Bone Language with Diffusion Models. 15554-15567 - Wai-Chung Kwan, Xingshan Zeng, Yufei Wang, Yusen Sun, Liangyou Li, Yuxin Jiang, Lifeng Shang, Qun Liu, Kam-Fai Wong:
M4LE: A Multi-Ability Multi-Range Multi-Task Multi-Domain Long-Context Evaluation Benchmark for Large Language Models. 15568-15592 - Jaavid Aktar Husain, Raj Dabre, Aswanth M., Jay Gala, Thanmay Jayakumar, Ratish Puduppully, Anoop Kunchukuttan:
RomanSetu: Efficiently unlocking multilingual capabilities of Large Language Models via Romanization. 15593-15615 - Pietro Lesci, Clara Meister, Thomas Hofmann, Andreas Vlachos, Tiago Pimentel:
Causal Estimation of Memorisation Profiles. 15616-15635 - Jiasheng Si, Yibo Zhao, Yingjie Zhu, Haiyang Zhu, Wenpeng Lu, Deyu Zhou:
CHECKWHY: Causal Fact Verification via Argument Structure. 15636-15659 - Christian Tomani, David Vilar, Markus Freitag, Colin Cherry, Subhajit Naskar, Mara Finkelstein, Xavier Garcia, Daniel Cremers:
Quality-Aware Translation Models: Efficient Generation and Quality Estimation in a Single Model. 15660-15679 - Jan-Christoph Klie, Juan Haladjian, Marc Kirchner, Rahul Nair:
On Efficient and Statistical Quality Estimation for Data Annotation. 15680-15696 - Chenye Zhao, Cornelia Caragea:
EZ-STANCE: A Large Dataset for English Zero-Shot Stance Detection. 15697-15714 - Kayo Yin, Terry Regier, Dan Klein:
American Sign Language Handshapes Reflect Pressures for Communicative Efficiency. 15715-15724 - Luca Soldaini, Rodney Kinney, Akshita Bhagia, Dustin Schwenk, David Atkinson, Russell Authur, Ben Bogin, Khyathi Raghavi Chandu, Jennifer Dumas, Yanai Elazar, Valentin Hofmann, Ananya Harsh Jha, Sachin Kumar, Li Lucy, Xinxi Lyu, Nathan Lambert, Ian Magnusson, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Abhilasha Ravichander, Kyle Richardson, Zejiang Shen, Emma Strubell, Nishant Subramani, Oyvind Tafjord, Evan Pete Walsh, Luke Zettlemoyer, Noah A. Smith, Hannaneh Hajishirzi, Iz Beltagy, Dirk Groeneveld, Jesse Dodge, Kyle Lo:
Dolma: an Open Corpus of Three Trillion Tokens for Language Model Pretraining Research. 15725-15788 - Dirk Groeneveld, Iz Beltagy, Evan Pete Walsh, Akshita Bhagia, Rodney Kinney, Oyvind Tafjord, Ananya Harsh Jha, Hamish Ivison, Ian Magnusson, Yizhong Wang, Shane Arora, David Atkinson, Russell Authur, Khyathi Raghavi Chandu, Arman Cohan, Jennifer Dumas, Yanai Elazar, Yuling Gu, Jack Hessel, Tushar Khot, William Merrill, Jacob Morrison, Niklas Muennighoff, Aakanksha Naik, Crystal Nam, Matthew E. Peters, Valentina Pyatkin, Abhilasha Ravichander, Dustin Schwenk, Saurabh Shah, Will Smith, Emma Strubell, Nishant Subramani, Mitchell Wortsman, Pradeep Dasigi, Nathan Lambert, Kyle Richardson, Luke Zettlemoyer, Jesse Dodge, Kyle Lo, Luca Soldaini, Noah A. Smith, Hannaneh Hajishirzi:
OLMo: Accelerating the Science of Language Models. 15789-15809 - Zhanhui Zhou, Jie Liu, Zhichen Dong, Jiaheng Liu, Chao Yang, Wanli Ouyang, Yu Qiao:
Emulated Disalignment: Safety Alignment for Large Language Models May Backfire! 15810-15830 - Mohammed Safi Ur Rahman Khan, Priyam Mehta, Ananth Sankar, Umashankar Kumaravelan, Sumanth Doddapaneni, Suriyaprasaad B, Varun Balan G, Sparsh Jain, Anoop Kunchukuttan, Pratyush Kumar, Raj Dabre, Mitesh M. Khapra:
IndicLLMSuite: A Blueprint for Creating Pre-training and Fine-Tuning Datasets for Indian Languages. 15831-15879 - Xiaolong Wang, Yile Wang, Yuanchi Zhang, Fuwen Luo, Peng Li, Maosong Sun, Yang Liu:
Reasoning in Conversation: Solving Subjective Tasks through Dialogue Simulation for Large Language Models. 15880-15893 - Ahmet Üstün, Viraat Aryabumi, Zheng Xin Yong, Wei-Yin Ko, Daniel D'souza, Gbemileke Onilude, Neel Bhandari, Shivalika Singh, Hui-Lee Ooi, Amr Kayid, Freddie Vargus, Phil Blunsom, Shayne Longpre, Niklas Muennighoff, Marzieh Fadaee, Julia Kreutzer, Sara Hooker:
Aya Model: An Instruction Finetuned Open-Access Multilingual Language Model. 15894-15939 - Peiwen Yuan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Boyuan Pan, Heda Wang, Yao Hu, Kan Li:
BatchEval: Towards Human-like Text Evaluation. 15940-15958 - Zhuang Chen, Jincenzi Wu, Jinfeng Zhou, Bosi Wen, Guanqun Bi, Gongyao Jiang, Yaru Cao, Mengting Hu, Yunghwei Lai, Zexuan Xiong, Minlie Huang:
ToMBench: Benchmarking Theory of Mind in Large Language Models. 15959-15983 - Jincenzi Wu, Zhuang Chen, Jiawen Deng, Sahand Sabour, Helen Meng, Minlie Huang:
COKE: A Cognitive Knowledge Graph for Machine Theory of Mind. 15984-16007 - Silvia Casola, Simona Frenda, Soda Marem Lo, Erhan Sezerer, Antonio Uva, Valerio Basile, Cristina Bosco, Alessandro Pedrani, Chiara Rubagotti, Viviana Patti, Davide Bernardi:
MultiPICo: Multilingual Perspectivist Irony Corpus. 16008-16021 - Harsh Trivedi, Tushar Khot, Mareike Hartmann, Ruskin Manku, Vinty Dong, Edward Li, Shashank Gupta, Ashish Sabharwal, Niranjan Balasubramanian:
AppWorld: A Controllable World of Apps and People for Benchmarking Interactive Coding Agents. 16022-16076 - Chuanyang Jin, Yutong Wu, Jing Cao, Jiannan Xiang, Yen-Ling Kuo, Zhiting Hu, Tomer D. Ullman, Antonio Torralba, Joshua B. Tenenbaum, Tianmin Shu:
MMToM-QA: Multimodal Theory of Mind Question Answering. 16077-16102 - Yilun Zhao, Yitao Long, Hongjun Liu, Ryo Kamoi, Linyong Nan, Lyuhao Chen, Yixin Liu, Xiangru Tang, Rui Zhang, Arman Cohan:
DocMath-Eval: Evaluating Math Reasoning Capabilities of LLMs in Understanding Financial Documents. 16103-16120 - Michael J. Ryan, William Held, Diyi Yang:
Unintended Impacts of LLM Alignment on Global Representation. 16121-16140 - Arkadiy Saakyan, Smaranda Muresan:
ICLEF: In-Context Learning with Expert Feedback for Explainable Style Transfer. 16141-16163 - Davis Yoshida, Kartik Goyal, Kevin Gimpel:
MAP's not dead yet: Uncovering true language model modes by conditioning away degeneracy. 16164-16215 - Stefano Perrella, Lorenzo Proietti, Alessandro Scirè, Edoardo Barba, Roberto Navigli:
Guardians of the Machine Translation Meta-Evaluation: Sentinel Metrics Fall In! 16216-16244 - Rongwu Xu, Brian S. Lin, Shujian Yang, Tianqi Zhang, Weiyan Shi, Tianwei Zhang, Zhixuan Fang, Wei Xu, Han Qiu:
The Earth is Flat because...: Investigating LLMs' Belief towards Misinformation via Persuasive Conversation. 16259-16303 - Jiaqi Li, Mengmeng Wang, Zilong Zheng, Muhan Zhang:
LooGLE: Can Long-Context Language Models Understand Long Contexts? 16304-16333 - Se Jin Park, Chae Won Kim, Hyeongseop Rha, Minsu Kim, Joanna Hong, Jeong Hun Yeo, Yong Man Ro:
Let's Go Real Talk: Spoken Dialogue Model for Face-to-Face Conversation. 16334-16348 - Yu Lu Liu, Su Lin Blodgett, Jackie C. K. Cheung, Vera Liao, Alexandra Olteanu, Ziang Xiao:
ECBD: Evidence-Centered Benchmark Design for NLP. 16349-16365 - Tarek Naous, Michael J. Ryan, Alan Ritter, Wei Xu:
Having Beer after Prayer? Measuring Cultural Bias in Large Language Models. 16366-16393 - Paul Roit, Aviv Slobodkin, Eran Hirsch, Arie Cattan, Ayal Klein, Valentina Pyatkin, Ido Dagan:
Explicating the Implicit: Argument Detection Beyond Sentence Boundaries. 16394-16409 - Chi Han, Jialiang Xu, Manling Li, Yi Fung, Chenkai Sun, Nan Jiang, Tarek F. Abdelzaher, Heng Ji:
Word Embeddings Are Steers for Language Models. 16410-16430
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.