default search action
63rd ACL 2025: Vienna, Austria - Long Papers
- Wanxiang Che, Joyce Nabende, Ekaterina Shutova, Mohammad Taher Pilehvar:
Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), ACL 2025, Vienna, Austria, July 27 - August 1, 2025. Association for Computational Linguistics 2025, ISBN 979-8-89176-251-0 - Frontmatter.
- Weiqi Wang, Limeng Cui, Xin Liu, Sreyashi Nag, Wenju Xu, Chen Luo, Sheikh Muhammad Sarwar, Yang Li, Hansu Gu, Hui Liu, Changlong Yu, Jiaxin Bai, Yifan Gao, Haiyang Zhang, Qi He, Shuiwang Ji, Yangqiu Song:
EcomScriptBench: A Multi-task Benchmark for E-commerce Script Planning via Step-wise Intention-Driven Product Association. 1-22 - Bo Pan, Zhen Xiong, Guanchen Wu, Zheng Zhang, Yifei Zhang, Yuntong Hu, Liang Zhao:
GraphNarrator: Generating Textual Explanations for Graph Neural Networks. 23-42 - Srishti Gureja, Lester James Validad Miranda, Shayekh Bin Islam, Rishabh Maheshwary, Drishti Sharma, Gusti Triandi Winata, Nathan Lambert, Sebastian Ruder, Sara Hooker, Marzieh Fadaee:
M-RewardBench: Evaluating Reward Models in Multilingual Settings. 43-58 - Xinwei Yang, Zhaofeng Liu, Chen Huang, Jiashuai Zhang, Tong Zhang, Yifan Zhang, Wenqiang Lei:
ELABORATION: A Comprehensive Benchmark on Human-LLM Competitive Programming. 59-104 - Jacy Reese Anthis, Kristian Lum, Michael D. Ekstrand, Avi Feller, Chenhao Tan:
The Impossibility of Fair LLMs. 105-120 - Ermo Hua, Biqing Qi, Kaiyan Zhang, Kai Tian, Xingtai Lv, Ning Ding, Bowen Zhou:
Intuitive Fine-Tuning: Towards Simplifying Alignment into a Single Process. 121-136 - Kristian Lum, Jacy Reese Anthis, Kevin Robinson, Chirag Nagpal, Alexander Nicholas D'Amour:
Bias in Language Models: Beyond Trick Tests and Towards RUTEd Evaluation. 137-161 - Wenhan Liu, Xinyu Ma, Yutao Zhu, Ziliang Zhao, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou:
Sliding Windows Are Not the End: Exploring Full Ranking with Long-Context Large Language Models. 162-176 - Aaron Nicolson, Shengyao Zhuang, Jason Dowling, Bevan Koopman:
The Impact of Auxiliary Patient Data on Automated Chest X-Ray Report Generation and How to Incorporate It. 177-203 - Jingheng Ye, Zishan Xu, Yinghui Li, Linlin Song, Qingyu Zhou, Hai-Tao Zheng, Ying Shen, Wenhao Jiang, Hong-Gee Kim, Ruitong Liu, Xin Su, Zifei Shan:
CLEME2.0: Towards Interpretable Evaluation by Disentangling Edits for Grammatical Error Correction. 204-222 - Zhouhong Gu, Haoning Ye, Xingzhou Chen, Zeyang Zhou, Hongwei Feng, Yanghua Xiao:
StrucText-Eval: Evaluating Large Language Model's Reasoning Ability in Structure-Rich Text. 223-244 - Haokun Liu, Yangqiaoyu Zhou, Mingxuan Li, Chenfei Yuan, Chenhao Tan:
Literature Meets Data: A Synergistic Approach to Hypothesis Generation. 245-281 - Zhouhong Gu, Xingzhou Chen, Xiaoran Shi, Tao Wang, Suhang Zheng, Tianyu Li, Hongwei Feng, Yanghua Xiao:
GAPO: Learning Preferential Prompt through Generative Adversarial Policy Optimization. 282-296 - Ziyang Luo, Kaixin Li, Hongzhan Lin, Yuchen Tian, Mohan S. Kankanhalli, Jing Ma:
Tree-of-Evolution: Tree-Structured Instruction Evolution for Code Generation in Large Language Models. 297-316 - Seunguk Yu, Juhwan Choi, YoungBin Kim:
Delving into Multilingual Ethical Bias: The MSQAD with Statistical Hypothesis Tests for Large Language Models. 317-340 - Dosung Lee, Wonjun Oh, Boyoung Kim, Minyoung Kim, Joonsuk Park, Paul Hongsuck Seo:
ReSCORE: Label-free Iterative Retriever Training for Multi-hop Question Answering with Relevance-Consistency Supervision. 341-359 - Hongzhan Lin, Yang Deng, Yuxuan Gu, Wenxuan Zhang, Jing Ma, See-Kiong Ng, Tat-Seng Chua:
FACT-AUDIT: An Adaptive Multi-Agent Framework for Dynamic Fact-Checking Evaluation of Large Language Models. 360-381 - Loïc Fosse, Frédéric Béchet, Benoît Favre, Géraldine Damnati, Gwénolé Lecorvé, Maxime Darrin, Philippe Formont, Pablo Piantanida:
Statistical Deficiency for Task Inclusion Estimation. 382-415 - Jabin Koo, Minwoo Jang, Jungseul Ok:
Towards Robust and Efficient Federated Low-Rank Adaptation with Heterogeneous Clients. 416-429 - Kaibo Liu, Zhenpeng Chen, Yiyang Liu, Jie M. Zhang, Mark Harman, Yudong Han, Yun Ma, Yihong Dong, Ge Li, Gang Huang:
LLM-Powered Test Case Generation for Detecting Bugs in Plausible Programs. 430-440 - Chengwei Dai, Kun Li, Wei Zhou, Songlin Hu:
Capture the Key in Reasoning to Enhance CoT Distillation Generalization. 441-465 - Chen Huang, Yang Deng, Wenqiang Lei, Jiancheng Lv, Tat-Seng Chua, Jimmy Huang:
How to Enable Effective Cooperation Between Humans and NLP Models: A Survey of Principles, Formalizations, and Beyond. 466-488 - Li Zheng, Sihang Wang, Hao Fei, Zuquan Peng, Fei Li, Jianming Fu, Chong Teng, Donghong Ji:
Enhancing Hyperbole and Metaphor Detection with Their Bidirectional Dynamic Interaction and Emotion Knowledge. 489-499 - Jun Gao, Qi Lv, Zili Wang, Tianxiang Wu, Ziqiang Cao, Wenjie Li:
UniICL: An Efficient ICL Framework Unifying Compression, Selection, and Generation. 500-510 - Maksim Aparovich, Volha Harytskaya, Vladislav Poritski, Oksana Volchek, Pavel Smrz:
BelarusianGLUE: Towards a Natural Language Understanding Benchmark for Belarusian. 511-527 - Fan Zhang, Hao Chen, Zhihong Zhu, Ziheng Zhang, Zhenxi Lin, Ziyue Qiao, Yefeng Zheng, Xian Wu:
A Survey on Foundation Language Models for Single-cell Biology. 528-549 - Ruiwen Zhou, Wenyue Hua, Liangming Pan, Sitao Cheng, Xiaobao Wu, En Yu, William Yang Wang:
RuleArena: A Benchmark for Rule-Guided Reasoning with LLMs in Real-World Scenarios. 550-572 - Xinhao Xu, Jiaxin Li, Hui Chen, Zijia Lin, Jungong Han, Guiguang Ding:
Extending LLM Context Window with Adaptive Grouped Positional Encoding: A Training-Free Method. 573-587 - Sungjae Lee, Hyejin Park, Jaechang Kim, Jungseul Ok:
Semantic Exploration with Adaptive Gating for Efficient Problem Solving with Language Models. 588-606 - Arian Askari, Emmanouil Stergiadis, Ilya Gusev, Moran Beladev:
HotelMatch-LLM: Joint Multi-Task Training of Small and Large Language Models for Efficient Multimodal Hotel Retrieval. 607-619 - Jingping Liu, Ziyan Liu, Zhedong Cen, Yan Zhou, Yinan Zou, Weiyan Zhang, Haiyun Jiang, Tong Ruan:
Can Multimodal Large Language Models Understand Spatial Relations? 620-632 - Márton Kardos, Jan Kostkan, Kenneth C. Enevoldsen, Arnault-Quentin Vermillet, Kristoffer L. Nielbo, Roberta Rocca:
S³ - Semantic Signal Separation. 633-666 - Lanxiang Hu, Tajana Rosing, Hao Zhang:
TrimLLM: Progressive Layer Dropping for Domain-Specific LLMs. 667-681 - Ariel Gera, Odellia Boni, Yotam Perlitz, Roy Bar-Haim, Lilach Eden, Asaf Yehudai:
JuStRank: Benchmarking LLM Judges for System Ranking. 682-712 - Zexuan Li, Hongliang Dai, Piji Li:
Generating Diverse Training Samples for Relation Extraction with Large Language Models. 713-726 - Dominik Macko, Jakub Kopal, Róbert Móro, Ivan Srba:
MultiSocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts. 727-752 - Cilin Yan, Jingyun Wang, Lin Zhang, Ruihui Zhao, Xiaopu Wu, Kai Xiong, Qingsong Liu, Guoliang Kang, Yangyang Kang:
Efficient and Accurate Prompt Optimization: the Benefit of Memory in Exemplar-Guided Reflection. 753-779 - Aneta Zugecova, Dominik Macko, Ivan Srba, Róbert Móro, Jakub Kopál, Katarina Marcincinova, Matús Mesarcík:
Evaluation of LLM Vulnerabilities to Being Misused for Personalized Disinformation Generation. 780-797 - Cheng Qian, Peixuan Han, Qinyu Luo, Bingxiang He, Xiusi Chen, Yuji Zhang, Hongyi Du, Jiarui Yao, Xiaocheng Yang, Denghui Zhang, Yunzhu Li, Heng Ji:
EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents. 798-820 - Teng Wang, Wing Yin Yu, Zhenqi He, Zehua Liu, HaileiGong HaileiGong, Han Wu, Xiongwei Han, Wei Shi, Ruifeng She, Fangzhou Zhu, Tao Zhong:
BPP-Search: Enhancing Tree of Thought Reasoning for Mathematical Modeling Problem Solving. 821-838 - Jakub Smíd, Pavel Pribán, Pavel Král:
LACA: Improving Cross-lingual Aspect-Based Sentiment Analysis with LLM Data Augmentation. 839-853 - Ning Ding, Yulin Chen, Ganqu Cui, Xingtai Lv, Weilin Zhao, Kaiyan Zhang, Ruobing Xie, Bowen Zhou, Zhiyuan Liu, Maosong Sun:
Fusing Highly Specialized Language Models for Comprehensive Expertise. 854-878 - Meng-Chieh Lee, Qi Zhu, Costas Mavromatis, Zhen Han, Soji Adeshina, Vassilis N. Ioannidis, Huzefa Rangwala, Christos Faloutsos:
HybGRAG: Hybrid Retrieval-Augmented Generation on Textual and Relational Knowledge Bases. 879-893 - Rajvardhan Oak, Muhammad Haroon, Claire Wonjeong Jo, Magdalena Wojcieszak, Anshuman Chhabra:
Re-ranking Using Large Language Models for Mitigating Exposure to Harmful Content on Social Media Platforms. 894-908 - Yidong Gan, Maciej Rybinski, Ben Hachey, Jonathan K. Kummerfeld:
Aligning AI Research with the Needs of Clinical Coding Workflows: Eight Recommendations Based on US Data Analysis and Critical Review. 909-922 - Ziyan Liu, Chunxiao Fan, Haoran Lou, Yuexin Wu, Kaiwei Deng:
MIND: A Multi-agent Framework for Zero-shot Harmful Meme Detection. 923-947 - Wei Tang, Yixin Cao, Yang Deng, Jiahao Ying, Bo Wang, Yizhe Yang, Yuyue Zhao, Qi Zhang, Xuanjing Huang, Yu-Gang Jiang, Yong Liao:
EvoWiki: Evaluating LLMs on Evolving Knowledge. 948-964 - Yihong Dong, Yuchen Liu, Xue Jiang, Bin Gu, Zhi Jin, Ge Li:
Rethinking Repetition Problems of LLMs in Code Generation. 965-985 - Kun Ouyang, Yuanxin Liu, Shicheng Li, Yi Liu, Hao Zhou, Fandong Meng, Jie Zhou, Xu Sun:
PunchBench: Benchmarking MLLMs in Multimodal Punchline Comprehension. 986-1008 - Chujie Zheng, Zhenru Zhang, Beichen Zhang, Runji Lin, Keming Lu, Bowen Yu, Dayiheng Liu, Jingren Zhou, Junyang Lin:
ProcessBench: Identifying Process Errors in Mathematical Reasoning. 1009-1024 - Chujie Zheng, Ziqi Wang, Heng Ji, Minlie Huang, Nanyun Peng:
Model Extrapolation Expedites Alignment. 1025-1041 - Yi Liu, Guoyin Wang, Shicheng Li, Feifan Song, Xu Sun:
ATLANTIS: Weak-to-Strong Learning via Importance Sampling. 1042-1052 - Zhaodan Zhang, Zhao Zhang, Jin Zhang, Hui Xu, Xueqi Cheng:
MPVStance: Mitigating Hallucinations in Stance Detection with Multi-Perspective Verification. 1053-1067 - Yaoqi Guo, Zhenpeng Chen, Jie M. Zhang, Yang Liu, Yun Ma:
Personality-Guided Code Generation Using Large Language Models. 1068-1080 - Haojie Xie, Yirong Chen, Xiaofen Xing, Jingkai Lin, Xiangmin Xu:
PsyDT: Using LLMs to Construct the Digital Twin of Psychological Counselor with Personalized Counseling Style for Psychological Counseling. 1081-1115 - Xu Zou:
BIPro: Zero-shot Chinese Poem Generation via Block Inverse Prompting Constrained Generation Framework. 1116-1134 - Chao Deng, Jiale Yuan, Pi Bu, Peijie Wang, Zhong-Zhi Li, Jian Xu, Xiao-Hui Li, Yuan Gao, Jun Song, Bo Zheng, Cheng-Lin Liu:
LongDocURL: a Comprehensive Multimodal Long Document Benchmark Integrating Understanding, Reasoning, and Locating. 1135-1159 - Yu Lin, Ruining Yang, Yunlong Mao, Qizhi Zhang, Jue Hong, Quanwei Cai, Ye Wu, Huiqi Liu, Zhiyu Chen, Bing Duan, Sheng Zhong:
ObfusLM: Privacy-preserving Language Model Service against Embedding Inversion Attacks. 1160-1174 - Federico Ruggeri, Gaetano Signorelli:
Interlocking-free Selective Rationalization Through Genetic-based Learning. 1175-1191 - Lucas Georges Gabriel Charpentier, Pierre Lison:
Re-identification of De-identified Documents with Autoregressive Infilling. 1192-1209 - Haomiao Tang, Jinpeng Wang, Yuang Peng, Guanghao Meng, Ruisheng Luo, Bin Chen, Long Chen, Yaowei Wang, Shutao Xia:
Modeling Uncertainty in Composed Image Retrieval via Probabilistic Embeddings. 1210-1222 - Junfeng Tian, Da Zheng, Yang Chen, Rui Wang, Colin Zhang, Debing Zhang:
Untie the Knots: An Efficient Data Augmentation Strategy for Long-Context Pre-Training in Language Models. 1223-1242 - Honghua Dong, Qidong Su, Yubo Gao, Zhaoyu Li, Yangjun Ruan, Gennady Pekhimenko, Chris J. Maddison, Xujie Si:
APPL: A Prompt Programming Language for Harmonious Integration of Programs and Large Language Model Prompts. 1243-1266 - Cristiano Ciaccio, Alessio Miaschi, Felice Dell'Orletta:
Evaluating Lexical Proficiency in Neural Language Models. 1267-1286 - Lingwei Meng, Long Zhou, Shujie Liu, Sanyuan Chen, Bing Han, Shujie Hu, Yanqing Liu, Jinyu Li, Sheng Zhao, Xixin Wu, Helen M. Meng, Furu Wei:
Autoregressive Speech Synthesis without Vector Quantization. 1287-1300 - Letian Peng, Zilong Wang, Feng Yao, Jingbo Shang:
Cuckoo: An IE Free Rider Hatched by Massive Nutrition in LLM's Nest. 1301-1315 - Raghav Singhal, Kaustubh Ponkshe, Praneeth Vepakomma:
FedEx-LoRA: Exact Aggregation for Federated and Efficient Fine-Tuning of Large Language Models. 1316-1336 - Rahul Zalkikar, Kanchan Chandra:
Measuring Social Biases in Masked Language Models by Proxy of Prediction Quality. 1337-1361 - Siddharth Mangalik, Adithya V. Ganesan, Abigail B. Wheeler, Nicholas Kerry, Jeremy D. W. Clifton, H. Andrew Schwartz, Ryan L. Boyd:
Capturing Author Self Beliefs in Social Media Language. 1362-1376 - Xiaohao Yang, He Zhao, Weijie Xu, Yuanyuan Qi, Jueqing Lu, Dinh Phung, Lan Du:
Neural Topic Modeling with Large Language Models in the Loop. 1377-1401 - Abhilasha Ravichander, Shrusti Ghela, David Wadden, Yejin Choi:
HALoGEN: Fantastic LLM Hallucinations and Where to Find Them. 1402-1425 - Shuguo Hu, Jun Hu, Huaiwen Zhang:
Synergizing LLMs with Global Label Propagation for Multimodal Fake News Detection. 1426-1440 - Zi Liang, Qingqing Ye, Yanyun Wang, Sen Zhang, Yaxin Xiao, Ronghua Li, Jianliang Xu, Haibo Hu:
"Yes, My LoRD." Guiding Language Model Extraction with Locality Reinforced Distillation. 1441-1465 - Yu Wang, Xiaofei Zhou, Yichen Wang, Geyuan Zhang, Tianxing He:
Jailbreak Large Vision-Language Models Through Multi-Modal Linkage. 1466-1494 - Gracjan Góral, Emilia Wisnios, Piotr Sankowski, Pawel Budzianowski:
Wait, that's not an option: LLMs Robustness with Incorrect Multiple-Choice Options. 1495-1515 - Ameen Ali, Itamar Zimerman, Lior Wolf:
The Hidden Attention of Mamba Models. 1516-1534 - Luohe Shi, Zuchao Li, Lefei Zhang, Baoyuan Qi, Guoming Liu, Hai Zhao:
KV-Latent: Dimensional-level KV Cache Reduction with Frequency-aware Rotary Positional Embedding. 1535-1550 - Yan Wang, Ling Ding, Tien N. Nguyen, Shaohua Wang, Yanan Zheng:
LEANCODE: Understanding Models Better for Code Simplification of Pre-trained Large Language Models. 1551-1567 - Weiqi Wang, Yangqiu Song:
MARS: Benchmarking the Metaphysical Reasoning Abilities of Language Models with a Multi-task Evaluation Dataset. 1568-1596 - Hang Li, Tianlong Xu, Kaiqi Yang, Yucheng Chu, Yanling Chen, Yichi Song, Qingsong Wen, Hui Liu:
Ask-Before-Detection: Identifying and Mitigating Conformity Bias in LLM-Powered Error Detector for Math Word Problem Solutions. 1597-1609 - Sanxing Chen, Yukun Huang, Bhuwan Dhingra:
Real-time Factuality Assessment from Adversarial Feedback. 1610-1630 - Ruohong Zhang, Bowen Zhang, Yanghao Li, Haotian Zhang, Zhiqing Sun, Zhe Gan, Yinfei Yang, Ruoming Pang, Yiming Yang:
Improve Vision Language Model Chain-of-thought Reasoning. 1631-1662 - Haozhe An, Connor Baumler, Abhilasha Sancheti, Rachel Rudinger:
On the Mutual Influence of Gender and Occupation in LLM Representations. 1663-1680 - Mingyu Jin, Weidi Luo, Sitao Cheng, Xinyi Wang, Wenyue Hua, Ruixiang Tang, William Yang Wang, Yongfeng Zhang:
Disentangling Memory and Reasoning Ability in Large Language Models. 1681-1701 - Jiaqi Li, Yanming Li, Xiaoli Shen, Chuanyi Zhang, Guilin Qi, Sheng Bi:
Open-World Attribute Mining for E-Commerce Products with Multimodal Self-Correction Instruction Tuning. 1702-1714 - Joakim Edin, Andreas Geert Motzfeldt, Casper L. Christensen, Tuukka Ruotsalo, Lars Maaløe, Maria Maistro:
Normalized AOPC: Fixing Misleading Faithfulness Metrics for Feature Attributions Explainability. 1715-1730 - Yuguang Yang, Yu Pan, Jixun Yao, Xiang Zhang, Jianhao Ye, Hongbin Zhou, Lei Xie, Lei Ma, Jianjun Zhao:
Takin-VC: Expressive Zero-Shot Voice Conversion via Adaptive Hybrid Content Encoding and Enhanced Timbre Modeling. 1731-1742 - Yihong Liu, Haotian Ye, Chunlan Ma, Mingyang Wang, Hinrich Schütze:
LangSAMP: Language-Script Aware Multilingual Pretraining. 1743-1770 - Haoyu Dong, Yue Hu, Huailiang Peng, Yanan Cao:
RelationalCoder: Rethinking Complex Tables via Programmatic Relational Transformation. 1771-1784 - Bolei Ma, Berk Yoztyurk, Anna-Carolina Haensch, Xinpeng Wang, Markus Herklotz, Frauke Kreuter, Barbara Plank, Matthias Aßenmacher:
Algorithmic Fidelity of Large Language Models in Generating Synthetic German Public Opinions: A Case Study. 1785-1809 - Fanheng Kong, Jingyuan Zhang, Hongzhi Zhang, Shi Feng, Daling Wang, Linhao Yu, Xingguang Ji, Yu Tian, Victoria W., Fuzheng Zhang:
TUNA: Comprehensive Fine-grained Temporal Understanding Evaluation on Dense Dynamic Videos. 1810-1839 - Zhuo Li, Yuhao Du, Jinpeng Hu, Xiang Wan, Anningzhe Gao:
Self-Instructed Derived Prompt Generation Meets In-Context Learning: Unlocking New Potential of Black-Box LLMs. 1840-1857 - Seungjae Jung, Gunsoo Han, Daniel Wontae Nam, Kyoung-Woon On:
Binary Classifier Optimization for Large Language Model Alignment. 1858-1872 - Md Nayem Uddin, Amir Saeidi, Divij Handa, Agastya Seth, Tran Cao Son, Eduardo Blanco, Steven R. Corman, Chitta Baral:
UnSeenTimeQA: Time-Sensitive Question-Answering Beyond LLMs' Memorization. 1873-1913 - Yang Zhong, Diane J. Litman:
From Information to Insight: Leveraging LLMs for Open Aspect-Based Educational Summarization. 1914-1947 - Charles Nimo, Tobi Olatunji, Abraham Toluwase Owodunni, Tassallah Abdullahi, Emmanuel Ayodele, Mardhiyah Sanni, Ezinwanne C. Aka, Folafunmi Omofoye, Foutse Yuehgoh, Timothy Faniran, Bonaventure F. P. Dossou, Moshood O. Yekini, Jonas Kemp, Katherine A. Heller, Jude Chidubem Omeke, Chidi Asuzu MD, Naome A. Etori, Aimérou Ndiaye, Ifeoma Okoh, Evans Doe Ocansey, Wendy Kinara, Michael L. Best, Irfan Essa, Stephen Edward Moore, Chris Fourie, Mercy Nyamewaa Asiedu:
AfriMed-QA: A Pan-African, Multi-Specialty, Medical Question-Answering Benchmark Dataset. 1948-1973 - Xinyi Zeng, Yuying Shang, Jiawei Chen, Jingyuan Zhang, Yu Tian:
Root Defense Strategies: Ensuring Safety of LLM at the Decoding Level. 1974-1988 - Tianrui Pan, Jie Liu, Zewen Huang, Jie Tang, Gangshan Wu:
In-the-wild Audio Spatialization with Flexible Text-guided Localization. 1989-2001 - Hyesung Jeon, Yulhwa Kim, Jae-Joon Kim:
L4Q: Parameter Efficient Quantization-Aware Fine-Tuning on Large Language Models. 2002-2024 - Jianqing Zhu, Huang Huang, Zhihang Lin, Juhao Liang, Zhengyang Tang, Khalid Almubarak, Mosen Alharthi, Bang An, Juncai He, Xiangbo Wu, Fei Yu, Junying Chen, Zhuoheng Ma, Yuhao Du, He Zhang, Saied Alshahrani, Emad A. Alghamdi, Lian Zhang, Ruoyu Sun, Haizhou Li, Benyou Wang, Jinchao Xu:
Second Language (Arabic) Acquisition of LLMs via Progressive Vocabulary Expansion. 2025-2042 - Sangyeop Kim, Yohan Lee, Yongwoo Song, Kimin Lee:
What Really Matters in Many-Shot Attacks? An Empirical Study of Long-Context Vulnerabilities in LLMs. 2043-2063 - Tao Zhang, Zhenhua Tan:
ECERC: Evidence-Cause Attention Network for Multi-Modal Emotion Recognition in Conversation. 2064-2077 - Li Hu, Guoqiang Chen, Xiuwei Shang, Shaoyin Cheng, Benlong Wu, LiGangyang LiGangyang, Xu Zhu, Weiming Zhang, Nenghai Yu:
CompileAgent: Automated Real-World Repo-Level Compilation with Tool-Integrated LLM-based Agent System. 2078-2091 - Matthias Orlikowski, Jiaxin Pei, Paul Röttger, Philipp Cimiano, David Jurgens, Dirk Hovy:
Beyond Demographics: Fine-tuning Large Language Models to Predict Individuals' Subjective Text Perceptions. 2092-2111 - Chonghua Liao, Ruobing Xie, Xingwu Sun, Haowen Sun, Zhanhui Kang:
Exploring Forgetting in Large Language Model Pre-Training. 2112-2127 - Virgile Rennard, Christos Xypolopoulos, Michalis Vazirgiannis:
Bias in the Mirror : Are LLMs opinions robust to their own adversarial attacks. 2128-2143 - Yifan Xu, Xiao Liu, Xueqiao Sun, Siyi Cheng, Hao Yu, Hanyu Lai, Shudan Zhang, Dan Zhang, Jie Tang, Yuxiao Dong:
AndroidLab: Training and Systematic Benchmarking of Android Autonomous Agents. 2144-2166 - Yongxin Huang, Kexin Wang, Goran Glavas, Iryna Gurevych:
Modular Sentence Encoders: Separating Language Specialization from Cross-Lingual Alignment. 2167-2187 - Yijie Jin, Junjie Peng, Xuanchao Lin, Haochen Yuan, Lan Wang, Cangzhi Zheng:
Multimodal Transformers are Hierarchical Modal-wise Heterogeneous Graphs. 2188-2209 - Yichi Zhang, Zhuo Chen, Lingbing Guo, Yajing Xu, Shaokai Chen, Mengshu Sun, Binbin Hu, Zhiqiang Zhang, Lei Liang, Wen Zhang, Huajun Chen:
Have We Designed Generalizable Structural Knowledge Promptings? Systematic Evaluation and Rethinking. 2210-2226 - Jan Pfister, Julia Wunderle, Andreas Hotho:
LLäMmlein: Transparent, Compact and Competitive German-Only Language Models from Scratch. 2227-2246 - Youngmin Kim, Jiwan Chung, Jisoo Kim, Sunghyun Lee, Sangkyu Lee, Junhyeok Kim, Cheoljong Yang, Youngjae Yu:
Speaking Beyond Language: A Large-Scale Multimodal Dataset for Learning Nonverbal Cues from Video-Grounded Dialogues. 2247-2265 - Simone Teglia, Simone Tedeschi, Roberto Navigli:
How Much Do Encoder Models Know About Word Senses? 2266-2277 - Huaizhi Ge, Yiming Li, Qifan Wang, Yongfeng Zhang, Ruixiang Tang:
When Backdoors Speak: Understanding LLM Backdoor Attacks Through Model-Generated Explanations. 2278-2296 - Manuel Tonneau, Diyi Liu, Niyati Malhotra, Scott A. Hale, Samuel Fraiberger, Víctor Orozco-Olvera, Paul Röttger:
HateDay: Insights from a Global Hate Speech Dataset Representative of a Day on Twitter. 2297-2321 - Haitao Li, Junjie Chen, Jingli Yang, Qingyao Ai, Wei Jia, Youfeng Liu, Kai Lin, Yueyue Wu, Guozhi Yuan, Yiran Hu, Wuyue Wang, Yiqun Liu, Minlie Huang:
LegalAgentBench: Evaluating LLM Agents in Legal Domain. 2322-2344 - Peiqi Wang, ShengYun Peng, Xuewen Zhang, Hanchao Yu, Yibo Yang, Lifu Huang, Fujun Liu, Qifan Wang:
Inference Compute-Optimal Video Vision Language Models. 2345-2374 - Anirudh Sundar, Sinead Williamson, Katherine Metcalf, Barry-John Theobald, Skyler Seto, Masha Fedzechkina:
Steering into New Embedding Spaces: Analyzing Cross-Lingual Alignment Induced by Model Interventions in Multilingual Language Models. 2375-2401 - Amrit Poudel, Yifan Ding, Tim Weninger, Jürgen Pfeffer:
Digital Gatekeepers: Google's Role in Curating Hashtags and Subreddits. 2402-2415 - Anna Kolos, Katarzyna Lorenc, Emilia Wisnios, Agnieszka Karlinska:
Behind Closed Words: Creating and Investigating the forePLay Annotated Dataset for Polish Erotic Discourse. 2416-2432 - Maor Reuben, Ortal Slobodin, Idan-Chaim Cohen, Aviad Elyashar, Orna Braun-Lewensohn, Odeya Cohen, Rami Puzis:
Assessment and manipulation of latent constructs in pre-trained language models using psychometric scales. 2433-2444 - Ben Peters, André F. T. Martins:
Did Translation Models Get More Robust Without Anyone Even Noticing? 2445-2458 - Dan Su, Kezhi Kong, Ying Lin, Joseph Jennings, Brandon Norick, Markus Kliegl, Mostofa Patwary, Mohammad Shoeybi, Bryan Catanzaro:
Nemotron-CC: Transforming Common Crawl into a Refined Long-Horizon Pretraining Dataset. 2459-2475 - Hans William Alexander Hanley, Zakir Durumeric:
Hierarchical Level-Wise News Article Clustering via Multilingual Matryoshka Embeddings. 2476-2492 - Tassilo Klein, Moin Nabi:
Contrastive Perplexity for Controlled Generation: An Application in Detoxifying Large Language Models. 2493-2508 - Haohang Li, Yupeng Cao, Yangyang Yu, Shashidhar Reddy Javaji, Zhiyang Deng, Yueru He, Yuechen Jiang, Zining Zhu, K. P. Subbalakshmi, Jimin Huang, Lingfei Qian, Xueqing Peng, Jordan W. Suchow, Qianqian Xie:
INVESTORBENCH: A Benchmark for Financial Decision-Making Tasks with LLM-based Agent. 2509-2525 - Benjamin Warner, Antoine Chaffin, Benjamin Clavié, Orion Weller, Oskar Hallström, Said Taghadouini, Alexis Gallagher, Raja Biswas, Faisal Ladhak, Tom Aarsen, Griffin Thomas Adams, Jeremy Howard, Iacopo Poli:
Smarter, Better, Faster, Longer: A Modern Bidirectional Encoder for Fast, Memory Efficient, and Long Context Finetuning and Inference. 2526-2547 - Zhengyang Shan, Emily Diana, Jiawei Zhou:
Gender Inclusivity Fairness Index (GIFI): A Multilevel Framework for Evaluating Gender Diversity in Large Language Models. 2548-2579 - Qi Zhang, Zhiqing Xiao, Ruixuan Xiao, Lirong Gao, Junbo Zhao:
D.Va: Validate Your Demonstration First Before You Use It. 2580-2594 - Jiwan Chung, Janghan Yoon, Junhyeong Park, Sangeyl Lee, Joowon Yang, Sooyeon Park, Youngjae Yu:
Are Any-to-Any Models More Consistent Across Modality Transfers Than Specialists? 2595-2606 - Chia-Yuan Chang, Zhimeng Jiang, Vineeth Rakesh, Menghai Pan, Chin-Chia Michael Yeh, Guanchu Wang, Mingzhi Hu, Zhichao Xu, Yan Zheng, Mahashweta Das, Na Zou:
MAIN-RAG: Multi-Agent Filtering Retrieval-Augmented Generation. 2607-2622 - Hui Liu, Wenya Wang, Hao Sun, Chris Xing Tian, Chenqi Kong, Xin Dong, Haoliang Li:
Unraveling the Mechanics of Learning-Based Demonstration Selection for In-Context Learning. 2623-2641 - Yangkun Wang, Zihan Wang, Jingbo Shang:
Direct Prompt Optimization with Continuous Representations. 2642-2652 - Aishik Nagar, Yutong Liu, Andy T. Liu, Viktor Schlegel, Vijay Prakash Dwivedi, Arun-Kumar Kaliya-Perumal, Guna Pratheep Kalanchiam, Yili Tang, Robby T. Tan:
uMedSum: A Unified Framework for Clinical Abstractive Summarization. 2653-2672 - Yifan Yang, Zheshu Song, Jianheng Zhuo, Mingyu Cui, Jinpeng Li, Bo Yang, Yexing Du, Ziyang Ma, Xunying Liu, Ziyuan Wang, Ke Li, Shuai Fan, Kai Yu, Wei-Qiang Zhang, Guoguo Chen, Xie Chen:
GigaSpeech 2: An Evolving, Large-Scale and Multi-domain ASR Corpus for Low-Resource Languages with Automated Crawling, Transcription and Refinement. 2673-2686 - Fanhang Man, Huandong Wang, Jianjie Fang, Zhaoyi Deng, Baining Zhao, Xinlei Chen, Yong Li:
Context-Aware Sentiment Forecasting via LLM-based Multi-Perspective Role-Playing Agents. 2687-2703 - Xiang Huang, Jiayu Shen, Shanshan Huang, Sitao Cheng, Xiaxia Wang, Yuzhong Qu:
TARGA: Targeted Synthetic Data Generation for Practical Reasoning over Structured Data. 2704-2726 - Hanyu Lai, Junjie Gao, Xiao Liu, Yifan Xu, Shudan Zhang, Yuxiao Dong, Jie Tang:
AndroidGen: Building an Android Language Agent under Data Scarcity. 2727-2749 - Mingxuan Xia, Haobo Wang, Yixuan Li, Zewei Yu, Jindong Wang, Junbo Zhao, Runze Wu:
Prompt Candidates, then Distill: A Teacher-Student Framework for LLM-driven Data Annotation. 2750-2770 - Hanyu Lai, Xiao Liu, Junjie Gao, Jiale Cheng, Zehan Qi, Yifan Xu, Shuntian Yao, Dan Zhang, Jinhua Du, Zhenyu Hou, Xin Lv, Minlie Huang, Yuxiao Dong, Jie Tang:
A Survey of Post-Training Scaling in Large Language Models. 2771-2791 - Tal Haklay, Hadas Orgad, David Bau, Aaron Mueller, Yonatan Belinkov:
Position-aware Automatic Circuit Discovery. 2792-2817 - Yuhuan Lu, Weijian Yu, Xin Jing, Dingqi Yang:
HyperFM: Fact-Centric Multimodal Fusion for Link Prediction over Hyper-Relational Knowledge Graphs. 2818-2830 - Gregor Geigle, Florian Schneider, Carolin Holtermann, Chris Biemann, Radu Timofte, Anne Lauscher, Goran Glavas:
Centurio: On Drivers of Multilingual Ability of Large Vision-Language Model. 2831-2881 - Dimitris Gkoumas, Maria Liakata:
Less for More: Enhanced Feedback-aligned Mixed LLMs for Molecule Caption Generation and Fine-Grained NLI Evaluation. 2882-2902 - Georg Niess, Roman Kern:
Ensemble Watermarks for Large Language Models. 2903-2916 - Jiahui Geng, Thy Thy Tran, Preslav Nakov, Iryna Gurevych:
\mathsfCon Instruction: Universal Jailbreaking of Multimodal Large Language Models via Non-Textual Modalities. 2917-2933 - Cheng-Han Chiang, Hung-yi Lee, Michal Lukasik:
TRACT: Regression-Aware Fine-tuning Meets Chain-of-Thought Reasoning for LLM-as-a-Judge. 2934-2952 - Hanghui Guo, Jia Zhu, Shimin Di, Weijie Shi, Zhangze Chen, Jiajie Xu:
DioR: Adaptive Cognitive Detection and Contextual Retrieval Optimization for Dynamic Retrieval-Augmented Generation. 2953-2975 - Boxuan Lyu, Hidetaka Kamigaito, Kotaro Funakoshi, Manabu Okumura:
Unveiling the Power of Source: Source-based Minimum Bayes Risk Decoding for Neural Machine Translation. 2976-2994 - Junjie Ye, Zhengyin Du, Xuesong Yao, Weijian Lin, Yufei Xu, Zehui Chen, Zaiyuan Wang, Sining Zhu, Zhiheng Xi, Siyu Yuan, Tao Gui, Qi Zhang, Xuanjing Huang, Jiecao Chen:
ToolHop: A Query-Driven Benchmark for Evaluating Large Language Models in Multi-Hop Tool Use. 2995-3021 - Zhili Liu, Yunhao Gou, Kai Chen, Lanqing Hong, Jiahui Gao, Fei Mi, Yu Zhang, Zhenguo Li, Xin Jiang, Qun Liu, James T. Kwok:
Mixture of insighTful Experts (MoTE): The Synergy of Reasoning Chains and Expert Mixtures in Self-Alignment. 3022-3038 - Weicong Qin, Yi Xu, Weijie Yu, Chenglei Shen, Ming He, Jianping Fan, Xiao Zhang, Jun Xu:
MAPS: Motivation-Aware Personalized Search via LLM-Driven Consultation Alignment. 3039-3051 - Jundong Xu, Hao Fei, Meng Luo, Qian Liu, Liangming Pan, William Yang Wang, Preslav Nakov, Mong-Li Lee, Wynne Hsu:
Aristotle: Mastering Logical Reasoning with A Logic-Complete Decompose-Search-Resolve Framework. 3052-3075 - Jianghao Chen, Junhong Wu, Yangyifan Xu, Jiajun Zhang:
LADM: Long-context Training Data Selection with Attention-based Dependency Measurement for LLMs. 3076-3090 - Yuanfan Li, Zhaohan Zhang, Chengzhengxu Li, Chao Shen, Xiaoming Liu:
Iron Sharpens Iron: Defending Against Attacks in Machine-Generated Text Detection with Adversarial Training. 3091-3113 - Chen Cecilia Liu, Anna Korhonen, Iryna Gurevych:
Cultural Learning-Based Culture Adaptation of Language Models. 3114-3134 - Yuhan Zhou, Naoki Yoshinaga:
A-TASC: Asian TED-Based Automatic Subtitling Corpus. 3135-3148 - Youliang Yuan, Wenxiang Jiao, Wenxuan Wang, Jen-tse Huang, Jiahao Xu, Tian Liang, Pinjia He, Zhaopeng Tu:
Refuse Whenever You Feel Unsafe: Improving Safety in LLMs via Decoupled Refusal Training. 3149-3167 - Yuchen Fu, Zifeng Cheng, Zhiwei Jiang, Zhonghui Wang, Yafeng Yin, Zhengliang Li, Qing Gu:
Token Prepending: A Training-Free Approach for Eliciting Better Sentence Embeddings from LLMs. 3168-3181 - Neha Srikanth, Rachel Rudinger, Jordan Lee Boyd-Graber:
No Questions are Stupid, but some are Poorly Posed: Understanding Poorly-Posed Information-Seeking Questions. 3182-3199 - Rupak Sarkar, Neha Srikanth, Taylor Pellegrin, Rachel Rudinger, Claire Bonial, Philip Resnik:
Understanding Common Ground Misalignment in Goal-Oriented Dialog: A Case-Study with Ubuntu Chat Logs. 3200-3215 - Olga Loginova, Oleksandr Bezrukov, Ravi Shekhar, Alexey Kravets:
Addressing Blind Guessing: Calibration of Selection Bias in Multiple-Choice Question Answering by Video Language Models. 3216-3246 - Sheng Ouyang, Yulan Hu, Ge Chen, Qingyang Li, Fuzheng Zhang, Yong Liu:
Towards Reward Fairness in RLHF: From a Resource Allocation Perspective. 3247-3259 - Siyuan Li, Juanxi Tian, Zedong Wang, Xin Jin, Zicheng Liu, Wentao Zhang, Dan Xu:
Taming LLMs with Gradient Grouping. 3260-3279 - Sukannya Purkayastha, Zhuang Li
, Anne Lauscher, Lizhen Qu, Iryna Gurevych:
LazyReview: A Dataset for Uncovering Lazy Thinking in NLP Peer Reviews. 3280-3308 - Amr Keleg, Sharon Goldwater, Walid Magdy:
Revisiting Common Assumptions about Arabic Dialects in NLP. 3309-3327 - Ravi Patel, Angus Brayne, Rogier Hintzen, Daniel Jaroslawicz, Georgiana Neculae, Dane S. Corneil:
Retrieve to Explain: Evidence-driven Predictions for Explainable Drug Target Identification. 3328-3370 - Nishant Balepur, Vishakh Padmakumar, Fumeng Yang, Shi Feng, Rachel Rudinger, Jordan Lee Boyd-Graber:
Whose Boat Does it Float? Improving Personalization in Preference Tuning via Inferred User Personas. 3371-3393 - Nishant Balepur, Rachel Rudinger, Jordan Lee Boyd-Graber:
Which of These Best Describes Multiple Choice Evaluation with LLMs? A) Forced B) Flawed C) Fixable D) All of the Above. 3394-3418 - Muhammad Zain Ali, Yuxia Wang, Bernhard Pfahringer, Tony C. Smith:
Detection of Human and Machine-Authored Fake News in Urdu. 3419-3428 - Yangyang Zhao, Ben Niu, Libo Qin, Shihan Wang:
An Efficient Task-Oriented Dialogue Policy: Evolutionary Reinforcement Learning Injected by Elite Individuals. 3429-3442 - Jiahuan Zhang, Tianheng Wang, Ziyi Huang, Yulong Wu, Hanqing Wu, DongbaiChen DongbaiChen, Linfeng Song, Yue Zhang, Guozheng Rao, Kaicheng Yu:
SR-LLM: Rethinking the Structured Representation in Large Language Model. 3443-3462 - Chuang Zhou, Zhu Wang, Shengyuan Chen, Jiahe Du, Qiyuan Zheng, Zhaozhuo Xu, Xiao Huang:
Taming Language Models for Text-attributed Graph Learning with Decoupled Aggregation. 3463-3474 - Zifeng Cheng, Zhonghui Wang, Yuchen Fu, Zhiwei Jiang, Yafeng Yin, Cong Wang, Qing Gu:
Contrastive Prompting Enhances Sentence Embeddings in LLMs through Inference-Time Steering. 3475-3487 - Jinghan He, Kuan Zhu, Haiyun Guo, Junfeng Fang, Zhenglin Hua, Yuheng Jia, Ming Tang, Tat-Seng Chua, Jinqiao Wang:
Cracking the Code of Hallucination in LVLMs with Vision-aware Head Divergence. 3488-3501 - Jiajie Jin, Xiaoxi Li, Guanting Dong, Yuyao Zhang, Yutao Zhu, Yongkang Wu, Zhonghua Li, Ye Qi, Zhicheng Dou:
Hierarchical Document Refinement for Long-context Retrieval-augmented Generation. 3502-3520 - Chaoyi Xiang, Chunhua Liu, Simon De Deyne, Lea Frermann:
Comparing Moral Values in Western English-speaking societies and LLMs with Word Associations. 3521-3536 - Yuting Wei, Qi Meng, Yuanxing Xu, Bin Wu:
TEACH: A Contrastive Knowledge Adaptive Distillation Framework for Classical Chinese Understanding. 3537-3550 - Guanting Dong, Jiajie Jin, Xiaoxi Li, Yutao Zhu, Zhicheng Dou, Ji-Rong Wen:
RAG-Critic: Leveraging Automated Critic-Guided Agentic Workflow for Retrieval Augmented Generation. 3551-3578 - Guanting Dong, Chenghao Zhang, Mengjie Deng, Yutao Zhu, Zhicheng Dou, Ji-Rong Wen:
Progressive Multimodal Reasoning via Active Retrieval. 3579-3602 - Hao Peng, Xin Lv, Yushi Bai, Zijun Yao, Jiajie Zhang, Lei Hou, Juanzi Li:
Pre-training Distillation for Large Language Models: A Design Space Exploration. 3603-3618 - Pu Jian, Donglei Yu, Wen Yang, Shuo Ren, Jiajun Zhang:
Teaching Vision-Language Models to Ask: Resolving Ambiguity in Visual Questions. 3619-3638 - Yushi Bai, Shangqing Tu, Jiajie Zhang, Hao Peng, Xiaozhi Wang, Xin Lv, Shulin Cao, Jiazheng Xu, Lei Hou, Yuxiao Dong, Jie Tang, Juanzi Li:
LongBench v2: Towards Deeper Understanding and Reasoning on Realistic Long-context Multitasks. 3639-3664 - Haiyang Wang, Zhiliang Tian, Yuchen Pan, Xin Song, Xin Niu, Minlie Huang, Bin Zhou:
Battling against Tough Resister: Strategy Planning with Adversarial Game for Non-collaborative Dialogues. 3665-3685 - Youcheng Huang, Chen Huang, Duanyu Feng, Wenqiang Lei, Jiancheng Lv
:
Cross-model Transferability among Large Language Models on the Platonic Representations of Concepts. 3686-3704 - Guichao Zhu, Lintian Lei, Yuhao Qing, Yichao Fu, Fanxin Li, Dong Huang, Zekai Sun, Heming Cui:
FoldMoE: Efficient Long Sequence MoE Training via Attention-MoE Pipelining. 3705-3717 - Jiajie Zhang, Zhongni Hou, Xin Lv, Shulin Cao, Zhenyu Hou, Yilin Niu, Lei Hou, Yuxiao Dong, Ling Feng, Juanzi Li:
LongReward: Improving Long-context Large Language Models with AI Feedback. 3718-3739 - Yuxi Xia, Pedro Henrique Luz de Araujo, Klim Zaporojets, Benjamin Roth:
Influences on LLM Calibration: A Study of Response Agreement, Loss Functions, and Prompt Styles. 3740-3761 - Boxi Yu, Yuxuan Zhu, Pinjia He, Daniel Kang:
UTBoost: Rigorous Evaluation of Coding Agents on SWE-Bench. 3762-3774 - Lekang Jiang, Pascal A. Scherz, Stefan Goetz:
Towards Better Evaluation for Generated Patent Claims. 3775-3788 - Haritz Puerto, Tilek Chubakov, Xiaodan Zhu, Harish Tayyar Madabushi, Iryna Gurevych:
Fine-Tuning on Diverse Reasoning Chains Drives Within-Inference CoT Refinement in LLMs. 3789-3808 - Kejian Zhu, Shangqing Tu, Zhuoran Jin, Lei Hou, Juanzi Li, Jun Zhao:
Establishing Trustworthy LLM Evaluation via Shortcut Neuron Analysis. 3809-3822 - Yanzhu Guo, Simone Conia, Zelin Zhou, Min Li, Saloni Potdar, Henry Xiao:
Do Large Language Models have an English Accent? Evaluating and Improving the Naturalness of Multilingual LLMs. 3823-3838 - Zhu Xu, Zhiqiang Zhao, Zihan Zhang, Yuchi Liu, Quanwei Shen, Fei Liu, Yu Kuang, Jian He, Conglin Liu:
Enhancing Character-Level Understanding in LLMs through Token Internal Structure Learning. 3839-3853 - Xiaochen Zhu, Caiqi Zhang, Tom Stafford, Nigel Collier, Andreas Vlachos:
Conformity in Large Language Models. 3854-3872 - Chenghao Sun, Zhen Huang, Yonggang Zhang, Le Lu, Houqiang Li, Xinmei Tian, Xu Shen, Jieping Ye:
Interpret and Improve In-Context Learning via the Lens of Input-Label Mappings. 3873-3895 - Lukas Kinder, Lukas Edman, Alexander Fraser, Tobias Käfer:
Positional Overload: Positional Debiasing and Context Window Extension for Large Language Models using Set Encoding. 3896-3908 - Weilin Zhao, Tengyu Pan, Xu Han, Yudi Zhang, Sun Ao, Yuxiang Huang, Kaihuo Zhang, Weilun Zhao, Yuxuan Li, Jie Zhou, Hao Zhou, Jianyong Wang, Maosong Sun, Zhiyuan Liu:
FR-Spec: Accelerating Large-Vocabulary Language Models via Frequency-Ranked Speculative Sampling. 3909-3921 - Congzhi Zhang, Jiawei Peng, Zhenglin Wang, Yilong Lai, Haowen Sun, Heng Chang, Fei Ma, Weijiang Yu:
VReST: Enhancing Reasoning in Large Vision-Language Models through Tree Search and Self-Reward Mechanism. 3922-3941 - Nianqi Li, Siyu Yuan, Jiangjie Chen, Jiaqing Liang, Feng Wei, Zujie Liang, Deqing Yang, Yanghua Xiao:
Past Meets Present: Creating Historical Analogy with Large Language Models. 3942-3957 - Yaoke Wang, Yun Zhu, XintongBao XintongBao, Wenqiao Zhang, Suyang Dai, Kehan Chen, Wenqiang Li, Gang Huang, Siliang Tang, Yueting Zhuang:
Meta-Reflection: A Feedback-Free Reflection Learning Framework. 3958-3976 - Chen Zhang, Jiuheng Lin, Xiao Liu, Zekai Zhang, Yansong Feng:
Read it in Two Steps: Translating Extremely Low-Resource Languages with Code-Augmented Grammar Books. 3977-3997 - Zhe Yang, Yichang Zhang, Yudong Wang, Ziyao Xu, Junyang Lin, Zhifang Sui:
Confidence v.s. Critique: A Decomposition of Self-Correction Capability for LLMs. 3998-4014 - Kangcheng Luo, Quzhe Huang, Cong Jiang, Yansong Feng:
Automating Legal Interpretation with LLMs: Retrieval, Generation, and Evaluation. 4015-4047 - Wei Li, Zhen Huang, Houqiang Li, Le Lu, Yang Lu, Xinmei Tian, Xu Shen, Jieping Ye:
Visual Evidence Prompting Mitigates Hallucinations in Large Vision-Language Models. 4048-4080 - Shao Zhang, Xihuai Wang, Wenhao Zhang, Chaoran Li, Junru Song, Tingyu Li, Lin Qiu, Xuezhi Cao, Xunliang Cai, Wen Yao, Weinan Zhang, Xinbing Wang, Ying Wen:
Leveraging Dual Process Theory in Language Agent Framework for Real-time Simultaneous Human-AI Collaboration. 4081-4108 - Chong Li, Jiajun Zhang, Chengqing Zong:
TokAlign: Efficient Vocabulary Adaptation via Token Alignment. 4109-4126 - Qi Li, Xiaowen Chu:
AdaEdit: Advancing Continuous Knowledge Editing For Large Language Models. 4127-4149 - Byung-Doh Oh, William Schuler:
The Impact of Token Granularity on the Predictive Power of Language Model Surprisal. 4150-4162 - Xiaochen Zhu, Georgi Karadzhov, Chenxi Whitehouse, Andreas Vlachos:
Segment-Level Diffusion: A Framework for Controllable Long-Form Generation with Diffusion Language Models. 4163-4183 - Taolin Zhang, Dongyang Li, Qizhou Chen, Chengyu Wang, Xiaofeng He:
BELLE: A Bi-Level Multi-Agent Reasoning Framework for Multi-Hop Question Answering. 4184-4202 - Zhangyue Yin, Qiushi Sun, Zhiyuan Zeng, Qinyuan Cheng, Xipeng Qiu, Xuanjing Huang:
Dynamic and Generalizable Process Reward Modeling. 4203-4233 - Zixin Chen, Hongzhan Lin, Kaixin Li, Ziyang Luo, Zhen Ye, Guang Chen, Zhiyong Huang, Jing Ma:
AdamMeme: Adaptively Probe the Reasoning Capacity of Multimodal Large Language Models on Harmfulness. 4234-4253 - Xin Zhang, Ziqi Dai, Yongqi Li, Yanzhao Zhang, Dingkun Long, Pengjun Xie, Meishan Zhang, Jun Yu, Wenjie Li, Min Zhang:
Towards Text-Image Interleaved Retrieval. 4254-4269 - Guangcheng Zhu, Ruixuan Xiao, Haobo Wang, Zhen Zhu, Gengyu Lyu, Junbo Zhao:
Large Margin Representation Learning for Robust Cross-lingual Named Entity Recognition. 4270-4291 - Wei Sun, Qianlong Du, Fuwei Cui, Jiajun Zhang:
An Efficient and Precise Training Data Construction Framework for Process-supervised Reward Model in Mathematical Reasoning. 4292-4305 - Zhengren Wang, Qinhan Yu, Shida Wei, Zhiyu Li, Feiyu Xiong, Xiaoxing Wang, Simin Niu, Hao Liang, Wentao Zhang:
QAEncoder: Towards Aligned Representation Learning in Question Answering Systems. 4306-4332 - Jiale Hong, Hongqiu Wu, Hai Zhao:
Game Development as Human-LLM Interaction. 4333-4354 - Rena Wei Gao, Xuetong Wu, Tatsuki Kuribayashi, Mingrui Ye, Siya Qi, Carsten Roever, Yuanxing Liu, Zheng Yuan, Jey Han Lau:
Can LLMs Simulate L2-English Dialogue? An Information-Theoretic Analysis of L1-Dependent Biases. 4355-4379 - Zhuoqun Li, Haiyang Yu, Xuanang Chen, Hongyu Lin, Yaojie Lu, Fei Huang, Xianpei Han, Yongbin Li, Le Sun:
DeepSolution: Boosting Complex Engineering Solution Design via Tree-based Exploration and Bi-point Thinking. 4380-4396 - Viet Thanh Pham, Lizhen Qu, Zhuang Li, Suraj Sharma, Gholamreza Haffari:
SurveyPilot: an Agentic Framework for Automated Human Opinion Collection from Social Media. 4397-4422 - Daoze Zhang, Yuze Zhao, Jintao Huang, Yingda Chen:
Sharper and Faster mean Better: Towards More Efficient Vision-Language Model for Hour-scale Long Video Understanding. 4423-4439 - Ruochen Zhao, Wenxuan Zhang, Yew Ken Chia, Weiwen Xu, Deli Zhao, Lidong Bing:
Auto-Arena: Automating LLM Evaluations with Agent Peer Battles and Committee Discussions. 4440-4463 - Andrea Pedrotti, Giulia Rambelli, Caterina Villani, Marianna Bolognesi:
How Humans and LLMs Organize Conceptual Knowledge: Exploring Subordinate Categories in Italian. 4464-4482 - Jiaqi Zhao, Miao Zhang, Ming Wang, Yuzhang Shang, Kaihao Zhang, Weili Guan, Yaowei Wang, Min Zhang:
PTQ1.61: Push the Real Limit of Extremely Low-Bit Post-Training Quantization Methods for Large Language Models. 4483-4502 - Bowen Wei, Ziwei Zhu:
ProtoLens: Advancing Prototype Learning for Fine-Grained Interpretability in Text Classification. 4503-4523 - Chaoqun Cui, Liangbin Huang, Shijing Wang, Zhe Tong, Zhaolong Huang, Xiao Zeng, Xiaofeng Liu:
Fine-grained Video Dubbing Duration Alignment with Segment Supervised Preference Optimization. 4524-4546 - Chunlei Xin, Shuheng Zhou, Huijia Zhu, Weiqiang Wang, Xuanang Chen, Xinyan Guan, Yaojie Lu, Hongyu Lin, Xianpei Han, Le Sun:
Sparse Latents Steer Retrieval-Augmented Generation. 4547-4562 - Boyi Deng, Yu Wan, Baosong Yang, Yidan Zhang, Fuli Feng:
Unveiling Language-Specific Features in Large Language Models via Sparse Autoencoders. 4563-4608 - Xun Liang, Simin Niu, Zhiyu Li, Sensen Zhang, Hanyu Wang, Feiyu Xiong, Jason Zhaoxin Fan, Bo Tang, Jihao Zhao, Jiawei Yang, Shichao Song, Mengwei Wang:
SafeRAG: Benchmarking Security in Retrieval-Augmented Generation of Large Language Model. 4609-4631 - Guo Tang, Zheng Chu, Wenxiang Zheng, Junjia Xiang, Yizhuo Li, Weihao Zhang, Ming Liu, Bing Qin:
AnRe: Analogical Replay for Temporal Knowledge Graph Forecasting. 4632-4650 - Zhiyuan Zeng, Qinyuan Cheng, Zhangyue Yin, Yunhua Zhou, Xipeng Qiu:
Revisiting the Test-Time Scaling of o1-like Models: Do they Truly Possess Test-Time Scaling Capabilities? 4651-4665 - Zitai Qiu, Congbo Ma, Jia Wu, Jian Yang:
Text is All You Need: LLM-enhanced Incremental Social Event Detection. 4666-4680 - Tong Liu, Zhixin Lai, Jiawen Wang, Gengyuan Zhang, Shuo Chen, Philip Torr, Vera Demberg, Volker Tresp, Jindong Gu:
Multimodal Pragmatic Jailbreak on Text-to-image Models. 4681-4720 - Xingcheng Xu, Zibo Zhao, Haipeng Zhang, Yanqing Yang:
Principled Understanding of Generalization for Generative Transformer Models in Arithmetic Reasoning Tasks. 4721-4747 - Wei Liu, Michael Strube:
Discourse Relation-Enhanced Neural Coherence Modeling. 4748-4762 - Kuofeng Gao, Shutao Xia, Ke Xu, Philip Torr, Jindong Gu:
Benchmarking Open-ended Audio Dialogue Understanding for Large Audio-Language Models. 4763-4784 - Yu Yan, Sheng Sun, Zenghao Duan, Teli Liu, Min Liu, Zhiyi Yin, LeiJingyu LeiJingyu, Qi Li:
from Benign import Toxic: Jailbreaking the Language Model via Adversarial Metaphors. 4785-4817 - Hengyuan Zhang, Chenming Shang, Sizhe Wang, Dongdong Zhang, Yiyao Yu, Feng Yao, Renliang Sun, Yujiu Yang, Furu Wei:
ShifCon: Enhancing Non-Dominant Language Capabilities with a Shift-based Multilingual Contrastive Framework. 4818-4841 - Zongqi Wang, Tianle Gu, Baoyuan Wu, Yujiu Yang:
MorphMark: Flexible Adaptive Watermarking for Large Language Models. 4842-4860 - Chenlong Deng, Zhisong Zhang, Kelong Mao, Shuaiyi Li, Xinting Huang, Dong Yu, Zhicheng Dou:
A Silver Bullet or a Compromise for Full Attention? A Comprehensive Study of Gist Token-based Context Compression. 4861-4879 - Cassie Huang, Li Zhang:
On the Limit of Language Models as Planning Formalizers. 4880-4904 - Yaxi Lu, Haolun Li, Xin Cong, Zhong Zhang, Yesai Wu, Yankai Lin, Zhiyuan Liu, Fangming Liu, Maosong Sun:
Learning to Generate Structured Output with Schema Reinforcement Learning. 4905-4918 - Peichao Lai, Zhengfeng Zhang, Wentao Zhang, Fangcheng Fu, Bin Cui:
Enhancing Unsupervised Sentence Embeddings via Knowledge-Driven Data Augmentation and Gaussian-Decayed Contrastive Learning. 4919-4940 - Peijian Gu, Quan Wang, Zhendong Mao:
Improve Safety Training of Large Language Models with Safety-Critical Singular Vectors Localization. 4941-4954 - Huawen Feng, Pu Zhao, Qingfeng Sun, Can Xu, Fangkai Yang, Lu Wang, Qianli Ma, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang:
WarriorCoder: Learning from Expert Battles to Augment Code Large Language Models. 4955-4969 - Junqing Gong, Binhan Yang, Wei Shen:
A Triple-View Framework for Fine-Grained Emotion Classification with Clustering-Guided Contrastive Learning. 4970-4984 - Sunbowen Lee, Junting Zhou, Chang Ao, Kaige Li, Xeron Du, Sirui He, Haihong Wu, Tianci Liu, Jiaheng Liu, Hamid Alinejad-Rokny, Min Yang, Yitao Liang, Zhoufutu Wen, Shiwen Ni:
Quantification of Large Language Model Distillation. 4985-5004 - Zihan Qiu, Zeyu Huang, Bo Zheng, Kaiyue Wen, Zekun Wang, Rui Men, Ivan Titov, Dayiheng Liu, Jingren Zhou, Junyang Lin:
Demons in the Detail: On Implementing Load Balancing Loss for Training Specialized Mixture-of-Expert Models. 5005-5018 - Jinyang Wu, Shuai Zhang, Feihu Che, Mingkuan Feng, Pengpeng Shao, Jianhua Tao:
Pandora's Box or Aladdin's Lamp: A Comprehensive Analysis Revealing the Role of RAG Noise in Large Language Models. 5019-5039 - Jingyu Peng, Maolin Wang, Xiangyu Zhao, Kai Zhang, Wanyu Wang, Pengyue Jia, Qidong Liu, Ruocheng Guo, Qi Liu:
Stepwise Reasoning Disruption Attack of LLMs. 5040-5058 - Qiyuan Zhang, Yufei Wang, Yuxin Jiang, Liangyou Li, Chuhan Wu, Yasheng Wang, Xin Jiang, Lifeng Shang, Ruiming Tang, Fuyuan Lyu, Chen Ma:
Crowd Comparative Reasoning: Unlocking Comprehensive Evaluations for LLM-as-a-Judge. 5059-5074 - Mingyang Wang, Heike Adel, Lukas Lange, Yihong Liu, Ercong Nie, Jannik Strötgen, Hinrich Schütze:
Lost in Multilinguality: Dissecting Cross-lingual Factual Inconsistency in Transformer Language Models. 5075-5094 - Yining Lu, Noah Ziems, Hy Dang, Meng Jiang:
Optimizing Decomposition for Optimal Claim Verification. 5095-5114 - Kai Yao, Zhaorui Tan, Penglei Gao, Lichun Li, Kaixin Wu, Yinggui Wang, Yuan Zhao, Yixin Ji, Jianke Zhu, Wei Wang:
GradOT: Training-free Gradient-preserving Offsite-tuning for Large Language Models. 5115-5130 - Moxin Li, Yong Zhao, Wenxuan Zhang, Shuaiyi Li, Wenya Xie, See-Kiong Ng, Tat-Seng Chua, Yang Deng:
Knowledge Boundary of Large Language Models: A Survey. 5131-5157 - Hai-Long Sun, Zhun Sun, Houwen Peng, Han-Jia Ye:
Mitigating Visual Forgetting via Take-along Visual Conditioning for Multi-modal Long CoT Reasoning. 5158-5171 - Jihao Zhao, Zhiyuan Ji, Zhaoxin Fan, Hanyu Wang, Simin Niu, Bo Tang, Feiyu Xiong, Zhiyu Li:
MoC: Mixtures of Text Chunking Learners for Retrieval-Augmented Generation System. 5172-5189 - Hyeong Kyu Choi, Weijie Xu, Chi Xue, Stephanie Eckman, Chandan K. Reddy:
Mitigating Selection Bias with Node Pruning and Auxiliary Options. 5190-5215 - Luhao Zhang, Xinyu Zhang, Linmei Hu, Dandan Song, Liqiang Nie:
Dually Self-Improved Counterfactual Data Augmentation Using Large Language Model. 5216-5227 - Shi-Qi Yan, Quan Liu, Zhen-Hua Ling:
RPO: Retrieval Preference Optimization for Robust Retrieval-Augmented Generation. 5228-5240 - Yanyang Li, Michael R. Lyu, Liwei Wang:
Learning to Reason from Feedback at Test-Time. 5241-5253 - Zecheng Tang, Keyan Zhou, Juntao Li, Baibei Ji, Jianye Hou, Min Zhang:
L-CiteEval: A Suite for Evaluating Fidelity of Long-context Models. 5254-5277 - Trisha Das, Afrah Shafquat, Mandis Beigi, Jacob Aptekar, Jimeng Sun:
SECRET: Semi-supervised Clinical Trial Document Similarity Search. 5278-5291 - Jin Hwa Lee, Thomas Jiralerspong, Lei Yu, Yoshua Bengio, Emily Cheng:
Geometric Signatures of Compositionality Across a Language Model's Lifetime. 5292-5320 - Maxime Griot, Jean Vanderdonckt, Demet Yüksel, Coralie Hemptinne:
Pattern Recognition or Medical Knowledge? The Problem with Multiple-Choice Questions in Medicine. 5321-5341 - Jenna Russell, Marzena Karpinska, Mohit Iyyer:
People who frequently use ChatGPT for writing tasks are accurate and robust detectors of AI-generated text. 5342-5373 - Yiwen Hu, Huatong Song, Jie Chen, Jia Deng, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Zican Dong, Yang Lu, Xu Miao, Xin Zhao, Ji-Rong Wen:
YuLan-Mini: Pushing the Limits of Open Data-efficient Language Model. 5374-5400 - Timothee Mickus, Aman Sinha, Raúl Vázquez:
Your Model is Overconfident, and Other Lies We Tell Ourselves. 5401-5417 - Weixuan Wang, Minghao Wu, Barry Haddow, Alexandra Birch:
Bridging the Language Gaps in Large Language Models with Inference-Time Cross-Lingual Intervention. 5418-5433 - Kyeonghyun Kim, Jinhee Jang, Juhwan Choi, Yoonji Lee, Kyohoon Jin, YoungBin Kim:
Plug-in and Fine-tuning: Bridging the Gap between Small Language Models and Large Language Models. 5434-5452 - Han Meng, Yancan Chen, Yunan Li, Yitian Yang, Jungup Lee, Renwen Zhang, Yi-Chieh Lee:
What is Stigma Attributed to? A Theory-Grounded, Expert-Annotated Interview Corpus for Demystifying Mental-Health Stigma. 5453-5490 - Yuguo Yin, Yuxin Xie, Wenyuan Yang, Dongchao Yang, Jinghan Ru, Xianwei Zhuang, Liming Liang, Yuexian Zou:
ATRI: Mitigating Multilingual Audio Text Retrieval Inconsistencies by Reducing Data Distribution Errors. 5491-5504 - Tianshi Zheng, Jiazheng Wang, Zihao Wang, Jiaxin Bai, Hang Yin, Zheye Deng, Yangqiu Song, Jianxin Li:
Enhancing Transformers for Generalizable First-Order Logical Entailment. 5505-5524 - Yufan Zhuang, Xiaodong Yu, Jialian Wu, Ximeng Sun, Ze Wang, Jiang Liu, Yusheng Su, Jingbo Shang, Zicheng Liu, Emad Barsoum:
Self-Taught Agentic Long Context Understanding. 5525-5537 - Shahrad Mohammadzadeh, Juan David Guerra, Marco Bonizzato, Reihaneh Rabbany, Golnoosh Farnadi:
Hallucination Detox: Sensitivity Dropout (SenD) for Large Language Model Training. 5538-5554 - Qiushi Sun, Kanzhi Cheng, Zichen Ding, Chuanyang Jin, Yian Wang, Fangzhi Xu, Zhenyu Wu, Chengyou Jia, Liheng Chen, Zhoumianze Liu, Ben Kao, Guohao Li, Junxian He, Yu Qiao, Zhiyong Wu:
OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis. 5555-5579 - Yepeng Weng, Dianwen Mei, Huishi Qiu, Xujie Chen, Li Liu, Jiang Tian, Zhongchao Shi:
CORAL: Learning Consistent Representations across Multi-step Training with Lighter Speculative Drafter. 5580-5593 - Antonin Poché, Alon Jacovi, Agustin Martin Picard, Victor Boutin, Fanny Jourdan:
ConSim: Measuring Concept-Based Explanations' Effectiveness with Automated Simulatability. 5594-5615 - Omer Shubi, Cfir Avraham Hadar, Yevgeni Berzak:
Decoding Reading Goals from Eye Movements. 5616-5637 - Si Wu, Sebastian Bruch:
Uncovering Visual-Semantic Psycholinguistic Properties from the Distributional Structure of Text Embedding Space. 5638-5649 - Bin Xie, Rui Shao, Gongwei Chen, Kaiwen Zhou, Yinchuan Li, Jie Liu, Min Zhang, Liqiang Nie:
GUI-explorer: Autonomous Exploration and Mining of Transition-aware Knowledge for GUI Agent. 5650-5667 - Xiaodong Chen, Yuxuan Hu, Xiaokang Zhang, Yanling Wang, Cuiping Li, Hong Chen, Jing Zhang:
P² Law: Scaling Law for Post-Training After Model Pruning. 5668-5686 - Kuleen Sasse, Carlos Alejandro Aguirre, Isabel Cachola, Sharon Levy, Mark Dredze:
Making FETCH! Happen: Finding Emergent Dog Whistles Through Common Habitats. 5687-5709 - Shihan Dou, Jiayi Chen, Chenhao Huang, Feng Chen, Wei Chengzhi, Huiyuan Zheng, Shichun Liu, Yan Liu, Chenxiao Liu, Chao Xin, Lin Yan, Zongzhang Zhang, Tao Gui, Qi Zhang, Xuanjing Huang:
Lost in the Context: Insufficient and Distracted Attention to Contexts in Preference Modeling. 5710-5728 - Jinu Lee, Qi Liu, Runzhi Ma, Vincent Han, Ziqi Wang, Heng Ji, Julia Hockenmaier:
Entailment-Preserving First-order Logic Representations in Natural Language Entailment. 5729-5742 - Duzhen Zhang, Yong Ren, Zhong-Zhi Li, Yahan Yu, Jiahua Dong, Chenxing Li, Zhilong Ji, Jinfeng Bai:
Enhancing Multimodal Continual Instruction Tuning with BranchLoRA. 5743-5756 - Yoav Gur-Arieh, Roy Mayan
, Chen Agassy
, Atticus Geiger, Mor Geva:
Enhancing Automated Interpretability with Output-Centric Feature Descriptions. 5757-5778 - Jie Chen, Zhipeng Chen, Jiapeng Wang, Kun Zhou, Yutao Zhu, Jinhao Jiang, Yingqian Min, Xin Zhao, Zhicheng Dou, Jiaxin Mao, Yankai Lin, Ruihua Song, Jun Xu, Xu Chen, Rui Yan, Zhewei Wei, Di Hu, Wenbing Huang, Ji-Rong Wen:
Towards Effective and Efficient Continual Pre-training of Large Language Models. 5779-5795 - Yihao Huang, Chong Wang, Xiaojun Jia, Qing Guo, Felix Juefei-Xu, Jian Zhang, Yang Liu, Geguang Pu:
Efficient Universal Goal Hijacking with Semantics-guided Prompt Organization. 5796-5816 - Anwen Hu, Haiyang Xu, Liang Zhang, Jiabo Ye, Ming Yan, Ji Zhang, Qin Jin, Fei Huang, Jingren Zhou:
mPLUG-DocOwl2: High-resolution Compressing for OCR-free Multi-page Document Understanding. 5817-5834 - Do Xuan Long, Duy Dinh, Ngoc-Hai Nguyen, Kenji Kawaguchi, Nancy F. Chen, Shafiq Joty, Min-Yen Kan:
What Makes a Good Natural Language Prompt? 5835-5873 - Weiqi Wu, Hongqiu Wu, Hai Zhao:
X-TURING: Towards an Enhanced and Efficient Turing Test for Long-Term Dialogue Agents. 5874-5889 - Shivani Kumar, David Jurgens:
Are Rules Meant to be Broken? Understanding Multilingual Moral Reasoning as a Computational Pipeline with UniMoral. 5890-5912 - Zheyuan Liu, Guangyao Dou, Xiangchi Yuan, Chunhui Zhang, Zhaoxuan Tan, Meng Jiang:
Modality-Aware Neuron Pruning for Unlearning in Multimodal Large Language Models. 5913-5933 - Zheyuan Zhang, Yiyang Li, Nhi Ha Lan Le, Zehong Wang, Tianyi Ma, Vincent Galassi, Keerthiram Murugesan, Nuno Moniz, Werner Geyer, Nitesh V. Chawla, Chuxu Zhang, Yanfang Ye:
NGQA: A Nutritional Graph Question Answering Benchmark for Personalized Health-aware Nutritional Reasoning. 5934-5966 - Haoming Xu, Ningyuan Zhao, Liming Yang, Sendong Zhao, Shumin Deng, Mengru Wang, Bryan Hooi, Nay Oo, Huajun Chen, Ningyu Zhang:
ReLearn: Unlearning via Learning for Large Language Models. 5967-5987 - Pritom Saha Akash, Kevin Chen-Chuan Chang:
Understanding Cross-Domain Adaptation in Low-Resource Topic Modeling. 5988-6001 - Boyang Xue, Fei Mi, Qi Zhu, Hongru Wang, Rui Wang, Sheng Wang, Erxin Yu, Xuming Hu, Kam-Fai Wong:
UAlign: Leveraging Uncertainty Estimations for Factuality Alignment on Large Language Models. 6002-6024 - Xinyin Ma, Guangnian Wan, Runpeng Yu, Gongfan Fang, Xinchao Wang:
CoT-Valve: Length-Compressible Chain-of-Thought Tuning. 6025-6035 - Jie Ouyang, Tingyue Pan, Mingyue Cheng, Ruiran Yan, Yucong Luo, Jiaying Lin, Qi Liu:
HoH: A Dynamic Benchmark for Evaluating the Impact of Outdated Information on Retrieval-Augmented Generation. 6036-6063 - Qiwei Zhao, Dong Li, Yanchi Liu, Wei Cheng, Yiyou Sun, Mika Oishi, Takao Osaki, Katsushi Matsuda, Huaxiu Yao, Chen Zhao, Haifeng Chen, Xujiang Zhao:
Uncertainty Propagation on LLM Agent. 6064-6073 - Valeria Ruscio, Umberto Nanni, Fabrizio Silvestri:
Beyond Position: the emergence of wavelet-like properties in Transformers. 6074-6088 - Giovanni Servedio, Alessandro De Bellis, Dario Di Palma, Vito Walter Anelli, Tommaso Di Noia:
Are the Hidden States Hiding Something? Testing the Limits of Factuality-Encoding Capabilities in LLMs. 6089-6104 - Zheyuan Liu, Suraj Maharjan, Fanyou Wu, Rahil Parikh, Belhassen Bayar, Srinivasan H. Sengamedu, Meng Jiang:
Disentangling Biased Knowledge from Reasoning in Large Language Models via Machine Unlearning. 6105-6123 - Dario Di Palma, Alessandro De Bellis, Giovanni Servedio, Vito Walter Anelli, Fedelucio Narducci, Tommaso Di Noia:
LLaMAs Have Feelings Too: Unveiling Sentiment and Emotion Representations in LLaMA Models Through Probing. 6124-6142 - Yayu Cao, Tianxiang Wang, Lvxiaowei Xu, Zhenyao Wang, Ming Cai:
CxGGEC: Construction-Guided Grammatical Error Correction. 6143-6156 - Xiangyu Zhang, Yu Zhou, Guang Yang, Wei Cheng, Taolue Chen:
Beyond Sequences: Two-dimensional Representation and Dependency Encoding for Code Generation. 6157-6172 - Qing Li, Jiahui Geng, Zongxiong Chen, Derui Zhu, Yuxia Wang, Congbo Ma, Chenyang Lyu, Fakhri Karray:
HD-NDEs: Neural Differential Equations for Hallucination Detection in LLMs. 6173-6186 - Dongqi Liu, Chenxi Whitehouse, Xi Yu, Louis Mahon, Rohit Saxena, Zheng Zhao, Yifu Qiu, Mirella Lapata, Vera Demberg:
What Is That Talk About? A Video-to-Text Summarization Dataset for Scientific Presentations. 6187-6210 - Ruisheng Cao, Hanchong Zhang, Tiancheng Huang, Zhangyi Kang, Yuxin Zhang, Liangtai Sun, Hanqi Li, Yuxun Miao, Shuai Fan, Lu Chen, Kai Yu:
NeuSym-RAG: Hybrid Neural Symbolic Retrieval with Multiview Structuring for PDF Question Answering. 6211-6239 - Xiuxuan Shen, Zhongyuan Jiang, Junsan Zhang, Junxiao Han, Yao Wan, Chengjie Guo, Bingcheng Liu, Jie Wu, Renxiang Li, Philip S. Yu:
ProvBench: A Benchmark of Legal Provision Recommendation for Contract Auto-Reviewing. 6240-6254 - Yushen Chen, Zhikang Niu, Ziyang Ma, Keqi Deng, Chunhui Wang, Jian Zhao, Kai Yu, Xie Chen:
F5-TTS: A Fairytaler that Fakes Fluent and Faithful Speech with Flow Matching. 6255-6271 - Xiechi Zhang, Zetian Ouyang, Linlin Wang, Gerard de Melo, Zhu Cao, Xiaoling Wang, Ya Zhang, Yanfeng Wang, Liang He:
AutoMedEval: Harnessing Language Models for Automatic Medical Capability Evaluation. 6272-6285 - Bohan Zhang, Xiaokang Zhang, Jing Zhang, Jifan Yu, Sijia Luo, Jie Tang:
CoT-based Synthesizer: Enhancing LLM Performance through Answer Synthesis. 6286-6303 - Xuandong Zhao, Chenwen Liao, Yuxiang Wang, Lei Li:
Efficiently Identifying Watermarked Segments in Mixed-Source Texts. 6304-6316 - Fangru Lin, Shaoguang Mao, Emanuele La Malfa, Valentin Hofmann, Adrian de Wynter, Xun Wang, Si-Qing Chen, Michael J. Wooldridge, Janet B. Pierrehumbert, Furu Wei:
Assessing Dialect Fairness and Robustness of Large Language Models in Reasoning Tasks. 6317-6342 - Qing Wang, Yuepei Li, Qiao Qiao, Kang Zhou, Qi Li:
Towards a More Generalized Approach in Open Relation Extraction. 6343-6354 - Viktor Moskvoretskii, Maria Marina, Mikhail Salnikov, Nikolay Ivanov, Sergey Pletenev, Daria Galimzianova, Nikita Krayko, Vasily Konovalov, Irina Nikishina, Alexander Panchenko:
Adaptive Retrieval Without Self-Knowledge? Bringing Uncertainty Back Home. 6355-6384 - Seungone Kim, Juyoung Suk, Xiang Yue, Vijay Viswanathan, Seongyun Lee, Yizhong Wang, Kiril Gashteovski, Carolin Lawrence, Sean Welleck, Graham Neubig:
Evaluating Language Models as Synthetic Data Generators. 6385-6403 - Yuyao Ge, Shenghua Liu, Baolong Bi, Yiwei Wang, Lingrui Mei, Wenjie Feng, Lizhe Chen, Xueqi Cheng:
Can Graph Descriptive Order Affect Solving Graph Problems with LLMs? 6404-6420 - Wei Hao, Ran Li, Weiliang Zhao, Junfeng Yang, Chengzhi Mao:
Learning to Rewrite: Generalized LLM-Generated Text Detection. 6421-6434 - Linhao Yu, Xingguang Ji, Yahui Liu, Fanheng Kong, Chenxi Sun, Jingyuan Zhang, Hongzhi Zhang, Victoria W., Fuzheng Zhang, Deyi Xiong:
Evaluating Multimodal Large Language Models on Video Captioning via Monte Carlo Tree Search. 6435-6462 - Maxim Zhelnin, Viktor Moskvoretskii, Egor Shvetsov, Mariya Krylova, Egor Venediktov, Aleksandr Zuev, Evgeny Burnaev:
GIFT-SW: Gaussian noise Injected Fine-Tuning of Salient Weights for LLMs. 6463-6480 - Hong Huang, Dapeng Wu:
Quaff: Quantized Parameter-Efficient Fine-Tuning under Outlier Spatial Stability Hypothesis. 6481-6496 - Atsuyuki Miyai, Jingkang Yang, Jingyang Zhang, Yifei Ming, Qing Yu, Go Irie, Yixuan Li, Hai Helen Li, Ziwei Liu, Kiyoharu Aizawa:
Unsolvable Problem Detection: Robust Understanding Evaluation for Large Multimodal Models. 6497-6540 - Yuhang Wu, Wenmeng Yu, Yean Cheng, Yan Wang, Xiaohan Zhang, Jiazheng Xu, Ming Ding, Yuxiao Dong:
AlignMMBench: Evaluating Chinese Multimodal Alignment in Large Vision-Language Models. 6541-6558 - Jillian Fisher, Shangbin Feng, Robert Aron, Thomas Richardson, Yejin Choi, Daniel W. Fisher, Jennifer Pan, Yulia Tsvetkov, Katharina Reinecke:
Biased LLMs can Influence Political Decision-Making. 6559-6607 - T. Y. S. S. Santosh, Tuan-Quang Vuong:
LexTempus: Enhancing Temporal Generalizability of Legal Language Models Through Dynamic Mixture of Experts. 6608-6624 - Soda Marem Lo, Oscar Araque, Rajesh Sharma, Marco Antonio Stranisci:
That is Unacceptable: the Moral Foundations of Canceling. 6625-6639 - Jun Yin, Pengyu Zeng, Haoyuan Sun, Yuqin Dai, Han Zheng, Miao Zhang, Yachao Zhang, Shuai Lu:
FloorPlan-LLaMa: Aligning Architects' Feedback and Domain Knowledge in Architectural Floor Plan Generation. 6640-6662 - Max Ku, Cheuk Hei Chong, Jonathan Leung, Krish Shah, Alvin Yu, Wenhu Chen:
TheoremExplainAgent: Towards Video-based Multimodal Explanations for LLM Theorem Understanding. 6663-6684 - Guizhen Chen, Weiwen Xu, Hao Zhang, Hou Pong Chan, Chaoqun Liu, Lidong Bing, Deli Zhao, Anh Tuan Luu, Yu Rong:
FineReason: Evaluating and Improving LLMs' Deliberate Reasoning through Reflective Puzzle Solving. 6685-6715 - Sergey Berezin, Reza Farahbakhsh, Noël Crespi:
The TIP of the Iceberg: Revealing a Hidden Class of Task-in-Prompt Adversarial Attacks on LLMs. 6716-6730 - Léane Jourdan, Nicolas Hernandez, Florian Boudin, Richard Dufour:
Identifying Reliable Evaluation Metrics for Scientific Text Revision. 6731-6756 - Liwei Jiang, Taylor Sorensen, Sydney Levine, Yejin Choi:
Can Language Models Reason about Individualistic Human Values and Preferences? 6757-6794 - Dmitry Morozov, Lizaveta Astapenka, Anna V. Glazkova, Timur Garipov, Olga Lyashevskaya:
BERT-like Models for Slavic Morpheme Segmentation. 6795-6815 - Xianzhen Luo, Yixuan Wang, Qingfu Zhu, Zhiming Zhang, Xuanyu Zhang, Qing Yang, Dongliang Xu:
Turning Trash into Treasure: Accelerating Inference of Large Language Models with Token Recycling. 6816-6831 - Xinyu Tang, Xiaolei Wang, Zhihao Lv, Yingqian Min, Xin Zhao, Binbin Hu, Ziqi Liu, Zhiqiang Zhang:
Unlocking General Long Chain-of-Thought Reasoning Capabilities of Large Language Models via Representation Engineering. 6832-6849 - Jiazheng Li, Hanqi Yan, Yulan He:
Drift: Enhancing LLM Faithfulness in Rationale Generation via Dual-Reward Probabilistic Inference. 6850-6866 - Angelina Wang, Michelle Phan, Daniel E. Ho, Sanmi Koyejo:
Fairness through Difference Awareness: Measuring Desired Group Discrimination in LLMs. 6867-6893 - Shojiro Yamabe, Futa Kai Waseda, Tsubasa Takahashi, Koki Wataoka:
MergePrint: Merge-Resistant Fingerprints for Robust Black-box Ownership Verification of Large Language Models. 6894-6916 - Zeyao Ma, Xiaokang Zhang, Jing Zhang, Jifan Yu, Sijia Luo, Jie Tang:
Dynamic Scaling of Unit Tests for Code Reward Modeling. 6917-6935 - Fengran Mo, Yifan Gao, Chuan Meng, Xin Liu, Zhuofeng Wu, Kelong Mao, Zhengyang Wang, Pei Chen, Zheng Li, Xian Li, Bing Yin, Meng Jiang:
UniConv: Unifying Retrieval and Response Generation for Large Language Models in Conversations. 6936-6949 - Minghao Lv, Siyuan Chen, Haoan Jin, Minghao Yuan, Qianqian Ju, Yujia Peng, Kenny Q. Zhu, Mengyue Wu:
Tracking Life's Ups and Downs: Mining Life Events from Social Media Posts for Mental Health Analysis. 6950-6965 - Shengpeng Ji, Qian Chen, Wen Wang, Jialong Zuo, Minghui Fang, Ziyue Jiang, Hai Huang, Zehan Wang, Xize Cheng, Siqi Zheng, Zhou Zhao:
ControlSpeech: Towards Simultaneous and Independent Zero-shot Speaker Cloning and Zero-shot Language Style Control. 6966-6981 - Haoran Que, Wenge Rong:
PIC: Unlocking Long-Form Text Generation Capabilities of Large Language Models via Position ID Compression. 6982-6995 - Dasha Metropolitansky, Jonathan Larson:
Towards Effective Extraction and Evaluation of Factual Claims. 6996-7045 - Yijie Hao, Haofei Yu, Jiaxuan You:
Beyond Facts: Evaluating Intent Hallucination in Large Language Models. 7046-7069 - Yida Zhao, Hao Xve, Xiang Hu, Kewei Tu:
A Systematic Study of Compositional Syntactic Transformer Language Models. 7070-7083 - Zhaopeng Feng, Jiayuan Su, Jiamei Zheng, Jiahan Ren, Yan Zhang, Jian Wu, Hongwei Wang, Zuozhu Liu:
M-MAD: Multidimensional Multi-Agent Debate for Advanced Machine Translation Evaluation. 7084-7107 - Shuangrui Ding, Zihan Liu, Xiaoyi Dong, Pan Zhang, Rui Qian, Junhao Huang, Conghui He, Dahua Lin, Jiaqi Wang:
SongComposer: A Large Language Model for Lyric and Melody Generation in Song Composition. 7108-7127 - Jinghao Zhang, Yuting Liu, Wenjie Wang, Qiang Liu, Shu Wu, Liang Wang, Tat-Seng Chua:
Personalized Text Generation with Contrastive Activation Steering. 7128-7141 - Siyuan Huang, Zhiyuan Ma, Jintao Du, Changhua Meng, Weiqiang Wang, Jingwen Leng, Minyi Guo, Zhouhan Lin:
Gumbel Reranking: Differentiable End-to-End Reranker Optimization. 7142-7161 - Lester James Validad Miranda, Yizhong Wang, Yanai Elazar, Sachin Kumar, Valentina Pyatkin, Faeze Brahman, Noah A. Smith, Hannaneh Hajishirzi, Pradeep Dasigi:
Hybrid Preferences: Learning to Route Instances for Human vs. AI Feedback. 7162-7200 - Yi-Fan Lu, Xian-Ling Mao, Tian Lan, Tong Zhang, Yu-Shi Zhu, Heyan Huang:
SEOE: A Scalable and Reliable Semantic Evaluation Framework for Open Domain Event Detection. 7201-7218 - Angelina Aspra Aquino, Lester James Validad Miranda, Elsie Marie T. Or:
The UD-NewsCrawl Treebank: Reflections and Challenges from a Large-scale Tagalog Syntactic Annotation Project. 7219-7239 - Jennifer Chen, Aidar Myrzakhan, Yaxin Luo, Hassaan Muhammad Khan, Sondos Mahmoud Bsharat, Zhiqiang Shen:
DRAG: Distilling RAG for SLMs from LLMs to Transfer Knowledge and Mitigate Hallucination via Evidence and Graph-based Distillation. 7240-7260 - Shilong Wang, Guibin Zhang, Miao Yu, Guancheng Wan, Fanci Meng, Chongye Guo, Kun Wang, Yang Wang:
G-Safeguard: A Topology-Guided Security Lens and Treatment on LLM-based Multi-agent Systems. 7261-7276 - Bumjin Park, Leejinsil Leejinsil, Jaesik Choi:
Deontological Keyword Bias: The Impact of Modal Expressions on Normative Judgments of Language Models. 7277-7296 - Weijie Shi, Han Zhu, Jiaming Ji, Mengze Li, Jipeng Zhang, Ruiyuan Zhang, Jia Zhu, Jiajie Xu, Sirui Han, Yike Guo:
LegalReasoner: Step-wised Verification-Correction for Legal Judgment Reasoning. 7297-7313 - Maggie Mi, Aline Villavicencio, Nafise Sadat Moosavi:
Rolling the DICE on Idiomaticity: How LLMs Fail to Grasp Context. 7314-7332 - Xuanle Zhao, Xianzhen Luo, Qi Shi, Chi Chen, Shuo Wang, Zhiyuan Liu, Maosong Sun:
ChartCoder: Advancing Multimodal Large Language Model for Chart-to-Code Generation. 7333-7348 - Nina Gregorio, Matteo Gay, Sharon Goldwater, Edoardo M. Ponti:
The Cross-linguistic Role of Animacy in Grammar Structures. 7349-7363 - Ayush Maheshwari, Atul Kumar Singh, N. J. Karthika, Krishnakant Bhatt, Preethi Jyothi, Ganesh Ramakrishnan:
LexGen: Domain-aware Multilingual Lexicon Generation. 7364-7375 - Tianyu Gao, Alexander Wettig, Howard Yen, Danqi Chen:
How to Train Long-Context Language Models (Effectively). 7376-7399 - Qizhi Pei, Lijun Wu, Zhuoshi Pan, Yu Li, Honglin Lin, Chenlin Ming, Xin Gao, Conghui He, Rui Yan:
MathFusion: Enhancing Mathematical Problem-solving of LLM through Instruction Fusion. 7400-7420 - Ramon Ruiz-Dolz, Zlata Kikteva, John Lawrence:
Mining Complex Patterns of Argumentative Reasoning in Natural Language Dialogue. 7421-7435 - Xueyu Hu, Tao Xiong, Biao Yi, Zishu Wei, Ruixuan Xiao, Yurun Chen, Jiasheng Ye, Meiling Tao, Xiangxin Zhou, Ziyu Zhao, Yuhuai Li, Shengze Xu, Shenzhi Wang, Xinchen Xu, Shuofei Qiao, Zhaokai Wang, Kun Kuang, Tieyong Zeng, Liang Wang, Jiwei Li, Yuchen Eleanor Jiang, Wangchunshu Zhou, Guoyin Wang, Keting Yin, Zhou Zhao, Hongxia Yang, Fan Wu, Shengyu Zhang, Fei Wu:
OS Agents: A Survey on MLLM-based Agents for Computer, Phone and Browser Use. 7436-7465 - Mingfei Lau, Qian Chen, Yeming Fang, Tingting Xu, Tongzhou Chen, Pavel Golik:
Data Quality Issues in Multilingual Speech Datasets: The Need for Sociolinguistic Awareness and Proactive Language Planning. 7466-7492 - Amr Mohamed, Mingmeng Geng, Michalis Vazirgiannis, Guokan Shang:
LLM as a Broken Telephone: Iterative Generation Distorts Information. 7493-7509 - Jianshu Zhang, Dongyu Yao, Renjie Pi, Paul Pu Liang, Yi R. Fung:
VLM2-Bench: A Closer Look at How Well VLMs Implicitly Link Explicit Matching Visual Cues. 7510-7545 - Xiang Geng, Zhejian Lai, Jiajun Chen, Hao Yang, Shujian Huang:
Alleviating Distribution Shift in Synthetic Data for Machine Translation Quality Estimation. 7546-7560 - Fan Zhang, Shulin Tian, Ziqi Huang, Yu Qiao, Ziwei Liu:
Evaluation Agent: Efficient and Promptable Evaluation Framework for Visual Generative Models. 7561-7582 - Zongxia Li, Lorena Calvo-Bartolomé, Alexander Miserlis Hoyle, Paiheng Xu, Daniel Kofi Stephens, Juan Francisco Fung, Alden Dima, Jordan Lee Boyd-Graber:
Large Language Models Struggle to Describe the Haystack without Human Help: A Social Science-Inspired Evaluation of Topic Models. 7583-7604 - Ziyue Wang, Chi Chen, Fuwen Luo, Yurui Dong, Yuanchi Zhang, Yuzhuang Xu, Xiaolong Wang, Peng Li, Yang Liu:
ActiView: Evaluating Active Perception Ability for Multimodal Large Language Models. 7605-7633 - Ritwik Gupta, Rodolfo Corona, Jiaxin Ge, Eric Wang, Dan Klein, Trevor Darrell, David M. Chan:
Enough Coin Flips Can Make LLMs Act Bayesian. 7634-7655 - Wenye Lin, Jonathan Roberts, Yunhan Yang, Samuel Albanie, Zongqing Lu, Kai Han:
GAMEBoT: Transparent Assessment of LLM Reasoning in Games. 7656-7682 - Zhijie Nie, Richong Zhang, Zhanyu Wu:
A Text is Worth Several Tokens: Text Embedding from LLMs Secretly Aligns Well with The Key Tokens. 7683-7694 - Abdelrahman Boda Sadallah, Junior Cedric Tonga, Khalid Almubarak, Saeed Almheiri, Farah Atif, Chatrine Qwaider, Karima Kadaoui, Sara Shatnawi, Yaser Alesh, Fajri Koto:
Commonsense Reasoning in Arab Culture. 7695-7710 - Junting Lu, Zhiyang Zhang, Fangkai Yang, Jue Zhang, Lu Wang, Chao Du, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang:
AXIS: Efficient Human-Agent-Computer Interaction with API-First LLM-Based Agents. 7711-7743 - Yang Chen, Vedaant Shah, Alan Ritter:
Translation and Fusion Improves Cross-lingual Information Extraction. 7744-7764 - Shaobo Cui, Wenqing Liu, Yiyang Feng, Jiawei Zhou, Boi Faltings:
Conditional Dichotomy Quantification via Geometric Embedding. 7765-7791 - Zhaoxuan Tan, Zheng Li, Tianyi Liu, Haodong Wang, Hyokun Yun, Ming Zeng, Pei Chen, Zhihan Zhang, Yifan Gao, Ruijie Wang, Priyanka Nigam, Bing Yin, Meng Jiang:
Aligning Large Language Models with Implicit Preferences from User-Generated Content. 7792-7820 - Yuyan Chen, Jiyuan Jia, Jiaxin Lu, Siyue Li, Yu Guan, Ming Yang, Qingpei Guo:
VQAGuider: Guiding Multimodal Large Language Models to Answer Complex Video Questions. 7821-7834 - Fang Wu, Vijay Prakash Dwivedi, Jure Leskovec:
Large Language Models are Good Relational Learners. 7835-7854 - Michael Ogezi, Freda Shi:
SpaRE: Enhancing Spatial Reasoning in Vision-Language Models with Synthetic Data. 7855-7875 - William Barr Held, Yanzhe Zhang, Weiyan Shi, Minzhi Li, Michael J. Ryan, Diyi Yang:
Distilling an End-to-End Voice Assistant Without Instruction Training Data. 7876-7891 - Shuhang Xu, Fangwei Zhong:
CoMet: Metaphor-Driven Covert Communication for Multi-Agent Language Games. 7892-7917 - Ali Razghandi, Seyed Mohammad Hadi Hosseini, Mahdieh Soleymani Baghshah:
CER: Confidence Enhanced Reasoning in LLMs. 7918-7938 - Minjia Mao, Dongjun Wei, Zeyu Chen, Xiao Fang, Michael Chau:
Watermarking Large Language Models: An Unbiased and Low-risk Method. 7939-7960 - Haoyang Wen, Jiang Guo, Yi Zhang, Jiarong Jiang, Zhiguo Wang:
On Synthetic Data Strategies for Domain-Specific Generative Retrieval. 7961-7976 - Ying Shen, Lifu Huang:
LLM Braces: Straightening Out LLM Predictions with Relevant Sub-Updates. 7977-7992 - Tamer Alkhouli, Katerina Margatina, James Gung, Raphael Shu, Claudia Zaghi, Monica Sunkara, Yi Zhang:
CONFETTI: Conversational Function-Calling Evaluation Through Turn-Level Interactions. 7993-8006 - Anthony B. Sicilia, Malihe Alikhani:
Evaluating Theory of (an uncertain) Mind: Predicting the Uncertain Beliefs of Others from Conversational Cues. 8007-8021 - Shaobo Cui, Luca Mouchel, Boi Faltings:
Uncertainty in Causality: A New Frontier. 8022-8044 - Michael J. Ryan, Omar Shaikh, Aditri Bhagirath, Daniel Frees, William Barr Held, Diyi Yang:
SynthesizeMe! Inducing Persona-Guided Prompts for Personalized Reward Models in LLMs. 8045-8078 - Julia Mendelsohn, Ceren Budak:
When People are Floods: Analyzing Dehumanizing Metaphors in Immigration Discourse with Large Language Models. 8079-8103 - Weidi Luo, Shenghong Dai, Xiaogeng Liu, Suman Banerjee, Huan Sun, Muhao Chen, Chaowei Xiao:
AGrail: A Lifelong Agent Guardrail with Effective and Adaptive Safety Detection. 8104-8139 - Yiqing Xie, Wenxuan Zhou, Pradyot Prakash, Di Jin, Yuning Mao, Quintin Fettes, Arya Talebzadeh, Sinong Wang, Han Fang, Carolyn P. Rosé, Daniel Fried, Hejia Zhang:
Improving Model Factuality with Fine-grained Critique-based Evaluator. 8140-8155 - Florencia Marotta-Wurgler, David Stein:
Building a Long Text Privacy Policy Corpus with Multi-Class Labels. 8156-8219 - Leonardo Ranaldi, Federico Ranaldi, Giulia Pucci:
R2-MultiOmnia: Leading Multilingual Multimodal Reasoning via Self-Training. 8220-8234 - Samuel Joseph Amouyal, Aya Meltzer-Asscher, Jonathan Berant:
When the LM misunderstood the human chuckled: Analyzing garden path effects in humans and language models. 8235-8253 - Zixiang Xu, Yanbo Wang, Yue Huang, Xiuying Chen, Jieyu Zhao, Meng Jiang, Xiangliang Zhang:
Cross-Lingual Pitfalls: Automatic Probing Cross-Lingual Weakness of Multilingual Large Language Models. 8254-8284 - Xuhao Hu, Dongrui Liu, Hao Li, Xuanjing Huang, Jing Shao:
VLSBench: Unveiling Visual Leakage in Multimodal Safety. 8285-8316 - Sky CH-Wang, Darshan Girish Deshpande, Smaranda Muresan, Anand Kannappan, Rebecca Qian:
Browsing Lost Unformed Recollections: A Benchmark for Tip-of-the-Tongue Search and Reasoning. 8317-8331 - Jonibek Mansurov, Akhmed Sakip, Alham Fikri Aji:
Data Laundering: Artificially Boosting Benchmark Results through Knowledge Distillation. 8332-8345 - Francesco Corso, Francesco Pierri, Gianmarco De Francisci Morales:
Conspiracy Theories and Where to Find Them on TikTok. 8346-8362 - Chunhui Zhang, Sirui Wang, Zhongyu Ouyang, Xiangchi Yuan, Soroush Vosoughi:
Growing Through Experience: Scaling Episodic Grounding in Language Models. 8363-8375 - Yuan Zhou, Zhuo Zhang, Xiangyu Zhang:
Exploiting the Shadows: Unveiling Privacy Leaks through Lower-Ranked Tokens in Large Language Models. 8376-8386 - Yanzhe Zhang, Tao Yu, Diyi Yang:
Attacking Vision-Language Computer Agents via Pop-ups. 8387-8401 - Congbo Ma, Yuxia Wang, Jia Wu, Jian Yang, Jing Du, Zitai Qiu, Qing Li, Hu Wang, Preslav Nakov:
Explicit and Implicit Data Augmentation for Social Event Detection. 8402-8415 - Zhen Tan, Jun Yan, I-Hung Hsu, Rujun Han, Zifeng Wang, Long T. Le, Yiwen Song, Yanfei Chen, Hamid Palangi, George Lee, Anand Rajan Iyer, Tianlong Chen, Huan Liu, Chen-Yu Lee, Tomas Pfister:
In Prospect and Retrospect: Reflective Memory Management for Long-term Personalized Dialogue Agents. 8416-8439 - Xiaoyi Bao, Zhongqing Wang, Jinghang Gu, Chu-Ren Huang:
Revisiting Classical Chinese Event Extraction with Ancient Literature Information. 8440-8451 - Xiangyu Peng, Prafulla Kumar Choubey, Caiming Xiong, Chien-Sheng Wu:
Unanswerability Evaluation for Retrieval Augmented Generation. 8452-8472 - Chengshuai Zhao, Zhen Tan, Chau-Wai Wong, Xinyan Zhao, Tianlong Chen, Huan Liu:
SCALE: Towards Collaborative Content Analysis in Social Science with Large Language Model Agents and Human Intervention. 8473-8503 - Erxin Yu, Jing Li, Ming Liao, Qi Zhu, Boyang Xue, Minghui Xu, Baojun Wang, Lanqing Hong, Fei Mi, Lifeng Shang:
Self-Error-Instruct: Generalizing from Errors for LLMs Mathematical Reasoning. 8504-8519 - Kunlun Zhu, Yifan Luo, Dingling Xu, Yukun Yan, Zhenghao Liu, Shi Yu, Ruobing Wang, Shuo Wang, Yishan Li, Nan Zhang, Xu Han, Zhiyuan Liu, Maosong Sun:
RAGEval: Scenario Specific RAG Evaluation Dataset Generation Framework. 8520-8544 - Homaira Huda Shomee, Zhu Wang, Sathya N. Ravi, Sourav Medya:
A Survey on Patent Analysis: From NLP to Multimodal AI. 8545-8561 - Chengye Wang, Yifei Shen, Zexi Kuang, Arman Cohan, Yilun Zhao:
SciVer: Evaluating Foundation Models for Multimodal Scientific Claim Verification. 8562-8579 - Kunlun Zhu, Hongyi Du, Zhaochen Hong, Xiaocheng Yang, Shuyi Guo, Zhe Wang, Zhenhailong Wang, Cheng Qian, Robert Tang, Heng Ji, Jiaxuan You:
MultiAgentBench : Evaluating the Collaboration and Competition of LLM agents. 8580-8622 - Tharindu Ranasinghe, Hansi Hettiarachchi, Nadeesha Chathurangi Naradde Vidana Pathirana, Damith Premasiri, Lasitha Uyangodage, Isuri Anuradha Nanomi Arachchige, Alistair Plum, Paul Rayson, Ruslan Mitkov:
Sinhala Encoder-only Language Models and Evaluation. 8623-8636 - Zhengxiang Wang, Veronika Makarova, Zhi Li, Jordan Kodner, Owen Rambow:
LLMs can Perform Multi-Dimensional Analytic Writing Assessments: A Case Study of L2 Graduate-Level Academic English Writing. 8637-8663 - Haomin Zhuang, Yihua Zhang, Kehan Guo, Jinghan Jia, Gaowen Liu, Sijia Liu, Xiangliang Zhang:
SEUF: Is Unlearning One Expert Enough for Mixture-of-Experts LLMs? 8664-8678 - Bolei Ma, Yuting Li, Wei Zhou, Ziwei Gong, Yang Janet Liu, Katja Jasinskaja, Annemarie Friedrich, Julia Hirschberg, Frauke Kreuter, Barbara Plank:
Pragmatics in the Era of Large Language Models: A Survey on Datasets, Evaluation, Opportunities and Challenges. 8679-8696 - Zhaoling Chen, Robert Tang, Gangda Deng, Fang Wu, Jialong Wu, Zhiwei Jiang, Viktor K. Prasanna, Arman Cohan, Xingyao Wang:
LocAgent: Graph-Guided LLM Agents for Code Localization. 8697-8727 - Raghvendra Kumar, Mohammed Salman S. A, Aryan Sahu, Tridib Nandi, Pragathi Y. P., Sriparna Saha, José G. Moreno:
COSMMIC: Comment-Sensitive Multimodal Multilingual Indian Corpus for Summarization and Headline Generation. 8728-8748 - Minzhi Li, William Barr Held, Michael J. Ryan, Kunat Pipatanakul, Potsawee Manakul, Hao Zhu, Diyi Yang:
Mind the Gap: Static and Interactive Evaluations of Large Audio Models. 8749-8766 - Renhao Pei, Yihong Liu, Peiqin Lin, François Yvon, Hinrich Schütze:
Understanding In-Context Machine Translation for Low-Resource Languages: A Case Study on Manchu. 8767-8788 - Jizhan Fang, Tianhe Lu
, Yunzhi Yao, Ziyan Jiang, Xin Xu
, Huajun Chen, Ningyu Zhang:
CKnowEdit: A New Chinese Knowledge Editing Dataset for Linguistics, Facts, and Logic Error Correction in LLMs. 8789-8807 - Cheng Xu, Nan Yan:
TripleFact: Defending Data Contamination in the Evaluation of LLM-driven Fake News Detection. 8808-8823 - Xiaomeng Zhu, Zhenghao Zhou, Simon Charlow, Robert Frank:
Meaning Beyond Truth Conditions: Evaluating Discourse Level Understanding via Anaphora Accessibility. 8824-8842 - Irtaza Khalid, Amir Masoud Nourollah, Steven Schockaert:
Large Language and Reasoning Models are Shallow Disjunctive Reasoners. 8843-8869 - Senyu Li, Zipeng Sun, Jiayi Wang, Xue Liu, Pontus Stenetorp, Siva Reddy, David Ifeoluwa Adelani:
Warmup Generations: A Task-Agnostic Approach for Guiding Sequence-to-Sequence Learning with Unsupervised Initial State Generation. 8870-8880 - Nedjma Ousidhoum, Meriem Beloucif, Saif M. Mohammad:
Building Better: Avoiding Pitfalls in Developing Language Resources when Data is Scarce. 8881-8894 - Shamsuddeen Hassan Muhammad, Nedjma Ousidhoum, Idris Abdulmumin, Jan Philip Wahle, Terry Ruas, Meriem Beloucif, Christine de Kock, Nirmal Surange, Daniela Teodorescu, Ibrahim Said Ahmad, David Ifeoluwa Adelani, Alham Fikri Aji, Felermino D. M. A. Ali, Ilseyar Alimova, Vladimir Araujo, Nikolay Babakov, Naomi Baes, Ana-Maria Bucur, Andiswa Bukula, Guanqun Cao, Rodrigo Tufiño Cardenas, Rendi Chevi, Chiamaka Ijeoma Chukwuneke, Alexandra Ciobotaru, Daryna Dementieva, Murja Sani Gadanya, Robert Geislinger, Bela Gipp, Oumaima Hourrane, Oana Ignat, Falalu Ibrahim Lawan, Rooweither Mabuya, Rahmad Mahendra, Vukosi Marivate, Alexander Panchenko, Andrew Piper, Charles Henrique Porto Ferreira, Vitaly Protasov, Samuel Rutunda, Manish Shrivastava, Aura Cristina Udrea, Lilian Diana Awuor Wanzare, Sophie Wu, Florian Valentin Wunderlich, Hanif Muhammad Zhafran, Tianhui Zhang, Yi Zhou, Saif M. Mohammad:
BRIGHTER: BRIdging the Gap in Human-Annotated Textual Emotion Recognition Datasets for 28 Languages. 8895-8916 - Yufei Tian, Jiao Sun, Nanyun Peng, Zizhao Zhang:
SkillVerse : Assessing and Enhancing LLMs with Tree Evaluation. 8917-8933 - Yanlin Feng, Simone Papicchio, Sajjadur Rahman:
CypherBench: Towards Precise Retrieval over Full-scale Modern Knowledge Graphs in the LLM Era. 8934-8958 - Francine Chen, Scott A. Carter, Tatiana Lau, Nayeli Suseth Bravo, Sumanta Bhattacharyya, Kate A. Sieck, Charlene C. Wu:
Empathy Prediction from Diverse Perspectives. 8959-8974 - Federico Ravenda, Seyed Ali Bahrainian, Andrea Raballo, Antonietta Mira, Noriko Kando:
Are LLMs effective psychological assessors? Leveraging adaptive RAG for interpretable mental health screening through psychometric practice. 8975-8991 - Aum Kendapadi, Kerem Zaman, Rakesh R. Menon, Shashank Srivastava:
INTERACT: Enabling Interactive, Question-Driven Learning in Large Language Models. 8992-9024 - Alan Sun:
Circuit Stability Characterizes Language Model Generalization. 9025-9040 - Olga Zamaraeva, Dan Flickinger, Francis Bond, Carlos Gómez-Rodríguez:
Comparing LLM-generated and human-authored news text using formal syntactic theory. 9041-9060 - Sharan Maiya, Yinhong Liu, Ramit Debnath, Anna Korhonen:
Improving Preference Extraction In LLMs By Identifying Latent Knowledge Through Classifying Probes. 9061-9081 - Yixin Wan, Kai-Wei Chang:
White Men Lead, Black Women Help? Benchmarking and Mitigating Language Agency Social Biases in LLMs. 9082-9108 - Adriana Eufrosina Bora, Akshatha Arodi, Duoyi Zhang, Jordan Bannister, Mirko Bronzi, Arsène Fansi Tchango, Md. Abul Bashar, Richi Nayak, Kerrie L. Mengersen:
AIMSCheck: Leveraging LLMs for AI-Assisted Review of Modern Slavery Statements Across Jurisdictions. 9109-9135 - Mohsen Fayyaz, Ali Modarressi, Hinrich Schütze, Nanyun Peng:
Collapse of Dense Retrievers: Short, Early, and Literal Biases Outranking Factual Evidence. 9136-9152 - Zhining Liu, Rana Ali Amjad, Ravinarayana Adkathimar, Tianxin Wei, Hanghang Tong:
SelfElicit: Your Language Model Secretly Knows Where is the Relevant Evidence. 9153-9173 - Yixin Wan, Kai-Wei Chang:
The Male CEO and the Female Assistant: Evaluation and Mitigation of Gender Biases in Text-To-Image Generation of Dual Subjects. 9174-9190 - Michalis Korakakis, Andreas Vlachos, Adrian Weller:
Mitigating Shortcut Learning with InterpoLated Learning. 9191-9206 - Theron S. Wang, Xingyuan Li, Hridayesh Lekhak, Tuan Minh Dang, Mengyue Wu, Kenny Q. Zhu:
Toward Automatic Discovery of a Canine Phonetic Alphabet. 9207-9219 - Haotian Zhou, Tingkai Liu, Qianli Ma, Yufeng Zhang, Jianbo Yuan, Pengfei Liu, Yang You, Hongxia Yang:
DavIR: Data Selection via Implicit Reward for Large Language Models. 9220-9237 - Artidoro Pagnoni, Ramakanth Pasunuru, Pedro Rodríguez, John Nguyen, Benjamin Muller, Margaret Li, Chunting Zhou, Lili Yu, Jason E. Weston, Luke Zettlemoyer, Gargi Ghosh, Mike Lewis, Ari Holtzman, Srini Iyer:
Byte Latent Transformer: Patches Scale Better Than Tokens. 9238-9258 - Zhenhao Li, Huichi Zhou, Marek Rei, Lucia Specia:
DiffuseDef: Improved Robustness to Adversarial Attacks via Iterative Denoising. 9259-9274 - Huanhuan Wei, Xiao Luo, Hongyi Yu, Jinping Liang, Luning Yang, Lixing Lin, Alexandra Popa, Xiting Yan:
Identifying Cellular Niches in Spatial Transcriptomics: An Investigation into the Capabilities of Large Language Models. 9275-9289 - Zahra Bokaei, Walid Magdy, Bonnie Webber:
Culture Matters in Toxic Language Detection in Persian. 9290-9304 - Jinheng Wang, Hansong Zhou, Ting Song, Shijie Cao, Yan Xia, Ting Cao, Jianyu Wei, Shuming Ma, Hongyu Wang, Furu Wei:
Bitnet.cpp: Efficient Edge Inference for Ternary LLMs. 9305-9322 - Guilherme Fonseca, Washington Cunha, Gabriel Prenassi, Marcos André Gonçalves, Leonardo Chaves Dutra da Rocha:
Instance-Selection-Inspired Undersampling Strategies for Bias Reduction in Small and Large Language Models for Binary Text Classification. 9323-9340 - Yeachan Kim, SangKeun Lee:
Forward Knows Efficient Backward Path: Saliency-Guided Memory-Efficient Fine-tuning of Large Language Models. 9341-9356 - Aofei Chang, Le Huang, Alex James Boyd, Parminder Bhatia, Taha A. Kass-Hout, Cao Xiao, Fenglong Ma:
Focus on What Matters: Enhancing Medical Vision-Language Models with Automatic Attention Alignment Tuning. 9357-9372 - Jiongnan Liu, Yutao Zhu, Shuting Wang, Xiaochi Wei, Erxue Min, Yu Lu, Shuaiqiang Wang, Dawei Yin, Zhicheng Dou:
LLMs + Persona-Plug = Personalized LLMs. 9373-9385 - Masato Mita, Ryo Yoshida, Yohei Oseki:
Developmentally-plausible Working Memory Shapes a Critical Period for Language Acquisition. 9386-9399 - Tao Feng, Lizhen Qu, Niket Tandon, Gholamreza Haffari:
IRIS: An Iterative and Integrated Framework for Verifiable Causal Discovery in the Absence of Tabular Data. 9400-9428 - Hao Yu, Jesujoba Oluwadara Alabi, Andiswa Bukula, Jian Yun Zhuang, En-Shiun Annie Lee, Tadesse Kebede Guge, Israel Abebe Azime, Happy Buzaaba, Blessing Kudzaishe Sibanda, Godson Koffi Kalipe, Jonathan Mukiibi, Salomon Kabongo Kabenamualu, Mmasibidi Setaka, Lolwethu Ndolela, Nkiruka Odu, Rooweither Mabuya, Shamsuddeen Hassan Muhammad, Salomey Osei, Sokhar Samb, Dietrich Klakow, David Ifeoluwa Adelani:
INJONGO: A Multicultural Intent Detection and Slot-filling Dataset for 16 African Languages. 9429-9452 - Hongjin Qian, Zheng Liu, Peitian Zhang, Zhicheng Dou, Defu Lian:
Boosting Long-Context Information Seeking via Query-Guided Activation Refilling. 9453-9464 - Tianyi Bai, Ling Yang, Zhen Hao Wong, Fupeng Sun, Xinlin Zhuang, Jiahui Peng, Chi Zhang, Lijun Wu, Jiantao Qiu, Wentao Zhang, Binhang Yuan, Conghui He:
Efficient Pretraining Data Selection for Language Models via Multi-Actor Collaboration. 9465-9491 - Han Liu, Changya Li, Xiaotong Zhang, Feng Zhang, Fenglong Ma, Wei Wang, Hong Yu:
AdaDHP: Fine-Grained Fine-Tuning via Dual Hadamard Product and Adaptive Parameter Selection. 9492-9504 - Jinhao Jiang, Kun Zhou, Xin Zhao, Yang Song, Chen Zhu, Hengshu Zhu, Ji-Rong Wen:
KG-Agent: An Efficient Autonomous Agent Framework for Complex Reasoning over Knowledge Graph. 9505-9523 - Mingyu Lee, Yeachan Kim, Wing-Lam Mok, SangKeun Lee:
Curriculum Debiasing: Toward Robust Parameter-Efficient Fine-Tuning Against Dataset Biases. 9524-9540 - Austin Xu, Srijan Bansal, Yifei Ming, Semih Yavuz, Shafiq Joty:
Does Context Matter? ContextualJudgeBench for Evaluating LLM-based Judges in Contextual Settings. 9541-9564 - Tao Feng, Lizhen Qu, Niket Tandon, Zhuang Li, Xiaoxi Kang, Gholamreza Haffari:
On the Reliability of Large Language Models for Causal Discovery. 9565-9590 - Jingxuan Li, Yuning Yang, Shengqi Yang, Linfan Zhang, Ying Nian Wu:
Value-Spectrum: Quantifying Preferences of Vision-Language Models via Value Decomposition in Social Media Contexts. 9591-9610 - Ziyang Liu, Chaokun Wang:
TeRDy: Temporal Relation Dynamics through Frequency Decomposition for Temporal Knowledge Graph Completion. 9611-9622 - Yerim Oh, Jun-Hyung Park, Junho Kim, SungHo Kim, SangKeun Lee:
Incorporating Domain Knowledge into Materials Tokenization. 9623-9644 - Yidan Wang, Yanan Cao, Yubing Ren, Fang Fang, Zheng Lin, Binxing Fang:
PIG: Privacy Jailbreak Attack on LLMs via Gradient-based Iterative In-Context Optimization. 9645-9660 - Rana Muhammad Shahroz, Zhen Tan, Sukwon Yun, Charles Fleming, Tianlong Chen:
Agents Under Siege: Breaking Pragmatic Multi-Agent LLM Systems with Optimized Prompt Attacks. 9661-9674 - Shusheng Li, Jiale Li, Yifei Qu, Xinwei Shi, Yanliang Guo, Ziyi He, Yubo Wang, Wenjun Tan:
Semantic-Eval : A Semantic Comprehension Evaluation Framework for Large Language Models Generation without Training. 9675-9690 - Michael Y. Hu, Jackson Petty, Chuan Shi, William Merrill, Tal Linzen:
Between Circuits and Chomsky: Pre-pretraining on Formal Languages Imparts Linguistic Biases. 9691-9709 - Hyuhng Joon Kim, Youna Kim, Sang-goo Lee, Taeuk Kim:
When to Speak, When to Abstain: Contrastive Decoding with Abstention. 9710-9730 - Herun Wan, Minnan Luo, Zhixiong Su, Guang Dai, Xiang Zhao:
On the Risk of Evidence Pollution for Malicious Social Text Detection in the Era of LLMs. 9731-9761 - Lei Wang, Zheqing Zhang, Xu Chen:
Investigating and Extending Homans' Social Exchange Theory with Large Language Model based Agents. 9762-9777 - Jiesong Liu, Brian Park, Xipeng Shen:
A Drop-In Solution for On-the-Fly Adaptation of Speculative Decoding in Large Language Models. 9778-9794 - Ryo Yoshida, Shinnosuke Isono, Kohei Kajikawa, Taiga Someya, Yushi Sugimoto, Yohei Oseki:
If Attention Serves as a Cognitive Model of Human Memory Retrieval, What is the Plausible Memory Representation? 9795-9812 - Yongqi Li, Shen Zhou, Xiaohu Li, Xin Miao, Jintao Wen, Mayi Xu, Jianhao Chen, Birong Pan, Hankun Kang, Yuanyuan Zhu, Ming Zhong, Tieyun Qian:
Aligning VLM Assistants with Personalized Situated Cognition. 9813-9839 - Zhisong Zhang, Yan Wang, Xinting Huang, Tianqing Fang, Hongming Zhang, Chenlong Deng, Shuaiyi Li, Dong Yu:
Attention Entropy is a Key Factor: An Analysis of Parallel Context Encoding with Full-attention-based Pre-trained Language Models. 9840-9855 - Huanran Zheng, Xiaoling Wang:
Faster Speculative Decoding via Effective Draft Decoder with Pruned Candidate Tree. 9856-9868 - Zhuojun Ding, Wei Wei, Chenghao Fan:
Selecting and Merging: Towards Adaptable and Scalable Named Entity Recognition with Large Language Models. 9869-9886 - Tao Wu, Jingyuan Chen, Wang Lin, Mengze Li, Yumeng Zhu, Ang Li, Kun Kuang, Fei Wu:
Embracing Imperfection: Simulating Students with Diverse Cognitive Levels Using LLM-based Agents. 9887-9908 - Jiali Chen, Xusen Hei, Hongfei Liu, Yuancheng Wei, Zikun Deng, Jiayuan Xie, Yi Cai, Qing Li:
CADReview: Automatically Reviewing CAD Programs with Error Detection and Correction. 9909-9927 - Junyi Li, Hwee Tou Ng:
Think&Cite: Improving Attributed Text Generation with Self-Guided Tree Search and Progress Reward Modeling. 9928-9942 - Dana R. Alsagheer, Abdulrahman Kamal, Mohammad Kamal, Cosmo Yang Wu, Weidong Shi:
The Lawyer That Never Thinks: Consistency and Fairness as Keys to Reliable AI. 9943-9954 - SungHo Kim, Nayeon Kim, Taehee Jeon, SangKeun Lee:
Polishing Every Facet of the GEM: Testing Linguistic Competence of LLMs and Humans in Korean. 9955-9984 - Wen Huang, Yanmei Gu, Zhiming Wang, Huijia Zhu, Yanmin Qian:
SpeechFake: A Large-Scale Multilingual Speech Deepfake Dataset Incorporating Cutting-Edge Generation Methods. 9985-9998 - Houxing Ren, Mingjie Zhan, Zhongyuan Wu, Aojun Zhou, Junting Pan, Hongsheng Li:
ReflectionCoder: Learning from Reflection Sequence for Enhanced One-off Code Generation. 9999-10020 - Huisheng Wang, Zhuoshi Pan, Hangjing Zhang, Mingxiao Liu, Hanqing Gao, H. Vicky Zhao:
InvestAlign: Overcoming Data Scarcity in Aligning Large Language Models with Investor Decision-Making Processes Under Herd Behavior. 10021-10052 - Abudurexiti Reheman, Hongyu Liu, Junhao Ruan, Abudukeyumu Abudula, Yingfeng Luo, Tong Xiao, JingBo Zhu:
Enhancing Neural Machine Translation Through Target Language Data: A kNN-LM Approach for Domain Adaptation. 10053-10065 - Fuwei Zhang, Xiaoyu Liu, Xinyu Jia, Yingfei Zhang, Shuai Zhang, Xiang Li, Fuzhen Zhuang, Wei Lin, Zhao Zhang:
Multi-level Relevance Document Identifier Learning for Generative Retrieval. 10066-10080 - Mengzhao Chen, Wenqi Shao, Peng Xu, Jiahao Wang, Peng Gao, Kaipeng Zhang, Ping Luo:
EfficientQAT: Efficient Quantization-Aware Training for Large Language Models. 10081-10100 - Siting Li, Pang Wei Koh, Simon Shaolei Du:
Exploring How Generative MLLMs Perceive More Than CLIP with the Same Vision Encoder. 10101-10119 - Hyuntak Kim, Byung-Hak Kim:
NexusSum: Hierarchical LLM Agents for Long-Form Narrative Summarization. 10120-10157 - Xiao Wang, Jingyun Hua, Weihong Lin, Yuanxing Zhang, Fuzheng Zhang, Jianlong Wu, Di Zhang, Liqiang Nie:
HAIC: Improving Human Action Understanding and Generation with Better Captions for Multi-modal Large Language Models. 10158-10181 - Yanhao Jia, Xinyi Wu, Li Hao, Qinglin Zhang, Yuxiao Hu, Shuai Zhao, Wenqi Fan:
Uni-Retrieval: A Multi-Style Retrieval Framework for STEM's Education. 10182-10197 - Lin Mu, Xiaoyu Wang, Li Ni, Yang Li, Zhize Wu, Peiquan Jin, Yiwen Zhang:
DenseLoRA: Dense Low-Rank Adaptation of Large Language Models. 10198-10211 - Jisoo Mok, Ik-hwan Kim, Sangkwon Park, Sungroh Yoon:
Exploring the Potential of LLMs as Personalized Assistants: Dataset, Evaluation, and Analysis. 10212-10239 - Yuheng Chen, Pengfei Cao, Yubo Chen, Yining Wang, Shengping Liu, Kang Liu, Jun Zhao:
Cracking Factual Knowledge: A Comprehensive Analysis of Degenerate Knowledge Neurons in Large Language Models. 10240-10261 - Shenglai Zeng, Pengfei He, Kai Guo, Tianqi Zheng, Hanqing Lu, Yue Xing, Hui Liu:
Towards Context-Robust LLMs: A Gated Representation Fine-tuning Approach. 10262-10276 - Yuqian Li, Yupei Du, Yufang Liu, Feifei Feng, Mou Xiao Feng, Yuanbin Wu:
On Support Samples of Next Word Prediction. 10277-10289 - Jialong Wu, Wenbiao Yin, Yong Jiang, Zhenglin Wang, Zekun Xi, Runnan Fang, Linhai Zhang, Yulan He, Deyu Zhou, Pengjun Xie, Fei Huang:
WebWalker: Benchmarking LLMs in Web Traversal. 10290-10305 - Yidan Wang, Yubing Ren, Yanan Cao, Binxing Fang:
From Trade-off to Synergy: A Versatile Symbiotic Watermarking Framework for Large Language Models. 10306-10322 - Hongxin Li, Jingfan Chen, Jingran Su, Yuntao Chen, Qing Li, Zhaoxiang Zhang:
AutoGUI: Scaling GUI Grounding with Automatic Functionality Annotations from LLMs. 10323-10358 - Jingwen Sun, Zhiyi Tian, Yu He, Jingwei Sun, Guangzhong Sun:
Introducing Graph Context into Language Models through Parameter-Efficient Fine-Tuning for Lexical Relation Mining. 10359-10374 - Zhirui Zeng, Jiamou Liu, Meng-Fen Chiang, Jialing He, Zijian Zhang:
S-RAG: A Novel Audit Framework for Detecting Unauthorized Use of Personal Data in RAG Systems. 10375-10385 - Yongqi Leng, Renren Jin, Yue Chen, Zhuowen Han, Ling Shi, Jianxiang Peng, Lei Yang, Juesi Xiao, Deyi Xiong:
Praetor: A Fine-Grained Generative LLM Evaluator with Instance-Level Customizable Evaluation Criteria. 10386-10418 - Zhecheng Sheng, Xiruo Ding, Brian Hur, Changye Li, Trevor Cohen, Serguei V. S. Pakhomov:
Mitigating Confounding in Speech-Based Dementia Detection through Weight Masking. 10419-10434 - Yang Liu, Jiahuan Cao, Hiuyi Cheng, Yongxin Shi, Kai Ding, Lianwen Jin:
MCS-Bench: A Comprehensive Benchmark for Evaluating Multimodal Large Language Models in Chinese Classical Studies. 10435-10492 - Yuheng Chen, Pengfei Cao, Kang Liu, Jun Zhao:
The Knowledge Microscope: Features as Better Analytical Lenses than Neurons. 10493-10515 - Chiwei Zhu, Benfeng Xu, Xiaorui Wang, Zhendong Mao:
From Real to Synthetic: Synthesizing Millions of Diversified and Complicated User Instructions with Attributed Grounding. 10516-10543 - Haoran Li, Wenbin Hu, Huihao Jing, Yulin Chen, Qi Hu, Sirui Han, Tianshu Chu, Peizhao Hu, Yangqiu Song:
PrivaCI-Bench: Evaluating Privacy with Contextual Integrity and Legal Compliance. 10544-10559 - Yanran Wu, Inez Hua, Yi Ding:
Unveiling Environmental Impacts of Large Language Model Serving: A Functional Unit View. 10560-10576 - Jinglong Gao, Xiao Ding, Lingxiao Zou, Bibo Cai, Bing Qin, Ting Liu:
ExpeTrans: LLMs Are Experiential Transfer Learners. 10577-10616 - Cong Liu, Xiaojun Quan, Yan Pan, Weigang Wu, Xu Chen, Liang Lin:
Cool-Fusion: Fuse Large Language Models without Training. 10617-10627 - Chuanyang Zheng, Yihang Gao, Han Shi, Jing Xiong, Jiankai Sun, Jingyao Li, Minbin Huang, Xiaozhe Ren, Michael Ng, Xin Jiang, Zhenguo Li, Yu Li:
DAPE V2: Process Attention Score as Feature Map for Length Extrapolation. 10628-10666 - Hui Huang, Jiaheng Liu, Yancheng He, Shilong Li, Bing Xu, Conghui Zhu, Muyun Yang, Tiejun Zhao:
MuSC: Improving Complex Instruction Following with Multi-granularity Self-Contrastive Training. 10667-10686 - Zican Dong, Junyi Li, Jinhao Jiang, Mingyu Xu, Xin Zhao, Bingning Wang, Weipeng Chen:
LongReD: Mitigating Short-Text Degradation of Long-Context Large Language Models via Restoration Distillation. 10687-10707 - Yuxiang Huang, Mingye Li, Xu Han, Chaojun Xiao, Weilin Zhao, Sun Ao, Hao Zhou, Jie Zhou, Zhiyuan Liu, Maosong Sun:
APB: Accelerating Distributed Long-Context Inference by Passing Compressed Context Blocks across GPUs. 10708-10727 - Yiyang Zhang, Nan Chen:
PPT: A Minor Language News Recommendation Model via Cross-Lingual Preference Pattern Transfer. 10728-10745 - Yi Jiang, Sendong Zhao, Jianbo Li, Haochun Wang, Bing Qin:
GainRAG: Preference Alignment in Retrieval-Augmented Generation through Gain Signal Synthesis. 10746-10757 - Chenxia Tang, Jianchun Liu, Hongli Xu, Liusheng Huang:
Top-nσ: Eliminating Noise in Logit Space for Robust Token Sampling of LLM. 10758-10774 - Jialong Wu, Zhenglin Wang, Linhai Zhang, Yilong Lai, Yulan He, Deyu Zhou:
SCOPE: Optimizing Key-Value Cache Compression in Long-context Generation. 10775-10790 - Thanh Duc Pham, Nam Le Hai, Linh Ngo Van, Nguyen Thi Ngoc Diep, Sang Dinh, Thien Huu Nguyen:
Mitigating Non-Representative Prototypes and Representation Bias in Few-Shot Continual Relation Extraction. 10791-10809 - Wei Tao, Haocheng Lu, Xiaoyang Qu, Bin Zhang, Kai Lu, Jiguang Wan, Jianzong Wang:
MoQAE: Mixed-Precision Quantization for Long-Context LLM Inference via Mixture of Quantization-Aware Experts. 10810-10820 - Ziqian Zeng, Jianwei Wang, Junyao Yang, Zhengdong Lu, Haoran Li, Huiping Zhuang, Cen Chen:
PrivacyRestore: Privacy-Preserving Inference in Large Language Models via Privacy Removal and Restoration. 10821-10855 - Xinlin Zhuang, Jiahui Peng, Ren Ma, Yinfan Wang, Tianyi Bai, Xingjian Wei, Jiantao Qiu, Chi Zhang, Ying Qian, Conghui He:
Meta-rater: A Multi-dimensional Data Selection Method for Pre-training Language Models. 10856-10896 - Qingchen Yu, Zifan Zheng, Ding Chen, Simin Niu, Bo Tang, Feiyu Xiong, Zhiyu Li:
GuessArena: Guess Who I Am? A Self-Adaptive Framework for Evaluating LLMs in Domain-Specific Knowledge and Reasoning. 10897-10912 - Kehua Feng, Keyan Ding, Hongzhi Tan, Kede Ma, Zhihua Wang, Shuangquan Guo, Yuzhou Cheng, Ge Sun, Guozhou Zheng, Qiang Zhang, Huajun Chen:
Sample-Efficient Human Evaluation of Large Language Models via Maximum Discrepancy Competition. 10913-10947 - Guanran Luo, Zhongquan Jian, Wentao Qiu, Meihong Wang, Qingqiang Wu:
DTCRS: Dynamic Tree Construction for Recursive Summarization. 10948-10963 - Zhiyu Zhang, Wei Chen, Youfang Lin, Huaiyu Wan:
A Generative Adaptive Replay Continual Learning Model for Temporal Knowledge Graph Reasoning. 10964-10977 - Yize Zhang, Tianshu Wang, Sirui Chen, Kun Wang, Xingyu Zeng, Hongyu Lin, Xianpei Han, Le Sun, Chaochao Lu:
ARise: Towards Knowledge-Augmented Reasoning via Risk-Adaptive Search. 10978-10995 - Ziyan Wang, Zhankun Xiong, Feng Huang, Wen Zhang:
PKAG-DDI: Pairwise Knowledge-Augmented Language Model for Drug-Drug Interaction Event Text Generation. 10996-11010 - Shuai Niu, Jing Ma, Hongzhan Lin, Liang Bai, Zhihua Wang, Richard Yi Da Xu, Yunya Song, Xian Yang:
Knowledge-Augmented Multimodal Clinical Rationale Generation for Disease Diagnosis with Small Language Models. 11011-11024 - Xindi Li, Zhe Liu, Tong Zhang, Jiahao Chen, Qingming Li, Jinbao Li, Shouling Ji:
TWIST: Text-encoder Weight-editing for Inserting Secret Trojans in Text-to-Image Models. 11025-11041 - Abhijnan Nath, Carine Graff, Andrei Bachinin, Nikhil Krishnaswamy:
Frictional Agent Alignment Framework: Slow Down and Don't Break Things. 11042-11089 - Dongjin Park, Eunsang Lee, Joon-Woo Lee:
Powerformer: Efficient and High-Accuracy Privacy-Preserving Language Model with Homomorphic Encryption. 11090-11111 - Weixiang Zhao, Yulin Hu, Yang Deng, Jiahe Guo, Xingyu Sui, Xinyang Han, An Zhang, Yanyan Zhao, Bing Qin, Tat-Seng Chua, Ting Liu:
Beware of Your Po! Measuring and Mitigating AI Safety Risks in Role-Play Fine-Tuning of LLMs. 11112-11137 - Zihao Li, Lecheng Zheng, Bowen Jin, Dongqi Fu, Baoyu Jing, Yikun Ban, Jingrui He, Jiawei Han:
Can Graph Neural Networks Learn Language with Extremely Weak Text Supervision? 11138-11165 - Hongqiu Wu, Weiqi Wu, Tianyang Xu, Jiameng Zhang, Hai Zhao:
Towards Enhanced Immersion and Agency for LLM-based Interactive Drama. 11166-11182 - Shun Inadumi, Nobuhiro Ueda, Koichiro Yoshino:
Disambiguating Reference in Visually Grounded Dialogues through Joint Modeling of Textual and Multimodal Semantic Structures. 11183-11198 - Mingda Chen, Yang Li, Karthik Padthe, Rulin Shao, Alicia Yi Sun, Luke Zettlemoyer, Gargi Ghosh, Wen-tau Yih:
Improving Factuality with Explicit Working Memory. 11199-11213 - Chengao Li, Hanyu Zhang, Yunkun Xu, Hongyan Xue, Xiang Ao, Qing He:
Gradient-Adaptive Policy Optimization: Towards Multi-Objective Alignment of Large Language Models. 11214-11232 - Yifu Ding, Wentao Jiang, Shunyu Liu, Yongcheng Jing, Jinyang Guo, Yingjie Wang, Jing Zhang, Zengmao Wang, Ziwei Liu, Bo Du, Xianglong Liu, Dacheng Tao:
Dynamic Parallel Tree Search for Efficient LLM Reasoning. 11233-11252 - Junyi Chen, Shihao Bai, Zaijun Wang, Siyu Wu, Chuheng Du, Hailong Yang, Ruihao Gong, Shengzhong Liu, Fan Wu, Guihai Chen:
Pre³: Enabling Deterministic Pushdown Automata for Faster Structured LLM Generation. 11253-11267 - Ge Qu, Jinyang Li, Bowen Qin, Xiaolong Li, Nan Huo, Chenhao Ma, Reynold Cheng:
SHARE: An SLM-based Hierarchical Action CorREction Assistant for Text-to-SQL. 11268-11292 - Tao Zhang, Ziqian Zeng, YuxiangXiao YuxiangXiao, Huiping Zhuang, Cen Chen, James R. Foulds, Shimei Pan:
GenderAlign: An Alignment Dataset for Mitigating Gender Bias in Large Language Models. 11293-11311 - Peng Zhou, Pengsen Ma, Jianmin Wang, Xibao Cai, Haitao Huang, Wei Liu, Longyue Wang, Lai Hou Tim, Xiangxiang Zeng:
Large Language and Protein Assistant for Protein-Protein Interactions Prediction. 11312-11327 - Jiaan Wang, Fandong Meng, Zengkui Sun, Yunlong Liang, Yuxuan Cao, Jiarong Xu, Haoxiang Shi, Jie Zhou:
An Empirical Study of Many-to-Many Summarization with Large Language Models. 11328-11344 - Suhang Wu, Jialong Tang, Chengyi Yang, Pei Zhang, Baosong Yang, Junhui Li, Junfeng Yao, Min Zhang, Jinsong Su:
Locate-and-Focus: Enhancing Terminology Translation in Speech Language Models. 11345-11360 - Lingxiao Diao, Xinyue Xu, Wanxuan Sun, Cheng Yang, Zhuosheng Zhang:
GuideBench: Benchmarking Domain-Oriented Guideline Following for LLM Agents. 11361-11399 - Xinke Jiang, Yue Fang, Rihong Qiu, Haoyu Zhang, Yongxin Xu, Hao Chen, Wentao Zhang, Ruizhe Zhang, Yuchen Fang, Xinyu Ma, Xu Chu, Junfeng Zhao, Yasha Wang:
TC-RAG: Turing-Complete RAG's Case study on Medical LLM Systems. 11400-11426 - Zexiong Ma, Chao Peng, Pengfei Gao, Xiangxin Meng, Yanzhen Zou, Bing Xie:
SoRFT: Issue Resolving with Subtask-oriented Reinforced Fine-Tuning. 11427-11441 - Zhongzhan Huang, Guoming Ling, Shanshan Zhong, Hefeng Wu, Liang Lin:
MiniLongBench: The Low-cost Long Context Understanding Benchmark for Large Language Models. 11442-11460 - Xin Sun, Jianan Xie, Zhongqi Chen, Qiang Liu, Shu Wu, Yuehe Chen, Bowen Song, Zilei Wang, Weiqiang Wang, Liang Wang:
Divide-Then-Align: Honest Alignment based on the Knowledge Boundary of RAG. 11461-11480 - Wanzong Peng, Lin Ye, Xuetao Du, Hongli Zhang, Dongyang Zhan, Yunting Zhang, Yicheng Guo, Chen Zhang:
PwnGPT: Automatic Exploit Generation Based on Large Language Models. 11481-11494 - Cuc Thi Bui, Nguyen Truong Son, Trang Van Truong, Viet Lam Phung, Pham Nhut Huy, Hoang Anh Le, Quoc Huu Van, Phong Nguyen-Thuan Do, Van Le Tran Truc, Duc Thanh Chau, Le-Minh Nguyen:
VMLU Benchmarks: A comprehensive benchmark toolkit for Vietnamese LLMs. 11495-11515 - Kai Liu, Jianfei Gao, Kai Chen:
Scaling up the State Size of RNN LLMs for Long-Context Scenarios. 11516-11529 - Bocheng Li, Zhujin Gao, Linli Xu:
Unifying Continuous and Discrete Text Diffusion with Non-simultaneous Diffusion Processes. 11530-11551 - Xin Gao, Qizhi Pei, Zinan Tang, Yu Li, Honglin Lin, Jiang Wu, Lijun Wu, Conghui He:
A Strategic Coordination Framework of Small LMs Matches Large LMs in Data Synthesis. 11552-11570 - Wenrui Xu, Dalin Lyu, Weihang Wang, Jie Feng, Chen Gao, Yong Li:
Defining and Evaluating Visual Language Models' Basic Spatial Abilities: A Perspective from Psychometrics. 11571-11590 - Wenyu Zhang, Wei En Ng, Lixin Ma, Yuwen Wang, Junqi Zhao, Allison Koenecke, Boyang Li, Lu Wang:
SPHERE: Unveiling Spatial Blind Spots in Vision-Language Models Through Hierarchical Evaluation. 11591-11609 - Qijun Miao, Zhixuan Fang:
User-side Model Consistency Monitoring for Open Source Large Language Models Inference Services. 11610-11622 - Weixiong Zheng, Peijian Zeng, Yiwei Li, Hongyan Wu, Nankai Lin, Junhao Chen, Aimin Yang, Yongmei Zhou:
Jailbreaking? One Step Is Enough! 11623-11642 - Yongxin Xu, Ruizhe Zhang, Xinke Jiang, Yujie Feng, Yuzhen Xiao, Xinyu Ma, Runchuan Zhu, Xu Chu, Junfeng Zhao, Yasha Wang:
Parenting: Optimizing Knowledge Selection of Retrieval-Augmented Language Models with Parameter Decoupling and Tailored Tuning. 11643-11662 - Yichen He, Guanhua Huang, Peiyuan Feng, Yuan Lin, Yuchen Zhang, Hang Li, Weinan E:
PaSa: An LLM Agent for Comprehensive Academic Paper Search. 11663-11679 - Abhilasha Sancheti, David Dale, Artyom Kozhevnikov, Maha Elbayad:
Less Mature is More Adaptable for Sentence-level Language Modeling. 11680-11695 - Subhajit Chaudhury, Payel Das, Sarathkrishna Swaminathan, Georgios Kollias, Elliot Nelson, Khushbu Pahwa, Tejaswini Pedapati, Igor Melnyk, Matthew Riemer:
EpMAN: Episodic Memory AttentioN for Generalizing to Longer Contexts. 11696-11708 - Xueyan Zhang, Jinman Zhao, Zhifei Yang, Yibo Zhong, Shuhao Guan, Linbo Cao, Yining Wang:
UORA: Uniform Orthogonal Reinitialization Adaptation in Parameter Efficient Fine-Tuning of Large Models. 11709-11728 - Haotian Wang, Yi Guan, Fanshu Meng, Chao Zhao, Lian Yan, Yang Yang, Jingchi Jiang:
Agri-CM³: A Chinese Massive Multi-modal, Multi-level Benchmark for Agricultural Understanding and Reasoning. 11729-11754 - Junnan Zhu, Min Xiao, Yining Wang, Feifei Zhai, Yu Zhou, Chengqing Zong:
TROVE: A Challenge for Fine-Grained Text Provenance via Source Sentence Tracing and Relationship Classification. 11755-11771 - Shane Arora, Marzena Karpinska, Hung-Ting Chen, Ipsita Bhattacharjee, Mohit Iyyer, Eunsol Choi:
CaLMQA: Exploring culturally specific long-form question answering across 23 languages. 11772-11817 - Yushan Zhu, Wen Zhang, Zhiqiang Liu, Mingyang Chen, Lei Liang, Huajun Chen:
Croppable Knowledge Graph Embedding. 11818-11835 - Xinke Jiang, Ruizhe Zhang, Yongxin Xu, Rihong Qiu, Yue Fang, Zhiyuan Wang, Jinyi Tang, Hongxin Ding, Xu Chu, Junfeng Zhao, Yasha Wang:
HyKGE: A Hypothesis Knowledge Graph Enhanced RAG Framework for Accurate and Reliable Medical LLMs Responses. 11836-11856 - Zhiyuan Hu, Yuliang Liu, Jinman Zhao, Suyuchen Wang, WangYan WangYan, Wei Shen, Qing Gu, Anh Tuan Luu, See-Kiong Ng, Zhiwei Jiang, Bryan Hooi:
LongRecipe: Recipe for Efficient Long Context Generalization in Large Language Models. 11857-11870 - Naibin Gu, Zhenyu Zhang, Xiyu Liu, Peng Fu, Zheng Lin, Shuohuan Wang, Yu Sun, Hua Wu, Weiping Wang, Haifeng Wang:
BeamLoRA: Beam-Constraint Low-Rank Adaptation. 11871-11883 - Yiming Lei, Chenkai Zhang, Zeming Liu, Haitao Leng, Shaoguo Liu, Tingting Gao, Qingjie Liu, Yunhong Wang:
GODBench: A Benchmark for Multimodal Large Language Models in Video Comment Art. 11884-11952 - Ang Li, Yiquan Wu, Yifei Liu, Ming Cai, Lizhi Qing, Shihang Wang, Yangyang Kang, Chengyuan Liu, Fei Wu, Kun Kuang:
UniLR: Unleashing the Power of LLMs on Multiple Legal Tasks with a Unified Legal Retriever. 11953-11967 - Haoran Ye, Tianze Zhang, Yuhang Xie, Liyuan Zhang, Yuanyi Ren, Xin Zhang, Guojie Song:
Generative Psycho-Lexical Approach for Constructing Value Systems in Large Language Models. 11968-11991 - Yeyong Yu, Runsheng Yu, Haojie Wei, Zhanqiu Zhang, Quan Qian:
Beyond Dialogue: A Profile-Dialogue Alignment Framework Towards General Role-Playing Language Model. 11992-12022 - Huaye Zeng, Dongfu Jiang, Haozhe Wang, Ping Nie, Xiaotong Chen, Wenhu Chen:
ACECODER: Acing Coder RL via Automated Test-Case Synthesis. 12023-12040 - Hang Chen, Xinyu Yang, Jiaying Zhu, Wenya Wang:
Quantifying Semantic Emergence in Language Models. 12041-12054 - Jizheng Chen, Kounianhua Du, Xinyi Dai, Weiming Zhang, Xihuai Wang, Yasheng Wang, Ruiming Tang, Weinan Zhang, Yong Yu:
DebateCoder: Towards Collective Intelligence of LLMs via Test Case Driven LLM Debate for Code Generation. 12055-12065 - Chen Qian, Dongrui Liu, Jie Zhang, Yong Liu, Jing Shao:
The Tug of War Within: Mitigating the Fairness-Privacy Conflicts in Large Language Models. 12066-12095 - Yukun Cao, Shuo Han, Zengyi Gao, Zezhong Ding, Xike Xie, S. Kevin Zhou:
GraphInsight: Unlocking Insights in Large Language Models for Graph Structure Understanding. 12096-12134 - Michael S. Yantosca, Albert M. K. Cheng:
Phonotomizer: A Compact, Unsupervised, Online Training Approach to Real-Time, Multilingual Phonetic Segmentation. 12135-12147 - Bojun Jin, Jianzhu Bao, Yufang Hou, Yang Sun, Yice Zhang, Huajie Wang, Bin Liang, Ruifeng Xu:
A Multi-persona Framework for Argument Quality Assessment. 12148-12170 - Chengwu Liu, Ye Yuan, Yichun Yin, Yan Xu, Xin Xu, Zaoyu Chen, Yasheng Wang, Lifeng Shang, Qun Liu, Ming Zhang:
Safe: Enhancing Mathematical Reasoning in Large Language Models via Retrospective Step-aware Formal Verification. 12171-12186 - Yuxuan Hu, Ke Wang, Xiaokang Zhang, Fanjin Zhang, Cuiping Li, Hong Chen, Jing Zhang:
SAM Decoding: Speculative Decoding via Suffix Automaton. 12187-12204 - Yuxin Hu, Danni Liu, Bo Liu, Yida Chen, Jiuxin Cao, Yan Liu:
PsyAdvisor: A Plug-and-Play Strategy Advice Planner with Proactive Questioning in Psychological Conversations. 12205-12229 - Silin Li, Yuhang Guo, Jiashu Yao, Zeming Liu, Haifeng Wang:
HomeBench: Evaluating LLMs in Smart Homes with Valid and Invalid Instructions Across Single and Multiple Devices. 12230-12250 - Xueyao Zhang, Yuancheng Wang, Chaoren Wang, Ziniu Li, Zhuo Chen, Zhizheng Wu:
Advancing Zero-shot Text-to-Speech Intelligibility across Diverse Domains via Preference Alignment. 12251-12270 - Haochen Li, Wanjin Feng, Xin Zhou, Zhiqi Shen:
GiFT: Gibbs Fine-Tuning for Code Generation. 12271-12284 - Yiwen Jiang, Deval Mehta, Wei Feng, Zongyuan Ge:
Enhancing Interpretable Image Classification Through LLM Agents and Conditional Concept Bottleneck Models. 12285-12297 - Xiaowei Zhu, Yubing Ren
, Yanan Cao, Xixun Lin, Fang Fang, Yangxi Li:
Reliably Bounding False Positives: A Zero-Shot Machine-Generated Text Detection Framework via Multiscaled Conformal Prediction. 12298-12319 - Junsik Kim, Jinwook Park, Kangil Kim:
RSCF: Relation-Semantics Consistent Filter for Entity Embedding of Knowledge Graph. 12320-12336 - Pinyi Zhang, Siyu An, Lingfeng Qiao, Yifei Yu, Jingyang Chen, Jie Wang, Di Yin, Xing Sun, Kai Zhang:
RolePlot: A Systematic Framework for Evaluating and Enhancing the Plot-Progression Capabilities of Role-Playing Agents. 12337-12354 - Zhenyu Hou, Ziniu Hu, Yujiang Li, Rui Lu, Jie Tang, Yuxiao Dong:
TreeRL: LLM Reinforcement Learning with On-Policy Tree Search. 12355-12369 - Emre Can Acikgoz, Jeremiah Greer, Akul Datta, Ze Yang, William Zeng, Oussama Elachqar, Emmanouil Koukoumidis, Dilek Hakkani-Tür, Gokhan Tur:
Can a Single Model Master Both Multi-turn Conversations and Tool Use? CoALM: A Unified Conversational Agentic Language Model. 12370-12390 - Yupu Liang, Yaping Zhang, Zhiyang Zhang, Yang Zhao, Lu Xiang, Chengqing Zong, Yu Zhou:
Single-to-mix Modality Alignment with Multimodal Large Language Model for Document Image Machine Translation. 12391-12408 - Aobo Kong, Wentao Ma, Shiwan Zhao, Yongbin Li, Yuchuan Wu, Ke Wang, Xiaoqian Liu, Qicheng Li, Yong Qin, Fei Huang:
SDPO: Segment-Level Direct Preference Optimization for Social Agents. 12409-12423 - Zhiyang Qi, Takumasa Kaneko, Keiko Takamizo, Mariko Ukiyo, Michimasa Inaba:
KokoroChat: A Japanese Psychological Counseling Dialogue Dataset Collected via Role-Playing by Trained Counselors. 12424-12443 - Xiangchao Yan, Shiyang Feng, Jiakang Yuan, Renqiu Xia, Bin Wang, Lei Bai, Bo Zhang:
SURVEYFORGE : On the Outline Heuristics, Memory-Driven Generation, and Multi-dimensional Evaluation for Automated Survey Writing. 12444-12465 - Yexing Du, Youcheng Pan, Ziyang Ma, Bo Yang, Yifan Yang, Keqi Deng, Xie Chen, Yang Xiang, Ming Liu, Bing Qin:
Making LLMs Better Many-to-Many Speech-to-Text Translators with Curriculum Learning. 12466-12478 - Yilun Zhao, Weiyuan Chen, Zhijian Xu, Manasi Patwardhan, Chengye Wang, Yixin Liu, Lovekesh Vig, Arman Cohan:
AbGen: Evaluating Large Language Models in Ablation Study Design and Evaluation for Scientific Research. 12479-12491 - Zicheng Zhang, Xiangyu Zhao, Xinyu Fang, Chunyi Li, Xiaohong Liu, Xiongkuo Min, Haodong Duan, Kai Chen, Guangtao Zhai:
Redundancy Principles for MLLMs Benchmarks. 12492-12504 - Yifu Chen, Shengpeng Ji, Haoxiao Wang, Ziqing Wang, Siyu Chen, Jinzheng He, Jin Xu, Zhou Zhao:
WavRAG: Audio-Integrated Retrieval Augmented Generation for Spoken Dialogue Models. 12505-12523 - Jiaming Zhou, Shiyao Wang, Shiwan Zhao, Jiabei He, Haoqin Sun, Hui Wang, Cheng Liu, Aobo Kong, Yujie Guo, Xi Yang, Yequan Wang, Yonghua Lin, Yong Qin:
ChildMandarin: A Comprehensive Mandarin Speech Dataset for Young Children Aged 3-5. 12524-12537 - Yao Xiao, Hai Ye, Linyao Chen, Hwee Tou Ng, Lidong Bing, Xiaoli Li, Roy Ka-Wei Lee:
Finding the Sweet Spot: Preference Data Construction for Scaling Preference Optimization. 12538-12552 - Yuhao Wang, Keyan Ding, Kehua Feng, Zeyuan Wang, Ming Qin, Xiaotong Li, Qiang Zhang, Huajun Chen:
Enhancing Safe and Controllable Protein Generation via Knowledge Preference Optimization. 12553-12569 - Mingqing Zhang, Qiang Liu, Xiang Tao, Shu Wu, Liang Wang:
SINCon: Mitigate LLM-Generated Malicious Message Injection Attack for Rumor Detection. 12570-12581 - Jungwoo Park, Taewhoo Lee, Chanwoong Yoon, Hyeon Hwang, Jaewoo Kang:
Outlier-Safe Pre-Training for Robust 4-Bit Quantization of Large Language Models. 12582-12600 - Shuofei Qiao, Zhisong Qiu, Baochang Ren, Xiaobin Wang, Xiangyuan Ru, Ningyu Zhang, Xiang Chen, Yong Jiang, Pengjun Xie, Fei Huang, Huajun Chen:
Agentic Knowledgeable Self-awareness. 12601-12625 - Jifang Wang, Yangxue Yangxue, Longyue Wang, Zhenran Xu, Yiyu Wang, Yaowei Wang, Weihua Luo, Kaifu Zhang, Baotian Hu, Min Zhang:
A Unified Agentic Framework for Evaluating Conditional Image Generation. 12626-12646 - Chao Lei, Yanchuan Chang, Nir Lipovetzky, Krista A. Ehinger:
Planning-Driven Programming: A Large Language Model Programming Workflow. 12647-12684 - Yuan Sui, Yufei He, Zifeng Ding, Bryan Hooi:
Can Knowledge Graphs Make Large Language Models More Trustworthy? An Empirical Study Over Open-ended Question Answering. 12685-12701 - Yu Fei, Yasaman Razeghi, Sameer Singh:
Nudging: Inference-time Alignment of LLMs via Guided Decoding. 12702-12739 - Zhilin Wang, Yafu Li, Jianhao Yan, Yu Cheng, Yue Zhang:
Unveiling Attractor Cycles in Large Language Models: A Dynamical Systems View of Successive Paraphrasing. 12740-12755 - Zhuang Li, Yuncheng Hua, Thuy-Trang Vu, Haolan Zhan, Lizhen Qu, Gholamreza Haffari:
SCAR: Data Selection via Style Consistency-Aware Response Ranking for Efficient Instruction-Tuning of Large Language Models. 12756-12790 - Tingfeng Hui, Zhenyu Zhang, Shuohuan Wang, Weiran Xu, Yu Sun, Hua Wu:
HFT: Half Fine-Tuning for Large Language Models. 12791-12819 - Huijun Lian, Zekai Sun, Keqi Chen, Yingming Gao, Ya Li:
Beyond Surface Simplicity: Revealing Hidden Reasoning Attributes for Precise Commonsense Diagnosis. 12820-12835 - Cheng Cheng, Zhenya Huang, Guanhao Zhao, Yuxiang Guo, Xin Lin, Jinze Wu, Xin Li, Shijin Wang:
From Objectives to Questions: A Planning-based Framework for Educational Mathematical Question Generation. 12836-12856 - Mingyan Wu, Zhenghao Liu, Yukun Yan, Xinze Li, Shi Yu, Zheni Zeng, Yu Gu, Ge Yu:
RankCoT: Refining Knowledge for Retrieval-Augmented Generation through Ranking Chain-of-Thoughts. 12857-12874 - Yafu Li, Ronghao Zhang, Zhilin Wang, Huajian Zhang, Leyang Cui, Yongjing Yin, Tong Xiao, Yue Zhang:
Lost in Literalism: How Supervised Training Shapes Translationese in LLMs. 12875-12894 - Yi Su, Yuechi Zhou, Quantong Qiu, Juntao Li, Qingrong Xia, Ping Li, Xinyu Duan, Zhefeng Wang, Min Zhang:
Accurate KV Cache Quantization with Outlier Tokens Tracing. 12895-12915 - Chen Huang, Junkai Luo, Xinzuo Wang, Wenqiang Lei, Jiancheng Lv:
Can Large Language Models Understand Internet Buzzwords Through User-Generated Content. 12916-12941 - Yuanteng Chen, Yuantian Shao, Peisong Wang, Jian Cheng:
EAC-MoE: Expert-Selection Aware Compressor for Mixture-of-Experts Large Language Models. 12942-12963 - Jingran Su, Jingfan Chen, Hongxin Li, Yuntao Chen, Li Qing, Zhaoxiang Zhang:
Activation Steering Decoding: Mitigating Hallucination in Large Vision-Language Models through Bidirectional Hidden State Intervention. 12964-12974 - Fangzhi Xu, Qiushi Sun, Kanzhi Cheng, Jun Liu, Yu Qiao, Zhiyong Wu:
Interactive Evolution: A Neural-Symbolic Self-Training Framework For Large Language Models. 12975-12993 - Yucheng Zhou, Lingran Song, Jianbing Shen:
Improving Medical Large Vision-Language Models with Abnormal-Aware Feedback. 12994-13011 - Tingfeng Hui, Zhenyu Zhang, Shuohuan Wang, Yu Sun, Hua Wu, Sen Su:
Upcycling Instruction Tuning from Dense to Mixture-of-Experts via Parameter Merging. 13012-13031 - Lingfeng Zhang, Xiaoshuai Hao, Qinwen Xu, Qiang Zhang, Xinyao Zhang, Pengwei Wang, Jing Zhang, Zhongyuan Wang, Shanghang Zhang, Renjing Xu:
MapNav: A Novel Memory Representation via Annotated Semantic Maps for VLM-based Vision-and-Language Navigation. 13032-13056 - Zhenyang Cai, Junying Chen, Rongsheng Wang, Weihong Wang, Yonglin Deng, Dingjie Song, Yize Chen, Zixu Zhang, Benyou Wang:
Exploring Compositional Generalization of Multimodal LLMs for Medical Imaging. 13057-13079 - Zekai Ye, Qiming Li, Xiaocheng Feng, Libo Qin, Yichong Huang, Baohang Li, Kui Jiang, Yang Xiang, Zhirui Zhang, Yunfei Lu, Duyu Tang, Dandan Tu, Bing Qin:
CLAIM: Mitigating Multilingual Object Hallucination in Large Vision-Language Models with Cross-Lingual Attention Intervention. 13080-13094 - Xiangci Li, Zhiyu Chen, Jason Ingyu Choi, Nikhita Vedula, Besnik Fetahu, Oleg Rokhlenko, Shervin Malmasi:
Wizard of Shopping: Target-Oriented E-commerce Dialogue Generation with Decision Tree Branching. 13095-13120 - Jian Yang, Wei Zhang, Yibo Miao, Shanghaoran Quan, Zhenhe Wu, Qiyao Peng, Liqun Yang, Tianyu Liu, Zeyu Cui, Binyuan Hui, Junyang Lin:
Qwen2.5-xCoder: Multi-Agent Collaboration for Multilingual Code Instruction Tuning. 13121-13131 - Wenxuan Lu, Jiangyang He, Zhanqiu Zhang, Steven Y. Guo, Tianning Zang:
Cultivating Gaming Sense for Yourself: Making VLMs Gaming Experts. 13132-13152 - Fangzhi Xu, Hang Yan, Chang Ma, Haiteng Zhao, Qiushi Sun, Kanzhi Cheng, Junxian He, Jun Liu, Zhiyong Wu:
Genius: A Generalizable and Purely Unsupervised Self-Training Framework For Advanced Reasoning. 13153-13167 - Weizhi Fei, Zihao Wang, Hang Yin, Yang Duan, Yangqiu Song:
Extending Complex Logical Queries on Uncertain Knowledge Graphs. 13168-13193 - Haoyu Xu, Pengxiang Lan, Enneng Yang, Guibing Guo, Jianzhe Zhao, Linying Jiang, Xingwei Wang:
Knowledge Decoupling via Orthogonal Projection for Lifelong Editing of Large Language Models. 13194-13213 - Fangzhi Xu, Hang Yan, Chang Ma, Haiteng Zhao, Jun Liu, Qika Lin, Zhiyong Wu:
φ-Decoding: Adaptive Foresight Sampling for Balanced Inference-Time Exploration and Exploitation. 13214-13227 - Leyi Pan, Aiwei Liu, Shiyu Huang, Yijian Lu, Xuming Hu, Lijie Wen, Irwin King, Philip S. Yu:
Can LLM Watermarks Robustly Prevent Unauthorized Knowledge Distillation? 13228-13251 - Sunghwan Kim, Dongjin Kang, Taeyoon Kwon, Hyungjoo Chae, Dongha Lee, Jinyoung Yeo:
Rethinking Reward Model Evaluation Through the Lens of Reward Overoptimization. 13252-13280 - Christine de Kock:
Inducing lexicons of in-group language with socio-temporal context. 13281-13291 - Boyi Kang, Xinfa Zhu, Zihan Zhang, Zhen Ye, Mingshuai Liu, Ziqian Wang, Yike Zhu, Guobin Ma, Jun Chen, Longshuai Xiao, Chao Weng, Wei Xue, Lei Xie:
LLaSE-G1: Incentivizing Generalization Capability for LLaMA-based Speech Enhancement. 13292-13305 - Kunxi Li, Zhonghua Jiang, Zhouzhou Shen, Zhaode Wang, Chengfei Lv, Shengyu Zhang, Fan Wu, Fei Wu:
MadaKV: Adaptive Modality-Perception KV Cache Eviction for Efficient Multimodal Long-Context Inference. 13306-13318 - Haoyuan Wu, Rui Ming, Haisheng Zheng, Zhuolun He, Bei Yu:
Efficient OpAmp Adaptation for Zoom Attention to Golden Contexts. 13319-13331 - Shengpeng Ji, Minghui Fang, Jialong Zuo, Ziyue Jiang, Dingdong Wang, Hanting Wang, Hai Huang, Zhou Zhao:
Language-Codec: Bridging Discrete Codec Representations and Speech Language Models. 13332-13345 - Wenjun Li, Dexun Li, Kuicai Dong, Cong Zhang, Hao Zhang, Weiwen Liu, Yasheng Wang, Ruiming Tang, Yong Liu:
Adaptive Tool Use in Large Language Models with Meta-Cognition Trigger. 13346-13370 - Qihao Zhao, Yangyu Huang, Tengchao Lv, Lei Cui, Qinzheng Sun, Shaoguang Mao, Xin Zhang, Ying Xin, Qiufeng Yin, Scarlett Li, Furu Wei:
MMLU-CF: A Contamination-free Multi-task Language Understanding Benchmark. 13371-13391 - Haneul Yoo, Yongjin Yang, Hwaran Lee:
Code-Switching Red-Teaming: LLM Evaluation for Safety and Multilingual Understanding. 13392-13413 - Yuyang Ding, Xinyu Shi, Xiaobo Liang, Juntao Li, Zhaopeng Tu, Qiaoming Zhu, Min Zhang:
Unleashing LLM Reasoning Capability via Scalable Question Synthesis from Scratch. 13414-13438 - Haneul Yoo, Jieun Han, So-Yeon Ahn, Alice Oh:
DREsS: Dataset for Rubric-based Essay Scoring on EFL Writing. 13439-13454 - Junfeng Kang, Rui Li, Qi Liu, Yanjiang Chen, Zheng Zhang, Junzhe Jiang, Heng Yu, Yu Su:
PQR: Improving Dense Retrieval via Potential Query Modeling. 13455-13469 - Frederick Riemenschneider, Anette Frank:
Cross-Lingual Generalization and Compression: From Language-Specific to Shared Neurons. 13470-13491 - Cheng Guo, Hu Kai, Shuxian Liang, Yiyang Jiang, Yi Gao, Xian-Sheng Hua, Wei Dong:
SDBench: A Survey-based Domain-specific LLM Benchmarking and Optimization Framework. 13492-13506 - Yusheng Liao, Shuyang Jiang, Yanfeng Wang, Yu Wang:
ReflecTool: Towards Reflection-Aware Tool-Augmented Clinical Agents. 13507-13531 - Henrike Beyer, Chris Reed:
Lexical Recall or Logical Reasoning: Probing the Limits of Reasoning Abilities in Large Language Models. 13532-13557 - Zilu Dong, Xiangqing Shen, Zinong Yang, Rui Xia:
ChainEdit: Propagating Ripple Effects in LLM Knowledge Editing through Logical Rule-Guided Chains. 13558-13571 - Haiyang Guo, Fanhu Zeng, Ziwei Xiang, Fei Zhu, Da-Han Wang, Xu-Yao Zhang, Cheng-Lin Liu:
HiDe-LLaVA: Hierarchical Decoupling for Continual Instruction Tuning of Multimodal Large Language Model. 13572-13586 - Qika Lin, Tianzhe Zhao, Kai He, Zhen Peng, Fangzhi Xu, Ling Huang, Jingying Ma, Mengling Feng:
Self-supervised Quantized Representation for Seamlessly Integrating Knowledge Graphs with Large Language Models. 13587-13602 - Yifan Zhang, Wenyu Du, Dongming Jin, Jie Fu, Zhi Jin:
Finite State Automata Inside Transformers with Chain-of-Thought: A Mechanistic Study on State Tracking. 13603-13621 - Tianwei Lin, Jiang Liu, Wenqiao Zhang, Yang Dai, Haoyuan Li, Zhelun Yu, Wanggui He, Juncheng Li, Jiannan Guo, Hao Jiang, Siliang Tang, Yueting Zhuang:
TeamLoRA: Boosting Low-Rank Adaptation with Expert Collaboration and Competition. 13622-13637 - Ling Shi, Deyi Xiong:
CRiskEval: A Chinese Multi-Level Risk Evaluation Benchmark Dataset for Large Language Models. 13638-13659 - Jaeseong Lee, Seung-won Hwang, Aurick Qiao, Daniel F. Campos, Zhewei Yao, Yuxiong He:
STUN: Structured-Then-Unstructured Pruning for Scalable MoE Pruning. 13660-13676 - Ziyou Jiang, Mingyang Li, Guowei Yang, Junjie Wang, Yuekai Huang, Zhiyuan Chang, Qing Wang:
Mimicking the Familiar: Dynamic Command Generation for Information Theft Attacks in LLM Tool-Learning System. 13677-13693 - Huadai Liu, Jialei Wang, Rongjie Huang, Yang Liu, Heng Lu, Zhou Zhao, Wei Xue:
FlashAudio: Rectified Flow for Fast and High-Fidelity Text-to-Audio Generation. 13694-13710 - Miao Peng, Nuo Chen, Jianheng Tang, Jia Li:
How does Misinformation Affect Large Language Model Behaviors and Preferences? 13711-13748 - Jennifer D'Souza, Hamed Babaei Giglou, Quentin Münch:
YESciEval: Robust LLM-as-a-Judge for Scientific Question Answering. 13749-13783 - Ziyin Zhang, Hang Yu, Sage Lee, Peng Di, Jianguo Li, Rui Wang:
GALLa: Graph Aligned Large Language Models for Improved Source Code Understanding. 13784-13802 - Daniel Philip Rose, Chia-Chien Hung, Marco Lepri, Israa Alqassem, Kiril Gashteovski, Carolin Lawrence:
MEDDxAgent: A Unified Modular Agent Framework for Explainable Automatic Differential Diagnosis. 13803-13826 - Houquan Zhou, Bo Zhang, Zhenghua Li, Ming Yan, Min Zhang:
A Training-free LLM-based Approach to General Chinese Character Error Correction. 13827-13852 - Songtao Jiang, Yan Zhang, Yeying Jin, Zhihang Tang, Yangyang Wu, Yang Feng, Jian Wu, Zuozhu Liu:
HSCR: Hierarchical Self-Contrastive Rewarding for Aligning Medical Vision Language Models. 13853-13868 - Jiawei Guo, Tianyu Zheng, Yizhi Li, Yuelin Bai, Bo Li, Yubo Wang, King Zhu, Graham Neubig, Wenhu Chen, Xiang Yue:
MAmmoTH-VL: Eliciting Multimodal Reasoning with Instruction Tuning at Scale. 13869-13920 - Prabhat Pandey, Rupak Vignesh Swaminathan, K. V. Vijay Girish, Arunasish Sen, Jian Xie, Grant P. Strimel, Andreas Schwarz:
SIFT-50M: A Large-Scale Multilingual Dataset for Speech Instruction Fine-Tuning. 13921-13942 - Wenqian Cui, Dianzhi Yu, Xiaoqi Jiao, Ziqiao Meng, Guangyan Zhang, Qichao Wang, Steven Y. Guo, Irwin King:
Recent Advances in Speech Language Models: A Survey. 13943-13970 - Rohit Upadhya
, T. Y. S. S. Santosh:
LexCLiPR: Cross-Lingual Paragraph Retrieval from Legal Judgments. 13971-13993 - Wenqiang Wang, Yan Xiao, Hao Lin, Yangshijie Zhang, Xiaochun Cao:
Multi-task Adversarial Attacks against Black-box Model with Few-shot Queries. 13994-14014 - Nguyen-Khang Le, Truong Dinh Do, Le-Minh Nguyen:
SPECTRA: Faster Large Language Model Inference with Optimized Internal and External Speculation. 14015-14034 - Zeliang Tong, Wei Wei, Xiaoye Qu, Rikui Huang, Zhixin Chen, Xingyu Yan:
Multi-level Association Refinement Network for Dialogue Aspect-based Sentiment Quadruple Analysis. 14035-14057 - Qiwen Wang, Junqi Yang, Zhenghao Lin, Zhenzhe Ying, Weiqiang Wang, Chen Lin:
Innovative Image Fraud Detection with Cross-Sample Anomaly Analysis: The Power of LLMs. 14058-14078 - Xiaoye Qu, Zengqi Yu, Dongrui Liu, Wei Wei, Daizong Liu, Jianfeng Dong, Yu Cheng:
Cooperative or Competitive? Understanding the Interaction between Attention Heads From A Game Theory Perspective. 14079-14099 - Linzhuang Sun, Hao Liang, Jingxuan Wei, Bihui Yu, Tianpeng Li, Fan Yang, Zenan Zhou, Wentao Zhang:
MM-Verify: Enhancing Multimodal Reasoning with Chain-of-Thought Verification. 14100-14115 - Aitaro Yamamoto, Hiroyuki Otomo, Hiroki Ouchi, Shohei Higashiyama, Hiroki Teranishi, Hiroyuki Shindo, Taro Watanabe:
Graph-Structured Trajectory Extraction from Travelogues. 14116-14132 - Yang Sun, Guanrong Chen, Hamid Alinejad-Rokny, Jianzhu Bao, Yuqi Huang, Bin Liang, Kam-Fai Wong, Min Yang, Ruifeng Xu:
Learning First-Order Logic Rules for Argumentation Mining. 14133-14148 - Jiafeng Liang, Shixin Jiang, Xuan Dong, Ning Wang, Zheng Chu, Hui Su, Jinlan Fu, Ming Liu, See-Kiong Ng, Bing Qin:
Investigating and Enhancing the Robustness of Large Multimodal Models Against Temporal Inconsistency. 14149-14162 - Rui Li, Liyang He, Qi Liu, Zheng Zhang, Heng Yu, Yuyang Ye, Linbo Zhu, Yu Su:
UniRAG: Unified Query Understanding Method for Retrieval Augmented Generation. 14163-14178 - Yitao Liu, Chenglei Si, Karthik R. Narasimhan, Shunyu Yao:
Contextual Experience Replay for Self-Improvement of Language Agents. 14179-14198 - Qi Sun, Pengfei Hong, Pala Tej Deep, Vernon Toh, U-Xuan Tan, Deepanway Ghosal, Soujanya Poria:
Emma-X: An Embodied Multimodal Action Model with Grounded Chain of Thought and Look-ahead Spatial Reasoning. 14199-14214 - Yupei Ren, Xinyi Zhou, Ning Zhang, Shangqing Zhao, Man Lan, Xiaopeng Bai:
Towards Comprehensive Argument Analysis in Education: Dataset, Tasks, and Method. 14215-14231 - Haohao Luo, Jiayi Kuang, Wei Liu, Ying Shen, Jian Luan, Yang Deng:
Browsing Like Human: A Multimodal Web Agent with Experiential Fast-and-Slow Thinking. 14232-14251 - Yile Liu, Ziwei Ma, Xiu Jiang, Jinglu Hu, ChangJing ChangJing, Liang Li:
MaXIFE: Multilingual and Cross-lingual Instruction Following Evaluation. 14252-14332 - Guijin Son, Jiwoo Hong, Hyunwoo Ko, James Thorne:
Linguistic Generalizability of Test-Time Scaling in Mathematical Reasoning. 14333-14368 - Chenhao Zhang, Xi Feng, Yuelin Bai, Xeron Du, Jinchang Hou, Kaixin Deng, Guangzeng Han, Qinrui Li, Bingli Wang, Jiaheng Liu, Xingwei Qu, Yifei Zhang, Qixuan Zhao, Yiming Liang, Ziqiang Liu, Feiteng Fang, Min Yang, Wenhao Huang, Chenghua Lin, Ge Zhang, Shiwen Ni:
Can MLLMs Understand the Deep Implication Behind Chinese Images? 14369-14402 - Mukhammed Togmanov, Nurdaulet Mukhituly, Diana Turmakhan, Jonibek Mansurov, Maiya Goloburda, Akhmed Sakip, Zhuohan Xie, Yuxia Wang, Bekassyl Syzdykov, Nurkhan Laiyk, Alham Fikri Aji, Ekaterina Kochmar, Preslav Nakov, Fajri Koto:
KazMMLU: Evaluating Language Models on Kazakh, Russian, and Regional Knowledge of Kazakhstan. 14403-14416 - Hyangsuk Min, Yuho Lee, Minjeong Ban, Jiaqi Deng, Nicole Hee-Yeon Kim, Taewon Yun, Hang Su, Jason Cai, Hwanjun Song:
Towards Multi-dimensional Evaluation of LLM Summarization across Domains and Languages. 14417-14450 - Minwei Zhang, Haifeng Sun, Jingyu Wang, Shaolong Li, Wanyi Ning, Qi Qi, Zirui Zhuang, Jianxin Liao:
ClusterAttn: KV Cache Compression under Intrinsic Attention Clustering. 14451-14473 - Eunwon Kim, Chanho Park, Buru Chang:
SHARE: Shared Memory-Aware Open-Domain Long-Term Dialogue Dataset Constructed from Movie Script. 14474-14498 - Jiecheng Zhang, C. L. Philip Chen, Shuzhen Li, Tong Zhang:
Incongruity-aware Tension Field Network for Multi-modal Sarcasm Detection. 14499-14508 - Nurkhan Laiyk, Daniil Orel, Rituraj Joshi, Maiya Goloburda, Yuxia Wang, Preslav Nakov, Fajri Koto:
Instruction Tuning on Public Government and Cultural Data for Low-Resource Language: a Case Study in Kazakh. 14509-14538 - Chenxi Dai, Lin Lu, Pan Zhou:
Stealing Training Data from Large Language Models in Decentralized Training through Activation Inversion Attack. 14539-14551 - Yu Xia, Subhojyoti Mukherjee, Zhouhang Xie, Junda Wu, Xintong Li, Ryan Aponte, Hanjia Lyu, Joe Barrow, Hongjie Chen, Franck Dernoncourt, Branislav Kveton, Tong Yu, Ruiyi Zhang, Jiuxiang Gu, Nesreen K. Ahmed, Yu Wang, Xiang Chen, Hanieh Deilamsalehy, Sungchul Kim, Zhengmian Hu, Yue Zhao, Nedim Lipka, Seunghyun Yoon, Ting-Hao Kenneth Huang, Zichao Wang, Puneet Mathur, Soumyabrata Pal, Koyel Mukherjee, Zhehao Zhang, Namyong Park, Thien Huu Nguyen, Jiebo Luo, Ryan A. Rossi, Julian J. McAuley:
From Selection to Generation: A Survey of LLM-based Active Learning. 14552-14569 - Qinglin Zhang, Luyao Cheng, Chong Deng, Qian Chen, Wen Wang, Siqi Zheng, Jiaqing Liu, Hai Yu, Chao-Hong Tan, Zhihao Du, Shiliang Zhang:
OmniFlatten: An End-to-end GPT Model for Seamless Voice Conversation. 14570-14580 - Dohoon Kim, Donghun Kang, Taesup Moon:
DoMIX: An Efficient Framework for Exploiting Domain Knowledge in Fine-Tuning. 14581-14602 - Meidan Ding, Jipeng Zhang, Wenxuan Wang, Haiqin Zhong, Xiaoqin Wang, Xinheng Lyu, Wenting Chen, Linlin Shen:
EAGLE: Expert-Guided Self-Enhancement for Preference Alignment in Pathology Large Vision-Language Model. 14603-14619 - Vignesh Kothapalli, Hamed Firooz, Maziar Sanjabi:
CoT-ICL Lab: A Synthetic Framework for Studying Chain-of-Thought Learning from In-Context Demonstrations. 14620-14642 - Chenxing Wei, Yao Shu, Ying Tiffany He, Fei Yu:
Flexora: Flexible Low-Rank Adaptation for Large Language Models. 14643-14682 - Lei Wang, Ruobing Zuo, Gaolei He, Jianlin Wang, Zhengfeng Yang:
QDTSynth: Quality-Driven Formal Theorem Synthesis for Enhancing Proving Performance of LLMs. 14683-14698 - Yi Lu, Jiawang Cao, Yongliang Wu, Bozheng Li, Licheng Tang, Yangguang Ji, Chong Wu, Jay Wu, Wenbo Zhu:
RSVP: Reasoning Segmentation via Visual Prompting and Multi-modal Chain-of-Thought. 14699-14716 - Tan Yue, Rui Mao, Xuzhao Shi, Shuo Zhan, Zuhao Yang, Dongyan Zhao:
QAEval: Mixture of Evaluators for Question-Answering Task Evaluation. 14717-14730 - Daiying Zhao, Xinyu Yang, Hang Chen:
Debiasing the Fine-Grained Classification Task in LLMs with Bias-Aware PEFT. 14731-14746 - Zhenyan Lu, Xiang Li, Dongqi Cai, Rongjie Yi, Fangming Liu, Wei Liu, Jian Luan, Xiwen Zhang, Nicholas D. Lane, Mengwei Xu:
Demystifying Small Language Models for Edge Deployment. 14747-14764 - Naibin Gu, Peng Fu, Xiyu Liu, Ke Ma, Zheng Lin, Weiping Wang:
Adapt Once, Thrive with Updates: Transferable Parameter-Efficient Fine-Tuning on Evolving Base Models. 14765-14783 - Oikantik Nath, Hanani Bathina, Mohammed Safi Ur Rahman Khan, Mitesh M. Khapra:
Can Vision-Language Models Evaluate Handwritten Math? 14784-14814 - Chenxu Wang, Yilin Lyu, Zicheng Sun, Liping Jing:
Continual Gradient Low-Rank Projection Fine-Tuning for LLMs. 14815-14829 - Ziming Wang, Zeyu Shi, Haoyi Zhou, Shiqi Gao, Qingyun Sun, Jianxin Li:
Towards Objective Fine-tuning: How LLMs' Prior Knowledge Causes Potential Poor Calibration? 14830-14853 - Keane Ong, Rui Mao, Deeksha Varshney, Erik Cambria, Gianmarco Mengaldo:
Towards Robust ESG Analysis Against Greenwashing Risks: Aspect-Action Analysis with Cross-Category Generalization. 14854-14879 - Yilei Jiang, Xinyan Gao, Tianshuo Peng, Yingshui Tan, Xiaoyong Zhu, Bo Zheng, Xiangyu Yue:
HiddenDetect: Detecting Jailbreak Attacks against Multimodal Large Language Models via Monitoring Hidden States. 14880-14893 - Joel Niklaus, Jakob Merane, Luka Nenadic
, Sina Ahmadi
, Yingqiang Gao, Cyrill A. H. Chevalley, Claude Humbel, Christophe Gösken, Lorenzo Tanzi, Thomas Lüthi, Stefan Palombo, Spencer Poff, Boling Yang, Nan Wu, Matthew Guillod, Robin Mamié, Daniel Brunner, Julio Pereyra, Niko Grupen:
SwiLTra-Bench: The Swiss Legal Translation Benchmark. 14894-14916 - Yichen Dong, Xinglin Lyu, Junhui Li, Daimeng Wei, Min Zhang, Shimin Tao, Hao Yang:
Two Intermediate Translations Are Better Than One: Fine-tuning LLMs for Document-level Translation Refinement. 14917-14933 - Philipp Mondorf, Sondre Wold, Barbara Plank:
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models. 14934-14955 - Clara Lachenmaier, Judith Sieker, Sina Zarrieß:
Can LLMs Ground when they (Don't) Know: A Study on Direct and Loaded Political Questions. 14956-14975 - Yingjian Chen, Haoran Liu, Yinhong Liu, Jinxiang Xie, Rui Yang, Han Yuan, Yanran Fu, Peng Yuan Zhou, Qingyu Chen, James Caverlee, Irene Li:
GraphCheck: Breaking Long-Term Text Barriers with Extracted Knowledge Graph-Powered Fact-Checking. 14976-14995 - Shanu Kumar, Akhila Yesantarao Venkata, Shubhanshu Khandelwal, Bishal Santra, Parag Agrawal, Manish Gupta:
SCULPT: Systematic Tuning of Long Prompts. 14996-15029 - Kai He, Yucheng Huang, Wenqing Wang, Delong Ran, Dongming Sheng, Junxuan Huang, Qika Lin, Jiaxing Xu, Wenqiang Liu, Mengling Feng:
Crab: A Novel Configurable Role-Playing LLM with Assessing Benchmark. 15030-15052 - Yingshui Tan, Boren Zheng, Baihui Zheng, Kerui Cao, Huiyun Jing, Jincheng Wei, Jiaheng Liu, Yancheng He, Wenbo Su, Xiaoyong Zhu, Bo Zheng, Kaifu Zhang:
Chinese SafetyQA: A Safety Short-form Factuality Benchmark for Large Language Models. 15053-15076 - Xiaorui Wu, Xiaofeng Mao, Fei Li, Xin Zhang, Xuanhong Li, Chong Teng, Donghong Ji, Zhuang Li:
TRIDENT: Enhancing Large Language Model Safety with Tri-Dimensional Diversified Red-Teaming Data Synthesis. 15077-15099 - Jungseob Lee, Seongtae Hong, Hyeonseok Moon, Heuiseok Lim:
Cross-Lingual Optimization for Language Transfer in Large Language Models. 15100-15119 - Minghui Fang, Shengpeng Ji, Jialong Zuo, Hai Huang, Yan Xia, Jieming Zhu, Xize Cheng, Xiaoda Yang, Wenrui Liu, Gang Wang, Zhenhua Dong, Zhou Zhao:
CART: A Generative Cross-Modal Retrieval Framework With Coarse-To-Fine Semantic Modeling. 15120-15133 - Xiang Yue, Tianyu Zheng, Yuansheng Ni, Yubo Wang, Kai Zhang, Shengbang Tong, Yuxuan Sun, Botao Yu, Ge Zhang, Huan Sun, Yu Su, Wenhu Chen, Graham Neubig:
MMMU-Pro: A More Robust Multi-discipline Multimodal Understanding Benchmark. 15134-15186 - Xueru Wen, Jie Lou, Zichao Li, Yaojie Lu, XingYu, Yuqiu Ji, Guohai Xu, Hongyu Lin, Ben He, Xianpei Han, Le Sun, Debing Zhang:
Cheems: A Practical Guidance for Building and Evaluating Chinese Reward Models from Scratch. 15187-15211 - Chak Tou Leong, Qingyu Yin, Jian Wang, Wenjie Li:
Why Safeguarded Ships Run Aground? Aligned Large Language Models' Safety Mechanisms Tend to Be Anchored in The Template Region. 15212-15229 - Jinhe Bi, Yujun Wang, Haokun Chen, Xun Xiao, Artur Hecker, Volker Tresp, Yunpu Ma:
LLaVA Steering: Visual Instruction Tuning with 500x Fewer Parameters through Modality Linear Representation-Steering. 15230-15250 - Minju Seo, Jinheon Baek, Seongyun Lee, Sung Ju Hwang:
Efficient Long Context Language Model Retrieval with Compression. 15251-15268 - Runxuan Liu, Luobei Luobei, Jiaqi Li, Baoxin Wang, Ming Liu, Dayong Wu, Shijin Wang, Bing Qin:
Ontology-Guided Reverse Thinking Makes Large Language Models Stronger on Knowledge Graph Question Answering. 15269-15284 - Zhe Chen, Yusheng Liao, Shuyang Jiang, Pingjie Wang, Yiqiu Guo, Yanfeng Wang, Yu Wang:
Towards Omni-RAG: Comprehensive Retrieval-Augmented Generation for Large Language Models in Medical Applications. 15285-15309 - Yuxin Lin, Yinglin Zheng, Ming Zeng, Wangzheng Shi:
Predicting Turn-Taking and Backchannel in Human-Machine Conversations Using Linguistic, Acoustic, and Visual Signals. 15310-15322 - Ryo Nagata, Kumiko Tanaka-Ishii:
A New Formulation of Zipf's Meaning-Frequency Law through Contextual Diversity. 15323-15335 - Wanli Yang, Fei Sun, Jiajun Tan, Xinyu Ma, Qi Cao, Dawei Yin, Huawei Shen, Xueqi Cheng:
The Mirage of Model Editing: Revisiting Evaluation in the Wild. 15336-15354 - Eran Hirsch, Aviv Slobodkin, David Wan, Elias Stengel-Eskin, Mohit Bansal, Ido Dagan:
LAQuer: Localized Attribution Queries in Content-grounded Generation. 15355-15370 - Xiaoqian Liu, Ke Wang, Yongbin Li, Yuchuan Wu, Wentao Ma, Aobo Kong, Fei Huang, Jianbin Jiao, Junge Zhang:
EPO: Explicit Policy Optimization for Strategic Reasoning in LLMs via Reinforcement Learning. 15371-15396 - Jihyung Lee, Jin-Seop Lee, Jaehoon Lee, YunSeok Choi, Jee-Hyong Lee:
DCG-SQL: Enhancing In-Context Learning for Text-to-SQL with Deep Contextual Schema Link Graph. 15397-15412 - Shuhao Guan, Moule Lin, Cheng Xu, Xinyi Liu, Jinman Zhao, Jiexin Fan, Qi Xu, Derek Greene:
PreP-OCR: A Complete Pipeline for Document Image Restoration and Enhanced OCR Accuracy. 15413-15425 - Junhong Wan, Tao Yu, Kunyu Jiang, Yao Fu, Weihao Jiang, Jiang Zhu:
Digest the Knowledge: Large Language Models empowered Message Passing for Knowledge Graph Question Answering. 15426-15442 - Yangqin Jiang, Yuhao Yang, Lianghao Xia, Da Luo, Kangyi Lin, Chao Huang:
RecLM: Recommendation Instruction Tuning. 15443-15459 - Hongling Xu, Yice Zhang, Qianlong Wang, Ruifeng Xu:
DS²-ABSA: Dual-Stream Data Synthesis with Label Refinement for Few-Shot Aspect-Based Sentiment Analysis. 15460-15478 - HangChen HangChen, Chao-Han Huck Yang, Jia-Chen Gu, Sabato Marco Siniscalchi, Jun Du:
MISP-Meeting: A Real-World Dataset with Multimodal Cues for Long-form Meeting Transcription and Summarization. 15479-15492 - Sohan Patnaik, Milan Aggarwal, Sumit Bhatia, Balaji Krishnamurthy:
Learning Together to Perform Better: Teaching Small-Scale LLMs to Collaborate via Preferential Rationale Tuning. 15493-15512 - Ziting Xian, Jiawei Gu, Lingbo Li, Shangsong Liang:
MolRAG: Unlocking the Power of Large Language Models for Molecular Property Prediction. 15513-15531 - Guangzhi Sun, Anmol Kagrecha, Potsawee Manakul, Philip C. Woodland, Mark J. F. Gales:
SkillAggregation: Reference-free LLM-Dependent Aggregation. 15532-15548 - Yanwei Yue, Guibin Zhang, Boyang Liu, Guancheng Wan, Kun Wang, Dawei Cheng, Yiyan Qi:
MasRouter: Learning to Route LLMs for Multi-Agent Systems. 15549-15572 - Haozhe Xu, Xiaohua Wang, Changze Lv, Xiaoqing Zheng:
Beyond Single Labels: Improving Conversational Recommendation through LLM-Powered Data Augmentation. 15573-15590 - Peiwen Yuan, Yueqi Zhang, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Jiayi Shi, Chuyi Tan, Boyuan Pan, Yao Hu, Kan Li:
Beyond One-Size-Fits-All: Tailored Benchmarks for Efficient Evaluation. 15591-15615 - Shuai Wang, Yinan Yu:
iQUEST: An Iterative Question-Guided Framework for Knowledge Base Question Answering. 15616-15628 - Wei Song, Zhenya Huang, Cheng Cheng, Weibo Gao, Bihan Xu, Guanhao Zhao, Fei Wang, Runze Wu:
IRT-Router: Effective and Interpretable Multi-LLM Routing via Item Response Theory. 15629-15644 - Tianyu Dong, Bo Li, Jinsong Liu, Shaolin Zhu, Deyi Xiong:
MLAS-LoRA: Language-Aware Parameters Detection and LoRA-Based Knowledge Transfer for Multilingual Machine Translation. 15645-15660 - Jiaheng Liu, Ken Deng, Congnan Liu, Jian Yang, Shukai Liu, He Zhu, Peng Zhao, Linzheng Chai, Yanan Wu, Ke Jin, Ge Zhang, Zekun Moore Wang, Guoan Zhang, Yingshui Tan, Bangyu Xiang, Zhaoxiang Zhang, Wenbo Su, Bo Zheng:
M2RC-EVAL: Massively Multilingual Repository-level Code Completion Evaluation. 15661-15684 - Susanna Rücker, Alan Akbik:
Evaluating Design Decisions for Dual Encoder-based Entity Disambiguation. 15685-15701 - Irina Nikishina, Saba Anwar, Nikolay Dolgov, Maria Manina, Daria Ignatenko, Artem Shelmanov, Chris Biemann:
How to Compare Things Properly? A Study of Argument Relevance in Comparative Question Answering. 15702-15720 - Zichen Tang, Haihong E, Ziyan Ma, Haoyang He, Jiacheng Liu, Zhongjun Yang, Zihua Rong, Rongjin Li, Kun Ji, Qing Huang, Xinyang Hu, Yang Liu, Qianhe Zheng:
FinanceReasoning: Benchmarking Financial Numerical Reasoning More Credible, Comprehensive and Challenging. 15721-15749 - Weiqi Wang, Wengang Zhou, Zongmeng Zhang, Jie Zhao, Houqiang Li:
Controllable Style Arithmetic with Language Models. 15750-15799 - Peiyu Liu, Tianwen Wei, Bo Zhu, Xin Zhao, Shuicheng Yan:
Masks Can be Learned as an Alternative to Experts. 15800-15811 - Chao Wen, Jacqueline Staub, Adish Singla:
Program Synthesis Benchmark for Visual Programming in XLogoOnline Environment. 15812-15838 - Wentao Hu, Wengyu Zhang, Yiyang Jiang, Chen Jason Zhang, Xiaoyong Wei, Qing Li:
Removal of Hallucination on Hallucination: Debate-Augmented RAG. 15839-15853 - Kechi Zhang, Ge Li, Yihong Dong, Jingjing Xu, Jun Zhang, Jing Su, Yongfei Liu, Zhi Jin:
CodeDPO: Aligning Code Models with Self Generated and Verified Source Code. 15854-15871 - Alexander Miserlis Hoyle, Lorena Calvo-Bartolomé, Jordan Lee Boyd-Graber, Philip Resnik:
ProxAnn: Use-Oriented Evaluations of Topic Models and Document Clustering. 15872-15897 - Yiting Ran, Xintao Wang, Tian Qiu, Jiaqing Liang, Yanghua Xiao, Deqing Yang:
BOOKWORLD: From Novels to Interactive Agent Societies for Story Creation. 15898-15912 - Ryo Kishino, Hiroaki Yamagiwa, Ryo Nagata, Sho Yokoi, Hidetoshi Shimodaira:
Quantifying Lexical Semantic Shift via Unbalanced Optimal Transport. 15913-15933 - Hao Peng, Yunjia Qi, Xiaozhi Wang, Zijun Yao, Bin Xu, Lei Hou, Juanzi Li:
Agentic Reward Modeling: Integrating Human Preferences with Verifiable Correctness Signals for Reliable Reward Systems. 15934-15949 - Gengyuan Shi, Chaokun Wang, Yabin Liu, Jiawei Ren:
Adaptive and Robust Translation from Natural Language to Multi-model Query Languages. 15950-15965 - Marco Scialanga, Thibault Laugel, Vincent Grari, Marcin Detyniecki:
SAKE: Steering Activations for Knowledge Editing. 15966-15978 - Danni Liu, Jan Niehues:
Middle-Layer Representation Alignment for Cross-Lingual Transfer in Fine-Tuned LLMs. 15979-15996 - Arduin Findeis, Floris Weers, Guoli Yin, Ke Ye, Ruoming Pang, Tom Gunter:
Can External Validation Tools Improve Annotation Quality for LLM-as-a-Judge? 15997-16020 - Weitao Ma, Xiyuan Du, Xiaocheng Feng, Lei Huang, Yichong Huang, Huiyi Zhang, Xiaoliang Yang, Baohang Li, Xiachong Feng, Ting Liu, Bing Qin:
One for All: Update Parameterized Knowledge Across Multiple Models with Once Edit. 16021-16034 - Xiasi Wang, Tianliang Yao, Simin Chen, Runqi Wang, Lei Ye, Kuofeng Gao, Yi Huang, Yuan Yao:
VLMInferSlow: Evaluating the Efficiency Robustness of Large Vision-Language Models as a Service. 16035-16050 - Nitay Calderon, Roi Reichart, Rotem Dror:
The Alternative Annotator Test for LLM-as-a-Judge: How to Statistically Justify Replacing Human Annotators with LLMs. 16051-16081 - Romain Meunier, Farah Benamara, Véronique Moriceau, Zhongzheng Qiao, Savitha Ramasamy:
CrisisTS: Coupling Social Media Textual Data and Meteorological Time Series for Urgency Classification. 16082-16099 - Junhao Shi, Qinyuan Cheng, Zhaoye Fei, Yining Zheng, Qipeng Guo, Xipeng Qiu:
How to Mitigate Overfitting in Weak-to-strong Generalization? 16100-16118 - Kai Xiong, Xiao Ding, Yixin Cao, Yuxiong Yan, Li Du, Yufei Zhang, Jinglong Gao, Jiaqian Liu, Bing Qin, Ting Liu:
Com² : A Causal-Guided Benchmark for Exploring Complex Commonsense Reasoning in Large Language Models. 16119-16140 - Yang Hou, Zhenghua Li:
Dynamic Head Selection for Neural Lexicalized Constituency Parsing. 16141-16155 - Jian Liao, Yu Feng, Yujin Zheng, Jun Zhao, Suge Wang, Jianxing Zheng:
My Words Imply Your Opinion: Reader Agent-Based Propagation Enhancement for Personalized Implicit Emotion Analysis. 16156-16172 - Zhiyuan Zhu, Yusheng Liao, Zhe Chen, Yuhao Wang, Yunfeng Guan, Yanfeng Wang, Yu Wang:
EvolveBench: A Comprehensive Benchmark for Assessing Temporal Awareness in LLMs on Evolving Knowledge. 16173-16188 - Yujia Hu, Tuan-Phong Nguyen, Shrestha Ghosh, Simon Razniewski:
Enabling LLM Knowledge Analysis via Extensive Materialization. 16189-16202 - Jialong Zuo, Shengpeng Ji, Minghui Fang, Mingze Li, Ziyue Jiang, Xize Cheng, Xiaoda Yang, Feiyang Chen, Xinyu Duan, Zhou Zhao:
Rhythm Controllable and Efficient Zero-Shot Voice Conversion via Shortcut Flow Matching. 16203-16217 - Jingcheng Niu, Xingdi Yuan, Tong Wang, Hamidreza Saghir, Amir H. Abdi:
Llama See, Llama Do: A Mechanistic Perspective on Contextual Entrainment and Distraction in LLMs. 16218-16239 - Honglin Guo, Kai Lv, Qipeng Guo, Tianyi Liang, Zhiheng Xi, Demin Song, Qiuyinzhe Zhang, Yu Sun, Kai Chen, Xipeng Qiu, Tao Gui:
CritiQ: Mining Data Quality Criteria from Human Preferences. 16240-16261 - Yuki Ichihara, Yuu Jinnai, Kaito Ariu, Tetsuro Morimura, Eiji Uchibe:
Theoretical Guarantees for Minimum Bayes Risk Decoding. 16262-16284 - Tianyuan Shi, Canbin Huang, Fanqi Wan, Longguang Zhong, Ziyi Yang, Weizhou Shen, Xiaojun Quan, Ming Yan:
Mutual-Taught for Co-adapting Policy and Reward Models. 16285-16298 - Wenhao Zhuang, Yuan Sun, Xiaobing Zhao:
Enhancing Cross-Lingual Transfer through Reversible Transliteration: A Huffman-Based Approach for Low-Resource Languages. 16299-16313 - Jiaxu Zhao, Meng Fang, Kun Zhang, Mykola Pechenizkiy:
Unmasking Style Sensitivity: A Causal Analysis of Bias Evaluation Instability in Large Language Models. 16314-16338 - Dávid Javorský, Ondrej Bojar, François Yvon:
MockConf: A Student Interpretation Dataset: Analysis, Word- and Span-level Alignment and Baselines. 16339-16356 - Ercong Nie, Bo Shao, Mingyang Wang, Zifeng Ding, Helmut Schmid, Hinrich Schütze:
BMIKE-53: Investigating Cross-Lingual Knowledge Editing with In-Context Learning. 16357-16374 - Dingyi Yang, Qin Jin:
What Matters in Evaluating Book-Length Stories? A Systematic Study of Long Story Evaluation. 16375-16398 - Linhai Zhang, Jialong Wu, Deyu Zhou, Yulan He:
PROPER: A Progressive Learning Framework for Personalized Large Language Models with Group-Level Adaptation. 16399-16411 - Longyin Zhang, Bowei Zou, AiTi Aw:
Enhancing Event-centric News Cluster Summarization via Data Sharpening and Localization Insights. 16412-16426 - Zhitao He, Sandeep Polisetty, Zhiyuan Fan, Yuchen Huang, Shujin Wu, Yi R. Fung:
MMBoundary: Advancing MLLM Knowledge Boundary Awareness through Reasoning Step Confidence Calibration. 16427-16444 - Xiaodong Wu, Minhao Wang, Yichen Liu, Xiaoming Shi, He Yan, Xiangju Li, Junmin Zhu, Wei Zhang:
LIFBench: Evaluating the Instruction Following Performance and Stability of Large Language Models in Long-Context Scenarios. 16445-16468 - Shuzheng Si, Haozhe Zhao, Gang Chen, Cheng Gao, Yuzhuo Bai, Zhitong Wang, Kaikai An, Kangyang Luo, Chen Qian, Fanchao Qi, Baobao Chang, Maosong Sun:
Aligning Large Language Models to Follow Instructions and Hallucinate Less via Effective Data Filtering. 16469-16488 - Junwoo Ha, Hyunjun Kim, Sangyoon Yu, Haon Park, Ashkan Yousefpour, Yuna Park, Suhyun Kim:
One-Shot is Enough: Consolidating Multi-Turn Attacks into Efficient Single-Turn Prompts for LLMs. 16489-16507 - Zhiwei Liu, Kailai Yang, Qianqian Xie, Christine de Kock, Sophia Ananiadou, Eduard H. Hovy:
RAEmoLLM: Retrieval Augmented LLMs for Cross-Domain Misinformation Detection Using In-Context Learning Based on Emotional Information. 16508-16523 - Zhiyue Liu, Xinru Zhang, Jinyuan Liu:
Task-Specific Information Decomposition for End-to-End Dense Video Captioning. 16524-16536 - Haitao Li, Junjie Chen, Qingyao Ai, Zhumin Chu, Yujia Zhou, Qian Dong, Yiqun Liu:
CalibraEval: Calibrating Prediction Distribution to Mitigate Selection Bias in LLMs-as-Judges. 16537-16552 - Sahrish Khan, Arshad Jhumka, Gabriele Pergola:
Explaining Matters: Leveraging Definitions and Semantic Expansion for Sexism Detection. 16553-16571 - Elena Sofia Ruzzetti, Giancarlo A. Xompero, Davide Venditti, Fabio Massimo Zanzotto:
Private Memorization Editing: Turning Memorization into a Defense to Strengthen Data Privacy in Large Language Models. 16572-16592 - Xinyu Zhang, Yuxuan Dong, Yanrui Wu, Jiaxing Huang, Chengyou Jia, Basura Fernando, Mike Zheng Shou, Lingling Zhang, Jun Liu:
PhysReason: A Comprehensive Benchmark towards Physics-Based Reasoning. 16593-16615 - Yein Park, Chanwoong Yoon, Jungwoo Park, Minbyul Jeong, Jaewoo Kang:
Does Time Have Its Place? Temporal Heads: Where Language Models Recall Time-specific Information. 16616-16643 - Zheheng Luo, Xin Zhang, Xiao Liu, Haoling Li, Yeyun Gong, Qi Chen, Peng Cheng:
Velocitune: A Velocity-based Dynamic Domain Reweighting Method for Continual Pre-training. 16644-16656 - Jingjie Zeng, Liang Yang, Zekun Wang, Yuanyuan Sun, Hongfei Lin:
Sheep's Skin, Wolf's Deeds: Are LLMs Ready for Metaphorical Implicit Hate Speech? 16657-16677 - Houcheng Jiang, Junfeng Fang, Tianyu Zhang, Baolong Bi, An Zhang, Ruipeng Wang, Tao Liang, Xiang Wang:
Neuron-Level Sequential Editing for Large Language Models. 16678-16702 - Shengzhuang Chen, Ying Wei, Jonathan Richard Schwarz:
Automatic Expert Discovery in LLM Upcycling via Sparse Interpolated Mixture-of-Experts. 16703-16717 - Keqi Deng, Wenxi Chen, Xie Chen, Philip C. Woodland:
SimulS2S-LLM: Unlocking Simultaneous Inference of Speech LLMs for Speech-to-Speech Translation. 16718-16734 - Wenqian Cui, Xiaoqi Jiao, Ziqiao Meng, Irwin King:
VoxEval: Benchmarking the Knowledge Understanding Capabilities of End-to-End Spoken Language Models. 16735-16753 - Xiaoxi Li, Jiajie Jin, Yujia Zhou, Yongkang Wu, Zhonghua Li, Ye Qi, Zhicheng Dou:
RetroLLM: Empowering Large Language Models to Retrieve Fine-grained Evidence within Generation. 16754-16779 - Chengkun Cai, Xu Zhao, Haoliang Liu, Zhongyu Jiang, Tianfang Zhang, Zongkai Wu, Jenq-Neng Hwang, Lei Li:
The Role of Deductive and Inductive Reasoning in Large Language Models. 16780-16790 - Yupei Du, Yingjin Song, Hugh Mee Wong, Daniil Ignatev, Albert Gatt, Dong Nguyen:
Disentangling the Roles of Representation and Selection in Data Pruning. 16791-16809 - Yukti Makhija, Priyanka Agrawal, Rishi Saket, Aravindan Raghuveer:
FRACTAL: Fine-Grained Scoring from Aggregate Text Labels. 16810-16830 - Makoto Nakatsuji, Shuhei Tateishi, Yasuhiro Fujiwara, Ayaka Matsumoto, Narichika Nomoto, Yoshihide Sato:
ACT: Knowledgeable Agents to Design and Perform Complex Tasks. 16831-16861 - Yixuan Wang, Freda Shi:
Logical forms complement probability in understanding language model (and human) performance. 16862-16877 - Yuxuan Gu, Wenjie Wang, Xiaocheng Feng, Weihong Zhong, Kun Zhu, Lei Huang, Ting Liu, Bing Qin, Tat-Seng Chua:
Length Controlled Generation for Black-box LLMs. 16878-16895 - Lei Huang, Xiaocheng Feng, Weitao Ma, Yuchun Fan, Xiachong Feng, Yangfan Ye, Weihong Zhong, Yuxuan Gu, Baoxin Wang, Dayong Wu, Guoping Hu, Bing Qin:
Improving Contextual Faithfulness of Large Language Models via Retrieval Heads-Induced Optimization. 16896-16913 - Wenxuan Lu, Wei Liu, Jian Luan, Bin Wang, Songhao Jiang, Tianning Zang:
Global Eye: Breaking the "Fixed Thinking Pattern" during the Instruction Expansion Process. 16914-16928 - Gorjan Radevski, Kiril Gashteovski, Shahbaz Syed, Christopher Malon, Sebastien Nicolas, Chia-Chien Hung, Timo Sztyler, Verena Heußer, Wiem Ben Rim, Masafumi Enomoto, Kunihiro Takeoka, Masafumi Oyamada, Goran Glavas, Carolin Lawrence:
On Synthesizing Data for Context Attribution in Question Answering. 16929-16950 - Peiwen Jiang, Haitong Jiang, Ruhui Ma, Yvonne Jie Chen, Jinhua Cheng:
TST: A Schema-Based Top-Down and Dynamic-Aware Agent of Text-to-Table Tasks. 16951-16966 - Zairun Yang, Yilin Wang, Zhengyan Shi, Yuan Yao, Lei Liang, Keyan Ding, Emine Yilmaz, Huajun Chen, Qiang Zhang:
EventRAG: Enhancing LLM Generation with Event Knowledge Graphs. 16967-16979 - Yang Zhao, Li Du, Xiao Ding, Kai Xiong, Ting Liu, Bing Qin:
Analyzing the Rapid Generalization of SFT via the Perspective of Attention Head Activation Patterns. 16980-16992 - Wenxuan Wang, Xiaoyuan Liu, Kuiyi Gao, Jen-tse Huang, Youliang Yuan, Pinjia He, Shuai Wang, Zhaopeng Tu:
Can't See the Forest for the Trees: Benchmarking Multimodal Safety Awareness for Multimodal LLMs. 16993-17006 - Jiayi Zeng, Yizhe Feng, Mengliang He, Wenhui Lei, Wei Zhang, Zeming Liu, Xiaoming Shi, Aimin Zhou:
Mis-prompt: Benchmarking Large Language Models for Proactive Error Handling. 17007-17034 - Soumyabrata Chaudhuri, Pranav Purkar, Ritwik Raghav, Shubhojit Mallick, Manish Gupta, Abhik Jana, Shreya Ghosh:
TripCraft: A Benchmark for Spatio-Temporally Fine Grained Travel Planning. 17035-17064 - Zihan Liu, Yizhen Wang, Rui Wang, Sai Wu:
DualGuard: A Parameter Space Transformation Approach for Bidirectional Defense in Split-Based LLM Fine-Tuning. 17065-17080 - Zihao Yue, Yepeng Zhang, Ziheng Wang, Qin Jin:
Movie101v2: Improved Movie Narration Benchmark. 17081-17095 - Nan Hu, Jiaoyan Chen, Yike Wu, Guilin Qi, Hongru Wang, Sheng Bi, Yongrui Chen, Tongtong Wu, Jeff Z. Pan:
Can LLMs Evaluate Complex Attribution in QA? Automatic Benchmarking using Knowledge Graphs. 17096-17118 - Jongwook Han, Dongmin Choi, Woojung Song, Eun-Ju Lee, Yohan Jo:
Value Portrait: Assessing Language Models' Values through Psychometrically and Ecologically Valid Items. 17119-17159 - Wei Li, Xin Zhang, Zhongxin Guo, Shaoguang Mao, Wen Luo, Guangyue Peng, Yangyu Huang, Houfeng Wang, Scarlett Li:
FEA-Bench: A Benchmark for Evaluating Repository-Level Code Generation for Feature Implementation. 17160-17176 - Jingyu Liu, JingquanPeng JingquanPeng, Xiaopeng Wu, Xubin Li, Tiezheng Ge, Bo Zheng, Yong Liu:
Do not Abstain! Identify and Solve the Uncertainty. 17177-17197 - Baolong Bi, Shenghua Liu, Lingrui Mei, Yiwei Wang, Junfeng Fang, Pengliang Ji, Xueqi Cheng:
Decoding by Contrasting Knowledge: Enhancing Large Language Model Confidence on Edited Facts. 17198-17208 - Mohammad Zia Ur Rehman, Anukriti Bhatnagar, Omkar Kabde, Shubhi Bansal, Nagendra Kumar:
ImpliHateVid: A Benchmark Dataset and Two-stage Contrastive Learning Framework for Implicit Hate Speech Detection in Videos. 17209-17221 - Leonardo Ranaldi, Marco Valentino, André Freitas:
Improving Chain-of-Thought Reasoning via Quasi-Symbolic Abstractions. 17222-17240 - Aniket Bhattacharyya, Anurag Tripathi, Ujjal Das, Archan Karmakar, Amit Pathak, Maneesh Gupta:
Information Extraction from Visually Rich Documents using LLM-based Organization of Documents into Independent Textual Segments. 17241-17256 - Bohan Lyu, Xin Cong, Heyang Yu, Pan Yang, Cheng Qian, Zihe Wang, Yujia Qin, Yining Ye, Yaxi Lu, Chen Qian, Zhong Zhang, Yukun Yan, Yankai Lin, Zhiyuan Liu, Maosong Sun:
Enhancing Open-Domain Task-Solving Capability of LLMs via Autonomous Tool Integration from GitHub. 17257-17277 - Zhuoyun Du, Lujie Zheng, Renjun Hu, Yuyang Xu, Xiawei Li, Ying Sun, Wei Chen, Jian Wu, Haolei Cai, Haochao Ying:
LLMs Can Simulate Standardized Patients via Agent Coevolution. 17278-17306 - Christopher Bagdon, Aidan Combs, Carina Silberer, Roman Klinger:
Donate or Create? Comparing Data Collection Strategies for Emotion-labeled Multimodal Social Media Posts. 17307-17330 - Johannes Schäfer, Aidan Combs, Christopher Bagdon, Jiahui Li, Nadine Probol, Lynn Greschner, Sean Papay, Yarik Menchaca Resendiz, Aswathy Velutharambath, Amelie Wührl, Sabine Weber, Roman Klinger:
Which Demographics do LLMs Default to During Annotation? 17331-17348 - Yutao Mou, Xiao Deng, Yuxiao Luo, Shikun Zhang, Wei Ye:
Can You Really Trust Code Copilot? Evaluating Large Language Models from a Code Security Perspective. 17349-17369 - Peiwen Yuan, Chuyi Tan, Shaoxiong Feng, Yiwei Li, Xinglin Wang, Yueqi Zhang, Jiayi Shi, Boyuan Pan, Yao Hu, Kan Li:
From Sub-Ability Diagnosis to Human-Aligned Generation: Bridging the Gap for Text Length Control via MarkerGen. 17370-17390 - Shilong Pan, Zhiliang Tian, Zhen Huang, Wanlong Yu, Zhihua Wen, Xinwang Liu, Kai Lu, Minlie Huang, Dongsheng Li:
AGD: Adversarial Game Defense Against Jailbreak Attacks in Large Language Models. 17391-17406 - Yongjie Xiao, Hongru Liang, Peixin Qin, Yao Zhang, Wenqiang Lei:
SCOP: Evaluating the Comprehension Process of Large Language Models from a Cognitive View. 17407-17431 - Peiying Yu, Guoxin Chen, Jingjing Wang:
Table-Critic: A Multi-Agent Framework for Collaborative Criticism and Refinement in Table Reasoning. 17432-17451 - Laurie Burchell, Ona De Gibert Bonet, Nikolay Arefyev, Mikko Aulamo, Marta Bañón, Pinzhen Chen, Mariia Fedorova, Liane Guillou, Barry Haddow, Jan Hajic, Jindrich Helcl, Erik Henriksson, Mateusz Klimaszewski, Ville Komulainen, Andrey Kutuzov, Joona Kytöniemi, Veronika Laippala, Petter Mæhlum, Bhavitvya Malik, Farrokh Mehryary, Vladislav Mikhailov, Nikita Moghe, Amanda Myntti, Dayyán O'Brien, Stephan Oepen, Proyag Pal, Jousia Piha, Sampo Pyysalo, Gema Ramírez-Sánchez, David Samuel, Pavel Stepachev, Jörg Tiedemann, Dusan Varis, Tereza Vojtechová, Jaume Zaragoza-Bernabeu:
An Expanded Massive Multilingual Dataset for High-Performance Language Technologies (HPLT). 17452-17485 - Yue Yang, Ajay Patel, Matt Deitke, Tanmay Gupta, Luca Weihs, Andrew Head, Mark Yatskar, Chris Callison-Burch, Ranjay Krishna, Aniruddha Kembhavi, Christopher Clark:
Scaling Text-Rich Image Understanding via Code-Guided Synthetic Multimodal Data Generation. 17486-17505 - Jianlong Chen, Chao Li, Yang Yuan, Andrew C. Yao:
Hierarchical Attention Generates Better Proofs. 17506-17520 - Tianyi Men, Zhuoran Jin, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao:
Agent-RewardBench: Towards a Unified Benchmark for Reward Modeling across Perception, Planning, and Safety in Real-World Multimodal Agents. 17521-17541 - Jingjie Zeng, Huayang Li, Liang Yang, Yuanyuan Sun, Hongfei Lin:
It's Not Bragging If You Can Back It Up: Can LLMs Understand Braggings? 17542-17560 - Tianyi Men, Pengfei Cao, Zhuoran Jin, Yubo Chen, Kang Liu, Jun Zhao:
A Troublemaker with Contagious Jailbreak Makes Chaos in Honest Towns. 17561-17587 - Michael Eric Goodale, Salvador Mascarenhas, Yair Lakretz:
Meta-Learning Neural Mechanisms rather than Bayesian Priors. 17588-17605 - Dahyun Lee, Yongrae Jo, Haeju Park, Moontae Lee:
Shifting from Ranking to Set Selection for Retrieval Augmented Generation. 17606-17619 - Jiaxu Zhao, Meng Fang, Fanghua Ye, Ke Xu, Qin Zhang, Joey Tianyi Zhou, Mykola Pechenizkiy:
Understanding Large Language Model Vulnerabilities to Social Bias Attacks. 17620-17636 - Zhigen Li, Jianxiang Peng, Yanmeng Wang, Yong Cao, Tianhao Shen, Minghui Zhang, Linxi Su, Shang Wu, Yihang Wu, Yuqian Wang, Ye Wang, Wei Hu, Jianfeng Li, Shaojun Wang, Jing Xiao, Deyi Xiong:
ChatSOP: An SOP-Guided MCTS Planning Framework for Controllable LLM Dialogue Agents. 17637-17659 - Dexian Cai, Xiaocui Yang, Yongkang Liu, Daling Wang, Shi Feng, Yifei Zhang, Soujanya Poria:
Pixel-Level Reasoning Segmentation via Multi-turn Conversations. 17660-17679 - Rong Bao, Donglei Yu, Kai Fan, Minpeng Liao:
Fixing Distribution Shifts of LLM Self-Critique via On-Policy Self-Play Training. 17680-17700 - Amit Elhelo, Mor Geva:
Inferring Functionality of Attention Heads from their Parameters. 17701-17733 - Xin Quan, Marco Valentino, Louise A. Dennis, André Freitas:
Faithful and Robust LLM-Driven Theorem Proving for NLI Explanations. 17734-17755 - Jiakuan Xie, Pengfei Cao, Yubo Chen, Kang Liu, Jun Zhao:
Revealing the Deceptiveness of Knowledge Editing: A Mechanistic Analysis of Superficial Editing. 17756-17780 - Wenyu Huang, Pavlos Vougiouklis, Mirella Lapata, Jeff Z. Pan:
Masking in Multi-hop QA: An Analysis of How Language Models Perform with Context Permutation. 17781-17795 - Luca Dini, Lucia Domenichelli, Dominique Brunato, Felice Dell'Orletta:
From Human Reading to NLM Understanding: Evaluating the Role of Eye-Tracking Data in Encoder-Based Models. 17796-17813 - Linhao Ye, Lang Yu, Zhikai Lei, Qin Chen, Jie Zhou, Liang He:
Optimizing Question Semantic Space for Dynamic Retrieval-Augmented Multi-hop Question Answering. 17814-17824 - Xiaoyuan Liu, Wenxuan Wang, Youliang Yuan, Jen-tse Huang, Qiuzhi Liu, Pinjia He, Zhaopeng Tu:
Insight Over Sight: Exploring the Vision-Knowledge Conflicts in Multimodal LLMs. 17825-17846 - Xiao Xia, Dan Zhang, Zibo Liao, Zhenyu Hou, Tianrui Sun, Jing Li, Ling Fu, Yuxiao Dong:
SceneGenAgent: Precise Industrial Scene Generation with Coding Agent. 17847-17875 - Hanxing Ding, Shuchang Tao, Liang Pang, Zihao Wei, Jinyang Gao, Bolin Ding, Huawei Shen, Xueqi Cheng:
ToolCoder: A Systematic Code-Empowered Tool Learning Framework for Large Language Models. 17876-17891 - Bashar Alhafni, Nizar Habash:
Enhancing Text Editing for Grammatical Error Correction: Arabic as a Case Study. 17892-17914 - Frederic Blum, Steffen Herbold, Johann-Mattis List:
From Isolates to Families: Using Neural Networks for Automated Language Affiliation. 17915-17927 - Xuxu Liu, Siyuan Liang, Mengya Han, Yong Luo, Aishan Liu, Xiantao Cai, Zheng He, Dacheng Tao:
ELBA-Bench: An Efficient Learning Backdoor Attacks Benchmark for Large Language Models. 17928-17947 - Xue Zhang, Yunlong Liang, Fandong Meng, Songming Zhang, Yufeng Chen, Jinan Xu, Jie Zhou:
Less, but Better: Efficient Multilingual Expansion for LLMs via Layer-wise Mixture-of-Experts. 17948-17963 - Daniela Occhipinti, Marco Guerini, Malvina Nissim:
When Harry Meets Superman: The Role of The Interlocutor in Persona-Based Dialogue Generation. 17964-17985 - Zhenliang Zhang, Xinyu Hu, Huixuan Zhang, Junzhe Zhang, Xiaojun Wan:
ICR Probe: Tracking Hidden State Dynamics for Reliable Hallucination Detection in LLMs. 17986-18002 - Xiancai Chen, Zhengwei Tao, Kechi Zhang, Changzhi Zhou, Xinyu Zhang, Wanli Gu, Yuanpeng He, Mengdi Zhang, Xunliang Cai, Haiyan Zhao, Zhi Jin:
Revisit Self-Debugging with Self-Generated Tests for Code Generation. 18003-18023 - Dingdong Wang, Jin Xu, Ruihang Chu, Zhifang Guo, Xiong Wang, Jincenzi Wu, Dongchao Yang, Shengpeng Ji, Junyang Lin:
InSerter: Speech Instruction Following with Unsupervised Interleaved Pre-training. 18024-18046 - Candida Maria Greco, Lucio La Cava, Lorenzo Zangari, Andrea Tagarelli:
Exploring LLMs' Ability to Spontaneously and Conditionally Modify Moral Expressions through Text Manipulation. 18047-18070 - Po-Kai Chen, Bo-Wei Tsai, Shao-Kuan Wei, Chien-Yao Wang, Jia-Ching Wang, Yi-Ting Huang:
Mixture of Ordered Scoring Experts for Cross-prompt Essay Trait Scoring. 18071-18084 - Anshumann, Mohd Abbas Zaidi, Akhil Kedia, Jinwoo Ahn, Taehwak Kwon, Kangwook Lee, Haejun Lee, Joohyung Lee:
Sparse Logit Sampling: Accelerating Knowledge Distillation in LLMs. 18085-18108 - Varsha Suresh, Muhammad Hamza Mughal, Christian Theobalt, Vera Demberg:
Enhancing Spoken Discourse Modeling in Language Models Using Gestural Cues. 18109-18123 - Yunkun Wang, Yue Zhang, Zhen Qin, Chen Zhi, Binhua Li, Fei Huang, Yongbin Li, Shuiguang Deng:
ExploraCoder: Advancing Code Generation for Multiple Unseen APIs via Planning and Chained Exploration. 18124-18145 - Zihong Zhang, Liqi He, Zuchao Li, Lefei Zhang, Hai Zhao, Bo Du:
Segment First or Comprehend First? Explore the Limit of Unsupervised Word Segmentation with Large Language Models. 18146-18163 - Wenzhuo Zhao, Shuangyin Li:
RUBY: An Effective Framework for Multi-Constraint Multi-Hop Question Generation. 18164-18188 - Yulin Chen, Haoran Li, Yuan Sui, Yufei He, Yue Liu, Yangqiu Song, Bryan Hooi:
Can Indirect Prompt Injection Attacks Be Detected and Removed? 18189-18206 - Rob van der Goot:
Identifying Open Challenges in Language Identification. 18207-18227 - Chen Amiraz, Florin Cuconasu, Simone Filice, Zohar S. Karnin:
The Distracting Effect: Understanding Irrelevant Passages in RAG. 18228-18258 - Zeli Su, Ziyin Zhang, Guixian Xu, Jianing Liu, Xu Han, Ting Zhang, Yushuang Dong:
Multilingual Encoder Knows more than You Realize: Shared Weights Pretraining for Extremely Low-Resource Languages. 18259-18270 - Célia Nouri, Chloé Clavel, Jean-Philippe Cointet:
Graphically Speaking: Unmasking Abuse in Social Media with Conversation Insights. 18271-18286 - Yifei Lu, Fanghua Ye, Jian Li, Qiang Gao, Cheng Liu, Haibo Luo, Nan Du, Xiaolong Li, Feiliang Ren:
CodeTool: Enhancing Programmatic Tool Invocation of LLMs via Process Supervision. 18287-18304 - Hieu Tran, Zonghai Yao, Zhichao Yang, Junda Wang, Yifan Zhang, Shuo Han, Feiyun Ouyang, Hong Yu:
RARE: Retrieval-Augmented Reasoning Enhancement for Large Language Models. 18305-18330 - Yulin Chen, Haoran Li, Zihao Zheng, Dekai Wu, Yangqiu Song, Bryan Hooi:
Defense Against Prompt Injection Attack by Leveraging Attack Techniques. 18331-18347 - Ziyu Shang, Jianghan Liu, Zhizhao Luo, Peng Wang, Wenjun Ke, Jiajun Liu, Zijie Xu, Guozheng Li:
Acquisition and Application of Novel Knowledge in Large Language Models. 18348-18368 - Xianrui Zheng, Chao Zhang, Philip C. Woodland:
DNCASR: End-to-End Training for Speaker-Attributed ASR. 18369-18383 - Yonghyun Jun, Hwanhee Lee:
Exploring Persona Sentiment Sensitivity in Personalized Dialogue Generation. 18384-18402 - Xiaobao Wu, Liangming Pan, Yuxi Xie, Ruiwen Zhou, Shuai Zhao, Yubo Ma, Mingzhe Du, Rui Mao, Anh Tuan Luu, William Yang Wang:
AntiLeakBench: Preventing Data Contamination by Automatically Constructing Benchmarks with Updated Real-World Knowledge. 18403-18419 - Jianghan Liu, Ziyu Shang, Wenjun Ke, Peng Wang, Zhizhao Luo, Jiajun Liu, Guozheng Li, Yining Li:
LLM-Guided Semantic-Aware Clustering for Topic Modeling. 18420-18435 - Ana Ezquerro, David Vilares, Anssi Yli-Jyrä, Carlos Gómez-Rodríguez:
Hierarchical Bracketing Encodings for Dependency Parsing as Tagging. 18436-18450 - Zuchen Gao, Zizheng Zhan, Xianming Li, Erxin Yu, Haotian Zhang, Chenbin Chenbin, Yuqun Zhang, Jing Li:
OASIS: Order-Augmented Strategy for Improved Code Search. 18451-18467 - Yancheng He, Shilong Li, Jiaheng Liu, Weixun Wang, Xingyuan Bu, Ge Zhang, Z. Y. Peng, Zhaoxiang Zhang, Zhicheng Zheng, Wenbo Su, Bo Zheng:
Can Large Language Models Detect Errors in Long Chain-of-Thought Reasoning? 18468-18489 - Xiangyu Zhao, Shengyuan Ding, Zicheng Zhang, Haian Huang, Maosongcao Maosongcao, Jiaqi Wang, Weiyun Wang, Xinyu Fang, Wenhai Wang, Guangtao Zhai, Hua Yang, Haodong Duan, Kai Chen:
OmniAlign-V: Towards Enhanced Alignment of MLLMs with Human Preference. 18490-18515 - Songjie Niu, Kaisen Yang, Rui Zhao, Yichao Liu, Zonglin Li, Hongning Wang, Wenguang Chen:
Tree-KG: An Expandable Knowledge Graph Construction Framework for Knowledge-intensive Domains. 18516-18529 - Yuming Yang, Yang Nan, Junjie Ye, Shihan Dou, Xiao Wang, Shuo Li, Huijie Lv, Tao Gui, Qi Zhang, Xuanjing Huang:
Measuring Data Diversity for Instruction Tuning: A Systematic Analysis and A Reliable Metric. 18530-18549 - Nan Huo, Jinyang Li, Bowen Qin, Ge Qu, Xiaolong Li, Xiaodong Li, Chenhao Ma, Reynold Cheng:
Micro-Act: Mitigate Knowledge Conflict in Question Answering via Actionable Self-Reasoning. 18550-18574 - Igor Sterner, Simone Teufel:
Minimal Pair-Based Evaluation of Code-Switching. 18575-18598 - Chuanqi Cheng, Hongda Sun, Bo Du, Shuo Shang, Xinrong Hu, Rui Yan:
DNASpeech: A Contextualized and Situated Text-to-Speech Dataset with Dialogues, Narratives and Actions. 18599-18616 - Qingkai Fang, Yan Zhou, Shoutao Guo, Shaolei Zhang, Yang Feng:
LLaMA-Omni 2: LLM-based Real-time Spoken Chatbot with Autoregressive Streaming Speech Synthesis. 18617-18629 - Qianlong Wang, Keyang Ding, Hengxin Gao, Hui Wang, Ruifeng Xu:
Error Comparison Optimization for Large Language Models on Aspect-Based Sentiment Analysis. 18630-18646 - Elisa Bassignana, Amanda Cercas Curry, Dirk Hovy:
The AI Gap: How Socioeconomic Status Affects Language Technology Interactions. 18647-18664 - Florian Eichin, Yang Janet Liu, Barbara Plank, Michael A. Hedderich:
Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set. 18665-18684 - Samuel Cahyawijaya, Holy Lovenia, Joel Ruben Antony Moniz, Tack Hwa Wong, Mohammad Rifqi Farhansyah, Thant Thiri Maung, Frederikus Hudi, David Anugraha, Muhammad Ravi Shulthan Habibi, Muhammad Reza Qorib, Amit Agarwal, Joseph Marvin Imperial, Hitesh Laxmichand Patel, Vicky Feliren, Bahrul Ilmi Nasution, Manuel Antonio Rufino, Genta Indra Winata, Rian Adam Rajagede, Carlos Rafael Catalan, Mohamed Fazli Mohamed Imam, Priyaranjan Pattnayak, Salsabila Zahirah Pranida, Kevin Pratama, Yeshil Bangera, Adisai Na-Thalang, Patricia Nicole Monderin, Yueqi Song, Christian Simon, Lynnette Hui Xian Ng, Richardy Lobo' Sapan, Taki Hasan Rafi, Bin Wang, Supryadi, Kanyakorn Veerakanjana, Piyalitt Ittichaiwong, Matthew Theodore Roque, Karissa Vincentio, Takdanai Kreangphet, Phakphum Artkaew, Kadek Hendrawan Palgunadi, Yanzhi Yu, Rochana Prih Hastuti, William Nixon, Mithil Bangera, Adrian Xuan Wei Lim, Aye Hninn Khine, Hanif Muhammad Zhafran, Teddy Ferdinan, Audra Aurora Izzani, Ayushman Singh, Evan, Jauza Akbar Krito, Michael Anugraha, Fenal Ashokbhai Ilasariya, Haochen Li, John Amadeo Daniswara, Filbert Aurelian Tjiaranata, Eryawan Presma Yulianrifat, Can Udomcharoenchaikit, Fadil Risdian Ansori, Mahardika Krisna Ihsani, Giang Nguyen, Anab Maulana Barik, Dan John Velasco, Rifo Ahmad Genadi, Saptarshi Saha, Chengwei Wei, Isaiah Edri W. Flores, Kenneth Ko Han Chen, Anjela Gail Santos, Wan Shen Lim, Kaung Si Phyo, Tim Santos, Meisyarah Dwiastuti, Jiayun Luo, Jan Christian Blaise Cruz, Ming Shan Hee, Ikhlasul Akmal Hanif, M. Alif Al Hakim, Muhammad Rizky Sya'ban, Kun Kerdthaisong, Lester James Validad Miranda, Fajri Koto, Tirana Noor Fatyanosa, Alham Fikri Aji, Jostin Jerico Rosal, Jun Kevin, Robert Wijaya, Onno P. Kampman, Ruochen Zhang, Börje F. Karlsson, Peerat Limkonchotiwat:
Crowdsource, Crawl, or Generate? Creating SEA-VL, a Multicultural Vision-Language Dataset for Southeast Asia. 18685-18717 - Yuhao Zhang, Zhiheng Liu, Fan Bu, Ruiyu Zhang, Benyou Wang, Haizhou Li:
Soundwave: Less is More for Speech-Text Alignment in LLMs. 18718-18738 - Soyoung Yoon, Dongha Ahn, Youngwon Lee, Minkyu Jung, HyungJoo Jang, Seung-won Hwang:
RoToR: Towards More Reliable Responses for Order-Invariant Inputs. 18739-18760 - Shivalika Singh, Angelika Romanou, Clémentine Fourrier, David Ifeoluwa Adelani, Jian Gang Ngui, Daniel Vila-Suero, Peerat Limkonchotiwat, Kelly Marchisio, Wei Qi Leong, Yosephine Susanto, Raymond Ng, Shayne Longpre, Sebastian Ruder, Wei-Yin Ko, Antoine Bosselut, Alice Oh, André F. T. Martins, Leshem Choshen, Daphne Ippolito, Enzo Ferrante, Marzieh Fadaee, Beyza Ermis, Sara Hooker:
Global MMLU: Understanding and Addressing Cultural and Linguistic Biases in Multilingual Evaluation. 18761-18799 - Yaxin Fan, Peifeng Li, Qiaoming Zhu:
Improving Dialogue Discourse Parsing through Discourse-aware Utterance Clarification. 18800-18816 - Yan Yang, Yixia Li, Hongru Wang, Xuetao Wei, James Jianqiao Yu, Yun Chen, Guanhua Chen:
ImPart: Importance-Aware Delta-Sparsification for Improved Model Compression and Merging in LLMs. 18817-18829 - Saif M. Mohammad:
Words of Warmth: Trust and Sociability Norms for over 26k English Words. 18830-18850 - Lindia Tjuatja, Graham Neubig:
BehaviorBox: Automated Discovery of Fine-Grained Performance Differences Between Language Models. 18851-18873 - Shujun Liu, Xiaoyu Shen, Yuhang Lai, Siyuan Wang, Shengbin Yue, Zengfeng Huang, Xuanjing Huang, Zhongyu Wei:
HAF-RM: A Hybrid Alignment Framework for Reward Model Training. 18874-18893 - Tadesse Destaw Belay, Ahmed Haj Ahmed, Alvin Grissom II, Iqra Ameer, Grigori Sidorov, Olga Kolesnikova, Seid Muhie Yimam:
CULEMO: Cultural Lenses on Emotion - Benchmarking LLMs for Cross-Cultural Emotion Understanding. 18894-18909 - Ruizhe Chen, Wenhao Chai, Zhifei Yang, Xiaotian Zhang, Ziyang Wang, Tony Q. S. Quek, Joey Tianyi Zhou, Soujanya Poria, Zuozhu Liu:
DiffPO: Diffusion-styled Preference Optimization for Inference Time Alignment of Large Language Models. 18910-18925 - Khoi P. N. Nguyen, Terrence Li, Derek Lou Zhou, Gabriel Xiong, Pranav Balu, Nandhan Alahari, Alan Huang, Tanush Chauhan, Harshavardhan Bala, Emre Guzelordu, Affan Kashfi, Aaron Xu, Suyesh Shrestha, Megan Kim Vu, Jerry Yining Wang, Vincent Ng:
MemeQA: Holistic Evaluation for Meme Understanding. 18926-18946 - Ruihan Yang, Caiqi Zhang, Zhisong Zhang, Xinting Huang, Sen Yang, Nigel Collier, Dong Yu, Deqing Yang:
LoGU: Long-form Generation with Uncertainty Expressions. 18947-18968 - Jinyuan Fang, Zaiqiao Meng, Craig MacDonald:
KiRAG: Knowledge-Driven Iterative Retriever for Enhancing Retrieval-Augmented Generation. 18969-18985 - Yibin Lei, Tao Shen, Yu Cao, Andrew Yates:
Enhancing Lexicon-Based Text Embeddings with Large Language Models. 18986-19001 - T. Y. S. S. Santosh, Youssef Tarek Elkhayat, Oana Ichim, Pranav Shetty, Dongsheng Wang, Zhiqiang Ma, Armineh Nourbakhsh, Xiaomo Liu:
CoCoLex: Confidence-guided Copy-based Decoding for Grounded Legal Text Generation. 19002-19018 - Itai Mondshine, Tzuf Paz-Argaman, Reut Tsarfaty:
Beyond N-Grams: Rethinking Evaluation Metrics and Strategies for Multilingual Abstractive Summarization. 19019-19035 - Yangfan Ye, Xiaocheng Feng, Zekun Yuan, Xiachong Feng, Libo Qin, Lei Huang, Weitao Ma, Yichong Huang, Zhirui Zhang, Yunfei Lu, Xiaohui Yan, Duyu Tang, Dandan Tu, Bing Qin:
CC-Tuning: A Cross-Lingual Connection Mechanism for Improving Joint Multilingual Supervised Fine-Tuning. 19036-19051 - Zhiyuan Wang, Qingni Wang, Yue Zhang, Tianlong Chen, Xiaofeng Zhu, Xiaoshuang Shi, Kaidi Xu:
SConU: Selective Conformal Uncertainty in Large Language Models. 19052-19075 - Junjie Zhou, Yongping Xiong, Zheng Liu, Ze Liu, Shitao Xiao, Yueze Wang, Bo Zhao, Chen Jason Zhang, Defu Lian:
MegaPairs: Massive Data Synthesis for Universal Multimodal Retrieval. 19076-19095 - Xinyue Shen, Yun Shen, Michael Backes, Yang Zhang:
When GPT Spills the Tea: Comprehensive Assessment of Knowledge File Leakage in GPTs. 19096-19111 - Yidi Jiang, Qian Chen, Shengpeng Ji, Yu Xi, Wen Wang, Chong Zhang, Xianghu Yue, Shiliang Zhang, Haizhou Li:
UniCodec: Unified Audio Codec with Single Domain-Adaptive Codebook. 19112-19124 - Fnu Mohbat, Mohammed J. Zaki:
KERL: Knowledge-Enhanced Personalized Recipe Recommendation using Large Language Models. 19125-19141 - Ayomide Odumakinde, Daniel D'souza, Pat Verga, Beyza Ermis, Sara Hooker:
Multilingual Arbitration: Optimizing Data Pools to Accelerate Multilingual Progress. 19142-19164 - Yuheng Lu, Bingshuo Qian, Caixia Yuan, Huixing Jiang, Xiaojie Wang:
Controlled Low-Rank Adaptation with Subspace Regularization for Continued Training on Large Language Models. 19165-19181 - Yancheng He, Shilong Li, Jiaheng Liu, Yingshui Tan, Weixun Wang, Hui Huang, Xingyuan Bu, Hangyu Guo, Chengwei Hu, Boren Zheng, Zhuoran Lin, Dekai Sun, Zhicheng Zheng, Wenbo Su, Bo Zheng:
Chinese SimpleQA: A Chinese Factuality Evaluation for Large Language Models. 19182-19208 - Junseo Kim, Jongwook Han, Dongmin Choi, Jongwook Yoon, Eun-Ju Lee, Yohan Jo:
PVP: An Image Dataset for Personalized Visual Persuasion with Persuasion Strategies, Viewer Characteristics, and Persuasiveness Ratings. 19209-19237 - Zheng Liu, Ze Liu, Zhengyang Liang, Junjie Zhou, Shitao Xiao, Chao Gao, Chen Jason Zhang, Defu Lian:
Any Information Is Just Worth One Single Screenshot: Unifying Search With Visualized Information Retrieval. 19238-19261 - Mingze Wang, Chongming Gao, Wenjie Wang, Yangyang Li, Fuli Feng:
Tunable LLM-based Proactive Recommendation Agent. 19262-19276 - Yu Xia, Jingru Fan, Weize Chen, Siyu Yan, Xin Cong, Zhong Zhang, Yaxi Lu, Yankai Lin, Zhiyuan Liu, Maosong Sun:
AgentRM: Enhancing Agent Generalization with Reward Modeling. 19277-19290 - Bin Xie, Bingbing Xu, Yige Yuan, Shengmao Zhu, Huawei Shen:
From Outcomes to Processes: Guiding PRM Learning from ORM for Inference-Time Alignment. 19291-19307 - Shahar Katz, Liran Ringel, Yaniv Romano, Lior Wolf:
Segment-Based Attention Masking for GPTs. 19308-19322 - Yuri Kuratov, Mikhail Arkhipov, Aydar Bulatov, Mikhail Burtsev:
Cramming 1568 Tokens into a Single Vector and Back Again: Exploring the Limits of Embedding Space Capacity. 19323-19339 - Xinyu Zhang, Linmei Hu, Luhao Zhang, Wentao Cheng, Yashen Wang, Ge Shi, Chong Feng, Liqiang Nie:
Bi-Tuning with Collaborative Information for Controllable LLM-based Sequential Recommendation. 19340-19351 - Jean-Philippe Corbeil, Amin Dada, Jean-Michel Attendu, Asma Ben Abacha, Alessandro Sordoni, Lucas Caccia, François Beaulieu, Thomas Lin, Jens Kleesiek, Paul Vozila:
A Modular Approach for Clinical SLMs Driven by Synthetic Data with Pre-Instruction Tuning, Model Merging, and Clinical-Tasks Alignment. 19352-19374 - Yuchen Feng, Bowen Shen, Naibin Gu, Jiaxuan Zhao, Peng Fu, Zheng Lin, Weiping Wang:
DIVE into MoE: Diversity-Enhanced Reconstruction of Large Language Models from Dense into Mixture-of-Experts. 19375-19394 - Yi Zhao, Zuchao Li, Hai Zhao, Baoyuan Qi, Guoming Liu:
DAC: A Dynamic Attention-aware Approach for Task-Agnostic Prompt Compression. 19395-19407 - Chi Han, Heng Ji:
Computation Mechanism Behind LLM Position Generalization. 19408-19424 - Shivank Garg, Ayush Singh, Shweta Singh, Paras Chopra:
IPO: Your Language Model is Secretly a Preference Classifier. 19425-19441 - Jiahao Yuan, Dehui Du, Hao Zhang, Zixiang Di, Usman Naseem:
Reversal of Thought: Enhancing Large Language Models with Preference-Guided Reverse Reasoning Warm-up. 19442-19459 - Yoav Meiri, Omer Shubi, Cfir Avraham Hadar, Ariel Kreisberg Nitzav, Yevgeni Berzak:
Déjà Vu? Decoding Repeated Reading from Eye Movements. 19460-19482 - Yerin Hwang, Yongil Kim, Jahyun Koo, Taegwan Kang, Hyunkyung Bae, Kyomin Jung:
LLMs can be easily Confused by Instructional Distractions. 19483-19496 - Hui Wei, Zihao Zhang, Shenghua He, Tian Xia, Shijia Pan, Fei Liu:
PlanGenLLMs: A Modern Survey of LLM Planning Capabilities. 19497-19521 - Yi Zhao, Zuchao Li, Hai Zhao:
IAM: Efficient Inference through Attention Mapping between Different-scale LLMs. 19522-19533 - Geliang Ouyang, Jingyao Chen, Zhihe Nie, Yi Gui, Yao Wan, Hongyu Zhang, Dongping Chen:
nvAgent: Automated Data Visualization from Natural Language via Collaborative Agent Workflow. 19534-19567 - Jian Zhu, Farhan Samir, Eleanor Chodroff, David R. Mortensen:
ZIPA: A family of efficient models for multilingual phone recognition. 19568-19585 - Yoo Yeon Sung, Eve Fleisig, Yu Hou, Ishan Upadhyay, Jordan Lee Boyd-Graber:
GRACE: A Granular Benchmark for Evaluating Model Calibration against Human Calibration. 19586-19587 - Lanxue Zhang, Yanan Cao, Yuqiang Xie, Fang Fang, Yangxi Li:
Dynamic Evaluation with Cognitive Reasoning for Multi-turn Safety of Large Language Models. 19588-19608 - Nathanaël Carraz Rakotonirina, Mohammed Hamdy, Jon Ander Campos, Lucas Weber, Alberto Testoni, Marzieh Fadaee, Sandro Pezzelle, Marco Del Tredici:
From Tools to Teammates: Evaluating LLMs in Multi-Session Coding Interactions. 19609-19642 - Junxiao Yang, Zhexin Zhang, Shiyao Cui, Hongning Wang, Minlie Huang:
Guiding not Forcing: Enhancing the Transferability of Jailbreaking Attacks on LLMs via Removing Superfluous Constraints. 19643-19655 - Felix Friedrich, Katharina Hämmerl, Patrick Schramowski, Manuel Brack, Jindrich Libovický, Alexander Fraser, Kristian Kersting:
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes. 19656-19679 - Jun Sun, Xinxin Zhang, Simin Hong, Jian Zhu, Lingfang Zeng:
Adversarial Alignment with Anchor Dragging Drift (A³D²): Multimodal Domain Adaptation with Partially Shifted Modalities. 19680-19690 - Lovisa Hagström, Sara Vera Marjanovic, Haeun Yu, Arnav Arora, Christina Lioma, Maria Maistro, Pepa Atanasova, Isabelle Augenstein:
A Reality Check on Context Utilisation for Retrieval-Augmented Generation. 19691-19730 - Debela Gemechu, Chris Reed:
CU-MAM: Coherence-Driven Unified Macro-Structures for Argument Mining. 19731-19749 - Hongyu Chen, Seraphina Goldfarb-Tarrant:
Safer or Luckier? LLMs as Safety Evaluators Are Not Robust to Artifacts. 19750-19766 - Dongge Xue, Zhili Pu, Zhentao Xia, Hongli Sun, Ruihui Hou, Guangya Yu, Yupian Lin, Yongqi Fan, Jingping Liu, Tong Ruan:
Text-to-ES Bench: A Comprehensive Benchmark for Converting Natural Language to Elasticsearch Query. 19767-19790 - Songming Zhang, Xue Zhang, Tong Zhang, Bojie Hu, Yufeng Chen, Jinan Xu:
AlignDistil: Token-Level Language Model Alignment as Adaptive Policy Distillation. 19791-19807 - Vaibhav Aggarwal, Ojasv Kamal, Abhinav Japesh, Zhijing Jin, Bernhard Schölkopf:
DARS: Dynamic Action Re-Sampling to Enhance Coding Agent Performance by Adaptive Tree Traversal. 19808-19855 - Patrick Queiroz Da Silva, Hari Sethuraman, Dheeraj Rajagopal, Hannaneh Hajishirzi, Sachin Kumar:
Steering off Course: Reliability Challenges in Steering Language Models. 19856-19882 - Dou Hu, Lingwei Wei, Wei Zhou, Songlin Hu:
Impartial Multi-task Representation Learning via Variance-invariant Probabilistic Decoding. 19883-19897 - Adrian de Wynter:
If Eleanor Rigby Had Met ChatGPT: A Study on Loneliness in a Post-LLM World. 19898-19913 - Luyao Cheng, Hui Wang, Chong Deng, Siqi Zheng, Yafeng Chen, Rongjie Huang, Qinglin Zhang, Qian Chen, Xihao Li, Wen Wang:
Integrating Audio, Visual, and Semantic Information for Enhanced Multimodal Speaker Diarization on Multi-party Conversation. 19914-19928 - Zhecheng Li, Yiwei Wang, Bryan Hooi, Yujun Cai, Zhen Xiong, Nanyun Peng, Kai-Wei Chang:
Vulnerability of LLMs to Vertically Aligned Text Manipulations. 19929-19941 - Ernie Chang, Yang Li, Patrick Huber, Vish Vogeti, David Kant, Yangyang Shi, Vikas Chandra:
AutoMixer: Checkpoint Artifacts as Automatic Data Mixers. 19942-19953 - Behrooz Azarkhalili, Maxwell W. Libbrecht:
Generalized Attention Flow: Feature Attribution for Transformer Models via Maximum Flow. 19954-19974 - Zhanghao Hu, Hanqi Yan, Qinglin Zhu, Zhenyi Shen, Yulan He, Lin Gui:
Beyond Prompting: An Efficient Embedding Framework for Open-Domain Question Answering. 19975-19990 - Jianlyu Chen, Nan Wang, Chaofan Li, Bo Wang, Shitao Xiao, Han Xiao, Hao Liao, Defu Lian, Zheng Liu:
AIR-Bench: Automated Heterogeneous Information Retrieval Benchmark. 19991-20022 - Runqi Qiao, Qiuna Tan, Guanting Dong, Minhui Wu, Chong Sun, Xiaoshuai Song, Jiapeng Wang, Zhuoma Gongque, Shanglin Lei, Yifan Zhang, Zhe Wei, Miaoxuan Zhang, Runfeng Qiao, Xiao Zong, Yida Xu, Peiqing Yang, Zhimin Bao, Muxi Diao, Chen Li, Honggang Zhang:
We-Math: Does Your Large Multimodal Model Achieve Human-like Mathematical Reasoning? 20023-20070 - Filip Miletic, Sabine Schulte im Walde:
Modeling the Evolution of English Noun Compounds with Feature-Rich Diachronic Compositionality Prediction. 20071-20092 - Michael A. Hedderich, Anyi Wang, Raoyuan Zhao, Florian Eichin, Jonas Fischer, Barbara Plank:
What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns. 20093-20123 - Runqi Qiao, Qiuna Tan, Guanting Dong, MinhuiWu MinhuiWu, Jiapeng Wang, Yifan Zhang, Zhuoma Gongque, Chong Sun, Yida Xu, Yadong Xue, Ye Tian, Zhimin Bao, Lan Yang, Chen Li, Honggang Zhang:
V-Oracle: Making Progressive Reasoning in Deciphering Oracle Bones for You and Me. 20124-20150 - Amir Hossein Yari, Fajri Koto:
Unveiling Cultural Blind Spots: Analyzing the Limitations of mLLMs in Procedural Text Comprehension. 20151-20170 - Ioannis Tsiamas, David Dale, Marta R. Costa-jussà:
Improving Language and Modality Transfer in Translation by Character-level Modeling. 20171-20187 - Niyati Bafna, Emily Chang, Nathaniel Romney Robinson, David R. Mortensen, Kenton Murray, David Yarowsky, Hale Sirin:
DialUp! Modeling the Language Continuum by Adapting Models to Dialects and Dialects to Models. 20188-20233 - Nicholas E. Corrado, Julian Katz-Samuels, Adithya M. Devraj, Hyokun Yun, Chao Zhang, Yi Xu, Yi Pan, Bing Yin, Trishul Chilimbi:
AutoMixAlign: Adaptive Data Mixing for Multi-Task Preference Optimization in LLMs. 20234-20258 - Naïm Es-Sebbani, Esteban Marquer, Zied Bouraoui:
Modeling Complex Semantics Relation with Contrastively Fine-Tuned Relational Encoders. 20259-20288 - Barry Menglong Yao, Qifan Wang, Lifu Huang:
Error-driven Data-efficient Large Multimodal Model Tuning. 20289-20306 - Hanwen Du, Bo Peng, Xia Ning:
Planning with Diffusion Models for Target-Oriented Dialogue Systems. 20307-20329 - Anthony Zhe Liu, Xinhe Wang, Jacob Sansom, Yao Fu, Jongwook Choi, Sungryull Sohn, Jaekyeom Kim, Honglak Lee:
Interactive and Expressive Code-Augmented Planning with Large Language Models. 20330-20354 - Yizhu Jiao, Xuchao Zhang, Zhaoyang Wang, Yubo Ma, Zhun Deng, Rujia Wang, Chetan Bansal, Saravan Rajmohan, Jiawei Han, Huaxiu Yao:
Synergistic Weak-Strong Collaboration by Aligning Preferences. 20355-20371 - Jeffrey Jian Ma, Hengzhi Pei, Leonard Lausen, George Karypis:
Understanding Silent Data Corruption in LLM Training. 20372-20394 - Guan-Ting Lin, Prashanth Gurunath Shivakumar, Aditya Gourav, Yile Gu, Ankur Gandhe, Hung-yi Lee, Ivan Bulyko:
Align-SLM: Textless Spoken Language Models with Reinforcement Learning from AI Feedback. 20395-20411 - Jungsoo Park, Junmo Kang, Gabriel Stanovsky, Alan Ritter:
Can LLMs Help Uncover Insights about LLMs? A Large-Scale, Evolving Literature Analysis of Frontier LLMs. 20412-20433 - Wenkai Li, Jiarui Liu, Andy Liu, Xuhui Zhou, Mona T. Diab, Maarten Sap:
BIG5-CHAT: Shaping LLM Personalities Through Training on Human-Grounded Data. 20434-20471 - Olga Loginova, Sofía Ortega Loguinova:
Deep Temporal Reasoning in Video Language Models: A Cross-Linguistic Evaluation of Action Duration and Completion through Perfect Times. 20472-20502 - Eddie L. Ungless, Sunipa Dev, Cynthia L. Bennett, Rebecca Gulotta, Jasmijn Bastings, Remi Denton:
Amplifying Trans and Nonbinary Voices: A Community-Centred Harm Taxonomy for LLMs. 20503-20535 - Yixiao Song, Parker Riley, Daniel Deutsch, Markus Freitag:
Enhancing Human Evaluation in Machine Translation with Comparative Judgement. 20536-20551 - Akash Ghosh, Aparna Garimella, Pritika Ramu, Sambaran Bandyopadhyay, Sriparna Saha:
Infogen: Generating Complex Statistical Infographics from Documents. 20552-20570 - Arne Rubehn, Johann-Mattis List:
Partial Colexifications Improve Concept Embeddings. 20571-20586 - Ruibo Chen, Yihan Wu, Junfeng Guo, Heng Huang:
Improved Unbiased Watermark for Large Language Models. 20587-20601 - Yixian Shen, Qi Bi, Jia-Hong Huang, Hongyi Zhu, Andy D. Pimentel, Anuj Pathania:
MaCP: Minimal yet Mighty Adaptation via Hierarchical Cosine Projection. 20602-20618 - Duy Nguyen, Archiki Prasad, Elias Stengel-Eskin, Mohit Bansal:
Multi-Attribute Steering of Language Models via Targeted Intervention. 20619-20634 - Gaurav Verma, Rachneet Kaur, Nishan Srishankar, Zhen Zeng, Tucker Balch, Manuela Veloso:
AdaptAgent: Adapting Multimodal Web Agents with Few-Shot Learning from Human Demonstrations. 20635-20651 - Zhijian Xu, Yilun Zhao, Manasi Patwardhan, Lovekesh Vig, Arman Cohan:
Can LLMs Identify Critical Limitations within Scientific Research? A Systematic Evaluation on AI Research Papers. 20652-20706 - Catherine Arnett, Tyler A. Chang, James A. Michaelov, Ben Bergen:
On the Acquisition of Shared Grammatical Representations in Bilingual Language Models. 20707-20726 - Divyansh Singhvi, Diganta Misra, Andrej Erkelens, Raghav Jain, Isabel Papadimitriou, Naomi Saphra:
Using Shapley interactions to understand how models use structure. 20727-20737 - Renato Lui Geh, Zilei Shao, Guy Van den Broeck:
Adversarial Tokenization. 20738-20765 - Anneliese Brei, Katharine Henry, Abhisheik Sharma, Shashank Srivastava, Snigdha Chaturvedi:
Classifying Unreliable Narrators with Large Language Models. 20766-20791 - Eylon Caplan, Dan Goldwasser:
ConceptCarve: Dynamic Realization of Evidence. 20792-20809 - An Quang Tang, Xiuzhen Zhang, Minh Ngoc Dinh, Zhuang Li:
QQSUM: A Novel Task and Model of Quantitative Query-Focused Summarization for Review-based Product Question Answering. 20810-20831 - Omar Shaikh, Hussein Mozannar, Gagan Bansal, Adam Fourney, Eric Horvitz:
Navigating Rifts in Human-LLM Grounding: Study and Benchmark. 20832-20847 - Vidya Srinivas, Xuhai Xu, Xin Liu, Kumar Ayush, Isaac R. Galatzer-Levy, Shwetak N. Patel, Daniel McDuff, Tim Althoff:
Substance over Style: Evaluating Proactive Conversational Coaching Agents. 20848-20880 - Xiaotian Liu, Ali Pesaranghader, Hanze Li, Punyaphat Sukcharoenchaikul, Jaehong Kim, Tanmana Sadhu, Hyejeong Jeon, Scott Sanner:
Open-World Planning via Lifted Regression with LLM-Inferred Affordances for Embodied Agents. 20881-20897 - Cesare Spinoso Di Piano, David Eric Austin, Pablo Piantanida, Jackie CK Cheung:
(RSA)²: A Rhetorical-Strategy-Aware Rational Speech Act Framework for Figurative Language Understanding. 20898-20938 - Hyeonjeong Ha, Xiaomeng Jin, Jeonghwan Kim, Jiateng Liu, Zhenhailong Wang, Khanh Duy Nguyen, Ansel Blume, Nanyun Peng, Kai-Wei Chang, Heng Ji:
SYNTHIA: Novel Concept Design with Affordance Composition. 20939-20958 - Yizhe Yang, Palakorn Achananuparp, Heyan Huang, Jing Jiang, Nicholas Gabriel Lim, Cameron Tan Shi Ern, Phey Ling Kit, Jenny Giam Xiuhui, John Pinto, Ee-Peng Lim:
Consistent Client Simulation for Motivational Interviewing-based Counseling. 20959-20998 - Naba Rizvi, Harper Strickland, Daniel Gitelman, Alexis Morales Flores, Tristan Cooper, Aekta Kallepalli, Akshat Alurkar, Haaset Owens, Saleha Ahmedi, Isha Khirwadkar, Imani N. S. Munyaka, Nedjma Ousidhoum:
AUTALIC: A Dataset for Anti-AUTistic Ableist Language In Context. 20999-21015 - Yunhui Jang, Jaehyung Kim, Sungsoo Ahn:
Structural Reasoning Improves Molecular Understanding of LLM. 21016-21036 - Yizhe Yang, Palakorn Achananuparp, Heyan Huang, Jing Jiang, Phey Ling Kit, Nicholas Gabriel Lim, Cameron Tan Shi Ern, Ee-Peng Lim:
CAMI: A Counselor Agent Supporting Motivational Interviewing through State Inference and Topic Exploration. 21037-21081 - Kuang Wang, Xianfei Li, Shenghao Yang, Li Zhou, Feng Jiang, Haizhou Li:
Know You First and Be You Better: Modeling Human-Like User Simulators via Implicit Profiles. 21082-21107 - Aomi Koyama, Masato Mita, Su-Youn Yoon, Yasufumi Takama, Mamoru Komachi:
Targeted Syntactic Evaluation for Grammatical Error Correction. 21108-21125 - Tingyu Song, Tongyan Hu, Guo Gan, Yilun Zhao:
VF-Eval: Evaluating Multimodal LLMs for Generating Feedback on AIGC Videos. 21126-21146 - Joseph Suh, Erfan Jahanparast, Suhong Moon, Minwoo Kang, Serina Chang:
Language Model Fine-Tuning on Scaled Survey Data for Predicting Distributions of Public Opinions. 21147-21170 - Jaesung Tae, Hamish Ivison, Sachin Kumar, Arman Cohan:
TESS 2: A Large-Scale Generalist Diffusion Language Model. 21171-21188 - Shinwoo Park, Shubin Kim, Do-Kyung Kim, Yo-Sub Han:
KatFishNet: Detecting LLM-Generated Korean Text through Linguistic Feature Analysis. 21189-21222 - Hanbing Liu, Haoyang Li, Xiaokang Zhang, Ruotong Chen, Haiyong Xu, Tian Tian, Qi Qi, Jing Zhang:
Uncovering the Impact of Chain-of-Thought Reasoning for Direct Preference Optimization: Lessons from Text-to-SQL. 21223-21261 - Minh Duc Bui, Kyung Eun Park, Goran Glavas, Fabian David Schmidt, Katharina von der Wense:
On Generalization across Measurement Systems: LLMs Entail More Test-Time Compute for Underrepresented Cultures. 21262-21276 - Aashish Anantha Ramakrishnan, Aadarsh Anantha Ramakrishnan, Dongwon Lee:
CORDIAL: Can Multimodal Large Language Models Effectively Understand Coherence Relationships? 21277-21297 - Yue Zhou, Barbara Di Eugenio:
Veracity Bias and Beyond: Uncovering LLMs' Hidden Beliefs in Problem-Solving Reasoning. 21298-21310 - Meng Li, Guangda Huzhang, Haibo Zhang, Xiting Wang, Anxiang Zeng:
Optimal Transport-Based Token Weighting scheme for Enhanced Preference Optimization. 21311-21334 - Dongil Yang, Minjin Kim, Sunghwan Kim, Beong-woo Kwak, Minjun Park, Jinseok Hong, Woontack Woo, Jinyoung Yeo:
LLM Meets Scene Graph: Can Large Language Models Understand and Generate Scene Graphs? A Benchmark and Empirical Study. 21335-21360 - Haochun Wang, Sendong Zhao, Jingbo Wang, Zewen Qiang, Bing Qin, Ting Liu:
Beyond Frameworks: Unpacking Collaboration Strategies in Multi-Agent Systems. 21361-21375 - Xiaoyu Zhang, Juan Zhai, Shiqing Ma, Qingshuang Bao, Weipeng Jiang, Qian Wang, Chao Shen, Yang Liu:
The Invisible Hand: Unveiling Provider Bias in Large Language Models for Code Generation. 21376-21403 - Minkyeong Jeon, Hyemin Jeong, Yerang Kim, Jiyoung Kim, Jae Hyeon Cho, Byung-Jun Lee:
K/DA: Automated Data Generation Pipeline for Detoxifying Implicitly Offensive Language in Korean. 21404-21432 - Yunlong Liang, Fandong Meng, Jie Zhou:
THOR-MoE: Hierarchical Task-Guided and Context-Responsive Routing for Neural Machine Translation. 21433-21445 - Xin Zhao, Zehui Jiang, Naoki Yoshinaga:
Neuron Empirical Gradient: Discovering and Quantifying Neurons' Global Linear Controllability. 21446-21477 - Jiayi Li, Yingfan Zhou, Pranav Narayanan Venkit, Halima Binte Islam, Sneha Arya, Shomir Wilson, Sarah Rajtmajer:
Can Third Parties Read Our Emotions? 21478-21499 - Nghia-Huynh Nguyen-Hieu, Ngoc Son Nguyen, Huynh Nguyen Dang, Thieu Vo, Truong-Son Hy, Van Nguyen:
OZSpeech: One-step Zero-shot Speech Synthesis with Learned-Prior-Conditioned Flow Matching. 21500-21517 - Siyin Wang, Zhaoye Fei, Qinyuan Cheng, Shiduo Zhang, Panpan Cai, Jinlan Fu, Xipeng Qiu:
World Modeling Makes a Better Planner: Dual Preference Optimization for Embodied Task Planning. 21518-21537 - Junjie Chu, Yugeng Liu, Ziqing Yang, Xinyue Shen, Michael Backes, Yang Zhang:
JailbreakRadar: Comprehensive Assessment of Jailbreak Attacks Against LLMs. 21538-21566 - Xiaqiang Tang, Jian Li, Keyu Hu, Nan Du, Xiaolong Li, Xi Zhang, Weigao Sun, Sihong Xie:
CogniBench: A Legal-inspired Framework and Dataset for Assessing Cognitive Faithfulness of Large Language Models. 21567-21585 - Yuqiao Tan, Shizhu He, Kang Liu, Jun Zhao:
Neural Incompatibility: The Unbridgeable Gap of Cross-Scale Parametric Knowledge Transfer in Large Language Models. 21586-21601 - Zhenyu Wu, Qingkai Zeng, Zhihan Zhang, Zhaoxuan Tan, Chao Shen, Meng Jiang:
Enhancing Mathematical Reasoning in LLMs by Stepwise Correction. 21602-21623 - Huachuan Qiu, Zhenzhong Lan:
PsyDial: A Large-scale Long-term Conversational Dataset for Mental Health Support. 21624-21655 - Didi Zhang, Yaxin Fan, Peifeng Li, Qiaoming Zhu:
Enhancing Goal-oriented Proactive Dialogue Systems via Consistency Reflection and Correction. 21656-21672 - Qihang Fu, Yongbin Qin, Ruizhang Huang, Yanping Chen, Yulin Zhou, Lintao Long:
Exclusion of Thought: Mitigating Cognitive Load in Large Language Models for Enhanced Reasoning in Multiple-Choice Tasks. 21673-21686 - Zhi Qu, Yiran Wang, Jiannan Mao, Chenchen Ding, Hideki Tanaka, Masao Utiyama, Taro Watanabe:
Registering Source Tokens to Target Language Spaces in Multilingual Neural Machine Translation. 21687-21706 - Yikun Wang, Siyin Wang, Qinyuan Cheng, Zhaoye Fei, Liang Ding, Qipeng Guo, Dacheng Tao, Xipeng Qiu:
VisuoThink: Empowering LVLM Reasoning with Multimodal Tree Search. 21707-21719 - Jianxing Liao, Junyan Xu, Yatao Sun, Maowen Tang, Sicheng He, Jingxian Liao, Shui Yu, Yun Li, Xiaohong Guan:
Automated CAD Modeling Sequence Generation from Text Descriptions via Transformer-Based Large Language Models. 21720-21748 - Qianli Ma, Dongrui Liu, Qian Chen, Linfeng Zhang, Jing Shao:
LED-Merging: Mitigating Safety-Utility Conflicts in Model Merging with Location-Election-Disjoint. 21749-21767 - Jiakang Yuan, Xiangchao Yan, Bo Zhang, Tao Chen, Botian Shi, Wanli Ouyang, Yu Qiao, Lei Bai, Bowen Zhou:
Dolphin: Moving Towards Closed-loop Auto-research through Thinking, Practice, and Feedback. 21768-21789 - Yun Luo, Yingjie Li, Xiangkun Hu, Qinglin Qi, Fang Guo, Qipeng Guo, Zheng Zhang, Yue Zhang:
PerSphere: A Comprehensive Framework for Multi-Faceted Perspective Retrieval and Summarization. 21790-21805 - Fujie Zhang, Peiqi Yu, Biao Yi, Baolei Zhang, Tong Li, Zheli Liu:
Prompt-Guided Internal States for Hallucination Detection of Large Language Models. 21806-21818 - Ndapa Nakashole:
Typology-Guided Adaptation in Multilingual Models. 21819-21835 - Orfeas Menis-Mastromichalakis, Jason Liartis, Kristina Rose, Antoine Isaac, Giorgos Stamou:
Don't Erase, Inform! Detecting and Contextualizing Harmful Language in Cultural Heritage Collections. 21836-21850 - Shangjian Yin, Peijie Huang, Jiatian Chen, Haojing Huang, Yuhong Xu:
ECLM: Entity Level Language Model for Spoken Language Understanding with Chain of Intent. 21851-21862 - Qinggang Zhang, Zhishang Xiang, Yilin Xiao, Le Wang, Junhui Li, Xinrun Wang, Jinsong Su:
FaithfulRAG: Fact-Level Conflict Modeling for Context-Faithful Retrieval-Augmented Generation. 21863-21882 - Guanghui Ye, Huan Zhao, Zhixue Zhao, Xupeng Zha, Yang Liu, Zhihua Jiang:
Knowledge Image Matters: Improving Knowledge-Based Visual Reasoning with Multi-Image Large Language Models. 21883-21896 - Yupu Hao, Pengfei Cao, Zhuoran Jin, Huanxuan Liao, Yubo Chen, Kang Liu, Jun Zhao:
Evaluating Personalized Tool-Augmented LLMs from the Perspectives of Personalization and Proactivity. 21897-21935 - Wentong Chen, Junbo Cui, Jinyi Hu, Yujia Qin, Junjie Fang, Yue Zhao, Chongyi Wang, Jun Liu, Guirong Chen, Yupeng Huo, Yuan Yao, Yankai Lin, Zhiyuan Liu, Maosong Sun:
GUICourse: From General Vision Language Model to Versatile GUI Agent. 21936-21959 - ChaeHun Park, Yujin Baek, Jaeseok Kim, Yu-Jung Heo, Du-Seong Chang, Jaegul Choo:
Evaluating Visual and Cultural Interpretation: The K-Viscuit Benchmark with Human-VLM Collaboration. 21960-21974 - Wen-Shu Fan, Su Lu, Shangyu Xing, Xin-Chun Li, De-Chuan Zhan:
Maximizing the Effectiveness of Larger BERT Models for Compression. 21975-21990 - Thanh Le-Cong, Bach Le, Toby Murray:
Can LLMs Reason About Program Semantics? A Comprehensive Evaluation of LLMs on Formal Specification Inference. 21991-22014 - Zhixiong Su, Yichen Wang, Herun Wan, Zhaohan Zhang, Minnan Luo:
HACo-Det: A Study Towards Fine-Grained Machine-Generated Text Detection under Human-AI Coauthoring. 22015-22036 - Divya V. Sharma, Vijval Ekbote, Anubha Gupta:
IndicSynth: A Large-Scale Multilingual Synthetic Speech Dataset for Low-Resource Indian Languages. 22037-22060 - Chaofan Li, Jianlyu Chen, Yingxia Shao, Chaozhuo Li, Quanqing Xu, Defu Lian, Zheng Liu:
Reinforced IR: A Self-Boosting Framework For Domain-Adapted Information Retrieval. 22061-22073 - Xiangyang Li, Kuicai Dong, Yi Quan Lee, Wei Xia, Hao Zhang, Xinyi Dai, Yasheng Wang, Ruiming Tang:
CoIR: A Comprehensive Benchmark for Code Information Retrieval Models. 22074-22091 - Delong Zeng, Yuexiang Xie, Yaliang Li, Ying Shen:
Enhancing Multimodal Retrieval via Complementary Information Extraction and Alignment. 22092-22105 - Yurui Chang, Bochuan Cao, Yujia Wang, Jinghui Chen, Lu Lin:
JoPA: Explaining Large Language Model's Generation via Joint Prompt Attribution. 22106-22122 - Aoqiang Zhu, Min Hu, Xiaohua Wang, Jiaoyun Yang, Yiming Tang, Ning An:
Proxy-Driven Robust Multimodal Sentiment Analysis with Incomplete Data. 22123-22138 - Xinran Chen, Ben He, Xuanang Chen, Le Sun:
Not All Terms Matter: Recall-Oriented Adaptive Learning for PLM-aided Query Expansion in Open-Domain Question Answering. 22139-22151 - Jiang Li, Xiangdong Su, Zehua Duo, Tian Lan, Xiaotao Guo, Guanglai Gao:
A Mutual Information Perspective on Knowledge Graph Embedding. 22152-22166 - Lihao Sun, Chengzhi Mao, Valentin Hofmann, Xuechunzi Bai:
Aligned but Blind: Alignment Increases Implicit Bias by Reducing Awareness of Race. 22167-22184 - Xinghua Zhang, Haiyang Yu, Cheng Fu, Fei Huang, Yongbin Li:
IOPO: Empowering LLMs with Complex Instruction Following via Input-Output Preference Optimization. 22185-22200 - T. Y. S. S. Santosh, Mohamed Hesham Elganayni:
ProMALex: Progressive Modular Adapters for Multi-Jurisdictional Legal Language Modeling. 22201-22217 - Mingzhe Li, Jing Xiang, Qishen Zhang, Kaiyang Wan, Xiuying Chen:
Flipping Knowledge Distillation: Leveraging Small Models' Expertise to Enhance LLMs in Text Matching. 22218-22229 - Jiahao Ying, Wei Tang, Yiran Zhao, Yixin Cao, Yu Rong, Wenxuan Zhang:
Disentangling Language and Culture for Evaluating Multilingual Large Language Models. 22230-22251 - Luc Raszewski, Christine de Kock:
Detecting Sockpuppetry on Wikipedia Using Meta-Learning. 22252-22264 - Zaitian Wang, Jinghan Zhang, Xinhao Zhang, Kunpeng Liu, Pengfei Wang, Yuanchun Zhou:
Diversity-oriented Data Augmentation with Large Language Models. 22265-22283 - Jingqian Zhao, Bingbing Wang, Geng Tu, Yice Zhang, Qianlong Wang, Bin Liang, Jing Li, Ruifeng Xu:
CoreEval: Automatically Building Contamination-Resilient Datasets with Real-World Knowledge toward Reliable LLM Evaluation. 22284-22306 - Chenyi Zhou, Zhengyan Shi, Yuan Yao, Lei Liang, Huajun Chen, Qiang Zhang:
RiOT: Efficient Prompt Refinement with Residual Optimization Tree. 22307-22323 - Xinbei Ma, Yiting Wang, Yao Yao, Tongxin Yuan, Aston Zhang, Zhuosheng Zhang, Hai Zhao:
Caution for the Environment: Multimodal LLM Agents are Susceptible to Environmental Distractions. 22324-22339 - Rong-Cheng Tu, Zi-Ao Ma, Tian Lan, Yuehao Zhao, Heyan Huang, Xian-Ling Mao:
Automatic Evaluation for Text-to-image Generation: Task-decomposed Framework, Distilled Training, and Meta-evaluation Benchmark. 22340-22361 - Rongzhi Zhu, Xiangyu Liu, Zequn Sun, Yiwei Wang, Wei Hu:
Mitigating Lost-in-Retrieval Problems in Retrieval Augmented Multi-Hop Question Answering. 22362-22375 - Xinyi He, Yihao Liu, Mengyu Zhou, Yeye He, Haoyu Dong, Shi Han, Zejian Yuan, Dongmei Zhang:
TableLoRA: Low-rank Adaptation on Table Structure Understanding for Large Language Models. 22376-22391 - Maosongcao Maosongcao, Taolin Zhang, Mo Li, Chuyu Zhang, Yunxin Liu, Conghui He, Haodong Duan, Songyang Zhang, Kai Chen:
Condor: Enhance LLM Alignment with Knowledge-Driven Data Synthesis and Refinement. 22392-22412 - Ruixiang Feng, Shen Gao, Xiuying Chen, Lisi Chen, Shuo Shang:
CulFiT: A Fine-grained Cultural-aware LLM Training Paradigm via Multilingual Critique Data Synthesis. 22413-22430 - Junzhuo Li, Bo Wang, Xiuze Zhou, Peijie Jiang, Jia Liu, Xuming Hu:
Decoding Knowledge Attribution in Mixture-of-Experts: A Framework of Basic-Refinement Collaboration and Efficiency Analysis. 22431-22446 - Manan Suri, Puneet Mathur, Nedim Lipka, Franck Dernoncourt, Ryan A. Rossi, Dinesh Manocha:
ChartLens: Fine-grained Visual Attribution in Charts. 22447-22462 - Yifei Yang, Zouying Cao, Xinbei Ma, Yao Yao, Zhi Chen, Libo Qin, Hai Zhao:
LESA: Learnable LLM Layer Scaling-Up. 22463-22476 - Haochen Xue, Feilong Tang, Ming Hu, Yexin Liu, Qidong Huang, Yulong Li, Chengzhi Liu, Zhongxing Xu, Chong Zhang, Chun-Mei Feng, Yutong Xie, Imran Razzak, Zongyuan Ge, Jionglong Su, Junjun He, Yu Qiao:
MMRC: A Large-Scale Benchmark for Understanding Multimodal Large Language Model in Real-World Conversation. 22477-22503 - Chen Zhang, Qiuchi Li, Dawei Song, Zheyu Ye, Yan Gao, Yao Hu:
Towards the Law of Capacity Gap in Distilling Language Models. 22504-22528 - Rajath Rao, Adithya V. Ganesan, Oscar N. E. Kjell, Jonah Luby, Akshay Raghavan, Scott M. Feltman, Whitney Ringwald, Ryan L. Boyd, Benjamin J. Luft, Camilo J. Ruggero, Neville Ryant, Roman Kotov, H. Andrew Schwartz:
WhiSPA: Semantically and Psychologically Aligned Whisper with Self-Supervised Contrastive and Student-Teacher Learning. 22529-22544 - Jianhao Yan, Futing Wang, Yun Luo, Yafu Li, Yue Zhang:
Keys to Robust Edits: From Theoretical Insights to Practical Advances. 22545-22560 - Xiang Zhuang, Bin Wu, Jiyu Cui, Kehua Feng, Xiaotong Li, Huabin Xing, Keyan Ding, Qiang Zhang, Huajun Chen:
Boosting LLM's Molecular Structure Elucidation with Knowledge Enhanced Tree Search Reasoning. 22561-22576 - María Andrea Cruz Blandón, Jayasimha Talur, Bruno Charron, Dong Liu, Saab Mansour, Marcello Federico:
MEMERAG: A Multilingual End-to-End Meta-Evaluation Benchmark for Retrieval Augmented Generation. 22577-22595 - Yufang Liu, Yao Du, Tao Ji, Jianing Wang, Yang Liu, Yuanbin Wu, Aimin Zhou, Mengdi Zhang, Xunliang Cai:
The Role of Visual Modality in Multimodal Mathematical Reasoning: Challenges and Insights. 22596-22611 - Chulun Zhou, Qiujing Wang, Mo Yu, Xiaoqian Yue, Rui Lu, Jiangnan Li, Yifan Zhou, Shunchi Zhang, Jie Zhou, Wai Lam:
The Essence of Contextual Understanding in Theory of Mind: A Study on Question Answering with Story Characters. 22612-22631 - Ruotian Ma, Peisong Wang, Cheng Liu, Xingyan Liu, Jiaqi Chen, Bang Zhang, Xin Zhou, Nan Du, Jia Li:
S²R: Teaching LLMs to Self-verify and Self-correct via Reinforcement Learning. 22632-22654 - Haoran Li, Ziyi Su, Yun Xue, Zhiliang Tian, Yiping Song, Minlie Huang:
Advancing Collaborative Debates with Role Differentiation through Multi-Agent Reinforcement Learning. 22655-22666 - Deokhyung Kang, Jeonghun Cho, Yejin Jeon, Sunbin Jang, Minsub Lee, Jawoon Cho, Gary Geunbae Lee:
Retrieval-Augmented Fine-Tuning With Preference Optimization For Visual Program Generation. 22667-22686 - Nils Dycke, Matej Zecevic, Ilia Kuznetsov, Beatrix Suess, Kristian Kersting, Iryna Gurevych:
STRICTA: Structured Reasoning in Critical Text Assessment for Peer Review and Beyond. 22687-22727 - Wooyoung Go, Hyoungshick Kim, Alice Oh, Yongdae Kim:
XDAC: XAI-Driven Detection and Attribution of LLM-Generated News Comments in Korean. 22728-22750 - Jinglong Luo, Guanzhong Chen, Yehong Zhang, Shiyu Liu, Hui Wang, Yue Yu, Xun Zhou, Yuan Qi, Zenglin Xu:
CENTAUR: Bridging the Impossible Trinity of Privacy, Efficiency, and Performance in Privacy-Preserving Transformer Inference. 22751-22770 - Prarabdh Shukla, Wei Yin Chong, Yash Patel, Brennan Schaffner, Danish Pruthi, Arjun Nitin Bhagoji:
Silencing Empowerment, Allowing Bigotry: Auditing the Moderation of Hate Speech on Twitch. 22771-22797 - Che Hyun Lee, Heeseung Kim, Jiheum Yeom, Sungroh Yoon:
EdiText: Controllable Coarse-to-Fine Text Editing with Diffusion Language Models. 22798-22815 - Jafar Isbarov, Arofat Akhundjanova, Mammad Hajili, Kavsar Huseynova, Dmitry Gaynullin, Anar Rzayev, Osman Tursun, Aizirek Turdubaeva, Ilshat Saetov, Rinat Kharisov, Saule Belginova, Ariana Kenbayeva, Amina Alisheva, Abdullatif Köksal, Samir Rustamov, Duygu Ataman:
TUMLU: A Unified and Native Language Understanding Benchmark for Turkic Languages. 22816-22838 - Ziyong Lin, Haoyi Wu, Shu Wang, Kewei Tu, Zilong Zheng, Zixia Jia:
Look Both Ways and No Sink: Converting LLMs into Text Encoders without Training. 22839-22853 - Bowen Chen, Namgi Han, Yusuke Miyao:
A Statistical and Multi-Perspective Revisiting of the Membership Inference Attack in Large Language Models. 22854-22874 - Carolin Holtermann, Paul Röttger, Anne Lauscher:
Around the World in 24 Hours: Probing LLM Knowledge of Time and Place. 22875-22897 - Neele Falk, Gabriella Lapesa:
Mining the uncertainty patterns of humans and models in the annotation of moral foundations and human values. 22898-22921 - Alessio Cocchieri, Luca Ragazzi, Paolo Italiani, Giuseppe Tagliavini, Gianluca Moro:
"What do you call a dog that is incontrovertibly true? Dogma": Testing LLM Generalization through Humor. 22922-22937 - Rui Li, Jing Long, Muge Qi, Heming Xia, Lei Sha, Peiyi Wang, Zhifang Sui:
Towards Harmonized Uncertainty Estimation for Large Language Models. 22938-22953 - Anudeex Shetty, Amin Beheshti, Mark Dras, Usman Naseem:
VITAL: A New Dataset for Benchmarking Pluralistic Alignment in Healthcare. 22954-22974 - Zhen Sun, Zongmin Zhang, Xinyue Shen, Ziyi Zhang, Yule Liu, Michael Backes, Yang Zhang, Xinlei He:
Are We in the AI-Generated Text World Already? Quantifying and Monitoring AIGT on Social Media. 22975-23005 - Linjuan Wu, Haoran Wei, Baosong Yang, Weiming Lu:
From English to Second Language Mastery: Enhancing LLMs with Cross-Lingual Continued Instruction Tuning. 23006-23023 - Anudeex Shetty, Qiongkai Xu, Jey Han Lau:
WET: Overcoming Paraphrasing Vulnerabilities in Embeddings-as-a-Service with Linear Transformation Watermarks. 23024-23043 - Yuhan Chen, Ang Lv, Jian Luan, Bin Wang, Wei Liu:
HoPE: A Novel Positional Encoding Without Long-Term Decay for Enhanced Context Awareness and Extrapolation. 23044-23056 - Ke Yi, Yuhui Xu, Heng Chang, Yuan Meng, Tong Zhang, Jia Li:
One QuantLLM for ALL: Fine-tuning Quantized LLMs Once for Efficient Deployments. 23057-23066 - Guoqiang Gong, Jiaxing Wang, Jin Xu, Deping Xiang, Zicheng Zhang, Leqi Shen, Yifeng Zhang, JunhuaShu JunhuaShu, ZhaolongXing ZhaolongXing, Zhen Chen, Pengzhang Liu, Ke Zhang:
Beyond Logits: Aligning Feature Dynamics for Effective Knowledge Distillation. 23067-23077 - Jingyang Yuan, Huazuo Gao, Damai Dai, Junyu Luo, Liang Zhao, Zhengyan Zhang, Zhenda Xie, Yuxing Wei, Lean Wang, Zhiping Xiao, Yuqing Wang, Chong Ruan, Ming Zhang, Wenfeng Liang, Wangding Zeng:
Native Sparse Attention: Hardware-Aligned and Natively Trainable Sparse Attention. 23078-23097 - Yayu Long, Kewei Chen, Long Jin, Mingsheng Shang:
DRAE: Dynamic Retrieval-Augmented Expert Networks for Lifelong Learning and Task Adaptation in Robotics. 23098-23141 - Kwangwook Seo, Donguk Kwon, Dongha Lee:
MT-RAIG: Novel Benchmark and Evaluation Framework for Retrieval-Augmented Insight Generation over Multiple Tables. 23142-23172 - Chenxi Huang, Shaotian Yan, Liang Xie, Binbin Lin, Sinan Fan, Yue Xin, Deng Cai, Chen Shen, Jieping Ye:
Enhancing Chain-of-Thought Reasoning with Critical Representation Fine-tuning. 23173-23195 - Jaewook Lee, Yeajin Jang, Oh-Woog Kwon, Harksoo Kim:
Does the Emotional Understanding of LVLMs Vary Under High-Stress Environments and Across Different Demographic Attributes? 23196-23210 - Suman Adhya, Debarshi Kumar Sanyal:
S2WTM: Spherical Sliced-Wasserstein Autoencoder for Topic Modeling. 23211-23225 - Zhaoxin Feng, Jianfei Ma, Emmanuele Chersoni, Xiaojing Zhao, Xiaoyi Bao:
Learning to Look at the Other Side: A Semantic Probing Study of Word Embeddings in LLMs with Enabled Bidirectional Attention. 23226-23245 - Yiqun Wang, Chaoqun Wan, Sile Hu, Yonggang Zhang, Xiang Tian, Yaowu Chen, Xu Shen, Jieping Ye:
Tracing and Dissecting How LLMs Recall Factual Knowledge for Real World Questions. 23246-23271 - Xinyu Chen, Peifeng Li, Qiaoming Zhu:
Employing Discourse Coherence Enhancement to Improve Cross-Document Event and Entity Coreference Resolution. 23272-23286 - Shaobo Wang, Xiangqi Jin, Ziming Wang, Jize Wang, Jiajun Zhang, Kaixin Li, Zichen Wen, Zhong Li, Conghui He, Xuming Hu, Linfeng Zhang:
Data Whisperer: Efficient Data Selection for Task-Specific LLM Fine-Tuning via Few-Shot In-Context Learning. 23287-23305 - Shuo Tang, Xianghe Pang, Zexi Liu, Bohan Tang, Rui Ye, Tian Jin, Xiaowen Dong, Yanfeng Wang, Siheng Chen:
Synthesizing Post-Training Data for LLMs through Multi-Agent Simulation. 23306-23335 - Yige Xu, Xu Guo, Zhiwei Zeng, Chunyan Miao:
SoftCoT: Soft Chain-of-Thought for Efficient Reasoning with LLMs. 23336-23351 - Seunghee Kim, Changhyeon Kim, Taeuk Kim:
FCMR: Robust Evaluation of Financial Cross-Modal Multi-Hop Reasoning. 23352-23380 - Mengru Wang, Ziwen Xu, Shengyu Mao, Shumin Deng, Zhaopeng Tu, Huajun Chen, Ningyu Zhang:
Beyond Prompt Engineering: Robust Behavior Control in LLMs via Steering Target Atoms. 23381-23399 - Borui Li, Yitao Wang, Haoran Ma, Ligeng Chen, Jun Xiao, Shuai Wang:
MobiLoRA: Accelerating LoRA-based LLM Inference on Mobile Devices via Context-aware KV Cache Optimization. 23400-23410 - Jiaming Ji, Kaile Wang, Tianyi Alex Qiu, Boyuan Chen, Jiayi Zhou, Changye Li, Hantao Lou, Josef Dai, Yunhuai Liu, Yaodong Yang:
Language Models Resist Alignment: Evidence From Data Compression. 23411-23432 - Qichuan Liu, Chentao Zhang, Chenfeng Zheng, Guosheng Hu, Xiaodong Li, Zhihong Zhang:
Beyond the Answer: Advancing Multi-Hop QA with Fine-Grained Graph Reasoning and Evaluation. 23433-23456 - Nir Endy, Idan Daniel Grosbard, Yuval Ran-Milo, Yonatan Slutzky, Itay Tshuva, Raja Giryes:
Mamba Knockout for Unraveling Factual Information Flow. 23457-23477 - Jaewook Lee, Junseo Jang, Oh-Woog Kwon, Harksoo Kim:
Small Changes, Big Impact: How Manipulating a Few Neurons Can Drastically Alter LLM Aggression. 23478-23505 - Huifeng Yin, Yu Zhao, Minghao Wu, Xuanfan Ni, Bo Zeng, Hao Wang, Tianqi Shi, Liangying Shao, Chenyang Lyu, Longyue Wang, Weihua Luo, Kaifu Zhang:
Marco-o1 v2: Towards Widening The Distillation Bottleneck for Reasoning Models. 23506-23516 - Haoran Sun, Yekun Chai, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang:
Curiosity-Driven Reinforcement Learning from Human Feedback. 23517-23534 - Zehan Wang, Ke Lei, Chen Zhu, Jiawei Huang, Sashuai Zhou, Luping Liu, Xize Cheng, Shengpeng Ji, Zhenhui Ye, Tao Jin, Zhou Zhao:
T2A-Feedback: Improving Basic Capabilities of Text-to-Audio Generation via Fine-grained AI Feedback. 23535-23547 - Zhiyu Shen, Yunhe Pang, Yanghui Rao, Jianxing Yu:
CoE: A Clue of Emotion Framework for Emotion Recognition in Conversations. 23548-23563 - Weixiang Zhao, Yulin Hu, Yang Deng, Tongtong Wu, Wenxuan Zhang, Jiahe Guo, An Zhang, Yanyan Zhao, Bing Qin, Tat-Seng Chua, Ting Liu:
MPO: Multilingual Safety Alignment via Reward Gap Optimization. 23564-23587 - Siyin Wang, Wenyi Yu, Xianzhao Chen, Xiaohai Tian, Jun Zhang, Lu Lu, Yu Tsao, Junichi Yamagishi, Yuxuan Wang, Chao Zhang:
QualiSpeech: A Speech Quality Assessment Dataset with Natural Language Reasoning and Descriptions. 23588-23609 - Deniz Ekin Yavas, Timothée Bernard, Benoît Crabbé, Laura Kallmeyer:
On the Relation Between Fine-Tuning, Topological Properties, and Task Performance in Sense-Enhanced Embeddings. 23610-23625 - Parth Thakkar, Ankush Agarwal, Prasad Kasu, Pulkit Bansal, Chaitanya Devaguptapu:
Finding Needles in Images: Can Multi-modal LLMs Locate Fine Details? 23626-23648 - Yongquan He, Wenyuan Zhang, Xuancheng Huang, Peng Zhang, Lingxun Meng, Xiang Zhou, Ke Zeng, Xunliang Cai:
Don't Half-listen: Capturing Key-part Information in Continual Instruction Tuning. 23649-23668 - Yooseop Lee, Suin Kim, Yohan Jo:
Generating Plausible Distractors for Multiple-Choice Questions via Student Choice Prediction. 23669-23692 - Ukyo Honda, Tatsushi Oka:
Exploring Explanations Improves the Robustness of In-Context Learning. 23693-23714 - Beatrix Miranda Ginn Nielsen, Iuri Macocco, Marco Baroni:
Prediction Hubs are Context-Informed Frequent Tokens in LLMs. 23715-23745 - Qiming Ge, Shuhao Xing, Songyang Gao, Yunhua Zhou, Yicheng Zou, Songyang Zhang, Zhi Chen, Hang Yan, Qi Zhang, Qipeng Guo, Kai Chen:
Capability Salience Vector: Fine-grained Alignment of Loss and Capabilities for Downstream Task Scaling Law. 23746-23761 - Ruiyang Xu, Jialun Cao, Yaojie Lu, Ming Wen, Hongyu Lin, Xianpei Han, Ben He, Shing-Chi Cheung, Le Sun:
CRUXEVAL-X: A Benchmark for Multilingual Code Reasoning, Understanding and Execution. 23762-23779 - Haozhen Zhang, Tao Feng, Jiaxuan You:
Graph of Records: Boosting Retrieval Augmented Generation for Long-context Summarization with Graphs. 23780-23799 - Diana Galván-Sosa, Gabrielle Gaudeau, Pride Kavumba, Yunmeng Li, Hongyi Gu, Zheng Yuan, Keisuke Sakaguchi, Paula Buttery:
Rubrik's Cube: Testing a New Rubric for Evaluating Explanations on the CUBE dataset. 23800-23839 - Yutong Liu, Lida Shi, Rui Song, Hao Xu:
A Dual-Mind Framework for Strategic and Expressive Negotiation Agent. 23840-23860 - Junjie Wu, Gefei Gu, Yanan Zheng, Dit-Yan Yeung, Arman Cohan:
Ref-Long: Benchmarking the Long-context Referencing Capability of Long-context Language Models. 23861-23880 - Zhengyu Chen, Siqi Wang, Teng Xiao, Yudong Wang, Shiqi Chen, Xunliang Cai, Junxian He, Jingang Wang:
Revisiting Scaling Laws for Language Models: The Role of Data Quality and Training Strategies. 23881-23899 - Marc Feger, Katarina Boland, Stefan Dietze:
Limited Generalizability in Argument Mining: State-Of-The-Art Models Learn Datasets, Not Arguments. 23900-23915 - Haoxiang Sun, Ruize Gao, Pei Zhang, Baosong Yang, Rui Wang:
Enhancing Machine Translation with Self-Supervised Preference Data. 23916-23934 - Hao Sun, Yingyan Hou, Jiayan Guo, Bo Wang, Chunyu Yang, Jinsong Ni, Yan Zhang:
Unveil: Unified Visual-Textual Integration and Distillation for Multi-modal Document Retrieval. 23935-23945 - Ante Wang, Linfeng Song, Ye Tian, Dian Yu, Haitao Mi, Xiangyu Duan, Zhaopeng Tu, Jinsong Su, Dong Yu:
Don't Get Lost in the Trees: Streamlining LLM Reasoning by Overcoming Tree Search Exploration Pitfalls. 23946-23959 - João Maria Janeiro, Benjamin Piwowarski, Patrick Gallinari, Loïc Barrault:
MEXMA: Token-level objectives improve sentence representations. 23960-23995 - Lei Li, Hehuan Liu, Yaxin Zhou, ZhaoYang Gui, Xudong Weng, Yi Yuan, Zheng Wei, Zang Li:
Uncertainty-Aware Iterative Preference Optimization for Enhanced LLM Reasoning. 23996-24012 - Zhexuan Wang, Yutong Wang, Xuebo Liu, Liang Ding, Miao Zhang, Jie Liu, Min Zhang:
AgentDropout: Dynamic Agent Elimination for Token-Efficient and High-Performance LLM-Based Multi-Agent Collaboration. 24013-24035 - Yang Xiao, Jiashuo Wang, Qiancheng Xu, Changhe Song, Chunpu Xu, Yi Cheng, Wenjie Li, Pengfei Liu:
Towards Dynamic Theory of Mind: Evaluating LLM Adaptation to Temporal Evolution of Human States. 24036-24057 - Bo Zeng, Chenyang Lyu, Sinuo Liu, Mingyan Zeng, Minghao Wu, Xuanfan Ni, Tianqi Shi, Yu Zhao, Yefeng Liu, Chenyu Zhu, Ruizhe Li, Jiahui Geng, Qing Li, Yu Tong, Longyue Wang, Weihua Luo, Kaifu Zhang:
Marco-Bench-MIF: On Multilingual Instruction-Following Capability of Large Language. 24058-24072 - Ashkan Yousefpour, Taeheon Kim, Ryan Sungmo Kwon, Seungbeen Lee, Wonje Jeung, Seungju Han, Alvin Wan, Harrison Ngan, Youngjae Yu, Jonghyun Choi:
Representation Bending for Large Language Model Safety. 24073-24098 - Chenghao Xiao, Hou Pong Chan, Hao Zhang, Mahani Aljunied, Lidong Bing, Noura Al Moubayed, Yu Rong:
Analyzing LLMs' Knowledge Boundary Cognition Across Languages Through the Lens of Internal Representations. 24099-24115 - Hao Sun, Hengyi Cai, Yuchen Li, Xuanbo Fan, Xiaochi Wei, Shuaiqiang Wang, Yan Zhang, Dawei Yin:
Enhancing Retrieval-Augmented Generation via Evidence Tree Search. 24116-24127 - Yejin Bang, Ziwei Ji, Alan Schelten, Anthony Hartshorn, Tara Fowler, Cheng Zhang, Nicola Cancedda, Pascale Fung:
HalluLens: LLM Hallucination Benchmark. 24128-24156 - Aili Chen, Chengyu Du, Jiangjie Chen, Jinghan Xu, Yikai Zhang, Siyu Yuan, Zulong Chen, Liangyue Li, Yanghua Xiao:
DEEPER Insight into Your User: Directed Persona Refinement for Dynamic Persona Modeling. 24157-24180 - Jie Liu, Wenxuan Wang, Yihang Su, Jingyuan Huang, Yudi Zhang, Cheng-Yi Li, Wenting Chen, Xiaohan Xing, Kao-Jung Chang, Linlin Shen, Michael R. Lyu:
Asclepius: A Spectrum Evaluation Benchmark for Medical Multi-Modal Large Language Models. 24181-24201 - Zifu Wan, Yaqi Xie, Ce Zhang, Zhiqiu Lin, Zihan Wang, Simon Stepputtis, Deva Ramanan, Katia P. Sycara:
InstructPart: Task-Oriented Part Segmentation with Instruction Reasoning. 24202-24227 - Thomas Bauwens, David Kaczér, Miryam de Lhoneux:
GRaMPa: Subword Regularisation by Skewing Uniform Segmentation Distributions with an Efficient Path-counting Markov Model. 24228-24257 - Tianhui Zhang, Bei Peng, Danushka Bollegala:
Evaluating the Evaluation of Diversity in Commonsense Generation. 24258-24275 - Zhao Tong, Yimeng Gu, Huidong Liu, Qiang Liu, Shu Wu, Haichao Shi, Xiao-Yu Zhang:
Generate First, Then Sample: Enhancing Fake News Detection with LLM-Augmented Reinforced Sampling. 24276-24290 - Yu Zhang, Ruijie Yu, Jidong Tian, Feng Zhu, Jiapeng Liu, Xiaokang Yang, Yaohui Jin, Yanyan Xu:
ChemActor: Enhancing Automated Extraction of Chemical Synthesis Actions with LLM-Generated Data. 24291-24314 - Shiyu Ni, Keping Bi, Jiafeng Guo, Lulu Yu, Baolong Bi, Xueqi Cheng:
Towards Fully Exploiting LLM Internal States to Enhance Knowledge Boundary Perception. 24315-24329 - Yiyi Chen, Qiongkai Xu, Johannes Bjerva:
ALGEN: Few-shot Inversion Attacks on Textual Embeddings via Cross-Model Alignment and Generation. 24330-24348 - Kun Li, Tianhua Zhang, Xixin Wu, Hongyin Luo, James R. Glass, Helen M. Meng:
Decoding on Graphs: Faithful and Sound Reasoning on Knowledge Graphs through Generation of Well-Formed Chains. 24349-24364 - Mingqian He, Yongliang Shen, Wenqi Zhang, Qiuying Peng, Jun Wang, Weiming Lu:
STaR-SQL: Self-Taught Reasoner for Text-to-SQL. 24365-24375 - T. Y. S. S. Santosh, Irtiza Chowdhury:
Fairness Beyond Performance: Revealing Reliability Disparities Across Groups in Legal NLP. 24376-24390 - Yang Zhao, Li Du, Xiao Ding, Yangou Ouyang, Hepeng Wang, Kai Xiong, Jinglong Gao, Zhouhao Sun, Dongliang Xu, Qing Yang, Dongchen Li, Bing Qin, Ting Liu:
Beyond Similarity: A Gradient-based Graph Method for Instruction Tuning Data Selection. 24391-24404 - Peiji Li, Kai Lv, Yunfan Shao, Yichuan Ma, Linyang Li, Xiaoqing Zheng, Xipeng Qiu, Qipeng Guo:
FastMCTS: A Simple Sampling Strategy for Data Synthesis. 24405-24422 - Qiwei Li, Teng Xiao, Zuchao Li, Ping Wang, Mengjia Shen, Hai Zhao:
Dialogue-RAG: Enhancing Retrieval for LLMs via Node-Linking Utterance Rewriting. 24423-24438 - Ethan Wilcox, Cui Ding, Giovanni Acampa, Tiago Pimentel, Alex Warstadt, Tamar I. Regev:
Using Information Theory to Characterize Prosodic Typology: The Case of Tone, Pitch-Accent and Stress-Accent. 24439-24451 - Arthur Mariano Rocha De Azevedo Scalercio, Elvis A. de Souza, Maria José Bocorny Finatto, Aline Paes:
Evaluating LLMs for Portuguese Sentence Simplification with Linguistic Insights. 24452-24477 - Hugo Pitorro, Marcos Vinícius Treviso:
LaTIM: Measuring Latent Token-to-Token Interactions in Mamba Models. 24478-24493 - Adam Wiemerslage, Katharina von der Wense:
Improving Low-Resource Morphological Inflection via Self-Supervised Objectives. 24494-24510 - Yingchaojie Feng, Yiqun Sun, Yandong Sun, Minfeng Zhu, Qiang Huang, Anthony Kum Hoe Tung, Wei Chen:
Don't Reinvent the Wheel: Efficient Instruction-Following Text Embedding based on Guided Space Transformation. 24511-24525 - Giuliano Martinelli, Tommaso Bonomo, Pere-Lluís Huguet Cabot, Roberto Navigli:
BOOKCOREF: Coreference Resolution at Book Scale. 24526-24544 - Wei Yang, Jingjing Fu, Rui Wang, Jinyu Wang, Lei Song, Jiang Bian:
OMGM: Orchestrate Multiple Granularities and Modalities for Efficient Multimodal Retrieval. 24545-24563 - Lei Huang, Xiaocheng Feng, Weitao Ma, Yuchun Fan, Xiachong Feng, Yuxuan Gu, Yangfan Ye, Liang Zhao, Weihong Zhong, Baoxin Wang, Dayong Wu, Guoping Hu, Lingpeng Kong, Tong Xiao, Ting Liu, Bing Qin:
Alleviating Hallucinations from Knowledge Misalignment in Large Language Models via Selective Abstention Learning. 24564-24579 - Zizhao Chen, Mustafa Omer Gul, Yiwei Chen, Gloria Geng, Anne Wu, Yoav Artzi:
Retrospective Learning from Interactions. 24580-24606 - Yiyan Xu, Jinghao Zhang, Alireza Salemi, Xinting Hu, Wenjie Wang, Fuli Feng, Hamed Zamani, Xiangnan He, Tat-Seng Chua:
Personalized Generation In Large Model Era: A Survey. 24607-24649 - Junqi Gao, Xiang Zou, Ying Ai, Dong Li, Yichen Niu, Biqing Qi, Jianxing Liu:
Graph Counselor: Adaptive Graph Exploration via Multi-Agent Synergy to Enhance LLM Reasoning. 24650-24668 - Wenyuan Zhang, Tianyun Liu, Mengxiao Song, Xiaodong Li, Tingwen Liu:
SOTOPIA-: Dynamic Strategy Injection Learning and Social Instruction Following Evaluation for Social Agents. 24669-24697 - Shanchao Liang, Nan Jiang, Yiran Hu, Lin Tan:
Can Language Models Replace Programmers for Coding? REPOCOD Says 'Not Yet'. 24698-24717 - Patrick Haller, Jannis Vamvas, Rico Sennrich, Lena Ann Jäger:
Leveraging In-Context Learning for Political Bias Testing of LLMs. 24718-24738 - Steven H. Wang, Maksim Zubkov, Kexin Fan, Sarah Harrell, Yuyang Sun, Wei Chen, Andreas Plesner, Roger Wattenhofer:
ACORD: An Expert-Annotated Retrieval Dataset for Legal Contract Drafting. 24739-24762 - Qibing Ren, Hao Li, Dongrui Liu, Zhanxu Xie, Xiaoya Lu, Yu Qiao, Lei Sha, Junchi Yan, Lizhuang Ma, Jing Shao:
LLMs know their vulnerabilities: Uncover Safety Gaps through Natural Distribution Shifts. 24763-24785 - Shanchao Liang, Nan Jiang, Shangshu Qian, Lin Tan:
WAFFLE: Fine-tuning Multi-Modal Model for Automated Front-End Development. 24786-24802 - Bryan R. Christ, Zachary Gottesman, Jonathan Kropko, Thomas Hartvigsen:
Math Neurosurgery: Isolating Language Models' Math Reasoning Abilities Using Only Forward Passes. 24803-24840 - Dayeon Ki, Rachel Rudinger, Tianyi Zhou, Marine Carpuat:
Multiple LLM Agents Debate for Equitable Cultural Alignment. 24841-24877 - Fangyuan Xu, Tanya Goyal, Eunsol Choi:
RefreshKV: Updating Small KV Cache During Long-form Generation. 24878-24893 - Weikai Lu, Hao Peng, Huiping Zhuang, Cen Chen, Ziqian Zeng:
SEA: Low-Resource Safety Alignment for Multimodal Large Language Models via Synthetic Embeddings. 24894-24913 - Yiyao Yu, Yuxiang Zhang, Dongdong Zhang, Xiao Liang, Hengyuan Zhang, Xingxing Zhang, Mahmoud Khademi, Hany Hassan Awadalla, Junjie Wang, Yujiu Yang, Furu Wei:
Chain-of-Reasoning: Towards Unified Mathematical Reasoning in Large Language Models via a Multi-Paradigm Perspective. 24914-24937 - Tatsuya Aoyama, Ethan Wilcox:
Language Models Grow Less Humanlike beyond Phase Transition. 24938-24958 - Arkadiusz Modzelewski, Witold Sosnowski, Tiziano Labruna, Adam Wierzbicki, Giovanni Da San Martino:
PCoT: Persuasion-Augmented Chain of Thought for Detecting Fake News and Social Media Disinformation. 24959-24983 - Benjamin Roger Litterer, David Jurgens, Dallas Card:
Coordinating Chaos: A Structured Review of Linguistic Coordination Methodologies. 24984-24999 - Tiancheng Hu, Nigel Collier:
iNews: A Multimodal Dataset for Modeling Personalized Affective Responses to News. 25000-25040 - Akhila Yerukola, Saadia Gabriel, Nanyun Peng, Maarten Sap:
Mind the Gesture: Evaluating AI Sensitivity to Culturally Offensive Non-Verbal Gestures. 25041-25080 - Zongqian Li, Yixuan Su, Nigel Collier:
500xCompressor: Generalized Prompt Compression for Large Language Models. 25081-25091 - James Flemings, Bo Jiang, Wanrong Zhang, Zafar Takhirov, Murali Annavaram:
Estimating Privacy Leakage of Augmented Contextual Knowledge in Language Models. 25092-25108 - Joseph Gatto, Omar Sharif, Parker Seegmiller, Sarah Masud Preum:
Document-Level Event-Argument Data Augmentation for Challenging Role Types. 25109-25131 - Benjamin Roger Litterer, David Jurgens, Dallas Card:
Mapping the Podcast Ecosystem with the Structured Podcast Research Corpus. 25132-25154 - Tharindu Madusanka, Marco Valentino, Iqra Zahid, Ian Pratt-Hartmann, Riza Batista-Navarro:
Unravelling the Logic: Investigating the Generalisation of Transformers in Numerical Satisfiability Problems. 25155-25168 - Aniket Pramanick, Yufang Hou, Saif M. Mohammad, Iryna Gurevych:
The Nature of NLP: Analyzing Contributions in NLP Papers. 25169-25191 - Vishal Dey, Xiao Hu, Xia Ning:
\mathttGeLLM³O: Generalizing Large Language Models for Multi-property Molecule Optimization. 25192-25221 - Joseph Gatto, Parker Seegmiller, Timothy E. Burdick, Inas S. Khayal, Sarah DeLozier, Sarah Masud Preum:
Follow-up Question Generation For Enhanced Patient-Provider Conversations. 25222-25240 - Bo Wang, Weiyi He, Shenglai Zeng, Zhen Xiang, Yue Xing, Jiliang Tang, Pengfei He:
Unveiling Privacy Risks in LLM Agent Memory. 25241-25260 - Emmanouil Zaranis, Giuseppe Attanasio, Sweta Agrawal, André F. T. Martins:
Watching the Watchers: Exposing Gender Disparities in Machine Translation Quality Estimation. 25261-25284 - Nayu Liu, Fanglong Yao, Haoran Luo, Yong Yang, Chen Tang, Bo Lv:
Language Constrained Multimodal Hyper Adapter For Many-to-Many Multimodal Summarization. 25285-25298 - Mingyang Song, Zhaochen Su, Xiaoye Qu, Jiawei Zhou, Yu Cheng:
PRMBench: A Fine-grained and Challenging Benchmark for Process-Level Reward Models. 25299-25346 - Dongyue Li, Ziniu Zhang, Lu Wang, Hongyang R. Zhang:
Efficient Ensemble for Fine-tuning Language Models on Multiple Datasets. 25347-25364 - Munachiso Nwadike, Zangir Iklassov, Toluwani Aremu, Tatsuya Hiraoka, Benjamin Heinzerling, Velibor Bojkovic, Hilal AlQuabeh, Martin Takác, Kentaro Inui:
Library-Like Behavior In Language Models is Enhanced by Self-Referencing Causal Cycles. 25365-25377 - Lang Gao, Jiahui Geng, Xiangliang Zhang, Preslav Nakov, Xiuying Chen:
Shaping the Safety Boundaries: Understanding and Defending Against Jailbreaks in Large Language Models. 25378-25398 - Alexandru Coca, Mark Gaynor, Zhenxing Zhang, Jianpeng Cheng, Bo-Hsiang Tseng, Peter Boothroyd, Héctor Martínez Alonso, Diarmuid Ó Séaghdha, Anders Johannsen:
ASPERA: A Simulated Environment to Evaluate Planning for Complex Action Execution. 25399-25434 - Jiahao Yuan
, Zixiang Di, Zhiqing Cui, Guisong Yang, Usman Naseem:
ReflectDiffu: Reflect between Emotion-intent Contagion and Mimicry for Empathetic Response Generation via a RL-Diffusion Framework. 25435-25449 - Nayu Liu, Junnan Zhu, Yiming Ma, Zhicong Lu, Wenlei Xu, Yong Yang, Jiang Zhong, Kaiwen Wei:
SARA: Salience-Aware Reinforced Adaptive Decoding for Large Language Models in Abstractive Summarization. 25450-25463 - Jinsung Yoon, Sercan Ö. Arik:
Embedding-Converter: A Unified Framework for Cross-Model Embedding Transformation. 25464-25482 - Md. Tahmid Rahman Laskar, Israt Jahan, Elham Dolatabadi, Chun Peng, Enamul Hoque, Jimmy Huang:
Improving Automatic Evaluation of Large Language Models (LLMs) in Biomedical Relation Extraction via LLMs-as-the-Judge. 25483-25497 - Fan Li, Jianxing Yu, Jielong Tang, Wenqing Chen, Hanjiang Lai, Yanghui Rao, Jian Yin:
Answering Complex Geographic Questions by Adaptive Reasoning with Visual Context and External Commonsense Knowledge. 25498-25514 - Zesheng Shi, Yucheng Zhou, Jing Li, Yuxin Jin, Yu Li, Daojing He, Fangming Liu, Saleh Alharbi, Jun Yu, Min Zhang:
Safety Alignment via Constrained Knowledge Unlearning. 25515-25529 - Shivam Chandhok, Wan-Cyuan Fan, Vered Shwartz, Vineeth N. Balasubramanian, Leonid Sigal:
Response Wide Shut? Surprising Observations in Basic Vision Language Model Capabilities. 25530-25545 - Zekun Wang, Minghua Ma, Zexin Wang, Rongchuan Mu, Liping Shan, Ming Liu, Bing Qin:
EffiVLM-BENCH: A Comprehensive Benchmark for Evaluating Training-Free Acceleration in Large Vision-Language Models. 25546-25572 - Ansar Aynetdinov, Alan Akbik:
Pre-Training Curriculum for Multi-Token Prediction in Language Models. 25573-25588 - Xingxuan Li, Weiwen Xu, Ruochen Zhao, Fangkai Jiao, Shafiq Joty, Lidong Bing:
Can We Further Elicit Reasoning in LLMs? Critic-Guided Planning with Retrieval-Augmentation for Solving Challenging Tasks. 25589-25604 - Kaijian Zou, Muhammad Khalifa, Lu Wang:
On Many-Shot In-Context Learning for Long-Context Evaluation. 25605-25639 - Zhilin Wang, Jiaqi Zeng, Olivier Delalleau, Daniel Egert, Ellie Evans, Hoo-Chang Shin, Felipe Soares, Yi Dong, Oleksii Kuchaiev:
HelpSteer3: Human-Annotated Feedback and Edit Data to Empower Inference-Time Scaling in Open-Ended General-Domain Tasks. 25640-25662 - Yu Ying Chiu, Liwei Jiang, Bill Yuchen Lin, Chan Young Park
, Shuyue Stella Li, Sahithya Ravi, Mehar Bhatia, Maria Antoniak, Yulia Tsvetkov, Vered Shwartz, Yejin Choi:
CulturalBench: A Robust, Diverse and Challenging Benchmark for Measuring LMs' Cultural Knowledge Through Human-AI Red-Teaming. 25663-25701 - Mohit Raghavendra, Junmo Kang, Alan Ritter:
Balancing the Budget: Understanding Trade-offs Between Supervised and Preference-Based Finetuning. 25702-25720 - Tarun Gupta, Danish Pruthi:
All That Glitters is Not Novel: Plagiarism in AI Generated Research. 25721-25738 - Yuxiang Liu, Kevin Chen-Chuan Chang:
Writing Like the Best: Exemplar-Based Expository Text Generation. 25739-25764 - Rochana Chaturvedi, Peyman Baghershahi, Sourav Medya, Barbara Di Eugenio:
Temporal Relation Extraction in Clinical Texts: A Span-based Graph Transformer Approach. 25765-25788 - Sarah E. Finch, Ellie S. Paek, Ikseon Choi, Jinho D. Choi:
Finding A Voice: Exploring the Potential of African American Dialect and Voice Generation for Chatbots. 25789-25806 - Chuyuan Li, Raymond Li, Thalia Shoshana Field, Giuseppe Carenini:
Delta-KNN: Improving Demonstration Selection in In-Context Learning for Alzheimer's Disease Detection. 25807-25826 - Hannah Rashkin, Elizabeth Clark, Fantine Huot, Mirella Lapata:
Help Me Write a Story: Evaluating LLMs' Ability to Generate Writing Feedback. 25827-25847 - Philipp Borchert, Ivan Vulic, Marie-Francine Moens, Jochen De Weerdt:
Language Fusion for Parameter-Efficient Cross-lingual Transfer. 25848-25868 - Naitian Zhou, David Bamman, Isaac L. Bleaman:
Culture is Not Trivia: Sociocultural Theory for Cultural NLP. 25869-25886 - Xilin Jiang, Sukru Samet Dindar, Vishal Choudhari, Stephan Bickel, Ashesh D. Mehta, Guy M. McKhann II, Daniel Friedman, Adeen Flinker, Nima Mesgarani:
AAD-LLM: Neural Attention-Driven Auditory Scene Understanding. 25887-25909 - Anders Søgaard:
Do Language Models Have Semantics? On the Five Standard Positions. 25910-25922 - Myra Cheng, Su Lin Blodgett, Alicia DeVrio, Lisa Egede, Alexandra Olteanu:
Dehumanizing Machines: Mitigating Anthropomorphic Behaviors in Text Generation Systems. 25923-25948 - Antonia Karamolegkou, Malvina Nikandrou, Georgios Pantazopoulos, Danae Sanchez Villegas, Phillip Rust, Ruchira Dhar, Daniel Hershcovich, Anders Søgaard:
Evaluating Multimodal Language Models as Visual Assistants for Visually Impaired Users. 25949-25982 - Myra Cheng, Sunny Yu, Dan Jurafsky:
HumT DumT: Measuring and controlling human-like language in LLMs. 25983-26008 - Serina Chang, Ashton Anderson, Jake M. Hofman:
ChatBench: From Static Benchmarks to Human-AI Evaluation. 26009-26038 - Mohammad Saqib Hasan, Saikat Chakraborty, Santu Karmaker, Niranjan Balasubramanian:
Teaching an Old LLM Secure Coding: Localized Preference Optimization on Distilled Preferences. 26039-26057 - Xiulin Yang, Tatsuya Aoyama, Yuekun Yao, Ethan Wilcox:
Anything Goes? A Crosslinguistic Study of (Im)possible Language Learning in LMs. 26058-26077 - Roland Daynauth, Christopher Clarke, Krisztián Flautner, Lingjia Tang, Jason Mars:
Ranking Unraveled: Recipes for LLM Rankings in Head-to-Head AI Combat. 26078-26091 - Georg Wölflein, Dyke Ferber, Daniel Truhn, Ognjen Arandjelovic, Jakob Nikolas Kather:
LLM Agents Making Agent Tools. 26092-26130 - Zoya Volovikova, Gregory Gorbov, Petr Kuderov, Aleksandr Panov, Alexey Skrynnik:
CrafText Benchmark: Advancing Instruction Following in Complex Multimodal Open-Ended World. 26131-26151 - Bang Nguyen, Tingting Du, Mengxia Yu, Lawrence Angrave, Meng Jiang:
QG-SMS: Enhancing Test Item Analysis via Student Modeling and Simulation. 26152-26168 - Mahnaz Koupaee, Xueying Bai, Mudan Chen, Greg Durrett, Nathanael Chambers, Niranjan Balasubramanian:
Causal Graph based Event Reasoning using Semantic Relation Experts. 26169-26199 - Jin Jiang, Yuchen Yan, Yang Liu, Jianing Wang, Shuai Peng, Xunliang Cai, Yixin Cao, Mengdi Zhang, Liangcai Gao:
LogicPro: Improving Complex Logical Reasoning via Program-Guided Learning. 26200-26218 - Ayesha Qamar, Jonathan Tong, Ruihong Huang:
Do LLMs Understand Dialogues? A Case Study on Dialogue Acts. 26219-26237 - Shaily Bhatt, Tal August, Maria Antoniak:
Research Borderlands: Analysing Writing Across Research Cultures. 26238-26266 - Xia Li, Wenjing Pan:
CEAES: Bidirectional Reinforcement Learning Optimization for Consistent and Explainable Essay Assessment. 26267-26279 - James Y. Huang, Sailik Sengupta, Daniele Bonadiman, Yi-An Lai, Arshit Gupta, Nikolaos Pappas, Saab Mansour, Katrin Kirchhoff, Dan Roth:
DeAL: Decoding-time Alignment for Large Language Models. 26280-26300 - Senqi Yang, Dongyu Zhang, Jing Ren, Ziqi Xu, Xiuzhen Zhang, Yiliao Song, Hongfei Lin, Feng Xia:
Cultural Bias Matters: A Cross-Cultural Benchmark Dataset and Sentiment-Enriched Model for Understanding Multimodal Metaphors. 26301-26317 - Haonan Zhang, Run Luo, Xiong Liu, Yuchuan Wu, Ting-En Lin, Pengpeng Zeng, Qiang Qu, Feiteng Fang, Min Yang, Lianli Gao, Jingkuan Song, Fei Huang, Yongbin Li:
OmniCharacter: Towards Immersive Role-Playing Agents with Seamless Speech-Language Personality Interaction. 26318-26331 - Giwon Hong, Emile van Krieken, Edoardo Maria Ponti, Nikolay Malkin, Pasquale Minervini:
Mixtures of In-Context Learners. 26332-26351 - Yuxuan Zhou, Margret Keuper, Mario Fritz:
Balancing Diversity and Risk in LLM Sampling: How to Select Your Method and Parameter for Open-Ended Text Generation. 26352-26365 - Wenjun Hou, Yi Cheng, Kaishuai Xu, Heng Li, Yan Hu, Wenjie Li, Jiang Liu:
RADAR: Enhancing Radiology Report Generation with Supplementary Knowledge Injection. 26366-26381 - Jaewoo Ahn, Heeseung Yun, Dayoon Ko, Gunhee Kim:
Can LLMs Deceive CLIP? Benchmarking Adversarial Compositionality of Pre-trained Multimodal Representation via Text Updates. 26382-26402 - Rishabh Adiga, Besmira Nushi, Varun Chandrasekaran:
Attention Speaks Volumes: Localizing and Mitigating Bias in Language Models. 26403-26423 - Weiyang Guo, Jing Li, Wenya Wang, Yu Li, Daojing He, Jun Yu, Min Zhang:
MTSA: Multi-turn Safety Alignment for LLMs through Multi-round Red-teaming. 26424-26442 - Huixue Zhou, Hengrui Gu, Zaifu Zhan, Xi Liu, Kaixiong Zhou, Yongkang Xiao, Mingfu Liang, Srinivas Prasad Govindan, Piyush Chawla, Jiyan Yang, Xiangfei Meng, Huayu Li, Buyun Zhang, Liang Luo, Wen-Yen Chen, Yiping Han, Bo Long, Rui Zhang, Tianlong Chen:
The Efficiency vs. Accuracy Trade-off: Optimizing RAG-Enhanced LLM Recommender Systems Using Multi-Head Early Exit. 26443-26458 - Haobo Zhang, Jiayu Zhou:
Unraveling LoRA Interference: Orthogonal Subspaces for Robust Model Merging. 26459-26472 - Mehran Kazemi, Bahare Fatemi, Hritik Bansal, John Palowitch, Chrysovalantis Anastasiou, Sanket Vaibhav Mehta, Lalit K. Jain, Virginia Aglietti, Disha Jindal, Peter Chen, Nishanth Dikkala, Gladys Tyen, Xin Liu, Uri Shalit, Silvia Chiappa, Kate Olszewska, Yi Tay, Vinh Q. Tran, Quoc V. Le, Orhan Firat:
BIG-Bench Extra Hard. 26473-26501 - Zhaowen Wang, Xiang Wei, Kangshao Du, Yiting Zhang, Libo Qin, Yingjie Xia, Li Kuang:
CSTree-SRI: Introspection-Driven Cognitive Semantic Tree for Multi-Turn Question Answering over Extra-Long Contexts. 26502-26525 - Wenyue Hua, Tyler Wong, Fei Sun, Liangming Pan, Adam Jardine, William Yang Wang:
InductionBench: LLMs Fail in the Simplest Complexity Class. 26526-26546 - Dongwei Jiang, Guoxuan Wang, Yining Lu, Andrew Wang, Jingyu Zhang, Chuyu Liu, Benjamin Van Durme, Daniel Khashabi:
RATIONALYST: Pre-training Process-Supervision for Improving Reasoning. 26547-26566 - Andong Chen, Yuchen Song, Kehai Chen, Xuefeng Bai, Muyun Yang, Liqiang Nie, Jie Liu, Tiejun Zhao, Min Zhang:
Make Imagination Clearer! Stable Diffusion-based Visual Imagination for Multimodal Machine Translation. 26567-26583 - Liang Zhang, Ziyao Lu, Fandong Meng, Hui Li, Jie Zhou, Jinsong Su:
Advancing SMoE for Continuous Domain Adaptation of MLLMs: Adaptive Router and Domain-Specific Loss. 26584-26602 - Yuanyuan Lei, Ruihong Huang:
Multi-document Summarization through Multi-document Event Relation Graph Reasoning in LLMs: a case study in Framing Bias Mitigation. 26603-26619 - Jiatao Li, Xiaojun Wan:
Who Writes What: Unveiling the Impact of Author Roles on AI-generated Text Detection. 26620-26658 - Md. Kowsher, Tara Esmaeilbeig, Chun-Nam Yu, Chen Chen, Mojtaba Soltanalian, Niloofar Yousefi:
RoCoFT: Efficient Finetuning of Large Language Models with Row-Column Updates. 26659-26678 - Tejas Vaidhya, Ayush Kaushal, Vineet Jain, Francis Couture Harpin, Prashant Shishodia, Majid Behbahani, Yuriy Nevmyvaka, Irina Rish:
Scaling Laws and Efficient Inference for Ternary Language Models. 26679-26710 - Kyubeen Han, Junseo Jang, Hongjin Kim, Geunyeong Jeong, Harksoo Kim:
Exploring the Impact of Instruction-Tuning on LLM's Susceptibility to Misinformation. 26711-26731 - Mohammad Rifqi Farhansyah, Iwan Darmawan, Adryan Kusumawardhana, Genta Indra Winata, Alham Fikri Aji, Derry Tanti Wijaya:
Do Language Models Understand Honorific Systems in Javanese? 26732-26754 - Xiaobo Liang, Haoke Zhang, Juntao Li, Kehai Chen, Qiaoming Zhu, Min Zhang:
Generative Reward Modeling via Synthetic Criteria Preference Learning. 26755-26769 - Xinyu Zhang, Aibo Song, Jingyi Qiu, Jiahui Jin, Tianbo Zhang, Xiaolin Fang:
Exploring Multimodal Relation Extraction of Hierarchical Tabular Data with Multi-task Learning. 26770-26781 - Liang Zhang, Yang Zhang, Ziyao Lu, Fandong Meng, Jie Zhou, Jinsong Su:
A Self-Denoising Model for Robust Few-Shot Relation Extraction. 26782-26797 - WeiJie Liu, Yibin Zheng, Fang Kong:
QuASAR: A Question-Driven Structure-Aware Approach for Table-to-Text Generation. 26798-26812 - Jean-Benoit Delbrouck, Justin Xu, Johannes Moll, Alois Thomas, Zhihong Chen, Sophie Ostmeier, Asfandyar Azhar, Kelvin Zhenghao Li, Andrew Johnston, Christian Bluethgen, Eduardo Pontes Reis, Mohamed S. Muneer, Maya Varma, Curtis P. Langlotz:
Automated Structured Radiology Report Generation. 26813-26829 - Fatemeh Pesaran Zadeh, Yoojin Oh, Gunhee Kim:
LPOI: Listwise Preference Optimization for Vision Language Models. 26830-26844 - Md. Kowsher, Nusrat Jahan Prottasha, Prakash Bhat, Chun-Nam Yu, Mojtaba Soltanalian, Ivan Garibay, Ozlem O. Garibay, Chen Chen, Niloofar Yousefi:
Predicting Through Generation: Why Generation Is Better for Prediction. 26845-26871 - Eldar Kurtic, Alexandre Noll Marques, Shubhra Pandit, Mark Kurtz, Dan Alistarh:
"Give Me BF16 or Give Me Death"? Accuracy-Performance Trade-Offs in LLM Quantization. 26872-26886 - Bodun Hu, Shuozhe Li, Saurabh Agarwal, Myungjin Lee, Akshay Jajoo, Jiamin Li, Le Xu, Geon-Woo Kim, Donghyun Kim, Hong Xu, Amy Zhang, Aditya Akella:
StitchLLM: Serving LLMs, One Block at a Time. 26887-26903 - Yuqi Bu, Xin Wu, Zirui Zhao, Yi Cai, David Hsu, Qiong Liu:
Walk in Others' Shoes with a Single Glance: Human-Centric Visual Grounding with Top-View Perspective Transformation. 26904-26923 - Ray Groshan, Michael Ginn, Alexis Palmer:
Is linguistically-motivated data augmentation worth it? 26924-26939 - Xuanchang Zhang, Wei Xiong, Lichang Chen, Tianyi Zhou, Heng Huang, Tong Zhang:
From Lists to Emojis: How Format Bias Affects Model Alignment. 26940-26961 - Jinggui Liang, Dung Vo, Yap Hong Xian, Hai Leong Chieu, Kian Ming Adam Chai, Jing Jiang, Lizi Liao:
Colloquial Singaporean English Style Transfer with Fine-Grained Explainable Control. 26962-26983 - Jialun Cao, Yaojie Lu, Meiziniu Li, Haoyang Ma, Haokun Li, Mengda He, Cheng Wen, Le Sun, Hongyu Zhang, Shengchao Qin, Shing-Chi Cheung, Cong Tian:
From Informal to Formal - Incorporating and Evaluating LLMs on Natural Language Requirements to Verifiable Formal Proofs. 26984-27003 - Yusuke Ide, Joshua Tanner, Adam Nohejl, Jacob Hoffman, Justin Vasselli, Hidetaka Kamigaito, Taro Watanabe:
CoAM: Corpus of All-Type Multiword Expressions. 27004-27021 - Zijun Yao, Weijian Qi, Liangming Pan, Shulin Cao, Linmei Hu, Weichuan Liu, Lei Hou, Juanzi Li:
SeaKR: Self-aware Knowledge Retrieval for Adaptive Retrieval Augmented Generation. 27022-27043 - Joykirat Singh, Akshay Uttama Nambi, Vibhav Vineet:
Exposing the Achilles' Heel: Evaluating LLMs Ability to Handle Mistakes in Mathematical Reasoning. 27044-27065 - Qingjie Zhang, Di Wang, Haoting Qian, Yiming Li, Tianwei Zhang, Minlie Huang, Ke Xu, Hewu Li, Liu Yan, Han Qiu:
Understanding the Dark Side of LLMs' Intrinsic Self-Correction. 27066-27101 - Xinyu Chen, Yunxin Li, Haoyuan Shi, Baotian Hu, Wenhan Luo, Yaowei Wang, Min Zhang:
VideoVista-CulturalLingo: 360° Horizons-Bridging Cultures, Languages, and Domains in Video Comprehension. 27102-27128 - Zhi Chen, Qiguang Chen, Libo Qin, Qipeng Guo, Haijun Lv, Yicheng Zou, Hang Yan, Kai Chen, Dahua Lin:
What are the Essential Factors in Crafting Effective Long Context Multi-Hop Instruction Datasets? Insights and Best Practices. 27129-27151 - Shijie Wang, Wenqi Fan, Yue Feng, Shanru Lin, Xinyu Ma, Shuaiqiang Wang, Dawei Yin:
Knowledge Graph Retrieval-Augmented Generation for LLM-based Recommendation. 27152-27168 - Qin Liu, Fei Wang, Chaowei Xiao, Muhao Chen:
SudoLM: Learning Access Control of Parametric Knowledge with Authorization Alignment. 27169-27181 - Na Min An, Eunki Kim, James Thorne, Hyunjung Shim:
I0T: Embedding Standardization Method Towards Zero Modality Gap. 27182-27199 - Wen Luo, Feifan Song, Wei Li, Guangyue Peng, Shaohang Wei, Houfeng Wang:
Odysseus Navigates the Sirens' Song: Dynamic Focus Decoding for Factual and Diverse Open-Ended Text Generation. 27200-27218 - Felix Stollenwerk, Tobias Stollenwerk:
Better Embeddings with Coupled Adam. 27219-27236 - Guofu Xie, Xiao Zhang, Ting Yao, Yunsheng Shi:
Bone Soups: A Seek-and-Soup Model Merging Approach for Controllable Multi-Objective Generation. 27237-27263 - Harshit Joshi, Shicheng Liu, James Chen, Larsen Weigle, Monica S. Lam:
Controllable and Reliable Knowledge-Intensive Task-Oriented Conversational Agents with Declarative Genie Worksheets. 27264-27308 - Jia Li, Xuyuan Guo, Lei Li, Kechi Zhang, Ge Li, Zhengwei Tao, Fang Liu, Chongyang Tao, Yuqi Zhu, Zhi Jin:
Benchmarking Long-Context Language Models on Long Code Understanding. 27309-27327 - Savya Khosla, Aditi Tiwari, Kushal Kafle, Simon Jenni, Handong Zhao, John P. Collomosse, Jing Shi:
MAGNET: Augmenting Generative Decoders with Representation Learning and Infilling Capabilities. 27328-27346 - Haoran Jin, Meng Li, Xiting Wang, Zhihao Xu, Minlie Huang, Yantao Jia, Defu Lian:
Internal Value Alignment in Large Language Models through Controlled Value Vector Activation. 27347-27371 - Xinyu Hu, Mingqi Gao, Li Lin, Zhenghan Yu, Xiaojun Wan:
A Dual-Perspective NLG Meta-Evaluation Framework with Automatic Benchmark and Better Interpretability. 27372-27395 - Yujie Feng, Xujia Wang, Zexin Lu, Shenghong Fu, Guangyuan Shi, Yongxin Xu, Yasha Wang, Philip S. Yu, Xu Chu, Xiao-Ming Wu:
Recurrent Knowledge Identification and Fusion for Language Model Continual Learning. 27396-27413 - Thomas Vakili, Aron Henriksson, Hercules Dalianis:
Data-Constrained Synthesis of Training Data for De-Identification. 27414-27427 - Soumitra Ghosh, Gopendra Vikram Singh, Shambhavi, Sabarna Choudhury, Asif Ekbal:
Just a Scratch: Enhancing LLM Capabilities for Self-harm Detection through Intent Differentiation and Emoji Interpretation. 27428-27445 - Peiming Guo, Meishan Zhang, Jianling Li, Min Zhang, Yue Zhang:
Contrastive Learning on LLM Back Generation Treebank for Cross-domain Constituency Parsing. 27446-27458 - Kexin Wang, Yuhong Chou, Di Shang, Shijie Mei, Jiahong Zhang, Yanbin Huang, Man Yao, Bo Xu, Guoqi Li:
MMDEND: Dendrite-Inspired Multi-Branch Multi-Compartment Parallel Spiking Neuron for Sequence Modeling. 27459-27470 - Taywon Min, Haeone Lee, Yongchan Kwon, Kimin Lee:
Understanding Impact of Human Feedback via Influence Functions. 27471-27500 - Ziwei Huang, Wanggui He, Quanyu Long, Yandi Wang, Haoyuan Li, Zhelun Yu, Fangxun Shu, Weilong Dai, Hao Jiang, Fei Wu, Leilei Gan:
T2I-FactualBench: Benchmarking the Factuality of Text-to-Image Models with Knowledge-Intensive Concepts. 27501-27524 - Fuyu Wang, Jiangtong Li, Kun Zhu, Changjun Jiang:
InspireDebate: Multi-Dimensional Subjective-Objective Evaluation-Guided Reasoning and Optimization for Debating. 27525-27544 - Hongliang He, Wenlin Yao, Kaixin Ma, Wenhao Yu, Hongming Zhang, Tianqing Fang, Zhenzhong Lan, Dong Yu:
OpenWebVoyager: Building Multimodal Web Agents via Iterative Real-World Exploration, Feedback and Optimization. 27545-27564 - Kankan Zhou, Eason Lai, Kyriakos Mouratidis, Jing Jiang:
FOCUS: Evaluating Pre-trained Vision-Language Models on Underspecification Reasoning. 27565-27584 - Wan Ju Kang, Eunki Kim, Na Min An, Sangryul Kim, Haemin Choi, Ki Hoon Kwak, James Thorne:
Sightation Counts: Leveraging Sighted User Feedback in Building a BLV-aligned Dataset of Diagram Descriptions. 27585-27621 - Zijian Shao, Jiancan Wu, Weijian Chen, Xiang Wang:
Personal Travel Solver: A Preference-Driven LLM-Solver System for Travel Planning. 27622-27642 - Aswini Kumar Padhi, Anil Bandhakavi, Tanmoy Chakraborty:
Counterspeech the ultimate shield! Multi-Conditioned Counterspeech Generation through Attributed Prefix Learning. 27643-27663 - Zihan Zhou, Chong Li, Xinyi Chen, Shuo Wang, Yu Chao, Zhili Li, Haoyu Wang, Qi Shi, Zhixing Tan, Xu Han, Xiaodong Shi, Zhiyuan Liu, Maosong Sun:
LLM×MapReduce: Simplified Long-Sequence Processing using Large Language Models. 27664-27678 - Dennis Hein, Zhihong Chen, Sophie Ostmeier, Justin Xu, Maya Varma, Eduardo Pontes Reis, Arne Edward Michalson, Christian Bluethgen, Hyun Joo Shin, Curtis P. Langlotz, Akshay S. Chaudhari:
CheXalign: Preference fine-tuning in chest X-ray interpretation models without human feedback. 27679-27702 - Doyoun Kim, Suin Kim, Yohan Jo:
Knowledge Tracing in Programming Education Integrating Students' Questions. 27703-27718 - Yiqun Sun, Qiang Huang, Anthony Kum Hoe Tung, Jun Yu:
PRISM: A Framework for Producing Interpretable Political Bias Embeddings with Political-Aware Cross-Encoder. 27719-27733 - Meng Li, Michael Vrazitulis, David Schlangen:
Representations of Fact, Fiction and Forecast in Large Language Models: Epistemics and Attitudes. 27734-27757 - Zhange Zhang, Yuqing Ma, Yulong Wang, Shan He, Tianbo Wang, Siqi He, Jiakai Wang, Xianglong Liu:
Lexical Diversity-aware Relevance Assessment for Retrieval-Augmented Generation. 27758-27781 - Juntian Zhang, Chuanqi Cheng, Yuhan Liu, Wei Liu, Jian Luan, Rui Yan:
Weaving Context Across Images: Improving Vision-Language Models through Focus-Centric Visual Chains. 27782-27798 - Ting Xiao, Lei Shi, Yang Zhang, HaoFeng Yang, Zhe Wang, Chenjia Bai:
Online Iterative Self-Alignment for Radiology Report Generation. 27799-27814 - Yifeng Wang, Yi Zhao:
Chinese Inertial GAN for Handwriting Signal Generation and Recognition. 27815-27832 - Haoyang Li, Huan Gao, Zhiyuan Zhao, Zhiyu Lin, Junyu Gao, Xuelong Li:
LLMs Caught in the Crossfire: Malware Requests and Jailbreak Challenges. 27833-27848 - Enrique Amigó, Elena Álvarez Mellado, Julio Gonzalo, Jorge Carrillo-de-Albornoz:
Evaluating Sequence Labeling on the basis of Information Theory. 27849-27860 - Xianshu Peng, Wei Wei:
GRAT: Guiding Retrieval-Augmented Reasoning through Process Rewards Tree Search. 27861-27875 - Wenxuan Zhou, Shujian Zhang, Lingxiao Zhao, Tao Meng:
T-REG: Preference Optimization with Token-Level Reward Regularization. 27876-27889 - Xunjian Yin, Xinyi Wang, Liangming Pan, Li Lin, Xiaojun Wan, William Yang Wang:
Gödel Agent: A Self-Referential Agent Framework for Recursively Self-Improvement. 27890-27913 - Zhiheng Xi, Yiwen Ding, Wenxiang Chen, Boyang Hong, Honglin Guo, Junzhe Wang, Xin Guo, Dingwen Yang, Chenyang Liao, Wei He, Songyang Gao, Lu Chen, Rui Zheng, Yicheng Zou, Tao Gui, Qi Zhang, Xipeng Qiu, Xuanjing Huang, Zuxuan Wu, Yu-Gang Jiang:
AgentGym: Evaluating and Training Large Language Model-based Agents across Diverse Environments. 27914-27961 - Yexiang Liu, Zekun Li, Zhi Fang, Nan Xu, Ran He, Tieniu Tan:
Rethinking the Role of Prompting Strategies in LLM Test-Time Scaling: A Perspective of Probability Theory. 27962-27994 - Taiga Someya, Anej Svete, Brian DuSell, Timothy J. O'Donnell, Mario Giulianelli, Ryan Cotterell:
Information Locality as an Inductive Bias for Neural Language Models. 27995-28013 - Adrián Bazaga, Rexhina Blloshmi, Bill Byrne, Adrià de Gispert:
Learning to Reason Over Time: Timeline Self-Reflection for Improved Temporal Reasoning in Language Models. 28014-28033 - Massimiliano Pronesti, Joao H. Bettencourt-Silva, Paul Flanagan, Alessandra Pascale, Oisin Redmond, Anya Belz, Yufang Hou:
Query-driven Document-level Scientific Evidence Extraction from Biomedical Studies. 28034-28051 - Jizhao Zhu, Akang Shi, Zixuan Li, Long Bai, Xiaolong Jin, Jiafeng Guo, Xueqi Cheng:
Towards Robust Universal Information Extraction: Dataset, Evaluation, and Solution. 28052-28070 - Huiyuan Lai, Esther Ploeger, Rik van Noord, Antonio Toral:
Multi-perspective Alignment for Increasing Naturalness in Neural Machine Translation. 28071-28084 - Jiayu Song, Mahmud Elahi Akhter, Dana Atzil-Slonim, Maria Liakata:
Temporal reasoning for timeline summarisation in social media. 28085-28101 - Tina Lommel, Elisabeth Eder, Josef Ruppenhofer, Michael Wiegand:
Beyond Negative Stereotypes - Non-Negative Abusive Utterances about Identity Groups and Their Semantic Variants. 28102-28120 - Manuel D. S. Hopp, Vincent Labatut, Arthur Amalvy, Richard Dufour, Hannah Stone, Hayley K. Jach, Kou Murayama:
Persistent Homology of Topic Networks for the Prediction of Reader Curiosity. 28121-28132 - Philip Whittington, Gregor Bachmann, Tiago Pimentel:
Tokenisation is NP-Complete. 28133-28153 - Andrei Mircea, Supriyo Chakraborty, Nima Chitsazan, Irina Rish, Ekaterina Lobacheva:
Training Dynamics Underlying Language Model Scaling Laws: Loss Deceleration and Zero-Sum Learning. 28154-28188 - Songlin Zhai, Yuan Meng, Yuxin Zhang, Guilin Qi:
Parameter-Aware Contrastive Knowledge Editing: Tracing and Rectifying based on Critical Transmission Paths. 28189-28200 - Haoyang Su, Renqi Chen, Shixiang Tang, Zhenfei Yin, Xinzhe Zheng, Jinzhe Li, Biqing Qi, Qi Wu, Hui Li, Wanli Ouyang, Philip Torr, Bowen Zhou, Nanqing Dong:
Many Heads Are Better Than One: Improved Scientific Idea Generation by A LLM-Based Multi-Agent System. 28201-28240 - Yilong Chen, Junyuan Shang, Zhenyu Zhang, Yanxi Xie, Jiawei Sheng, Tingwen Liu, Shuohuan Wang, Yu Sun, Hua Wu, Haifeng Wang:
Inner Thinking Transformer: Leveraging Dynamic Depth Scaling to Foster Adaptive Internal Thinking. 28241-28259 - Yuu Jinnai:
Document-Level Text Generation with Minimum Bayes Risk Decoding using Optimal Transport. 28260-28279 - Minseok Choi, Daniel Rim, Dohyun Lee, Jaegul Choo:
Opt-Out: Investigating Entity-Level Unlearning for Large Language Models via Optimal Transport. 28280-28297 - Ziheng Qiao, Houquan Zhou, Zhenghua Li:
Mixture of Small and Large Models for Chinese Spelling Check. 28298-28311 - Ziheng Qiao, Houquan Zhou, Yumeng Liu, Zhenghua Li, Min Zhang, Bo Zhang, Chen Li, Ji Zhang, Fei Huang:
DISC: Plug-and-Play Decoding Intervention with Similarity of Characters for Chinese Spelling Check. 28312-28324 - Pietro Lesci, Clara Meister, Thomas Hofmann, Andreas Vlachos, Tiago Pimentel:
Causal Estimation of Tokenisation Bias. 28325-28340 - Zhanchao Zhou, Tianyi Wu, Zhiyun Jiang, Fares Obeid, Zhenzhong Lan:
Value Residual Learning. 28341-28356 - Guanhua Chen, Yutong Yao, Lidia S. Chao, Xuebo Liu, Derek F. Wong:
SGIC: A Self-Guided Iterative Calibration Framework for RAG. 28357-28370 - Muhammad Farid Adilazuarda, Musa Izzanardi Wijanarko, Lucky Susanto, Khumaisa Nur'aini, Derry Tanti Wijaya, Alham Fikri Aji:
NusaAksara: A Multimodal and Multilingual Benchmark for Preserving Indonesian Indigenous Scripts. 28371-28401 - Zhiliang Tian, Jingyuan Huang, Zejiang He, Zhen Huang, Menglong Lu, Linbo Qiao, Songzhu Mei, Yijie Wang, Dongsheng Li:
LLM-based Rumor Detection via Influence Guided Sample Selection and Game-based Perspective Analysis. 28402-28414 - Ziqi Jia, Anmin Wang, Xiaoyang Qu, Xiaowen Yang, Jianzong Wang:
Hierarchical-Task-Aware Multi-modal Mixture of Incremental LoRA Experts for Embodied Continual Learning. 28415-28427 - Zicong Tang, Luohe Shi, Zuchao Li, Baoyuan Qi, Guoming Liu, Lefei Zhang, Ping Wang:
SpindleKV: A Novel KV Cache Reduction Method Balancing Both Shallow and Deep Layers. 28428-28442 - Junde Wu, Jiayuan Zhu, Yunli Qi, Jingkun Chen, Min Xu, Filippo Menolascina, Yueming Jin, Vicente Grau:
Medical Graph RAG: Evidence-based Medical Large Language Model via Graph Retrieval-Augmented Generation. 28443-28467 - Seungcheol Park, Jeongin Bae, Beomseok Kwon, Minjun Kim, Byeongwook Kim, Se Jung Kwon, U Kang, Dongsoo Lee:
Unifying Uniform and Binary-coding Quantization for Accurate Compression of Large Language Models. 28468-28488 - Junde Wu, Jiayuan Zhu, Yuyuan Liu, Min Xu, Yueming Jin:
Agentic Reasoning: A Streamlined Framework for Enhancing LLM Reasoning with Agentic Tools. 28489-28503 - Chenxiao Li, Jingwei Cheng, Qiang Tong, Fu Zhang, Cairui Wang:
Probing Relative Interaction and Dynamic Calibration in Multi-modal Entity Alignment. 28504-28516 - Guangyue Peng, Tao Ge, Wen Luo, Wei Li, Houfeng Wang:
Learn to Memorize: Scalable Continual Learning in Semiparametric Models with Mixture-of-Neighbors Induction Memory. 28517-28531 - Imane Guellil, Salomé Andres, Atul Anand, Bruce Guthrie, Huayu Zhang, Abul Hasan, Honghan Wu, Beatrice Alex:
Adverse Event Extraction from Discharge Summaries: A New Dataset, Annotation Scheme, and Initial Findings. 28532-28562 - Longhui Zhang, Jiahao Wang, Meishan Zhang, GaoXiong Cao, Ensheng Shi, Mayuchi Mayuchi, Jun Yu, Honghai Liu, Jing Li, Min Zhang:
Speed Up Your Code: Progressive Code Acceleration Through Bidirectional Tree Editing. 28563-28576 - Heejin Do, Sangwon Ryu, Jonghwi Kim, Gary Lee:
Multi-Facet Blending for Faceted Query-by-Example Retrieval. 28577-28590 - Zhicong Lu, Changyuan Tian, PeiguangLi PeiguangLi, Li Jin, Sirui Wang, Wei Jia, Ying Shen, Guangluan Xu:
PIPER: Benchmarking and Prompting Event Reasoning Boundary of LLMs via Debiasing-Distillation Enhanced Tuning. 28591-28613 - Aniketh Garikaparthi, Manasi Patwardhan, Aditya Sanjiv Kanade, Aman Hassan, Lovekesh Vig, Arman Cohan:
MIR: Methodology Inspiration Retrieval for Scientific Research Problems. 28614-28659 - Kexin Chen, Dongxia Wang, Yi Liu, Haonan Zhang, Wenhai Wang:
Sticking to the Mean: Detecting Sticky Tokens in Text Embedding Models. 28660-28681 - Ruoxi Xu, Yunjie Ji, Boxi Cao, Yaojie Lu, Hongyu Lin, Xianpei Han, Ben He, Yingfei Sun, Xiangang Li, Le Sun:
Memorizing is Not Enough: Deep Knowledge Injection Through Reasoning. 28682-28693 - Haesung Pyun, Yoonah Park, Yohan Jo:
Improving Dialogue State Tracking through Combinatorial Search for In-Context Examples. 28694-28714 - Yuhong Dai, Jianxun Lian, Yitian Huang, Wei Zhang, Mingyang Zhou, Mingqi Wu, Xing Xie, Hao Liao:
Pretraining Context Compressor for Large Language Models with Embedding-Based Memory. 28715-28732 - Juhee Kim, Chunghu Mok, Jisun Lee, Hyang Sook Kim, Yohan Jo:
Dialogue Systems for Emotional Support via Value Reinforcement. 28733-28766 - Yuqi Zhou, Sunhao Dai, Zhanshuo Cao, Xiao Zhang, Jun Xu:
Length-Induced Embedding Collapse in PLM-based Models. 28767-28791 - Shester Gueuwou, Xiaodan Du, Greg Shakhnarovich, Karen Livescu, Alexander H. Liu:
SHuBERT: Self-Supervised Sign Language Representation Learning via Multi-Stream Cluster Prediction. 28792-28810 - Lam Thanh Do, Aaditya Bodke, Pritom Saha Akash, Kevin Chen-Chuan Chang:
ERU-KG: Efficient Reference-aligned Unsupervised Keyphrase Generation. 28811-28829 - Suvodip Dey, Yi-Jyun Sun, Gokhan Tur, Dilek Hakkani-Tür:
Know Your Mistakes: Towards Preventing Overreliance on Task-Oriented Conversational AI Through Accountability Modeling. 28830-28843 - Yuxuan Li, Xinwei Guo, Jiashi Gao, Guanhua Chen, Xiangyu Zhao, Jiaxin Zhang, Quanying Liu, Haiyan Wu, Xin Yao, Xuetao Wei:
LLMs Trust Humans More, That's a Problem! Unveiling and Mitigating the Authority Bias in Retrieval-Augmented Generation. 28844-28858 - Dongsheng Zhu, Weixian Shi, Zhengliang Shi, Zhaochun Ren, Shuaiqiang Wang, Lingyong Yan, Dawei Yin:
Divide-Then-Aggregate: An Efficient Tool Learning Method via Parallel Tool Invocation. 28859-28875 - Yuyi Zhang, Peirong Zhang, Zhenhua Yang, Pengyu Yan, Yongxin Shi, Pengwei Liu, Fengjun Guo, Lianwen Jin:
Reviving Cultural Heritage: A Novel Approach for Comprehensive Historical Document Restoration. 28876-28892 - Zekun Moore Wang, Shenzhi Wang, King Zhu, Jiaheng Liu, Ke Xu, Jie Fu, Wangchunshu Zhou, Wenhao Huang:
PopAlign: Diversifying Contrasting Patterns for a More Comprehensive Alignment. 28893-28921 - Tianyu Yang, Xiaodan Zhu, Iryna Gurevych:
Robust Utility-Preserving Text Anonymization Based on Large Language Models. 28922-28941 - Changhun Lee, Minsang Seok, Jungyu Jin, Younghyun Cho, Eunhyeok Park:
SEAL: Scaling to Emphasize Attention for Long-Context Retrieval. 28942-28955 - Chongxuan Huang, Yongshi Ye, Biao Fu, Qifeng Su, Xiaodong Shi:
From Neurons to Semantics: Evaluating Cross-Linguistic Alignment Capabilities of Large Language Models via Neurons Alignment. 28956-28974 - Yue Wang, Haoke Zhang, Juntao Li, Jinxiong Chang, Min Zhang:
\mathcalA³: Automatic Alignment Framework for Attributed Text Generation. 28975-28990 - Bingbing Xu, Jing Yao, Xiaoyuan Yi, Aishan Maoliniyazi, Xing Xie, Xiaofeng Meng:
Towards Better Value Principles for Large Language Model Alignment: A Systematic Evaluation and Enhancement. 28991-29010 - Arvid Frydenlund:
Language Models, Graph Searching, and Supervision Adulteration: When More Supervision is Less and How to Make More More. 29011-29059 - Hidetaka Kamigaito, Hiroyuki Deguchi, Yusuke Sakai, Katsuhiko Hayashi, Taro Watanabe:
Diversity Explains Inference Scaling Laws: Through a Case Study of Minimum Bayes Risk Decoding. 29060-29094 - Ido Cohen, Daniela Gottesman, Mor Geva, Raja Giryes:
Performance Gap in Entity Knowledge Extraction Across Modalities in Vision Language Models. 29095-29108 - Zixuan Chen, Weikai Lu, Xin Lin, Ziqian Zeng:
SDD: Self-Degraded Defense against Malicious Fine-tuning. 29109-29125 - Wei-Hsin Yeh, Yu-An Su, Chih-Ning Chen, Yi-Hsueh Lin, Calvin Ku, Wenhsin Chiu, Min-Chun Hu, Lun-Wei Ku:
CoachMe: Decoding Sport Elements with a Reference-Based Coaching Instruction Generation Model. 29126-29151 - Hexuan Deng, Wenxiang Jiao, Xuebo Liu, Jing Li, Min Zhang, Zhaopeng Tu:
DRPruning: Efficient Large Language Model Pruning through Distributionally Robust Optimization. 29152-29173 - Karin de Langis, Jong Inn Park, Andreas Schramm, Bin Hu, Khanh Chi Le, Dongyeop Kang:
How LLMs Comprehend Temporal Meaning in Narratives: A Case Study in Cognitive Evaluation of LLMs. 29174-29191 - Nicholas Deas, Blake Vente, Amith Ananthram, Jessica Grieser, Desmond Upton Patton, Shana Kleiner, James R. Shepard III, Kathleen McKeown:
Data Caricatures: On the Representation of African American Language in Pretraining Corpora. 29192-29217 - Charles Lovering, Michael Krumdick, Viet Dac Lai, Varshini Reddy, Seth Ebner, Nilesh Kumar, Rik Koncel-Kedziorski, Chris Tanner:
Language Model Probabilities are Not Calibrated in Numeric Contexts. 29218-29257 - Gabrielle Kaili-May Liu, Bowen Shi, Avi Caciularu, Idan Szpektor, Arman Cohan:
MDCure: A Scalable Pipeline for Multi-Document Instruction-Following. 29258-29296 - Sumanth Doddapaneni, Mohammed Safi Ur Rahman Khan, Dilip Venkatesh, Raj Dabre, Anoop Kunchukuttan, Mitesh M. Khapra:
Cross-Lingual Auto Evaluation for Assessing Multilingual LLMs. 29297-29329 - Minjun Zhu, Yixuan Weng, Linyi Yang, Yue Zhang:
DeepReview: Improving LLM-based Paper Review with Human-like Deep Thinking Process. 29330-29355 - Yuan Gao, Zujing Liu, Weizhong Zhang, Bo Du, Gui-Song Xia:
Bypass Back-propagation: Optimization-based Structural Pruning for Large Language Models via Policy Gradient. 29356-29377 - Priyanka Kargupta, Ishika Agarwal, Tal August, Jiawei Han:
Tree-of-Debate: Multi-Persona Debate Trees Elicit Critical Thinking for Scientific Comparative Analysis. 29378-29403 - Eugene J. Yu, Dawei Zhu, Yifan Song, Xiangyu Wong, Jiebin Zhang, Wenxuan Shi, Xiaoguang Li, Qun Liu, Sujian Li:
Hierarchical Memory Organization for Wikipedia Generation. 29404-29427 - Chenlu Wang, Weimin Lyu, Ritwik Banerjee:
Class Distillation with Mahalanobis Contrast: An Efficient Training Paradigm for Pragmatic Language Understanding Tasks. 29428-29442 - Kai Liu, Ze Chen, Zhihang Fu, Wei Zhang, Rongxin Jiang, Fan Zhou, Yaowu Chen, Yue Wu, Jieping Ye:
Structure-aware Domain Knowledge Injection for Large Language Models. 29443-29464 - Junyu Luo, Zhizhuo Kou, Liming Yang, Xiao Luo, Jinsheng Huang, Zhiping Xiao, Jingshu Peng, Chengzhong Liu, Jiaming Ji, Xuanzhe Liu, Sirui Han, Ming Zhang, Yike Guo:
FinMME: Benchmark Dataset for Financial Multi-Modal Reasoning Evaluation. 29465-29489 - Amirbek Djanibekov, Hawau Olamide Toyin, Raghad Alshalan, Abdullah Alatir, Hanan Aldarmaki:
Dialectal Coverage And Generalization in Arabic Speech Recognition. 29490-29502 - Ron Yosef, Yonatan Bitton, Dani Lischinski, Moran Yanuka:
EditInspector: A Benchmark for Evaluation of Text-Guided Image Edits. 29503-29530 - Yavuz Faruk Bakman, Duygu Nur Yaldiz, Sungmin Kang, Tuo Zhang, Baturalp Buyukates, Salman Avestimehr, Sai Praneeth Karimireddy:
Reconsidering LLM Uncertainty Estimation Methods in the Wild. 29531-29556 - Caio Corro, Mathieu Lacroix, Joseph Le Roux:
Bregman Conditional Random Fields: Sequence Labeling with Parallelizable Inference Algorithms. 29557-29574 - Wendi Cui, Jiaxin Zhang, Zhuohang Li, Hao Sun, Damien Lopez, Kamalika Das, Bradley A. Malin, Kumar Sricharan:
SEE: Strategic Exploration and Exploitation for Cohesive In-Context Prompt Optimization. 29575-29627 - Atharva Naik, Darsh Agrawal, Hong Sng, Clayton Marr, Kexun Zhang, Nathaniel Romney Robinson, Kalvin Chang, Rebecca Byrnes, Aravind Mysore, Carolyn P. Rosé, David R. Mortensen:
Programming by Example meets Historical Linguistics: A Large Language Model Based Approach to Sound Law Induction. 29628-29647 - Priyanka Kargupta, Yunyi Zhang, Yizhu Jiao, Siru Ouyang, Jiawei Han:
Synergizing Unsupervised Episode Detection with LLMs for Large-Scale News Events. 29648-29663 - Priyanka Kargupta, Runchu Tian, Jiawei Han:
Beyond True or False: Retrieval-Augmented Hierarchical Analysis of Nuanced Claims. 29664-29679 - Feiran Jia, Tong Wu, Xin Qin, Anna Cinzia Squicciarini:
The Task Shield: Enforcing Task Alignment to Defend Against Indirect Prompt Injection in LLM Agents. 29680-29697 - Fabrice Harel-Canada, Boran Erol, Connor Choi, Jason Liu, Gary Jiarui Song, Nanyun Peng, Amit Sahai:
Sandcastles in the Storm: Revisiting the (Im)possibility of Strong Watermarking. 29698-29735 - Yaxuan Kong, Yiyuan Yang, Yoontae Hwang, Wenjie Du, Stefan Zohren, Zhangyang Wang, Ming Jin, Qingsong Wen:
Time-MQA: Time Series Multi-Task Question Answering with Context Enhancement. 29736-29753 - Ruxiao Chen, Chenguang Wang, Yuran Sun, Xilei Zhao, Susu Xu:
From Perceptions to Decisions: Wildfire Evacuation Decision Prediction with Behavioral Theory-informed LLMs. 29754-29778 - Shikhhar Siingh, Abhinav Rawat, Chitta Baral, Vivek Gupta:
GETReason: Enhancing Image Context Extraction through Hierarchical Multi-Agent Reasoning. 29779-29800 - Vivian Nguyen, Lillian Lee, Cristian Danescu-Niculescu-Mizil:
Hanging in the Balance: Pivotal Moments in Crisis Counseling Conversations. 29801-29817 - Yisheng Xiao, Juntao Li, Wenpeng Hu, Zhunchen Luo, Min Zhang:
Unveiling the Potential of BERT-family: A New Recipe for Building Scalable, General and Competitive Large Language Models. 29818-29833 - Priyanka Kargupta, Nan Zhang, Yunyi Zhang, Rui Zhang, Prasenjit Mitra, Jiawei Han:
TaxoAdapt: Aligning LLM-Based Multidimensional Taxonomy Construction to Evolving Research Corpora. 29834-29850 - Yisheng Xiao, Pei Guo, Zechen Sun, Juntao Li, Kai Song, Min Zhang:
An Empirical Study of Iterative Refinements for Non-autoregressive Translation. 29851-29865 - Darius Feher, Ivan Vulic, Benjamin Minixhofer:
Retrofitting Large Language Models with Dynamic Tokenization. 29866-29883 - Vishakh Padmakumar, Zichao Wang, David Arbour, Jennifer Healey:
Principled Content Selection to Generate Diverse and Personalized Multi-Document Summaries. 29884-29899 - Chenye Zhao, Cornelia Caragea:
Bilingual Zero-Shot Stance Detection. 29900-29919 - Rita Ramos, Everlyn Asiko Chimoto, Maartje ter Hoeve, Natalie Schluter:
GrammaMT: Improving Machine Translation with Grammar-Informed In-Context Learning. 29920-29940 - Joshua Ong Jun Leang, Giwon Hong, Wenda Li, Shay B. Cohen:
Theorem Prover as a Judge for Synthetic Data Generation. 29941-29977 - Ori Shapira, Shlomo E. Chazan, Amir David Nissan Cohen:
Measuring the Effect of Transcription Noise on Downstream Language Understanding Tasks. 29978-30004 - Reto Gubelmann, Ghassen Karray:
Assessing Reliability and Political Bias In LLMs' Judgements of Formal and Material Inferences With Partisan Conclusions. 30005-30031 - Sina Ahmadi, Rico Sennrich, Erfan Karami, Ako Marani, Parviz Fekrazad, Gholamreza Akbarzadeh Baghban, Hanah Hadi, Semko Heidari, Mahîr Dogan, Pedram Asadi, Dashne Bashir, Mohammad Amin Ghodrati, Kourosh Amini, Zeynab Ashourinezhad, Mana Baladi, Farshid Ezzati, Alireza Ghasemifar, Daryoush Hosseinpour, Behrooz Abbaszadeh, Amin Hassanpour, Bahaddin Jalal Hamaamin, Saya Kamal Hama, Ardeshir Mousavi, Sarko Nazir Hussein, Isar Nejadgholi, Mehmet Ölmez, Horam Osmanpour, Rashid Roshan Ramezani, Aryan Sediq Aziz, Ali Salehi, Mohammadreza Yadegari, Kewyar Yadegari, Sedighe Zamani Roodsari:
PARME: Parallel Corpora for Low-Resourced Middle Eastern Languages. 30032-30053 - Bingxuan Li, Yiwei Wang, Jiuxiang Gu, Kai-Wei Chang, Nanyun Peng:
METAL: A Multi-Agent Framework for Chart Generation with Test-Time Scaling. 30054-30069 - Sina Ahmadi
, Micha David Hess, Elena Álvarez Mellado
, Alessia Battisti, Cui Ding, Anne Göhring, Yingqiang Gao, Zifan Jiang, Andrianos Michail, Peshmerge Morad, Joel Niklaus, Maria Christina Panagiotopoulou, Stefano Perrella, Juri Opitz
, Anastassia Shaitarova, Rico Sennrich:
ConLoan: A Contrastive Multilingual Dataset for Evaluating Loanwords. 30070-30090 - Sarath Sivaprasad, Pramod Kaushik, Sahar Abdelnabi, Mario Fritz:
A Theory of Response Sampling in LLMs: Part Descriptive and Part Prescriptive. 30091-30135 - Jingxuan Zhang, Zhenhua Xu, Rui Hu, Wenpeng Xing, Xuhong Zhang, Meng Han:
MEraser: An Effective Fingerprint Erasure Approach for Large Language Models. 30136-30153 - Xueguang Ma, Shengyao Zhuang, Bevan Koopman, Guido Zuccon, Wenhu Chen, Jimmy Lin:
VISA: Retrieval Augmented Generation with Visual Source Attribution. 30154-30169 - Xueguang Ma, Xi Victoria Lin, Barlas Oguz, Jimmy Lin, Wen-tau Yih, Xilun Chen:
DRAMA: Diverse Augmentation from Large Language Models to Smaller Dense Retrievers. 30170-30186 - Ziling Cheng, Meng Cao, Marc-Antoine Rondeau, Jackie CK Cheung:
Stochastic Chameleons: Irrelevant Context Hallucinations Reveal Class-Based (Mis)Generalization in LLMs. 30187-30214 - Chanwoo Park, Seungju Han, Xingzhi Guo, Asuman E. Ozdaglar, Kaiqing Zhang, Joo-Kyung Kim:
MAPoRL: Multi-Agent Post-Co-Training for Collaborative Large Language Models with Reinforcement Learning. 30215-30248 - Naman Ahuja, Fenil Denish Bardoliya, Chitta Baral, Vivek Gupta:
Map&Make: Schema Guided Text to Table Generation. 30249-30262 - Fengnan Li, Elliot D. Hill, Jiang Shu, Jiaxin Gao, Matthew M. Engelhard:
IRIS: Interpretable Retrieval-Augmented Classification for Long Interspersed Document Sequences. 30263-30283 - Shengguang Wu, Fan-Yun Sun, Kaiyue Wen, Nick Haber:
Symmetrical Visual Contrastive Optimization: Aligning Vision-Language Models with Minimal Contrastive Images. 30284-30297 - Peter Baile Chen, Yi Zhang, Mike Cafarella, Dan Roth:
Can we Retrieve Everything All at Once? ARM: An Alignment-Oriented LLM-based Retrieval Method. 30298-30317 - Tenghao Huang, Kinjal Basu, Ibrahim Abdelaziz, Pavan Kapanipathi, Jonathan May, Muhao Chen:
R2D2: Remembering, Replaying and Dynamic Decision Making with a Reflective Agentic Memory. 30318-30330 - Janki Atul Nawale, Mohammed Safi Ur Rahman Khan, Janani D, Mansi Gupta, Danish Pruthi, Mitesh M. Khapra:
FairI Tales: Evaluation of Fairness in Indian Contexts with a Focus on Bias and Stereotypes. 30331-30380 - Zhen Wan, Chao-Han Huck Yang, Yahan Yu, Jinchuan Tian, Sheng Li, Ke Hu, Zhehuai Chen, Shinji Watanabe, Fei Cheng, Chenhui Chu, Sadao Kurohashi:
SpeechIQ: Speech-Agentic Intelligence Quotient Across Cognitive Levels in Voice Understanding by Large Language Models. 30381-30398 - Anil Batra, Laura Sevilla-Lara, Marcus Rohrbach, Frank Keller:
Predicting Implicit Arguments in Procedural Video Instructions. 30399-30419 - Hao Li, Xiaogeng Liu, Ning Zhang, Chaowei Xiao:
PIGuard: Prompt Injection Guardrail via Mitigating Overdefense for Free. 30420-30437 - Tianyu Yang, Lisen Dai, Xiangqi Wang, Minhao Cheng, Yapeng Tian, Xiangliang Zhang:
CLIPErase: Efficient Unlearning of Visual-Textual Associations in CLIP. 30438-30452 - Austin T. Wang, ZeMing Gong, Angel X. Chang:
ViGiL3D: A Linguistically Diverse Dataset for 3D Visual Grounding. 30453-30475 - Tamar I. Regev, Chiebuka Ohams, Shaylee Xie, Lukas Wolf, Evelina Fedorenko, Alex Warstadt, Ethan Wilcox, Tiago Pimentel:
The time scale of redundancy between prosody and linguistic context. 30476-30488 - Zhi Zhou, Sirui Miao, Xiangyu Duan, Hao Yang, Min Zhang:
Basic Reading Distillation. 30489-30502 - Mingyu Zhong, Guanchu Wang, Yu-Neng Chuang, Na Zou:
Quantized Can Still Be Calibrated: A Unified Framework to Calibration in Quantized Large Language Models. 30503-30517 - Francesco Ignazio Re, Andreas Opedal, Glib Manaiev, Mario Giulianelli, Ryan Cotterell:
A Spatio-Temporal Point Process for Fine-Grained Modeling of Reading Behavior. 30518-30538 - Xiaoqing Zhang, Ang Lv, Yuhan Liu, Flood Sung, Wei Liu, Jian Luan, Shuo Shang, Xiuying Chen, Rui Yan:
More is not always better? Enhancing Many-Shot In-Context Learning with Differentiated and Reweighting Objectives. 30539-30552 - Fei Wang, Xingchen Wan, Ruoxi Sun, Jiefeng Chen, Sercan Ö. Arik:
Astute RAG: Overcoming Imperfect Retrieval Augmentation and Knowledge Conflicts for Large Language Models. 30553-30571 - Gayathri Saranathan, Cong Xu, Mahammad Parwez Alam, Tarun Kumar, Martin Foltin, Soon Yee Wong, Suparna Bhattacharya:
SubLIME: Subset Selection via Rank Correlation Prediction for Data-Efficient LLM Evaluation. 30572-30593 - Boci Peng, Yongchao Liu, Xiaohe Bo, Jiaxin Guo, Yun Zhu, Xuanbo Fan, Chuntao Hong, Yan Zhang:
M³GQA: A Multi-Entity Multi-Hop Multi-Setting Graph Question Answering Benchmark. 30594-30620 - Guanghao Zhou, Panjia Qiu, Cen Chen, Hongyu Li, Jason Chu, Xin Zhang, Jun Zhou:
LSSF: Safety Alignment for Large Language Models through Low-Rank Safety Subspace Fusion. 30621-30638 - Kishan Maharaj, Vitobha Munigala, Srikanth G. Tamilselvam, Prince Kumar, Sayandeep Sen, Palani Kodeswaran, Abhijit Mishra, Pushpak Bhattacharyya:
ETF: An Entity Tracing Framework for Hallucination Detection in Code Summaries. 30639-30652 - Shengqian Qin, Yakun Zhu, Linjie Mu, Shaoting Zhang, Xiaofan Zhang:
Meta-Tool: Unleash Open-World Function Calling Capabilities of General-Purpose Large Language Models. 30653-30677 - Yingjie Zhu, Xuefeng Bai, Kehai Chen, Yang Xiang, Jun Yu, Min Zhang:
Benchmarking and Improving Large Vision-Language Models for Fundamental Visual Graph Understanding and Reasoning. 30678-30701 - Zhuocheng Yu, Bingchan Zhao, Yifan Song, Sujian Li, Zhonghui He:
ISR: Self-Refining Referring Expressions for Entity Grounding. 30702-30714 - Siyuan Wang, Dianyi Wang, Chengxing Zhou, Zejun Li, Zhihao Fan, Xuanjing Huang, Zhongyu Wei:
Activating Distributed Visual Region within LLMs for Efficient and Effective Vision-Language Training and Inference. 30715-30727 - Yongheng Zhang, Xu Liu, Ruoxi Zhou, Qiguang Chen, Hao Fei, Wenpeng Lu, Libo Qin:
CCHall: A Novel Benchmark for Joint Cross-Lingual and Cross-Modal Hallucinations Detection in Large Language Models. 30728-30749 - Henry Peng Zou, Zhengyao Gu, Yue Zhou, Yankai Chen, Weizhi Zhang, Liancheng Fang, Yibo Wang, Yangning Li, Kay Liu, Philip S. Yu:
TestNUC: Enhancing Test-Time Computing Approaches and Scaling through Neighboring Unlabeled Data Consistency. 30750-30762 - Jenalea Rajab, Anuoluwapo Aremu, Everlyn Asiko Chimoto, Dale Dunbar, Graham Morrissey, Fadel Thior, Luandrie Potgieter, Jessica Ojo, Atnafu Lambebo Tonja, Wilhelmina Ndapewa Onyothi Nekoto, Pelonomi Moiloa, Jade Z. Abbott, Vukosi Marivate, Benjamin Rosman:
The Esethu Framework: Reimagining Sustainable Dataset Governance and Curation for Low-Resource Languages. 30763-30776 - Daichi Hayakawa, Issei Sato:
Theoretical Analysis of Hierarchical Language Recognition and Generation by Transformers without Positional Encoding. 30777-30834 - James C. Douglas, Yidong Gan, Ben Hachey, Jonathan K. Kummerfeld:
Less is More: Explainable and Efficient ICD Code Prediction with Clinical Entities. 30835-30847 - Alperen Yildiz, Sin G. Teo, Yiling Lou, Yebo Feng, Chong Wang, Dinil Mon Divakaran:
Benchmarking LLMs and LLM-based Agents in Practical Vulnerability Detection for Code Repositories. 30848-30865 - Junlin Li, Guodong Du, Jing Li, Sim Kuan Goh, Wenya Wang, Yequan Wang, Fangming Liu, Ho-Kin Tang, Saleh Alharbi, Daojing He, Min Zhang:
Multi-Modality Expansion and Retention for LLMs through Parameter Merging and Decoupling. 30866-30887 - YuJu Cheng, Yu-Chu Yu, Kai-Po Chang, Yu-Chiang Frank Wang:
Serial Lifelong Editing via Mixture of Knowledge Experts. 30888-30903 - Junyu Luo, Bohan Wu, Xiao Luo, Zhiping Xiao, Yiqiao Jin, Rong-Cheng Tu, Nan Yin, Yifan Wang, Jingyang Yuan, Wei Ju, Ming Zhang:
A Survey on Efficient Large Language Model Training: From Data-centric Perspectives. 30904-30920 - Zhi Zeng, Jiaying Wu, Minnan Luo, Herun Wan, Xiangzheng Kong, Zihan Ma, Guang Dai, Qinghua Zheng:
IMOL: Incomplete-Modality-Tolerant Learning for Multi-Domain Fake News Video Detection. 30921-30933 - Qian Wu, Zheyao Gao, Longfei Gou, Qi Dou:
DDxTutor: Clinical Reasoning Tutoring System with Differential Diagnosis-Based Structured Reasoning. 30934-30957 - Jinfeng Zhou, Yuxuan Chen, Yihan Shi, Xuanming Zhang, Leqi Lei, Yi Feng, Zexuan Xiong, Miao Yan, Xunzhi Wang, Yaru Cao, Jianing Yin, Shuai Wang, Quanyu Dai, Zhenhua Dong, Hongning Wang, Minlie Huang:
SocialEval: Evaluating Social Intelligence of Large Language Models. 30958-31012 - Md Messal Monem Miah, Adrita Anika, Xi Shi, Ruihong Huang:
Hidden in Plain Sight: Evaluation of the Deception Detection Capabilities of LLMs in Multimodal Settings. 31013-31034 - Wenrui Liu, Zhifang Guo, Jin Xu, Yuanjun Lv, Yunfei Chu, Zemin Liu, Junyang Lin:
Analyzing and Mitigating Inconsistency in Discrete Speech Tokens for Neural Codec Language Models. 31035-31046 - Zihan Zheng, Tianle Cui, Chuwen Xie, Jiahui Pan, Qianglong Chen, Lewei He:
PlanningArena: A Modular Benchmark for Multidimensional Evaluation of Planning and Tool Learning. 31047-31086 - Zhenyu Li, Yike Zhang, Tengyu Pan, Yutao Sun, Zhichao Duan, Junjie Fang, Rong Han, Zixuan Wang, Jianyong Wang:
FocusLLM: Precise Understanding of Long Context by Dynamic Condensing. 31087-31101 - Tengyu Pan, Zhichao Duan, Zhenyu Li, Bowen Dong, Ning Liu, Xiuxing Li, Jianyong Wang:
Negative Matters: Multi-Granularity Hard-Negative Synthesis and Anchor-Token-Aware Pooling for Enhanced Text Embeddings. 31102-31118 - Alessandro Vanzo, Sankalan Pal Chowdhury, Mrinmaya Sachan:
GPT-4 as a Homework Tutor Can Improve Student Engagement and Learning Outcomes. 31119-31136 - Zahra Bayramli, Ayhan Suleymanzade, Na Min An, Huzama Ahmad, Eunsu Kim, Junyeong Park, James Thorne, Alice Oh:
Diffusion Models Through a Global Lens: Are They Culturally Inclusive? 31137-31155 - Qiyuan Deng, Xuefeng Bai, Kehai Chen, Yaowei Wang, Liqiang Nie, Min Zhang:
Efficient Safety Alignment of Large Language Models via Preference Re-ranking and Representation-based Reward Modeling. 31156-31171 - Sam Passmore, Lila San Roque, Kirsty Gillespie, Saurabh Nath, Kira Davey, Keira Mullan, Tim Cawley, Jennifer Biggs, Rosey Billington, Bethwyn Evans, Nick Thieberger, Danielle Barth:
English-based acoustic models perform well in the forced alignment of two English-based Pacific Creoles. 31172-31183 - Kaishuai Xu, Tiezheng Yu, Wenjun Hou, Yi Cheng, Chak Tou Leong, Liangyou Li, Xin Jiang, Lifeng Shang, Qun Liu, Wenjie Li:
Subtle Errors in Reasoning: Preference Learning via Error-injected Self-editing. 31184-31203 - Blanca Calvo Figueras, Eneko Sagarzazu, Julen Etxaniz, Jeremy Barnes, Pablo Gamallo, Iria de-Dios-Flores, Rodrigo Agerri:
Truth Knows No Language: Evaluating Truthfulness Beyond English. 31204-31218 - Yusuke Sakai, Hidetaka Kamigaito, Taro Watanabe:
Revisiting Compositional Generalization Capability of Large Language Models Considering Instruction Following Ability. 31219-31238 - Jann Railey Montalan, Jimson Paulo Layacan, David Demitri Africa, Richell Isaiah Flores, Michael Tuscano Lopez II, Theresa Denise Magsajo, Anjanette Cayabyab, William-Chandra Tjhi:
Batayan: A Filipino NLP benchmark for evaluating Large Language Models. 31239-31273 - Michiel van der Meer, Pavel Korshunov, Sébastien Marcel, Lonneke van der Plas:
HintsOfTruth: A Multimodal Checkworthiness Detection Dataset with Real and Synthetic Claims. 31274-31291 - Weichen Zhang, Chen Gao, Shiquan Yu, Ruiying Peng, Baining Zhao, Qian Zhang, Jinqiang Cui, Xinlei Chen, Yong Li:
CityNavAgent: Aerial Vision-and-Language Navigation with Hierarchical Semantic Planning and Global Memory. 31292-31309 - Iuliia Zaitova, Badr M. Abdullah, Wei Xue, Dietrich Klakow, Bernd Möbius, Tania Avgustinova:
It's Not a Walk in the Park! Challenges of Idiom Translation in Speech-to-text Systems. 31310-31322 - Nikolaos Nikolaidis, Nicolas Stefanovitch, Purificação Silvano, Dimitar Iliyanov Dimitrov, Roman Yangarber, Nuno Guimarães, Elisa Sartori, Ion Androutsopoulos, Preslav Nakov, Giovanni Da San Martino, Jakub Piskorski:
PolyNarrative: A Multilingual, Multilabel, Multi-domain Dataset for Narrative Extraction from News Articles. 31323-31345 - Yongbin Guo, Shuzhen Li, Zhulin Liu, Tong Zhang, C. L. Philip Chen:
A Parameter-Efficient and Fine-Grained Prompt Learning for Vision-Language Models. 31346-31359 - Seungwon Lim, Seungbeen Lee, Dongjun Min, Youngjae Yu:
Persona Dynamics: Unveiling the Impact of Persona Traits on Agents in Text-Based Games. 31360-31394 - Jie Ying, Zihong Chen, Zhefan Wang, Wanli Jiang, Chenyang Wang, Zhonghang Yuan, Haoyang Su, Huanjun Kong, Fan Yang, Nanqing Dong:
SeedBench: A Multi-task Benchmark for Evaluating Large Language Models in Seed Science. 31395-31449 - Ankita Gupta, Douglas Rice, Brendan T. O'Connor:
-Stance: A Large-Scale Real World Dataset of Stances in Legal Argumentation. 31450-31467 - Zhiyang Zhang, Ziqiang Liu, Huiming Wang, Renke Shan, Li Kuang, Lu Wang, De Wen Soh:
Re³Syn: A Dependency-Based Data Synthesis Framework for Long-Context Post-training. 31468-31480 - Jihyoung Jang, Minwook Bae, Minji Kim, Dilek Hakkani-Tür, Hyounghun Kim:
Enabling Chatbots with Eyes and Ears: An Immersive Multimodal Conversation System for Dynamic Interactions. 31481-31512 - Xingyu Li, Chen Gong, Guohong Fu:
Multimodal Coreference Resolution for Chinese Social Media Dialogues: Dataset and Benchmark Approach. 31513-31525 - Yindu Su, Huike Zou, Lin Sun, Ting Zhang, Haiyang Yang, Chen Li Yu, David Lo, Qingheng Zhang, Shuguang Han, Jufeng Chen:
TACLR: A Scalable and Efficient Retrieval-based Method for Industrial Product Attribute Value Identification. 31526-31538 - Ruirui Chen, Weifeng Jiang, Chengwei Qin, Cheston Tan:
Theory of Mind in Large Language Models: Assessment and Enhancement. 31539-31558 - Rui Qiu, Shijie Chen, Yu Su, Po-Yin Yen, Han-Wei Shen:
Completing A Systematic Review in Hours instead of Months with Interactive AI Agents. 31559-31593 - Guohua Wang, Shengping Song, Wuchun He, Yongsen Zheng:
CMHKF: Cross-Modality Heterogeneous Knowledge Fusion for Weakly Supervised Video Anomaly Detection. 31594-31607 - Longze Chen, Renke Shan, Huiming Wang, Lu Wang, Ziqiang Liu, Run Luo, Jiawei Wang, Hamid Alinejad-Rokny, Min Yang:
CLaSp: In-Context Layer Skip for Self-Speculative Decoding. 31608-31618 - Canasai Kruengkrai, Koichiro Yoshino:
Teaching Text Agents to Learn Sequential Decision Making from Failure. 31619-31635 - Eleftheria Tsipidi, Samuel Kiegeland, Franz Nowak, Tianyang Xu, Ethan Wilcox, Alex Warstadt, Ryan Cotterell, Mario Giulianelli:
The Harmonic Structure of Information Contours. 31636-31659 - Navve Wasserman, Roi Pony, Oshri Naparstek, Adi Raz Goldfarb, Eli Schwartz, Udi Barzelay, Leonid Karlinsky:
REAL-MM-RAG: A Real-World Multi-Modal Retrieval Benchmark. 31660-31683 - Mats Faulborn, Indira Sen, Max Pellert, Andreas Spitz, David García:
Only a Little to the Left: A Theory-grounded Measure of Political Bias in Large Language Models. 31684-31704 - Yida Lu, Jiale Cheng, Zhexin Zhang, Shiyao Cui, Cunxiang Wang, Xiaotao Gu, Yuxiao Dong, Jie Tang, Hongning Wang, Minlie Huang:
LongSafety: Evaluating Long-Context Safety of Large Language Models. 31705-31725 - Xiaowei Yuan, Zhao Yang, Ziyang Huang, Yequan Wang, Siqi Fan, Yiming Ju, Jun Zhao, Kang Liu:
Exploiting Contextual Knowledge in LLMs through V-usable Information based Layer Enhancement. 31726-31741 - Sooyung Choi, Jaehyeok Lee, Xiaoyuan Yi, Jing Yao, Xing Xie, JinYeong Bak:
Unintended Harms of Value-Aligned LLMs: Psychological and Empirical Insights. 31742-31768 - Hani Alomari, Anushka Sivakumar, Andrew Zhang, Chris Thomas:
Maximal Matching Matters: Preventing Representation Collapse for Robust Cross-Modal Retrieval. 31769-31785 - Hong Chen, Misha Teplitskiy, David Jurgens:
The Noisy Path from Source to Citation: Measuring How Scholars Engage with Past Research. 31786-31802 - Ching-Wen Yang, Zhi-Quan Feng, Ying-Jia Lin, Che Wei Chen, Kun-da Wu, Hao Xu, Jui-Feng Yao, Hung-Yu Kao:
MAPLE: Enhancing Review Generation with Multi-Aspect Prompt LEarning in Explainable Recommendation. 31803-31821 - Clément Dumas, Chris Wendler, Veniamin Veselovsky, Giovanni Monea, Robert West:
Separating Tongue from Thought: Activation Patching Reveals Language-Agnostic Concept Representations in Transformers. 31822-31841 - Ivan Vegner, Sydelle de Souza, Valentin Forch, Martha Lewis, Leonidas A. A. Doumas:
Behavioural vs. Representational Systematicity in End-to-End Models: An Opinionated Survey. 31842-31856 - Boheng Sheng, Jiacheng Yao, Meicong Zhang, Guoxiu He:
Dynamic Chunking and Selection for Reading Comprehension of Ultra-Long Context in Large Language Models. 31857-31876 - Rong Cheng, Jinyi Liu, Yan Zheng, Fei Ni, Jiazhen Du, Hangyu Mao, Fuzheng Zhang, Bo Wang, Jianye Hao:
DualRAG: A Dual-Process Approach to Integrate Reasoning and Retrieval for Multi-Hop Question Answering. 31877-31899 - Siheng Xiong, Ali Payani, Yuan Yang, Faramarz Fekri:
Deliberate Reasoning in Language Models as Structure-Aware Planning with an Accurate World Model. 31900-31931 - Xinxin Liu, Aaron Thomas, Cheng Zhang, Jianyi Cheng, Yiren Zhao, Xitong Gao:
Refining Salience-Aware Sparse Fine-Tuning Strategies for Language Models. 31932-31945 - Emily Xiao, Chin-Jou Li, Yilin Zhang, Graham Neubig, Amanda Bertsch:
Efficient Many-Shot In-Context Learning with Dynamic Block-Sparse Attention. 31946-31958 - Rui Pan, Dylan Zhang, Hanning Zhang, Xingyuan Pan, Minrui Xu, Jipeng Zhang, Renjie Pi, Xiaoyu Wang, Tong Zhang:
ScaleBiO: Scalable Bilevel Optimization for LLM Data Reweighting. 31959-31982 - Jiaming Ji, Donghai Hong, Borong Zhang, Boyuan Chen, Josef Dai, Boren Zheng, Tianyi Alex Qiu, Jiayi Zhou, Kaile Wang, Boxun Li, Sirui Han, Yike Guo, Yaodong Yang:
PKU-SafeRLHF: Towards Multi-Level Safety Alignment for LLMs with Human Preference. 31983-32016 - Ming Li, Yanhong Li, Tianyi Zhou:
What Happened in LLMs Layers when Trained for Fast vs. Slow Thinking: A Gradient Perspective. 32017-32154 - Jonas F. Lotz, António Vilarinho Lopes, Stephan Peitz, Hendra Setiawan, Leonardo Emili:
Beyond Text Compression: Evaluating Tokenizers Across Scales. 32155-32173 - Ahmed Elhady, Eneko Agirre, Mikel Artetxe:
Emergent Abilities of Large Language Models under Continued Pre-training for Language Adaptation. 32174-32186 - Lorenzo Balzotti, Donatella Firmani, Jerin George Mathew, Riccardo Torlone, Sihem Amer-Yahia:
R-Fairness: Assessing Fairness of Ranking in Subjective Data. 32187-32199 - Atoosa Malemir Chegini, Keivan Rezaei, Hamid Eghbalzadeh, Soheil Feizi:
RePanda: Pandas-powered Tabular Verification and Reasoning. 32200-32212 - Shreya Havaldar, Adam Stein, Eric Wong, Lyle H. Ungar:
Towards Style Alignment in Cross-Cultural Translation. 32213-32230 - Jeffrey Li, Mohammadreza Armandpour, Iman Mirzadeh, Sachin Mehta, Vaishaal Shankar, Raviteja Vemulapalli, Samy Bengio, Oncel Tuzel, Mehrdad Farajtabar, Hadi Pouransari, Fartash Faghri:
TiC-LM: A Web-Scale Benchmark for Time-Continual LLM Pretraining. 32231-32273 - Shreya Havaldar, Hamidreza Alvari, John Palowitch, Mohammad Javad Hosseini, Senaka Buthpitiya, Alex Fabrikant:
Entailed Between the Lines: Incorporating Implication into NLI. 32274-32290 - Lucas Monteiro Paes, Dennis Wei, Hyo Jin Do, Hendrik Strobelt, Ronny Luss, Amit Dhurandhar, Manish Nagireddy, Karthikeyan Natesan Ramamurthy, Prasanna Sattigeri, Werner Geyer, Soumya Ghosh:
Multi-Level Explanations for Generative Language Models. 32291-32317 - Dorde Klisura, Astrid R. Bernaga Torres, Anna Karen Gárate-Escamilla, Rajesh Roshan Biswal, Ke Yang, Hilal Pataci, Anthony Rios:
A Multi-Agent Framework for Mitigating Dialect Biases in Privacy Policy Question-Answering Systems. 32318-32337 - Xu Ouyang, Tao Ge, Thomas Hartvigsen, Zhisong Zhang, Haitao Mi, Dong Yu:
Low-Bit Quantization Favors Undertrained LLMs. 32338-32348 - Rachneet Kaur, Zhen Zeng, Tucker Balch, Manuela Veloso:
LETS-C: Leveraging Text Embedding for Time Series Classification. 32365-32399 - Baining Zhao, Jianjie Fang, Zichao Dai, Ziyou Wang, Jirong Zha, Weichen Zhang, Chen Gao, Yue Wang, Jinqiang Cui, Xinlei Chen, Yong Li:
UrbanVideo-Bench: Benchmarking Vision-Language Models on Embodied Intelligence with Video Data in Urban Spaces. 32400-32423 - Sungho Park, Joohyung Yun, Jongwuk Lee, Wook-Shin Han:
HELIOS: Harmonizing Early Fusion, Late Fusion, and LLM Reasoning for Multi-Granular Table-Text Retrieval. 32424-32444 - Adhiraj Ghosh, Sebastian Dziadzio, Ameya Prabhu, Vishaal Udandarao, Samuel Albanie, Matthias Bethge:
ONEBench to Test Them All: Sample-Level Benchmarking Over Open-Ended Capabilities. 32445-32481 - María Grandury, Javier Aula-Blasco, Júlia Falcão, Clémentine Fourrier, Miguel González Saiz, Gonzalo Martínez, Gonzalo Santamaría Gómez, Rodrigo Agerri, Nuria Aldama-García, Luis Chiruzzo, Javier Conde, Helena Gómez-Adorno, Marta Guerrero Nieto, Guido Ivetta, Natàlia López Fuertes, Flor Miriam Plaza del Arco, María Teresa Martín-Valdivia, Helena Montoro Zamorano, Carmen Muñoz Sanz, Pedro Reviriego, Leire Rosado Plaza, Alejandro Vaca Serrano, María Estrella Vallecillo Rodríguez, Jorge Vallego, Irune Zubiaga:
La Leaderboard: A Large Language Model Leaderboard for Spanish Varieties and Languages of Spain and Latin America. 32482-32524 - Xiang Zhang, Juntai Cao, Chenyu You, Dujian Ding:
Why Prompt Design Matters and Works: A Complexity Analysis of Prompt Search Space in LLMs. 32525-32555 - Jared Fernandez, Clara Na, Vashisth Tiwari, Yonatan Bisk, Sasha Luccioni, Emma Strubell:
Energy Considerations of Large Language Model Inference and Efficiency Optimizations. 32556-32569 - Lior Belenki, Alekh Agarwal, Tianze Shi, Kristina Toutanova:
Optimizing Pre-Training Data Mixtures with Mixtures of Data Expert Models. 32570-32587 - Ran Xin, Chenguang Xi, Jie Yang, Feng Chen, Hang Wu, Xia Xiao, Yifan Sun, Shen Zheng, Ming Ding:
BFS-Prover: Scalable Best-First Tree Search for LLM-based Automatic Theorem Proving. 32588-32599 - Fan Yin, Zifeng Wang, I-Hung Hsu, Jun Yan, Ke Jiang, Yanfei Chen, Jindong Gu, Long T. Le, Kai-Wei Chang, Chen-Yu Lee, Hamid Palangi, Tomas Pfister:
Magnet: Multi-turn Tool-use Data Synthesis and Distillation via Graph Translation. 32600-32616 - Xinyu Wang, Changzhi Sun, Lian Cheng, Yuanbin Wu, Dell Zhang, Xiaoling Wang, Xuelong Li:
Logic-Regularized Verifier Elicits Reasoning from LLMs. 32617-32630 - Coleman Richard Charles Hooper, Sehoon Kim, Hiva Mohammadzadeh, Monishwaran Maheswaran, Sebastian Zhao, June Paik, Michael W. Mahoney, Kurt Keutzer, Amir Gholami:
Squeezed Attention: Accelerating Long Context Length LLM Inference. 32631-32652 - Diego Velazquez, Mikaela Grace, Konstantinos Karageorgos, Lawrence Carin, Aaron Schliem, Dimitrios Zaikis, Roger Wechsler:
LangMark: A Multilingual Dataset for Automatic Post-Editing. 32653-32667 - Guodong Du, Zitao Fang, Jing Li, Junlin Li, Runhua Jiang, Shuyang Yu, Yifei Guo, Yangneng Chen, Sim Kuan Goh, Ho-Kin Tang, Daojing He, Honghai Liu, Min Zhang:
Neural Parameter Search for Slimmer Fine-Tuned Models and Better Transfer. 32668-32687 - Zenghui Yuan, Yangming Xu, Jiawen Shi, Pan Zhou, Lichao Sun:
Merge Hijacking: Backdoor Attacks to Model Merging of Large Language Models. 32688-32703 - Ife Adebara, Hawau Olamide Toyin, Nahom Tesfu Ghebremichael, AbdelRahim A. Elmadany, Muhammad Abdul-Mageed:
Where Are We? Evaluating LLM Performance on African Languages. 32704-32731 - Chengwei Qin, Wenhan Xia, Fangkai Jiao, Chen Chen, Yuchen Hu, Bosheng Ding, Ruirui Chen, Shafiq Joty:
Beyond Output Matching: Bidirectional Alignment for Enhanced In-Context Learning. 32732-32758 - Yumo Xu, Peng Qi, Jifan Chen, Kunlun Liu, Rujun Han, Lan Liu, Bonan Min, Vittorio Castelli, Arshit Gupta, Zhiguo Wang:
CiteEval: Principle-Driven Citation Evaluation for Source Attribution. 32759-32778 - Mengkang Hu, Tianxing Chen, Qiguang Chen, Yao Mu, Wenqi Shao, Ping Luo:
HiAgent: Hierarchical Working Memory Management for Solving Long-Horizon Agent Tasks with Large Language Model. 32779-32798 - Yao Shi, Rongkeng Liang, Yong Xu:
EducationQ: Evaluating LLMs' Teaching Capabilities Through Multi-Agent Dialogue Framework. 32799-32828 - Peiqi Sui, Juan Diego Rodriguez, Philippe Laban, Dean Murphy, Joseph P. Dexter, Richard Jean So, Samuel Baker, Pramit Chaudhuri:
KRISTEVA: Close Reading as a Novel Task for Benchmarking Interpretive Reasoning. 32829-32849 - Yiduo Guo, Jie Fu, Huishuai Zhang, Dongyan Zhao:
Efficient Domain Continual pretraining by Mitigating the Stability Gap. 32850-32870 - Fakhraddin Alwajih, Abdellah El Mekki, Samar Mohamed Magdy, AbdelRahim A. Elmadany, Omer Nacar, El Moatez Billah Nagoudi, Reem Abdel-Salam, Hanin Atwany, Youssef Nafea, Abdulfattah Mohammed Yahya, Rahaf Alhamouri, Hamzah A. Alsayadi, Hiba Zayed, Sara Shatnawi, Serry Sibaee, Yasir Ech-Chammakhy, Walid Al-Dhabyani, Marwa Mohamed Ali, Imen Jarraya, Ahmed Oumar El-Shangiti, Aisha Alraeesi, Mohammed Anwar Al-Ghrawi, Abdulrahman S. Al-Batati, Elgizouli Mohamed, Noha Taha Elgindi, Muhammed Saeed, Houdaifa Atou, Issam Ait Yahia, Abdelhak Bouayad, Mohammed Machrouh, Amal Makouar, Dania Alkawi, Mukhtar Mohamed, Safaa Taher Abdelfadil, Amine Ziad Ounnoughene, Rouabhia Anfel, Rwaa Assi, Ahmed Sorkatti, Mohamedou Cheikh Tourad, Anis Koubaa, Ismail Berrada, Mustafa Jarrar, Shady Shehata, Muhammad Abdul-Mageed:
Palm: A Culturally Inclusive and Linguistically Diverse Dataset for Arabic LLMs. 32871-32894 - Alexander Spangher, Michael Lu, Sriya Kalyan, Hyundong Justin Cho, Tenghao Huang, Weiyan Shi, Jonathan May:
NewsInterview: a Dataset and a Playground to Evaluate LLMs' Grounding Gap via Informational Interviews. 32895-32925 - Tao Zhang, Chenglin Zhu, Yanjun Shen, Wenjing Luo, Yan Zhang, Hao Liang, Fan Yang, Mingan Lin, Yujing Qiao, Weipeng Chen, Bin Cui, Wentao Zhang, Zenan Zhou:
CFBench: A Comprehensive Constraints-Following Benchmark for LLMs. 32926-32944 - Ashwin Sankar, Sparsh Jain, Nikhil Narasimhan, Devilal Choudhary, Dhairya Suman, Mohammed Safi Ur Rahman Khan, Anoop Kunchukuttan, Mitesh M. Khapra, Raj Dabre:
Towards Building Large Scale Datasets and State-of-the-Art Automatic Speech Translation Systems for 14 Indian Languages. 32945-32966 - Yang Tian, Fan Liu, Jingyuan Zhang, Victoria W., Yupeng Hu, Liqiang Nie:
CoRe-MMRAG: Cross-Source Knowledge Reconciliation for Multimodal RAG. 32967-32982 - Momose Oyama, Hiroaki Yamagiwa, Yusuke Takase, Hidetoshi Shimodaira:
Mapping 1, 000+ Language Models via the Log-Likelihood Vector. 32983-33038 - Zhaochen Hong, Haofei Yu, Jiaxuan You:
ConsistencyChecker: Tree-based Evaluation of LLM Generalization Capabilities. 33039-33075 - Alejandro Benito-Santos, Adrián Ghajari, Víctor Fresno:
Robust Estimation of Population-Level Effects in Repeated-Measures NLP Experimental Designs. 33076-33089 - Farima Fatahi Bayat, Lechen Zhang, Sheza Munir, Lu Wang:
FactBench: A Dynamic Benchmark for In-the-Wild Language Model Factuality Evaluation. 33090-33110 - Zichuan Fu, Xian Wu, Yejing Wang, Wanyu Wang, Shanshan Ye, Hongzhi Yin, Yi Chang, Yefeng Zheng, Xiangyu Zhao:
Training-free LLM Merging for Multi-task Learning. 33111-33124 - Mingyu Derek Ma, Yanna Ding, Zijie Huang, Jianxi Gao, Yizhou Sun, Wei Wang:
Inferring from Logits: Exploring Best Practices for Decoding-Free Generative Candidate Selection. 33125-33144 - Minhyeon Oh, Seungjoon Lee, Jungseul Ok:
Comparison-based Active Preference Learning for Multi-dimensional Personalization. 33145-33166 - Siming Huang, Tianhao Cheng, Jason Klein Liu, Weidi Xu, Jiaran Hao, Liuyihan Song, Yang Xu, Jian Yang, Jiaheng Liu, Chenchen Zhang, Linzheng Chai, Ruifeng Yuan, Xianzhen Luo, Qiufeng Wang, YuanTao Fan, Qingfu Zhu, Zhaoxiang Zhang, Yang Gao, Jie Fu, Qian Liu, Houyi Li, Ge Zhang, Yuan Qi, Yinghui Xu, Wei Chu, Zili Wang:
OpenCoder: The Open Cookbook for Top-Tier Code Large Language Models. 33167-33193 - Chansung Park, Juyong Jiang, Fan Wang, Sayak Paul, Jing Tang:
LlamaDuo: LLMOps Pipeline for Seamless Migration from Service LLMs to Small-Scale Local LLMs. 33194-33215 - Anastasiia Ivanova, Eva Bakaeva, Zoya Volovikova, Alexey K. Kovalev, Aleksandr Panov:
AmbiK: Dataset of Ambiguous Tasks in Kitchen Environment. 33216-33241 - Jincenzi Wu, Jianxun Lian, Dingdong Wang, Helen M. Meng:
SocialCC: Interactive Evaluation for Cultural Competence in Language Agents. 33242-33271 - Hongyuan Dong, Zijian Kang, Weijie Yin, LiangXiao LiangXiao, ChaoFeng ChaoFeng, Ran Jiao:
Scalable Vision Language Model Training via High Quality Data Curation. 33272-33293 - Sunkyung Lee, Minjin Choi, Eunseong Choi, Hye-young Kim, Jongwuk Lee:
GRAM: Generative Recommendation via Semantic-aware Multi-granular Late Fusion. 33294-33312 - Tao Ji, Bin Guo, Yuanbin Wu, Qipeng Guo, Shenlixing Shenlixing, Chenzhan Chenzhan, Xipeng Qiu, Qi Zhang, Tao Gui:
Towards Economical Inference: Enabling DeepSeek's Multi-Head Latent Attention in Any Transformer-based LLMs. 33313-33328 - Zhaoxuan Wu, Zijian Zhou, Arun Verma, Alok Prakash, Daniela Rus, Bryan Kian Hsiang Low:
TETRIS: Optimal Draft Token Selection for Batch Speculative Decoding. 33329-33345 - Mooho Song, Hye Ryung Son, Jay-Yoon Lee:
Introducing Verification Task of Set Consistency with Set-Consistency Energy Networks. 33346-33366 - Atharvan Dogra, Krishna Pillutla, Ameet Deshpande, Ananya B. Sai, John J. Nay, Tanmay Rajpurohit, Ashwin Kalyan, Balaraman Ravindran:
Language Models can Subtly Deceive Without Lying: A Case Study on Strategic Phrasing in Legislation. 33367-33390 - Kayode Olaleye, Arturo Oncevay, Mathieu Sibue, Nombuyiselo Zondi, Michelle Terblanche, Sibongile Mapikitla, Richard Lastrucci, Charese Smiley, Vukosi Marivate:
AfroCS-xs: Creating a Compact, High-Quality, Human-Validated Code-Switched Dataset for African Languages. 33391-33410 - Muhammad Reza Qorib, Junyi Li, Hwee Tou Ng:
Just Go Parallel: Improving the Multilingual Capabilities of Large Language Models. 33411-33424 - Mukai Li, Lei Li, Shansan Gong, Qi Liu:
Design Choices for Extending the Context Length of Visual Language Models. 33425-33438
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.