Cited Text Spans for Scientific Citation Text Generation

Xiangci Li; Yi-Hui Lee; Jessica Ouyang

Cited Text Spans for Scientific Citation Text Generation

Abstract

An automatic citation generation system aims to concisely and accurately describe the relationship between two scientific articles. To do so, such a system must ground its outputs to the content of the cited paper to avoid non-factual hallucinations. Due to the length of scientific documents, existing abstractive approaches have conditioned only on cited paper abstracts. We demonstrate empirically that the abstract is not always the most appropriate input for citation generation and that models trained in this way learn to hallucinate. We propose to condition instead on the cited text span (CTS) as an alternative to the abstract. Because manual CTS annotation is extremely time- and labor-intensive, we experiment with distant labeling of candidate CTS sentences, achieving sufficiently strong performance to substitute for expensive human annotations in model training, and we propose a human-in-the-loop, keyword-based CTS retrieval approach that makes generating citation texts grounded in the full text of cited papers both promising and practical.

Anthology ID:: 2024.sdp-1.9
Volume:: Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024)
Month:: August
Year:: 2024
Address:: Bangkok, Thailand
Editors:: Tirthankar Ghosal, Amanpreet Singh, Anita Waard, Philipp Mayr, Aakanksha Naik, Orion Weller, Yoonjoo Lee, Shannon Shen, Yanxia Qin
Venues:: sdp | WS
SIG:
Publisher:: Association for Computational Linguistics
Note:
Pages:: 90–104
Language:
URL:: https://aclanthology.org/2024.sdp-1.9/
DOI:
Bibkey:
Cite (ACL):: Xiangci Li, Yi-Hui Lee, and Jessica Ouyang. 2024. Cited Text Spans for Scientific Citation Text Generation. In Proceedings of the Fourth Workshop on Scholarly Document Processing (SDP 2024), pages 90–104, Bangkok, Thailand. Association for Computational Linguistics.
Cite (Informal):: Cited Text Spans for Scientific Citation Text Generation (Li et al., sdp 2024)
Copy Citation:
PDF:: https://aclanthology.org/2024.sdp-1.9.pdf

PDF Cite Search Fix data