research-article

TriviaHG: A Dataset for Automatic Hint Generation from Factoid Questions

Authors:

Jamshid Mozafari,

Anubhav Jangra,

Adam JatowtAuthors Info & Claims

SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

Pages 2060 - 2070

https://doi.org/10.1145/3626772.3657855

Published: 11 July 2024 Publication History

Abstract

Nowadays, individuals tend to engage in dialogues with Large Language Models, seeking answers to their questions. In times when such answers are readily accessible to anyone, the stimulation and preservation of human's cognitive abilities, as well as the assurance of maintaining good reasoning skills by humans becomes crucial. This study addresses such needs by proposing hints (instead of final answers or before giving answers) as a viable solution. We introduce a framework for the automatic hint generation for factoid questions, employing it to construct TriviaHG, a novel large-scale dataset featuring 160,230 hints corresponding to 16,645 questions from the TriviaQA dataset. Additionally, we present an automatic evaluation method that measures the Convergence and Familiarity quality attributes of hints. To evaluate the TriviaHG dataset and the proposed evaluation method, we enlisted 10 individuals to annotate 2,791 hints and tasked 6 humans with answering questions using the provided hints. The effectiveness of hints varied, with success rates of 96%, 78%, and 36% for questions with easy, medium, and hard answers, respectively. Moreover, the proposed automatic evaluation methods showed a robust correlation with annotators' results. Conclusively, the findings highlight three key insights: the facilitative role of hints in resolving unknown questions, the dependence of hint quality on answer difficulty, and the feasibility of employing automatic evaluation methods for hint assessment.

References

[1]

Abdelrahman Abdallah and Adam Jatowt. 2023. Generator-retriever-generator: A novel approach to open-domain question answering. arXiv preprint arXiv:2307.11278 (2023).

[2]

Heba Abdel-Nabi, Arafat Awajan, and Mostafa Z. Ali. 2023. Deep learning-based question answering: a survey. Knowledge and Information Systems 65, 4 (01 Apr 2023), 1399--1485. https://doi.org/10.1007/s10115-022-01783-5

Digital Library

[3]

Josh Achiam, Steven Adler, Sandhini Agarwal, Lama Ahmad, Ilge Akkaya, Florencia Leoni Aleman, Diogo Almeida, Janko Altenschmidt, Sam Altman, Shyamal Anadkat, et al. 2023. Gpt-4 technical report. arXiv preprint arXiv:2303.08774 (2023).

[4]

Raghunath Arnab. 2017. Chapter 7 - Stratified Sampling. In Survey Sampling Theory and Applications, Raghunath Arnab (Ed.). Academic Press, 213--256. https: //doi.org/10.1016/B978-0-12-811848-1.00007--8

[5]

Albert Bandura. 2013. The role of self-efficacy in goal-based motivation. New developments in goal setting and task performance (2013), 147--157.

[6]

Iz Beltagy, Matthew E Peters, and Arman Cohan. 2020. Longformer: The long-document transformer. arXiv preprint arXiv:2004.05150 (2020).

[7]

Jonathan Berant, Andrew Chou, Roy Frostig, and Percy Liang. 2013. Semantic Parsing on Freebase from Question-Answer Pairs. In Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, David Yarowsky, Timothy Baldwin, Anna Korhonen, Karen Livescu, and Steven Bethard (Eds.). Association for Computational Linguistics, Seattle, Washington, USA, 1533--1544. https://aclanthology.org/D13-1160

[8]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language models are few-shot learners. Advances in neural information processing systems 33 (2020), 1877--1901.

[9]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, Sandhini Agarwal, Ariel Herbert-Voss, Gretchen Krueger, Tom Henighan, Rewon Child, Aditya Ramesh, Daniel Ziegler, Jeffrey Wu, Clemens Winter, Chris Hesse, Mark Chen, Eric Sigler, Mateusz Litwin, Scott Gray, Benjamin Chess, Jack Clark, Christopher Berner, Sam McCandlish, Alec Radford, Ilya Sutskever, and Dario Amodei. 2020. Language Models are Few-Shot Learners. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 1877--1901. https://proceedings.neurips.cc/paper_files/paper/2020/file/ 1457c0d6bfcb4967418bfb8ac142f64a-Paper.pdf

[10]

Jannis Bulian, Christian Buck, Wojciech Gajewski, Benjamin Börschinger, and Tal Schuster. 2022. Tomayto, Tomahto. Beyond Token-level Answer Equivalence for Question Answering Evaluation. In Proceedings of the 2022 Conference on Empirical Methods in Natural Language Processing, Yoav Goldberg, Zornitsa Kozareva, and Yue Zhang (Eds.). Association for Computational Linguistics, Abu Dhabi, United Arab Emirates, 291--305. https://doi.org/10.18653/v1/2022.emnlp-main.20

[11]

Jiawei Chen, Hongyu Lin, Xianpei Han, and Le Sun. 2023. Benchmarking large lan-guage models in retrieval-augmented generation. arXiv preprint arXiv:2309.01431 (2023).

[12]

H Looren De Jong. 1996. Levels: Reduction and elimination in cognitive neuro-science. Problems of theoretical psychology 6 (1996), 165.

[13]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2019. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Jill Burstein, Christy Doran, and Thamar Solorio (Eds.). Association for Computational Linguistics, Minneapolis, Minnesota, 4171--4186. https://doi. org/10.18653/v1/N19-1423

[14]

Yunfan Gao, Yun Xiong, Xinyu Gao, Kangxiang Jia, Jinliu Pan, Yuxi Bi, Yi Dai, Jiawei Sun, and Haofen Wang. 2023. Retrieval-augmented generation for large language models: A survey. arXiv preprint arXiv:2312.10997 (2023).

[15]

Allen Rovick Gregory Hume, Joel Michael and Martha Evens. 1996. Hinting as a Tactic in One-on-One Tutoring. Journal of the Learning Sciences 5, 1 (1996), 23--47. https://doi.org/10.1207/s15327809jls0501_2

[16]

Georgiana Haldeman, Andrew Tjang, Monica Babeş-Vroman, Stephen Bartos, Jay Shah, Danielle Yucht, and Thu D. Nguyen. 2018. Providing Meaningful Feedback for Autograding of Programming Assignments. In Proceedings of the 49th ACM Technical Symposium on Computer Science Education (Baltimore, Maryland, USA) (SIGCSE '18). Association for Computing Machinery, New York, NY, USA, 278--283. https://doi.org/10.1145/3159450.3159502

Digital Library

[17]

Jiawei Han, Jian Pei, and Hanghang Tong. 2022. Data mining: concepts and techniques. Morgan kaufmann.

[18]

Andrew Head, Elena Glassman, Gustavo Soares, Ryo Suzuki, Lucas Figueredo, Loris D'Antoni, and Björn Hartmann. 2017. Writing Reusable Code Feedback at Scale with Mixed-Initiative Program Synthesis. In Proceedings of the Fourth (2017) ACM Conference on Learning @ Scale (Cambridge, Massachusetts, USA) (L@S '17). Association for Computing Machinery, New York, NY, USA, 89--98. https://doi.org/10.1145/3051457.3051467

Digital Library

[19]

Gautier Izacard and Edouard Grave. 2021. Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering. In Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, Paola Merlo, Jorg Tiedemann, and Reut Tsarfaty (Eds.). Association for Computational Linguistics, Online, 874--880. https://doi.org/10. 18653/v1/2021.eacl-main.74

[20]

Adam Jatowt, Calvin Gehrer, and Michael Färber. 2023. Automatic Hint Generation. In Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval (Taipei, Taiwan) (ICTIR '23). Association for Computing Machinery, New York, NY, USA, 117--123. https://doi.org/10.1145/3578337.3605119

Digital Library

[21]

Wei Jin, Tiffany Barnes, John Stamper, Michael John Eagle, Matthew W. Johnson, and Lorrie Lehmann. 2012. Program Representation for Automatic Hint Generation for a Data-Driven Novice Programming Tutor. In Intelligent Tutoring Systems, Stefano A. Cerri, William J. Clancey, Giorgos Papadourakis, and Kitty Panourgia (Eds.). Springer Berlin Heidelberg, Berlin, Heidelberg, 304--309.

[22]

Mandar Joshi, Eunsol Choi, Daniel Weld, and Luke Zettlemoyer. 2017. TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Regina Barzilay and Min-Yen Kan (Eds.). Association for Computational Linguistics, Vancouver, Canada, 1601--1611. https: //doi.org/10.18653/v1/P17-1147

[23]

Ehsan Kamalloo, Nouha Dziri, Charles Clarke, and Davood Rafiei. 2023. Evaluating Open-Domain Question Answering in the Era of Large Language Models. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 5591--5606. https://doi.org/10.18653/v1/2023.acl-long.307

[24]

Vladimir Karpukhin, Barlas Oguz, Sewon Min, Patrick Lewis, Ledell Wu, Sergey Edunov, Danqi Chen, and Wen-tau Yih. 2020. Dense Passage Retrieval for Open-Domain Question Answering. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, Online, 6769--6781. https://doi.org/10.18653/v1/2020.emnlp-main.550

[25]

Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, Kristina Toutanova, Llion Jones, Matthew Kelcey, Ming-Wei Chang, Andrew M. Dai, Jakob Uszkoreit, Quoc Le, and Slav Petrov. 2019. Natural Questions: A Benchmark for Question Answering Research. Transactions of the Association for Computational Linguistics 7 (2019), 452--466. https://doi.org/10.1162/tacl_a_00276

[26]

Timotej Lazar, Martin Mo?ina, and Ivan Bratko. 2017. Automatic Extraction of AST Patterns for Debugging Student Programs. In Artificial Intelligence in Education, Elisabeth André, Ryan Baker, Xiangen Hu, Ma. Mercedes T. Rodrigo, and Benedict du Boulay (Eds.). Springer International Publishing, Cham, 162--174.

[27]

V.C.S. Lee, Y.T. Yu, C.M. Tang, T.L. Wong, and C.K. Poon. 2018. ViDA: A virtual debugging advisor for supporting learning in computer programming courses. Journal of Computer Assisted Learning 34, 3 (2018), 243--258. https://doi.org/10. 1111/jcal.12238 arXiv:https://onlinelibrary.wiley.com/doi/pdf/10.1111/jcal.12238

[28]

Xin Li and Dan Roth. 2002. Learning Question Classifiers. In COLING 2002: The 19th International Conference on Computational Linguistics. https://aclanthology. org/C02--1150

[29]

Jimmy Lin, Xueguang Ma, Sheng-Chieh Lin, Jheng-Hong Yang, Ronak Pradeep, and Rodrigo Nogueira. 2021. Pyserini: A Python Toolkit for Reproducible Information Retrieval Research with Sparse and Dense Representations. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval () (SIGIR '21). Association for Computing Machinery, New York, NY, USA, 2356--2362. https://doi.org/10.1145/3404835. 3463238

Digital Library

[30]

Yinhan Liu, Myle Ott, Naman Goyal, Jingfei Du, Mandar Joshi, Danqi Chen, Omer Levy, Mike Lewis, Luke Zettlemoyer, and Veselin Stoyanov. 2019. Roberta: A robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019).

[31]

Zefang Liu. 2023. SecQA: A Concise Question-Answering Dataset for Evaluating Large Language Models in Computer Security. arXiv preprint arXiv:2312.15838 (2023).

[32]

Alex Mallen, Akari Asai, Victor Zhong, Rajarshi Das, Daniel Khashabi, and Hannaneh Hajishirzi. 2023. When Not to Trust Language Models: Investigating Effectiveness of Parametric and Non-Parametric Memories. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 9802--9822. https: //doi.org/10.18653/v1/2023.acl-long.546

[33]

Victor J. Marin, Tobin Pereira, Srinivas Sridharan, and Carlos R. Rivero. 2017. Automated Personalized Feedback in Introductory Java Programming MOOCs. In 2017 IEEE 33rd International Conference on Data Engineering (ICDE). 1259--1270. https://doi.org/10.1109/ICDE.2017.169

[34]

Matej Martinc, Senja Pollak, and Marko Robnik-?ikonja. 2021. Supervised and unsupervised neural approaches to text readability. Computational Linguistics 47, 1 (2021), 141--179.

[35]

Vaibhav Mavi, Anubhav Jangra, and Adam Jatowt. 2022. A Survey on Multi-hop Question Answering and Generation. arXiv e-prints (2022), arXiv-2204.

[36]

Jessica McBroom, Irena Koprinska, and Kalina Yacef. 2021. A Survey of Automated Programming Hint Generation: The HINTS Framework. ACM Comput. Surv. 54, 8, Article 172 (oct 2021), 27 pages. https://doi.org/10.1145/3469885

Digital Library

[37]

Khalid Nassiri and Moulay Akhloufi. 2023. Transformer models used for text-based question answering systems. Applied Intelligence 53, 9 (01 May 2023), 10602--10635. https://doi.org/10.1007/s10489-022-04052-8

Digital Library

[38]

Florian Obermüller, Ute Heuer, and Gordon Fraser. 2021. Guiding Next-Step Hint Generation Using Automated Tests. In Proceedings of the 26th ACM Conference on Innovation and Technology in Computer Science Education V. 1 (Virtual Event, Germany) (ITiCSE '21). Association for Computing Machinery, New York, NY, USA, 220--226. https://doi.org/10.1145/3430665.3456344

Digital Library

[39]

Ankit Pal, Logesh Kumar Umapathi, and Malaikannan Sankarasubbu. 2022. MedMCQA: A Large-scale Multi-Subject Multi-Choice Dataset for Medical domain Question Answering. In Proceedings of the Conference on Health, Inference, and Learning (Proceedings of Machine Learning Research, Vol. 174), Gerardo Flores, George H Chen, Tom Pollard, Joyce C Ho, and Tristan Naumann (Eds.). PMLR, 248--260. https://proceedings.mlr.press/v174/pal22a.html

[40]

Guangyuan Piao. 2021. Scholarly Text Classification with Sentence BERT and Entity Embeddings. In Trends and Applications in Knowledge Discovery and Data Mining, Manish Gupta and Ganesh Ramakrishnan (Eds.). Springer International Publishing, Cham, 79--87.

[41]

Chris Piech, Mehran Sahami, Jonathan Huang, and Leonidas Guibas. 2015. Autonomously Generating Hints by Inferring Problem Solving Policies. In Proceedings of the Second (2015) ACM Conference on Learning @ Scale (Vancouver, BC, Canada) (L@S '15). Association for Computing Machinery, New York, NY, USA, 195--204. https://doi.org/10.1145/2724660.2724668

Digital Library

[42]

Rajpurkar Pranav, Jia Robin, and Liang Percy. 2018. Know What You Don't Know: Unanswerable Questions for SQuAD. Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (2018). https://doi.org/10.18653/v1/p18-2124

[43]

Archiki Prasad, Trung Bui, Seunghyun Yoon, Hanieh Deilamsalehy, Franck Dernoncourt, and Mohit Bansal. 2023. MeetingQA: Extractive Question-Answering on Meeting Transcripts. In Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Anna Rogers, Jordan Boyd-Graber, and Naoaki Okazaki (Eds.). Association for Computational Linguistics, Toronto, Canada, 15000--15025. https://doi.org/10.18653/v1/2023.acl-long.837

[44]

Thomas Price, Rui Zhi, and Tiffany Barnes. 2017. Evaluation of a Data-Driven Feedback Algorithm for Open-Ended Programming. International Educational Data Mining Society (2017).

[45]

Thomas W. Price, Yihuan Dong, Rui Zhi, Benjamin Paaßen, Nicholas Lytle, Veronica Cateté, and Tiffany Barnes. 2019. A Comparison of the Quality of Data-Driven Programming Hint Generation Algorithms. International Journal of Artificial Intelligence in Education 29, 3 (01 Aug 2019), 368--395. https://doi.org/10.1007/s40593-019-00177-z

[46]

Sahana Ramnath, Preksha Nema, Deep Sahni, and Mitesh M. Khapra. 2020. Towards Interpreting BERT for Reading Comprehension Based QA. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Bonnie Webber, Trevor Cohn, Yulan He, and Yang Liu (Eds.). Association for Computational Linguistics, Online, 3236--3242. https: //doi.org/10.18653/v1/2020.emnlp-main.261

[47]

Nils Reimers and Iryna Gurevych. 2019. Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks. In Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing. Association for Computational Linguistics. https://arxiv.org/abs/1908.10084

[48]

Kelly Rivers. 2017. Automated Data-Driven Hint Generation for Learning Pro-gramming. (7 2017). https://doi.org/10.1184/R1/6714911.v1

[49]

Kelly Rivers, Erik Harpstead, and Ken Koedinger. 2016. Learning Curve Analysis for Programming: Which Concepts do Students Struggle With?. In Proceedings of the 2016 ACM Conference on International Computing Education Research (Melbourne, VIC, Australia) (ICER '16). Association for Computing Machinery, New York, NY, USA, 143--151. https://doi.org/10.1145/2960310.2960333

Digital Library

[50]

Anna Rogers, Matt Gardner, and Isabelle Augenstein. 2023. QA Dataset Explosion: A Taxonomy of NLP Resources for Question Answering and Reading Comprehension. ACM Comput. Surv. 55, 10, Article 197 (feb 2023), 45 pages. https://doi.org/10.1145/3560260

Digital Library

[51]

Jianlin Su, Murtadha Ahmed, Yu Lu, Shengfeng Pan, Wen Bo, and Yunfeng Liu. 2024. RoFormer: Enhanced transformer with Rotary Position Embedding. Neuro-computing 568 (2024), 127063. https://doi.org/10.1016/j.neucom.2023.127063

Digital Library

[52]

Hong Sun, Xue Li, Yinchuan Xu, Youkow Homma, Qi Cao, Min Wu, Jian Jiao, and Denis Charles. 2023. Autohint: Automatic prompt optimization with hint generation. arXiv preprint arXiv:2307.07415 (2023).

[53]

Gemini Team, Rohan Anil, Sebastian Borgeaud, Yonghui Wu, Jean-Baptiste Alayrac, Jiahui Yu, Radu Soricut, Johan Schalkwyk, Andrew M Dai, Anja Hauth, et al. 2023. Gemini: a family of highly capable multimodal models. arXiv preprint arXiv:2312.11805 (2023).

[54]

Ryan Tolboom. 2023. Computer Systems Security.

[55]

Yongqi Tong, Yifan Wang, Dawei Li, Sizhe Wang, Zi Lin, Simeng Han, and Jingbo Shang. 2023. Eliminating Reasoning via Inferring with Planning: A New Framework to Guide LLMs' Non-linear Thinking. arXiv preprint arXiv:2310.12342 (2023).

[56]

Hugo Touvron, Louis Martin, Kevin Stone, Peter Albert, Amjad Almahairi, Yasmine Babaei, Nikolay Bashlykov, Soumya Batra, Prajjwal Bhargava, Shruti Bhosale, et al. 2023. Llama 2: Open foundation and fine-tuned chat models. arXiv preprint arXiv:2307.09288 (2023).

[57]

Ellen L. Usher and Frank Pajares. 2006. Sources of academic and self-regulatory efficacy beliefs of entering middle school students. Contemporary Educational Psychology 31, 2 (2006), 125--141. https://doi.org/10.1016/j.cedpsych.2005.03.002

[58]

Jiexin Wang, Adam Jatowt, and Masatoshi Yoshikawa. 2022. ArchivalQA: A Large-scale Benchmark Dataset for Open-Domain Question Answering over Historical News Collections. In Proceedings of the 45th International ACM SIGIR Conference on Research and Development in Information Retrieval (Madrid, Spain) (SIGIR '22). Association for Computing Machinery, New York, NY, USA, 3025--3035. https://doi.org/10.1145/3477495.3531734

Digital Library

[59]

Liang Wang, Nan Yang, and Furu Wei. 2023. Query2doc: Query Expansion with Large Language Models. In Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Houda Bouamor, Juan Pino, and Kalika Bali (Eds.). Association for Computational Linguistics, Singapore, 9414--9423. https://doi.org/10.18653/v1/2023.emnlp-main.585

[60]

Luqi Wang, Kaiwen Zheng, Liyin Qian, and Sheng Li. 2022. A Survey of Extractive Question Answering. In 2022 International Conference on High Performance Big Data and Intelligent Systems (HDIS). 147--153. https://doi.org/10.1109/HDIS56859. 2022.9991478

[61]

Dewey Lonzo Whaley III. 2005. The interquartile range: Theory and estimation. Ph. D. Dissertation. East Tennessee State University.

[62]

BigScience Workshop, Teven Le Scao, Angela Fan, Christopher Akiki, Ellie Pavlick, Suzana IliĆ, Daniel Hesslow, Roman Castagné, Alexandra Sasha Luccioni, François Yvon, et al. 2022. Bloom: A 176b-parameter open-access multilingual language model. arXiv preprint arXiv:2211.05100 (2022).

[63]

Can Xu, Qingfeng Sun, Kai Zheng, Xiubo Geng, Pu Zhao, Jiazhan Feng, Chongyang Tao, and Daxin Jiang. 2023. Wizardlm: Empowering large language models to follow complex instructions. arXiv preprint arXiv:2304.12244 (2023).

[64]

Puning Yu and Yunyi Liu. 2021. Roberta-based Encoder-decoder Model for Question Answering System. In 2021 International Conference on Intelligent Computing, Automation and Applications (ICAA). 344--349. https://doi.org/10.1109/ ICAA53760.2021.00070

[65]

Wenhao Yu, Dan Iter, Shuohang Wang, Yichong Xu, Mingxuan Ju, Soumya Sanyal, Chenguang Zhu, Michael Zeng, and Meng Jiang. 2023. Generate rather than Retrieve: Large Language Models are Strong Context Generators. In The Eleventh International Conference on Learning Representations. https://openreview.net/ forum?id=fB0hRu9GZUS

[66]

Manzil Zaheer, Guru Guruganesh, Kumar Avinava Dubey, Joshua Ainslie, Chris Alberti, Santiago Ontanon, Philip Pham, Anirudh Ravula, Qifan Wang, Li Yang, and Amr Ahmed. 2020. Big Bird: Transformers for Longer Sequences. In Advances in Neural Information Processing Systems, H. Larochelle, M. Ranzato, R. Hadsell, M.F. Balcan, and H. Lin (Eds.), Vol. 33. Curran Associates, Inc., 17283--17297. https://proceedings.neurips.cc/paper_files/paper/2020/file/ c8512d142a2d849725f31a9a7a361ab9-Paper.pdf

[67]

Tianyi Zhang*, Varsha Kishore*, Felix Wu*, Kilian Q. Weinberger, and Yoav Artzi. 2020. BERTScore: Evaluating Text Generation with BERT. In International Conference on Learning Representations. https://openreview.net/forum?id=SkeHuCVFDr

[68]

Haoxi Zhong, Chaojun Xiao, Cunchao Tu, Tianyang Zhang, Zhiyuan Liu, and Maosong Sun. 2020. JEC-QA: A Legal-Domain Question Answering Dataset. Proceedings of the AAAI Conference on Artificial Intelligence 34, 05 (Apr. 2020), 9701--9708. https://doi.org/10.1609/aaai.v34i05.6519

Index Terms

TriviaHG: A Dataset for Automatic Hint Generation from Factoid Questions
1. Information systems
  1. Information retrieval
    1. Evaluation of retrieval results
    2. Retrieval models and ranking

Recommendations

Automatic Hint Generation
ICTIR '23: Proceedings of the 2023 ACM SIGIR International Conference on Theory of Information Retrieval

At times when answers to user questions are readily and easily available (at essentially zero cost), it is important for humans to maintain their knowledge and strong reasoning capabilities. We believe that in many cases providing hints rather than final ...
Automating next-step hints generation using ASTUS
ITS'12: Proceedings of the 11th international conference on Intelligent Tutoring Systems

ASTUS is an authoring framework designed to create model-tracing tutors with similar efforts to those needed to create Cognitive Tutors. Its knowledge representation system was designed to model the teacher's point of view of the task and to be ...
A systematic review of question answering systems for non-factoid questions
Abstract
Question Answering (QA) is a field of study addressed to develop automatic methods for answering questions expressed in natural language. Recently, the emergence of the new generation of intelligent assistants, such as Siri, Alexa, and Google ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGIR '24: Proceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 2024

3164 pages

ISBN:9798400704314

DOI:10.1145/3626772

General Chairs:
Grace Hui Yang
Georgetown University, USA
,
Hongning Wang
Tsinghua University, China
,
Sam Han
The Washington Post, USA
,
Program Chairs:
Claudia Hauff
Spotify, Netherlands
,
Guido Zuccon
The University of Queensland, Australia
,
Yi Zhang
University of California Santa Cruz, USA

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGIR: ACM Special Interest Group on Information Retrieval

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 July 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGIR 2024

Sponsor:

SIGIR

SIGIR 2024: The 47th International ACM SIGIR Conference on Research and Development in Information Retrieval

July 14 - 18, 2024

Washington DC, USA

Acceptance Rates

Overall Acceptance Rate 792 of 3,983 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
152
Total Downloads

Downloads (Last 12 months)152
Downloads (Last 6 weeks)24

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents