SUMMARY : Session P10-S

 

Title The Ritel Corpus - An annotated Human-Machine open-domain question answering spoken dialog corpus
Authors S. Rosset, S. Petel
Abstract In this paper we present a real (as opposed to Wizard-of-Oz) Human-Computer QA-oriented spoken dialog corpus collected with our Ritel platform. This corpus has been orthographically transcribed and annotated in terms of Specific Entities and Topics. Twelve main topics have been chosen. They are refined into 22 sub-topics. The Specific Entities are from five categories and cover Named Entities, linguistic entities, topic-defining entities, general entities and extended entities. The corpus contains 582 dialogs for 6 hours of user speech.
Keywords
Full paper The Ritel Corpus - An annotated Human-Machine open-domain question answering spoken dialog corpus