Skip to content
This repository was archived by the owner on Nov 3, 2025. It is now read-only.
This repository was archived by the owner on Nov 3, 2025. It is now read-only.

[Question] Regarding to the TriviaQA, TriviaQA+CF-TriviaQA datasets in the paper #1

@nooobodynose

Description

@nooobodynose

Hello!!!

I am interested in your paper "Hallucination Augmented Recitations for Language Models" and would like to conduct similar experiments.

I have few questions for you, and I would be incredibly grateful if you could answer me :)

Table 3: Token-level F1 scores of T5-3B models finetuned with TriviaQA, CF-TriviaQA, and their combination. Combining our CF-TriviaQA dataset with TriviaQA achieves good out-of-domain performance while having a similar performance in in-domain as the model finetuned with TriviaQA.

  1. Here are you referring to the whole (61,688 examples link) TriviaQA dataset or just a subset? If it's a subset, how many training examples were included?

  2. (Similar to previous question) TriviaQA + CF-TriviaQA, does that mean whole TriviaQA dataset + 19,327 CF examples you generated?

Looking forward to your responses!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions