Skip to content

hiaoxui/nugget-data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 

Repository files navigation

This repo contains the synthetic data for the Nugget paper.

The file docs.txt contains all the documents. Each line contains a doc id and the doc itself, separated by \t.

The file task.jsonl contains the metadata of the task. Each line is a data point, where source is the id of the source doc, candidates are candidate doc ids, and the answer is the index of the target document in the candidate list.

Note that the paraphrase identification dataset contains 3 splits, but we only experiment with the dev set.

About

Synthetic data for Nugget paper

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published