Modified

The dataset has been modified and enhanced by StellarDragon for his undergraduate thesis at Nanjing University. Specific modifications include, but are not limited to: correcting texts that do not conform to the formatting requirements, replacing full-width letters, characters, and numbers with half-width ones, and correcting some questions with obvious logical mistakes. After repairing the Chinese dataset, I used advanced tools such as ChatGPT for a more accurate translation. Through practical testing, the same model achieved an accuracy improvement of up to 5% on the dataset I modified compared to the original dataset.

LogiQA

This dataset consists of 8,678 QA instances.(Train:7376; Eval:651; Test:651)

The files is divided into English version: Train.txt, Eval.txt, Test.txt, and Chinese version: zh_train.txt, zh_eval.txt, zh_test.txt.

Each 8 lines constitute an example of a problem. (8,678 * 8 = 69,424)

In each 8 lines:

             The first line is blank line;

             The second is right choice;

             The third is context;

             The fourth is question;

             The remaining four lines are four options.

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
Eval.txt		Eval.txt
README.md		README.md
Test.txt		Test.txt
Train.txt		Train.txt
eval_plus.json		eval_plus.json
test_plus.json		test_plus.json
train_plus.json		train_plus.json
zh_eval.txt		zh_eval.txt
zh_eval_plus.json		zh_eval_plus.json
zh_test.txt		zh_test.txt
zh_test_plus.json		zh_test_plus.json
zh_train.txt		zh_train.txt
zh_train_plus.json		zh_train_plus.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Modified

LogiQA

About

Uh oh!

Releases

Packages

StellarDragon/LogiQA-Plus

Folders and files

Latest commit

History

Repository files navigation

Modified

LogiQA

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Packages