How are the metrics: acc, precision, recall of claim level calculated？

How are the acc, precision, recall, and other indicators of claim level calculated? Because the claims extracted from the model are definitely different? It is challenging to calculate indicators without manual evaluation.
Is the calculation of the indicators here based on the ground truth annotated in the dataset? The default claim is given, and the correctness of each claim is known, corresponding to the label in the dataset. Collecting evidence based on the given claim, and then verify the correctness of the claim and its consistency with the ground truth, in order to conduct factual verification. I don't know if my understanding is correct.  Isn't this part of the code missing in the repo？
<img width="80%" alt="image" src="https://github.com/GAIR-NLP/factool/assets/45895439/d5f2bb61-383a-455b-8f53-4ea37dc8e12c">


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How are the metrics: acc, precision, recall of claim level calculated？ #35

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

How are the metrics: acc, precision, recall of claim level calculated？ #35

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions