KonTest

Replication Package for Knowledge-based Consistency Testing of Large Language Models

KonTest is an automated testing framework for evaluating the consistency of large language models (LLMs). It leverages an external knowledge base to craft queries to a target LLM and identifies both metamorphic and ontological errors in the LLM's output.

Requirements

Python Packages Required

time
ast
bigtree
pickle
timeit
random
pprint
itertools
llm
math
networkx
copy
pathlib
textwrap

LLM Specific Packages

Google Gemini

import google.generativeai as genai

Note: Credentials cannot be embedded in code. A service key is required. It can be created using GCP by following the linked procedure.

Link: Gemini Credentials

OpenAI GPT3.5

Note: Credentials cannot be embedded in code. A service key is required. It can be created at the OpenAI website.

Link: OpenAI Website

Falcon and Llama2

llm install llm-gpt4all

Note: The models need to be installed prior to usage. Full documentation is available below.

Link: LLM Documentation

Usage

The code consists of 2 sections, Knowledge Base Generation and LLM Query. The knowledge base construction is handled by kgConstruct.py. The LLM query section is split into three parts. The first, nodeSelection.py, selects a set of paths to be used in subsequent steps. The llmGen.py file queries the chosen LLM and stores its responses. errorFinder.py then takes the stored responses and generates the number of errors for each error type.

CONFIG.txt

The file contains 4 editable parameter, namely, chosenDomain, chosenLLM, initList, and selNodes. chosenDomain allows the user to specify the domain being explored. chosenLLM allows the user to specify the LLM which they would like to test. initList specifies the initial list of entities in Wikidata to explore. selNodes specifies the random set of paths and nodes chosen by KonTest in the paper. In the absence of a specified list of nodes, KonTest randomly generates a list before appending it to the CONFIG file.

kgConstruct.py

It constructs the knowledge base for the selected list of entities (nodes) and domain.

nodeSelection.py

Takes the KGs associated with the initial set of nodes, and generates a list containing the knowledge paths that KonTest will use for this iteration.

llmGen.py

Takes the paths found and generates the queries according to the specified template. It then generates the queries and responses for the chosen LLM.

errorFinder.py

Takes the queries and the associated responses and outputs both the number of valid tests and the number of errors found for each error type for the chosen LLM.

License

Shield:

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Acknowledgment

This work was partially funded by grant number SMU-SUTD 2023_02_04 and the Singapore Ministry of Education (MOE) Present’s Graduate Fellowship. Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not reflect the view of the respective funding agencies.

Citing KonTest

@inproceedings{rajan2024knowledge,
  title={Knowledge-based Consistency Testing of Large Language Models},
  author={Rajan, Sai Sathiesh and Soremekun, Ezekiel and Chattopadhyay, Sudipta},
  booktitle={Findings of the Association for Computational Linguistics: EMNLP 2024},
  year={2024}
}

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
ConvFiles		ConvFiles
knowledgeGraphs		knowledgeGraphs
CONFIG.txt		CONFIG.txt
KonTest.pdf		KonTest.pdf
README.md		README.md
errorFinder.py		errorFinder.py
kgConstruct.py		kgConstruct.py
llmGen.py		llmGen.py
nodeSelection.py		nodeSelection.py
overview-approach.png		overview-approach.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

KonTest

Requirements

Python Packages Required

LLM Specific Packages

Google Gemini

OpenAI GPT3.5

Falcon and Llama2

Usage

CONFIG.txt

kgConstruct.py

nodeSelection.py

llmGen.py

errorFinder.py

License

Acknowledgment

Citing KonTest

About

Uh oh!

Releases

Packages

Languages

sparkssss/KonTest

Folders and files

Latest commit

History

Repository files navigation

KonTest

Requirements

Python Packages Required

LLM Specific Packages

Google Gemini

OpenAI GPT3.5

Falcon and Llama2

Usage

CONFIG.txt

kgConstruct.py

nodeSelection.py

llmGen.py

errorFinder.py

License

Acknowledgment

Citing KonTest

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages