GMTP

Official repo of ACL 2025 Findings paepr: Safeguarding RAG Pipelines with GMTP: A Gradient-based Masked Token Probability Method for Poisoned Document Detection.

We also provide 200 poisoned test samples, using PoisonedRAG, Phantom, and Adversarial Decoding.

🚀 Features

🛠️ Setup

Download required Beir datasets (NQ, HotpotQA, MS MARCO).

python donwload_datasets.py

Build and run docker image.

docker build -t gmtp .
docker run --rm --gpus all -it -v $(pwd):/app -w /app gmtp

We recommend using docker for Pyserini's dependence on Java 21, but you may simply follow the code below.

conda create -n gmtp python=3.10
pip install -r requirements.txt
pip install -e ./beir

🔍 Usage

Convert Beir datasets into fixed format.

bash scripts/run_convert_dataset.sh

Convert poisoned datasets into fixed format.

attack: Attack method (poisonedrag / phantom / advdecoding).
total: Total amount of poisoned documents (For current setting, fix it to 200 as only 200 poisoned documents are provided.)

bash scripts/run_convert_poisoned_dataset.sh

Run Pyserini to index clean / attacked documents.

is_attack: whether we are indexing clean or poisoned documents. You should run is_attack=False once for each dataset, and you should run is_attack=True once for every combination of dataset and attack to blend the poisoned documents into the clean ones.
dataset: Target dataset (nq / hotpotqa / msmarco).
encoder_class: Retriever (dpr / contriever).

bash scripts/run_faiss_indexing.sh

Now get average masked token probability of knowledge base.

retriever: Retriever (dpr / contriever).
N : Maximum amount of potential cheating tokens.
M : The amount of tokens actually used for consideration.
total_samples: The number of documents used for average gradient calculation ($K$ in paper)
use_random_doc: Whether use random documents for single query or use relevant documents.
include_poison: Whether to sample total_samples of documents from poisoned knowledge base.

bash scripts/run_get_avg_mask_probs.sh

Now we are ready to run GTMP against various attacks! Run the code below to reproduce the result. Otherwise, you may simply use code in src/defenses/method/GMTP for your work.

bash scripts/run_main.sh

retrieval_only: Whether run only retrieval phase or until the generation phase.
latency_check: Whether check latency.
debug: Debug mode, using only 10 test samples.
api_key: Input your OpenAI API key.
defense_method: defense baselines (GMTP, PPL, L2).
reranker: MLM of reranker. Can be either bert or roberta.
remove_threshold: default remove threshold of GMTP. If higher than 0, it will use fixed remove threshold instead of calculated one from run_get_avg_mask_probs.sh.
use_random_doc: Whether to use threshold calculated by random corpus or related corpus.
retrieve_k: top-2k.
rerank_k: top-k.

References

Our code used and contains Beir benchmark, with required library modification.
Our code used Pyserini.
The base template was originated from monologg.

Citation

If you use this code, please cite the following our paper:

@inproceedings{kim2025safeguarding,
  title={Safeguarding RAG Pipelines with GMTP: A Gradient-based Masked Token Probability Method for Poisoned Document Detection},
  author={Kim, San and Kim, Jonghwi and Jeon, Yejin and Lee, Gary},
  booktitle={Findings of the Association for Computational Linguistics: ACL 2025},
  pages={24597--24614},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
beir		beir
conf		conf
data		data
images		images
scripts		scripts
src		src
.editorconfig		.editorconfig
.gitattributes		.gitattributes
.gitignore		.gitignore
.gitmessage		.gitmessage
.pre-commit-config.yaml		.pre-commit-config.yaml
Dockerfile		Dockerfile
Makefile		Makefile
README.md		README.md
convert_dataset_to_jsonl.py		convert_dataset_to_jsonl.py
convert_poisoned_dataset_to_jsonl.py		convert_poisoned_dataset_to_jsonl.py
download_datasets.py		download_datasets.py
get_avg_mask_probs.py		get_avg_mask_probs.py
merge_index.py		merge_index.py
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
run_main.py		run_main.py
setup.cfg		setup.cfg
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

GMTP

🚀 Features

🛠️ Setup

🔍 Usage

References

Citation

About

Uh oh!

Releases

Packages

Languages

mountinyy/GMTP

Folders and files

Latest commit

History

Repository files navigation

GMTP

🚀 Features

🛠️ Setup

🔍 Usage

References

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages