Skip to content
/ Genii Public

Source code for our paper ''Mitigating Judgment Preference Bias in Large Language Models through Group-Based Polling''

License

Notifications You must be signed in to change notification settings

NEUIR/Genii

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Genii

Source code for our paper ''Mitigating Judgment Preference Bias in Large Language Models through Group-Based Polling''

arXiv HuggingFace-Model HuggingFace-Dataset

Overview

Genii is an unsupervised multi-agent collaborative optimization framework that encourages multiple LLM-based judgment models to interact with each other to mitigate their own judgment preference bias.

Set Up

Use git clone to download this project

conda create -n Genii python=3.10.14
conda activate Genii
git clone https://github.com/NEUIR/Genii.git
cd Genii
pip install -r requirements.txt

2. Install LLaMA-Factory.

Refer to https://github.com/hiyouga/LLaMA-Factory for detailed instructions. We recommend creating a new environment for installing LLaMA-Factory, as some packages may be incompatible with your existing training environment.

conda create -n LLamafactory python=3.10.14
conda activate LLamafactory
git clone https://github.com/hiyouga/LLaMA-Factory.git
cd LLaMA-Factory
pip install -e ".[torch,metrics]"

Train Judgment Model

1. Data Preparation.

In order to construct the train dataset for Judgment model, you can follow Table 4 in the paper to collect the corresponding dataset and process the data into the following jsonl format and put the data into the data/raw_data folder. Besides, you can also collect more datasets on your own and process them for training.

{
  'question': str # The question of data.
  'answer': str # The ground truth of data.
}

After this, you can use the following code to synthesize training data. You will need to download the Llama3.1-8B-Instruct, Gemma2-9B-Instruct, and Qwen2.5-7B-Instruct models in advance. Our dataset contains two types of data: QA data and Instruction data, they need to be processed separately.

cd script
bash gen_answer_ins.sh
bash gen_answer_qa.sh

Then, merge the data and compute consistency scores to construct the DPO training data using MiniCPM-Embedding model. You can also download the processed data directly from here.

cd ..
cd src

# for Instruction data
python merge.py
python embedding.py
python revert_dpo_data_ins.py

# for QA data
python merge.py
python embedding.py
python revert_dpo_data_qa.py

# merge Instruction and QA data
python dpo_data_merge.py

2. Train Model.

You can train Judgment model by utilizing LLaMA-Factory framework quickly, we provide the yaml files. Please refer to LLaMA-Factory for relevant environment installation and configuration.

cd LLaMA-Factory

You can also download the checkpoint of Gemma2-9B, Qwen2.5-7B and Llama3.1-8B directly.

Evaluation

1. Prepare the test data.

You can download the test data directly form here and put then into the data/eval folder.

2. Evaluation.

For evaluation judgment accuracy, you can use the following code:

cd script
bash eval_acc.sh
cd ..
cd src
python calculate_acc.py

For evaluation Harmful Self-Preference Propensity(HSPP), you can use the following code:

cd script
bash eval_hspp.sh
cd ..
cd src
python calculate_hspp.py

Contact

If you find this work useful, please cite our paper and give us a shining star 🌟

@article{Liu2025Mitigating,
      title={Mitigating Judgment Preference Bias in Large Language Models through Group-Based Polling}, 
      author={Shuliang Liu, Zhipeng Xu, Zhenghao Liu, Yukun Yan, Minghe Yu, Yu Gu, Chong Chen, Huiyuan Xie, Ge Yu},
      year={2025},
      eprint={2510.08145},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2510.08145}, 
}

If you have questions, suggestions, and bug reports, please email:

2472026@stu.neu.edu.cn

About

Source code for our paper ''Mitigating Judgment Preference Bias in Large Language Models through Group-Based Polling''

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published