Official implementation for BiasScope: Towards Automated Detection of Bias in LLM-as-a-Judge Evaluation (ICLR 2026).
LLM-as-a-judge is widely used, but evaluation bias undermines reliability. BiasScope is an LLM-driven framework that automatically discovers potential biases at scale—moving bias discovery from manual, predefined lists toward active, automated exploration. The method is validated on JudgeBench; the paper introduces JudgeBench-Pro, a harder benchmark for judge robustness under controlled bias interference.
├── attack_judge_and_analysis.py # Main discovery pipeline
├── synthesis_bias_verification.py # Per-bias error rates & library updates
├── bias_detector.py # Bias classification + merge
├── prompts.py # All LLM prompts
├── utils.py # CLI, vLLM batching, data helpers
├── run_biasscope.sh # Example end-to-end driver
├── data/ # Bias JSON + example parquet layout
├── requirements.txt
└── README.md
Create a virtual environment, activate it, then install dependencies.
venv
python3 -m venv .venv
source .venv/bin/activate # Windows: .venv\Scripts\activate
pip install -r requirements.txtConda
conda create -n biasscope python=3.11
conda activate biasscope
pip install -r requirements.txtIf torch / vLLM wheels fail, use the vLLM install guide and a matching PyTorch CUDA index.
From anywhere:
bash run_biasscope.shThe script changes to the repository root, then for each entry in ANALYSIS_DATASETS × JUDGE_MODELS (and optional repeats) runs Stage 1 and Stage 2.
Edit run_biasscope.sh (# --- User config ---):
| Variable | Purpose |
|---|---|
CUDA_VISIBLE_DEVICES |
GPU ids visible to the run (default 0 if unset). |
PYTHON |
Python executable (default python3). |
BATCH_SIZE |
vLLM batch size for generation. |
TP_SIZE |
Tensor parallel for teacher and judge (--self-defined-tp-size). |
BIAS_JSON |
Seed bias library path. |
ANALYSIS_DATASETS |
Bash array of Stage 1 parquet paths (relative to repo root or absolute). |
TEST_DATA |
Stage 2 verification parquet. |
TEACHER_MODEL_PATH |
Teacher model (HF id or local path); can be set in the script or exported in the shell. |
JUDGE_MODELS |
Bash array of judge checkpoints (one full pipeline per model). |
REPEAT |
Repeat the inner judge loop on the same data (1 = single pass). |
DRY_RUN |
Set to 1 to print commands without running Python. |
EXTRA_ARGS |
Optional extra CLI flags passed to both Python scripts (e.g. --teacher-backend api and API credentials). |
You can override several options without editing the file:
CUDA_VISIBLE_DEVICES=0,1 TEACHER_MODEL_PATH=/path/to/teacher bash run_biasscope.sh
DRY_RUN=1 bash run_biasscope.sh # print-only dry runpython attack_judge_and_analysis.py \
--bias-json data/bias/basic_biases.json \
--analysis-data-path data/rewardbench/rewardbench_filtered.parquet \
--model-path /path/to/judge \
--teacher-model-path /path/to/teacher \
--self-defined-tp-size 2 \
--batch-size 64 \
--detection-mode 2Use a held-out parquet for --test-data-path (e.g. local JudgeBench / MMLU-style exports, or JudgeBench-Pro saved as Parquet).
python synthesis_bias_verification.py \
--bias-json data/bias/basic_biases.json \
--analysis-data-path data/rewardbench/rewardbench_filtered.parquet \
--test-data-path data/judgeBench/judge_bench.parquet \
--model-path /path/to/judge \
--teacher-model-path /path/to/teacher \
--self-defined-tp-size 2 \
--batch-size 64Use an OpenAI-compatible server for the teacher only (judge remains vLLM). From the shell:
python attack_judge_and_analysis.py \
--teacher-backend api \
--api-key "$OPENAI_API_KEY" \
--base-url "https://api.openai.com/v1" \
--teacher-model "gpt-4o" \
...other flags...To use the same flags in run_biasscope.sh, set the EXTRA_ARGS array (see the commented example in that file).
See python attack_judge_and_analysis.py --help for all options.
If you use this code or build on BiasScope, please cite:
@article{lai2026biasscope,
title={BiasScope: Towards Automated Detection of Bias in LLM-as-a-Judge Evaluation},
author={Lai, Peng and Ou, Zhihao and Wang, Yong and Wang, Longyue and Yang, Jian and Chen, Yun and Chen, Guanhua},
journal={arXiv preprint arXiv:2602.09383},
year={2026}
}