BiasScope

Official implementation for BiasScope: Towards Automated Detection of Bias in LLM-as-a-Judge Evaluation (ICLR 2026).

LLM-as-a-judge is widely used, but evaluation bias undermines reliability. BiasScope is an LLM-driven framework that automatically discovers potential biases at scale—moving bias discovery from manual, predefined lists toward active, automated exploration. The method is validated on JudgeBench; the paper introduces JudgeBench-Pro, a harder benchmark for judge robustness under controlled bias interference.

Repository structure

├── attack_judge_and_analysis.py   # Main discovery pipeline
├── synthesis_bias_verification.py # Per-bias error rates & library updates
├── bias_detector.py               # Bias classification + merge
├── prompts.py                     # All LLM prompts
├── utils.py                       # CLI, vLLM batching, data helpers
├── run_biasscope.sh               # Example end-to-end driver
├── data/                          # Bias JSON + example parquet layout
├── requirements.txt       
└── README.md

Requirements

Create a virtual environment, activate it, then install dependencies.

venv

python3 -m venv .venv
source .venv/bin/activate   # Windows: .venv\Scripts\activate
pip install -r requirements.txt

Conda

conda create -n biasscope python=3.11
conda activate biasscope
pip install -r requirements.txt

If torch / vLLM wheels fail, use the vLLM install guide and a matching PyTorch CUDA index.

Quick start

From anywhere:

bash run_biasscope.sh

The script changes to the repository root, then for each entry in ANALYSIS_DATASETS × JUDGE_MODELS (and optional repeats) runs Stage 1 and Stage 2.

Edit run_biasscope.sh (# --- User config ---):

Variable	Purpose
`CUDA_VISIBLE_DEVICES`	GPU ids visible to the run (default `0` if unset).
`PYTHON`	Python executable (default `python3`).
`BATCH_SIZE`	vLLM batch size for generation.
`TP_SIZE`	Tensor parallel for teacher and judge (`--self-defined-tp-size`).
`BIAS_JSON`	Seed bias library path.
`ANALYSIS_DATASETS`	Bash array of Stage 1 parquet paths (relative to repo root or absolute).
`TEST_DATA`	Stage 2 verification parquet.
`TEACHER_MODEL_PATH`	Teacher model (HF id or local path); can be set in the script or exported in the shell.
`JUDGE_MODELS`	Bash array of judge checkpoints (one full pipeline per model).
`REPEAT`	Repeat the inner judge loop on the same data (`1` = single pass).
`DRY_RUN`	Set to `1` to print commands without running Python.
`EXTRA_ARGS`	Optional extra CLI flags passed to both Python scripts (e.g. `--teacher-backend api` and API credentials).

You can override several options without editing the file:

CUDA_VISIBLE_DEVICES=0,1 TEACHER_MODEL_PATH=/path/to/teacher bash run_biasscope.sh
DRY_RUN=1 bash run_biasscope.sh   # print-only dry run

Stage 1 — Attack, judge, bias analysis

python attack_judge_and_analysis.py \
  --bias-json data/bias/basic_biases.json \
  --analysis-data-path data/rewardbench/rewardbench_filtered.parquet \
  --model-path /path/to/judge \
  --teacher-model-path /path/to/teacher \
  --self-defined-tp-size 2 \
  --batch-size 64 \
  --detection-mode 2

Stage 2 — Synthesis & verification (error rates per bias)

Use a held-out parquet for --test-data-path (e.g. local JudgeBench / MMLU-style exports, or JudgeBench-Pro saved as Parquet).

python synthesis_bias_verification.py \
  --bias-json data/bias/basic_biases.json \
  --analysis-data-path data/rewardbench/rewardbench_filtered.parquet \
  --test-data-path data/judgeBench/judge_bench.parquet \
  --model-path /path/to/judge \
  --teacher-model-path /path/to/teacher \
  --self-defined-tp-size 2 \
  --batch-size 64

API teacher (optional)

Use an OpenAI-compatible server for the teacher only (judge remains vLLM). From the shell:

python attack_judge_and_analysis.py \
  --teacher-backend api \
  --api-key "$OPENAI_API_KEY" \
  --base-url "https://api.openai.com/v1" \
  --teacher-model "gpt-4o" \
  ...other flags...

To use the same flags in run_biasscope.sh, set the EXTRA_ARGS array (see the commented example in that file).

See python attack_judge_and_analysis.py --help for all options.

Citation

If you use this code or build on BiasScope, please cite:

@article{lai2026biasscope,
  title={BiasScope: Towards Automated Detection of Bias in LLM-as-a-Judge Evaluation},
  author={Lai, Peng and Ou, Zhihao and Wang, Yong and Wang, Longyue and Yang, Jian and Chen, Yun and Chen, Guanhua},
  journal={arXiv preprint arXiv:2602.09383},
  year={2026}
}

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BiasScope

Repository structure

Requirements

Quick start

Stage 1 — Attack, judge, bias analysis

Stage 2 — Synthesis & verification (error rates per bias)

API teacher (optional)

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
data		data
README.md		README.md
attack_judge_and_analysis.py		attack_judge_and_analysis.py
bias_detector.py		bias_detector.py
biasscope.png		biasscope.png
prompts.py		prompts.py
requirements.txt		requirements.txt
run_biasscope.sh		run_biasscope.sh
synthesis_bias_verification.py		synthesis_bias_verification.py
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

BiasScope

Repository structure

Requirements

Quick start

Stage 1 — Attack, judge, bias analysis

Stage 2 — Synthesis & verification (error rates per bias)

API teacher (optional)

Citation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages