This repository provides code and datasets for the paper "Neuron-Level Analysis of Cultural Understanding in Large Language Models". We implement CULNIG (Culture Neuron Identification Pipeline with Gradient-based Scoring), a method to identify culture-general and culture-specific neurons in LLMs via gradient-based scoring.
- Install uv (see: https://docs.astral.sh/uv/getting-started/installation/)
- Create a virtual environment and install dependencies:
- Adjust the
pyproject.tomlfile to meet your environment if necessary.
- Adjust the
uv venv
uv sync- Prepare datasets used by
dataset.py:- BLEnD: Download
US_questions.csvfrom the BLEnD repo and place it under thedata/BLEnD/directory (e.g.,data/BLEnD/US_questions.csv). - WorldValuesBench: Follow the instructions in the WorldValuesBench repo, then place
question_metadata.json,full_demographic_qa.tsv, andfull_value_qa.tsvunderdata/WorldValuesBench/.
- BLEnD: Download
-
CULNIG/calc_neuron_score.py: Compute neuron scores on a target dataset using gradient-based scoring.- Example:
uv run python CULNIG/calc_neuron_score.py --model_name <model_name> --dataset_names blend
- Available models:
google/gemma-3-12b-it,google/gemma-3-27b-it,Qwen/Qwen3-14B,meta-llama/Llama-3.1-8B-Instruct,microsoft/phi-4,tiiuae/Falcon3-10B-Instruct- You can add other models by extending the code.
- Through the pipeline, compute neuron scores for both
blendandblendcontrol. - Neuron scores for CountryRC are computed every time.
- Example:
-
CULNIG/decide_culture_general_neurons.py: Identify culture-general neurons based on computed scores.- Example:
uv run python CULNIG/decide_culture_general_neurons.py --model_name <model_name> --dataset_names blend
- Run this script after
CULNIG/calc_neuron_score.pyon bothblendandblendcontrolas this script uses scores from both datasets to identify culture-general neurons.
- Example:
-
CULNIG/decide_culture_specific_neuron.py: Identify culture-specific neurons based on computed scores.- Example:
uv run python CULNIG/decide_culture_specific_neuron.py --model_name <model_name> --dataset_names blend
- Run this script after
CULNIG/calc_neuron_score.pyon bothblendandblendcontrolas this script uses scores from both datasets to identify culture-specific neurons.
- Example:
-
CULNIG/decide_random_neuron.py: Select random neurons as a baseline.- Example:
uv run python CULNIG/decide_random_neuron.py --model_name <model_name> --mlp_neuron_num <num> --attention_neuron_num 0
- Note: This script currently does not treat MLP and attention neurons separately; attention neurons are included in MLP neurons (to match CULNIG). You can modify the code to separate them if desired.
- Run
CULNIG/calc_neuron_score.pybeforehand to populate model architecture info used here.
- Example:
eval/evaluate.py: Evaluate a model on a dataset, with optional neuron manipulation.- Example:
uv run python eval/evaluate.py --model_name <model_name> --dataset_name <dataset_name> --neuron_file <path_to_neuron_file> --operation suppress
- To evaluate without neuron manipulation, unset
--neuron_file. - Available datasets:
blend,culturalbench,normad,worldvaluesbench,countryrc,commonsenseqa,qnli,mrpc - Operations:
suppress,enhance - Results are saved under
outputs/. - For BLEnD, the script evaluates all questions (both BLEnD_neur and BLEnD_test). To evaluate only BLEnD_test, modify the code to load test questions only (
target_data='all' -> 'non_neuron').
- Example:
eval/evaluate_blend_saq.py: Evaluate a model on BLEnD SAQs with optional neuron manipulation.- Example:
uv run python eval/evaluate_blend_saq.py --model_name <model_name> --neuron_file <path_to_neuron_file> --operation suppress
- To evaluate without neuron manipulation, unset
--neuron_file. - Operations:
suppress,enhance - Results are saved under
outputs/blend_sqa - For judging correctness, we use lemmatizers/stemmers/tokenizers of each language, following the original BLEnD paper and repo. We place the codes for our evaluation in
eval/lemmatizers/. You can use the scripts asuv run python lemma.py --input_file <input_file>. For the detailed usage of each script, please refer to the comments in the code.
- Example:
train/train.py: Fine-tune a model on a dataset by updating specified modules.- Example:
uv run python train/train.py --config train/config.yaml
- An example config is provided at
train/config.yaml; edit as needed. - To monitor with Weights & Biases, set environment variables beforehand:
export WANDB_API_KEY=<your_wandb_api_key> export WANDB_PROJECT=<your_wandb_project_name>
- Trained models and training logs are saved in
model_outputs/(or the directory configured in your settings).
- Example:
- Issues and contributions are welcome via GitHub Issues and PRs.