EvoGM addresses evolutionary model merging as a learnable search problem. Instead of relying on hand-crafted mutation or crossover operators, it treats validation performance as feedback and learns where high-quality merging coefficients are likely to lie.
Accepted at ICML 2026
-
Generative model merging. We propose EvoGM, which reformulates evolutionary model merging as a learnable generative optimization problem for adaptive search in the merging space.
-
Preference-aware dual generation. We introduce a cycle-consistent dual-generator with a winner–loser preference strategy to better leverage sparse validation feedback and synthesize effective merged models for unseen tasks.
-
Strong empirical performance. Extensive experiments across diverse benchmarks and model families show that EvoGM consistently outperforms state-of-the-art model merging baselines.
Use Python 3.10. A fresh environment is recommended. The requirement files are pinned to the core packages used by the released experiments rather than a minimal import-only set.
GPU:
python -m venv .venv
source .venv/bin/activate
pip install -U pip
pip install -r requirements-gpu.txt
pip install -e .NPU:
conda create -n evogm_npu python=3.10 -y
conda activate evogm_npu
pip install -U pip
# Install torch and torch_npu for your CANN version first, then:
pip install -r requirements-npu.txt
pip install -e .NPU runs require a working Ascend driver, CANN runtime, npu-smi, and a torch_npu build matching your PyTorch/CANN stack.
The NPU experiments were checked against the project environment named evogm_npu on Ascend machines with CANN 8.1.RC2, PyTorch 2.5.1, and torch-npu 2.5.1. If your cluster uses another CANN release, install the matching PyTorch/torch_npu pair first and then install the pinned Python packages from requirements-npu.txt. Do not let a generic PyPI torch wheel replace the Ascend-compatible build.
Model weights are not included in this repository. Download the Qwen2.5-1.5B base model plus the 10 released Tulu v2 LoRA expert adapters from:
https://huggingface.co/TaoJiangCN/qwen2.5-1.5b-tulu-v2-lora-experts
With huggingface-hub installed, one direct way to fetch the weights is:
huggingface-cli download TaoJiangCN/qwen2.5-1.5b-tulu-v2-lora-experts \
--local-dir models/qwen25-1.5b-lora-expertsThe downloaded files should be arranged like this:
models/qwen25-1.5b-lora-experts/
base/
config.json
model.safetensors or model.safetensors.index.json
tokenizer files...
experts/
tulu_code_alpaca/adapter_config.json
tulu_cot/adapter_config.json
tulu_flan_v2/adapter_config.json
tulu_gpt4_alpaca/adapter_config.json
tulu_lima/adapter_config.json
tulu_oasst1/adapter_config.json
tulu_open_orca/adapter_config.json
tulu_science/adapter_config.json
tulu_sharegpt/adapter_config.json
tulu_wizardlm/adapter_config.json
The default configs read from models/qwen25-1.5b-lora-experts. You can override the location:
export EVOGM_MODEL_DIR=/path/to/qwen25-1.5b-lora-expertsYou can also override the dataset directory:
export EVOGM_DATA_DIR=/path/to/swarm_evalRun the setup check before launching experiments:
bash scripts/check_setup.shFor NPU machines, use the stricter NPU check:
bash scripts/check_setup.sh npuThis validates imports, key package versions, Hydra config composition, bundled task data, and prints model layout guidance. In NPU mode it also verifies that torch_npu is installed and torch.npu is available. Missing model weights are reported clearly because weights are expected to be downloaded separately.
After model weights are in place, run a minimal smoke test. It creates a temporary one-example dataset from the bundled JSON files and uses tiny EvoGM search settings.
GPU:
bash scripts/smoke_test.sh gpuNPU:
bash scripts/smoke_test.sh npuGPU multi-task:
bash scripts/run_gpu_multi.shGPU single-task over all 8 task entries:
bash scripts/run_gpu_single.shGPU single-task for one task:
bash scripts/run_gpu_single.sh 'method.target_tasks=[gsm8k]'NPU multi-task:
bash scripts/run_npu_multi.shNPU single-task:
bash scripts/run_npu_single.shNPU single-task for one task:
bash scripts/run_npu_single.sh 'method.target_tasks=[gsm8k]'You can override device visibility in the usual way:
CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash scripts/run_gpu_multi.sh
ASCEND_RT_VISIBLE_DEVICES=0,1,2,3,4,5,6,7 bash scripts/run_npu_multi.sh