SAGA is a generalist agentic framework for scientific discovery that automates the iterative process of objective design and hypothesis optimization. Rather than assuming a fixed set of objectives is known upfront, SAGA dynamically discovers and refines optimization objectives through a bi-level procedure: an outer loop that plans and evolves objectives, and an inner loop that optimizes candidate hypotheses against those objectives.
The framework comprises four core agentic modules:
- Planner decomposes the scientific goal into concrete, measurable objectives at each iteration
- Implementer converts proposed objectives into executable scoring functions
- Optimizer searches for candidate hypotheses that maximize the current objectives
- Analyzer evaluates optimization progress and provides actionable suggestions for the next iteration
SAGA supports three levels of human involvement:
- co-pilot: human collaborates with both planner and analyzer.
- semi-pilot: human reviews analyzer outputs only.
- autopilot: fully autonomous.
SAGA/
├── scileo_agent/ # Core SAGA framework
│
├── modules/ # Task-specific and shared module implementations
│ ├── shared/ # Domain-agnostic implementations reusable across tasks
│ │ ├── scorer_creator/ # Implementer
│ │ ├── analyzer/ # Analyzer
│ │ ├── planner/ # Planner
│ │ ├── selector/ # Candidate selection utilities
│ │ ├── serializer/ # Candidate serialization/deserialization
│ │ └── knowledge_manager/ # Knowledge management
│ ├── dna_design/ # DNA sequence design modules
│ └── small_molecule_drug_design/ # Small molecule drug design module
│
├── llm_configs/ # LLM model and credential configuration
│ ├── models.template.yaml # Template for defining available LLM models
│ ├── claude_code.template.yaml # Template for Claude Code model configuration
│ └── credentials.template.yaml # Template for API keys and credentials
│
├── exps/ # Experiment entry point scripts
│ ├── dna_design/ # DNA design experiment
│ └── small_molecule_drug_design/ # Antibiotic design experiment
│
└── runs/ # Run logs and results (auto-created at runtime)
SAGA requires a Linux server with one or more GPUs and Docker installed. Other platforms and configurations have not been thoroughly tested and are not guaranteed to work.
Create and activate a conda environment:
conda create -n saga python=3.13
conda activate sagaInstall Python dependencies:
pip install -r requirements.txtInstall and start Docker:
Ensure Docker is installed and the Docker daemon is running:
# Verify Docker is available
docker infoInstall Claude Code:
Follow the official installation instructions at https://code.claude.com/docs/en/overview to install the Claude Code CLI.
Pull required Docker images:
SAGA uses Docker to run scoring functions and Claude Code. Pull the required images in advance to avoid long waits during experiments:
docker pull btyu24/scileo:v5-claude
docker pull btyu24/scileo:v4
docker pull btyu24/scileo:claude-agent-runner-251117Copy all template config files to create your local configuration:
for f in llm_configs/*.template.yaml; do cp "$f" "${f/.template/}"; doneThis creates three config files that you need to fill in:
llm_configs/models.yaml— define which LLM models SAGA should use and how to call themllm_configs/claude_code.yaml— configure which Claude model is used by the Claude Code agent (in the Implementer and Analyzer)llm_configs/credentials.yaml— add your API keys for the providers you want to use (OpenAI, Anthropic, AWS Bedrock, etc.)
Open each file and follow the inline instructions to fill in your model settings and credentials.
Open and run the experiment notebook:
exps/dna_design/exp_dna_design.ipynb
Launch Jupyter and execute all cells in the notebook. The experiment designs cell-type-specific enhancer sequences for the HepG2 cell line.
Navigate to the experiment directory and run the script:
cd exps/small_molecule_drug_design
python exp_kp_drug.py --level <LEVEL>The --level argument sets the autonomy level:
| Level | Mode | Description |
|---|---|---|
1 |
Co-pilot | Human collaborates with both the planner and analyzer at each iteration |
2 |
Semi-pilot | Human reviews analyzer outputs and provides feedback; planner runs autonomously |
3 |
Autopilot | All four modules run autonomously without human intervention |
For example, to run in autopilot mode:
python exp_kp_drug.py --level 3Run logs and results will be saved automatically to the runs/ directory.
If you use SAGA in your research, please cite our paper:
@article{du2025saga,
title={Accelerating Scientific Discovery with Autonomous Goal-evolving Agents},
author={Du, Yuanqi and Yu, Botao and Liu, Tianyu and Shen, Tony and Chen, Junwu and Rittig, Jan G. and Sun, Kunyang and Zhang, Yikun and Krishnan, Aarti and Zhang, Yu and Rosen, Daniel and Pirone, Rosali and Song, Zhangde and Zhou, Bo and Masschelein, Cassandra and Wang, Yingze and Wang, Haorui and Jia, Haojun and Zhang, Chao and Zhao, Hongyu and Ester, Martin and Hacohen, Nir and Head-Gordon, Teresa and Gomes, Carla P. and Sun, Huan and Duan, Chenru and Schwaller, Philippe and Jin, Wengong},
journal={arXiv preprint arXiv:2512.21782},
year={2025}
}