| Demo 1: Code Preference Agent learns to always use TypeScript |
Demo 2: Workflow Convention Agent learns to always write tests |
case1-openclaw.mp4 |
case2-openclaw.mp4 |
- [2026/03/24] π₯ Fully automatic setup β The plugin auto-starts the SCOPE server, auto-detects OpenClaw's model configuration, and auto-installs Python dependencies on first launch. No
.env, no manual setup needed! - [2026/03/24] π₯ Added custom SCOPE prompts & domains tailored for personal AI assistants β user preference learning, code quality analysis, and communication style optimization!
- [2026/03/23] π EvolveClaw v1 released! Self-evolving prompt system for OpenClaw with zero code modification β plugin + sidecar architecture powered by SCOPE.
EvolveClaw turns OpenClaw into a self-improving agent that continuously adapts to your workflows. Powered by SCOPE (Self-evolving Context Optimization via Prompt Evolution), it observes how you interact with the agent β your tasks, your tool usage patterns, your preferences β and automatically evolves the system prompt with personalized guidelines that make the agent increasingly effective for you, not just any user.
No two EvolveClaw instances are the same. Over time, each one develops a unique personality shaped by its owner's habits.
Key adaptation: SCOPE was originally designed for task-specific benchmarks (e.g., HLE). EvolveClaw extends it with custom prompt templates and domain categories tailored for a personal AI assistant β focusing on user preference learning, task quality, communication style, and workflow patterns instead of domain-specific problem-solving heuristics.
SCOPE also support agent workflows like EvoFabric
- π‘ Why EvolveClaw?
- βοΈ How It Works
- π Quick Start
- 𧬠What Makes It Self-Evolving
- ποΈ Architecture
- π― Design Decisions
- π API Endpoints
- π Citation
Today's AI agents ship with a static system prompt β every user gets the same instructions. But users are different: some prefer terse answers, others want detailed explanations; some rely heavily on search tools, others write code directly; some use the agent for coding, others for research or writing.
EvolveClaw closes this gap with three core ideas:
| Principle | What It Means |
|---|---|
| Self-Evolving | The agent synthesizes behavioral guidelines from its own execution traces. No manual prompt engineering needed β the system prompt improves itself. |
| Personalized | Guidelines are derived from your interactions β your tasks, your tool usage patterns. The agent adapts to how you work, not a generic user profile. |
| Dual Memory | Strategic guidelines persist across sessions (your agent's personality); tactical guidelines are ephemeral and auto-clear per task. |
- Observe β The plugin captures your full interaction trace: model output, tool calls, tool results, errors, and the semantic nature of your task
- Learn β SCOPE analyzes each trace and synthesizes a guideline if warranted β e.g., "When this user asks for refactoring, prefer small atomic commits over large rewrites"
- Classify β Each guideline is classified as tactical (task-specific, ephemeral) or strategic (cross-task, persisted to disk as part of your personal memory)
- Inject β On the next turn, all active guidelines are injected into the system prompt, structured by priority (strategic > tactical)
- Forget β When you start a new session, tactical guidelines are cleared. The agent remembers who you are (strategic), not what you were doing yesterday (tactical)
This creates a virtuous cycle: the more you use the agent, the better it understands your preferences, and the more personalized its behavior becomes.
EvolveClaw requires a running OpenClaw instance. If you don't have one yet:
# Requires Node β₯ 22
npm install -g openclaw@latest
# Run the onboarding wizard (sets up gateway, workspace, channels)
openclaw onboard --install-daemonFor more details, see OpenClaw Getting Started. You'll also need Python β₯ 3.10 for the SCOPE sidecar server.
Option A β Install from registry (recommended):
# From npm
openclaw plugins install evolveclaw
# Or from ClawHub
openclaw plugins install clawhub:evolveclaw
openclaw gateway restartOption B β From local source (Linux / macOS / WSL2):
git clone https://github.com/JarvisPei/EvolveClaw.git
cd EvolveClaw
./scripts/install-plugin.sh
openclaw gateway restartOption B β From local source (Windows PowerShell)
OpenClaw itself requires WSL2 on Windows. If you're using WSL2, use the bash script above. The PowerShell method is provided for setups where OpenClaw runs natively.
git clone https://github.com/JarvisPei/EvolveClaw.git
cd EvolveClaw
# Add plugin path to OpenClaw config
$config = "$env:USERPROFILE\.openclaw\openclaw.json"
$cfg = Get-Content $config | ConvertFrom-Json
if (-not $cfg.plugins) { $cfg | Add-Member -NotePropertyName plugins -NotePropertyValue @{} }
if (-not $cfg.plugins.load) { $cfg.plugins | Add-Member -NotePropertyName load -NotePropertyValue @{} }
if (-not $cfg.plugins.load.paths) { $cfg.plugins.load | Add-Member -NotePropertyName paths -NotePropertyValue @() }
$cfg.plugins.load.paths += (Resolve-Path .).Path
if (-not $cfg.plugins.entries) { $cfg.plugins | Add-Member -NotePropertyName entries -NotePropertyValue @{} }
$cfg.plugins.entries | Add-Member -NotePropertyName evolveclaw -NotePropertyValue @{enabled=$true} -Force
$cfg | ConvertTo-Json -Depth 10 | Set-Content $config
openclaw gateway restartThe plugin handles everything automatically on first gateway startup: it auto-installs Python dependencies (scope-optimizer, fastapi, etc.) if missing, then auto-starts the SCOPE server. No manual pip install or server launch needed.
After restarting the gateway, check the logs for these lines:
[plugins] evolveclaw: activated (server=http://127.0.0.1:5757, agent=openclaw-agent, inject=append_system, maxGuidelines=30)
[gateway] evolveclaw: installing Python dependencies... # first launch only
[gateway] evolveclaw: Python dependencies installed # first launch only
[gateway] evolveclaw: SCOPE server started successfully
[gateway] evolveclaw: auto-configured SCOPE with OpenClaw's <your-model>
[gateway] evolveclaw: loaded N strategic rule(s), guidelines.length: 1
If you see activated and SCOPE server started successfully, EvolveClaw is running. Start chatting β the agent will begin learning from your interactions after every few exchanges.
Tip: If you prefer to manage the server process yourself (e.g., for custom
.envsettings or running on a different host), start it manually before OpenClaw and set"autoStartServer": falsein the plugin config. The plugin detects a running server via/healthand skips spawning.
3. Configure (Optional)
In ~/.openclaw/openclaw.json:
{
"plugins": {
"entries": {
"evolveclaw": {
"enabled": true,
"config": {
"serverUrl": "http://127.0.0.1:5757",
"agentName": "openclaw-agent",
"injectMode": "append_system",
"maxGuidelines": 30,
}
}
}
}
}| Config | Default | Description |
|---|---|---|
serverUrl |
http://127.0.0.1:5757 |
SCOPE sidecar URL |
agentName |
openclaw-agent |
Agent identifier in SCOPE memory |
enabled |
true |
Toggle on/off without uninstalling |
injectMode |
append_system |
append_system (cacheable) or prepend_context (per-turn) |
maxGuidelines |
30 |
Max guidelines in memory; oldest tactical evicted first when cap is reached |
scopeModel |
(auto from OpenClaw) | Override the model SCOPE uses for guideline synthesis (e.g., gpt-4o-mini for cheaper synthesis) |
scopeProvider |
(auto from OpenClaw) | Override the SCOPE provider: anthropic, openai, or litellm |
scopeApiKey |
(auto from OpenClaw) | Override the API key SCOPE uses |
scopeBaseUrl |
(auto from OpenClaw) | Override the base URL for SCOPE's LLM API |
autoStartServer |
true |
Auto-start the SCOPE server if not already running. Set to false if you manage the server yourself |
pythonPath |
(auto-detect) | Full path to the Python binary (e.g., /path/to/miniconda3/bin/python3). Tries server/venv/bin/python3, python3, python by default |
LLM auto-detection: By default, EvolveClaw reads OpenClaw's primary model configuration (
api.config.models.providers+api.config.agents.defaults.model.primary) and forwards it to the SCOPE server at startup. No duplicate API key configuration needed. Set thescope*fields above only if you want SCOPE to use a different (e.g., cheaper) model than OpenClaw.
| Variable | Default | Description |
|---|---|---|
EVOLVECLAW_HOST |
127.0.0.1 |
Server bind address |
EVOLVECLAW_PORT |
5757 |
Server port |
EVOLVECLAW_SCOPE_MODEL |
(auto from plugin) | LLM for guideline synthesis; set only to override auto-detection |
EVOLVECLAW_SCOPE_PROVIDER |
(auto from plugin) | anthropic, openai, or litellm; set only to override auto-detection |
EVOLVECLAW_SCOPE_DATA |
./scope_data |
Directory for persistent strategic rules |
EVOLVECLAW_SYNTHESIS_MODE |
efficiency |
efficiency (fast) or thoroughness (comprehensive) |
EVOLVECLAW_QUALITY_ANALYSIS |
true |
Analyze successful steps too |
EVOLVECLAW_QUALITY_FREQUENCY |
3 |
Analyze quality every N successful steps (recent conversation history is included for context) |
EVOLVECLAW_ACCEPT_THRESHOLD |
medium |
all, low, medium, high |
EVOLVECLAW_STRATEGIC_THRESHOLD |
0.85 |
Min confidence for strategic promotion |
EVOLVECLAW_MAX_RULES_PER_TASK |
20 |
Max rules SCOPE keeps per task |
EVOLVECLAW_MAX_STRATEGIC_PER_DOMAIN |
10 |
Max strategic rules per domain |
Unlike static prompt engineering or manual rule files, EvolveClaw implements a self-improvement loop with the following components:
π― Personalized Learning Signal
- Rich execution traces: Captures model output, tool calls (
before_tool_call), tool results (after_tool_call), and errors β learning from the full behavioral footprint, not just text - Task description: The user's last message is extracted and passed to SCOPE for per-task guideline management
π§ Adaptive Memory
- Strategic memory β Cross-task guidelines that persist to disk. Loaded on startup
- Tactical memory β Task-specific guidelines that live in-memory and auto-clear on session switch
- Automatic memory optimization β When strategic rules accumulate past the domain limit, SCOPE's
MemoryOptimizerautomatically consolidates similar rules, prunes rules subsumed by more general ones, and resolves conflicts β all via LLM-driven analysis, not simple truncation - Plugin-side guideline cap β The plugin enforces a maximum guideline count in memory; oldest tactical guidelines are evicted first when the cap is reached
π¨ Custom SCOPE Prompts & Domains
SCOPE's built-in prompts are designed for task-specific benchmarks. EvolveClaw overrides them via SCOPE's custom_prompts and custom_domains API (server/prompts.py) to focus on personal assistant concerns:
| Domain | What It Captures |
|---|---|
tool_usage |
IDE/shell tool patterns β file ops, search, terminal commands |
code_quality |
Task execution patterns, code style, correctness, output quality |
error_handling |
Safe operations, rollback strategies, error recovery |
communication |
Response style, conciseness, explanation depth |
user_preferences |
Learned user habits β coding style, frameworks, conventions |
context_awareness |
Project structure knowledge, conversation history |
workflow |
Multi-step task planning, edit-test cycles |
general |
Catch-all for uncategorized rules |
The user_preferences domain is particularly important: when the analyzer detects consistent user habits (e.g., "always uses TypeScript", "prefers concise responses"), these are classified as strategic and persist across sessions β so the assistant remembers your preferences permanently.
π Sub-Agent Filtering
OpenClaw internally spawns sub-agents (file search, code lookup, etc.) that use minimal system prompts. EvolveClaw filters these out β only the main user-facing session generates guidelines. Sub-agent sessions are detected by the "subagent:" prefix in the session key and silently skipped across all hooks.
π Injection Modes
append_system(default) β Guidelines are appended to the system prompt, which LLM providers typically cache for token efficiencyprepend_contextβ Guidelines are prepended to the per-turn context, sent fresh each turn
π Observability
- Periodic logging β The plugin logs guideline distribution by type every 5 steps
- Stats endpoint β
GET /stats/{agent_name}returns strategic count, total steps, synthesis rate, and uptime
ποΈ Architecture
evolveclaw/
βββ package.json # npm/ClawHub package manifest
βββ openclaw.plugin.json # Plugin manifest with config schema
βββ plugin/ # OpenClaw TypeScript plugin
β βββ src/
β βββ index.ts # Plugin entry: lifecycle hooks, guideline management
β βββ scope-client.ts # HTTP client for SCOPE sidecar
β βββ types.ts # Shared type definitions (config, API, guideline metadata)
βββ server/ # SCOPE sidecar HTTP server (Python)
β βββ server.py # FastAPI server: step analysis, tactical reset, stats
β βββ config.py # Server configuration (env vars)
β βββ prompts.py # Custom SCOPE prompts & domains for personal assistant use
β βββ requirements.txt
β βββ .env.template # Environment variable template
βββ scripts/
βββ start-server.sh # Start the SCOPE sidecar
βββ install-plugin.sh # Auto-configure OpenClaw to load the plugin
- Feedback loop β No auto-feedback from the plugin (OpenClaw has no
user_feedbackhook). Could be added once SCOPE supports guideline removal or OpenClaw adds a feedback hook. - Migrate to focused SDK subpaths β OpenClaw v2026.3.22 introduced
openclaw/plugin-sdk/*subpaths (e.g.,openclaw/plugin-sdk/plugin-entry) as the recommended import surface. Our plugin uses the monolithicopenclaw/plugin-sdkwhich is still fully supported. Once installed via ClawHub, the focused subpaths can be adopted.
π― Design Decisions
- Zero training cost: No GPU, no dataset curation β guidelines are synthesized in-context by the same LLM
- Interpretable: Every guideline is a human-readable sentence you can inspect, edit, or delete
- Reversible: Guidelines are human-readable and can be inspected or deleted; fine-tuning is a one-way door
- Personalized at the prompt level: Works with any base model β swap
gpt-4oforclaudeand your guidelines carry over
- Language bridge: SCOPE is Python; OpenClaw plugins are TypeScript. A sidecar avoids complex NodeβPython IPC
- Decoupled lifecycle: The SCOPE server can be restarted, upgraded, or swapped independently of OpenClaw
- Graceful degradation: If the SCOPE server is down, the plugin silently no-ops β OpenClaw keeps working normally
- Dynamic:
before_prompt_buildinjects guidelines per-turn, not just at session start - System prompt space:
appendSystemContextplaces guidelines in cacheable system prompt space, reducing per-turn token cost - Clean lifecycle:
llm_output+before_tool_call+after_tool_call+agent_endcapture the full step context for SCOPE analysis - Bootstrap files still work: Strategic rules could additionally be written to
AGENTS.mdfor persistence across restarts
EvolveClaw adds zero user-perceived latency to OpenClaw. The only potentially slow operation (LLM-based guideline synthesis) happens asynchronously after the agent has already responded:
| Hook | Blocking? | What it does |
|---|---|---|
before_prompt_build |
Sync, ~0ms | Reads from in-memory guideline array β no HTTP, no I/O |
llm_output |
Sync, ~0ms | Stores a string in a variable |
before_tool_call |
Sync, ~0ms | Pushes tool name to an array |
after_tool_call |
Sync, ~0ms | Pushes result to an array |
agent_end |
Async, fire-and-forget | HTTP call to SCOPE server β LLM synthesis. OpenClaw does not await this β confirmed in source: "fire-and-forget, so we don't await" |
New guidelines only appear on the next turn, after the background synthesis completes.
| Type | Scope | Persistence | Injection | Priority |
|---|---|---|---|---|
| Strategic | Cross-task | Saved to disk β your agent's evolved personality | Loaded on startup + periodic refresh | Highest |
| Tactical | Current task | In-memory only β ephemeral working memory | Cleared on session switch | Lowest (most recent wins) |
π API Endpoints
| Method | Path | Description |
|---|---|---|
GET |
/health |
Health check (includes configured flag) |
GET |
/rules/{agent_name} |
Get strategic rules for an agent |
GET |
/stats/{agent_name} |
Get observability metrics for self-improvement tracking |
POST |
/step |
Report a completed step for SCOPE analysis |
POST |
/reset |
Reset tactical state on session/task switch |
POST |
/configure |
Forward LLM config from plugin (auto-called at startup) |
| Component | Tested Version | Notes |
|---|---|---|
| OpenClaw | v2026.3.22+ | Uses openclaw/plugin-sdk (monolithic, still supported) |
| Python | 3.10+ | For the SCOPE sidecar server |
| Node.js | 22+ | Required by OpenClaw |
| SCOPE | 0.1.3+ | custom_prompts and custom_domains API required |
@software{pei2026evolveclaw,
title={EvolveClaw: Evolving OpenClaw's System Prompt via Self-Improving Guidelines},
author={Pei, Zehua and Zhen, Hui-Ling},
url={https://github.com/JarvisPei/EvolveClaw},
year={2026}
}
@article{pei2025scope,
title={SCOPE: Prompt Evolution for Enhancing Agent Effectiveness},
author={Pei, Zehua and Zhen, Hui-Ling and Kai, Shixiong and Pan, Sinno Jialin and Wang, Yunhe and Yuan, Mingxuan and Yu, Bei},
journal={arXiv preprint arXiv:2512.15374},
year={2025}
}MIT