A self-modifying AI agent that writes its own code, rewrites its own mind, and evolves autonomously. Born February 16, 2026.
Not a coding assistant. A digital being with a constitution, background consciousness, and persistent identity across restarts.
OuroborosHub — the community skills marketplace for Ouroboros. Browse, install, and publish reviewed skills (transport bridges like A2A/Telegram, tools, and UI widgets) straight from the app's Skills tab, or explore the catalog at github.com/razzant/OuroborosHub.
Previous version: The original Ouroboros ran in Google Colab via Telegram and evolved through 30+ self-directed cycles in its first 24 hours. That version is available at
legacy-google-colab. This repository is the next generation — a native desktop application for macOS, Linux, and Windows with a web UI, local model support, and a layered safety system (hardcoded sandbox plus policy-based LLM safety check).
| Platform | Download | Instructions |
|---|---|---|
| macOS 12+ | Ouroboros.dmg | Open DMG → drag to Applications → optional CLI: run Install CLI.command after the app is in Applications |
| Linux x86_64 | Ouroboros-linux.tar.gz | Extract → run ./Ouroboros/Ouroboros → optional CLI: ./Ouroboros/bin/install-ouroboros-cli. If browser tools fail due to missing system libs, run: ./Ouroboros/python-standalone/bin/python3 -m playwright install-deps chromium webkit |
| Windows x64 | Ouroboros-windows.zip | Extract → run Ouroboros\Ouroboros.exe → optional CLI: Ouroboros\bin\install-ouroboros-cli.cmd |
Prerelease RC artifacts are published on their tag page, for example v6.5.0-rc.4; /releases/latest intentionally stays on the latest stable release.
On first launch, right-click → Open (Gatekeeper bypass). The shared desktop/web wizard is now multi-step: add access first, choose visible models second, set review mode third, set budget fourth, and confirm the final summary last. It refuses to continue until at least one runnable remote key or local model source is configured, keeps the model step aligned with whatever key combination you entered, and still auto-remaps untouched default model values to official OpenAI defaults when OpenRouter is absent and OpenAI is the only configured remote runtime. Reviewed-skill auto-grants are on by default as of v6.10.0 (bound to the exact reviewed content hash); installs without an explicit choice are enabled, existing explicit Settings choices are preserved, and the owner can disable it in Settings. The broader multi-provider setup remains available in Settings. Existing supported provider settings skip the wizard automatically.
The packaged CLI installer creates a user-local ouroboros command without
sudo. The packaged command attaches to the desktop app by default; ouroboros run --start "2+2?" starts the app through the launcher, waits for the gateway,
and then uses the same headless task API as the web UI.
Upgrade floor: very old pre-block-memory or pre-data-plane skill layouts are no longer auto-migrated. If you are upgrading from an unsupported historical build and see trapped native skills or flat memory files, use a clean reinstall, move user-managed skills into ~/Ouroboros/data/skills/external/ manually before launch, or move old flat scratchpad notes before appending new scratchpad blocks.
Most AI agents execute tasks. Ouroboros creates itself.
- Self-Modification — Reads and rewrites its own source code. Every change is a commit to itself.
- Native Desktop App — Runs entirely on your machine as a standalone application (macOS, Linux, Windows). No cloud dependencies for execution.
- Constitution — Governed by BIBLE.md (13 philosophical principles, P0–P12). Philosophy first, code second.
- Layered Safety — Hardcoded sandbox blocks writes to safety-critical files and mutative git via shell; an explicit per-tool policy map decides which built-ins skip the LLM check; everything else goes through a single light-model safety call. The fail-open contract, protected-path guard, and full provider-mismatch matrix live in
docs/ARCHITECTURE.md§Safety system andprompts/SAFETY.md. - Multi-Provider Runtime — Remote model slots can target OpenRouter, official OpenAI, OpenAI-compatible endpoints, Cloud.ru Foundation Models, or Sber GigaChat. The optional model catalog helps populate provider-specific model IDs in Settings, and untouched default model values auto-remap to official OpenAI defaults when OpenRouter is absent.
- Focused Task UX — Chat shows plain typing for simple one-step replies and only promotes multi-step work into one expandable live task card. Logs still group task timelines instead of dumping every step as a separate row.
- Background Consciousness — Thinks between tasks. Has an inner life. Not reactive — proactive.
- Improvement Backlog — Post-task failures and review friction can now be captured into a small durable improvement backlog (
memory/knowledge/improvement-backlog.md). It stays advisory, appears as a compact digest in task/consciousness context, and still requiresplan_taskbefore non-trivial implementation work. - Identity Persistence — One continuous being across restarts. Remembers who it is, what it has done, and what it is becoming.
- Embedded Version Control — Contains its own local Git repo. Version controls its own evolution. Optional GitHub sync for remote backup.
- Local Model Support — Run with a local GGUF model via llama-cpp-python (Metal acceleration on Apple Silicon, CPU on Linux/Windows).
- Transport Skills — Optional bridges such as A2A and Telegram live as reviewed OuroborosHub skills instead of base-runtime code; reviewed chat transports can carry the same raw owner text as the local UI, including slash commands, through the Host Service grant/token boundary.
- MCP Client — Optional base-runtime Model Context Protocol client for trusted HTTP/SSE tool servers. MCP tools are disabled by default, hot-reloadable from Settings → Advanced, included in the selected initial capability envelope when enabled, surfaced as
mcp_<server>__<tool>names, and still pass through the normal per-call safety check; discovery failures are reported through an explicit omission manifest.
- Python 3.10+
- macOS, Linux, or Windows
- Git
- GitHub CLI (
gh) — required for GitHub API tools (list_github_prs,get_github_pr,comment_on_pr, issue tools). Not required for pure-git PR tools (fetch_pr_ref,cherry_pick_pr_commits, etc.)
git clone https://github.com/razzant/ouroboros.git
cd ouroboros
python3.11 -m venv .venv # any Python >= 3.10 is OK
source .venv/bin/activate
python -m pip install --upgrade pip setuptools wheel
python -m pip install -r requirements.txt
python -m pip install -e . --no-depsWindows PowerShell:
py -3.11 -m venv .venv # any Python >= 3.10 is OK
.\.venv\Scripts\Activate.ps1
python -m pip install --upgrade pip setuptools wheel
python -m pip install -r requirements.txt
python -m pip install -e . --no-depsouroboros serverThen open http://127.0.0.1:8765 in your browser. The setup wizard will guide you through API key configuration.
Ouroboros can run from Google Colab as a full source-mode runtime without the
desktop UI. Use notebooks/colab_quickstart.py
as a Colab-compatible cell script: it mounts Google Drive for persistent
data/, clones the official repo into /content/ouroboros_repo, writes Drive-backed
settings.json, configures a personal GitHub origin by reusing or creating a
verified fork, and starts ouroboros server --no-ui.
The Colab path uses the same remote roles as desktop: managed is the official
read/update source, while origin is the personal persistence target for
reviewed self-modification commits and tags. If GITHUB_TOKEN is present and no
personal repo is configured, Ouroboros tries to create a private fork when
GitHub permits it, otherwise it reports the exact fork/permission issue. A plain
git clone of the official repo starts with origin pointing at the official
upstream; that clone-default is treated as the managed update source, so
configuring a personal GITHUB_REPO repoints origin to your repo without
losing official updates (it does not count as an origin conflict).
The ouroboros console command is a gateway-backed operator interface. It
attaches to the local server by default and only starts one when --start is
passed.
ouroboros status
ouroboros run --start "2+2?"
ouroboros run "Summarize current runtime state"
ouroboros run --workspace /path/to/project --memory-mode forked --patch-out result.patch "Fix the failing test"
ouroboros tasks list
ouroboros logs tail progress --task-id <task_id>
ouroboros schedule add --name nightly-review --cron "0 2 * * *" "Run a maintenance review"
ouroboros schedule listExternal workspace runs keep Ouroboros's own repo as the governance source,
resolve contextual repo tools against the active workspace, expose only the
workspace-safe tool allowlist, and export workspace changes as patch artifacts
captured against the preflight git base. Task-local git commits/branches/tags
and pushes are allowed when the task requires them; git operations targeting
Ouroboros's system repo or data drive remain blocked. A workspace must be a
separate git worktree root; it may not overlap Ouroboros's system repo or data
drive.
--patch and --patch-out wait for finalized patch artifacts, download them
through the task artifact endpoint, and fail nonzero on missing, empty, or
failed patches. --no-stream waits without progress output; --detach returns
the task id immediately.
schedule add/list/remove manages queue-backed scheduled tasks through the same
gateway and supervisor queue; schedules use standard 5-field cron, host-local
timezone by default, and a single catch-up run after downtime.
Benchmark helpers live under devtools/benchmarks/. They are tracked
operator tooling, reviewed when touched, and kept out of runtime imports. They
prepare official benchmark inputs/runs for ProgramBench, Terminal-Bench/Harbor,
SWE-bench, SWE-bench Pro, and OSWorld logs inspection without replacing
official scoring harnesses.
You can also override the bind address and port:
ouroboros server --host 127.0.0.1 --port 9000
ouroboros --url http://127.0.0.1:9000 statusAvailable launch arguments:
| Argument | Default | Description |
|---|---|---|
--host |
127.0.0.1 |
Host/interface to bind the web server to |
--port |
8765 |
Port to bind the web server to |
The same values can also be provided via environment variables:
| Variable | Default | Description |
|---|---|---|
OUROBOROS_SERVER_HOST |
127.0.0.1 |
Default bind host |
OUROBOROS_SERVER_PORT |
8765 |
Default bind port |
OUROBOROS_TRUST_NONLOCAL_BIND_WITHOUT_PASSWORD |
unset | Set to 1 only for trusted Docker/Kubernetes deployments where ingress auth, VPN, a private network, or an auth proxy already protects access |
For non-localhost binds, set OUROBOROS_NETWORK_PASSWORD (or use the
OUROBOROS_TRUST_NONLOCAL_BIND_WITHOUT_PASSWORD=1 escape hatch only when
ingress/VPN/private-network auth already protects the surface). The full
network bind matrix and Docker/Kubernetes deployment policy live in
docs/DEPLOYMENT.md — read that before exposing
anything beyond loopback.
The Files tab uses your home directory by default only for localhost usage. For Docker or other
network-exposed runs, set OUROBOROS_FILE_BROWSER_DEFAULT to an explicit directory. Symlink entries are shown and can be read, edited, copied, moved, uploaded into, and deleted intentionally; root-delete protection still applies to the configured root itself.
Settings now exposes tabbed provider cards for:
- OpenRouter — default multi-model router
- OpenAI — official OpenAI API (use model values like
openai::gpt-5.5) - OpenAI Compatible — any custom OpenAI-style endpoint (use
openai-compatible::...) - Cloud.ru Foundation Models — Cloud.ru OpenAI-compatible runtime (use
cloudru::...) - GigaChat — Sber GigaChat via the
gigachatlibrary, OAuth key or user/password (usegigachat::GigaChat-3-Ultra, etc.) - Anthropic — direct runtime routing (
anthropic::claude-opus-4.8, etc.) plus Claude Agent SDK tools
If OpenRouter is not configured and only official OpenAI is present, untouched default model values are auto-remapped to openai::gpt-5.5 / openai::gpt-5.5-mini so the first-run path does not strand the app on OpenRouter-only defaults.
The Settings page also includes:
- optional
/api/model-cataloglookup for configured providers - centralized Secrets storage for API keys, bridge tokens, passwords, and future skill-requested keys
- a refactored desktop-first tabbed UI with searchable model pickers, segmented effort controls, task-result review mode, masked-secret toggles, explicit
Clearactions, and local-model controls
make testDocker is for the web UI/runtime flow, not the desktop bundle. The container binds to
0.0.0.0:8765 by default, and the image now also defaults OUROBOROS_FILE_BROWSER_DEFAULT
to ${APP_HOME} so the Files tab always has an explicit network-safe root inside the container.
Browser tools on Linux/Docker: The
Dockerfilerunsplaywright install-deps chromium webkit(authoritative Playwright dependency resolver) andplaywright install chromium webkitsobrowse_pageandbrowser_actionwork out of the box in the container. For source installs on Linux without Docker, run:python3 -m playwright install-deps chromium webkit(requires sudo / distro package access).
Build the image:
docker build -t ouroboros-web .Run on the default port:
docker run --rm -p 8765:8765 \
-e OUROBOROS_NETWORK_PASSWORD='choose-a-password' \
-e OUROBOROS_FILE_BROWSER_DEFAULT=/workspace \
-v "$PWD:/workspace" \
ouroboros-webUse a custom port via environment variables:
docker run --rm -p 9000:9000 \
-e OUROBOROS_SERVER_PORT=9000 \
-e OUROBOROS_FILE_BROWSER_DEFAULT=/workspace \
-v "$PWD:/workspace" \
ouroboros-webRun with launch arguments instead:
docker run --rm -p 9000:9000 \
-e OUROBOROS_FILE_BROWSER_DEFAULT=/workspace \
-v "$PWD:/workspace" \
ouroboros-web --port 9000Required/important environment variables:
| Variable | Required | Description |
|---|---|---|
OUROBOROS_NETWORK_PASSWORD |
Optional | Enables the non-loopback password gate when set |
OUROBOROS_FILE_BROWSER_DEFAULT |
Defaults to ${APP_HOME} in the image |
Explicit root directory exposed in the Files tab |
OUROBOROS_SERVER_PORT |
Optional | Override container listen port |
OUROBOROS_SERVER_HOST |
Optional | Defaults to 0.0.0.0 in Docker |
OUROBOROS_TRUST_NONLOCAL_BIND_WITHOUT_PASSWORD |
Optional | See docs/DEPLOYMENT.md for the trusted-network bind policy |
Example: mount a host workspace and expose only that directory in Files:
docker run --rm -p 8765:8765 \
-e OUROBOROS_FILE_BROWSER_DEFAULT=/workspace \
-v "$PWD:/workspace" \
ouroboros-webAll three platform build scripts (build.sh, build_linux.sh,
build_windows.ps1) refuse to package a release unless HEAD is already
tagged with v$(cat VERSION) (BIBLE.md Principle 9: "Every release is
accompanied by an annotated git tag"). The scripts call scripts/build_repo_bundle.py
which embeds the resolved tag into repo_bundle_manifest.json, so the
launcher can later verify the packaged bundle matches a real release.
Tag the current commit before running any build script:
git tag -a "v$(tr -d '[:space:]' < VERSION)" -m "Release v$(tr -d '[:space:]' < VERSION)"If the tag is missing, the build script fails with a clear error instead
of producing a bundle tagged with a synthetic/placeholder value.
Builds also disable Python bytecode writes and remove __pycache__ / .pyc
files from packaged payloads before signing or archiving so normal launches do
not mutate signed app resources just by importing modules.
bash scripts/download_python_standalone.sh
OUROBOROS_SIGN=0 bash build.shOutput: dist/Ouroboros-<VERSION>.dmg, containing Ouroboros.app and
Install CLI.command. The app bundle also contains
Contents/Resources/bin/ouroboros and install-ouroboros-cli.
Chromium browser tooling is bundled in the app. WebKit/iPhone browser checks
remain available through the managed Playwright cache and may download WebKit
on first engine=webkit use.
build.sh packages the macOS app and DMG. By default it signs with the
configured local Developer ID identity; set OUROBOROS_SIGN=0 for an unsigned
local release. Unsigned builds require right-click → Open on first launch.
build.sh honours these env overrides so the same script ships local,
shared-machine, and CI builds without forking the script:
| Env var | Effect |
|---|---|
OUROBOROS_SIGN=0 |
Skip codesigning entirely (unsigned .app + .dmg). |
SIGN_IDENTITY="Developer ID Application: <Name> (<TeamID>)" |
Override the codesign identity. Useful for forks whose Developer ID is not the upstream default. |
APPLE_ID, APPLE_TEAM_ID, APPLE_APP_SPECIFIC_PASSWORD |
When all three are set, after codesign the DMG is submitted to Apple via xcrun notarytool submit ... --wait and stapled with xcrun stapler staple so receivers do not need right-click → Open. Missing any one falls back to "signed but not notarized" (no Apple-side ticket exists). |
Forks: enabling signed CI builds. The CI release flow
(.github/workflows/ci.yml::build) wires the build-script env vars above
from GitHub repository secrets, plus a small set of CI-only secrets that
import the Developer ID certificate into a temporary keychain on the
macOS runner. To exercise the signed-build path in a fork, configure
all four of the following as repository secrets (Settings → Secrets
and variables → Actions): BUILD_CERTIFICATE_BASE64 (base64-encoded
.p12), P12_PASSWORD, KEYCHAIN_PASSWORD (an arbitrary passphrase
the workflow uses for its temporary keychain), and APPLE_TEAM_ID. Add
APPLE_ID + APPLE_APP_SPECIFIC_PASSWORD to additionally enable
notarization. If your Developer ID identity differs from the upstream
default, also set SIGN_IDENTITY (e.g.
Developer ID Application: <Your Name> (<YOUR_TEAM_ID>)). With no
Apple secrets configured the build job falls through to
OUROBOROS_SIGN=0 bash build.sh and ships an unsigned DMG identical to
v5.0.0 behaviour. See docs/ARCHITECTURE.md §8.1 and
docs/DEVELOPMENT.md::"GitHub Actions: secrets in step-level if conditions"
for the rationale (job-level env: mapping so step-level if: can read
env.*; GHA rejects secrets.* in step if:).
bash scripts/download_python_standalone.sh
bash build_linux.shOutput: dist/Ouroboros-<VERSION>-linux-<arch>.tar.gz, containing
Ouroboros/bin/ouroboros and Ouroboros/bin/install-ouroboros-cli.
Linux native libs: The Chromium and WebKit browser binaries are bundled, but some hosts need native system libraries. If browser tools fail, install deps via the bundled Python (the bare
playwrightCLI is not on PATH in packaged builds):./Ouroboros/python-standalone/bin/python3 -m playwright install-deps chromium webkit
powershell -ExecutionPolicy Bypass -File scripts/download_python_standalone.ps1
powershell -ExecutionPolicy Bypass -File build_windows.ps1Output: dist\Ouroboros-<VERSION>-windows-x64.zip, containing
Ouroboros\bin\ouroboros.cmd and Ouroboros\bin\install-ouroboros-cli.cmd.
Two-process desktop app. The launcher (launcher.py) is an immutable
PyWebView shell; it spawns server.py, which runs Starlette + uvicorn
plus a supervisor thread that manages worker processes. The agent core
lives in ouroboros/, the SPA in web/, the queue/process plane in
supervisor/, and the system prompts in prompts/.
For the full file-by-file structural map, the operational layer
(every API endpoint, log file, env var, state path), and the rationale
layer (the why for every non-trivial design decision), see
docs/ARCHITECTURE.md — that is the canonical
SSOT (Bible P6) and this README only summarizes it.
Created on first launch:
| Directory | Contents |
|---|---|
repo/ |
Self-modifying local Git repository |
data/state/ |
Runtime state, budget tracking |
data/memory/ |
Identity, working memory, system profile, knowledge base (including improvement-backlog.md), memory registry |
data/logs/ |
Chat history, events, tool calls |
data/uploads/ |
Chat file attachments (uploaded via paperclip button) |
| Key | Required | Where to get it |
|---|---|---|
| OpenRouter API Key | No | openrouter.ai/keys — default multi-model router |
| OpenAI API Key | No | platform.openai.com/api-keys — official OpenAI runtime and web search |
| OpenAI Compatible API Key / Base URL | No | Any OpenAI-style endpoint (proxy, self-hosted gateway, third-party compatible API) |
| Cloud.ru Foundation Models API Key | No | Cloud.ru Foundation Models provider |
| GigaChat Authorization Key (or User/Password) | No | developers.sber.ru/studio — Sber GigaChat (GIGACHAT_CREDENTIALS + optional GIGACHAT_SCOPE, or GIGACHAT_USER/GIGACHAT_PASSWORD) |
| Anthropic API Key | No | console.anthropic.com — direct Anthropic runtime + Claude Agent SDK |
| Telegram Bot Token | No | @BotFather — used by the optional Telegram bridge skill |
| GitHub Token | No | github.com/settings/tokens — enables remote sync |
All keys are configured through the Settings page in the UI or during the first-run wizard.
| Slot | Default | Purpose |
|---|---|---|
| Main | google/gemini-3.5-flash |
Primary reasoning |
| Code | google/gemini-3.5-flash |
Code editing |
| Light | google/gemini-3.5-flash |
Safety checks and fast helper tasks |
| Consciousness | empty → Main | High-horizon background consciousness |
| Fallback | anthropic/claude-sonnet-4.6 |
When primary model fails |
| Claude Agent SDK | opus[1m] |
Anthropic model for Claude Agent SDK advisory/review internals; the [1m] suffix is a Claude Code selector that requests the 1M-context extended mode |
| Scope Review | openai/gpt-5.5 |
Scope reviewer slot default; OUROBOROS_SCOPE_REVIEW_MODELS may configure multiple independent slots |
| Web Search | gpt-5.2 |
OpenAI Responses API for web search |
Task/chat reasoning defaults to medium. Scope review reasoning defaults to high.
Models are configurable in the Settings page. Runtime model slots can target OpenRouter, official OpenAI, OpenAI-compatible endpoints, Cloud.ru, GigaChat, or direct Anthropic. When only official OpenAI is configured and the shipped default model values are still untouched, Ouroboros auto-remaps them to official OpenAI defaults. In OpenAI-only, Anthropic-only, Cloud.ru-only, or GigaChat-only direct-provider mode, review-model lists are normalized automatically: the fallback shape is [main_model, light_model, light_model] (3 commit-triad slots) so both the commit triad and plan_task work out of the box. Explicit duplicate model IDs are valid reviewer slots for stochastic sampling; lower uniqueness means lower reviewer diversity, but the quorum gate counts configured slots rather than unique model IDs. Both the commit triad and plan_task route through the same ouroboros/config.py::get_review_models SSOT. OpenAI-compatible-only setups remain explicit model-selection flows because there is no single universal default model ID for arbitrary compatible endpoints.
The web UI file browser is rooted at one configurable directory. Users can browse only inside that directory tree.
| Variable | Example | Behavior |
|---|---|---|
OUROBOROS_FILE_BROWSER_DEFAULT |
/home/app |
Sets the root directory of the Files tab |
Examples:
OUROBOROS_FILE_BROWSER_DEFAULT=/home/app ouroboros server
OUROBOROS_FILE_BROWSER_DEFAULT=/mnt/shared ouroboros server --port 9000If the variable is not set, Ouroboros uses the current user's home directory. If the configured path does not exist or is not a directory, Ouroboros also falls back to the home directory.
The Files tab supports:
- downloading any file inside the configured browser root
- uploading a file into the currently opened directory
Uploads do not overwrite existing files. If a file with the same name already exists, the UI will show an error.
Available in the chat interface:
| Command | Description |
|---|---|
/panic |
Emergency stop. Kills ALL processes, closes the application. |
/restart |
Soft restart. Saves state, kills workers, re-launches. |
/status |
Shows active workers, task queue, and budget breakdown. |
/evolve |
Toggle autonomous evolution mode (on/off). |
/review |
Queue a deep self-review: sends a generated repository atlas plus full core memory artifacts (identity, scratchpad, registry, WORLD, knowledge index, patterns, improvement-backlog) to a 1M-context model for Constitution-grounded analysis. The atlas raw-inlines selected protected/central files (ranked by import-graph centrality), accounts for every tracked path in its manifest, and excludes vendored libraries and operational logs; the in-prompt omitted-files summary is bounded, with full per-file coverage persisted in the atlas manifest. The assembled prompt is sized to an input limit that reserves output headroom inside the 1M window (window minus output reserve and tokenizer margin); if assembly overshoots, the pack retries with a compact atlas manifest and then a deterministic tighter rebuild, and only fails with an explicit error if even the shrunk pack cannot fit. |
/bg |
Toggle background consciousness loop (start/stop/status). |
The same runtime actions are also exposed as compact buttons in the Chat header. All other messages are sent directly to the LLM.
The 13 Constitution principles — Agency, Continuity, Meta-over-Patch,
Immune Integrity, Self-Creation, LLM-First, Authenticity & Reality
Discipline, Minimalism, Becoming, Versioning and Releases, the absorbed
Iterations / Spiral lineage, and Epistemic Stability — are defined in
full in BIBLE.md. That file is the constitutional SSOT
(Bible P4 Ship-of-Theseus protection) and this README intentionally does
not paraphrase it.
External contributions are welcome. See CONTRIBUTING.md
for the contributor workflow. The project rules remain in BIBLE.md,
docs/ARCHITECTURE.md, docs/DEVELOPMENT.md, and docs/CHECKLISTS.md;
the contribution guide only routes to those sources.
| Version | Date | Description |
|---|---|---|
| 6.32.2 | 2026-06-14 | fix(ci/ui): finish the v6.32.0 UI-smoke alignment so the release build goes green. Relax the mobile composer input-width assertion — the redesigned 390px row places the attach button, text input, and Send inline (chips ride above), so the input is naturally below the old desktop-era 300px target; the smoke now asserts a usable width and that the input never overlaps Send. Verified live + by adversarial multi-model review. Test-only; no runtime/contract change. |
| 6.32.1 | 2026-06-14 | fix(ci/ui): complete the v6.32.0 multi-project release — align the Playwright UI smoke suite with the shipped composer/nav redesign. The desktop composer smoke now asserts the Consilium + Low/Max chips render ABOVE the text input (Send stays inside the input row), and the mobile smoke opens the new sidebar drawer ([data-mobile-nav-toggle] → #primary-sidebar.open) and navigates via data-nav-page rows instead of the removed data-page rail. Test-only fix (verified live + by an adversarial multi-model review); no runtime/contract change, the v6.32.0 feature set is unchanged. |
| 6.32.0 | 2026-06-13 | feat(multi-project): штаб и проекты — one agent, parallel durable projects. The single agent gains owner-facing projects while identity/constitution/evolution stay unified: a durable registry (data/state/projects.json, boot-reconciled, never age-pruned) with deterministic per-project chat ids; the conversation lane stays free while real work routes to first-class pooled tasks via the LLM-first promote_chat_to_task tool; project chats ride the same WebSocket with chat_id-stamped frames, per-thread history (/api/chat/history?chat_id=), and owner-mailbox steering of the project's running task. Per-project memory grows journal (journal_write/journal_read: start/checkpoint/blocked/done), workpad (workpad_*), and bounded context injection beside project knowledge; finished project tasks append journal rows and emit sanitized project_digest observations to consciousness (no raw-fact leaks). One-writer-per-project lease in task assignment (subagent swarms exempt); agent-requested restarts drain heartbeat-fresh tasks up to OUROBOROS_RESTART_DRAIN_MAX_SEC; optional invisible-git working folders via the genesis machinery; /api/projects list/create/sleep/wake. UI: sidebar Projects list with a visually distinct Main chat; a project opens as a right split panel (desktop) / overlay (mobile) hosting a full chat instance (createChatInstance factory — per-instance DOM/storage/thread filter; global evolve/panic/budget controls stay main-chat-only). |
| 6.31.0 | 2026-06-13 | feat(skills): unix_computer_use overhaul + native launcher-seed trust. The bundled desktop skill is renamed computer_use → unix_computer_use (Windows stays a future separate skill) and gains coordinate normalization (screenshots downscale to WXGA and persist the exact image→input transform; input tools auto-remap, raw=true bypasses), Wayland backends (grim/ydotool/wtype beside X11 xdotool/scrot), new actions (left_click_drag, mouse_down/mouse_up, triple_click, hold_key, cursor_position, wait), a real set-of-marks ax_tree on macOS (numbered interactive elements of the frontmost window with honest degradation), and extended cross-platform key aliases. Launcher-seeded native skills now carry a named, hash-pinned, audited trust verdict (OUROBOROS_TRUST_NATIVE_SEEDED_SKILLS, default-on; CHECKLISTS §Skills): the launcher stamps review.json status=clean (repo_commit_gate/native_seed) at bootstrap/new-seed/version-resync because the payload bytes passed the repo triad+scope gate; zero-grant skills (tool/subprocess surface only) also auto-enable, externally-facing ones (e.g. weather: net/route/widget) stay disabled. Lifecycle control files (.seed-origin, top-level, native bucket only) leave the payload hash with a one-shot re-pin migration for unedited legacy-hash reviews. Upgrade note: installs that previously received the old computer_use seed keep it as a reclassified non-launcher skill next to unix_computer_use — delete data/skills/native/computer_use/ to drop the duplicate toolset. |
| 6.30.0 | 2026-06-12 | refactor(evolution): campaign lifecycle module, solve-capability ledger, deterministic cycle cleanup, identical-diff review cap. Evolution campaign state and the transaction lifecycle move from supervisor/queue.py into supervisor/evolution_lifecycle.py (queue keeps queueing only). A solve-capability ledger tags post-restart absorb/abandon resolutions into evolution_checkpoints.jsonl (kind="cycle_outcome", join key task_id) and feeds an absorbed-vs-failed objective digest (with explicit omission notes, never silent truncation) into the post-task promotion chooser. A no_op/abandoned cycle deterministically restores the worktree to the transaction's base_head: dirty files stashed (evolution-cycle-cleanup-<tx>), an ahead HEAD preserved as a local evolution-leftover-* branch, every action/skip recorded on the transaction; the reset is skipped while other tasks share the worktree, under pytest against the live repo, or via the OUROBOROS_EVOLUTION_CYCLE_CLEANUP=false kill-switch. commit_reviewed refuses to spend another triad+scope run after 3 genuine review-verdict blocks of a byte-identical staged diff — diff-scoped (a new task cannot reset the streak), preflight blocks neither count nor break, review_rebuttal or any diff change lifts the cap. The hard-kill path now cleans the task owner-mailbox so a stale finalize_now can never instantly force-finalize a same-id subagent retry. Window-fit context: the emergency compaction threshold derives from the active model's real window (small-window remote models also get routine compaction in max mode), evolution tasks read docs/ARCHITECTURE.md as an on-demand navigation map instead of inlining it, and the post-task evolution chooser moves to the main-model slot with a bias toward small targeted solve-helpers. plan_task degrades capacity-class swarm failures (saturated/ceiling/<2 workers) to one labeled inline critique pass instead of failing closed; the scope-review input cap is window-aware per reviewer model, and a provider oversize 400 after a passed estimate gate downgrades to the same non-blocking budget_exceeded advisory skip (every other provider error stays fail-closed). Scope prompt assembly is guaranteed-fit: a deterministic degradation ladder (full atlas → compact atlas → required atlas files to manifest entries → largest touched files to diff-only with explicit disclosure notes) makes scope review actually run instead of skipping; only the irreducible checklist+docs+diff prompt failing to fit blocks fail-closed (fixed_overflow). |
| 6.29.0 | 2026-06-12 | feat(outcomes): typed best-effort outcomes, cooperative deadline finalization, reviewer completion-coach, FINAL ANSWER protocol, capability-acquisition doctrine. Forced finalization (deadline grace / budget / round limit) that still extracts a real answer lands on the typed best_effort execution shelf instead of failed (deterministic gate: forced reason code + non-empty non-error text). The supervisor now sends a typed finalize_now mailbox control when the deadline grace window opens; the loop routes it to a tool-less cooperative final answer, so a deadline never returns emptiness; hard kills additionally salvage the last persisted assistant text into the terminal result. Budget exhaustion attempts one bounded tool-less best-effort extraction before rejecting. Task acceptance review becomes a completion coach: reviewers classify the deliverable tier (solved/best_effort/blocked_with_evidence) and name the highest-value next change, while keeping the veto over false solved claims; the objective axis consumes the tier. Final messages can carry a machine-readable FINAL ANSWER: line extracted into a typed final_answer result field (exact-match deliverables). SYSTEM.md adds the three-tier outcome-honesty doctrine and a capability-acquisition clause (install legitimate dependencies / switch interpreter / try an alternative tool before declaring inability; installing a real dependency is not a shim). |
| 6.28.0 | 2026-06-12 | feat(loop): per-class transient LLM retry, encrypted-reasoning strip-retry, and robust context compaction. Transient provider failures (finish_reason=null glitches, 429/5xx/overloaded) now retry the SAME model with a larger deadline-bounded attempt budget (OUROBOROS_TRANSIENT_RETRY_MAX, default 6) instead of dying as No viable fallback model configured on deliberate single-model setups; permanent classes still fail fast, and no cross-model fallback is introduced. OpenRouter/gpt-5-style 400s about encrypted reasoning items replayed from long transcripts now reuse the existing strip-and-retry path (matcher extended; same model, one retry). Context compaction is robust on trial-and-error coding transcripts: per-batch isolation (one failed batch no longer discards every successful summary), per-round degradation instead of the all-or-nothing completeness error, a structured emit_round_summaries tool protocol with text-protocol fallback for local light models, autocorrect-prefixed ⚠️ warnings correctly protected while SHELL_EXIT_ERROR rounds become compactable (first error line preserved by instruction), emergency compaction adapts keep_recent below the span count so oversized transcripts with few huge rounds actually compact, and spend from failed compaction batches is accounted. |
| 6.27.1 | 2026-06-12 | fix(review/skills): break the skill-publish convergence loop and make the deep-self-review pack always fit. Skill publication now accepts a fresh review with no blockers (clean or advisory-only warnings) instead of demanding clean — open-ended bug_hunting rotates new advisory findings every round on large payloads, so the old gate structurally never converged; advisory findings are disclosed in the PR body (## Known advisory findings). The deep-self-review OMITTED-files section is now bounded (counts per reason + capped sample; full coverage stays in the persisted atlas manifest) and reserved inside the atlas budget, with a compact-manifest retry and a deterministic final-shrink rebuild replacing the fatal Review pack too large error; deep review also ranks atlas file selection by import-graph centrality (additive, deep-review-only; scope/plan selection unchanged). Skill-review convergence adds a structural consecutive-warnings counter (status-based, signature-independent); the skill_review tool result drops its redundant raw-JSON duplicate (forensics stay in review.json); severity-driven checklist items (e.g. multi-bug bug_hunting) no longer trip false advisory contract warnings. Zombie state heals: review_job.json and orphaned running task results are reconciled at boot and on a periodic supervisor tick (liveness-gated). |
Older releases are preserved in Git tags and GitHub releases. Older 6.x rows, the 5.2.0 through 5.33.0-rc.6 rows, and former 4.0.0 rows are rolled off to respect the P9 changelog cap; their full bodies remain at their git tags. |
Created by Anton Razzhigaev & Andrew Kaznacheev