Private-by-default memory and context orchestration for AI agents.
ContextLattice provides a single memory contract for agentic systems:
- Unified write/read contract for memory and context.
- Durable fanout across retrieval/storage lanes.
- Staged retrieval (fast now, deep continuation when needed).
- Agent sessions that turn prior work, graph touches, skills, checkpoints, and handoffs into prompt-ready packages and exportable run cards.
- Go/Rust runtime ownership for the active application path.
- Legacy Python runtime archived under
archive/services/orchestrator_legacy_pythonfor tooling/test compatibility only. - Local-first deployment with optional hosted surfaces.
v3.4.2 is the public agent runtime contract baseline: universal adapter lifecycle, native agent sessions, objective runtime state, scoped recall, checkpoints, handoffs, completion flow, runtime telemetry, one-command runtime proof, storage-governance hardening, and local session-store diagnostics behind one local contract.
v4 remains the private tuning lane for experiments that still need benchmark, recall, and soak gates before public promotion.
- Ingress:
gateway-go. - Core memory + retrieval lanes: Go + Rust services.
- Degradation policy: fail-open retrieval with continuation lifecycle.
- Tooling compatibility: MCP + HTTP clients.
- Single-container lite builds (
Dockerfile.hf-lite) also rungateway-go(no Python runtime dependency). - Public single-container lite vector default:
topic_rollupsonly. - Public local lite core default:
topic_rollups + qdrant; pgvector and memory-bank spike adapters are not started by default. - Public local lite advanced: opt-in adapter lab via
gmake mem-up-lite-advanced. - Full/operator stacks: Qdrant remains the primary vector-native lane; pgvector stays supported for SQL-co-located vector workloads.
git clone git@github.com:sheawinkler/ContextLattice.git
cd ContextLattice
cp .env.example .envgmake quickstartgmake quickstart prompts for runtime profile and then launches the selected stack.
curl -fsS http://127.0.0.1:8075/health | jq
scripts/agent/agent-runtime-proof-pack --pretty
scripts/agent/agent-adoption-proof-matrix --skip-provider-smoke --progress --prettyExpected:
/healthreturns{"ok": true, ...}agent-runtime-proof-packcompletes bootstrap, scoped recall, checkpoint, handoff, completion, status, prompt context package, and runtime telemetry phases.agent-adoption-proof-matrixverifies configured agent profiles and reports the skills, context, session, graph, and handoff evidence shaping each run, with trace commands for run-card export.
Task inference defaults to ORCH_INFER_PROVIDER=auto. gateway-go detects the host profile and probes local backends before selecting a route.
- Apple Silicon default priority:
mlx,vllm-metal,ane_sidecar,llama-cpp,ollama. - CUDA/ROCm default priority:
sglang,vllm,openai-compatible,llama-cpp,lmstudio,ollama. - Generic CPU default priority:
openai-compatible,llama-cpp,lmstudio,ollama. - Supported provider ids include
sglang,vllm,vllm-metal,mlx,mtplx(alias for MLX),openai-compatible,lmstudio,llama-cpp,tgi,tensorrt-llm,ane_sidecar, andollama. /v1/inference/runtime-policyreturns live provider health plus resource-aware model guidance. If host memory/VRAM is not identifiable, it falls back to generic local advice: start with Q4/IQ4 7B-9B models, benchmark, then scale up.- Large Qwen3.6 Dream Mode models are opt-in only; ContextLattice does not bundle or pull them by default. The default GGUF recommendation is
mudler/Qwen3.6-35B-A3B-Claude-4.7-Opus-Reasoning-Distilled-APEX-MTP-GGUFfor llama.cpp-compatible advanced users. Abliterated variants are private-eval only behindCONTEXTLATTICE_DREAM_ALLOW_PRIVATE_EVAL_MODELS=true(GO_DREAM_ALLOW_UNCENSORED_MODELS=trueremains a legacy alias). - Inference runtimes must emit final assistant content through their API. Reasoning-only responses fail with repair instructions instead of being accepted. For MLX Qwen thinking templates, use
scripts/inference_mlx_server.sh --model /path/to/mlx/model --template-profile qwen-final-content, then verify withscripts/inference_template_conformance.sh --provider mlx --model /path/to/mlx/model. - Dream Mode reflects on generated hypotheses by default and performs one bounded deepening pass when the best output misses the sigma target (
GO_DREAM_REFLECT_ENABLED=true,GO_DREAM_DEEPEN_ON_WEAK_OUTPUT=true,GO_DREAM_REFLECTION_MIN_SCORE=0.74). - Ollama remains a compatibility fallback, not the preferred always-on embedding path.
- Local helpers enforce one active LLM backend by default (
CONTEXTLATTICE_SINGLE_ACTIVE_INFER_BACKEND=true).
Inspect live routing and benchmark configured backends:
scripts/inference_runtime_policy.sh
scripts/benchmark_inference_backends.sh
scripts/inference_template_conformance.sh --provider mlx --model /path/to/mlx/modelEmbedding defaults to the Rust fastembed-rs sidecar. Ollama stays available as an explicit compatibility fallback, not the preferred embedding path.
Useful model runtime knobs:
ORCH_INFER_PROVIDER=auto
ORCH_INFER_PROVIDER_PRIORITY=mlx,vllm-metal,ane_sidecar,sglang,vllm,openai-compatible,llama-cpp,ollama
ORCH_INFER_AUTO_PROBE_ENABLED=true
SGLANG_BASE_URL=http://127.0.0.1:30000
VLLM_BASE_URL=http://127.0.0.1:8000
VLLM_METAL_BASE_URL=http://127.0.0.1:8000
MLX_API_BASE=http://127.0.0.1:18087/v1
LLAMA_CPP_BASE_URL=http://127.0.0.1:8080Installer and quickstart paths install agent helpers under ~/.contextlattice/bin.
contextlattice_agent_adapter profiles
contextlattice_adopt status --pretty
contextlattice_agent_start --soft --compact
contextlattice_agent_trace --session-id <session-id> --tree
contextlattice_pack "what should the next agent know?" --project my-project --pretty
contextlattice_search -h
contextlattice_write -h
contextlattice_checkpoint -h
contextlattice_source_backfill --source jsonl --path data.jsonl --project my-project --prettycontextlattice_agent_adapteris the first-class lifecycle helper for bootstrap, context-pack, checkpoint, handoff, event, and completion flows.contextlattice_adoptis the zero-friction front door for local readiness, install repair, lifecycle proof, no-secrets agent packs, and new agent profile templates.contextlattice_agent_startruns the lightweight startup guard for agents.contextlattice_agent_tracerenders the bounded run-shaping trail as a terminal tree, JSON, or Markdown run card.contextlattice_packcompiles a bounded prompt-ready packet with ranked evidence, files to inspect, risks, checks, source coverage, and areference_prompt.contextlattice_checkpointwrites a checkpoint and verifies readback.contextlattice_source_backfillimports bounded files, JSONL, JSON, CSV, SQLite, DuckDB/Parquet, or Postgres data through the same memory write contract.- Hook pack details:
docs/agent-hooks.md.
ContextLattice tracks live agent work as first-class sessions, independent of the runner or model provider.
- Start/list/read sessions through
GET|POST /v1/agents/sessionsandGET /v1/agents/sessions/{session_id}. - Emit normalized events through
POST /v1/agents/sessions/eventorPOST /v1/agents/sessions/{session_id}/events. - Inspect a bounded run trace through
GET /v1/agents/sessions/{session_id}/trace; the trace reports context, skills that may be helpful, source coverage, graph touches, handoffs, checkpoints, and timeline events without raw provider payloads. - Read live runtime telemetry from
GET /telemetry/agents/runtime. - Compile task context through
POST /memory/context-pack,POST /tools/context_pack, or globalcontextlattice_pack; responses includecontext_compiler, ranked evidence, prompt sections, and a boundedreference_prompt. - Preflight, context-pack, and Dream Mode return
objective_runtime_state.v1withobjective_state,action_executed,evidence,objective_delta,risk_or_blocker, andnext_action. - Use
scripts/agent/contextlattice-agent-adapteror globalcontextlattice_agent_adapteras the first-class product path for agent bootstrap, context-pack, checkpoint, handoff, event, and completion flows. - Use
scripts/agent/contextlattice-adoptor globalcontextlattice_adoptbefore handing ContextLattice to a new agent/account; it wraps gateway health, helper install state, shell PATH, storage posture, session store, profile coverage, and runtime-doctor checks into one bounded report. - Run
scripts/agent/agent-runtime-proof-pack --prettyor globalcontextlattice_agent_runtime_proof --prettyfor a one-command live proof that bootstrap, scoped recall, checkpoint, handoff, completion, status, and runtime telemetry are wired end to end. - Use
scripts/agent/contextlattice-sessionfor CLI start/event/complete/fail/status/runtime/trace flows. - Use
scripts/agent/agent-run-trace --session-id <id> --treeor globalcontextlattice_agent_trace --session-id <id> --treeto see the terminal trace, then--markdownto export the run card. - Use
scripts/agent/contextlattice-session sweep-stale-audits --all-projects --prettyfor dry-run-first cleanup of stale objective-runtime audit/preflight sessions; add--confirmonly after reviewing matches. scripts/agent/contextlattice-pack,scripts/agent/contextlattice-dream,scripts/agent/writeback, and compaction hooks auto-start or recover a session whenCONTEXTLATTICE_SESSION_IDis absent.- Pass
--session-idorCONTEXTLATTICE_SESSION_IDto force a specific session. SetCONTEXTLATTICE_AUTO_SESSION_DISABLED=1to disable automatic session creation.
Canonical event families include session.started, context_pack.completed, dream.completed, graph.neighbors_returned, graph.edge_touched, decision.made, test.ran, handoff.created, writeback.completed, and session.completed.
- macOS DMG:
https://github.com/sheawinkler/ContextLattice/releases/latest/download/ContextLattice-macOS-universal.dmg - Homebrew cask:
brew tap sheawinkler/contextlattice && brew install --cask contextlattice - Windows MSI:
https://github.com/sheawinkler/ContextLattice/releases/latest/download/ContextLattice-windows-x64.msi - Linux bundle:
https://github.com/sheawinkler/ContextLattice/releases/latest/download/ContextLattice-linux-bootstrap.tar.gz
| Profile | CPU | RAM | Storage |
|---|---|---|---|
| Lite core | 2-4 vCPU |
8-12 GB |
25-80 GB |
| Lite advanced | 4-6 vCPU |
12-16 GB |
80-140 GB |
| Full | 6-8 vCPU |
12-20 GB |
100-180 GB |
GET|POST /v1/memory/edgespersists explicit typed relationships.POST /v1/memory/edges/backfillaudits or applies deterministic retroactive edges and opt-in same-projectinferred_relatedscoring. It is dry-run by default.POST /v1/memory/neighborsreturns explicit/inferred edge neighbors merged with semantic/topic neighbors.
./scripts/agent/memory-edge-backfill
./scripts/agent/memory-edge-backfill --include-inferred --min-confidence 0.90
./scripts/agent/memory-edge-backfill --write
./scripts/agent/memory-edge-inferred-retrofill --all-projects
./scripts/agent/memory-edge-inferred-retrofill --all-projects --profile exploratory
./scripts/agent/memory-edge-inferred-retrofill --all-projects --profile exploratory --write --confirm-retrofill ALL_PROJECTS
./scripts/agent/memory-edge-inferred-retrofill --project hermes-agent-ultra --corpus disk --profile exploratoryBring existing data into ContextLattice without changing the ingest boundary.
Backfill is dry-run by default, writes go through /memory/write, and writes
require --write --confirm-write <project>.
./scripts/agent/source-backfill-memory --source jsonl --path exports/tasks.jsonl --project my-project --pretty
./scripts/agent/source-backfill-memory --source sqlite --path app.db --table notes --project my-project --pretty
./scripts/agent/source-backfill-memory --source parquet --path warehouse/events.parquet --project my-project --pretty
./scripts/agent/source-backfill-memory --source postgres --dsn "$DATABASE_URL" --query "select id,title,body from notes limit 100" --project my-project --pretty
./scripts/agent/source-backfill-memory --source jsonl --path exports/tasks.jsonl --project my-project --write --confirm-write my-project --apply-edges --prettySupported adapters: files/directories, JSONL, JSON, CSV, SQLite, DuckDB, Parquet
via DuckDB, and Postgres via optional psycopg. Import caps cover records, row
bytes, document bytes, total bytes, and structured-list items. Secret-like
fields are redacted by default, and graph edge repair is optional and bounded.
ContextLattice exposes active skills as a native Go Skills Index so agents can discover relevant capabilities without loading every SKILL.md into prompt context. In local installs, the active index mounts ${HOME}/.codex/skills read-only by default. Quarantined/vendor skill discovery remains a separate read-only lane and does not auto-load quarantined skills.
- Active index endpoint:
GET|POST /v1/skills/index/search - Active index tool:
GET|POST /tools/skills_index_search - Active index status/reindex endpoint:
POST /v1/skills/index/reindex(live native scan; no prompt loading) - Search endpoint:
GET|POST /v1/skills/quarantine/search - Tool alias:
GET|POST /tools/skills_quarantine_search - Reindex endpoint:
POST /v1/skills/quarantine/reindex(off by default; enable explicitly)
Runtime knobs:
ORCH_SKILLS_QUARANTINE_ENABLED=true
ORCH_SKILLS_QUARANTINE_HOST_BIN_DIR=${HOME}/.local/bin
ORCH_SKILLS_INDEX_HOST_ACTIVE_ROOT_DIR=${HOME}/.codex/skills
ORCH_SKILLS_INDEX_HOST_SYSTEM_ROOT_DIR=${HOME}/.codex/skills/.system
ORCH_SKILLS_INDEX_ROOTS=/opt/contextlattice/skills_active:/opt/contextlattice/skills_system
ORCH_SKILLS_QUARANTINE_HOST_ROOT_DIR=${HOME}/.codex/skills_quarantine
ORCH_SKILLS_QUARANTINE_SEARCH_CMD=/opt/contextlattice/skills/bin/codex-skills-quarantine-search
ORCH_SKILLS_QUARANTINE_REINDEX_CMD=/opt/contextlattice/skills/bin/codex-skills-quarantine-reindex
ORCH_SKILLS_QUARANTINE_TIMEOUT_SECS=8
ORCH_SKILLS_QUARANTINE_DEFAULT_LIMIT=20
ORCH_SKILLS_QUARANTINE_MAX_LIMIT=100
ORCH_SKILLS_QUARANTINE_REINDEX_ENABLED=false
CODEX_SKILLS_QUARANTINE_ROOT=/opt/contextlattice/skills_quarantine
CODEX_SKILLS_QUARANTINE_INDEX_DIR=/opt/contextlattice/skills_quarantine/index
CODEX_SKILLS_QUARANTINE_INDEX=/opt/contextlattice/skills_quarantine/index/skills_index.jsonl- Local-first by default.
- API-key protected operational routes.
- Secret-like content redaction controls.
- Premium billing/provider route maps are intentionally kept out of public docs.
- Overview:
https://contextlattice.io/ - Architecture:
https://contextlattice.io/architecture.html - Local AI workspace comparison:
https://contextlattice.io/local-ai-workspaces.html - Scaling memory:
https://contextlattice.io/scaling-memory.html - Wiki:
https://contextlattice.io/wiki.html - Installation:
https://contextlattice.io/installation.html - Integrations:
https://contextlattice.io/integration.html - Troubleshooting:
https://contextlattice.io/troubleshooting.html - Updates:
https://contextlattice.io/updates.html - Release notes:
docs/releases/v3.4.13.mddocs/releases/v3.4.12.mddocs/releases/v3.4.11.mddocs/releases/v3.4.10.mddocs/releases/v3.4.5.mddocs/releases/v3.4.2.mddocs/releases/v3.4.1.md
Business Source License 1.1 (LICENSE).