Tags · reh3376/mdemg

edge

dev: reh3376_dev01 -> main (#464)

* chore(submodule): bump homebrew-mdemg to v0.10.1 formula

Point the parent at the manually-published v0.10.1 homebrew formula
(reh3376/homebrew-mdemg@10c1843). The release artifacts published cleanly;
the formula update was manual because the CI HOMEBREW_TAP_TOKEN expired
(follow-up: rotate the secret so future releases auto-publish).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(eventgraph-003): sprint plan — reinforcement coverage for other Hebbian paths

Wire the 3 remaining Hebbian write paths (CoactivateSession,
ApplySymbolCoactivation, ApplyNegativeFeedback weaken-only) into the existing
reinforcement_events writer via distinct trigger_path values. No schema/writer/
wiring change (V0022 already has trigger_path + signed delta_weight +
created_new_edge; writer already injected). Contradict path deferred (CONTRADICTS
edges aren't traversed by the federation walk). RETURN-only Cypher edits; Tier-2
asserts unchanged weights. 5 epics, 3 tiers, live Tier-3.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(eventgraph-003): wire CoactivateSession into reinforcement_events (Epic 1)

CoactivateSession (session-internal conversation-observation co-activation, full
Hebbian formula) now emits per-pair reinforcement events with
trigger_path=coactivate_session. RETURN-only Cypher change: replaced the
discarded `count(*)` with the standard 17-field per-pair RETURN (one row per
forward edge; reverse is a mirror). Weight SET untouched → update behavior
provably unchanged. Mirrors the proven ApplyCoactivation record loop; writer
already injected. EXPLAIN-validated (compiles, all RETURN vars in scope, no
writes); build + lint clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(eventgraph-003): wire ApplySymbolCoactivation into reinforcement_events (Epic 2)

SymbolNode-pair co-activation now emits trigger_path=apply_symbol_coactivation
rows. Split the weight update out of the ON MATCH clause into a separate SET so
the pre-update weight (w) can be captured for prev/new/delta — createdNew
(evidence_count=1) keeps a fresh edge at 0.1 and increments matches by +0.05,
preserving the original ON-clause weight behavior exactly. eta/surprise/
activation/path_sim are NULL (N/A for symbols); roles default 'symbol_node'.
EXPLAIN-validated; build + lint clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* feat(eventgraph-003): wire ApplyNegativeFeedback weaken path → reinforcement_events (Epic 3)

The weaken path (existing CO_ACTIVATED_WITH edge weakened by negWeight) now emits
trigger_path=apply_negative_feedback rows with a NEGATIVE delta_weight and
created_new_edge=false. The FOREACH writes (weaken SET + contradict MERGE) are
untouched; only the RETURN changed from aggregated `action,count(*)` to per-pair
rows — the Go side counts rows (sum = grouped count, NegativeFeedbackResult
preserved) and emits reinforcement events for weaken rows only. prevWeight is
captured before the FOREACH SET. Contradict path deliberately not emitted
(CONTRADICTS isn't traversed by the federation walk; deferred). EXPLAIN-validated;
build + lint clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* fix(conversation): inject learning service so CoactivateSession actually runs

Discovered via EVENTGRAPH-003 live smoke: session co-activation
(CO_ACTIVATED_WITH edges between same-session conversation observations) had
NEVER fired — 0 such edges ever in mdemg-dev across 5495 conversation
observations. Root cause: conversation.NewServiceWithConfig sets
learningService=nil ("set via SetLearningService to avoid circular dependency"),
but SetLearningService had NO caller, so the `if s.learningService != nil` guard
in Observe() always skipped CoactivateSession. The function + its Cypher were
correct (verified by running it directly: 3 pairs, proper Hebbian weights) —
it was just never invoked.

Fix: convSvc.SetLearningService(lea) at construction. Live-verified: 3 distinct
observations in a session now create 6 CO_ACTIVATED_WITH edges + emit
coactivate_session reinforcement events. Standalone fix-commit per the
live-smoke precedent (surprise bugs don't get rolled into the sprint commit).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(eventgraph-003): Tier 3 verification + feature doc + CHANGELOG + close (Epic 4)

All four trigger_paths live-verified (apply_coactivation 50, apply_symbol_
coactivation 1000, apply_negative_feedback 1 negative-delta, coactivate_session
4 after the dormancy fix); federation CLI surfaces them. Feature doc updated to
all-four-paths + the trigger_path table; CHANGELOG Added (EVENTGRAPH-003) + Fixed
(CoactivateSession never-invoked); CLAUDE.md note + correction (CoactivateSession
was dead, not "writing via sidecar paths"); verification.md + post.md.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

* docs(eventgraph-004): sprint plan + CoactivateSession post-revival health review (Epic 0)

EVENTGRAPH-004 federates the last unfederated Hebbian write — the
ApplyNegativeFeedback contradict action — into reinforcement_events
(trigger_path=apply_negative_feedback_contradict). Data-decided scope:
reuse the existing V0022 sink (zero CONTRADICTS edges exist anywhere;
no producer calls /v1/learning/negative-feedback — instrument before
the producer arrives, the inverse of the dormancy pattern).

Also closes the EVENTGRAPH-003 follow-up: 30h post-fix health review of
the revived CoactivateSession path — no tuning needed, textbook session
cliques, pre-fix orphans stay as historical record (operator decision).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(eventgraph-004): wire ApplyNegativeFeedback contradict path → reinforcement_events (Epic 1)

The contradict action (no co-activation edge → MERGE CONTRADICTS) was the
last unfederated Hebbian write. The CONTRADICTS MERGE lived inside a
FOREACH, where the edge variable is invisible to RETURN — so the original
single statement is split into two statements in the SAME ExecuteWrite
transaction: (a) weaken (EVENTGRAPH-003 telemetry, RETURN unchanged) and
(b) contradict with a per-pair RETURN. Classification is identical: weaken
never deletes edges, so contradict's NOT EXISTS sees the same edge set the
original OPTIONAL MATCH did.

Contradict rows land with trigger_path=apply_negative_feedback_contradict.
created_new_edge detected via `c.updated_at IS NULL` (ON MATCH always sets
it; ON CREATE never does — invariant pinned by comment). delta_weight is
the CONTRADICTS edge's OWN weight delta (+negWeight on create, 0 on
re-match); negative-feedback semantics are carried by trigger_path, not
the sign.

Both statements EXPLAIN-validated against live Neo4j. Tier 1: 2 new parser
tests (create/re-match branches); learning suite green; lint clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(eventgraph-004): Tier 3 live verification — contradict create/re-match + weaken unchanged (Epic 2)

Live against the restarted Epic-1 binary: contradict create row
(+0.15, created_new_edge=true), re-match row (delta=0, evidence=2),
weaken row byte-equivalent to pre-split behavior (negative delta,
floor at 0). Federation CLI surfaces the new trigger_path with no
read-side change. UATS learning_negative_feedback 5/5 PASS.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(eventgraph-004): feature doc + CHANGELOG + UATS pin + close (Epic 3)

Feature doc: 5-path trigger_path table + delta-semantics consumer
warning (contradict delta is the CONTRADICTS edge's own weight delta —
semantics live in trigger_path, not the sign). UATS spec extended:
zero-count equals assertions on nonexistent nodes (hash refreshed,
5/5 live). CLAUDE.md architecture note + producer-gap disclosure.
Sprint close in post.md.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* ci: auto-sync dev branch with main after each squash-merged PR

Squash merges never advance the dev branch's merge-base, so every
sprint touching CHANGELOG.md/CLAUDE.md hit CONFLICTING on its next PR
(first bitten: PR #419). New sync-dev-after-merge.yml merges main back
into the source *_dev* branch after each merged PR; the GITHUB_TOKEN
push triggers no other workflows, so it can never spawn an empty
auto-PR (the PR #420 failure mode). Conflicts fail loudly for manual
resolution; workflow_dispatch enables manual runs/live testing.

auto-pr.yml additionally skips PR creation when branch content is
identical to main — guards MANUAL sync pushes, verified against the
live repo state (current dev01 ≡ main → empty=true → skip).

actionlint clean (untrusted refs passed via env, not inline).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(roadmap): Q3 2026 vision-derived roadmap from 26-agent codebase deep-dive

Full-codebase review vs MDEMG's purpose (cognitive substrate / connection
layer): 19 map agents (3 vision + 16 subsystem), 3 cross-cutting assessors,
synthesizer + adversarial completeness critic (19 revisions applied).

Verdict: server-side substrate is mature, but the system is not currently
functioning as the assistant's internal dialogue — the per-prompt delivery
channel silently no-ops (hook reads .user_prompt, Claude Code sends
.prompt), 100% of GENERALIZES edges have NULL weight (22,170/22,170,
live-verified), scheduled decay/prune has been a permanent dry-run, RSIC
validates 16/17 actions vacuously, and supervision covers 3 of ~14
background loops. Every defect is the same disease: wired-looking seams
with no caller, wrong contract, or no reader.

4 phases ≈ 75 days committed: (1) reconnect the loop ends, (2) close the
learning loops, (3) survivability + class-ending forcing functions,
(4) FT frontier + release hygiene. Top-10 ranked; deferrals explicit.
Orchestrator spot-verification annex included (5 claims re-verified live).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(hookwire-001): sprint plan — fix hook stdin contract, reconnect per-prompt channel (Epic 0)

Roadmap Q3 Phase 1 rank #1. Audit of all 6 hooks vs the actual Claude
Code stdin schemas: prompt-context.sh reads .user_prompt (CC sends
.prompt) → channel exits silently on every prompt; post-tool-observe.py
reads tool_output (CC sends tool_response) → false "Build/test
succeeded" observations with empty output; guidance wrongly coupled to
RESULT_COUNT>0; minor pre-compact transcript jq. session-start /
pre-bash-check / pre-write-check verified correct.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(hookwire-001): prompt-context.sh reads .prompt — revive the per-prompt channel (Epic 1)

Claude Code's UserPromptSubmit stdin field is `prompt`; the hook read
`.user_prompt`, which is always empty → exit 0 → per-prompt CMS recall,
Jiminy guidance, /strict reformulation, the warm trigger, and the
retrieve-time Hebbian reinforcement have NEVER fired in any session.
Now reads `.prompt // .user_prompt` (legacy fallback kept).

Also decouples guidance from recall: the RESULT_COUNT=0 branch no longer
exits — it printed its notice then skipped guidance + warm + retrieval
reinforcement, coupling independent deliveries.

Both copies (live + installer template). Tier 1 simulated stdin: real
.prompt payload → first-ever guidance delivery (J17 T1 bootstrap + DICT,
5363 guidance bytes vs 0 forever); legacy fallback works; short/empty/
malformed payloads exit silently (fail-open preserved).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(hookwire-001): post-tool-observe reads tool_response — end blind "succeeded" observations (Epic 2)

Claude Code's PostToolUse stdin field is `tool_response` (string or
object); the hook read `tool_output`, which is always absent → output_str
empty → error indicators never matched → every go build/go test/pytest
Bash call was recorded as "Build/test succeeded" sight-unseen, and real
errors were never observed.

Now reads tool_response (fallback tool_output) via _response_text(),
normalizing string|dict|list (stdout/stderr join). Success classification
requires NON-EMPTY clean output — a silent success records nothing rather
than fabricating; failures land as error observations with real stderr.

Both copies (template regenerated from fixed live, {{SPACE_ID}}
placeholder preserved, verified identical modulo placeholder). Tier 1
against real CMS: failing build → error obs with stderr; passing →
progress; empty → no record.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(hookwire-001): pre-compact transcript extraction reads the real line shape (Epic 3)

Transcript lines are {type, message:{content:[{type, text|name, ...}]}};
the old top-level `.content` read always yielded empty, so pre-compaction
snapshots never carried recent-activity context. New jq walks
.message.content[] extracting .text/.name. Verified against this
session's real transcript (old: nothing; new: real activity). Both
copies, placeholders preserved.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(hookwire-001): Tier 3 verification + CHANGELOG + CLAUDE.md contract pin + close (Epics 4-5)

Live in the real session: first-ever guidance delivery (J17 T1 bootstrap
+ DICT, 5363 bytes vs 0 forever); real failing build → error observation
with actual compiler output in CMS. PostToolUse success-only firing
documented as a limitation. Hook stdin contract pinned in CLAUDE.md.
Drift + clique-semantics findings logged for HOOKSYNC-001 / Phase 2.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(hooksync-001): sprint plan — drift-proof + self-monitoring hook channel (Epic 0)

Roadmap Q3 Phase 1 rank #2. Investigation grounded all five findings:
template→live drift severed alert delivery (50-entry file actively
rotating today, never shown); no Cleared lifecycle (nothing sets the
field; no /v1/alert* endpoints); no absence detection for the channel
that just had a months-long silent outage; compose publishes 9999 on
0.0.0.0; neural sidecar binds 0.0.0.0:8101 via a 39-day-old process
serving pre-J17-fix code. 8 epics: reconcile, CI parity gate, clear
lifecycle, hook_events absence rule (reuses V0024 via jobhealth),
hooks doctor, PORT-TRUTH rider, Tier 3, docs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(hooksync-001): reconcile bidirectional hook drift — alert delivery restored to live (Epic 1)

Live hooks adopted from templates (SPACE_ID substituted): restores the
alert-display blocks (all-pending per prompt; critical/high + degraded
healthz at session start) that the live copies lacked — the NOSILENT
last mile. Reverse drift caught during reconcile: the live hook's T1/T2
bootstrap-detection block (MAX_TIER → /v1/jiminy/bootstrap → ACTIVE
CONSTRAINTS header) never existed in the template and was nearly lost —
now single-sourced in the template and regenerated into live.

Live-verified: one prompt now renders alerts (50 pending incl. live
CRITICALs) + recall + J17:INIT bootstrap + guidance + synergy footer,
coexisting. All 6 hooks byte-identical to templates modulo {{SPACE_ID}}.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* ci(hooksync-001): hook-template parity gate — live hooks must match templates (Epic 2)

Mirrors the compose/launchd parity pattern: every *.sh/*.py template
must byte-match .claude/hooks/ modulo the {{SPACE_ID}} placeholder.
Proven locally: passes clean, fails (with a bounded diff dump) on
deliberate drift. Ends the bidirectional-drift class that severed
alert delivery and nearly lost the T1 bootstrap block.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(hooksync-001): alert Cleared lifecycle — display once, then delivered (Epic 3)

Alert.Cleared existed but nothing ever set it: once hooks rendered the
file, the same entries would re-render every prompt forever. New:
FileBackend.Clear (ids and/or all_before cutoff, idempotent, under the
existing lock) → Dispatcher.ClearAlerts → POST /v1/alerts/clear. Hooks
now clear exactly what they displayed (fire-and-forget, fail-open);
cleared = delivered-to-operator, not resolved — persisting conditions
re-fire via the evaluator. Alert IDs now CUIDv2 per the identifier
standard (was UnixNano; old ids remain valid opaque strings).

Live-verified lifecycle: prompt 1 → "50 pending, showing 10" + 10
cleared; prompt 2 → "40 pending, showing 10" (next batch, no re-render)
→ 20 cleared. Tier 1: Clear by-id/by-time/idempotent/no-backend. UATS
alerts_clear 3/3 live (runner falsy-body inheritance discovered:
variant bodies must be non-empty objects).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(hooksync-001): hook-channel absence detection — the channel now self-reports outages (Epic 4)

POST /v1/hooks/event records heartbeats into V0024 scheduled_job_events
via the jobhealth policy point (job_name hook:<name>; no new sink).
Two independent heartbeats: prompt-context fires per delivery (the
monitored channel); post-tool-observe fires throttled (HOOK_HEARTBEAT_
COOLDOWN_SEC, default 300 — proves sessions ACTIVE). Evaluator rule
hook_channel_silent (distinct service per the NOSILENT cooldown rule):
sessions active + zero prompt-context fires in HOOK_SILENT_LOOKBACK_
HOURS (24) → high alert. This is the "job never ran" guarantee applied
to the channel whose months-long outage HOOKWIRE-001 found only by
manual audit — the next contract drift self-reports.

Config: HOOK_HEALTH_ALERT_ENABLED (true), HOOK_SILENT_LOOKBACK_HOURS
(24), HOOK_ACTIVITY_MIN_EVENTS (5). Live-verified: real hook fires land
rows (session metadata, latency); throttle holds; rule SQL positive +
negative branches proven against the real table; UATS hooks_event 3/3.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(hooksync-001): mdemg hooks doctor — one-shot hook-channel triage (Epic 5)

11 checks: per-hook template parity (the CI gate's local twin),
settings registration, server healthz, a stdin-contract self-test
piping a real-shape UserPromptSubmit payload through the installed
hook (asserts the always-present synergy footer), alert-file state
(pending/total), and the last hook:prompt-context heartbeat age from
scheduled_job_events (SKIP when TSDB unreachable). Table or --json;
non-zero exit on any FAIL.

Live: 11/11 PASS on this machine ("last fire 5s ago" — fed by the
doctor's own self-test); correctly fails (exit 1) on deliberate drift.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(hooksync-001): PORT-TRUTH — loopback bind defaults + sidecar zombie replaced (Epic 6)

Compose published the API on 0.0.0.0 (unauthenticated admin/destructive
routes exposed off-host): now "${MDEMG_BIND_ADDR:-127.0.0.1}:${MDEMG_PORT}
:9999" — wide bind is an explicit opt-in (both compose copies, CI-synced).
Neural sidecar bound 0.0.0.0:8101 via config.py default AND the plist
arg: both now 127.0.0.1 (both plist copies, CI-synced; SIDECAR_HOST env
overrides). Operational: the 39-day-old sidecar process (started
2026-05-02, serving pre-J17-fix code) replaced — fresh process verified
on 127.0.0.1:8101, both models loaded, health 200.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(hooksync-001): Tier 3 verification + feature doc + CHANGELOG + close (Epics 7-8)

Live-verified across the sprint: alert backlog drained 50→2 on real
prompts (display-then-clear); evaluator rules 15→16 (hook_channel_silent
loaded); doctor 11/11 + correct failure mode; sidecar fresh on
127.0.0.1:8101 (NLI 234ms). Feature doc docs/features/hook-channel-
health.md (config table incl. MDEMG_BIND_ADDR + SIDECAR_HOST). Findings:
packaging plists are templates (raw copy → launchd exit 78; service
install is canonical); UATS falsy-variant-body inheritance pinned.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(uats): jiminy_guide_sanitized timeout 30s → 90s — stale vs synthesis latency

Caught in the HOOKSYNC-001 full-suite regression: the synchronous
/v1/jiminy/guide includes local-model synthesis (~43s observed quiet,
~50s typical per GUIDANCE-SYNTH-001) — the spec's 30s timeout has been
silently erroring since synthesis latency grew. Aligned with the
JIMINY_WARM_COMPUTE_TIMEOUT_MS budget (90s); hash refreshed; passes
live. Pre-existing — not a HOOKSYNC regression (Guide path untouched).
The other 3 suite errors were load-induced flakes (pass individually):
suite-vs-llama-server slot contention, noted for UXTS-CI-001.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(ci): track .claude/hooks/pre-write-check.py so hook-parity check passes

Root cause: the new 'Verify live hooks match hook templates' CI step
(HOOKSYNC-001) diffs every internal/cli/hook_templates/*.{sh,py} against
.claude/hooks/<name>, but the .gitignore allowlist only un-ignored the
5 original hooks. pre-write-check.py gained a template in this sprint
while its live counterpart stayed gitignored, so CI checked out a tree
without it and failed with 'MISSING live hook:
.claude/hooks/pre-write-check.py'.

Fix: add '!.claude/hooks/pre-write-check.py' to the allowlist and commit
the live hook (already byte-identical to its template modulo SPACE_ID),
preserving the full parity guarantee instead of weakening the CI step.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(hidden-weight-001): sprint plan — real weights on the abstraction hierarchy (Epic 0)

Roadmap Q3 Phase 1 rank #3. Live investigation: point.distance() returns
NULL on embedding lists (proven: NULL where vector.similarity.cosine
returns 0.627 on the same pair); 3 creation sites affected incl. an
ABSTRACTS_TO site the audit missed. Scale worse than audited and
growing: 28,332/28,332 GENERALIZES + 36,110/37,996 ABSTRACTS_TO = 64,442
NULL-weight abstraction edges. Neo4j cosine returns [0,1] directly —
drop-in. Plan: fix sites (+ CUIDv2 edge ids), LIMIT-5-then-batched
backfill, null-weight gauge + alert rule via the existing graph-stats →
metric_samples path, UVTS-quick regression guard.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(hidden-weight-001): abstraction-edge weights — vector.similarity.cosine replaces point.distance (Epic 1)

point.distance() is a spatial-Point function: on embedding lists it
returns NULL, so every weight at the 3 abstraction-edge creation sites
was never set (100% of GENERALIZES + 95% of ABSTRACTS_TO weightless;
the CASE guards passed on good embeddings, then the THEN expr evaluated
NULL — edges with good embeddings got nothing while embedding-less ones
got the 0.5 fallback). vector.similarity.cosine returns [0,1] directly
(live-verified: identical=1.0, orthogonal=0.5, opposite=0.0). Site 1
(theme GENERALIZES) gains the null-guard it never had.

Also: edge_id randomUUID() → CUIDv2 per the identifier standard, minted
Go-side via memberEdgePairs (Cypher can't generate CUIDv2) and zipped
with member ids for UNWIND. All 3 statements EXPLAIN-validated live.
Tier 1: pair-builder tests (uniqueness, CUID format, empty input).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(hidden-weight-001): mdemg graph backfill-weights — heal 56k NULL abstraction weights (Epic 2)

Standalone subcommand (deliberately NOT folded into `graph repair`,
whose orphan sweep would delete the pre-fix orphan observations the
operator chose to keep). Weight = vector.similarity.cosine(endpoint
embeddings) when both exist, else 0.5 (the creation sites' fallback);
similarity_score set alongside; idempotent (pure function of
embeddings); batched (default 1000/txn) with --limit for trials.

Executed per the small-batch-first rule: dry-run count → LIMIT-5 live
trial → hand-verified (stored ≡ independently recomputed to 6dp) →
distribution preview over 2000 (min 0.704, mean 0.96; the ~50% near-1.0
mass is single-member-cluster degeneracy — centroid ≡ member embedding,
HIDDEN-CHURN-001 territory, faithfully encoded) → full runs. Mid-run
the count GREW: the running server predated Epic 1 and kept minting
NULL edges — restarted on the fixed binary, swept stragglers, then
whk-wms (8,755) + linear (199). Final: 0 NULL / 57,395 edges globally.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(hidden-weight-001): null-weight gauge + regression alert rule (Epic 3)

Query 4 in the graph-stats collector counts NULL-weight GENERALIZES/
ABSTRACTS_TO edges per space → new gauge
mdemg_neo4j_graph_null_weight_edges → metric_samples → evaluator rule
null_weight_abstraction_edges (service graph-weight-integrity, distinct
per the cooldown rule; NULL_WEIGHT_EDGE_ALERT_THRESHOLD default 100,
ForDuration 10m). Steady state post-backfill is 0; sustained
reappearance = the point.distance bug class regressed at a creation
site — it self-reports instead of waiting for the next audit.

Live: evaluator rules 16→17; gauge rows persisting at value 0 across
all spaces.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(ingest): config-driven consolidation timeout — was sharing the 300s batch budget

Caught live during the HIDDEN-WEIGHT-001 corpus reingest: the post-ingest
/v1/memory/consolidate call used the shared batch-ingest client
(--timeout, 300s); consolidating a ~10k-node space exceeds that, so the
client reported failure while the server completed the work — the
GUIDANCE-SYNTH-001 bug class (long graph/LLM work needs its own budget).
New --consolidate-timeout flag / INGEST_CONSOLIDATE_TIMEOUT_SEC env
(default 1800s) with a dedicated client. Live-verified: "running
consolidation timeout_sec=1800" → complete.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(hidden-weight-001): Tier 3 verification + corpus restoration + UVTS harness audit + close (Epics 4-5)

Tier 3: real consolidation minted edges with varied cosine weights
(0.83-0.94) + CUIDv2 ids; at-scale via the corpus reingest (9,500 edges,
0 NULL, mean 0.923); gauge holds 0; evaluator rules 16→17.

UVTS harness: corpus space lnl-demo-whk had been deleted with zero trace
(no UVTS run since 2026-05-04 measured anything real); restored by
operator-directed full reingest. A fresh baseline NUMBER remains blocked
by further live-found harness rot — grader/persist breakage, expected-
path format drift, vector post-filter dilution (service.go:1137 global
top-K then space filter) amplified by the duplicate whk-wms space —
complete defect inventory handed to UXTS-CI-001. Retrieval ranking on
the restored corpus verified correct (expected files at ranks 1-4).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(maint-live-001): sprint plan — scheduled maintenance actually runs (Epic 0)

Roadmap Q3 Phase 1 rank #4. Weekly decay+prune has never executed
(--dry-run defaults true; plist passes no override) while reporting
success — NOSILENT's blind spot. Tonight's Memory Bloat alerts (79k+
nodes) are the accumulated backlog. Safety verified in code before
planning: nodes are tombstoned (never deleted) with abstraction-chain/
degree/recency protections; edge deletion is the designed near-zero-
weight lifecycle, meaningful now that HIDDEN-WEIGHT made weights real.
Plan: live-by-default plist (+installed refresh), dry_run in job-event
metadata (no schema change — disclosed), maintenance_no_live_run
evaluator rule, darwin upgrade refreshes plists/hooks, first-ever live
run with preview-first protocol.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(maint-live-001): scheduled maintenance runs live — plist passes --dry-run=false (Epic 1)

The weekly LaunchAgent ran `mdemg maintenance` with no dry-run override;
the CLI defaults --dry-run=true, so every scheduled cycle previewed and
reported success — decay+prune NEVER executed (the 79k-node Memory
Bloat backlog). Both plist copies now pass --dry-run=false (the CLI
default stays true for safe manual previews — the SCHEDULE is what must
not silently no-op); installed plist refreshed + agent reloaded.

reportScheduledJobMeta threads job metadata into V0024; maintenance
records dry_run so the only-ever-dry-runs pattern is queryable
(metadata JSONB — no schema change, disclosed in the plan).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(maint-live-001): maintenance_no_live_run evaluator rule (Epic 2)

Fires when maintenance rows exist in MAINT_LIVE_LOOKBACK_DAYS (default
8) but none ran live (success + metadata dry_run=false) — the only-
ever-dry-runs pattern self-reports instead of hiding inside "the job
ran". Distinct service maintenance-liveness per the cooldown rule.
Config: MAINT_LIVE_ALERT_ENABLED (true), MAINT_LIVE_LOOKBACK_DAYS (8).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(maint-live-001): mdemg upgrade refreshes installed LaunchAgents + hooks (darwin) (Epic 3)

Plist/hook fixes shipped in releases but never reached installed
machines — the maintenance dry-run override would have sat unreachable
next to upgraded binaries forever. Upgrade now re-renders ALREADY-
INSTALLED mdemg LaunchAgents from the new binary's embedded templates
(refresh-only — never installs new services) + re-syncs mdemg-managed
Claude hooks in the current project (marker-checked). Substitution
logic single-sourced into renderLaunchdTemplate (Install + Refresh —
the drift class that exit-78'd the sidecar during HOOKSYNC live smoke).
Mirrors the existing Linux systemd-unit refresh.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(maint-live-001): context-dependent orphan policy — --exclude-role-types (Epic 4a)

Orphan disposition is context-dependent (operator, 2026-06-11): a
uniform degree/age rule conflates governance constraints, conversation
history, test junk, and hierarchy debris. New --exclude-role-types on
prune + maintenance (env PRUNE_EXCLUDE_ROLE_TYPES) makes the policy
expressible; the scheduled plist ships
constraint,conversation_observation excluded per the operator's call
(constraints are load-bearing governance rules at any degree;
conversation observations differ by SESSION which the knob can't
express yet). Aged hierarchy debris stays eligible — that's the
lifecycle working. Candidate census that drove the decision: 5,388
conv-obs (9 eligible tonight under the 90d shield), 11 constraints,
238 hierarchy nodes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(prune): orphan sweeps use implicit transactions for batched deletes

Caught by the FIRST-EVER live maintenance run (MAINT-LIVE-001 Tier 3):
Neo4j raises TransactionStartFailed when a batched CALL-IN-TRANSACTIONS
statement executes inside an explicit transaction. Both orphan sweeps
(SymbolNode + Observation) ran their batched delete via ExecuteWrite;
the dry-run path never executes the deleting statement, so no preview
or unit test could surface it — only live execution. Switched to
session.Run (implicit tx). The failure ALSO proved the NOSILENT chain
live: the run fired "Scheduled job failed: maintenance" before exiting.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(maint-live-001): first live run verification + feature doc + CHANGELOG + close (Epics 4b-5)

First live maintenance in MDEMG history: 20,236 orphan SymbolNodes
deleted; all 5,010 tombstone candidates protected (recency + operator
exclusions); liveness rule born-firing → silenced by the real run; the
3-row job-event story (preview/true → failure/false alerted →
success/false) proves the dry_run plumbing through every path.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs: CLAUDE.md architecture note for MAINT-LIVE-001

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(embed-wire-001): sprint plan — embedder wiring + ingest exec resolution (Epic 0)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(embed-wire-001): breaker + recorder reach the real embedder through the wrapper chain (Epic 1)

The embedding circuit breaker was NEVER wired in any default deployment:
embeddings.New returns *CachedEmbedder when EMBEDDING_CACHE_ENABLED=true
(the default), so the server's emb.(*embeddings.OpenAI)/(*Ollama)
assertions on the OUTERMOST value failed silently (no else branch). The
recorder assertion had the inverse fragility (cache off → training-data
recording silently dies).

New: Unwrap() chain (CachedEmbedder joins RateLimitedEmbedder's existing
one) + embeddings.Base() / FindCached() interface-driven walkers — any
future wrapper joins by adding Unwrap(), no type lists. Wiring now walks
to the base for the breaker and to the cache layer for the recorder,
with LOUD warns when nothing matches. Tier 1 pins the production shape
(ratelimit(cache(provider))) plus cache-off and bare chains.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(ingest-exec-001): server-triggered ingest resolves the mdemg binary — was hardcoded ./bin/mdemg (Epic 2)

Both ingest-job exec sites ran a relative "./bin/mdemg": broken in
Docker (the documented-primary deployment — binary at /usr/local/bin,
no repo checkout) and any CWD other than the repo root. New
resolveMdemgBin(): MDEMG_BIN env → os.Executable() (the server IS the
binary) → PATH → ./bin/mdemg legacy fallback; cached; Tier 1 pins the
order. Scheduled-sync jobs now report outcomes to scheduled_job_events
via jobhealth (job_name codebase-sync) — an unattended sync that keeps
failing is never silent; manual API jobs stay queue-visible only.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(embed-wire-001): live verification + CHANGELOG + CLAUDE.md + close (Epics 3-4)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(doc-truth-001): sprint plan — documentation matches reality (Epic 0)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(doc-truth-001): CLAUDE.md FT section rewritten to post-pivot reality (Epic 1)

The section presented the abandoned Qwen3.6-35B-A3B MoE target, two-tier
MoE-Sieve strategy, and Sprint A→E critical path as CURRENT — all
superseded by the 2026-04-22 MoE→dense pivot; this stale text seeded the
Q3 roadmap audit with a dead architecture. Rewritten: shipped state
(dense Qwen3-14B mdemg-llm-v1, 0.8389, llama-server runtime), superseded
plan documented with the pivot rationale (never deleted — supersede-with-
pointer), guardrail llmclient exception marked CLOSED (re-verified in
code), memo-07 provenance break disclosed (the file never existed;
00_README_v2.md is canonical), open FT work named (FT-CLASSIFY-002 +
recursive-retraining trigger). Adapter env-name drift fixed
(MDEMG_ADAPTER_BASE, not MDEMG_MODEL_ADAPTER_BASE).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(doc-truth-001): operator-facing text matches the Phase 13.5 reality (Epic 2)

preflight errors directed operators to start the DECOMMISSIONED
mlx_lm.server on :8101 — following them reintroduces the crash-looping
stack Phase 13.5 replaced. Now: llama-server :8102 guidance (managed
service install + manual command), backend-agnostic wording. model.go
help text dropped three stale "deferred to MODEL-DIST-002" mentions
(shipped 2026-05-25). Operationally (untracked .env): removed the
J17_SIDECAR_TIMEOUT_MS=200 override that re-pinned the exact value
DH-004 remediated — the 1000ms default now applies; server restarted.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(doc-truth-001): 00_README STATUS block + AGENT_HANDOFF retired (Epic 3)

00_README_v2.md gains a top-of-file STATUS block: shipped-through-
cutover state, superseded MoE plan (FT-2 skip + FT-3 supersession +
R-LT-4 prototype-discipline adjudication recorded), the NOT-STARTED
recursive-retraining loop with its FT-CLASSIFY-002 trigger, and
provenance notes (memo-07 never existed; the spec is untracked pending
FG-2). AGENT_HANDOFF.md (stale since 2026-05-06) retired to a pointer
stub — handoff state lives in CLAUDE.md/roadmap/CHANGELOG/CMS.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(doc-truth-001): grep-sweep proof + CHANGELOG + close (Epic 4)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(doc-truth-001): last stale --adapter help string (sweep straggler)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(rsic-validate-001): sprint plan — fail-closed self-improvement (Epic 0)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(rsic-validate-001): honest criteria evaluation — populated keys + fail-closed mutations (Epics 1-2)

The cycle baseline populated 10 metric keys while task criteria
referenced ~15 others (only volatile_count + correction_rate
intersected) → missing_data → skip → ~16/17 actions validated
vacuously; criteria-driven rollback was unreachable. The
SelfAssessmentReport already carried nearly every needed key — they
were never copied into the maps.

New single source reportMetricsMap() feeds BOTH MetricsBefore and
MetricsAfter (the mismatch class cannot recur), resolving
edges_below_threshold, total_edges, consolidation_age_sec,
avg_edge_weight, guidance_health, protocol_health + 13 more. Fail-
closed rule: for MUTATING actions (15-entry registry) a criterion with
missing evidence counts as NOT met ("missing_data_failclosed") — an
unverifiable mutation must never be recorded as success; observational
actions keep advisory semantics. The prior test pinned the vacuous
pass as the contract — updated to the honest one + advisory companion.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(rsic-validate-001): tombstone_stale scoped to correction-linked nodes; refresh_stale_edges decays for real (Epic 3)

tombstone_stale archived 50 ARBITRARY older observations whenever ANY
correction existed in the 7-day window — no relationship between
correction and target. Now requires linkage: same session as the
correction OR its 1-hop CO_ACTIVATED_WITH neighborhood. Live check:
0 corrections in the current 7-day window, so both old and new scopes
are 0 RIGHT NOW — the hazard was conditional (any future correction
re-armed the old query against thousands of unrelated observations;
the new query bounds it to genuinely related nodes).

refresh_stale_edges bumped last_activated BEFORE the weight expression
read it → staleness=0 → the decay term vanished → every refresh was a
pure +0.1·log(count+1) boost. Staleness now captured via WITH before
SET; weights can genuinely decay. Both statements EXPLAIN-validated.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(rsic-validate-001): counter-free confidence calibration — RSIC stops polluting its own signal (Epic 4)

RSIC-SK1 injected synthetic "followed"/"ignored" outcomes through
UpdateConfidence, incrementing total_surfaced/total_followed/
total_ignored — the exact counters GetConstraintEffectiveness reads
next cycle: measured effectiveness drove synthetic outcomes which drove
measured effectiveness (circular self-reinforcement). New
AdjustConfidenceDirect applies the clamp+archive confidence delta with
ZERO counter writes; the outcome counters now belong exclusively to
real guidance feedback. Provider interface + adapter + dispatcher use
the direct path with the configured boost/decay magnitudes; test mock
maps deltas back to outcome labels so existing assertions keep meaning.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(rsic-validate-001): Tier 3 verification + CHANGELOG + close (Epics 5-6)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* test(rsic-validate-001): integration seeds carry session linkage for the scoped tombstone contract

CI's TombstoneStaleEndToEnd + MultiActionDispatchAndMetrics failed
because the seeded observations had NO relationship to the seeded
corrections — under the old behavior they were archived anyway (the
memory-eroding bug the sprint removed); under the new correction-
linkage contract they are correctly spared. SeedObservationNodes now
stamps a per-space test session shared by corrections and their stale
peers, so the tests exercise the new contract. Query-level proof
against the exact seeded shape: 10/10 linked observations match the
scoped Cypher. (Local integration runs hit the 30s client timeout —
the loaded local stack's cycles take ~6 min; CI's arbitrates.)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(rrf-scale-002): sprint plan — finish the score-scale contract (Epic 0)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(rrf-scale-002): persistent rerank clients — failure alerting re-armed on the hottest LLM path (Epic 1)

doRerankWithOpenAI/doRerankWithOllama constructed a fresh llmclient per
call: the consecutive-failure counter reset every time, so
LLM_CONSECUTIVE_FAILURE_THRESHOLD could NEVER fire for
retrieval.rerank_cross / rerank_nli (a north-star distill task), and the
HTTP transport was discarded per call. Per-provider base clients now
init once (sync.Once); WithContext() shallow-copies and SHARES the
*atomic counter + breaker, so per-call contexts keep failure accounting.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(rrf-scale-002): config-driven score thresholds — suggest revival, MCP tiers, guardrail floor (Epic 2)

Three score-literal leftovers from the RRF-SCALE-001 audit instruction:
(1) /v1/memory/suggest's hardcoded 0.5 min-confidence default filtered
nearly everything on a scale topping out ~0.58 → CONSULTING_SUGGEST_
MIN_CONFIDENCE (default 0.45, RRF-calibrated); (2) MCP memory_reflect
tiers 0.7/0.4 (high tier unreachable) → MCP_REFLECT_SCORE_HIGH/_MEDIUM
(0.45/0.25); (3) guardrail constraint-retrieval Cypher's hardcoded
sim > 0.3 → GUARDRAIL_CONSTRAINT_SIM_FLOOR via GuardrailConfig
(cosine-stable today but inside the class).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(rrf-scale-002): CacheKey covers ALL result-affecting fields + two forcing functions (Epics 3-4)

CACHE-KEY-002: the key omitted result-affecting RetrieveRequest fields —
the audit named 5 (include/exclude_extensions, temporal_after/before,
policy_context); the new reflection forcing-function caught 8 MORE on
its first run: sparse-gate per-call overrides (SparseEnabled/
SparsePercentile/SparseOverridePresent/Category — the ?sparse= URL
params), pagination (Cursor/Limit), and the context-fingerprint params
(QueryContextFingerprint/StrictContextMode). All now keyed, plus a
caller-supplied query-embedding hash. Two requests differing in any of
these no longer collide on one cache entry.

Forcing functions: (1) reflection test — every RetrieveRequest field
must be in CacheKey or explicitly classified result-neutral with
justification (new fields fail until classified); (2) score-literal
scan — flags `.Score/score <op> 0.x` comparisons repo-wide outside a
justified allowlist (first run triaged 3 scale-local sites; clamp
guards excluded by pattern). The RRF-SCALE bug class is now CI-caught.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(rrf-scale-002): live-calibrated suggest floor + CHANGELOG + close (Epics 5-6)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(hidden-churn-001): sprint plan — stable concept identity, two-PR delivery (Epic 0)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(hidden-churn-001): automated consolidation no longer skips LLM emergence (Epic A1)

dynamicEmergenceStep registers at phase 22, but RunConsolidation ran
hardcoded ranges (10,20) + (25,30) — phase 22 fell in the gap, so with
EMERGENCE_ENABLED=true the AUTOMATED path silently skipped LLM concept
emergence while the manual path (RunNodeCreationPipeline, 10–22 with an
emergence gate) ran it. RunConsolidation now delegates to
RunNodeCreationPipeline(cfg.EmergenceEnabled) — single range source; a
pin test fails if the step's phase ever leaves the range.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(hidden-churn-001): stable theme identity — centroid match-or-create replaces the 5-minute churn (Epic A2)

ClusterConversations detached EVERY observation→theme edge, deleted
childless themes, and recreated all themes from scratch each ~5-min
cycle: new node_ids every run, evidence chains destroyed continuously,
recall flooded with stacks of near-identical concepts (observed live in
this session's own prompt headers).

New flow: cluster first → match each cluster to an EXISTING theme by
centroid cosine (HIDDEN_THEME_IDENTITY_SIM_THRESHOLD, default 0.90,
greedy with per-run claiming) → matched themes UPDATE in place
(props + theme-scoped member-edge rewire; node_id and all inbound
references survive) → unmatched clusters create as before → only themes
claimed by NO cluster are deleted. The global detach is gone.
ThemesUpdated added to the result. Tier 1: match/threshold/claimed/
best-of selection.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* chore(hidden-churn-001): remove the dead global-detach helper — the churn mechanism itself

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(hidden-churn-001): PR-A verification + CHANGELOG (Epics A3-A4)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(hidden-churn-001): PR-B coverage retune — config ratio, density assignment, gauge + rule (Epic B1)

maxThemes was an inline ceil(n/10) equation → HIDDEN_THEME_TARGET_RATIO
(default preserves it). NOISE observations (previously dropped from the
hierarchy forever — the 94% coverage gap's mechanism) now density-assign
to their nearest theme when cosine ≥ HIDDEN_THEME_ASSIGN_SIM_THRESHOLD
(default 0.70; edges only, no new themes; below-floor stays unthemed
honestly). New per-space coverage gauge
mdemg_neo4j_conversation_coverage_ratio (collector Query 5) + evaluator
rule low_conversation_coverage (CONVERSATION_COVERAGE_ALERT_FLOOR 0.2,
6h ForDuration for convergence).

Audit bonus: caught WeightIntegrityRules querying metric_samples with
recorded_at — the column is `time`; the null-weight rule had been
silently erroring every evaluation since it shipped (Debug-only logging
— the SUPERVISOR-002 finding in action). Both rules fixed + a pin test
bans recorded_at against metric_samples.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(hidden-churn-001): mdemg concepts repair + trace — grounding audit CLI (Epic B2)

repair: tombstones childless layer>=2 abstraction nodes (no inbound
ABSTRACTS_TO|GENERALIZES|GENERALIZES_TO — 10,395 live in mdemg-dev,
churn-era debris). Recoverable (is_archived=true + archived_reason),
batched, dry-run default, --limit for small-batch-first verification.
trace: per-node grounding audit — direct children, transitive per-layer
census, grounded/ungrounded verdict, sample path to L0.

Live data note: GENERALIZES alone over-counts (19,147) — ABSTRACTS_TO
is the hidden layer's actual child edge; pin test guards the predicate.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(hidden-churn-001): surface themes_updated + noise_assigned in consolidate API + periodic log (Epic B3)

/v1/conversation/consolidate now reports themes_updated and
noise_assigned alongside themes_created. The periodic-consolidation
log condition also gains both — with stable theme identity (PR-A),
created is usually 0 on healthy cycles, which would have silenced the
success log entirely (the silent-success bug class).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(hidden-churn-001): live-smoke fixes — noise pool was structurally empty, clustering included archived debris, coverage gauge gated on min-obs

Three defects only the live run surfaced (Tier 3 forcing function):
1. KMeans never emits label -1, so the density-assignment hook received
   an always-empty noise list; the min-samples/max-themes/nil-centroid
   drops now feed their members into the noise pool instead of silently
   excluding them from the hierarchy.
2. fetchClusterableConversationObservations had no is_archived filter —
   it clustered 4,838 observations of which only 183 were live (MAINT-LIVE
   tombstones), building themes on archived debris. Both fetch variants
   now exclude archived. Live effect: 24 debris themes swept to 5 clean
   ones; second cycle themes_updated=5/created=0 (stable identity on
   real data).
3. Coverage gauge gated on CONVERSATION_COVERAGE_MIN_OBS (default 50,
   DH-005 confidence-threshold pattern) — tiny scratch/test spaces
   (2-13 observations) emitted 0.000 and would have alarmed forever
   (born-firing alert hazard). Sentinel -1 skips emission.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(hidden-churn-001): PR-B verification + CHANGELOG + CLAUDE.md — sprint complete (Epic B5)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(supervisor-002): sprint plan + background loop inventory (Epic 0)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(supervisor-002): sliding-window restart budget + late registration (Epic 1)

The restart counter only ever incremented — a once-a-week transient
permanently killed a worker after 3 weeks. Budget is now a sliding
window (restarts older than the window are forgotten); permanent
failure requires >SUPERVISOR_MAX_RESTARTS within
SUPERVISOR_RESTART_WINDOW_MIN. New Go() registers+launches workers
after Start (the API server starts its loops late); nil return without
ctx cancellation now means intentional completion, not a restart.
Start() outlives dead workers so late workers stay supervised.

Config: SUPERVISOR_MAX_RESTARTS (3), SUPERVISOR_RESTART_WINDOW_MIN
(60), SUPERVISOR_BACKOFF_BASE_SEC (5).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(supervisor-002): register the 12 unsupervised background loops (Epic 2)

Every scheduler/loop goroutine now runs under the goroutine supervisor
(panic recovery + sliding-window restart budget) instead of as a bare
go func() whose panic silently killed the subsystem forever:

- api.Server (6): periodic-consolidation, context-cooler,
  space-prune-scheduler, weekly-gap-interviews, scheduled-sync,
  rsic-macro-cron — via injected SetSupervisor(sup.Go) + goSupervised
  helper (bgWg brackets each run; stop channels remain the graceful
  path and return nil = no restart)
- ape (3): rsic-watchdog, rsic-store-flush, signal-learner-flush
- backup schedulers (2): neo4j-backup-scheduler, tsdb-backup-scheduler
  — their NewServer construction-time Start() moved to
  StartSupervisedBackground() (serve.go) so the hook exists first
- serve.go (1): llm-fastfail-burst-flush via sup.Go

All owners keep a nil-hook fallback (legacy bare goroutine), so tests
and non-server callers are unchanged. Buffered TSDB writers are
explicitly out of scope (TSDB-CONSUME-001 owns flush observability).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(supervisor-002): rule-health meta-alert on evaluator query failures (Epic 3)

A rule whose SQL errors was a silently-disabled alert: failures were
logged at Debug and nothing watched the watcher — bitten twice in one
week (HIDDEN-WEIGHT-001 null-weight rule + the recorded_at column bug,
both found by accident in later sprints). Query failures now log at
Warn, and after ALERT_RULE_FAILURE_THRESHOLD (default 3) consecutive
failures a high-severity meta-alert fires directly via the dispatcher
(not via a rule — the meta-channel must not depend on the failing
mechanism). Service label is rule-health-<rule-id> so concurrent
failing rules don't cooldown-suppress each other; success re-arms.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(supervisor-002): recency-gate the RSIC llm_error_rate_spike insight (Epic 4)

Insight 26 computes the error rate over a 24h window with no recency
requirement, so a 35-min jiminy.synthesize timeout burst at 02:00 UTC
kept re-firing HIGH 'LLM error rate spike' (and escalating to CRITICAL
'Jiminy Pipeline Critical') every RSIC micro-cycle for 12+ hours after
the incident self-resolved (live, 2026-06-11). LLMPerformanceSummary
now carries LastErrorAt (MAX(time) over errored rows); the spike
insight fires only when the most recent error is within
RSIC_LLM_ERROR_RECENCY_MIN (default 60; 0 disables the gate). A zero
LastErrorAt (older data source) keeps legacy behavior — the gate never
widens silently.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(supervisor-002): nolint G118 on legacy-fallback loop launches

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(supervisor-002): aggregate evaluator-degraded alert for global outages

A TSDB-level outage fails every rule at once; per-rule meta-alerts
would storm ~19 alerts duplicating the health prober's signal. At the
failure threshold the evaluator now distinguishes: other rules
succeeding recently → per-rule rule-health alert (broken SQL class);
nothing succeeding within threshold×interval → ONE
alert-evaluator-degraded alert per outage. Success re-arms both.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(jiminy): detach feedback outcome processing from the hook's connection lifetime

Live-smoke surprise during SUPERVISOR-002 Epic 5 (own fix-commit per
policy): jiminy.evaluate_llm was failing at 94.9% (657 'context
canceled' rows/24h). The post-tool-observe hook POSTs
/v1/jiminy/feedback with curl --max-time 5, but per-item Tier-2
outcome classification routinely outlives the connection — the request
ctx then cancelled every in-flight LLM call and outcomes silently
degraded to the keyword heuristic. Same defect class as
GUIDANCE-SYNTH-001's warm-path budget.

handleJiminyFeedback now uses context.WithoutCancel(r.Context()) with
its own server-side budget JIMINY_FEEDBACK_TIMEOUT_MS (default 60000,
0 = unbounded). The hook keeps its fire-and-forget 5s curl.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(supervisor-002): streak-relative global-outage discriminator (drill-caught)

The Epic 5 TSDB-stop drill caught the freshness-window heuristic
misclassifying outage ONSET: rules were succeeding seconds before the
stop, so lastAnySuccess was fresh when the first rules hit threshold —
2 per-rule alerts leaked before the aggregate fired. The discriminator
is now streak-relative: per-rule only when some other rule succeeded
AFTER this rule's failure streak began; otherwise global, once per
outage. Unit-pinned with the drill scenario
(TestRuleFailureStreak_OutageOnsetIsGlobal).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(supervisor-002): feature doc + verification + CHANGELOG + CLAUDE.md — sprint complete (Epic 6)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(backup-restore-verify-001): sprint plan (Epic 0)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(backup-restore-verify-001): harden the restore path (Epics 1-4)

Epics 1-4 land together — they share the restore-path signatures.

1. Checksum gate (Epic 1): the manifest SHA-256 was written at backup
   time and never read; a corrupted .mdemg restored silently. The gate
   now fails closed before import; legacy manifests without a checksum
   warn and proceed.
2. Snapshot completion polling (Epic 2): the pre-restore safety
   snapshot was raced with time.Sleep(2s) against an async backup
   goroutine. waitForBackupJob now polls the jobs queue until
   completed, failing closed on failure/cancel/vanish/timeout
   (BACKUP_SNAPSHOT_WAIT_TIMEOUT_SEC, default 300).
3. Count validation (Epic 3): manifest NodeCount/EdgeCount are
   whole-database counts and cannot validate file contents (they
   diverge on partial backups) — new additive file_node_count/
   file_edge_count/file_observation_count manifest fields are counted
   from the exported chunks; restore re-counts the file and hard-fails
   on mismatch (truncation class). Importer accounting divergence
   under CONFLICT_SKIP is warn-only, surfaced in a job-result
   validation block.
4. dockerbin routing (Epic 4): the legacy .dump restore shelled out to
   bare "docker" (the launchd-minimal-PATH class NOSILENT-001 fixed
   for TSDB); now routes via dockerbin unless the operator set a
   non-default FullCmd.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(backup-restore-verify-001): neo4j-backup jobhealth + generalized staleness rules (Epic 5)

The default-ON Neo4j backup scheduler had zero jobhealth coverage —
the inverse of NOSILENT-001 (which wired only tsdb-backup). The
scheduler now waits on each triggered job (its Trigger is queue-async;
a fire-and-forget report would always claim success) and reports
outcome via SetResultHook → jobhealth.Report with
job_name='neo4j-backup' (wired in SetTSDBClient next to the tsdb
hook). The staleness rule is generalized into a jobStalenessRule
factory; Neo4jBackupStalenessRule (neo4j_backup_no_recent_success,
Service scheduled-job-staleness-neo4j, window = partial interval × 2
unless BACKUP_JOB_STALENESS_HOURS overrides) registers when
BACKUP_ENABLED. The existing tsdb rule is pinned unchanged through the
refactor.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(backup-restore-verify-001): initial backup on start (rule honesty)

With the neo4j_backup_no_recent_success rule registered, a fresh
install would alarm honestly-but-noisily for up to 24h (the scheduler's
first tick). The scheduler now runs an initial partial backup
BACKUP_INITIAL_DELAY_MIN (default 5) minutes after start, so every
install has a backup — and a quiet staleness rule — within minutes.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(backup-restore-verify-001): retention was deleting every backup it just made (drill-caught)

The Tier 3 round-trip exposed that the backup system was a no-op for
this database: BACKUP_RETENTION_MAX_STORAGE_GB had a comment/code
default drift (documented 50, code read 2), and with RunAfter=true the
quota pass deleted each 3-4 GB whole-database backup ~80 ms after it
completed (log: 'backup completed' → 'retention cleaned backups
deleted_count=1 freed_bytes=<exactly the new backup>'). Three fixes:

1. Quota retention NEVER deletes the newest backup of each type — a
   quota smaller than one backup degrades to 'over quota, keep it'
   with a loud warning, not 'delete the only backup'. Sparse-file unit
   tests pin both the two-backup and only-backup-oversize shapes.
2. Default quota raised to the documented 50 GB.
3. BACKUP_SNAPSHOT_WAIT_TIMEOUT_SEC default 300 → 3600: the live
   whole-database export runs ~15 min; the 5-min wait made the initial
   scheduled run report failure (jobhealth correctly recorded it —
   the wiring works) while the backup actually completed later.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(transfer): omit empty path/name on import — restores with observations always failed (drill-caught)

The Tier 3 round-trip's real restore failed with
ConstraintValidationFailed: conversation observations carry path=NULL
in Neo4j (which memorynode_path_unique (space_id, path) ignores), but
the exporter serializes NULL as the proto default "" and the importer
wrote the literal empty string unconditionally — so the second
observation node in any restore collided. Every restore containing 2+
observation nodes had always been broken; this was invisible because
no backup had ever been restore-tested (the sprint's premise,
demonstrated). nodeProps now omits empty path/name (null fidelity);
unit-pinned.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(backup-restore-verify-001): feature doc + verification + CHANGELOG + CLAUDE.md — sprint complete (Epic 7)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(rsic-storm-001): sprint plan with corrected burst attribution (Epic 0)

Triage correction baked in: the 5,397-node burst was the Context
Cooler via the session-start hook's /v1/conversation/graduate (uncapped
backlog sweep of pre-DH-004 graduation-bug victims), NOT RSIC —
mis-attributed because tombstone_stale stamps no metadata and two
archive-reason property names coexist. RSIC's own issues stand:
trigger-race cycle storm (~20-30k/day) and snapshot/executor predicate
drift (rollback restores nothing).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(rsic-storm-001): atomic trigger admission — reserve-on-allow (Epic A)

EvaluateTrigger checked activeCycles/lastTrigger, but both were written
only by RecordTrigger — which callers invoke AFTER RunCycle completes.
For a cycle's entire multi-second duration every concurrent trigger
passed every gate: ~20-30k micro cycles/day live (4 spawning within
50ms of each tool-use burst), the 300s cooldown effectively
nonexistent, llama-server saturated (the recurring synthesize/
evaluate_llm/intent_translate timeout cascades), and RSIC actions
dispatched at storm frequency.

Admission now reserves the active + cooldown (+dedupe) records under
the same lock that performs the checks; RecordTrigger updates the
reservation with the real cycle ID; CompleteCycle clears the active
slot; a failed cycle still cools down. Unit-pinned: 50 concurrent
triggers admit exactly one; cooldown holds from admission through
completion and failure.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(rsic-storm-001): attributable archival + unified tombstone predicate (Epics B+C)

Epic B — every archival is now attributable:
- tombstone_stale stamps archived_at + archive_reason
  ('rsic_tombstone_stale') + archived_cycle_id (bare is_archived made
  the 2026-06-11 burst forensics mis-attribute the Context Cooler's
  5,397-node sweep to RSIC for hours).
- Canonical property name is archive_reason; concepts.go (the one
  archived_reason writer) migrates; historical rows keep the old name
  (readers coalesce; no data migration).
- Context Cooler tombstone step capped per run
  (COOLER_TOMBSTONE_MAX_PER_RUN, default 500; 0=unlimited) with a loud
  cap-reached warning — the incident sweep was a single uncapped run
  over the pre-DH-004 volatile backlog via the session-start hook's
  graduate call.

Epic C — rollback restores the right nodes:
- The executor and the rollback snapshot now share ONE candidate
  predicate (tombstoneStaleCandidates const). RSIC-VALIDATE-001 had
  updated only the executor; the snapshot captured the old unlinked
  set, so rollback restored nodes that were never archived
  (restored_count=0 live). Drift class eliminated, pin-tested.
- Rollback also clears the new attribution fields on restore.

Epics share the tombstone Cypher — combined commit (disclosed).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(rsic-storm-001): feature doc + verification + rollback drill test + CHANGELOG + CLAUDE.md — sprint complete (Epic F)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(rsic-storm-001): commit ExecuteTombstoneStaleForTest wrapper missed from Epic F

The rollback drill test (committed in 2534a28) references this
test-support wrapper, but the Epic F git-add listed the test file and
not internal/ape/task_dispatch.go — CI's integration build failed on
the already-merged PR #435 while local builds passed (the method
existed uncommitted in the working tree). 7-line addition, no behavior
change.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(tsdb): initial backup on start — restart-resetting ticker meant zero backups ever

Alert triage on the (correctly firing) 'No Successful TSDB Backup In
Window' staleness alert found scheduled_job_events has ZERO tsdb-backup
rows: the scheduler's 24h ticker resets on every restart, and a server
restarted more often than the interval never backs up (8 restarts
today alone). Same gap BACKUP-RESTORE-VERIFY-001 fixed for the Neo4j
scheduler. The shared runOnce() now also fires
TSDB_BACKUP_INITIAL_DELAY_MIN (default 10; 0 disables) after start —
the staleness rule's 'never ran' guarantee did its job catching this.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* docs(uats-gap-001): sprint plan (Epic 0)

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(jiminy): reformulate returns 400 (not 500) for missing context

Live-caught while probing for the UATS-GAP-001 contract spec: the
service rejects an empty context but the handler surfaced that
request-validation failure as 500 'internal error'. Validate at the
edge like space_id. The /strict reformulation channel (prompt-context
hook in strict mode) was otherwise contract-clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* fix(transfer): nil-guard edge identity assertions — one bad edge panicked the whole server (UATS-caught P0)

The UATS suite's backup_trigger spec ran a live export that hit an
edge whose endpoint returned nil fromId/toId — the unchecked
fromID.(string)/toID.(string)/relType.(string) assertions at
exporter.go:641-643 panicked, taking the entire server down mid-run
(launchd restarted it; 167 connection errors in the suite were the
visible symptom). An unexportable edge is now skipped with a warning;
the two parentVal assertion sites hardened with comma-ok for the same
class. An HTTP-triggerable request must never be able to kill the
process.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

* feat(uats-gap-001): 8 contract specs for the revived channels + suite hygiene (Epics 1-6)

New specs (27 cases, 100% live pass, hash-stamped): jiminy_strict,
jiminy_reformulate, jiminy_classify (incl. the fail-open contract),
jiminy_warm (202 warming|debounced union), jiminy_latest (warmth-state
union — the Follow-up C strict-JSON surface), admin_breakers_list,
admin_…

Jun 15, 2026
4fdfe48
zip
tar.gz
Notes
Downloads

v0.10.1

v0.10.1

Event Graph federation (EVENTGRAPH-001/002 + CLI), guidance-loop revival
(RRF-SCALE-001, JIMINY-OUTCOME-001, GUIDANCE-SYNTH-001), NOSILENT-001
fail-loud scheduled jobs, docker-PATH fix, jiminy-governance skill +
per-conversation SessionID, MODEL-DIST-002 adapter distribution.

See CHANGELOG.md [0.10.1].

Jun 9, 2026
5daf71d
zip
tar.gz
Notes
Downloads

v0.10.0

release: v0.10.0 — local LoRA distribution via Ollama Library + mdemg…

… model pull CLI

May 11, 2026
a83f87f
zip
tar.gz
Notes
Downloads

v0.9.0

release: v0.9.0 — Phase 13.5 LLM runtime cutover, Phase 14.x retrieva…

…l defaults, UBENCH framework

May 6, 2026
550e705
zip
tar.gz
Notes
Downloads

v0.8.5

v0.8.5: DH-004/DH-005 + ACA-BFC + DD-P1P2 + DOC-UPDATE-01

See CHANGELOG.md [0.8.5] for full notes.

Highlights:
- DH-005: confidence-adaptive ComputeOverallHealth + 7 RSIC_HEALTH_WEIGHT_* knobs
- DH-004: J17 dashboard remediation, admin breakers endpoints, deadline-aware LLM retry
- /strict Mode + UAITS Framework (10th UxTS)
- Jiminy semantic dedup, temporal correction decay, bounded ticket LRU
- Context Cooler stability reinforcement (99.7% volatile observations now graduate)
- RSIC 32-finding hardening + Neo4j signal learner persistence
- DOC-UPDATE-01: docs aligned with runtime defaults

Apr 20, 2026
fe8dc49
zip
tar.gz
Notes
Downloads

v0.8.1

v0.8.1: /strict mode + T1/T2 comprehension fixes (STRICT-P0P1 sprint)

- fix(jiminy): T1/T2 comprehension regression — bootstrap + decoding instruction + gate
- feat(jiminy): persist escalation state to Neo4j (write-behind)
- feat(jiminy): strict mode toggle + state file
- feat(jiminy): /strict prompt reformulation (context-neutral directives)
- feat(jiminy): response classification + PreToolUse enforcement
- fix(test): update J17 metrics test for POST reset endpoint

Apr 12, 2026
7e284d6
zip
tar.gz
Notes
Downloads

v0.8.0

v0.8.0 — UAITS Framework: spec-driven multi-paradigm training data cu…

…ration (SFT/DPO/RAFT/curriculum)

Apr 10, 2026
6c99cca
zip
tar.gz
Notes
Downloads

v0.7.4

v0.7.4: DD-P1P2 deep dive bug fix campaign (30+ fixes, live-validated)

Apr 8, 2026
b80f1e9
zip
tar.gz
Notes
Downloads

v0.7.3

Merge pull request #296 from reh3376/reh3376_dev01

dev: reh3376_dev01 -> main

Apr 7, 2026
3e5458a
zip
tar.gz
Notes
Downloads

v0.7.2

v0.7.2

feat: add PHP language parser with Laravel and WordPress support (#278)
deps: bump actions/download-artifact 4→8, upload-artifact 4→7 (#279, #282)
deps: bump docker/login-action 3→4, metadata-action 5→6, build-push-action 6→7 (#280, #284, #285)
deps: bump mcp-go 0.45→0.47, x/sys 0.41→0.42, grpc 1.79.3→1.80.0, validator 10.30.1→10.30.2 (#281, #283, #286, #288)
fix(jiminy): skip J8 synthesis at T1 trust, preserve J17 encoded form (#277)
fix(jiminy): trust accrual for partial_compliance + WarmStore upward invalidation (#277)

Apr 7, 2026
6ad0e6b
zip
tar.gz
Notes
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

edge

v0.10.1

v0.10.0

v0.9.0

v0.8.5

v0.8.1

v0.8.0

v0.7.4

v0.7.3

v0.7.2

Tags: reh3376/mdemg