dev: reh3376_dev01 -> main#450
Conversation
Epic 7 (Documentation Update — never cut). - docs/features/local-model-distribution.md: adapter section flipped from "deferred to MODEL-DIST-002" to "shipped 2026-05-25"; status header updated; Configurability Contract table adds --adapter flag row. - CHANGELOG.md: Unreleased gains "Sprint MODEL-DIST-002 — Adapter-only distribution path shipped" entry with full pipeline + verification + SHA + Ollama manifest digest. - CLAUDE.md Model Distribution architecture note: replaces "adapter-only deferred to MODEL-DIST-002+" with the operator-facing recipe and the pinned-toolchain pointer. - docs/development/model-dist-002/post.md: sprint close with epic-by-epic outcomes, acceptance criteria check-off, surprise log, and forward- looking notes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sprint EVENTGRAPH-001 — Reinforcement-Event TSDB Hypertable + Graph Federation. First implementation of Pattern Y1 from the TypeDB-inspired topology discussion: federate events into TSDB rather than reify them in the Neo4j graph, preserve graph traversal via a Go orchestration layer. 12-section v1.0 format; 8 sequential epics; ~1.5-2 dev-days; $0 LLM; low-medium risk (touches the Hebbian hot write path so the new writer must be fully non-blocking + the Cypher RETURN-shape change must be backwards-compatible at the Go call site). Targets ApplyCoactivation only for v1. Other Hebbian entry points (ApplySymbolCoactivation, CoactivateSession, ApplyNegativeFeedback) deferred to EVENTGRAPH-003 once the pattern proves out under production traffic. Pattern Y2 (link-node promotion in Neo4j) explicitly deferred until a query proves federation-in-Go insufficient. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…c 1) One row per Hebbian co-activation pair update. Captures prev/new weight (plus signed delta), evidence_count_after, eta_effective, surprise_factor, activation_product, path_sim, role/obs_type of both endpoints, session_id, direction (forward/reverse/bidirectional), and a created_new_edge flag that distinguishes "new connection formed" from "existing connection strengthened" at analysis time. trigger_path column will distinguish ApplyCoactivation from EVENTGRAPH-003's other Hebbian entry points. 7-day chunks (same as V0017-V0021). 4 indexes: per-space time-series, src+time, dst+time, partial index on (space_id, session_id, time) where session_id is set. Federation API (Epic 5) needs src + dst lookups for the graph-neighborhood join. Buffered + flushed via CopyFrom on TSDB_FLUSH_INTERVAL_SEC cadence (default 30s). Pattern matches V0019 (sparse_gate_metrics) buffered writer, NOT V0021 (model_install_events) sync writer — Hebbian writes are per-retrieve, far higher volume than CLI-driven model install events. Config: TSDB_REQUIRED_SCHEMA_VERSION default bumped 21 -> 22. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
internal/tsdb/reinforcement_writer.go — buffered CopyFrom writer mirroring the V0019 SparseGateMetricsWriter pattern. 30s auto-flush ticker, Close() drains buffer + flushes final batch, idempotent across multiple Close calls. FIFO eviction on buffer-full matches the LLMInteractionWriter precedent; eviction counted in droppedRows for Epic 6 Prometheus surfacing. ReinforcementEventRow serializes optional float / string fields via nullableFloat / nullableString helpers — zero-valued inputs land as DB NULL rather than 0 / '', so analytic queries can distinguish "no data" from "actually zero." Required fields (prev/new/delta weight, evidence_count_after, created_new_edge, trigger_path) are never nullable. Tier 1 unit tests (9 green): - Record + Flush writes all rows with correct table + column shape. - Empty buffer Flush is a no-op (no CopyFrom call). - Buffer-full evicts oldest, increments droppedRows counter. - Unlimited buffer (maxBufferSize=0) never drops. - Nullable serialization: zero-valued optionals → DB NULL. - Flush error increments FailureCount; SuccessCount/TotalRows unchanged. - Close drains buffer (final flush triggered). - Close is idempotent (Close × 2 does not double-flush). - Auto-flush ticker fires within deadline. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ENTGRAPH-001 Epic 3) ApplyCoactivation Cypher RETURN clause extended from "count(*) AS updated" to 17 per-pair columns: src/dst node IDs, prev/new/delta weight, evidence_count_after, eta_effective (cfg.LearningEta × etaMult), surprise_factor, activation_product, path_sim, role_a/b, obs_type_a/b, session_id, direction (forward/reverse/bidirectional), created_new_edge. created_new_edge derived from (r.evidence_count = 1) — the ON CREATE branch sets evidence_count to 1; ON MATCH increments. Reliable proxy for "new connection formed" vs "existing connection strengthened" at analysis time. Plan-deviation disclosure (per feedback_plan_options_pattern.md): the plan called for 2 rows per pair in asymmetric mode (forward + reverse). The Cypher mirrors rr.weight = r.weight at all times — forward and reverse edges carry identical weights. Emitting 2 rows would double- count without adding signal. Final choice: 1 row per logical pair regardless of mode, with the direction column carrying the forward/reverse/bidirectional distinction. Revisit if EVENTGRAPH-003 introduces a Hebbian path where forward/reverse weights diverge. New helper internal/learning/reinforcement_parser.go translates a neo4j.Record (or any (key) → (any, bool) getter) into a tsdb.ReinforcementEventRow. Lives in its own file so service.go doesn't grow. Defensive against missing keys (zero values), nil values (zero/empty), wrong types (fallback to zero) — no panics. Tier 1 unit tests (6 green) cover: - Symmetric bidirectional + ON CREATE branch - Asymmetric forward + ON MATCH branch (evidence > 1) - Missing optional fields → zero values (nullable* writer helpers serialize as DB NULL) - Neo4j int64 → Go int coercion - nil values → zero/empty - Wrong-typed values → graceful fallback Reinforcement rows are captured locally in ApplyCoactivation but not yet forwarded to TSDB — Epic 4 wires the writer. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…pic 4) learning.Service grows a reinforcementWriter field + SetReinforcementWriter setter (mirrors the SetStabilityReinforcer back-compat pattern). After ExecuteWrite returns from ApplyCoactivation, each captured per-pair row gets the spaceID stamped on it and is enqueued via writer.Record. The writer is non-blocking; the Hebbian hot path never waits on TSDB. Configurability Contract — 7 new env vars (no-hardcoding rule): - EVENTGRAPH_ENABLED (bool, default true) - EVENTGRAPH_WRITER_FLUSH_INTERVAL_SEC (int, default 30, floor 5) - EVENTGRAPH_WRITER_BUFFER_SIZE (int, default 1000, 0 = unlimited) - EVENTGRAPH_MAX_PAIRS_PER_EVENT_BATCH (int, default 200) - EVENTGRAPH_MAX_EVENTS_PER_QUERY (int, default 500, Epic 5 ceiling) - EVENTGRAPH_FEDERATION_DEFAULT_HOPS (int, default 2) - EVENTGRAPH_FEDERATION_DEFAULT_LOOKBACK_HOURS (int, default 24) api/server.go wires the writer's lifecycle: - Constructed after TSDB client is ready, gated by cfg.EventGraphEnabled so EVENTGRAPH_ENABLED=false cleanly skips construction; learner's reinforcementWriter stays nil and the Hebbian path short-circuits. - Closed alongside the other TSDB writers in graceful-shutdown — buffer drains before the process exits. Tier 2 integration tests (against real TSDB, build tag integration): - TestEventGraph_Writer_RoundTrip: 3 rows recorded → flush-window elapses → SELECT count(*) returns 3. - TestEventGraph_Writer_DrainOnClose: 5 rows recorded with 1-hour flush interval → Close() drains → SELECT returns 5 (verifies the server shutdown invariant). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…001 Epic 5)
internal/eventgraph/query.go — Pattern Y1 federation helper.
EventsInGraphNeighborhood orchestrates a two-step query:
1. Cypher graph walk from a seed node — variable-length path over
CO_ACTIVATED_WITH | GENERALIZES at depth 0..Hops. Returns the
N-hop neighborhood (DISTINCT node_ids, includes the seed).
2. TSDB query against reinforcement_events for events where src OR
dst is in the neighborhood, within the lookback window, ordered
newest-first, capped at the configured limit.
3. Go-side join — annotates events with SrcInNeighborhood /
DstInNeighborhood so the consumer can distinguish "both endpoints
in the subgraph" from "one endpoint outside the seed's N-hop
reach but the event still touches our subgraph."
Empty neighborhood (no seed match, hops=0) short-circuits before the
TSDB call. Sub-1-second Since values clamp to 1s. Hops < 0 is rejected
upfront. The handler enforces an additional ceiling of 2 ×
EVENTGRAPH_FEDERATION_DEFAULT_HOPS for runaway-walk protection.
internal/api/eventgraph_handler.go — POST /v1/eventgraph/reinforcement-
neighborhood. Same auth convention as /v1/admin/breakers. 503 when
EVENTGRAPH_ENABLED=false or when eventgraphService is nil (TSDB-down at
boot). 400 on missing space_id / seed_node_id / negative hops / hops >
ceiling. Defaults applied from config when fields omitted from request.
Plan-decision disclosure (per feedback_plan_options_pattern.md): plan
proposed Option A (single endpoint with event_type query param) vs
Option B (endpoint per event class). Final choice: A. v1 has one event
class (reinforcement); the endpoint URL is explicit about that.
EVENTGRAPH-002 can either add a query param or split the URL when a
second event class arrives — no breaking change either way.
Tests:
- Tier 1 (internal/eventgraph/query_test.go, 7 green): request
validation rejects empty space_id, empty seed, negative hops; interval
formatting roundtrips; join annotation handles both-inside,
one-outside, and empty-neighborhood cases.
- Tier 1 (internal/api/eventgraph_handler_test.go, 4 green + 2 skipped):
method-not-allowed, feature-disabled 503, nil-service 503, invalid-
JSON short-circuit. Two validation paths skipped — they require a
non-nil eventgraphService which can't be constructed without a real
driver; Tier 2 exercises them.
- Tier 2 (tests/integration/eventgraph_federation_test.go, 1 green):
builds seed--mid--leaf graph + off-node, emits 3 reinforcement
events touching all four nodes, calls federation at hops=0 and
hops=1, asserts neighborhood + in-neighborhood flags. The hops=0
test confirms that mid↔leaf (touching neither seed nor any 0-hop
neighbor) is correctly excluded.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ement events (EVENTGRAPH-001 Epic 6) Three new Prometheus counters mirror the V0022 writer's internal atomic counters: - mdemg_eventgraph_writer_rows_enqueued_total — rows successfully CopyFrom'd - mdemg_eventgraph_writer_rows_dropped_total — rows FIFO-evicted (buffer full) - mdemg_eventgraph_writer_flush_failure_total — flush errors Wiring: the writer accepts a narrow PrometheusCounter interface (Add(int64)) so internal/tsdb does not import internal/metrics (which would cycle). api/server.go calls SetPrometheusCounters after the writer is constructed, passing the three counters from the global StandardMetrics struct. Nil-safe. Dashboard: mdemg-graph-topology.json gains a new collapsed row "Reinforcement Events (EVENTGRAPH-001)" with a single time-series panel "Reinforcement Event Rate (events/min)" showing all three rates (enqueued / dropped / flush failures) over the last 24h. Dropped is colored orange, flush failures red, enqueued the default palette. Tied to the prometheus datasource. The existing GRAFANA-AUDIT-001 harness (scripts/grafana_panel_audit.py) only evaluates SQL-target panels — the new panel uses Prometheus queries, so it lands on the SKIP pile, same as the other 8 Cypher / Prometheus panels on this dashboard. Audit JSON refreshed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Epic 6's targeted audit run (scripts/grafana_panel_audit.py --dashboard mdemg-graph-topology.json) overwrote the full multi-dashboard audit results from GRAFANA-AUDIT-001 with the single-dashboard subset (9 SKIPs only). Restoring the full snapshot from commit 0a1e8e1 — that audit covered all 8 dashboards and is the canonical baseline the GRAFANA-AUDIT-001 post.md references. EVENTGRAPH-001 did not need to regenerate it; the new panel uses Prometheus queries, which the audit harness SKIPs regardless of dashboard. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…fix-commit)
ScoreAndRankRRF's ConsensusResult → RetrieveResult conversion was
silently dropping the Activation field. The legacy ScoreAndRank path at
scoring.go:883 sets Activation: a (where a := act[c.NodeID] is the
spreading-activation map value). The RRF path constructed
models.RetrieveResult{...} with no Activation key, leaving the field
zero-valued.
Net effect: since Phase 13.1 default-on (2026-05-03),
learning.Service.ApplyCoactivation has filtered out every L0 candidate
on the retrieve hot path. The filter is r.Activation >=
LearningMinActivation (default 0.20). With Activation=0, no pair makes
it to the Hebbian Cypher; the function returns nil without writing.
Hebbian learning has been silently no-op on the production retrieve
goroutine for ~24 days. CO_ACTIVATED_WITH edges still exist in the
graph — sidecar paths (CoactivateSession, ApplySymbolCoactivation,
consolidation walks) and pre-Phase-13.1 retrieves wrote them — but the
retrieve-time goroutine has been a silent no-op.
Discovered during EVENTGRAPH-001 Epic 7 live e2e. Three retrieves
produced 0 rows in reinforcement_events. Investigation traced the gap
to the missing Activation field.
Fix: one-line addition in scoring_rrf.go — Activation: act[c.NodeID].
Brings the RRF path to parity with the legacy scorer.
Post-fix verification: rebuilt, restarted server, re-issued 3 retrieves
→ 10 reinforcement events landed in TSDB. Federation API at hops=1
correctly returned all 10 with src_in_neighborhood=true,
dst_in_neighborhood=true. Documented in
docs/development/eventgraph-001/verification.md.
Per CLAUDE.md "Testing — Live System Testing Is Required":
"surprise bugs caught during live smoke get their own follow-up
fix-commit — do not silently roll them into the sprint commit." This
is the precedent-aligned separate commit.
Forward-only: existing graph state is preserved; new retrieves now
correctly emit Hebbian updates. EVENTGRAPH-002 may revisit whether to
backfill the missing 24-day window.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Real /v1/memory/retrieve × 3 against mdemg-dev → 10 reinforcement events landed in TSDB within the flush window. Federation API at hops=1 from a seed node returned 5-node neighborhood + 10 in-neighborhood events. Documents the surprise-bug discovery + fix that preceded this transcript (see fix-commit for scoring_rrf.go::ScoreAndRankRRF Activation propagation). Acceptance criteria from sprint plan §"Acceptance Criteria" all PASS. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ose (Epic 8) Final epic — Documentation Update (never cut, per feedback_per_feature_docs_required.md and the standardized v1.0 sprint plan format). New: docs/features/event-graph-federation.md (~240 lines, Why / Choices / How it works / How to use / Forward-looking). Documents: - Pattern Y1 vs Y2 trade-off (why federation-in-Go now, link-node reification deferred until a query forces it) - Why V0019 buffered-CopyFrom over V0021 sync-INSERT (per-retrieve volume) - Why ApplyCoactivation first (other 3 Hebbian entry points deferred to EVENTGRAPH-003) - Why forward-only (no source to backfill from) - Federation pipeline (Cypher walk → TSDB query → Go-side join with src/dst_in_neighborhood annotation) - TSDB schema, API request/response shape, 7 env vars + defaults - Observability (3 Prometheus counters + Grafana panel) - Forward-looking sprints New: docs/development/eventgraph-001/post.md — epic-by-epic outcomes, acceptance criteria check-off, surprise log (RRF Activation drop + audit-JSON overwrite + orphan-process port collision), plan deviations disclosed (1-row-per-pair regardless of asymmetric mode; single- endpoint over endpoint-per-class), forward-looking. CHANGELOG.md Unreleased gains the EVENTGRAPH-001 entry — 11 bullet points covering V0022 migration, buffered writer, Cypher RETURN-shape change, Configurability Contract, federation helper + API, Prometheus + Grafana, Tier 2 + Tier 3 verification, the surprise-bug RRF Activation fix-commit, and the audit-JSON restore. CLAUDE.md Architecture Notes gains a new "Event Graph Federation" entry above the Model Distribution section. Documents the pattern, surface, deferrals, and the load-bearing fix-commit f307f55 that surfaced 24 days of silent Hebbian no-op on the retrieve hot path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…Prometheus datasource
The Epic 6 panel used datasource {type: prometheus, uid: prometheus} but
this Grafana instance has no Prometheus datasource configured — mdemg
exposes counters as JSON via /v1/metrics/snapshot, not a /metrics scrape
endpoint. Configured datasources: mdemg-nodegraph, neo4j, timescaledb
only. The panel rendered "No data" in the live Grafana.
Rewritten panel queries the reinforcement_events hypertable directly via
the timescaledb postgres datasource. Two targets:
1. count(*) over 1-minute time_buckets → overall events/min
2. count(*) FILTER (WHERE created_new_edge) vs WHERE NOT created_new_edge
→ split between new connections formed and existing connections
strengthened (the operational dimension the analytic queries
actually need)
Both targets templated on $space_id (existing dashboard variable). The
Prometheus counters (mdemg_eventgraph_writer_rows_{enqueued,dropped,
flush_failure}_total) remain wired and incrementing — they surface via
/v1/metrics/snapshot for ops scripts. The Grafana panel now actually
displays data instead of relying on a scrape path that doesn't exist
in this deployment.
Discovered during post-merge live verification (2026-05-29). Verified
fix: reloaded dashboard via Grafana API → /api/ds/query against same
SQL returns 1-minute buckets matching TSDB direct count. Audit harness
now reports 2 PASS for the new panel (previously SKIP — no SQL target).
verification.md updated with the post-merge transcript.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
# Conflicts: # deploy/docker/grafana/dashboards/mdemg-graph-topology.json # docs/development/eventgraph-001/verification.md
P0 fix. The Jiminy guidance->feedback->outcome loop has been dormant ~9 weeks: consulting/service.go gates constraint/suggestion extraction on hardcoded legacy-scale score thresholds (r.Score < 0.55 et al.). Phase 13.1 RRF (default-on May 3) dropped the score scale so strong matches top out ~0.53 -> 0/10 results clear the gates -> empty guidance -> dead loop. Third instance of the RRF-score-contract bug class (after the EVENTGRAPH-001 Activation drop). 12-section format; 6 epics; config-driven percentile-gate fix + sigmoid recalibration; live-verify the revived loop end-to-end. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Full-repo sweep of post-RRF score/activation/confidence consumers + live score-distribution sampling. Findings: - HIGH (4): consulting constraint gates (1005/1081/1087) + confidence sigmoid midpoint 1.5 (35-36) — the loop-killer cluster. - MED (5): consulting conflict gates (931/944/957/981) + minConfidence pre-filter (619, already config-driven). - LOW (3): retrieval/jiminy.go Activation display gates (45/155/192) — explanation text only, no guidance gating. - NONE (2): jiminy trial score (0-10 scale), trust-score clamp. Live distribution: RRF strong-match top scores cluster 0.49-0.58; the 0.55 gate sits mid-band, rejecting the most-relevant constraint half the time. NormalizedConfidence is positional rank (spreads 100->0 even on uniform-score sets) -> rules out plan Option A (percentile) as sole gate. Remediation: config-driven RRF-calibrated absolute thresholds (Option B), constraint floor default 0.45, sigmoid midpoint ->0.45. Disclosed deviation per feedback_plan_options_pattern. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…SCALE-001 Epic 2)
Revives the dormant Jiminy guidance loop. Replaces 7 hardcoded legacy-
scale score gates in consulting/service.go + the score->confidence
sigmoid (both copies) with config-driven, RRF-calibrated values.
Gates (all default 0.45, RRF strong-match band is 0.49-0.58):
- constraint extraction (was <0.55) -> CONSULTING_CONSTRAINT_SCORE_FLOOR
- keyword/name authority inner gate (0.55/0.6) -> CONSULTING_AUTHORITY_SCORE_FLOOR
- conflict/contradiction detection (0.6-0.7) -> CONSULTING_CONFLICT_SCORE_FLOOR
Key Epic-2 finding: keywordClassifyConstraint has an INNER authority
gate that binds tighter than the outer constraint gate. If authority
floor > constraint floor, the binding gate re-rejects the strong-match
band and the loop stays dormant -> all three default to 0.45. The RRF
band is too compressed to subdivide into tiers; knobs stay separate so
operators can raise any one independently.
Sigmoid (score->confidence), both consulting/service.go and
jiminy/retrieval_source.go (they MUST stay in sync per their own
comments): midpoint 1.5 -> 0.45, steepness 1.5 -> 8.0, config-driven via
RETRIEVAL_CONFIDENCE_SIGMOID_{MIDPOINT,STEEPNESS}. Legacy crushed a
strong 0.5 match to 0.18 confidence; recalibrated maps it to 0.60
(0.1->0.06, 0.58->0.74). normalizeRetrievalConfidence is now a Service
method reading cfg with zero-value fallback; mapRetrievalToGuidance
takes the sigmoid params from its caller's cfg.
5 new config knobs, all with RRF-calibrated defaults + zero-value
guards (no-hardcoding rule; the bug WAS a hardcoded value).
Tier 1 tests: updated 2 legacy-scale boundary tests to the new
thresholds + added RRFStrongMatchBand regression (0.50 must surface),
ConstraintFloor_ConfigDriven (override honored), and
NormalizeRetrievalConfidence_RRFCalibration (band mapping). Full
consulting + jiminy + config suites green; lint clean.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
retrieval/jiminy.go Activation display gates (45/155/192 + LearningEdge siblings) traced live: they're in the explainability renderer, not the guidance-surfacing path; always-additive at RRF scale (live activation ~0.723 >> thresholds), no misbehavior. Intentionally left unchanged with rationale — config-ifying display verbosity is out of proportion to zero functional impact. Every High/Med remediated (Epic 2), every Low decided. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…pic 4) Tier 3 live e2e (verification.md): the score-gate fix revives the dormant guidance loop on the live stack — - /v1/jiminy/guide guidance items 0 -> 10, source_counts.constraints 0 -> 2, patterns 0 -> 3 (acceptance #1 MET). - Full loop warm->latest->feedback->outcome: TSDB constraint_outcomes sink REVIVED — fresh rows dated 2026-06-03 (table was dead since May 1). Constraint-effectiveness Grafana sink is live again. Three adjacent issues surfaced during live smoke, documented as distinct follow-ups (NOT score-scale, not bolted on): - A: Neo4j GUIDANCE_OUTCOME edges still dormant — guidance SourceNodes point at emergent_concept nodes; PersistGuidanceOutcome only writes edges for constraint/correction/pattern/learning or role_type= constraint targets. Node-type-targeting bug, independent of RRF. Candidate sprint JIMINY-OUTCOME-001. - B: LLM guidance synthesis timeout (now that synthesis runs). - C: /v1/jiminy/latest unescaped control chars break jq/json parsers — the hook uses jq, so may compound dormancy. Low-effort follow-up. Tier 2 (rrf_scale_guidance_test.go, integration tag, 2 green): - SuggestSurfacesGuidance: constraint-matching context surfaces 7 suggestions (was 0 before fix) against live mdemg-dev. - SuggestRejectsNoise: gibberish does not flood constraints (no over-correction). Cold-start note: first guide call post-restart returned constraints:0 (LLM classifier cold-model timeout -> keyword fallback); after one warm-up call, constraints surface. Model-warmth artifact, not a fix defect. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t.md (Epic 5) Final epic. CHANGELOG Unreleased gains the RRF-SCALE-001 Fixed entry. CLAUDE.md gains a 'score-scale contract' architecture note — the structural defense against a 4th instance: downstream consumers MUST NOT hardcode absolute thresholds against RetrieveResult.Score (the scorer scale is not a stable contract); gate via config or a scale-invariant signal, and re-audit on any scorer change. Notes that NormalizedConfidence is positional (not a safe sole gate) and records the three open follow-ups. post.md: epic-by-epic, acceptance check-off (honest: #2 partial — TSDB sink revived, Neo4j edge is distinct Follow-up A), scope note separating the score-scale fix (done) from the 3 adjacent surfaced issues (documented follow-ups), discipline notes (cold-start mask, inner authority gate). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ment (CI fix) CI failure on PR 404: TestRRFScale_SuggestSurfacesGuidance failed in 0.02s. Root cause: the test assumed the populated local mdemg-dev space (111 constraint nodes), but CI boots a FRESH EMPTY Neo4j with stub embeddings (and RETRIEVAL_COLUMN_VOTING_ENABLED=false / legacy scorer). With no data, /v1/memory/suggest returns 0 candidates, so the 'total == 0' assertion fired. Other integration tests self-seed data or skip when prerequisites are absent; mine relied on ambient data — wrong for a reproducible CI run. Fix: skip when debug.retrieved_count == 0 (no retrievable data → the score-gate fix isn't exercisable; there's nothing for the gate to admit or reject). The test stays meaningful against a populated stack (local: 9 suggestions from 15 retrieved → PASS) and skips cleanly in CI's empty-DB environment. Verified both paths live: populated → PASS, empty space → retrieved_count 0 → SKIP. The gate fix itself is validated by Tier 1 unit tests + the live Tier 3 e2e (docs/development/rrf-scale-001/verification.md); this integration test is a bonus live-stack assertion, not the primary proof. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… sink Follow-up A from RRF-SCALE-001: the Neo4j GUIDANCE_OUTCOME edge sink has been dormant since Apr 12. Root cause: matchConstraintCode links guidance items to constraint codes by keyword overlap (>=3 shared words), but retrieval surfaces emergent_concept abstractions whose content does not share 3+ literal words with raw constraint text -> no constraint_code -> PersistGuidanceOutcome falls back to the concept SourceNode -> the role_type=constraint filter rejects it -> no edge. Live-proven: all 17 recent outcome rows had constraint_code=(none). Fix (Option 1): switch the matcher to embedding cosine similarity (content already normalized to natural language ~0.70 cosine; Service has an embedder; cosineSimilarity + embed->cosine pattern already exist in-package via OutcomeClassifier). Existing PersistGuidanceOutcome + findConstraintNodeID then create edges on the correct constraint nodes. Keyword matching stays as fallback -- never regresses. 4 epics; ~1-1.5 dev-days; config-driven threshold; acceptance bar = a fresh Neo4j GUIDANCE_OUTCOME edge on a real role_type=constraint node dated today, reflected in GetConstraintEffectiveness. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…UTCOME-001 Epic 1) Revives the Neo4j GUIDANCE_OUTCOME edge sink (dormant since Apr 12). Root cause (RRF-SCALE-001 Follow-up A): matchConstraintCode links guidance items to constraint codes by keyword overlap (>=3 shared words), but retrieval surfaces emergent_concept abstractions whose content rarely shares 3+ literal words with raw constraint text -> no code -> PersistGuidanceOutcome falls back to the concept SourceNode -> the role_type=constraint filter rejects it -> no edge. Fix: new matchConstraintCodeByEmbedding queries the constraint vector index (db.index.vector.queryNodes, role_type=constraint, sim >= threshold) and returns the closest constraint's code. Guide() tries this first, falling back to the keyword matcher when the embedder is unavailable, content is empty, or nothing clears the threshold — never regresses. The existing PersistGuidanceOutcome + findConstraintNodeID then create the edge on the correct constraint node. Implementation refinement vs plan: uses Neo4j's vector index server-side (mirrors the proven Evaluator.findMatchingConstraints pattern) rather than loading all constraint embeddings into Go and computing cosine in a loop — cleaner, no constraintCodeEntry.Embedding needed. Same Option-1 outcome. Config: JIMINY_CONSTRAINT_CODE_SIM_THRESHOLD (default 0.55, zero-value fallback) — provisional; tuned against the live similarity distribution in Epic 2. Tier 1 (4 tests): nil-driver/empty-embedding guards, threshold default resolution, keyword-fallback non-regression. Full jiminy + config suites green; lint clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…on (Epic 2)
Tier 3 live e2e (verification.md) — acceptance bar MET:
- /v1/jiminy/guide now yields guidance items carrying constraint_codes
(10 items, 6 coded; was 0). Matched code 'no-direct-main-commits' is
semantically exact for the 'commit to main' context.
- Full warm->latest->feedback loop: Neo4j GUIDANCE_OUTCOME 893 -> 899
(+6), latest today. All 6 new edges land on REAL role_type=constraint
nodes ('CONSTRAINT: NEVER commit directly to main') — not
emergent_concept. The sink dormant since Apr 12 is revived on the
correct nodes.
- /v1/constraints/effectiveness reflects it: 'NEVER commit directly to
main | surfaced: 30 followed: 28 rate: 0.93'.
- Both sinks now revived: TSDB (RRF-SCALE-001) + Neo4j (here). The
constraint-effectiveness loop is fully restored.
Threshold 0.55 validated live: correct matches, no false positives.
Tier 2 (jiminy_outcome_test.go, integration tag, skip-on-empty): PASSES
on a populated stack with an idle LLM (7/10 items coded). The guide path
is LLM-latency-dependent (per-node classifier ~31s/call, serialized; a
call fired while the LLM is busy fast-fails empty), so the test
warm-retries and SKIPS (never false-fails) when the LLM path can't
produce items. Bonus check; Tier 3 is the definitive proof. The LLM
serialization/synthesis-timeout is RRF-SCALE-001 Follow-up B, tracked
separately.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Final epic. CHANGELOG Unreleased gains the JIMINY-OUTCOME-001 Fixed entry. CLAUDE.md gains a guidance-outcome constraint-code-matching note (embedding-first via vector index, keyword fallback; both outcome sinks now live). post.md: epic-by-epic, acceptance check-off, the loop-revival completion (TSDB from RRF-SCALE-001 + Neo4j here), discipline notes (LLM serialization is the test-flakiness source), forward-looking (Follow-up B now the most operationally-visible remaining issue). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t (Follow-up B) Synthesis fails on every production warm call (6/6 jiminy.synthesize errored). Root cause: the hook's /warm path runs background Guide() with a hardcoded 30s timeout (handlers_jiminy.go:302), inside which the per-node constraint classifier runs SERIALLY (~1.5s x ~10 nodes ~= 15s), leaving only ~15s for synthesis which needs 8-27s -> deadline exceeded. JIMINY_TIMEOUT_MS=240s is configured but the 30s hardcode caps it. Fix (both): (1) parallelize the per-node classifier with bounded concurrency (CONSULTING_CLASSIFY_CONCURRENCY, default 4 matching llama-server --parallel 4); (2) config-drive the warm timeout (JIMINY_WARM_COMPUTE_TIMEOUT_MS, default 90s). Acceptance: synthesis succeeds live (no synthesis_error), measured latency drop, no constraint-surfacing regression. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
UATS-style Go contracts over the previously zero-test 1,635-line file: each handler invoked against an httptest MDEMG backend, asserting the HTTP mapping (method/path/body), the space-resolution precedence (explicit > env default > ide-agent), single-space association invariant, validation-before-HTTP, and backend-500 → tool-error (never a Go error). 7 contracts green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…aper (Epics 3+4) Three new tools: eventgraph_reinforcement_neighborhood + eventgraph_guidance_outcome_neighborhood (seed_node_id OR query-resolved seed, per the EVENTGRAPH-CLI-001 precedent; hops/since/limit OMITTED when unset so server config stays the single source of truth) and jiminy_strict (the /strict toggle). 23 tools total, all contract-covered (12 contracts incl. the omit-when-unset pin and the seed-by-query two-call chain). Plugin-orphan reaper: launchctl kickstart -k kills the server without Manager.Stop, orphaning plugin children (3 stale generations observed live: Apr 30 / May 1 / May 7). startModuleInstance now pgrep-reaps any prior-generation process holding the module's socket path before spawning — surgical (full socket path match), loud per-kill warn. Test decoy lesson: sh -c exec-optimizes argv away; tail -f <path> carries it like a real plugin. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…c 5) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…istorical window Live-caught regression (RSIC schema-drift alerts, correctly firing): V0014's post-migration check asserted ALL-TIME prompt-hash membership against a frozen hash set, and migrations re-run on every auto-migrate startup — so the first legitimate prompt evolution (JIMINY-OUTCOME-002's not_applicable classifier prompt, 8 rows with the new hash) made V0014 RAISE forever, aborting every migration run at 013 and pinning schema_version=13 while all 26-era objects continued to exist. V0014 is a historical repair (the Phase 11.6.x task_name swap); its UPDATEs and the Step-4 assertion now scope to time < 2026-05-02 — the migration asserts exactly the work it did, immune to future prompt evolution. Verified live: full chain runs clean, schema_version restored to 26, drift alert cause removed. Lesson for migration authors: in a run-every-startup migration model, integrity checks over LIVE tables must be scoped to the migration's own target window — an open-ended assertion is a time bomb armed by normal system evolution. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…classify distillation Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… reh3376_dev01
….5d root cause (Epic 1)
The 11.5d class skew was NOT (only) sampling: summary_quality penalized
empty summaries, but a correct 'none' verdict REQUIRES an empty summary
by spec — so the reward>=0.8 distill filter silently rejected every
correct-none teacher answer (proof: run 1 with perfect 82%-none input
stratification kept 37/200 pairs with ZERO none). Fixed: spec-compliant
{type:none, summary:''} scores 1.0 (98 reward tests green). Capture
gains --stratify-classify (production-distribution bucketed sampling,
measured live over 4,028 rows). Run 2: 200/200 kept, reward mean 0.981,
train dist none 82/must 12/must_not 3/should 3/should_not 1 — within
±2pp of production on every class, 0-leak vs all 10 sources.
NOTE for the gate epic: stored classify baselines (0.668) were computed
under the biased reward — gates recompute baselines fresh.
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… production fused model (Epic 2) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…d mlx port 8101 Phase 13.5 cutover updated serving to llama-server :8102 but benchmark_phase10.yaml kept mlx_port: 8101 — any benchmark run without an explicit --mlx-base-url override made ZERO model calls and reported aggregate 0.0000 (caught live launching the FT-CLASSIFY-002 fresh baseline). UBENCH config sha re-pinned; lint green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…input) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…-approved) Adopts the operator's doc-audit orchestrator prompt with the team's binding amendments: push-based survival loop (commit-without-push lost state across resets), git ls-files enumerator (find swept ~700 non-repo files incl. .claude agent memory into a FIX_IN_PLACE authorization), dev02 branch strategy with CLAUDE.md/CHANGELOG operator carve-outs, pinned-snapshot convergence, claim-budget batching, JSONL ledger, CI discipline, and the age-based drift model (96% of docs/architecture untouched since pre-April is the real hotspot; the MoE-marker class is already closed). Phased: 001a after FT-CLASSIFY-002, 001b after DORMANT-CENSUS-001, 001c as a standing mechanism. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… closed (Epic 6) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… reh3376_dev01
1. compose LLM_ENDPOINT default 8101→8102 + stale Phase-11.6 comment (triple-confirmed audit finding; un-overridden Docker deployments pointed at the port decommissioned 2026-05-03; both copies, parity ok) 2. 00_README_v2.md version ledger unfrozen: v5.13 catch-up entry (13.5 cutover, MODEL-DIST-001/002, FT-RECURSIVE-000, FT-CLASSIFY-002) — append-only, R-LT-4-clean 3. README dashboard tab count 8→10 4. pre-campaign-checklist schema v8+→26 (cites config.go as authority) 5. beta-testing version-under-test marker → v0.10.1 6. live-validation F11 framing verified already-correct (no change) + CHANGELOG entry for 001a (rides dev01 per charter carve-out) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…that can fire Live baselines escalate the roadmap's diagnosis: the surprise chain is flat DEAD, not noisy — all 221,504 reinforcement events ever carry surprise_factor=1.0; node surprise_score avg 0.023 (max 0.503, n=5,808) vs hardcoded 0.4/0.7 thresholds. Scope: vector-index top-K novelty + config-driven thresholds recalibrated to the new scale in the same sprint (RRF-SCALE lesson) + CoactivateSession surprise-CASE audit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ty (Epic 1) Replaces the unordered LIMIT 50 sample (no ORDER BY) whose comparison set was dominated by EMPTY-ARRAY embeddings — 4,564 of 5,810 conversation observations (78%) carry size(embedding)=0, passing the old IS NOT NULL guard while cosine() yields NULL; this is why node surprise_score averaged 0.023 and every reinforcement event ever carried surprise_factor=1.0. New: exact ORDER BY cosine scan over real-embedding (size=dims), non-archived, space-scoped conversation observations; config SURPRISE_EMBEDDING_NOVELTY_TOPK (50) + _SIM_FLOOR (0=off). The db.index.vector.queryNodes route was live-REJECTED: the label-wide index is crowded by ~100k non-conversation nodes (top-200 near a conversation seed were ALL emergent_concept centroids — the HIDDEN-CHURN degeneracy), pruning role-filtered hits to zero; the exact scan over ~1.2k real rows is deterministic and ~ms (revisit at ~50k). Live-verified: avg sim 0.680 / count 50 over the true nearest set. Follow-up recorded: the 78% empty-embedding backlog (embeddings backfill) is its own item. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… at both Cypher sites (Epic 2) CoactivateSession DOES compute the surprise CASE (audit result) — the chain was dead purely from broken scores. The hardcoded 0.7/0.4 thresholds (unreachable: live max score 0.503, avg 0.023 under the old noise) become SURPRISE_FACTOR_HIGH_THRESHOLD (0.5) / SURPRISE_FACTOR_MEDIUM_THRESHOLD (0.3) — defaults calibrated to the exact-top-K novelty scale, parameterized into BOTH ApplyCoactivation and CoactivateSession Cypher (never recalibrate against the old scale — the RRF-SCALE lesson). Non-positive config falls back to defaults. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…loud error path (Epics 3-4) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sprint SURPRISE-TOPK-001 — Honest Novelty + a Multiplier That Fires (sprint complete)Roadmap Q3 next-in-line; HEBB-ETA-001's named prerequisite. The diagnosis escalated on live evidenceThe roadmap said the novelty input was noise. The baselines said worse: the entire surprise chain was flat dead — all 221,504 reinforcement events ever carried
What shipped
Live Tier 3First non-1.0 surprise factors in system history: smoke-session distribution Follow-ups recordedEmpty-embedding backfill (4,564 rows); weak non-correction discrimination (term-novelty component — SURPRISE-TERMS-001 candidate); exact-scan→index revisit at ~50k role population. Sprint: |
Summary
Development branch changes from
reh3376_dev01.Commits
mdemg modelCLI + pluggable Fetcher interfaceAuto-generated PR from reh3376_dev01 push