dev: reh3376_dev01 -> main#456
Conversation
…t.md (Epic 5) Final epic. CHANGELOG Unreleased gains the RRF-SCALE-001 Fixed entry. CLAUDE.md gains a 'score-scale contract' architecture note — the structural defense against a 4th instance: downstream consumers MUST NOT hardcode absolute thresholds against RetrieveResult.Score (the scorer scale is not a stable contract); gate via config or a scale-invariant signal, and re-audit on any scorer change. Notes that NormalizedConfidence is positional (not a safe sole gate) and records the three open follow-ups. post.md: epic-by-epic, acceptance check-off (honest: #2 partial — TSDB sink revived, Neo4j edge is distinct Follow-up A), scope note separating the score-scale fix (done) from the 3 adjacent surfaced issues (documented follow-ups), discipline notes (cold-start mask, inner authority gate). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ment (CI fix) CI failure on PR 404: TestRRFScale_SuggestSurfacesGuidance failed in 0.02s. Root cause: the test assumed the populated local mdemg-dev space (111 constraint nodes), but CI boots a FRESH EMPTY Neo4j with stub embeddings (and RETRIEVAL_COLUMN_VOTING_ENABLED=false / legacy scorer). With no data, /v1/memory/suggest returns 0 candidates, so the 'total == 0' assertion fired. Other integration tests self-seed data or skip when prerequisites are absent; mine relied on ambient data — wrong for a reproducible CI run. Fix: skip when debug.retrieved_count == 0 (no retrievable data → the score-gate fix isn't exercisable; there's nothing for the gate to admit or reject). The test stays meaningful against a populated stack (local: 9 suggestions from 15 retrieved → PASS) and skips cleanly in CI's empty-DB environment. Verified both paths live: populated → PASS, empty space → retrieved_count 0 → SKIP. The gate fix itself is validated by Tier 1 unit tests + the live Tier 3 e2e (docs/development/rrf-scale-001/verification.md); this integration test is a bonus live-stack assertion, not the primary proof. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… sink Follow-up A from RRF-SCALE-001: the Neo4j GUIDANCE_OUTCOME edge sink has been dormant since Apr 12. Root cause: matchConstraintCode links guidance items to constraint codes by keyword overlap (>=3 shared words), but retrieval surfaces emergent_concept abstractions whose content does not share 3+ literal words with raw constraint text -> no constraint_code -> PersistGuidanceOutcome falls back to the concept SourceNode -> the role_type=constraint filter rejects it -> no edge. Live-proven: all 17 recent outcome rows had constraint_code=(none). Fix (Option 1): switch the matcher to embedding cosine similarity (content already normalized to natural language ~0.70 cosine; Service has an embedder; cosineSimilarity + embed->cosine pattern already exist in-package via OutcomeClassifier). Existing PersistGuidanceOutcome + findConstraintNodeID then create edges on the correct constraint nodes. Keyword matching stays as fallback -- never regresses. 4 epics; ~1-1.5 dev-days; config-driven threshold; acceptance bar = a fresh Neo4j GUIDANCE_OUTCOME edge on a real role_type=constraint node dated today, reflected in GetConstraintEffectiveness. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…UTCOME-001 Epic 1) Revives the Neo4j GUIDANCE_OUTCOME edge sink (dormant since Apr 12). Root cause (RRF-SCALE-001 Follow-up A): matchConstraintCode links guidance items to constraint codes by keyword overlap (>=3 shared words), but retrieval surfaces emergent_concept abstractions whose content rarely shares 3+ literal words with raw constraint text -> no code -> PersistGuidanceOutcome falls back to the concept SourceNode -> the role_type=constraint filter rejects it -> no edge. Fix: new matchConstraintCodeByEmbedding queries the constraint vector index (db.index.vector.queryNodes, role_type=constraint, sim >= threshold) and returns the closest constraint's code. Guide() tries this first, falling back to the keyword matcher when the embedder is unavailable, content is empty, or nothing clears the threshold — never regresses. The existing PersistGuidanceOutcome + findConstraintNodeID then create the edge on the correct constraint node. Implementation refinement vs plan: uses Neo4j's vector index server-side (mirrors the proven Evaluator.findMatchingConstraints pattern) rather than loading all constraint embeddings into Go and computing cosine in a loop — cleaner, no constraintCodeEntry.Embedding needed. Same Option-1 outcome. Config: JIMINY_CONSTRAINT_CODE_SIM_THRESHOLD (default 0.55, zero-value fallback) — provisional; tuned against the live similarity distribution in Epic 2. Tier 1 (4 tests): nil-driver/empty-embedding guards, threshold default resolution, keyword-fallback non-regression. Full jiminy + config suites green; lint clean. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…on (Epic 2)
Tier 3 live e2e (verification.md) — acceptance bar MET:
- /v1/jiminy/guide now yields guidance items carrying constraint_codes
(10 items, 6 coded; was 0). Matched code 'no-direct-main-commits' is
semantically exact for the 'commit to main' context.
- Full warm->latest->feedback loop: Neo4j GUIDANCE_OUTCOME 893 -> 899
(+6), latest today. All 6 new edges land on REAL role_type=constraint
nodes ('CONSTRAINT: NEVER commit directly to main') — not
emergent_concept. The sink dormant since Apr 12 is revived on the
correct nodes.
- /v1/constraints/effectiveness reflects it: 'NEVER commit directly to
main | surfaced: 30 followed: 28 rate: 0.93'.
- Both sinks now revived: TSDB (RRF-SCALE-001) + Neo4j (here). The
constraint-effectiveness loop is fully restored.
Threshold 0.55 validated live: correct matches, no false positives.
Tier 2 (jiminy_outcome_test.go, integration tag, skip-on-empty): PASSES
on a populated stack with an idle LLM (7/10 items coded). The guide path
is LLM-latency-dependent (per-node classifier ~31s/call, serialized; a
call fired while the LLM is busy fast-fails empty), so the test
warm-retries and SKIPS (never false-fails) when the LLM path can't
produce items. Bonus check; Tier 3 is the definitive proof. The LLM
serialization/synthesis-timeout is RRF-SCALE-001 Follow-up B, tracked
separately.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Final epic. CHANGELOG Unreleased gains the JIMINY-OUTCOME-001 Fixed entry. CLAUDE.md gains a guidance-outcome constraint-code-matching note (embedding-first via vector index, keyword fallback; both outcome sinks now live). post.md: epic-by-epic, acceptance check-off, the loop-revival completion (TSDB from RRF-SCALE-001 + Neo4j here), discipline notes (LLM serialization is the test-flakiness source), forward-looking (Follow-up B now the most operationally-visible remaining issue). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t (Follow-up B) Synthesis fails on every production warm call (6/6 jiminy.synthesize errored). Root cause: the hook's /warm path runs background Guide() with a hardcoded 30s timeout (handlers_jiminy.go:302), inside which the per-node constraint classifier runs SERIALLY (~1.5s x ~10 nodes ~= 15s), leaving only ~15s for synthesis which needs 8-27s -> deadline exceeded. JIMINY_TIMEOUT_MS=240s is configured but the 30s hardcode caps it. Fix (both): (1) parallelize the per-node classifier with bounded concurrency (CONSULTING_CLASSIFY_CONCURRENCY, default 4 matching llama-server --parallel 4); (2) config-drive the warm timeout (JIMINY_WARM_COMPUTE_TIMEOUT_MS, default 90s). Acceptance: synthesis succeeds live (no synthesis_error), measured latency drop, no constraint-surfacing regression. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…E-SYNTH-001 Epic 1) The per-node LLM constraint classifier in findApplicableConstraints ran serially (~1.5s/node x ~10 nodes ~= 15s), starving guidance synthesis of its time budget (synthesis 6/6 errored on the warm path). Now classifies with bounded concurrency. - Gate-first (RRF-SCALE-001 score floor) to fix a stable candidate order, then classify each candidate into a position-indexed slot via a semaphore-bounded worker pool, then collect-in-order + dedup-by-name — output is identical to the serial path (determinism). Keyword-only (no LLM) or cap=1 runs serially (no LLM latency to hide). - Config: CONSULTING_CLASSIFY_CONCURRENCY (default 4, matching llama-server --parallel 4; floor 1 = serial rollback). Zero-value fallback to 4. - Extracted constraintClassifierIface (minimal Classify surface) so the concurrent path is unit-testable with a fake; *ConstraintClassifier satisfies it; SetConstraintClassifier guards against a typed-nil interface. Tier 1 (5 new, -race clean): ParallelEqualsSerial (determinism + order), ParallelIsFaster (concurrency overlaps latency), ErrorFallsBackToKeyword (fallback intact), ScoreGateStillApplies (RRF-SCALE-001 gate preserved), ConcurrencyDefaultFallback. Existing findApplicableConstraints tests unchanged — no regression. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…pic 2) The warm-path background Guide() ran with a hardcoded 30s timeout (handlers_jiminy.go:302) even though JIMINY_TIMEOUT_MS=240s is configured. 30s was too tight for the per-node classifier (~15s) + synthesis (8-27s) -> synthesis deadline-exceeded every warm call. Replaced with JIMINY_WARM_COMPUTE_TIMEOUT_MS (default 90000, zero-value fallback 90000) — headroom for the now-parallel classifier (~7.5s) + a slow 27s synthesis. No-hardcoding rule. Rollback: set to 30000. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…Epic 3)
Tier 3 live e2e (verification.md): the warm production path now produces
a synthesized narrative — synthesis_used=true, no synthesis_error,
1892-char augmentation. Fresh jiminy.synthesize succeeded at 50.7s
latency (fit the new 90s budget; would die at the old 30s — validates
the default). Both fixes needed.
Tier 2 (guidance_synth_test.go, integration, skip-on-empty + LLM-
tolerant): PASS — warm path produces guidance without synthesis_error.
Docs: CHANGELOG Fixed entry; CLAUDE.md guidance-synthesis-budget note
('when adding LLM calls to the guidance hot path: respect the
warm-compute budget and prefer bounded concurrency over serial loops');
post.md with the data-driven diagnosis + forward-looking (Follow-up C
now the last open item).
Closes Follow-up B. The guidance pipeline (surfacing + codes +
synthesis) is fully functional end-to-end.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Follow-up C (the last open item from RRF-SCALE-001 triage) investigated and closed with evidence — NO code change, because there is no bug to fix. The earlier /v1/jiminy/latest parse failures were client-side shell artifacts (co-occurring with the session's 'failed to change group ID' errors + ad-hoc variable-capture piping), not server bytes: - writeJSON uses json.NewEncoder().Encode (encoding/json always escapes control chars U+0000-U+001F); no raw-write bypass; no custom MarshalJSON. - The synthesized narrative is double-StripControlChars'd (synthesizer.go :127 + service.go:1116). - prompt-context.sh already strips control chars via perl before jq, with 2>/dev/null + // empty fallbacks. Live-verified: the hook's exact jq returns guidance_id correctly; 5 rapid /latest fetches all parse as strict-valid JSON; 0 raw control chars. Per 'don't fix a non-problem', shipping a fix would invent a bug that doesn't exist. Closure documented in docs/development/followup-c-closure.md. This closes the entire RRF-SCALE-001 follow-up triage: A (JIMINY-OUTCOME -001), B (GUIDANCE-SYNTH-001), C (non-issue). The guidance->feedback-> outcome loop is fully functional end-to-end. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ts (GUIDANCE-SYNTH-001 fix-commit) Two things, both from a live e2e of the full loop through the real production hook path (run per user directive: standard tests don't find live problems). 1. Sibling bug: the /v1/jiminy/guide handler had the SAME hardcoded 30s cap as the warm path (handlers_jiminy.go). GUIDANCE-SYNTH-001 fixed warm; /guide still deadline-exceeded synthesis at exactly 30.003s (this is what made prior sprints' /guide integration tests flaky). Now uses the config-driven budget. Live-verified: a 50.05s synthesis completed (synthesis_used=true) — would die at 30s. 2. Single source of truth for config defaults (user directive: 'single place to change all instances'). The 90s budget was duplicated as a literal in 3 sites; prior sprints similarly duplicated each default (the sigmoid 0.45/8.0 was in 3 places). Now each default is one exported config.Default* const, referenced by FromEnv and aliased by consuming-package fallbacks + a Config.JiminyWarmComputeTimeout() method. Consolidated: warm-compute timeout, the 3 consulting score floors, sigmoid midpoint/steepness, constraint-code sim threshold, classify concurrency. Zero behavior change (compile-time aliases); -race + full suites green. Live e2e also re-confirmed: real hook captures guidance_id -> feedback -> +7 Neo4j GUIDANCE_OUTCOME edges on real constraint nodes + 10 TSDB rows (whole loop closes through the actual hook; re-confirms Follow-up C non-issue). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…S backfill Builds the first consumer for EVENTGRAPH-001's reinforcement-neighborhood federation API (which has no consumer): a 'mdemg eventgraph' CLI command. Validates the Pattern Y1 bet + becomes the live-testing harness for EVENTGRAPH-002/003 (user directive: build the consumer first). Per the UxTS directive: maps the work to the frameworks. UATS applies to the federation HTTP API -> add eventgraph_reinforcement_neighborhood.uats .json (backfilling the -001 gap; the endpoint shipped with no UATS), which replaces an ad-hoc Go integration test as the Tier 2 contract test. UVTS/UBENCH N/A. UOTS panel-spec gap noted as a follow-up (out of scope). CLI rendering -> Tier 1 Go units. 4 epics; CLI (--seed/--query/--hops/--since/--limit/--json) renders summary + events table or JSON; server-driven defaults (no re-hardcoding); read-only. ~1-1.5 dev-days. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… (Epic 1) First consumer of the EVENTGRAPH-001 federation API — POSTs to /v1/eventgraph/reinforcement-neighborhood, renders a summary + events table (or --json). Supports --seed, --query (resolves seed via /v1/memory/retrieve top-1), --hops, --since, --limit. Unset flags are omitted from the request so the server applies its config defaults (no re-hardcoding of hops/since/limit in the CLI). Registered under the "advanced" command group. Tier 1 (httptest, -race clean): request-mapping omit-when-unset + conversion, --query seed resolution, no-results + invalid --since + surfaced-503 errors, render (empty + table), helpers. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y neighborhood
Caught in EVENTGRAPH-CLI-001 live contract testing (standard code tests
missed it; the live UATS happy-path against the running server did not):
walkNeighborhood returns a nil slice when the seed has no neighborhood
(e.g. an unknown seed), which JSON-marshals to `null`, while Events is
defensively initialized to []. Both are array fields and must serialize
consistently — null breaks any consumer asserting an array type (incl. the
new UATS contract's `type_is array` on $.neighbor_node_ids).
EventsInGraphNeighborhood now coalesces the nil slice to []string{}.
Tier 1 TestFederationResult_EmptyArraysNotNull pins the JSON contract.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Backfills the UATS gap EVENTGRAPH-001 left (no contract test for /v1/eventgraph/reinforcement-neighborhood). 6 cases, validated 6/6 live against the running server: - happy 200: asserts the response contract shape (events/neighbor_node_ids arrays, graph_hops/tsdb_rows_scanned numbers, truncated boolean) — robust to data, works even with an unknown seed (empty neighborhood is valid 200) - missing_space_id / missing_seed_node_id → 400 (empty-string override, since the runner deep-merges variant body over base — key omission can't unset) - negative_hops → 400, hops_over_ceiling (999 > 2×default) → 400 - method_not_allowed (GET) → 405 sha256 integrity hash added + verified. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… + close (Epic 3) Tier 3 live e2e verified the real binary against the real stack: --query surfaced 20 reinforcement events in a 5-node neighborhood (demonstrating the Hebbian-write → federation-read loop closing in one command); --seed/--json/ --limit/unknown-seed/no-arg paths all verified live. Feature doc gains the CLI consumer section; CHANGELOG Added + Fixed entries; CLAUDE.md architecture note; verification.md + post.md (UxTS mapping: UATS done, UOTS follow-up carried over). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…TSDB CI Test failed: the UATS contract step boots a minimal server without TSDB, so the eventgraph service is nil and every POST returns 503 "service not initialized" instead of the expected 200/400 (only GET→405 passed, since the method check precedes the service check). Same class as PR #404. The federation endpoint genuinely requires TSDB (it queries reinforcement_events; the service is nil without TSDB at boot), and CI's UATS step already excludes `tsdb`-tagged specs (ci.yml --exclude-tag ...,tsdb). Added "tsdb" to api.tags (matching metrics_snapshot/readyz_tsdb); re-hashed. Verified locally: the spec now reports Status: skip under the exact CI exclude filter, and still 6/6 live against the full stack via explicit --spec. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Federate the guidance-outcome event stream (Pattern Y1, second event class): walk a constraint's Neo4j neighborhood, surface time-windowed constraint_outcomes (followed/ignored/contradicted) for the constraint + its graph-related constraints. Data-decided architecture: reuse the existing constraint_outcomes table (no new hypertable/writer/enqueue site — RRF-SCALE-001 already populates it, 1176 live rows); join graph↔events on constraint_code (TSDB constraint_id UUID ≠ Neo4j node_id CUID — code is the only viable key). One additive migration (V0023: constraint_code index, schema 22→23). 8 epics, 3 testing tiers, live Tier 3. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…mes (Epic 1) Adds idx_constraint_outcomes_code (space_id, constraint_code, time DESC) — the guidance-outcome federation joins graph↔events on constraint_code (TSDB constraint_id is a UUID that doesn't match the Neo4j node_id CUID; code is the only viable key), and migration 011 indexed only space/constraint_id/outcome. Partial index (constraint_code NOT NULL AND <> '') skips uncoded outcomes. Bumps TSDB_REQUIRED_SCHEMA_VERSION default 22→23 (config.go) to match the migration count — CI schema-version validator gates on this. Additive, no data change, idempotent. Live-verified: migration applies (schema 22→23), idx present, re-apply is a no-op, config/tsdb tests green, CI schema check 23=23. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…d (Epic 2) Second Pattern Y1 federation: walk a constraint's Neo4j neighborhood, collect each neighbor's constraint_code, and join constraint_outcomes on those codes (backed by the V0023 index). walkNeighborhoodWithCodes returns the neighborhood node IDs + a code→node map; queryGuidanceOutcomes pulls coded outcomes in the window; Go-side join resolves each outcome's code → its neighborhood constraint node. Non-nil slices from the start (EVENTGRAPH-CLI-001 lesson). Reuses the existing constraint_outcomes sink — no new table/writer. Tier 1 (-race): validation guards, empty-arrays-not-null, sortedKeys determinism, join resolution. Tier 2 integration (live Neo4j+TSDB): full round-trip — hops=1 (seed+related codes, off-neighborhood excluded), hops=0 (seed code only), unknown-seed (empty non-nil). PASS. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ic 3) POST /v1/eventgraph/guidance-outcome-neighborhood — walk a constraint's neighborhood, surface constraint_outcomes whose code is in the neighborhood. Same gating/auth/default convention as the reinforcement endpoint. Single-source refactor (per the dynamic-variables directive): extracted the shared gate (method/enabled/service → eventgraphGate) and default-resolution (hops/since/limit + ceiling → resolveFederationDefaults) into helpers used by BOTH handlers, so the federation rules live in exactly one place. The reinforcement handler now calls them too — verified no regression (reinforcement UATS still 6/6 live, unit tests green). Live-verified: seeding from the real 'no-direct-main-commits' constraint node surfaced real 'followed' outcomes with constraint_node_id resolved to the seed and in_neighborhood=true. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…CLI (Epic 4) Sibling subcommand consuming POST /v1/eventgraph/guidance-outcome-neighborhood. Walks a constraint's neighborhood and renders guidance outcomes (followed/ ignored split + table: code · outcome · sim · g_type · guidance_id · recorded) or --json. Seed via --seed/--query (--constraint-code seeding deferred — needs server-side code→node resolution; --query covers discovery). Unset hops/since/ limit omitted so the server applies config defaults (single source of truth). Tier 1 (-race): request-mapping omit-when-unset + conversion, --query seed resolution, surfaced-503 error, render (empty + followed/ignored table), truncStr. Help renders. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ion (Epic 5) 6 cases, validated 6/6 live: happy-200 response shape (outcomes/ neighbor_node_ids/neighbor_constraint_codes arrays, graph_hops/tsdb_rows_scanned numbers, truncated boolean), missing space_id/seed → 400 (empty-string override under deep-merge), negative_hops → 400, hops_over_ceiling → 400, GET → 405. Tagged 'tsdb' so CI skips it without TSDB (the EVENTGRAPH-CLI-001 lesson). sha256 hashed + verified. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Real binary against the real stack. Key assertion: CLI --json output matches direct constraint_outcomes SQL exactly (11 outcomes = 11, all followed) for the no-direct-main-commits constraint. --seed/--query/--limit/--json/unknown-seed/ no-arg all verified live. The --query "0 outcomes" result was traced to SQL ground truth — the 5 neighborhood codes genuinely have no feedback, so it's correct (federation distinguishes "code in neighborhood" from "code has outcomes"), not a join bug. Reinforcement endpoint un-regressed by the shared- helper refactor (UATS 6/6). Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…that can fire Live baselines escalate the roadmap's diagnosis: the surprise chain is flat DEAD, not noisy — all 221,504 reinforcement events ever carry surprise_factor=1.0; node surprise_score avg 0.023 (max 0.503, n=5,808) vs hardcoded 0.4/0.7 thresholds. Scope: vector-index top-K novelty + config-driven thresholds recalibrated to the new scale in the same sprint (RRF-SCALE lesson) + CoactivateSession surprise-CASE audit. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ty (Epic 1) Replaces the unordered LIMIT 50 sample (no ORDER BY) whose comparison set was dominated by EMPTY-ARRAY embeddings — 4,564 of 5,810 conversation observations (78%) carry size(embedding)=0, passing the old IS NOT NULL guard while cosine() yields NULL; this is why node surprise_score averaged 0.023 and every reinforcement event ever carried surprise_factor=1.0. New: exact ORDER BY cosine scan over real-embedding (size=dims), non-archived, space-scoped conversation observations; config SURPRISE_EMBEDDING_NOVELTY_TOPK (50) + _SIM_FLOOR (0=off). The db.index.vector.queryNodes route was live-REJECTED: the label-wide index is crowded by ~100k non-conversation nodes (top-200 near a conversation seed were ALL emergent_concept centroids — the HIDDEN-CHURN degeneracy), pruning role-filtered hits to zero; the exact scan over ~1.2k real rows is deterministic and ~ms (revisit at ~50k). Live-verified: avg sim 0.680 / count 50 over the true nearest set. Follow-up recorded: the 78% empty-embedding backlog (embeddings backfill) is its own item. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… at both Cypher sites (Epic 2) CoactivateSession DOES compute the surprise CASE (audit result) — the chain was dead purely from broken scores. The hardcoded 0.7/0.4 thresholds (unreachable: live max score 0.503, avg 0.023 under the old noise) become SURPRISE_FACTOR_HIGH_THRESHOLD (0.5) / SURPRISE_FACTOR_MEDIUM_THRESHOLD (0.3) — defaults calibrated to the exact-top-K novelty scale, parameterized into BOTH ApplyCoactivation and CoactivateSession Cypher (never recalibrate against the old scale — the RRF-SCALE lesson). Non-positive config falls back to defaults. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…loud error path (Epics 3-4) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ace/outcome split, token floors (Epics 1-3)
Budgets: JIMINY_TIMEOUT_MS default 15s→0 (=derive from the 90s
warm-compute budget — both paths run the same synthesis work; the
independent 15s starved every fresh install, the GUIDANCE-SYNTH-001
class); /reformulate's hardcoded 10s → JIMINY_REFORMULATE_TIMEOUT_MS
(0=derive); config.Validate() warns on explicit budget incoherence.
Attribution: PersistGuidanceOutcome now receives feedbackSessionID
(was literal "" — every GUIDANCE_OUTCOME edge ever has null
session_id; forward-only fix). Surface-vs-outcome split:
mdemg_jiminy_guidance_surfaced_total{space_id} (the honest denominator
— TotalGuidanceIssued only counts guidance that RECEIVED feedback) +
mdemg_jiminy_feedback_dropped_total{space_id} on tracker-expiry drops.
Floors: JIMINY_OUTCOME_LLM_MAX_TOKENS 100→3000 (truncation risk on the
classifier reasoning field → parse fail → heuristic fallback, the
JIMINY-OUTCOME-002 artifact class), SYNTHESIS 2000→3000, EVALUATE
2000→3000 (standing ≥3000 rule; completion stops at JSON end — floors
are free insurance). Stale TTL comment fixed (1800→86400 actual).
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…keleton (Epic 1a) scripts/verify_route_consumers.py extracts the live route table and fails on bidirectional drift (unlisted route / stale entry) and on any UNREVIEWED disposition — the bootstrap marker. Gate verified failing on the 187 fresh UNREVIEWED entries; adjudication next. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…cking CI gate (Epic 1b) All 187 routes carry evidence-based dispositions: 109 ACTIVE, 60 OPERATOR_SURFACE, 13 INTERNAL, 4 PRUNE_CANDIDATE, 1 DEFERRED. 171/187 matched to UATS specs. Orchestrator re-verified every PRUNE_CANDIDATE and known false positive independently. Census reversals vs recon (the false-positive class this gate exists to catch): /viz/topology + /api/graph/* are LIVE Grafana consumers (topology iframe + nodegraph datasource) — removed from the prune list; /v1/conversation/snapshot* is NOT called by pre-compact.sh (saves via /v1/conversation/observe) — OPERATOR_SURFACE. Hidden consumer surfaced: embedded /ui/ dashboard consumes ~35 routes that look dormant from hooks/CLI/scripts alone. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… ordering (Epic 2) The Hebbian signal learner (V0024 SignalState, supervised flush, startup hydration, live emission/response stream since HOOKWIRE-001) had a read side with ZERO production callers. Guide() now orders within equal priority by (1-w)·confidence + w·GetStrength(code), w = JIMINY_SIGNAL_STRENGTH_WEIGHT (default 0.2; 0 = off, pre-census behavior; clamped to 1). Ordering only — selection/filtering untouched. Unknown codes blend the learner's 0.5 neutral default. 6 Tier 1 tests pin the contract: weight-0/nil-learner pure confidence, blend formula, neutral default, clamp, priority dominance + within-priority strength overtake. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ic 3) Pruned (each independently re-verified zero producers + named successor): - /v1/feedback — gap-feedback intake; live channel is /v1/jiminy/feedback. Handler + isolated gaps.ProcessFeedback/Feedback removed. - /v1/memory/ingest-codebase[/] — deprecated since Phase 94 (Deprecation header); successor /v1/memory/ingest/trigger. Whole handler file removed. - POST /v1/alerts/grafana — superseded by the native alert evaluator; only ref was a commented-out contactpoint. Compose env MDEMG_GRAFANA_ALERT_WEBHOOK_URL removed from both compose files. NOT pruned (census reversals): /viz/topology + /api/graph/* are live Grafana consumers; PREDICTS/FORESHADOWS exist nowhere in code (recon claim was wrong — no-op). 6 UATS specs removed + capability_gaps_full /v1/feedback variants dropped; UXTS matrix 220→214; inventory entries retained as PRUNED with removed_in. Gate: 183 live / 187 inventoried, OK. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…bserves Live-smoke catch (DORMANT-CENSUS-001 Tier 3, own fix commit per precedent): ConstraintCodeGenerator.GenerateCode's collision branch called fallbackCode while holding g.mu; fallbackCode locks g.mu again. sync.Mutex is not reentrant — the first LLM-returned code that collided with a registered code deadlocked the generator permanently, and every later constraint-typed /v1/conversation/observe queued behind it forever (UATS conversation_observe_pinned hung 45-90s+ deterministically; goroutine dump showed the holder 18 min wedged at codegen.go:121 with N waiters at :47). Fix: fallbackCodeLocked (caller holds g.mu) used by the collision branch; fallbackCode wraps it. Regression test drives the real collision path through a fake OpenAI-compat endpoint with a 10s deadlock tripwire + post-call usability check. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…variant removal The Epic 3 prune removed the spec's two /v1/feedback variants but did not re-pin its integrity sha256 — the merge-blocking UNTS hash-verify step correctly caught the drift on PR #452. 214/214 valid locally. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…(Epic 1) (1) Cross-version fingerprint guard: deriveQueryFingerprint returns the active catalog version; it rides RetrieveRequest/ColumnQuery into ContextColumn and StrictContextMode, where candidates fingerprinted against a different catalog version score 0 — bit positions reallocate per build, so cross-version Jaccard is noise (mdemg-dev: 76,906 v1 nodes were silently compared against v3 query bits). Version 0 = unknown (explicit-fp callers) keeps legacy behavior. (2) Consensus semantics: columns STRUCTURALLY unable to vote (disabled by config, or context with no query fingerprint) are no longer appended, so they stop deflating the denominator — the always-empty live context column had every live query's consensus hard-capped at 0.8. Errored/timed-out columns still count (documented intent kept). The column_context.go comment that claimed exclusion now matches code. (3) Cache integrity: QueryContextFingerprintVersion joins CacheKey (the reflection forcing-function caught it unprompted); scorerVersion bumps to v2 (semantics change) and gains a deterministic hash of the per-category context-weight + sparse-override maps — operator edits to either JSON now flip the namespace (pointer fields dereferenced; %+v would have hashed addresses). Pins: TestContextColumn_VersionGuard, TestConsensus_DenominatorExcludesAbsentColumns, TestScorerVersion_FlipsOnCategoryMapChanges. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ed (Epic 2) Disclosed deviation from the plan sketch: RefineWithCoactivations is NOT the skew healer — it merges old-catalog bits and bumps the version, which would relabel v1-semantics bits as current. The healer is recomputation: new conversation.RecomputeStaleFingerprints (the backfill CLI's core as a budget-bounded, resumable library call). Stage 6 now runs on EVERY invocation (not just rebuilds): catalog freshness/build as before, then (1) heal pass — recompute up to CONTEXT_FINGERPRINT_HEAL_MAX_PER_CYCLE (2000) stale-version nodes under the existing 60s budget, flushing partial batches on ctx expiry; (2) Phase-B refine — RefineWithCoactivations (previously ZERO callers) over up to CONTEXT_FINGERPRINT_REFINE_MAX_PER_CYCLE (200) current-version observations with co-activations, marked via context_fingerprint_refined_version so cycles never re-walk. Driver injected via SetFingerprintDriver (server.go, next to SetContextCatalog); nil driver = old rebuild-only behavior. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…(Epic 3) Live traffic never passes ?category=, so SPARSE_GATE_CATEGORY_OVERRIDES and RETRIEVAL_CONTEXT_COLUMN_CATEGORY_WEIGHTS only ever fired on benchmark calls. Retrieve now derives req.Category from the QueryClassifier's types via QUERY_CLASSIFY_CATEGORY_MAP (JSON; default data_flow→data_flow_integration, architecture→architecture_structure, relationship→relationship). Explicit body/param category always wins; first mapped type wins for multi-label; empty map disables. Runs pre-CacheKey (Category already in the key — cache-safe; classifier already ran before the key was computed). Vocabulary gap disclosed: service_relationships/business_logic_constraints have no classifier equivalent and stay benchmark-only. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ult-on (Epic 4) CONTEXT_QUERY_AUTO_DEFAULT (default true) makes derivation fire for every retrieve with query text; per-call opt-out ?context=off|false|0; explicit ?context=auto still forces it when the config gate is off. Sequenced after the skew heal + version guard, so the newly-active column scores healed current-version fingerprints, never cross-version noise. Cost: reuses the per-(space,version) catalog-vector cache — one cosine top-K over ≤256 refs per call. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…+ post (Epics 5-6) Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Sprint CONTEXT-LIVE-001 — Context Fingerprinting Goes LiveQ3 stretch tier, first pick. The 5th RRF column existed since Phase 14.2 but was benchmark-only in practice — this sprint makes it real on live traffic. All roadmap recon claims confirmed live (sharper than stated). What was broken
Shipped
The 120q UVTS A/B gate — full disclosureStrict verdict: FAIL at mean −0.004 (0.4070→0.4030). Forensics: all 13 changed questions moved by exactly ±0.100, exclusively in Side-note: the deliberate baseline/candidate binary swaps live-fired the scorer-drift tripwire (TSDB-CONSUME-001) — the alert storm during the runs was the tripwire working as designed. Follow-ups recordedCatalogs carry no symbol bits (Phase-B refine structurally no-op until the builder allocates them); 🤖 Generated with Claude Code |
Summary
Development branch changes from
reh3376_dev01.Commits
mdemg modelCLI + pluggable Fetcher interfaceAuto-generated PR from reh3376_dev01 push