Skip to content

dev: reh3376_dev01 -> main#466

Merged
reh3376 merged 360 commits into
mainfrom
reh3376_dev01
Jun 15, 2026
Merged

dev: reh3376_dev01 -> main#466
reh3376 merged 360 commits into
mainfrom
reh3376_dev01

Conversation

@github-actions

Copy link
Copy Markdown
Contributor

Summary

Development branch changes from reh3376_dev01.

Commits

  • feat(baseline-recompute-001): honest baseline 0.8338->0.8655 via fixed harness (Epic 2-4)
  • docs+fix(baseline-recompute-001): sprint plan (Epic 0) + rl_phase11 port 8101->8102 (Epic 1)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(reward-correctness-002): CHANGELOG + post (Epic 5 close)
  • fix(reward-correctness-002): schema/reward-mismatch fixes (Epics 1-3) + live validation
  • docs(reward-correctness-002): sprint plan — schema/reward-mismatch fixes (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(dataprune): Neo4j space hygiene record + CHANGELOG
  • fix(space): delete pre-check + list panic on non-MemoryNode / null-space data
  • docs(dataprune-audit-001): Category B+C prune execution record
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(dataprune-audit-001): record Category A prune execution + CHANGELOG (Epic 2 close)
  • docs(dataprune-audit-001): non-destructive audit of non-conforming data (Epic 0-1)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • fix(ape-prompt-budget-001): re-pin ape_reflect ULTS source line after import shift
  • docs(ape-prompt-budget-001): feature doc + CHANGELOG + post + CLAUDE.md (Epic 5)
  • feat(ape-prompt-budget-001): bound ape.reflect prompt to protect output (Epic 1-2)
  • docs(ape-prompt-budget-001): sprint plan + recon (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • Merge remote-tracking branch 'origin/reh3376_dev01' into reh3376_dev01
  • docs(reward-correctness-001): CHANGELOG + sprint post (Epic 6 close)
  • feat(reward-correctness-001): per-task inclusion thresholds + live findings (Epic 2-3)
  • feat(reward-correctness-001): length-neutral correctness rewards (Epic 1)
  • docs(reward-correctness-001): sprint plan (Epic 0)
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • fix(eval-integrity-001): restore ape_reflect system_prompt_source to clean file:line
  • Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
  • docs(eval-integrity-001): feature doc + CHANGELOG + post (Epic 7)
  • feat(eval-integrity-001): leak-audit gate target (Epic 2)
  • fix(eval-integrity-001): wire LLM recorder in ingest process — capture summarize.generate (Epic: capture-gap)
  • feat(eval-integrity-001): hard-fail on zero successful calls (Epic 3)
  • feat(eval-integrity-001): wire gate to leak-free 12-task valid_clean (Epic: gate-set)
  • fix(eval-integrity-001): respect dynamic_prompt — recover enum-templated tasks' production rows (Epic 1)
  • docs(eval-integrity-001): sprint plan + recon (Epic 0)
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
  • docs(sidecar-loop-001): feature doc + CHANGELOG; archive mislabeled pre-fix data
  • docs(training-audit): root-cause the 3 discarded retrains — corpus filter + eval + DPO all mislabeled
  • fix(sidecar-loop-001): collector logs aligned reranked candidates + guard (Epic 1)
  • docs(sidecar-loop-001): sprint plan + recon (Epic 0) — fix-but-defer
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
  • docs(negfeed-cooler-001): feature doc + CHANGELOG + post (Epic 6)
  • feat(cooler-001): graduated-incident edges resist decay (Epic 4)
  • feat(cooler-001): unify graduation onto the Context Cooler (Epic 3)
  • feat(negfeed-001): anti-Hebbian producer — Jiminy contradicted bridge + MCP memory_reject (Epic 2)
  • feat(negfeed-001): CoactivateSession off-request + delta emission (Epic 1)
  • docs(negfeed-cooler-001): sprint plan + live-verified recon (Epic 0)
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
  • docs(readme): coverage-gap additions per 3-lane team review (operator-approved)
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(context-live-001): UVTS A/B artifacts + feature doc + CHANGELOG + post (Epics 5-6)
  • feat(context-live-001): server-side query-fingerprint derivation default-on (Epic 4)
  • feat(context-live-001): classifier→category dispatch on live traffic (Epic 3)
  • feat(context-live-001): stage-6 fingerprint heal + Phase-B refine wired (Epic 2)
  • feat(context-live-001): version guard + honest consensus denominator (Epic 1)
  • docs(context-live-001): sprint plan + recon findings (Epic 0)
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • fix(dormant-census-001): re-pin capability_gaps_full spec hash after variant removal
  • docs(dormant-census-001): feature doc + CHANGELOG + sprint post (Epic 5)
  • fix(jiminy): codegen collision-path self-deadlock wedged constraint observes
  • feat(dormant-census-001): prune 4 verified-dead routes + handlers (Epic 3)
  • feat(dormant-census-001): wire SignalLearner.GetStrength into Guide() ordering (Epic 2)
  • feat(dormant-census-001): adjudicated 187-route inventory + merge-blocking CI gate (Epic 1b)
  • feat(dormant-census-001): route↔consumer gate + 187-route inventory skeleton (Epic 1a)
  • docs(dormant-census-001): sprint plan — the standing dormancy guarantee
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(jiminy-budget-001): CHANGELOG + post (Epic 4)
  • feat(jiminy-budget-001): budget derivation, session attribution, surface/outcome split, token floors (Epics 1-3)
  • docs(jiminy-budget-001): sprint plan
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(surprise-topk-001): post + CHANGELOG; threshold recalibration + loud error path (Epics 3-4)
  • feat(surprise-topk-001): config-driven surprise multiplier thresholds at both Cypher sites (Epic 2)
  • feat(surprise-topk-001): exact top-K nearest-neighbor embedding novelty (Epic 1)
  • docs(surprise-topk-001): sprint plan — honest novelty + a multiplier that can fire
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • fix(doc-audit-001a): execute the approved fix batch
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
  • docs(ft-classify-002): CHANGELOG + post — no-promote accepted, sprint closed (Epic 6)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(ft-classify-002): gate results — 2/3 PASS, no-promote recommended
  • docs(doc-audit-001): amended charter per 3-lens review team (operator-approved)
  • docs(ft-classify-002): stage-by-stage run record (6a instrumentation input)
  • fix(ft-classify-002): benchmark config still pointed at decommissioned mlx port 8101
  • feat(ft-classify-002): training config — fresh task-delta LoRA on the production fused model (Epic 2)
  • feat(ft-classify-002): class-stratified distill capture + the real 11.5d root cause (Epic 1)
  • Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(ft-classify-002): sprint plan — distribution-matched consulting.classify distillation
  • fix(tsdb): time-scope V0014's backfill + integrity assertion to its historical window
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(mcp-revive-001): feature doc + CHANGELOG + CLAUDE.md + post (Epic 5)
  • feat(mcp-revive-001): eventgraph + strict MCP tools, plugin-orphan reaper (Epics 3+4)
  • test(mcp-revive-001): contract suite for the MCP tool surface (Epic 2)
  • feat(mcp-revive-001): space resolution chain across all MCP memory tools (Epic 1)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • fix(jiminy-outcome-002): Tier-2 not_applicable + verdict provenance
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(ft-recursive-000): recursive-retraining loop as-built audit + buildable spec
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(tsdb-consume-001): record the Epic-5 live-smoke catch in post.md
  • fix(tsdb-consume-001): emergence-cycle gauge writers actually land in hidden/service.go
  • docs(tsdb-consume-001): feature doc + CHANGELOG + CLAUDE.md + post (Epic 8)
  • ci: reclaim runner disk before docker-publish builds
  • fix(tsdb-consume-001): remove the 4 reader-without-writer ft_* dashboard panels (Epic 7)
  • feat(tsdb-consume-001): V0020 writer-or-drop — decided: writer (Epic 6)
  • feat(tsdb-consume-001): guidance_conflicts counter + emergence-cycle duration tripwire (Epic 5)
  • feat(tsdb-consume-001): scorer-drift tripwires over retrieval_audit (Epic 4)
  • feat(tsdb-consume-001): writer flush stats → metrics + flush-failure rule (Epic 3)
  • fix(tsdb-consume-001): honest metrics plane — windowed percentiles, real pool gauges, latency rules over real wall-time (Epic 2)
  • feat(tsdb-consume-001): V0025 retention+compression for unbounded hypertables (Epic 1)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(config-deadflag-001): CHANGELOG + post — sprint complete
  • feat(config-deadflag-001): full triage — 28 wired, 29 deleted, 0 allowlisted (Epic 2) + CI guard (Epic 4a)
  • feat(config-deadflag-001): enforce EVENTGRAPH_MAX_PAIRS_PER_EVENT_BATCH (first triage wire)
  • docs(config-deadflag-001): sprint plan (Epic 0)
  • feat(config-deadflag-001): strict getBool + un-swallow LoadYAMLConfig + consumer scanner (Epics 1+3)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(doc-truth-002): MoE-as-current residual cleanup (modified per review)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • fix(uxts-ci-001): UBENCH CI gate → lint mode; holdout absence tolerated, mismatch still fatal
  • fix(uxts-ci-001): add tiktoken + pyyaml to the neural CI job deps
  • docs(uxts-ci-001): CHANGELOG + CLAUDE.md CI-gate inventory + post — sprint complete (Epics 5-6)
  • feat(uxts-ci-001): drift checker covers all 15 frameworks, matrix refreshed, gate un-zombied (Epic 4)
  • feat(uxts-ci-001): neural pytest+ruff CI job — first run caught two silent failures (Epic 3b)
  • feat(uxts-ci-001): TSDB in CI, un-zombie UOTS, delete UVTS step, ULTS hash gate live (Epics 1-3a)
  • docs(uxts-ci-001): sprint plan (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • fix(uats-gap-001): tag breaker reset round-trip llm_required — CI has no openai-embeddings breaker
  • feat(uats-gap-001): 8 contract specs for the revived channels + suite hygiene (Epics 1-6)
  • fix(transfer): nil-guard edge identity assertions — one bad edge panicked the whole server (UATS-caught P0)
  • fix(jiminy): reformulate returns 400 (not 500) for missing context
  • docs(uats-gap-001): sprint plan (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • fix(tsdb): initial backup on start — restart-resetting ticker meant zero backups ever
  • fix(rsic-storm-001): commit ExecuteTombstoneStaleForTest wrapper missed from Epic F
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(rsic-storm-001): feature doc + verification + rollback drill test + CHANGELOG + CLAUDE.md — sprint complete (Epic F)
  • fix(rsic-storm-001): attributable archival + unified tombstone predicate (Epics B+C)
  • fix(rsic-storm-001): atomic trigger admission — reserve-on-allow (Epic A)
  • docs(rsic-storm-001): sprint plan with corrected burst attribution (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(backup-restore-verify-001): feature doc + verification + CHANGELOG + CLAUDE.md — sprint complete (Epic 7)
  • fix(transfer): omit empty path/name on import — restores with observations always failed (drill-caught)
  • fix(backup-restore-verify-001): retention was deleting every backup it just made (drill-caught)
  • feat(backup-restore-verify-001): initial backup on start (rule honesty)
  • feat(backup-restore-verify-001): neo4j-backup jobhealth + generalized staleness rules (Epic 5)
  • feat(backup-restore-verify-001): harden the restore path (Epics 1-4)
  • docs(backup-restore-verify-001): sprint plan (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(supervisor-002): feature doc + verification + CHANGELOG + CLAUDE.md — sprint complete (Epic 6)
  • fix(supervisor-002): streak-relative global-outage discriminator (drill-caught)
  • fix(jiminy): detach feedback outcome processing from the hook's connection lifetime
  • feat(supervisor-002): aggregate evaluator-degraded alert for global outages
  • fix(supervisor-002): nolint G118 on legacy-fallback loop launches
  • fix(supervisor-002): recency-gate the RSIC llm_error_rate_spike insight (Epic 4)
  • feat(supervisor-002): rule-health meta-alert on evaluator query failures (Epic 3)
  • feat(supervisor-002): register the 12 unsupervised background loops (Epic 2)
  • feat(supervisor-002): sliding-window restart budget + late registration (Epic 1)
  • docs(supervisor-002): sprint plan + background loop inventory (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(hidden-churn-001): PR-B verification + CHANGELOG + CLAUDE.md — sprint complete (Epic B5)
  • fix(hidden-churn-001): live-smoke fixes — noise pool was structurally empty, clustering included archived debris, coverage gauge gated on min-obs
  • feat(hidden-churn-001): surface themes_updated + noise_assigned in consolidate API + periodic log (Epic B3)
  • feat(hidden-churn-001): mdemg concepts repair + trace — grounding audit CLI (Epic B2)
  • feat(hidden-churn-001): PR-B coverage retune — config ratio, density assignment, gauge + rule (Epic B1)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(hidden-churn-001): PR-A verification + CHANGELOG (Epics A3-A4)
  • chore(hidden-churn-001): remove the dead global-detach helper — the churn mechanism itself
  • feat(hidden-churn-001): stable theme identity — centroid match-or-create replaces the 5-minute churn (Epic A2)
  • fix(hidden-churn-001): automated consolidation no longer skips LLM emergence (Epic A1)
  • docs(hidden-churn-001): sprint plan — stable concept identity, two-PR delivery (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(rrf-scale-002): live-calibrated suggest floor + CHANGELOG + close (Epics 5-6)
  • fix(rrf-scale-002): CacheKey covers ALL result-affecting fields + two forcing functions (Epics 3-4)
  • fix(rrf-scale-002): config-driven score thresholds — suggest revival, MCP tiers, guardrail floor (Epic 2)
  • fix(rrf-scale-002): persistent rerank clients — failure alerting re-armed on the hottest LLM path (Epic 1)
  • docs(rrf-scale-002): sprint plan — finish the score-scale contract (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • test(rsic-validate-001): integration seeds carry session linkage for the scoped tombstone contract
  • docs(rsic-validate-001): Tier 3 verification + CHANGELOG + close (Epics 5-6)
  • fix(rsic-validate-001): counter-free confidence calibration — RSIC stops polluting its own signal (Epic 4)
  • fix(rsic-validate-001): tombstone_stale scoped to correction-linked nodes; refresh_stale_edges decays for real (Epic 3)
  • fix(rsic-validate-001): honest criteria evaluation — populated keys + fail-closed mutations (Epics 1-2)
  • docs(rsic-validate-001): sprint plan — fail-closed self-improvement (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(doc-truth-001): last stale --adapter help string (sweep straggler)
  • docs(doc-truth-001): grep-sweep proof + CHANGELOG + close (Epic 4)
  • docs(doc-truth-001): 00_README STATUS block + AGENT_HANDOFF retired (Epic 3)
  • fix(doc-truth-001): operator-facing text matches the Phase 13.5 reality (Epic 2)
  • docs(doc-truth-001): CLAUDE.md FT section rewritten to post-pivot reality (Epic 1)
  • docs(doc-truth-001): sprint plan — documentation matches reality (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(embed-wire-001): live verification + CHANGELOG + CLAUDE.md + close (Epics 3-4)
  • fix(ingest-exec-001): server-triggered ingest resolves the mdemg binary — was hardcoded ./bin/mdemg (Epic 2)
  • fix(embed-wire-001): breaker + recorder reach the real embedder through the wrapper chain (Epic 1)
  • docs(embed-wire-001): sprint plan — embedder wiring + ingest exec resolution (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs: CLAUDE.md architecture note for MAINT-LIVE-001
  • docs(maint-live-001): first live run verification + feature doc + CHANGELOG + close (Epics 4b-5)
  • fix(prune): orphan sweeps use implicit transactions for batched deletes
  • feat(maint-live-001): context-dependent orphan policy — --exclude-role-types (Epic 4a)
  • feat(maint-live-001): mdemg upgrade refreshes installed LaunchAgents + hooks (darwin) (Epic 3)
  • feat(maint-live-001): maintenance_no_live_run evaluator rule (Epic 2)
  • fix(maint-live-001): scheduled maintenance runs live — plist passes --dry-run=false (Epic 1)
  • docs(maint-live-001): sprint plan — scheduled maintenance actually runs (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • docs(hidden-weight-001): Tier 3 verification + corpus restoration + UVTS harness audit + close (Epics 4-5)
  • fix(ingest): config-driven consolidation timeout — was sharing the 300s batch budget
  • feat(hidden-weight-001): null-weight gauge + regression alert rule (Epic 3)
  • feat(hidden-weight-001): mdemg graph backfill-weights — heal 56k NULL abstraction weights (Epic 2)
  • fix(hidden-weight-001): abstraction-edge weights — vector.similarity.cosine replaces point.distance (Epic 1)
  • docs(hidden-weight-001): sprint plan — real weights on the abstraction hierarchy (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • fix(ci): track .claude/hooks/pre-write-check.py so hook-parity check passes
  • fix(uats): jiminy_guide_sanitized timeout 30s → 90s — stale vs synthesis latency
  • docs(hooksync-001): Tier 3 verification + feature doc + CHANGELOG + close (Epics 7-8)
  • fix(hooksync-001): PORT-TRUTH — loopback bind defaults + sidecar zombie replaced (Epic 6)
  • feat(hooksync-001): mdemg hooks doctor — one-shot hook-channel triage (Epic 5)
  • feat(hooksync-001): hook-channel absence detection — the channel now self-reports outages (Epic 4)
  • feat(hooksync-001): alert Cleared lifecycle — display once, then delivered (Epic 3)
  • ci(hooksync-001): hook-template parity gate — live hooks must match templates (Epic 2)
  • fix(hooksync-001): reconcile bidirectional hook drift — alert delivery restored to live (Epic 1)
  • docs(hooksync-001): sprint plan — drift-proof + self-monitoring hook channel (Epic 0)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • Merge remote-tracking branch 'origin/reh3376_dev01' into reh3376_dev01
  • docs(hookwire-001): Tier 3 verification + CHANGELOG + CLAUDE.md contract pin + close (Epics 4-5)
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • fix(hookwire-001): pre-compact transcript extraction reads the real line shape (Epic 3)
  • fix(hookwire-001): post-tool-observe reads tool_response — end blind "succeeded" observations (Epic 2)
  • fix(hookwire-001): prompt-context.sh reads .prompt — revive the per-prompt channel (Epic 1)
  • docs(hookwire-001): sprint plan — fix hook stdin contract, reconnect per-prompt channel (Epic 0)
  • docs(roadmap): Q3 2026 vision-derived roadmap from 26-agent codebase deep-dive
  • chore: sync reh3376_dev01 with main (post-merge base advance)
  • ci: auto-sync dev branch with main after each squash-merged PR
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • Merge origin/main into reh3376_dev01 (resolve PR dev: reh3376_dev01 -> main #419 conflicts)
  • docs(eventgraph-004): feature doc + CHANGELOG + UATS pin + close (Epic 3)
  • docs(eventgraph-004): Tier 3 live verification — contradict create/re-match + weaken unchanged (Epic 2)
  • feat(eventgraph-004): wire ApplyNegativeFeedback contradict path → reinforcement_events (Epic 1)
  • docs(eventgraph-004): sprint plan + CoactivateSession post-revival health review (Epic 0)
  • docs(eventgraph-003): Tier 3 verification + feature doc + CHANGELOG + close (Epic 4)
  • fix(conversation): inject learning service so CoactivateSession actually runs
  • feat(eventgraph-003): wire ApplyNegativeFeedback weaken path → reinforcement_events (Epic 3)
  • feat(eventgraph-003): wire ApplySymbolCoactivation into reinforcement_events (Epic 2)
  • feat(eventgraph-003): wire CoactivateSession into reinforcement_events (Epic 1)
  • docs(eventgraph-003): sprint plan — reinforcement coverage for other Hebbian paths
  • chore(submodule): bump homebrew-mdemg to v0.10.1 formula
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • docs: governance system doc + bring cli/api references current
  • release: cut v0.10.1
  • feat(hooks): add pre-write-check.py to the tracked installer
  • feat(hooks): per-conversation SessionID — installed hook copies
  • feat(hooks): per-conversation SessionID across hooks + skill
  • docs(jiminy-governance): commit install-ready skill + install README
  • feat(jiminy-governance): ship the J17 governance skill + register MDEMG MCP
  • docs(jiminy-governance): resolve skill wire-up against the real instance
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • docs(roadmap): add jiminy-governance skill build-out (Workstream C, Action 7)
  • fix(nosilent-001): sync embedded launchd server plist with source (CI)
  • docs(nosilent-001): feature doc + CHANGELOG + CLAUDE.md + close (Epic 4)
  • fix(nosilent-001): distinct services for job rules so neither masks the other
  • feat(nosilent-001): scheduled-job staleness + failure alert rules (Epic 3)
  • feat(nosilent-001): record + alert on scheduled-job outcomes (Epic 2)
  • feat(nosilent-001): V0024 scheduled_job_events + writer (Epic 1)
  • docs(nosilent-001): sprint plan — fail-loud scheduled jobs (Epic 0)
  • fix(metrics,backup): resolve docker binary robustly under minimal launchd PATH
  • docs(eventgraph-002): feature doc + CHANGELOG + CLAUDE.md + close (Epic 7)
  • docs(eventgraph-002): Tier 3 live verification (Epic 6)
  • test(eventgraph-002): UATS contract spec for guidance-outcome federation (Epic 5)
  • feat(eventgraph-002): mdemg eventgraph guidance-outcome-neighborhood CLI (Epic 4)
  • feat(eventgraph-002): guidance-outcome federation handler + route (Epic 3)
  • feat(eventgraph-002): GuidanceOutcomesInNeighborhood federation method (Epic 2)
  • feat(eventgraph-002): V0023 constraint_code index on constraint_outcomes (Epic 1)
  • docs(eventgraph-002): sprint plan — guidance-outcome federation (Epic 0)
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • fix(eventgraph-cli-001): tag UATS spec 'tsdb' so CI skips it without TSDB
  • docs(eventgraph-cli-001): live verification + feature doc + CHANGELOG + close (Epic 3)
  • test(eventgraph-cli-001): UATS contract spec for federation API (Epic 2)
  • fix(eventgraph): neighbor_node_ids serializes as [] not null for empty neighborhood
  • feat(eventgraph-cli-001): mdemg eventgraph reinforcement-neighborhood (Epic 1)
  • docs(eventgraph-cli-001): sprint plan — federation consumer CLI + UATS backfill
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • fix(jiminy): /guide 30s timeout sibling + single-source config defaults (GUIDANCE-SYNTH-001 fix-commit)
  • docs(followup-c): close JSON control-char escaping as NON-ISSUE (no fix)
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • test+docs(guidance-synth-001): Tier 2/3 verification + docs + close (Epic 3)
  • feat(jiminy): config-drive warm-compute timeout (GUIDANCE-SYNTH-001 Epic 2)
  • feat(consulting): parallelize per-node constraint classifier (GUIDANCE-SYNTH-001 Epic 1)
  • docs(guidance-synth-001): sprint plan — fix guidance synthesis timeout (Follow-up B)
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • docs(jiminy-outcome-001): CHANGELOG + CLAUDE.md + post.md (Epic 3)
  • test(jiminy-outcome-001): Tier 2 integration + Tier 3 live verification (Epic 2)
  • feat(jiminy): embedding-similarity constraint-code matching (JIMINY-OUTCOME-001 Epic 1)
  • docs(jiminy-outcome-001): sprint plan — revive Neo4j GUIDANCE_OUTCOME sink
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • test(rrf-scale-001): skip guidance integration tests on empty environment (CI fix)
  • docs(rrf-scale-001): CHANGELOG + CLAUDE.md score-scale contract + post.md (Epic 5)
  • test(rrf-scale-001): Tier 2 integration + Tier 3 live verification (Epic 4)
  • docs(rrf-scale-001): Epic 3 — remaining LOW findings reviewed + decided
  • fix(consulting): RRF-calibrate score gates + confidence sigmoid (RRF-SCALE-001 Epic 2)
  • docs(rrf-scale-001): Epic 1 audit findings — 12 sites cataloged
  • docs(rrf-scale-001): sprint plan — RRF score-scale consumer remediation
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • fix(eventgraph-001): Grafana panel uses TSDB instead of unconfigured Prometheus datasource
  • Merge branch 'main' into reh3376_dev01
  • docs(eventgraph-001): feature doc + CHANGELOG + CLAUDE.md + sprint close (Epic 8)
  • docs(eventgraph-001): Tier 3 live e2e verification transcript (Epic 7)
  • fix(retrieval): set Activation on RRF RetrieveResult (EVENTGRAPH-001 fix-commit)
  • fix(eventgraph-001): restore full GRAFANA-AUDIT-001 audit_results.json
  • feat(observability): Grafana panel + Prometheus counters for reinforcement events (EVENTGRAPH-001 Epic 6)
  • feat(eventgraph): federation query helper + API endpoint (EVENTGRAPH-001 Epic 5)
  • feat(learning): record reinforcement events to TSDB (EVENTGRAPH-001 Epic 4)
  • refactor(learning): expose per-pair telemetry from Hebbian Cypher (EVENTGRAPH-001 Epic 3)
  • feat(tsdb): buffered reinforcement_events writer (EVENTGRAPH-001 Epic 2)
  • feat(tsdb): V0022 reinforcement_events hypertable (EVENTGRAPH-001 Epic 1)
  • docs(eventgraph-001): sprint plan (Pattern Y1 TSDB-federation)
  • docs(model-dist-002): flip adapter section to shipped + sprint close
  • feat(cli): enable mdemg model pull --adapter (MODEL-DIST-002 Epic 5+6)
  • feat(model-dist-002): Epic 4 local — Modelfile.adapter + ollama create
  • feat(model-dist-002): Epic 1-3 — MLX adapter → PEFT → GGUF LoRA + live verify
  • feat(model-dist-002): Epic 0 — sprint plan + workspace prep
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • docs(grafana-audit): Epic 4 + 7 — feature doc + sprint close
  • fix(grafana): Epic 3 — 5 panels recovered (3 FAIL + 2 schema-drift)
  • feat(grafana-audit): Epic 1 + 2 — full audit + findings
  • feat(grafana-audit): Epic 0 — sprint plan + audit harness
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • docs(api): document 19 previously-undocumented endpoints (follow-up Implement Learning Loop - ApplyCoactivation #2)
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • feat(cli): add mdemg model run wrapper (follow-up Edge Weight Decay CLI Command #1 to MODEL-DIST-001)
  • chore(submodule + docs): bump homebrew-mdemg to v0.10.0 + cli-reference Model Distribution section
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • docs(release): promote Unreleased -> v0.10.0
  • merge: resolve quant_manifest.json conflicts (Epic 3 closeout vs squashed main)
  • docs(model-dist-001): sprint close — post.md
  • feat(model-dist-001): Epic 3 closeout — Ollama Library push complete
  • docs(model-dist-001): Epic 8 — Documentation Update (main repo)
  • docs(model-dist-001): Epic 7 — local-model-distribution feature doc
  • feat(model-dist-001): Epic 5 — V0021 model_install_events hypertable + writer
  • feat(model-dist-001): Epic 4 — mdemg model CLI + pluggable Fetcher interface
  • feat(model-dist-001): Epic 3 — 3 Modelfiles + local ollama create (push pending)
  • docs(model-dist-001): Epic 2 — defer adapter to MODEL-DIST-002
  • feat(model-dist-001): Epic 1 — built Q4_K_M + Q8_0 fused GGUFs
  • docs(sprint): MODEL-DIST-001 sprint plan + quant manifest skeleton
  • fix(service): replace decommissioned mlx-server LaunchAgent with llama-server
  • fix(api): /healthz returns build-time version, not stale literal "0.6.0"
  • chore(submodule): bump homebrew-mdemg to v0.9.0 formula + docs
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • docs(release): promote Unreleased -> v0.9.0

Auto-generated PR from reh3376_dev01 push

rhenley1958 and others added 30 commits June 10, 2026 20:13
…alth review (Epic 0)

EVENTGRAPH-004 federates the last unfederated Hebbian write — the
ApplyNegativeFeedback contradict action — into reinforcement_events
(trigger_path=apply_negative_feedback_contradict). Data-decided scope:
reuse the existing V0022 sink (zero CONTRADICTS edges exist anywhere;
no producer calls /v1/learning/negative-feedback — instrument before
the producer arrives, the inverse of the dormancy pattern).

Also closes the EVENTGRAPH-003 follow-up: 30h post-fix health review of
the revived CoactivateSession path — no tuning needed, textbook session
cliques, pre-fix orphans stay as historical record (operator decision).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…inforcement_events (Epic 1)

The contradict action (no co-activation edge → MERGE CONTRADICTS) was the
last unfederated Hebbian write. The CONTRADICTS MERGE lived inside a
FOREACH, where the edge variable is invisible to RETURN — so the original
single statement is split into two statements in the SAME ExecuteWrite
transaction: (a) weaken (EVENTGRAPH-003 telemetry, RETURN unchanged) and
(b) contradict with a per-pair RETURN. Classification is identical: weaken
never deletes edges, so contradict's NOT EXISTS sees the same edge set the
original OPTIONAL MATCH did.

Contradict rows land with trigger_path=apply_negative_feedback_contradict.
created_new_edge detected via `c.updated_at IS NULL` (ON MATCH always sets
it; ON CREATE never does — invariant pinned by comment). delta_weight is
the CONTRADICTS edge's OWN weight delta (+negWeight on create, 0 on
re-match); negative-feedback semantics are carried by trigger_path, not
the sign.

Both statements EXPLAIN-validated against live Neo4j. Tier 1: 2 new parser
tests (create/re-match branches); learning suite green; lint clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…-match + weaken unchanged (Epic 2)

Live against the restarted Epic-1 binary: contradict create row
(+0.15, created_new_edge=true), re-match row (delta=0, evidence=2),
weaken row byte-equivalent to pre-split behavior (negative delta,
floor at 0). Federation CLI surfaces the new trigger_path with no
read-side change. UATS learning_negative_feedback 5/5 PASS.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…c 3)

Feature doc: 5-path trigger_path table + delta-semantics consumer
warning (contradict delta is the CONTRADICTS edge's own weight delta —
semantics live in trigger_path, not the sign). UATS spec extended:
zero-count equals assertions on nonexistent nodes (hash refreshed,
5/5 live). CLAUDE.md architecture note + producer-gap disclosure.
Sprint close in post.md.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Squash-merge workflow leaves a stale merge-base: PR #418's squash
(b408bbc) rewrote the same CHANGELOG/CLAUDE.md/service.go regions this
branch then extended in EVENTGRAPH-004. Verified before resolving:
main == dev01@36377a2 + .github/workflows/codeql.yml exactly (git diff
b408bbc 36377a2 shows only codeql.yml), so taking this branch's side
of every content conflict is lossless; codeql.yml comes in from main.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Squash merges never advance the dev branch's merge-base, so every
sprint touching CHANGELOG.md/CLAUDE.md hit CONFLICTING on its next PR
(first bitten: PR #419). New sync-dev-after-merge.yml merges main back
into the source *_dev* branch after each merged PR; the GITHUB_TOKEN
push triggers no other workflows, so it can never spawn an empty
auto-PR (the PR #420 failure mode). Conflicts fail loudly for manual
resolution; workflow_dispatch enables manual runs/live testing.

auto-pr.yml additionally skips PR creation when branch content is
identical to main — guards MANUAL sync pushes, verified against the
live repo state (current dev01 ≡ main → empty=true → skip).

actionlint clean (untrusted refs passed via env, not inline).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…deep-dive

Full-codebase review vs MDEMG's purpose (cognitive substrate / connection
layer): 19 map agents (3 vision + 16 subsystem), 3 cross-cutting assessors,
synthesizer + adversarial completeness critic (19 revisions applied).

Verdict: server-side substrate is mature, but the system is not currently
functioning as the assistant's internal dialogue — the per-prompt delivery
channel silently no-ops (hook reads .user_prompt, Claude Code sends
.prompt), 100% of GENERALIZES edges have NULL weight (22,170/22,170,
live-verified), scheduled decay/prune has been a permanent dry-run, RSIC
validates 16/17 actions vacuously, and supervision covers 3 of ~14
background loops. Every defect is the same disease: wired-looking seams
with no caller, wrong contract, or no reader.

4 phases ≈ 75 days committed: (1) reconnect the loop ends, (2) close the
learning loops, (3) survivability + class-ending forcing functions,
(4) FT frontier + release hygiene. Top-10 ranked; deferrals explicit.
Orchestrator spot-verification annex included (5 claims re-verified live).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…per-prompt channel (Epic 0)

Roadmap Q3 Phase 1 rank #1. Audit of all 6 hooks vs the actual Claude
Code stdin schemas: prompt-context.sh reads .user_prompt (CC sends
.prompt) → channel exits silently on every prompt; post-tool-observe.py
reads tool_output (CC sends tool_response) → false "Build/test
succeeded" observations with empty output; guidance wrongly coupled to
RESULT_COUNT>0; minor pre-compact transcript jq. session-start /
pre-bash-check / pre-write-check verified correct.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…rompt channel (Epic 1)

Claude Code's UserPromptSubmit stdin field is `prompt`; the hook read
`.user_prompt`, which is always empty → exit 0 → per-prompt CMS recall,
Jiminy guidance, /strict reformulation, the warm trigger, and the
retrieve-time Hebbian reinforcement have NEVER fired in any session.
Now reads `.prompt // .user_prompt` (legacy fallback kept).

Also decouples guidance from recall: the RESULT_COUNT=0 branch no longer
exits — it printed its notice then skipped guidance + warm + retrieval
reinforcement, coupling independent deliveries.

Both copies (live + installer template). Tier 1 simulated stdin: real
.prompt payload → first-ever guidance delivery (J17 T1 bootstrap + DICT,
5363 guidance bytes vs 0 forever); legacy fallback works; short/empty/
malformed payloads exit silently (fail-open preserved).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…"succeeded" observations (Epic 2)

Claude Code's PostToolUse stdin field is `tool_response` (string or
object); the hook read `tool_output`, which is always absent → output_str
empty → error indicators never matched → every go build/go test/pytest
Bash call was recorded as "Build/test succeeded" sight-unseen, and real
errors were never observed.

Now reads tool_response (fallback tool_output) via _response_text(),
normalizing string|dict|list (stdout/stderr join). Success classification
requires NON-EMPTY clean output — a silent success records nothing rather
than fabricating; failures land as error observations with real stderr.

Both copies (template regenerated from fixed live, {{SPACE_ID}}
placeholder preserved, verified identical modulo placeholder). Tier 1
against real CMS: failing build → error obs with stderr; passing →
progress; empty → no record.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ine shape (Epic 3)

Transcript lines are {type, message:{content:[{type, text|name, ...}]}};
the old top-level `.content` read always yielded empty, so pre-compaction
snapshots never carried recent-activity context. New jq walks
.message.content[] extracting .text/.name. Verified against this
session's real transcript (old: nothing; new: real activity). Both
copies, placeholders preserved.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…act pin + close (Epics 4-5)

Live in the real session: first-ever guidance delivery (J17 T1 bootstrap
+ DICT, 5363 bytes vs 0 forever); real failing build → error observation
with actual compiler output in CMS. PostToolUse success-only firing
documented as a limitation. Hook stdin contract pinned in CLAUDE.md.
Drift + clique-semantics findings logged for HOOKSYNC-001 / Phase 2.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…channel (Epic 0)

Roadmap Q3 Phase 1 rank #2. Investigation grounded all five findings:
template→live drift severed alert delivery (50-entry file actively
rotating today, never shown); no Cleared lifecycle (nothing sets the
field; no /v1/alert* endpoints); no absence detection for the channel
that just had a months-long silent outage; compose publishes 9999 on
0.0.0.0; neural sidecar binds 0.0.0.0:8101 via a 39-day-old process
serving pre-J17-fix code. 8 epics: reconcile, CI parity gate, clear
lifecycle, hook_events absence rule (reuses V0024 via jobhealth),
hooks doctor, PORT-TRUTH rider, Tier 3, docs.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…y restored to live (Epic 1)

Live hooks adopted from templates (SPACE_ID substituted): restores the
alert-display blocks (all-pending per prompt; critical/high + degraded
healthz at session start) that the live copies lacked — the NOSILENT
last mile. Reverse drift caught during reconcile: the live hook's T1/T2
bootstrap-detection block (MAX_TIER → /v1/jiminy/bootstrap → ACTIVE
CONSTRAINTS header) never existed in the template and was nearly lost —
now single-sourced in the template and regenerated into live.

Live-verified: one prompt now renders alerts (50 pending incl. live
CRITICALs) + recall + J17:INIT bootstrap + guidance + synergy footer,
coexisting. All 6 hooks byte-identical to templates modulo {{SPACE_ID}}.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…emplates (Epic 2)

Mirrors the compose/launchd parity pattern: every *.sh/*.py template
must byte-match .claude/hooks/ modulo the {{SPACE_ID}} placeholder.
Proven locally: passes clean, fails (with a bounded diff dump) on
deliberate drift. Ends the bidirectional-drift class that severed
alert delivery and nearly lost the T1 bootstrap block.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…vered (Epic 3)

Alert.Cleared existed but nothing ever set it: once hooks rendered the
file, the same entries would re-render every prompt forever. New:
FileBackend.Clear (ids and/or all_before cutoff, idempotent, under the
existing lock) → Dispatcher.ClearAlerts → POST /v1/alerts/clear. Hooks
now clear exactly what they displayed (fire-and-forget, fail-open);
cleared = delivered-to-operator, not resolved — persisting conditions
re-fire via the evaluator. Alert IDs now CUIDv2 per the identifier
standard (was UnixNano; old ids remain valid opaque strings).

Live-verified lifecycle: prompt 1 → "50 pending, showing 10" + 10
cleared; prompt 2 → "40 pending, showing 10" (next batch, no re-render)
→ 20 cleared. Tier 1: Clear by-id/by-time/idempotent/no-backend. UATS
alerts_clear 3/3 live (runner falsy-body inheritance discovered:
variant bodies must be non-empty objects).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…self-reports outages (Epic 4)

POST /v1/hooks/event records heartbeats into V0024 scheduled_job_events
via the jobhealth policy point (job_name hook:<name>; no new sink).
Two independent heartbeats: prompt-context fires per delivery (the
monitored channel); post-tool-observe fires throttled (HOOK_HEARTBEAT_
COOLDOWN_SEC, default 300 — proves sessions ACTIVE). Evaluator rule
hook_channel_silent (distinct service per the NOSILENT cooldown rule):
sessions active + zero prompt-context fires in HOOK_SILENT_LOOKBACK_
HOURS (24) → high alert. This is the "job never ran" guarantee applied
to the channel whose months-long outage HOOKWIRE-001 found only by
manual audit — the next contract drift self-reports.

Config: HOOK_HEALTH_ALERT_ENABLED (true), HOOK_SILENT_LOOKBACK_HOURS
(24), HOOK_ACTIVITY_MIN_EVENTS (5). Live-verified: real hook fires land
rows (session metadata, latency); throttle holds; rule SQL positive +
negative branches proven against the real table; UATS hooks_event 3/3.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… (Epic 5)

11 checks: per-hook template parity (the CI gate's local twin),
settings registration, server healthz, a stdin-contract self-test
piping a real-shape UserPromptSubmit payload through the installed
hook (asserts the always-present synergy footer), alert-file state
(pending/total), and the last hook:prompt-context heartbeat age from
scheduled_job_events (SKIP when TSDB unreachable). Table or --json;
non-zero exit on any FAIL.

Live: 11/11 PASS on this machine ("last fire 5s ago" — fed by the
doctor's own self-test); correctly fails (exit 1) on deliberate drift.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ie replaced (Epic 6)

Compose published the API on 0.0.0.0 (unauthenticated admin/destructive
routes exposed off-host): now "${MDEMG_BIND_ADDR:-127.0.0.1}:${MDEMG_PORT}
:9999" — wide bind is an explicit opt-in (both compose copies, CI-synced).
Neural sidecar bound 0.0.0.0:8101 via config.py default AND the plist
arg: both now 127.0.0.1 (both plist copies, CI-synced; SIDECAR_HOST env
overrides). Operational: the 39-day-old sidecar process (started
2026-05-02, serving pre-J17-fix code) replaced — fresh process verified
on 127.0.0.1:8101, both models loaded, health 200.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…lose (Epics 7-8)

Live-verified across the sprint: alert backlog drained 50→2 on real
prompts (display-then-clear); evaluator rules 15→16 (hook_channel_silent
loaded); doctor 11/11 + correct failure mode; sidecar fresh on
127.0.0.1:8101 (NLI 234ms). Feature doc docs/features/hook-channel-
health.md (config table incl. MDEMG_BIND_ADDR + SIDECAR_HOST). Findings:
packaging plists are templates (raw copy → launchd exit 78; service
install is canonical); UATS falsy-variant-body inheritance pinned.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…sis latency

Caught in the HOOKSYNC-001 full-suite regression: the synchronous
/v1/jiminy/guide includes local-model synthesis (~43s observed quiet,
~50s typical per GUIDANCE-SYNTH-001) — the spec's 30s timeout has been
silently erroring since synthesis latency grew. Aligned with the
JIMINY_WARM_COMPUTE_TIMEOUT_MS budget (90s); hash refreshed; passes
live. Pre-existing — not a HOOKSYNC regression (Guide path untouched).
The other 3 suite errors were load-induced flakes (pass individually):
suite-vs-llama-server slot contention, noted for UXTS-CI-001.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…passes

Root cause: the new 'Verify live hooks match hook templates' CI step
(HOOKSYNC-001) diffs every internal/cli/hook_templates/*.{sh,py} against
.claude/hooks/<name>, but the .gitignore allowlist only un-ignored the
5 original hooks. pre-write-check.py gained a template in this sprint
while its live counterpart stayed gitignored, so CI checked out a tree
without it and failed with 'MISSING live hook:
.claude/hooks/pre-write-check.py'.

Fix: add '!.claude/hooks/pre-write-check.py' to the allowlist and commit
the live hook (already byte-identical to its template modulo SPACE_ID),
preserving the full parity guarantee instead of weakening the CI step.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…n hierarchy (Epic 0)

Roadmap Q3 Phase 1 rank #3. Live investigation: point.distance() returns
NULL on embedding lists (proven: NULL where vector.similarity.cosine
returns 0.627 on the same pair); 3 creation sites affected incl. an
ABSTRACTS_TO site the audit missed. Scale worse than audited and
growing: 28,332/28,332 GENERALIZES + 36,110/37,996 ABSTRACTS_TO = 64,442
NULL-weight abstraction edges. Neo4j cosine returns [0,1] directly —
drop-in. Plan: fix sites (+ CUIDv2 edge ids), LIMIT-5-then-batched
backfill, null-weight gauge + alert rule via the existing graph-stats →
metric_samples path, UVTS-quick regression guard.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…cosine replaces point.distance (Epic 1)

point.distance() is a spatial-Point function: on embedding lists it
returns NULL, so every weight at the 3 abstraction-edge creation sites
was never set (100% of GENERALIZES + 95% of ABSTRACTS_TO weightless;
the CASE guards passed on good embeddings, then the THEN expr evaluated
NULL — edges with good embeddings got nothing while embedding-less ones
got the 0.5 fallback). vector.similarity.cosine returns [0,1] directly
(live-verified: identical=1.0, orthogonal=0.5, opposite=0.0). Site 1
(theme GENERALIZES) gains the null-guard it never had.

Also: edge_id randomUUID() → CUIDv2 per the identifier standard, minted
Go-side via memberEdgePairs (Cypher can't generate CUIDv2) and zipped
with member ids for UNWIND. All 3 statements EXPLAIN-validated live.
Tier 1: pair-builder tests (uniqueness, CUID format, empty input).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
rhenley1958 and others added 25 commits June 13, 2026 13:47
2nd training-integrity remediation sprint. The distill gate
(x9_distill_capture_v2.py:360: kept = mean(reward_vector) >= 0.8,
global) drops spec-correct-but-terse answers because coverage_score/
explanation_quality/coherence_score reward length over correctness —
gutting ape.reflect (largest target) + summarize + synthesize, then
balanced_sampler amplifies the verbose-skew. Principle: inclusion
selects for CORRECTNESS not length. Fix: length-neutral correctness
rewards + per-task inclusion thresholds + a forcing-function test (each
of the 12 covered tasks' known-correct golden rows clear their gate) +
distribution check. Closes with the eval-integrity-deferred GGUF
serving + honest baseline recompute.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…c 1)

The distill inclusion gate (mean(reward_vector) >= 0.8) selected for LENGTH,
not correctness — the corpus-skew mechanism behind 3 discarded retrains. Four
reward functions used length/count ladders that dropped spec-correct-but-terse
answers below the gate and rewarded verbosity upward:

- coverage_score: <20 words→0.4 / <50→0.7 / then rising → now substantive
  content scores 0.9 flat (length-neutral); empty→0.0, pure-repetition→0.3.
- explanation_quality: <20 words→0.6 cliff → now substantive→0.9.
- coherence_score: required >=2 sentences + 10 words → now any coherent
  non-repetitive response→0.9; pure repetition→0.4.
- insight_count: rewarded bullet COUNT (>=5→1.0) → now >=1 genuine insight→0.9
  (no upward count reward; stops bullet-spam, stops dropping single-insight
  reflections to 0.5 — ape.reflect, the largest target).

Verified: terse-correct now clears the 0.8 gate; verbosity/bullet-count no
longer rewarded above concise; varied detailed content unaffected (0.9);
empty/repetition still rejected. Tests rewritten to pin the new semantics
(78 pass). Subtler keyword-bag functions (specificity/actionability) left for
the continuation — they reward content signals, not raw length.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ndings (Epic 2-3)

Epic 2: --reward-threshold-map JSON ({"task": float}) overrides the global
--reward-threshold per task in x9_distill_capture_v2.py, so tasks whose reward
arrays have a different natural ceiling can gate at the right bar. Records the
per-task gate in each row + the manifest. Live-verified end-to-end: real
OpenAI + TSDB run with {"consulting.classify": 0.6} applied the override
(3/3 captured, manifest per_task reward_threshold=0.6).

Epic 3 (live Tier 3, docs/development/reward-correctness-001/live_findings.md):
scored REAL production llm_interactions at the 0.8 gate, old vs new rewards.
Validated Epic 1: hidden.summarize recovered 69/72 real concise summaries the
old length ladder dropped. Surfaced three larger correctness issues the
length fix does NOT close (the real dominant suppressors for the big tasks):
  1. ape.reflect (54k, largest target): json_valid mean 0.133 — ~87% of recent
     responses TRUNCATED mid-JSON (prompt ~5800 + ~3000 output > 8192 per-slot
     KV bound). Production serving/capture defect; gate correctly rejects.
     Recommended own-sprint follow-up (raise output budget, re-capture).
  2. jiminy.evaluate: explanation_quality=0.0 on correct {violations,warnings}
     responses — wrong reward for the schema (no top-level explanation key).
     Reward-array fix, operator-gated (changes a ULTS array + re-grades).
  3. jiminy.synthesize: keyword-bag follow_rate/specificity just below gate —
     the deferred Epic 1 continuation.

Also fixed 2 pre-existing lint nits in the touched file (F541, E741).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Closes REWARD-CORRECTNESS-001 at Epic 1+2+3. Epic 4 (baseline recompute)
explicitly deferred behind the ape.reflect truncation fix per operator
sequencing — recomputing over a known-truncated corpus would bake in the
corruption. Next sprint: ape.reflect truncation.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Root-caused the ape.reflect ~87% truncation (largest training target) to a
structurally-unbounded prompt: live-measured 7489 tokens (Current Assessment
~3895 + 5-cycle history ~2693), leaving only ~700 of the 8192 per-slot KV
budget for output — 191/200 invalid responses cluster at 490-520 tokens_out,
truncating mid-JSON at the ceiling. Compression already on; not a max_tokens
cap. Plan: bound the prompt to a configurable token budget (gate verbose TSDB
dataset fields, cap history cycles, final drop-oldest guard) so output always
has ~4000-token headroom, with an optional serving-slot increase as the safety
margin. Lever A (structural prompt budget) + Lever B (KV slot) proposed, picked
at execution. Tier 3 proof: fresh ape.reflect json_valid recovers 0.13 → ~1.0.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ut (Epic 1-2)

Implements the structural fix for the ape.reflect ~87% mid-JSON cutoff. The
prompt was unbounded (live 7489 tok), leaving only ~700 of the 8192 per-slot
KV budget for output, so the largest training target's responses were cut off
mid-array.

buildUserPrompt now enforces a token budget:
- gate the verbose TSDB dataset fields (LLMPerformance x17 / Retrieval /
  Embedding / TrainingReadiness, ~3895 of 7489 prompt tokens) behind
  RSIC_LLM_REFLECT_INCLUDE_DATASETS (default false); scalar health metrics
  the detectors use are always kept;
- cap history cycles via RSIC_LLM_REFLECT_HISTORY_CYCLES (default 3, was
  hardcoded 5);
- final budget guard (RSIC_LLM_REFLECT_PROMPT_BUDGET_TOKENS default 3500, 0
  disables): drops history oldest-first, then trims the assessment tail,
  logging loudly what was dropped (never silent). estimateTokens calibrated
  to the measured 2.3 chars/tok ratio, slightly conservative.

3 config fields (range-validated, no hardcoding) wired config ->
LLMReflectorConfig -> server.go. 6 Tier-1 tests: dataset gating, history cap,
drop-history-under-budget, trim-assessment-under-budget, under-budget-unchanged,
estimator. Full ape suite + lint + config scanner (687/687) clean.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…md (Epic 5)

Live Tier 3 result documented: 3/3 fresh post-restart ape.reflect rows valid
JSON (100%, up from ~13%), tokens_in ~2575 (from ~7489). Corrected the stale
CLAUDE.md ape.reflect prompt-size figure (~5800 -> ~7489 live-measured) and
added the per-slot KV "prompt+output share the budget" guidance. Closes
APE-PROMPT-BUDGET-001.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… import shift

The Epic 1-2 import (log/slog) + struct fields shifted llmReflectSystemPrompt
from line 74 to 80. The ULTS hash verifier reads from line-2 and grabs the
first backtick string; at the stale :74 the search region now included the
`"` backtick in `quoted[i] = `"` + a + `"`` → wrong hash. The system prompt
TEXT is unchanged (hash 39b2bc… still matches at :80). Updated
system_prompt_source :74 → :80. Local glob verify: ape.reflect 11/11 PASS.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ta (Epic 0-1)

Operator-chosen audit-first prune phase. Read-only enumeration of every
non-conforming TSDB/file target with exact counts + a backup/small-batch/verify
prune plan (each category operator-gated).

Findings — PRUNE TARGETS: (A) 2,111 invalid-JSON rows in object/array tasks
(ape.reflect 1890 in the 06-11..06-13 truncation window, rerank_cross 184,
evaluate_llm 18, query_classify 18, classify 1); (B) 21,135 error/empty rows
(mdemg data clean target); (C) rerank mislabeled archive 6,894 events/21M +
valid_golden 108 leaked + ~14 stale April baselines. ~23,246 TSDB rows total
(~22.7%).

NOT prune targets (schema/reward mismatch, data is fine, fix the definition):
hidden.summarize 72 (prose vs object schema), string-schema tasks the jsonb
check false-flags (intent_translate/codegen/synthesize emit valid bare
strings), jiminy.evaluate. Audit pitfall recorded: never run a jsonb-validity
prune predicate against string-schema tasks.

Corrects the "87% of 54k" assumption: ape.reflect corruption is 1,890 rows in
the recent truncation window (forward-fixed by APE-PROMPT-BUDGET-001), not
corpus-wide.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…LOG (Epic 2 close)

Pruned 1,898 genuinely-corrupt invalid-JSON rows from llm_interactions
(ape.reflect 1,879 / jiminy.evaluate_llm 18 / consulting.classify 1),
backup-first to .mdemg-backup-20260613_195431/dataprune/ (reversible).
102,415 -> 100,517, remaining_corrupt=0, live healthy, recent ape.reflect
14/14 valid.

Small-batch verify caught that the raw pg_input_is_valid predicate over-counted
by 213 (valid JSON behind markdown fences / think-tags that production
SanitizeResponse strips); validated all candidates through a faithful replica
of llmclient.SanitizeResponse and spared the recoverable 213. Categories B
(error rows) + C (file artifacts) deferred. The backup dir is untracked
(not committed).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
B: 21,254 error/silent-failure rows removed via mdemg data clean (4 spaces),
backed up first. C: rerank prefix-archive (6,894 events/21M, no refs) moved to
backup; valid_golden + ~14 baselines RETAINED (load-bearing — leak source +
regression harness; retire during baseline recompute). Final: llm_interactions
79,461 rows, 0 non-conforming. Verification catch documented: data clean
dry-run per-task table is surviving-rows, not the delete set.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ace data

Two bugs surfaced during the space-hygiene cleanup (removing 24 junk/test/demo
spaces for live testing):

1. `mdemg space delete` gated its pre-check on `count(MemoryNode {space_id})`
   but the delete itself is label-agnostic (`MATCH (n {space_id})`). A space
   holding only SymbolNodes/Observations (e.g. e2e-test = 10,918 SymbolNodes)
   reported "no nodes. Nothing to delete." and silently survived. Pre-check now
   counts all labels, matching the delete.

2. `ListSpaces` (`mdemg space list`) panicked — `interface conversion:
   interface {} is nil, not string` at the `sid.(string)` assertion — when any
   MemoryNode had a null space_id (orphaned/infra artifacts). The query now
   excludes null/empty space_id (such nodes are not a "space"), and the
   assertion is nil-guarded defensively.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
24 junk/test/demo spaces removed (~143k nodes, backed up); blank-space resolved
(global infra kept null, 155 test MemoryNodes staged for delete); 2 space-tool
bugs fixed (delete pre-check, list panic). Record in space_cleanup.md.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…xes (Epic 0)

The REWARD-CORRECTNESS-001 follow-ups: (1) hidden.summarize schema object->string
(production emits prose; 72 rows mis-flagged invalid-JSON); (2) explanation_quality
schema-aware for nested violations[].reasoning (fixes jiminy.evaluate +
evaluate_llm scoring correct responses 0.0); (3) keyword-bag specificity/
actionability substantive-floored (jiminy.synthesize valid guidance dropped for
lacking magic words). Makes the 4 tasks' grading correct before the baseline
recompute.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
… + live validation

Three reward/schema mismatches that scored CORRECT responses wrong:

1. hidden.summarize ULTS schema object->string. Production emits bare prose
   (cluster_summarizer.go), so the object schema mis-flagged 72 valid summaries
   as invalid-JSON. (Reward already fixed in RC-001; this corrects the spec.)

2. explanation_quality made schema-aware: jiminy.evaluate / evaluate_llm nest
   reasoning in violations[].reasoning, not a top-level field, so the flat
   lookup scored every correct response 0.0. Now credits nested reasoning and
   treats a valid no-violation verdict as a correct "no issues" answer (nothing
   to explain). Falls back to the flat path.

3. specificity_score / actionability_score substantive-floored (0.7 floor,
   keyword presence a bounded bonus, hedging/empty/repetition low) — the
   keyword-bag dropped valid concise guidance below the gate for lacking ~6
   magic words. follow_rate inherits it.

Live Tier 3 (real production rows, old->new kept@0.8): jiminy.evaluate 0/60->60/60
(mean 0.667->0.967), jiminy.synthesize 3/60->59/60 (0.725->0.879), ape.reflect
47/60->60/60 (0.848->0.956); evaluate_llm unchanged 60/60. New means 0.88-0.97 =
correct production output scoring correctly, no over-inflation. 87 unit tests +
609 neural tests + ruff green.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…ort 8101->8102 (Epic 1)

The capstone of the training-integrity arc: recompute the frozen 0.8338 baseline
through the fixed harness (valid_clean + RC-001/002 rewards + GGUF :8102). Epic 1
fixes the stale rl_phase11.yaml mlx_port (8101 mlx_lm.server decommissioned →
8102 llama-server GGUF), flagged by EVAL-INTEGRITY-001.

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
…d harness (Epic 2-4)

Recomputed the adapter-promotion baseline through the fixed harness (valid_clean
leak-free eval + RC-001/002 corrected rewards + GGUF llama-server :8102) = 0.8655,
replacing the stale frozen 0.8338 (valid_golden-leaked, old length-biased rewards,
decommissioned MLX serving — not comparable). evaluate_gate_5a now derives the
target from the loaded baseline REPORT (single source of truth); the constant is
retained only as a >5pp drift tripwire. status ok, 12 tasks, 50 samples/task,
0 zero-call. ape.reflect 0.696 is an eval-harness artifact (stored ~7.5k-token
prompts bypass the runtime prompt budget and get cut off mid-JSON), not a model
regression.

Closes the training-integrity arc: trustworthy gate (EVAL-INTEGRITY-001),
correct rewards (REWARD-CORRECTNESS-001/002), sound corpus (APE-PROMPT-BUDGET-001
+ DATAPRUNE), honest baseline (this sprint).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
@github-actions github-actions Bot requested a review from reh3376 as a code owner June 15, 2026 16:05
@reh3376

reh3376 commented Jun 15, 2026

Copy link
Copy Markdown
Owner

BASELINE-RECOMPUTE-001 — honest adapter-promotion baseline (training-integrity capstone)

The promotion gate's baseline was a stale frozen constant 0.8338 (99%-leaked valid_golden eval, old length-biased rewards, decommissioned mlx_lm.server). Recomputed through the fixed harness — leak-free valid_clean + RC-001/002 corrected rewards + GGUF llama-server :8102 — the honest baseline is 0.8655.

Not comparable to 0.8338 (different eval + rewards + serving); future retrains compare against 0.8655. evaluate_gate_5a now derives the target from the loaded baseline report (single source of truth); the constant is retained only as a >5pp drift tripwire. Also fixed the stale rl_phase11.yaml mlx_port 8101→8102.

Live recompute (Tier 3): status: ok, GGUF :8102, 12 tasks, 50 samples/task, 0 zero-call. Per-task: jiminy.codegen 1.00 · jiminy.evaluate 0.967 · hidden.name_emergence 0.95 · consulting.classify 0.91 · rerank_cross 0.90 · hidden.summarize 0.90 · jiminy.synthesize 0.88 · intent_translate 0.874 · query_classify 0.82 · evaluate_llm 0.80 · ape.reflect 0.696 (dragged by json_valid=0.24 — eval-harness artifact: benchmark replays stored ~7.5k-token prompts bypassing the runtime APE-PROMPT-BUDGET-001 bound; fresh ape.reflect is 100% valid live — not a model regression).

Closes the training-integrity arc: trustworthy gate → correct rewards → sound corpus → honest baseline.

Disclosed follow-up: run_benchmark.py is single-threaded against a 4-slot llama-server; client-side concurrency would cut wall-time ~4×.

(Note: an identical summary was accidentally posted to the now-merged #465 first — this #466 is the correct PR.)

@reh3376 reh3376 self-assigned this Jun 15, 2026
@reh3376 reh3376 merged commit 504facd into main Jun 15, 2026
7 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants