dev: reh3376_dev01 -> main by github-actions[bot] · Pull Request #466 · reh3376/mdemg

github-actions · 2026-06-15T16:05:44Z

Summary

Development branch changes from reh3376_dev01.

Commits

feat(baseline-recompute-001): honest baseline 0.8338->0.8655 via fixed harness (Epic 2-4)
docs+fix(baseline-recompute-001): sprint plan (Epic 0) + rl_phase11 port 8101->8102 (Epic 1)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(reward-correctness-002): CHANGELOG + post (Epic 5 close)
fix(reward-correctness-002): schema/reward-mismatch fixes (Epics 1-3) + live validation
docs(reward-correctness-002): sprint plan — schema/reward-mismatch fixes (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(dataprune): Neo4j space hygiene record + CHANGELOG
fix(space): delete pre-check + list panic on non-MemoryNode / null-space data
docs(dataprune-audit-001): Category B+C prune execution record
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(dataprune-audit-001): record Category A prune execution + CHANGELOG (Epic 2 close)
docs(dataprune-audit-001): non-destructive audit of non-conforming data (Epic 0-1)
chore: sync reh3376_dev01 with main (post-merge base advance)
fix(ape-prompt-budget-001): re-pin ape_reflect ULTS source line after import shift
docs(ape-prompt-budget-001): feature doc + CHANGELOG + post + CLAUDE.md (Epic 5)
feat(ape-prompt-budget-001): bound ape.reflect prompt to protect output (Epic 1-2)
docs(ape-prompt-budget-001): sprint plan + recon (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
Merge remote-tracking branch 'origin/reh3376_dev01' into reh3376_dev01
docs(reward-correctness-001): CHANGELOG + sprint post (Epic 6 close)
feat(reward-correctness-001): per-task inclusion thresholds + live findings (Epic 2-3)
feat(reward-correctness-001): length-neutral correctness rewards (Epic 1)
docs(reward-correctness-001): sprint plan (Epic 0)
Merge remote-tracking branch 'origin/main' into reh3376_dev01
chore: sync reh3376_dev01 with main (post-merge base advance)
fix(eval-integrity-001): restore ape_reflect system_prompt_source to clean file:line
Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
docs(eval-integrity-001): feature doc + CHANGELOG + post (Epic 7)
feat(eval-integrity-001): leak-audit gate target (Epic 2)
fix(eval-integrity-001): wire LLM recorder in ingest process — capture summarize.generate (Epic: capture-gap)
feat(eval-integrity-001): hard-fail on zero successful calls (Epic 3)
feat(eval-integrity-001): wire gate to leak-free 12-task valid_clean (Epic: gate-set)
fix(eval-integrity-001): respect dynamic_prompt — recover enum-templated tasks' production rows (Epic 1)
docs(eval-integrity-001): sprint plan + recon (Epic 0)
Merge remote-tracking branch 'origin/main' into reh3376_dev01
chore: sync reh3376_dev01 with main (post-merge base advance)
Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
docs(sidecar-loop-001): feature doc + CHANGELOG; archive mislabeled pre-fix data
docs(training-audit): root-cause the 3 discarded retrains — corpus filter + eval + DPO all mislabeled
fix(sidecar-loop-001): collector logs aligned reranked candidates + guard (Epic 1)
docs(sidecar-loop-001): sprint plan + recon (Epic 0) — fix-but-defer
chore: sync reh3376_dev01 with main (post-merge base advance)
Merge remote-tracking branch 'origin/main' into reh3376_dev01
Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
docs(negfeed-cooler-001): feature doc + CHANGELOG + post (Epic 6)
feat(cooler-001): graduated-incident edges resist decay (Epic 4)
feat(cooler-001): unify graduation onto the Context Cooler (Epic 3)
feat(negfeed-001): anti-Hebbian producer — Jiminy contradicted bridge + MCP memory_reject (Epic 2)
feat(negfeed-001): CoactivateSession off-request + delta emission (Epic 1)
docs(negfeed-cooler-001): sprint plan + live-verified recon (Epic 0)
Merge remote-tracking branch 'origin/main' into reh3376_dev01
chore: sync reh3376_dev01 with main (post-merge base advance)
Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
docs(readme): coverage-gap additions per 3-lane team review (operator-approved)
Merge remote-tracking branch 'origin/main' into reh3376_dev01
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(context-live-001): UVTS A/B artifacts + feature doc + CHANGELOG + post (Epics 5-6)
feat(context-live-001): server-side query-fingerprint derivation default-on (Epic 4)
feat(context-live-001): classifier→category dispatch on live traffic (Epic 3)
feat(context-live-001): stage-6 fingerprint heal + Phase-B refine wired (Epic 2)
feat(context-live-001): version guard + honest consensus denominator (Epic 1)
docs(context-live-001): sprint plan + recon findings (Epic 0)
Merge remote-tracking branch 'origin/main' into reh3376_dev01
chore: sync reh3376_dev01 with main (post-merge base advance)
fix(dormant-census-001): re-pin capability_gaps_full spec hash after variant removal
docs(dormant-census-001): feature doc + CHANGELOG + sprint post (Epic 5)
fix(jiminy): codegen collision-path self-deadlock wedged constraint observes
feat(dormant-census-001): prune 4 verified-dead routes + handlers (Epic 3)
feat(dormant-census-001): wire SignalLearner.GetStrength into Guide() ordering (Epic 2)
feat(dormant-census-001): adjudicated 187-route inventory + merge-blocking CI gate (Epic 1b)
feat(dormant-census-001): route↔consumer gate + 187-route inventory skeleton (Epic 1a)
docs(dormant-census-001): sprint plan — the standing dormancy guarantee
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(jiminy-budget-001): CHANGELOG + post (Epic 4)
feat(jiminy-budget-001): budget derivation, session attribution, surface/outcome split, token floors (Epics 1-3)
docs(jiminy-budget-001): sprint plan
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(surprise-topk-001): post + CHANGELOG; threshold recalibration + loud error path (Epics 3-4)
feat(surprise-topk-001): config-driven surprise multiplier thresholds at both Cypher sites (Epic 2)
feat(surprise-topk-001): exact top-K nearest-neighbor embedding novelty (Epic 1)
docs(surprise-topk-001): sprint plan — honest novelty + a multiplier that can fire
chore: sync reh3376_dev01 with main (post-merge base advance)
fix(doc-audit-001a): execute the approved fix batch
Merge remote-tracking branch 'origin/main' into reh3376_dev01
chore: sync reh3376_dev01 with main (post-merge base advance)
Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
docs(ft-classify-002): CHANGELOG + post — no-promote accepted, sprint closed (Epic 6)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(ft-classify-002): gate results — 2/3 PASS, no-promote recommended
docs(doc-audit-001): amended charter per 3-lens review team (operator-approved)
docs(ft-classify-002): stage-by-stage run record (6a instrumentation input)
fix(ft-classify-002): benchmark config still pointed at decommissioned mlx port 8101
feat(ft-classify-002): training config — fresh task-delta LoRA on the production fused model (Epic 2)
feat(ft-classify-002): class-stratified distill capture + the real 11.5d root cause (Epic 1)
Merge branch 'reh3376_dev01' of https://github.com/reh3376/mdemg into reh3376_dev01
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(ft-classify-002): sprint plan — distribution-matched consulting.classify distillation
fix(tsdb): time-scope V0014's backfill + integrity assertion to its historical window
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(mcp-revive-001): feature doc + CHANGELOG + CLAUDE.md + post (Epic 5)
feat(mcp-revive-001): eventgraph + strict MCP tools, plugin-orphan reaper (Epics 3+4)
test(mcp-revive-001): contract suite for the MCP tool surface (Epic 2)
feat(mcp-revive-001): space resolution chain across all MCP memory tools (Epic 1)
chore: sync reh3376_dev01 with main (post-merge base advance)
fix(jiminy-outcome-002): Tier-2 not_applicable + verdict provenance
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(ft-recursive-000): recursive-retraining loop as-built audit + buildable spec
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(tsdb-consume-001): record the Epic-5 live-smoke catch in post.md
fix(tsdb-consume-001): emergence-cycle gauge writers actually land in hidden/service.go
docs(tsdb-consume-001): feature doc + CHANGELOG + CLAUDE.md + post (Epic 8)
ci: reclaim runner disk before docker-publish builds
fix(tsdb-consume-001): remove the 4 reader-without-writer ft_* dashboard panels (Epic 7)
feat(tsdb-consume-001): V0020 writer-or-drop — decided: writer (Epic 6)
feat(tsdb-consume-001): guidance_conflicts counter + emergence-cycle duration tripwire (Epic 5)
feat(tsdb-consume-001): scorer-drift tripwires over retrieval_audit (Epic 4)
feat(tsdb-consume-001): writer flush stats → metrics + flush-failure rule (Epic 3)
fix(tsdb-consume-001): honest metrics plane — windowed percentiles, real pool gauges, latency rules over real wall-time (Epic 2)
feat(tsdb-consume-001): V0025 retention+compression for unbounded hypertables (Epic 1)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(config-deadflag-001): CHANGELOG + post — sprint complete
feat(config-deadflag-001): full triage — 28 wired, 29 deleted, 0 allowlisted (Epic 2) + CI guard (Epic 4a)
feat(config-deadflag-001): enforce EVENTGRAPH_MAX_PAIRS_PER_EVENT_BATCH (first triage wire)
docs(config-deadflag-001): sprint plan (Epic 0)
feat(config-deadflag-001): strict getBool + un-swallow LoadYAMLConfig + consumer scanner (Epics 1+3)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(doc-truth-002): MoE-as-current residual cleanup (modified per review)
chore: sync reh3376_dev01 with main (post-merge base advance)
fix(uxts-ci-001): UBENCH CI gate → lint mode; holdout absence tolerated, mismatch still fatal
fix(uxts-ci-001): add tiktoken + pyyaml to the neural CI job deps
docs(uxts-ci-001): CHANGELOG + CLAUDE.md CI-gate inventory + post — sprint complete (Epics 5-6)
feat(uxts-ci-001): drift checker covers all 15 frameworks, matrix refreshed, gate un-zombied (Epic 4)
feat(uxts-ci-001): neural pytest+ruff CI job — first run caught two silent failures (Epic 3b)
feat(uxts-ci-001): TSDB in CI, un-zombie UOTS, delete UVTS step, ULTS hash gate live (Epics 1-3a)
docs(uxts-ci-001): sprint plan (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
fix(uats-gap-001): tag breaker reset round-trip llm_required — CI has no openai-embeddings breaker
feat(uats-gap-001): 8 contract specs for the revived channels + suite hygiene (Epics 1-6)
fix(transfer): nil-guard edge identity assertions — one bad edge panicked the whole server (UATS-caught P0)
fix(jiminy): reformulate returns 400 (not 500) for missing context
docs(uats-gap-001): sprint plan (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
fix(tsdb): initial backup on start — restart-resetting ticker meant zero backups ever
fix(rsic-storm-001): commit ExecuteTombstoneStaleForTest wrapper missed from Epic F
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(rsic-storm-001): feature doc + verification + rollback drill test + CHANGELOG + CLAUDE.md — sprint complete (Epic F)
fix(rsic-storm-001): attributable archival + unified tombstone predicate (Epics B+C)
fix(rsic-storm-001): atomic trigger admission — reserve-on-allow (Epic A)
docs(rsic-storm-001): sprint plan with corrected burst attribution (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(backup-restore-verify-001): feature doc + verification + CHANGELOG + CLAUDE.md — sprint complete (Epic 7)
fix(transfer): omit empty path/name on import — restores with observations always failed (drill-caught)
fix(backup-restore-verify-001): retention was deleting every backup it just made (drill-caught)
feat(backup-restore-verify-001): initial backup on start (rule honesty)
feat(backup-restore-verify-001): neo4j-backup jobhealth + generalized staleness rules (Epic 5)
feat(backup-restore-verify-001): harden the restore path (Epics 1-4)
docs(backup-restore-verify-001): sprint plan (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(supervisor-002): feature doc + verification + CHANGELOG + CLAUDE.md — sprint complete (Epic 6)
fix(supervisor-002): streak-relative global-outage discriminator (drill-caught)
fix(jiminy): detach feedback outcome processing from the hook's connection lifetime
feat(supervisor-002): aggregate evaluator-degraded alert for global outages
fix(supervisor-002): nolint G118 on legacy-fallback loop launches
fix(supervisor-002): recency-gate the RSIC llm_error_rate_spike insight (Epic 4)
feat(supervisor-002): rule-health meta-alert on evaluator query failures (Epic 3)
feat(supervisor-002): register the 12 unsupervised background loops (Epic 2)
feat(supervisor-002): sliding-window restart budget + late registration (Epic 1)
docs(supervisor-002): sprint plan + background loop inventory (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(hidden-churn-001): PR-B verification + CHANGELOG + CLAUDE.md — sprint complete (Epic B5)
fix(hidden-churn-001): live-smoke fixes — noise pool was structurally empty, clustering included archived debris, coverage gauge gated on min-obs
feat(hidden-churn-001): surface themes_updated + noise_assigned in consolidate API + periodic log (Epic B3)
feat(hidden-churn-001): mdemg concepts repair + trace — grounding audit CLI (Epic B2)
feat(hidden-churn-001): PR-B coverage retune — config ratio, density assignment, gauge + rule (Epic B1)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(hidden-churn-001): PR-A verification + CHANGELOG (Epics A3-A4)
chore(hidden-churn-001): remove the dead global-detach helper — the churn mechanism itself
feat(hidden-churn-001): stable theme identity — centroid match-or-create replaces the 5-minute churn (Epic A2)
fix(hidden-churn-001): automated consolidation no longer skips LLM emergence (Epic A1)
docs(hidden-churn-001): sprint plan — stable concept identity, two-PR delivery (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(rrf-scale-002): live-calibrated suggest floor + CHANGELOG + close (Epics 5-6)
fix(rrf-scale-002): CacheKey covers ALL result-affecting fields + two forcing functions (Epics 3-4)
fix(rrf-scale-002): config-driven score thresholds — suggest revival, MCP tiers, guardrail floor (Epic 2)
fix(rrf-scale-002): persistent rerank clients — failure alerting re-armed on the hottest LLM path (Epic 1)
docs(rrf-scale-002): sprint plan — finish the score-scale contract (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
test(rsic-validate-001): integration seeds carry session linkage for the scoped tombstone contract
docs(rsic-validate-001): Tier 3 verification + CHANGELOG + close (Epics 5-6)
fix(rsic-validate-001): counter-free confidence calibration — RSIC stops polluting its own signal (Epic 4)
fix(rsic-validate-001): tombstone_stale scoped to correction-linked nodes; refresh_stale_edges decays for real (Epic 3)
fix(rsic-validate-001): honest criteria evaluation — populated keys + fail-closed mutations (Epics 1-2)
docs(rsic-validate-001): sprint plan — fail-closed self-improvement (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(doc-truth-001): last stale --adapter help string (sweep straggler)
docs(doc-truth-001): grep-sweep proof + CHANGELOG + close (Epic 4)
docs(doc-truth-001): 00_README STATUS block + AGENT_HANDOFF retired (Epic 3)
fix(doc-truth-001): operator-facing text matches the Phase 13.5 reality (Epic 2)
docs(doc-truth-001): CLAUDE.md FT section rewritten to post-pivot reality (Epic 1)
docs(doc-truth-001): sprint plan — documentation matches reality (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(embed-wire-001): live verification + CHANGELOG + CLAUDE.md + close (Epics 3-4)
fix(ingest-exec-001): server-triggered ingest resolves the mdemg binary — was hardcoded ./bin/mdemg (Epic 2)
fix(embed-wire-001): breaker + recorder reach the real embedder through the wrapper chain (Epic 1)
docs(embed-wire-001): sprint plan — embedder wiring + ingest exec resolution (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs: CLAUDE.md architecture note for MAINT-LIVE-001
docs(maint-live-001): first live run verification + feature doc + CHANGELOG + close (Epics 4b-5)
fix(prune): orphan sweeps use implicit transactions for batched deletes
feat(maint-live-001): context-dependent orphan policy — --exclude-role-types (Epic 4a)
feat(maint-live-001): mdemg upgrade refreshes installed LaunchAgents + hooks (darwin) (Epic 3)
feat(maint-live-001): maintenance_no_live_run evaluator rule (Epic 2)
fix(maint-live-001): scheduled maintenance runs live — plist passes --dry-run=false (Epic 1)
docs(maint-live-001): sprint plan — scheduled maintenance actually runs (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
docs(hidden-weight-001): Tier 3 verification + corpus restoration + UVTS harness audit + close (Epics 4-5)
fix(ingest): config-driven consolidation timeout — was sharing the 300s batch budget
feat(hidden-weight-001): null-weight gauge + regression alert rule (Epic 3)
feat(hidden-weight-001): mdemg graph backfill-weights — heal 56k NULL abstraction weights (Epic 2)
fix(hidden-weight-001): abstraction-edge weights — vector.similarity.cosine replaces point.distance (Epic 1)
docs(hidden-weight-001): sprint plan — real weights on the abstraction hierarchy (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
fix(ci): track .claude/hooks/pre-write-check.py so hook-parity check passes
fix(uats): jiminy_guide_sanitized timeout 30s → 90s — stale vs synthesis latency
docs(hooksync-001): Tier 3 verification + feature doc + CHANGELOG + close (Epics 7-8)
fix(hooksync-001): PORT-TRUTH — loopback bind defaults + sidecar zombie replaced (Epic 6)
feat(hooksync-001): mdemg hooks doctor — one-shot hook-channel triage (Epic 5)
feat(hooksync-001): hook-channel absence detection — the channel now self-reports outages (Epic 4)
feat(hooksync-001): alert Cleared lifecycle — display once, then delivered (Epic 3)
ci(hooksync-001): hook-template parity gate — live hooks must match templates (Epic 2)
fix(hooksync-001): reconcile bidirectional hook drift — alert delivery restored to live (Epic 1)
docs(hooksync-001): sprint plan — drift-proof + self-monitoring hook channel (Epic 0)
chore: sync reh3376_dev01 with main (post-merge base advance)
Merge remote-tracking branch 'origin/reh3376_dev01' into reh3376_dev01
docs(hookwire-001): Tier 3 verification + CHANGELOG + CLAUDE.md contract pin + close (Epics 4-5)
chore: sync reh3376_dev01 with main (post-merge base advance)
fix(hookwire-001): pre-compact transcript extraction reads the real line shape (Epic 3)
fix(hookwire-001): post-tool-observe reads tool_response — end blind "succeeded" observations (Epic 2)
fix(hookwire-001): prompt-context.sh reads .prompt — revive the per-prompt channel (Epic 1)
docs(hookwire-001): sprint plan — fix hook stdin contract, reconnect per-prompt channel (Epic 0)
docs(roadmap): Q3 2026 vision-derived roadmap from 26-agent codebase deep-dive
chore: sync reh3376_dev01 with main (post-merge base advance)
ci: auto-sync dev branch with main after each squash-merged PR
Merge remote-tracking branch 'origin/main' into reh3376_dev01
Merge origin/main into reh3376_dev01 (resolve PR dev: reh3376_dev01 -> main #419 conflicts)
docs(eventgraph-004): feature doc + CHANGELOG + UATS pin + close (Epic 3)
docs(eventgraph-004): Tier 3 live verification — contradict create/re-match + weaken unchanged (Epic 2)
feat(eventgraph-004): wire ApplyNegativeFeedback contradict path → reinforcement_events (Epic 1)
docs(eventgraph-004): sprint plan + CoactivateSession post-revival health review (Epic 0)
docs(eventgraph-003): Tier 3 verification + feature doc + CHANGELOG + close (Epic 4)
fix(conversation): inject learning service so CoactivateSession actually runs
feat(eventgraph-003): wire ApplyNegativeFeedback weaken path → reinforcement_events (Epic 3)
feat(eventgraph-003): wire ApplySymbolCoactivation into reinforcement_events (Epic 2)
feat(eventgraph-003): wire CoactivateSession into reinforcement_events (Epic 1)
docs(eventgraph-003): sprint plan — reinforcement coverage for other Hebbian paths
chore(submodule): bump homebrew-mdemg to v0.10.1 formula
Merge remote-tracking branch 'origin/main' into reh3376_dev01
docs: governance system doc + bring cli/api references current
release: cut v0.10.1
feat(hooks): add pre-write-check.py to the tracked installer
feat(hooks): per-conversation SessionID — installed hook copies
feat(hooks): per-conversation SessionID across hooks + skill
docs(jiminy-governance): commit install-ready skill + install README
feat(jiminy-governance): ship the J17 governance skill + register MDEMG MCP
docs(jiminy-governance): resolve skill wire-up against the real instance
Merge remote-tracking branch 'origin/main' into reh3376_dev01
docs(roadmap): add jiminy-governance skill build-out (Workstream C, Action 7)
fix(nosilent-001): sync embedded launchd server plist with source (CI)
docs(nosilent-001): feature doc + CHANGELOG + CLAUDE.md + close (Epic 4)
fix(nosilent-001): distinct services for job rules so neither masks the other
feat(nosilent-001): scheduled-job staleness + failure alert rules (Epic 3)
feat(nosilent-001): record + alert on scheduled-job outcomes (Epic 2)
feat(nosilent-001): V0024 scheduled_job_events + writer (Epic 1)
docs(nosilent-001): sprint plan — fail-loud scheduled jobs (Epic 0)
fix(metrics,backup): resolve docker binary robustly under minimal launchd PATH
docs(eventgraph-002): feature doc + CHANGELOG + CLAUDE.md + close (Epic 7)
docs(eventgraph-002): Tier 3 live verification (Epic 6)
test(eventgraph-002): UATS contract spec for guidance-outcome federation (Epic 5)
feat(eventgraph-002): mdemg eventgraph guidance-outcome-neighborhood CLI (Epic 4)
feat(eventgraph-002): guidance-outcome federation handler + route (Epic 3)
feat(eventgraph-002): GuidanceOutcomesInNeighborhood federation method (Epic 2)
feat(eventgraph-002): V0023 constraint_code index on constraint_outcomes (Epic 1)
docs(eventgraph-002): sprint plan — guidance-outcome federation (Epic 0)
Merge remote-tracking branch 'origin/main' into reh3376_dev01
fix(eventgraph-cli-001): tag UATS spec 'tsdb' so CI skips it without TSDB
docs(eventgraph-cli-001): live verification + feature doc + CHANGELOG + close (Epic 3)
test(eventgraph-cli-001): UATS contract spec for federation API (Epic 2)
fix(eventgraph): neighbor_node_ids serializes as [] not null for empty neighborhood
feat(eventgraph-cli-001): mdemg eventgraph reinforcement-neighborhood (Epic 1)
docs(eventgraph-cli-001): sprint plan — federation consumer CLI + UATS backfill
Merge remote-tracking branch 'origin/main' into reh3376_dev01
fix(jiminy): /guide 30s timeout sibling + single-source config defaults (GUIDANCE-SYNTH-001 fix-commit)
docs(followup-c): close JSON control-char escaping as NON-ISSUE (no fix)
Merge remote-tracking branch 'origin/main' into reh3376_dev01
test+docs(guidance-synth-001): Tier 2/3 verification + docs + close (Epic 3)
feat(jiminy): config-drive warm-compute timeout (GUIDANCE-SYNTH-001 Epic 2)
feat(consulting): parallelize per-node constraint classifier (GUIDANCE-SYNTH-001 Epic 1)
docs(guidance-synth-001): sprint plan — fix guidance synthesis timeout (Follow-up B)
Merge remote-tracking branch 'origin/main' into reh3376_dev01
docs(jiminy-outcome-001): CHANGELOG + CLAUDE.md + post.md (Epic 3)
test(jiminy-outcome-001): Tier 2 integration + Tier 3 live verification (Epic 2)
feat(jiminy): embedding-similarity constraint-code matching (JIMINY-OUTCOME-001 Epic 1)
docs(jiminy-outcome-001): sprint plan — revive Neo4j GUIDANCE_OUTCOME sink
Merge remote-tracking branch 'origin/main' into reh3376_dev01
test(rrf-scale-001): skip guidance integration tests on empty environment (CI fix)
docs(rrf-scale-001): CHANGELOG + CLAUDE.md score-scale contract + post.md (Epic 5)
test(rrf-scale-001): Tier 2 integration + Tier 3 live verification (Epic 4)
docs(rrf-scale-001): Epic 3 — remaining LOW findings reviewed + decided
fix(consulting): RRF-calibrate score gates + confidence sigmoid (RRF-SCALE-001 Epic 2)
docs(rrf-scale-001): Epic 1 audit findings — 12 sites cataloged
docs(rrf-scale-001): sprint plan — RRF score-scale consumer remediation
Merge remote-tracking branch 'origin/main' into reh3376_dev01
Merge remote-tracking branch 'origin/main' into reh3376_dev01
fix(eventgraph-001): Grafana panel uses TSDB instead of unconfigured Prometheus datasource
Merge branch 'main' into reh3376_dev01
docs(eventgraph-001): feature doc + CHANGELOG + CLAUDE.md + sprint close (Epic 8)
docs(eventgraph-001): Tier 3 live e2e verification transcript (Epic 7)
fix(retrieval): set Activation on RRF RetrieveResult (EVENTGRAPH-001 fix-commit)
fix(eventgraph-001): restore full GRAFANA-AUDIT-001 audit_results.json
feat(observability): Grafana panel + Prometheus counters for reinforcement events (EVENTGRAPH-001 Epic 6)
feat(eventgraph): federation query helper + API endpoint (EVENTGRAPH-001 Epic 5)
feat(learning): record reinforcement events to TSDB (EVENTGRAPH-001 Epic 4)
refactor(learning): expose per-pair telemetry from Hebbian Cypher (EVENTGRAPH-001 Epic 3)
feat(tsdb): buffered reinforcement_events writer (EVENTGRAPH-001 Epic 2)
feat(tsdb): V0022 reinforcement_events hypertable (EVENTGRAPH-001 Epic 1)
docs(eventgraph-001): sprint plan (Pattern Y1 TSDB-federation)
docs(model-dist-002): flip adapter section to shipped + sprint close
feat(cli): enable mdemg model pull --adapter (MODEL-DIST-002 Epic 5+6)
feat(model-dist-002): Epic 4 local — Modelfile.adapter + ollama create
feat(model-dist-002): Epic 1-3 — MLX adapter → PEFT → GGUF LoRA + live verify
feat(model-dist-002): Epic 0 — sprint plan + workspace prep
Merge remote-tracking branch 'origin/main' into reh3376_dev01
docs(grafana-audit): Epic 4 + 7 — feature doc + sprint close
fix(grafana): Epic 3 — 5 panels recovered (3 FAIL + 2 schema-drift)
feat(grafana-audit): Epic 1 + 2 — full audit + findings
feat(grafana-audit): Epic 0 — sprint plan + audit harness
Merge remote-tracking branch 'origin/main' into reh3376_dev01
docs(api): document 19 previously-undocumented endpoints (follow-up Implement Learning Loop - ApplyCoactivation #2)
Merge remote-tracking branch 'origin/main' into reh3376_dev01
feat(cli): add mdemg model run wrapper (follow-up Edge Weight Decay CLI Command #1 to MODEL-DIST-001)
chore(submodule + docs): bump homebrew-mdemg to v0.10.0 + cli-reference Model Distribution section
Merge remote-tracking branch 'origin/main' into reh3376_dev01
docs(release): promote Unreleased -> v0.10.0
merge: resolve quant_manifest.json conflicts (Epic 3 closeout vs squashed main)
docs(model-dist-001): sprint close — post.md
feat(model-dist-001): Epic 3 closeout — Ollama Library push complete
docs(model-dist-001): Epic 8 — Documentation Update (main repo)
docs(model-dist-001): Epic 7 — local-model-distribution feature doc
feat(model-dist-001): Epic 5 — V0021 model_install_events hypertable + writer
feat(model-dist-001): Epic 4 — mdemg model CLI + pluggable Fetcher interface
feat(model-dist-001): Epic 3 — 3 Modelfiles + local ollama create (push pending)
docs(model-dist-001): Epic 2 — defer adapter to MODEL-DIST-002
feat(model-dist-001): Epic 1 — built Q4_K_M + Q8_0 fused GGUFs
docs(sprint): MODEL-DIST-001 sprint plan + quant manifest skeleton
fix(service): replace decommissioned mlx-server LaunchAgent with llama-server
fix(api): /healthz returns build-time version, not stale literal "0.6.0"
chore(submodule): bump homebrew-mdemg to v0.9.0 formula + docs
Merge remote-tracking branch 'origin/main' into reh3376_dev01
docs(release): promote Unreleased -> v0.9.0

Auto-generated PR from reh3376_dev01 push

…alth review (Epic 0) EVENTGRAPH-004 federates the last unfederated Hebbian write — the ApplyNegativeFeedback contradict action — into reinforcement_events (trigger_path=apply_negative_feedback_contradict). Data-decided scope: reuse the existing V0022 sink (zero CONTRADICTS edges exist anywhere; no producer calls /v1/learning/negative-feedback — instrument before the producer arrives, the inverse of the dormancy pattern). Also closes the EVENTGRAPH-003 follow-up: 30h post-fix health review of the revived CoactivateSession path — no tuning needed, textbook session cliques, pre-fix orphans stay as historical record (operator decision). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…inforcement_events (Epic 1) The contradict action (no co-activation edge → MERGE CONTRADICTS) was the last unfederated Hebbian write. The CONTRADICTS MERGE lived inside a FOREACH, where the edge variable is invisible to RETURN — so the original single statement is split into two statements in the SAME ExecuteWrite transaction: (a) weaken (EVENTGRAPH-003 telemetry, RETURN unchanged) and (b) contradict with a per-pair RETURN. Classification is identical: weaken never deletes edges, so contradict's NOT EXISTS sees the same edge set the original OPTIONAL MATCH did. Contradict rows land with trigger_path=apply_negative_feedback_contradict. created_new_edge detected via `c.updated_at IS NULL` (ON MATCH always sets it; ON CREATE never does — invariant pinned by comment). delta_weight is the CONTRADICTS edge's OWN weight delta (+negWeight on create, 0 on re-match); negative-feedback semantics are carried by trigger_path, not the sign. Both statements EXPLAIN-validated against live Neo4j. Tier 1: 2 new parser tests (create/re-match branches); learning suite green; lint clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…-match + weaken unchanged (Epic 2) Live against the restarted Epic-1 binary: contradict create row (+0.15, created_new_edge=true), re-match row (delta=0, evidence=2), weaken row byte-equivalent to pre-split behavior (negative delta, floor at 0). Federation CLI surfaces the new trigger_path with no read-side change. UATS learning_negative_feedback 5/5 PASS. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…c 3) Feature doc: 5-path trigger_path table + delta-semantics consumer warning (contradict delta is the CONTRADICTS edge's own weight delta — semantics live in trigger_path, not the sign). UATS spec extended: zero-count equals assertions on nonexistent nodes (hash refreshed, 5/5 live). CLAUDE.md architecture note + producer-gap disclosure. Sprint close in post.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Squash-merge workflow leaves a stale merge-base: PR #418's squash (b408bbc) rewrote the same CHANGELOG/CLAUDE.md/service.go regions this branch then extended in EVENTGRAPH-004. Verified before resolving: main == dev01@36377a2 + .github/workflows/codeql.yml exactly (git diff b408bbc 36377a2 shows only codeql.yml), so taking this branch's side of every content conflict is lossless; codeql.yml comes in from main. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Squash merges never advance the dev branch's merge-base, so every sprint touching CHANGELOG.md/CLAUDE.md hit CONFLICTING on its next PR (first bitten: PR #419). New sync-dev-after-merge.yml merges main back into the source *_dev* branch after each merged PR; the GITHUB_TOKEN push triggers no other workflows, so it can never spawn an empty auto-PR (the PR #420 failure mode). Conflicts fail loudly for manual resolution; workflow_dispatch enables manual runs/live testing. auto-pr.yml additionally skips PR creation when branch content is identical to main — guards MANUAL sync pushes, verified against the live repo state (current dev01 ≡ main → empty=true → skip). actionlint clean (untrusted refs passed via env, not inline). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…deep-dive Full-codebase review vs MDEMG's purpose (cognitive substrate / connection layer): 19 map agents (3 vision + 16 subsystem), 3 cross-cutting assessors, synthesizer + adversarial completeness critic (19 revisions applied). Verdict: server-side substrate is mature, but the system is not currently functioning as the assistant's internal dialogue — the per-prompt delivery channel silently no-ops (hook reads .user_prompt, Claude Code sends .prompt), 100% of GENERALIZES edges have NULL weight (22,170/22,170, live-verified), scheduled decay/prune has been a permanent dry-run, RSIC validates 16/17 actions vacuously, and supervision covers 3 of ~14 background loops. Every defect is the same disease: wired-looking seams with no caller, wrong contract, or no reader. 4 phases ≈ 75 days committed: (1) reconnect the loop ends, (2) close the learning loops, (3) survivability + class-ending forcing functions, (4) FT frontier + release hygiene. Top-10 ranked; deferrals explicit. Orchestrator spot-verification annex included (5 claims re-verified live). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…per-prompt channel (Epic 0) Roadmap Q3 Phase 1 rank #1. Audit of all 6 hooks vs the actual Claude Code stdin schemas: prompt-context.sh reads .user_prompt (CC sends .prompt) → channel exits silently on every prompt; post-tool-observe.py reads tool_output (CC sends tool_response) → false "Build/test succeeded" observations with empty output; guidance wrongly coupled to RESULT_COUNT>0; minor pre-compact transcript jq. session-start / pre-bash-check / pre-write-check verified correct. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…rompt channel (Epic 1) Claude Code's UserPromptSubmit stdin field is `prompt`; the hook read `.user_prompt`, which is always empty → exit 0 → per-prompt CMS recall, Jiminy guidance, /strict reformulation, the warm trigger, and the retrieve-time Hebbian reinforcement have NEVER fired in any session. Now reads `.prompt // .user_prompt` (legacy fallback kept). Also decouples guidance from recall: the RESULT_COUNT=0 branch no longer exits — it printed its notice then skipped guidance + warm + retrieval reinforcement, coupling independent deliveries. Both copies (live + installer template). Tier 1 simulated stdin: real .prompt payload → first-ever guidance delivery (J17 T1 bootstrap + DICT, 5363 guidance bytes vs 0 forever); legacy fallback works; short/empty/ malformed payloads exit silently (fail-open preserved). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…"succeeded" observations (Epic 2) Claude Code's PostToolUse stdin field is `tool_response` (string or object); the hook read `tool_output`, which is always absent → output_str empty → error indicators never matched → every go build/go test/pytest Bash call was recorded as "Build/test succeeded" sight-unseen, and real errors were never observed. Now reads tool_response (fallback tool_output) via _response_text(), normalizing string|dict|list (stdout/stderr join). Success classification requires NON-EMPTY clean output — a silent success records nothing rather than fabricating; failures land as error observations with real stderr. Both copies (template regenerated from fixed live, {{SPACE_ID}} placeholder preserved, verified identical modulo placeholder). Tier 1 against real CMS: failing build → error obs with stderr; passing → progress; empty → no record. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ine shape (Epic 3) Transcript lines are {type, message:{content:[{type, text|name, ...}]}}; the old top-level `.content` read always yielded empty, so pre-compaction snapshots never carried recent-activity context. New jq walks .message.content[] extracting .text/.name. Verified against this session's real transcript (old: nothing; new: real activity). Both copies, placeholders preserved. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…act pin + close (Epics 4-5) Live in the real session: first-ever guidance delivery (J17 T1 bootstrap + DICT, 5363 bytes vs 0 forever); real failing build → error observation with actual compiler output in CMS. PostToolUse success-only firing documented as a limitation. Hook stdin contract pinned in CLAUDE.md. Drift + clique-semantics findings logged for HOOKSYNC-001 / Phase 2. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…channel (Epic 0) Roadmap Q3 Phase 1 rank #2. Investigation grounded all five findings: template→live drift severed alert delivery (50-entry file actively rotating today, never shown); no Cleared lifecycle (nothing sets the field; no /v1/alert* endpoints); no absence detection for the channel that just had a months-long silent outage; compose publishes 9999 on 0.0.0.0; neural sidecar binds 0.0.0.0:8101 via a 39-day-old process serving pre-J17-fix code. 8 epics: reconcile, CI parity gate, clear lifecycle, hook_events absence rule (reuses V0024 via jobhealth), hooks doctor, PORT-TRUTH rider, Tier 3, docs. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…y restored to live (Epic 1) Live hooks adopted from templates (SPACE_ID substituted): restores the alert-display blocks (all-pending per prompt; critical/high + degraded healthz at session start) that the live copies lacked — the NOSILENT last mile. Reverse drift caught during reconcile: the live hook's T1/T2 bootstrap-detection block (MAX_TIER → /v1/jiminy/bootstrap → ACTIVE CONSTRAINTS header) never existed in the template and was nearly lost — now single-sourced in the template and regenerated into live. Live-verified: one prompt now renders alerts (50 pending incl. live CRITICALs) + recall + J17:INIT bootstrap + guidance + synergy footer, coexisting. All 6 hooks byte-identical to templates modulo {{SPACE_ID}}. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…emplates (Epic 2) Mirrors the compose/launchd parity pattern: every *.sh/*.py template must byte-match .claude/hooks/ modulo the {{SPACE_ID}} placeholder. Proven locally: passes clean, fails (with a bounded diff dump) on deliberate drift. Ends the bidirectional-drift class that severed alert delivery and nearly lost the T1 bootstrap block. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…vered (Epic 3) Alert.Cleared existed but nothing ever set it: once hooks rendered the file, the same entries would re-render every prompt forever. New: FileBackend.Clear (ids and/or all_before cutoff, idempotent, under the existing lock) → Dispatcher.ClearAlerts → POST /v1/alerts/clear. Hooks now clear exactly what they displayed (fire-and-forget, fail-open); cleared = delivered-to-operator, not resolved — persisting conditions re-fire via the evaluator. Alert IDs now CUIDv2 per the identifier standard (was UnixNano; old ids remain valid opaque strings). Live-verified lifecycle: prompt 1 → "50 pending, showing 10" + 10 cleared; prompt 2 → "40 pending, showing 10" (next batch, no re-render) → 20 cleared. Tier 1: Clear by-id/by-time/idempotent/no-backend. UATS alerts_clear 3/3 live (runner falsy-body inheritance discovered: variant bodies must be non-empty objects). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…self-reports outages (Epic 4) POST /v1/hooks/event records heartbeats into V0024 scheduled_job_events via the jobhealth policy point (job_name hook:<name>; no new sink). Two independent heartbeats: prompt-context fires per delivery (the monitored channel); post-tool-observe fires throttled (HOOK_HEARTBEAT_ COOLDOWN_SEC, default 300 — proves sessions ACTIVE). Evaluator rule hook_channel_silent (distinct service per the NOSILENT cooldown rule): sessions active + zero prompt-context fires in HOOK_SILENT_LOOKBACK_ HOURS (24) → high alert. This is the "job never ran" guarantee applied to the channel whose months-long outage HOOKWIRE-001 found only by manual audit — the next contract drift self-reports. Config: HOOK_HEALTH_ALERT_ENABLED (true), HOOK_SILENT_LOOKBACK_HOURS (24), HOOK_ACTIVITY_MIN_EVENTS (5). Live-verified: real hook fires land rows (session metadata, latency); throttle holds; rule SQL positive + negative branches proven against the real table; UATS hooks_event 3/3. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… (Epic 5) 11 checks: per-hook template parity (the CI gate's local twin), settings registration, server healthz, a stdin-contract self-test piping a real-shape UserPromptSubmit payload through the installed hook (asserts the always-present synergy footer), alert-file state (pending/total), and the last hook:prompt-context heartbeat age from scheduled_job_events (SKIP when TSDB unreachable). Table or --json; non-zero exit on any FAIL. Live: 11/11 PASS on this machine ("last fire 5s ago" — fed by the doctor's own self-test); correctly fails (exit 1) on deliberate drift. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ie replaced (Epic 6) Compose published the API on 0.0.0.0 (unauthenticated admin/destructive routes exposed off-host): now "${MDEMG_BIND_ADDR:-127.0.0.1}:${MDEMG_PORT} :9999" — wide bind is an explicit opt-in (both compose copies, CI-synced). Neural sidecar bound 0.0.0.0:8101 via config.py default AND the plist arg: both now 127.0.0.1 (both plist copies, CI-synced; SIDECAR_HOST env overrides). Operational: the 39-day-old sidecar process (started 2026-05-02, serving pre-J17-fix code) replaced — fresh process verified on 127.0.0.1:8101, both models loaded, health 200. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…lose (Epics 7-8) Live-verified across the sprint: alert backlog drained 50→2 on real prompts (display-then-clear); evaluator rules 15→16 (hook_channel_silent loaded); doctor 11/11 + correct failure mode; sidecar fresh on 127.0.0.1:8101 (NLI 234ms). Feature doc docs/features/hook-channel- health.md (config table incl. MDEMG_BIND_ADDR + SIDECAR_HOST). Findings: packaging plists are templates (raw copy → launchd exit 78; service install is canonical); UATS falsy-variant-body inheritance pinned. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…sis latency Caught in the HOOKSYNC-001 full-suite regression: the synchronous /v1/jiminy/guide includes local-model synthesis (~43s observed quiet, ~50s typical per GUIDANCE-SYNTH-001) — the spec's 30s timeout has been silently erroring since synthesis latency grew. Aligned with the JIMINY_WARM_COMPUTE_TIMEOUT_MS budget (90s); hash refreshed; passes live. Pre-existing — not a HOOKSYNC regression (Guide path untouched). The other 3 suite errors were load-induced flakes (pass individually): suite-vs-llama-server slot contention, noted for UXTS-CI-001. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…passes Root cause: the new 'Verify live hooks match hook templates' CI step (HOOKSYNC-001) diffs every internal/cli/hook_templates/*.{sh,py} against .claude/hooks/<name>, but the .gitignore allowlist only un-ignored the 5 original hooks. pre-write-check.py gained a template in this sprint while its live counterpart stayed gitignored, so CI checked out a tree without it and failed with 'MISSING live hook: .claude/hooks/pre-write-check.py'. Fix: add '!.claude/hooks/pre-write-check.py' to the allowlist and commit the live hook (already byte-identical to its template modulo SPACE_ID), preserving the full parity guarantee instead of weakening the CI step. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…n hierarchy (Epic 0) Roadmap Q3 Phase 1 rank #3. Live investigation: point.distance() returns NULL on embedding lists (proven: NULL where vector.similarity.cosine returns 0.627 on the same pair); 3 creation sites affected incl. an ABSTRACTS_TO site the audit missed. Scale worse than audited and growing: 28,332/28,332 GENERALIZES + 36,110/37,996 ABSTRACTS_TO = 64,442 NULL-weight abstraction edges. Neo4j cosine returns [0,1] directly — drop-in. Plan: fix sites (+ CUIDv2 edge ids), LIMIT-5-then-batched backfill, null-weight gauge + alert rule via the existing graph-stats → metric_samples path, UVTS-quick regression guard. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…cosine replaces point.distance (Epic 1) point.distance() is a spatial-Point function: on embedding lists it returns NULL, so every weight at the 3 abstraction-edge creation sites was never set (100% of GENERALIZES + 95% of ABSTRACTS_TO weightless; the CASE guards passed on good embeddings, then the THEN expr evaluated NULL — edges with good embeddings got nothing while embedding-less ones got the 0.5 fallback). vector.similarity.cosine returns [0,1] directly (live-verified: identical=1.0, orthogonal=0.5, opposite=0.0). Site 1 (theme GENERALIZES) gains the null-guard it never had. Also: edge_id randomUUID() → CUIDv2 per the identifier standard, minted Go-side via memberEdgePairs (Cypher can't generate CUIDv2) and zipped with member ids for UNWIND. All 3 statements EXPLAIN-validated live. Tier 1: pair-builder tests (uniqueness, CUID format, empty input). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

2nd training-integrity remediation sprint. The distill gate (x9_distill_capture_v2.py:360: kept = mean(reward_vector) >= 0.8, global) drops spec-correct-but-terse answers because coverage_score/ explanation_quality/coherence_score reward length over correctness — gutting ape.reflect (largest target) + summarize + synthesize, then balanced_sampler amplifies the verbose-skew. Principle: inclusion selects for CORRECTNESS not length. Fix: length-neutral correctness rewards + per-task inclusion thresholds + a forcing-function test (each of the 12 covered tasks' known-correct golden rows clear their gate) + distribution check. Closes with the eval-integrity-deferred GGUF serving + honest baseline recompute. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…c 1) The distill inclusion gate (mean(reward_vector) >= 0.8) selected for LENGTH, not correctness — the corpus-skew mechanism behind 3 discarded retrains. Four reward functions used length/count ladders that dropped spec-correct-but-terse answers below the gate and rewarded verbosity upward: - coverage_score: <20 words→0.4 / <50→0.7 / then rising → now substantive content scores 0.9 flat (length-neutral); empty→0.0, pure-repetition→0.3. - explanation_quality: <20 words→0.6 cliff → now substantive→0.9. - coherence_score: required >=2 sentences + 10 words → now any coherent non-repetitive response→0.9; pure repetition→0.4. - insight_count: rewarded bullet COUNT (>=5→1.0) → now >=1 genuine insight→0.9 (no upward count reward; stops bullet-spam, stops dropping single-insight reflections to 0.5 — ape.reflect, the largest target). Verified: terse-correct now clears the 0.8 gate; verbosity/bullet-count no longer rewarded above concise; varied detailed content unaffected (0.9); empty/repetition still rejected. Tests rewritten to pin the new semantics (78 pass). Subtler keyword-bag functions (specificity/actionability) left for the continuation — they reward content signals, not raw length. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ndings (Epic 2-3) Epic 2: --reward-threshold-map JSON ({"task": float}) overrides the global --reward-threshold per task in x9_distill_capture_v2.py, so tasks whose reward arrays have a different natural ceiling can gate at the right bar. Records the per-task gate in each row + the manifest. Live-verified end-to-end: real OpenAI + TSDB run with {"consulting.classify": 0.6} applied the override (3/3 captured, manifest per_task reward_threshold=0.6). Epic 3 (live Tier 3, docs/development/reward-correctness-001/live_findings.md): scored REAL production llm_interactions at the 0.8 gate, old vs new rewards. Validated Epic 1: hidden.summarize recovered 69/72 real concise summaries the old length ladder dropped. Surfaced three larger correctness issues the length fix does NOT close (the real dominant suppressors for the big tasks): 1. ape.reflect (54k, largest target): json_valid mean 0.133 — ~87% of recent responses TRUNCATED mid-JSON (prompt ~5800 + ~3000 output > 8192 per-slot KV bound). Production serving/capture defect; gate correctly rejects. Recommended own-sprint follow-up (raise output budget, re-capture). 2. jiminy.evaluate: explanation_quality=0.0 on correct {violations,warnings} responses — wrong reward for the schema (no top-level explanation key). Reward-array fix, operator-gated (changes a ULTS array + re-grades). 3. jiminy.synthesize: keyword-bag follow_rate/specificity just below gate — the deferred Epic 1 continuation. Also fixed 2 pre-existing lint nits in the touched file (F541, E741). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Closes REWARD-CORRECTNESS-001 at Epic 1+2+3. Epic 4 (baseline recompute) explicitly deferred behind the ape.reflect truncation fix per operator sequencing — recomputing over a known-truncated corpus would bake in the corruption. Next sprint: ape.reflect truncation. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Root-caused the ape.reflect ~87% truncation (largest training target) to a structurally-unbounded prompt: live-measured 7489 tokens (Current Assessment ~3895 + 5-cycle history ~2693), leaving only ~700 of the 8192 per-slot KV budget for output — 191/200 invalid responses cluster at 490-520 tokens_out, truncating mid-JSON at the ceiling. Compression already on; not a max_tokens cap. Plan: bound the prompt to a configurable token budget (gate verbose TSDB dataset fields, cap history cycles, final drop-oldest guard) so output always has ~4000-token headroom, with an optional serving-slot increase as the safety margin. Lever A (structural prompt budget) + Lever B (KV slot) proposed, picked at execution. Tier 3 proof: fresh ape.reflect json_valid recovers 0.13 → ~1.0. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ut (Epic 1-2) Implements the structural fix for the ape.reflect ~87% mid-JSON cutoff. The prompt was unbounded (live 7489 tok), leaving only ~700 of the 8192 per-slot KV budget for output, so the largest training target's responses were cut off mid-array. buildUserPrompt now enforces a token budget: - gate the verbose TSDB dataset fields (LLMPerformance x17 / Retrieval / Embedding / TrainingReadiness, ~3895 of 7489 prompt tokens) behind RSIC_LLM_REFLECT_INCLUDE_DATASETS (default false); scalar health metrics the detectors use are always kept; - cap history cycles via RSIC_LLM_REFLECT_HISTORY_CYCLES (default 3, was hardcoded 5); - final budget guard (RSIC_LLM_REFLECT_PROMPT_BUDGET_TOKENS default 3500, 0 disables): drops history oldest-first, then trims the assessment tail, logging loudly what was dropped (never silent). estimateTokens calibrated to the measured 2.3 chars/tok ratio, slightly conservative. 3 config fields (range-validated, no hardcoding) wired config -> LLMReflectorConfig -> server.go. 6 Tier-1 tests: dataset gating, history cap, drop-history-under-budget, trim-assessment-under-budget, under-budget-unchanged, estimator. Full ape suite + lint + config scanner (687/687) clean. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…md (Epic 5) Live Tier 3 result documented: 3/3 fresh post-restart ape.reflect rows valid JSON (100%, up from ~13%), tokens_in ~2575 (from ~7489). Corrected the stale CLAUDE.md ape.reflect prompt-size figure (~5800 -> ~7489 live-measured) and added the per-slot KV "prompt+output share the budget" guidance. Closes APE-PROMPT-BUDGET-001. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… import shift The Epic 1-2 import (log/slog) + struct fields shifted llmReflectSystemPrompt from line 74 to 80. The ULTS hash verifier reads from line-2 and grabs the first backtick string; at the stale :74 the search region now included the `"` backtick in `quoted[i] = `"` + a + `"`` → wrong hash. The system prompt TEXT is unchanged (hash 39b2bc… still matches at :80). Updated system_prompt_source :74 → :80. Local glob verify: ape.reflect 11/11 PASS. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ta (Epic 0-1) Operator-chosen audit-first prune phase. Read-only enumeration of every non-conforming TSDB/file target with exact counts + a backup/small-batch/verify prune plan (each category operator-gated). Findings — PRUNE TARGETS: (A) 2,111 invalid-JSON rows in object/array tasks (ape.reflect 1890 in the 06-11..06-13 truncation window, rerank_cross 184, evaluate_llm 18, query_classify 18, classify 1); (B) 21,135 error/empty rows (mdemg data clean target); (C) rerank mislabeled archive 6,894 events/21M + valid_golden 108 leaked + ~14 stale April baselines. ~23,246 TSDB rows total (~22.7%). NOT prune targets (schema/reward mismatch, data is fine, fix the definition): hidden.summarize 72 (prose vs object schema), string-schema tasks the jsonb check false-flags (intent_translate/codegen/synthesize emit valid bare strings), jiminy.evaluate. Audit pitfall recorded: never run a jsonb-validity prune predicate against string-schema tasks. Corrects the "87% of 54k" assumption: ape.reflect corruption is 1,890 rows in the recent truncation window (forward-fixed by APE-PROMPT-BUDGET-001), not corpus-wide. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…LOG (Epic 2 close) Pruned 1,898 genuinely-corrupt invalid-JSON rows from llm_interactions (ape.reflect 1,879 / jiminy.evaluate_llm 18 / consulting.classify 1), backup-first to .mdemg-backup-20260613_195431/dataprune/ (reversible). 102,415 -> 100,517, remaining_corrupt=0, live healthy, recent ape.reflect 14/14 valid. Small-batch verify caught that the raw pg_input_is_valid predicate over-counted by 213 (valid JSON behind markdown fences / think-tags that production SanitizeResponse strips); validated all candidates through a faithful replica of llmclient.SanitizeResponse and spared the recoverable 213. Categories B (error rows) + C (file artifacts) deferred. The backup dir is untracked (not committed). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

B: 21,254 error/silent-failure rows removed via mdemg data clean (4 spaces), backed up first. C: rerank prefix-archive (6,894 events/21M, no refs) moved to backup; valid_golden + ~14 baselines RETAINED (load-bearing — leak source + regression harness; retire during baseline recompute). Final: llm_interactions 79,461 rows, 0 non-conforming. Verification catch documented: data clean dry-run per-task table is surviving-rows, not the delete set. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ace data Two bugs surfaced during the space-hygiene cleanup (removing 24 junk/test/demo spaces for live testing): 1. `mdemg space delete` gated its pre-check on `count(MemoryNode {space_id})` but the delete itself is label-agnostic (`MATCH (n {space_id})`). A space holding only SymbolNodes/Observations (e.g. e2e-test = 10,918 SymbolNodes) reported "no nodes. Nothing to delete." and silently survived. Pre-check now counts all labels, matching the delete. 2. `ListSpaces` (`mdemg space list`) panicked — `interface conversion: interface {} is nil, not string` at the `sid.(string)` assertion — when any MemoryNode had a null space_id (orphaned/infra artifacts). The query now excludes null/empty space_id (such nodes are not a "space"), and the assertion is nil-guarded defensively. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

24 junk/test/demo spaces removed (~143k nodes, backed up); blank-space resolved (global infra kept null, 155 test MemoryNodes staged for delete); 2 space-tool bugs fixed (delete pre-check, list panic). Record in space_cleanup.md. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…xes (Epic 0) The REWARD-CORRECTNESS-001 follow-ups: (1) hidden.summarize schema object->string (production emits prose; 72 rows mis-flagged invalid-JSON); (2) explanation_quality schema-aware for nested violations[].reasoning (fixes jiminy.evaluate + evaluate_llm scoring correct responses 0.0); (3) keyword-bag specificity/ actionability substantive-floored (jiminy.synthesize valid guidance dropped for lacking magic words). Makes the 4 tasks' grading correct before the baseline recompute. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

… + live validation Three reward/schema mismatches that scored CORRECT responses wrong: 1. hidden.summarize ULTS schema object->string. Production emits bare prose (cluster_summarizer.go), so the object schema mis-flagged 72 valid summaries as invalid-JSON. (Reward already fixed in RC-001; this corrects the spec.) 2. explanation_quality made schema-aware: jiminy.evaluate / evaluate_llm nest reasoning in violations[].reasoning, not a top-level field, so the flat lookup scored every correct response 0.0. Now credits nested reasoning and treats a valid no-violation verdict as a correct "no issues" answer (nothing to explain). Falls back to the flat path. 3. specificity_score / actionability_score substantive-floored (0.7 floor, keyword presence a bounded bonus, hedging/empty/repetition low) — the keyword-bag dropped valid concise guidance below the gate for lacking ~6 magic words. follow_rate inherits it. Live Tier 3 (real production rows, old->new kept@0.8): jiminy.evaluate 0/60->60/60 (mean 0.667->0.967), jiminy.synthesize 3/60->59/60 (0.725->0.879), ape.reflect 47/60->60/60 (0.848->0.956); evaluate_llm unchanged 60/60. New means 0.88-0.97 = correct production output scoring correctly, no over-inflation. 87 unit tests + 609 neural tests + ruff green. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…ort 8101->8102 (Epic 1) The capstone of the training-integrity arc: recompute the frozen 0.8338 baseline through the fixed harness (valid_clean + RC-001/002 rewards + GGUF :8102). Epic 1 fixes the stale rl_phase11.yaml mlx_port (8101 mlx_lm.server decommissioned → 8102 llama-server GGUF), flagged by EVAL-INTEGRITY-001. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

…d harness (Epic 2-4) Recomputed the adapter-promotion baseline through the fixed harness (valid_clean leak-free eval + RC-001/002 corrected rewards + GGUF llama-server :8102) = 0.8655, replacing the stale frozen 0.8338 (valid_golden-leaked, old length-biased rewards, decommissioned MLX serving — not comparable). evaluate_gate_5a now derives the target from the loaded baseline REPORT (single source of truth); the constant is retained only as a >5pp drift tripwire. status ok, 12 tasks, 50 samples/task, 0 zero-call. ape.reflect 0.696 is an eval-harness artifact (stored ~7.5k-token prompts bypass the runtime prompt budget and get cut off mid-JSON), not a model regression. Closes the training-integrity arc: trustworthy gate (EVAL-INTEGRITY-001), correct rewards (REWARD-CORRECTNESS-001/002), sound corpus (APE-PROMPT-BUDGET-001 + DATAPRUNE), honest baseline (this sprint). Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

reh3376 · 2026-06-15T16:06:21Z

BASELINE-RECOMPUTE-001 — honest adapter-promotion baseline (training-integrity capstone)

The promotion gate's baseline was a stale frozen constant 0.8338 (99%-leaked valid_golden eval, old length-biased rewards, decommissioned mlx_lm.server). Recomputed through the fixed harness — leak-free valid_clean + RC-001/002 corrected rewards + GGUF llama-server :8102 — the honest baseline is 0.8655.

Not comparable to 0.8338 (different eval + rewards + serving); future retrains compare against 0.8655. evaluate_gate_5a now derives the target from the loaded baseline report (single source of truth); the constant is retained only as a >5pp drift tripwire. Also fixed the stale rl_phase11.yaml mlx_port 8101→8102.

Live recompute (Tier 3): status: ok, GGUF :8102, 12 tasks, 50 samples/task, 0 zero-call. Per-task: jiminy.codegen 1.00 · jiminy.evaluate 0.967 · hidden.name_emergence 0.95 · consulting.classify 0.91 · rerank_cross 0.90 · hidden.summarize 0.90 · jiminy.synthesize 0.88 · intent_translate 0.874 · query_classify 0.82 · evaluate_llm 0.80 · ape.reflect 0.696 (dragged by json_valid=0.24 — eval-harness artifact: benchmark replays stored ~7.5k-token prompts bypassing the runtime APE-PROMPT-BUDGET-001 bound; fresh ape.reflect is 100% valid live — not a model regression).

Closes the training-integrity arc: trustworthy gate → correct rewards → sound corpus → honest baseline.

Disclosed follow-up: run_benchmark.py is single-threaded against a 4-slot llama-server; client-side concurrency would cut wall-time ~4×.

(Note: an identical summary was accidentally posted to the now-merged #465 first — this #466 is the correct PR.)

rhenley1958 and others added 30 commits June 10, 2026 20:13

Merge remote-tracking branch 'origin/main' into reh3376_dev01

4ec42a4

chore: sync reh3376_dev01 with main (post-merge base advance)

10432e7

chore: sync reh3376_dev01 with main (post-merge base advance)

51d6e31

Merge remote-tracking branch 'origin/reh3376_dev01' into reh3376_dev01

4003a11

chore: sync reh3376_dev01 with main (post-merge base advance)

c02353a

chore: sync reh3376_dev01 with main (post-merge base advance)

07fbb79

rhenley1958 and others added 25 commits June 13, 2026 13:47

Merge remote-tracking branch 'origin/main' into reh3376_dev01

89cc913

Merge remote-tracking branch 'origin/reh3376_dev01' into reh3376_dev01

2dc7890

chore: sync reh3376_dev01 with main (post-merge base advance)

66a21cc

chore: sync reh3376_dev01 with main (post-merge base advance)

29dc4ed

chore: sync reh3376_dev01 with main (post-merge base advance)

5db82e9

chore: sync reh3376_dev01 with main (post-merge base advance)

0425f35

docs(reward-correctness-002): CHANGELOG + post (Epic 5 close)

0fb4e59

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>

chore: sync reh3376_dev01 with main (post-merge base advance)

f829c37

github-actions Bot requested a review from reh3376 as a code owner June 15, 2026 16:05

reh3376 approved these changes Jun 15, 2026

View reviewed changes

reh3376 self-assigned this Jun 15, 2026

reh3376 merged commit 504facd into main Jun 15, 2026
7 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dev: reh3376_dev01 -> main#466

dev: reh3376_dev01 -> main#466
reh3376 merged 360 commits into
mainfrom
reh3376_dev01

github-actions Bot commented Jun 15, 2026

Uh oh!

reh3376 commented Jun 15, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

github-actions Bot commented Jun 15, 2026

Summary

Commits

Uh oh!

reh3376 commented Jun 15, 2026

BASELINE-RECOMPUTE-001 — honest adapter-promotion baseline (training-integrity capstone)

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants