Skip to content

dev: reh3376_dev01 -> main#408

Merged
reh3376 merged 60 commits into
mainfrom
reh3376_dev01
Jun 8, 2026
Merged

dev: reh3376_dev01 -> main#408
reh3376 merged 60 commits into
mainfrom
reh3376_dev01

Conversation

@github-actions

@github-actions github-actions Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Summary

Development branch changes from reh3376_dev01.

Commits

  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • test(rrf-scale-001): skip guidance integration tests on empty environment (CI fix)
  • docs(rrf-scale-001): CHANGELOG + CLAUDE.md score-scale contract + post.md (Epic 5)
  • test(rrf-scale-001): Tier 2 integration + Tier 3 live verification (Epic 4)
  • docs(rrf-scale-001): Epic 3 — remaining LOW findings reviewed + decided
  • fix(consulting): RRF-calibrate score gates + confidence sigmoid (RRF-SCALE-001 Epic 2)
  • docs(rrf-scale-001): Epic 1 audit findings — 12 sites cataloged
  • docs(rrf-scale-001): sprint plan — RRF score-scale consumer remediation
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • fix(eventgraph-001): Grafana panel uses TSDB instead of unconfigured Prometheus datasource
  • Merge branch 'main' into reh3376_dev01
  • docs(eventgraph-001): feature doc + CHANGELOG + CLAUDE.md + sprint close (Epic 8)
  • docs(eventgraph-001): Tier 3 live e2e verification transcript (Epic 7)
  • fix(retrieval): set Activation on RRF RetrieveResult (EVENTGRAPH-001 fix-commit)
  • fix(eventgraph-001): restore full GRAFANA-AUDIT-001 audit_results.json
  • feat(observability): Grafana panel + Prometheus counters for reinforcement events (EVENTGRAPH-001 Epic 6)
  • feat(eventgraph): federation query helper + API endpoint (EVENTGRAPH-001 Epic 5)
  • feat(learning): record reinforcement events to TSDB (EVENTGRAPH-001 Epic 4)
  • refactor(learning): expose per-pair telemetry from Hebbian Cypher (EVENTGRAPH-001 Epic 3)
  • feat(tsdb): buffered reinforcement_events writer (EVENTGRAPH-001 Epic 2)
  • feat(tsdb): V0022 reinforcement_events hypertable (EVENTGRAPH-001 Epic 1)
  • docs(eventgraph-001): sprint plan (Pattern Y1 TSDB-federation)
  • docs(model-dist-002): flip adapter section to shipped + sprint close
  • feat(cli): enable mdemg model pull --adapter (MODEL-DIST-002 Epic 5+6)
  • feat(model-dist-002): Epic 4 local — Modelfile.adapter + ollama create
  • feat(model-dist-002): Epic 1-3 — MLX adapter → PEFT → GGUF LoRA + live verify
  • feat(model-dist-002): Epic 0 — sprint plan + workspace prep
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • docs(grafana-audit): Epic 4 + 7 — feature doc + sprint close
  • fix(grafana): Epic 3 — 5 panels recovered (3 FAIL + 2 schema-drift)
  • feat(grafana-audit): Epic 1 + 2 — full audit + findings
  • feat(grafana-audit): Epic 0 — sprint plan + audit harness
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • docs(api): document 19 previously-undocumented endpoints (follow-up Implement Learning Loop - ApplyCoactivation #2)
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • feat(cli): add mdemg model run wrapper (follow-up Edge Weight Decay CLI Command #1 to MODEL-DIST-001)
  • chore(submodule + docs): bump homebrew-mdemg to v0.10.0 + cli-reference Model Distribution section
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • docs(release): promote Unreleased -> v0.10.0
  • merge: resolve quant_manifest.json conflicts (Epic 3 closeout vs squashed main)
  • docs(model-dist-001): sprint close — post.md
  • feat(model-dist-001): Epic 3 closeout — Ollama Library push complete
  • docs(model-dist-001): Epic 8 — Documentation Update (main repo)
  • docs(model-dist-001): Epic 7 — local-model-distribution feature doc
  • feat(model-dist-001): Epic 5 — V0021 model_install_events hypertable + writer
  • feat(model-dist-001): Epic 4 — mdemg model CLI + pluggable Fetcher interface
  • feat(model-dist-001): Epic 3 — 3 Modelfiles + local ollama create (push pending)
  • docs(model-dist-001): Epic 2 — defer adapter to MODEL-DIST-002
  • feat(model-dist-001): Epic 1 — built Q4_K_M + Q8_0 fused GGUFs
  • docs(sprint): MODEL-DIST-001 sprint plan + quant manifest skeleton
  • fix(service): replace decommissioned mlx-server LaunchAgent with llama-server
  • fix(api): /healthz returns build-time version, not stale literal "0.6.0"
  • chore(submodule): bump homebrew-mdemg to v0.9.0 formula + docs
  • Merge remote-tracking branch 'origin/main' into reh3376_dev01
  • docs(release): promote Unreleased -> v0.9.0

Auto-generated PR from reh3376_dev01 push

rhenley1958 and others added 30 commits May 6, 2026 12:04
Promote the Unreleased CHANGELOG block to v0.9.0 (2026-05-06) ahead of
release.yml / goreleaser tag push.

New ### Breaking subsection captures two operator-visible cutovers since
v0.8.5: (1) Phase 13.5 LLM runtime port 8101 -> 8102 + .env migration
required; (2) Phase 13.6 MLX_* -> LLM_* env-var rename (legacy aliases
retained for >= 1 release cycle).

New ### Added entries: Phase 10.5 closeout (UBENCH framework promotion,
commit 0389b49) and Claude Code GitHub App workflows (PRs #378, #379).

All previously-Unreleased entries (Phase 14.2.3, 14.2.x, 14.1.x, 14, 13.6,
13.5, 13.2, 13.1) carried forward unchanged into the v0.9.0 block. Fresh
empty Unreleased section seeded above.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Bumps packaging/homebrew-mdemg pointer a235977 -> 6077097, which incorporates:
- f9358cd  Brew formula update for mdemg version v0.8.5 (goreleaser, prior)
- b4a0d2c  Brew formula update for mdemg version v0.9.0 (goreleaser, this release)
- 6077097  docs: v0.9.0 -- CHANGELOG, README What's New, beta-testing version pin

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`config.FromEnv()` defaulted MdemgVersion/MdemgCommit to literal "0.6.0"/
"unknown" when MDEMG_VERSION/MDEMG_COMMIT envs were unset. Both /healthz
and /readyz serialize cfg.MdemgVersion, so they reported "0.6.0" forever
regardless of the actual binary's ldflags-injected cli.Version.

Fix: defaults to "" in config; cli/config_loader.go injects cli.Version /
cli.Commit (the build-time vars set by goreleaser ldflags) when the env
override is unset. Operators can still pin via MDEMG_VERSION env.

Live-verified: dev build (no ldflags) now reports {"version":"dev"} on
/healthz instead of the lying "0.6.0". Production builds via goreleaser
will report the real semver tag.

TestHandleHealthz unaffected (sets cfg.MdemgVersion directly).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…a-server

Phase 13.5 cutover (2026-05-03) replaced mlx_lm.server (port 8101) with
llama.cpp llama-server (port 8102) as the production LLM runtime, but the
embedded launchd plist template + service install code paths were never
updated. Any operator running 'mdemg service install' from a fresh checkout
got the decommissioned mlx_lm.server agent — mdemg's startup preflight then
failed because LLM_ENDPOINT=http://127.0.0.1:8102/v1 wasn't reachable.

Changes:
- New packaging/launchd/com.mdemg.llama-server.plist with the Phase 13.5
  production flags (--ctx-size 32768 --parallel 4 --cont-batching --metrics
  --jinja). Byte-identical mirror at internal/cli/launchd_templates/ for
  the embed.FS (CI sync-check enforced).
- Removed packaging/launchd/com.mdemg.mlx-server.plist + embed.FS mirror.
  mlx_lm.server is decommissioned and known-broken on M5 + macOS 26.3.x;
  keeping the template would just risk re-deploying it.
- internal/cli/service_darwin.go: launchdServices entry replaced with
  com.mdemg.llama-server. resolveMLXLMBin renamed to resolveLlamaServerBin
  with primary env MDEMG_LLAMA_SERVER_BIN, deprecation alias for
  MDEMG_MLX_LM_BIN (slog.Warn at boot, retained ≥1 release cycle per the
  Phase 13.6 deprecation pattern), PATH lookup of `llama-server`.
  resolveMDEMGModelPath default updated to the canonical Phase 13.5 GGUF
  filepath (.local-models/mdemg-llm-v1-gguf/mdemg-llm-v1.Q5_K_M.gguf) since
  llama-server takes a `.gguf` filepath, not an HF-format directory like
  mlx_lm.server. Install error message updated for the new env var name +
  remediation steps (`brew install llama.cpp`).
- migrateLegacyMLXServerPlist() added: if a pre-cutover com.mdemg.mlx-server
  plist is bootstrapped on the operator's machine, Install() boots it out
  and renames the file to .disabled-phase13_5 (matches the manual operator
  convention from Phase 13.5 rollout). Best-effort: failures don't block
  the install.
- internal/cli/service_darwin_test.go fully rewritten:
    * TestLaunchdServicesIncludesLlamaServer asserts the new entry exists
      and is Optional=false (production matches Hotfix 11.6.3.1; the old
      test asserted Optional=true, a latent lie since 2026-05-02 that
      Linux CI never caught because of //go:build darwin)
    * TestLlamaServerPlistEmbedded replaces TestMLXServerPlistEmbedded;
      additionally asserts mlx-server.plist is NOT in embed.FS
    * Two resolver tests for the primary env var
    * New TestResolveLlamaServerBinFallsBackToMLXAlias proves the
      Phase 13.6 deprecation alias path works
    * resolveMDEMGModelPath tests updated for the new GGUF default
- internal/cli/watchdog.go: help text references com.mdemg.llama-server
  (instead of com.mdemg.mlx-server) and llama-server (instead of
  mlx_lm.server). Notes that mdemg_mlx_health_state metric name is
  retained for dashboard compatibility.

Tested:
- Tier 1 unit: 7/7 new tests pass; full ./internal/cli/... suite green
  (61s wall-clock).
- Tier 2 integration: golangci-lint run ./internal/cli/ — 0 issues.
  CI plist sync-check (diff -q packaging/launchd/*.plist
  internal/cli/launchd_templates/) — 6/6 byte-identical.
- Tier 3 live e2e: deferred. Running mdemg service install on the
  operator's currently-serving machine would briefly bootout the running
  llama-server LaunchAgent (PID 20527 actively serving production
  inference). The hand-installed llama-server plist on the operator's
  machine is byte-equivalent (modulo template substitutions) to what
  this commit will install via `mdemg service install` on a fresh
  operator setup, so the operator can verify on next planned redeploy.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Epic 0 of Sprint MODEL-DIST-001 — Local LoRA Distribution via Ollama Library.

Sprint plan in 12-section v1.0 format. Supersedes parts of the speculative spec
at docs/research/mdemg_sprint_ideas/MDEMG_FT_LORA_PACKAGING_SPEC.md (HF Hub vs
Ollama Library; adapter-only vs both-fused-and-adapter; Apple Silicon scope vs
cross-platform).

Configurability Contract — every operator-visible value is dynamic per the
framework's no-hardcoding rule. 12 env vars + flag overrides + sensible
defaults. ModelFetcher interface decouples CLI from Ollama-specific knowledge;
v1 ships OllamaFetcher only, future backends (HF / S3 / GitHub Release / file)
plug in via factory dispatch on MDEMG_MODEL_BACKEND without touching the CLI
surface.

Forensic from Epic 0:
- adapters/tier1/adapters.safetensors verified present (514 MB MLX, Phase 5
  SFT Iter 2400 best output)
- mdemg-llm-v1.Q5_K_M.gguf SHA256 captured (9.8 GB; 144ad7231...)
- f16 GGUF intermediate NOT on disk; Epic 1 will regenerate via
  convert_hf_to_gguf.py from the MLX merged model (~5 min)
- qwen3:14b model-layer digest captured from Ollama registry; manifest digest
  to be computed at Epic 3 for Modelfile FROM @sha256: pinning

quant_manifest.json skeleton with Q5_K_M SHA pre-populated; Q4_K_M / Q8_0 /
adapter SHAs filled in during Epics 1+2.

Estimated effort 5–7 dev-days. OpenAI spend $0. Risk medium (Ollama publish
one-way; MLX→PEFT→GGUF LoRA conversion is the riskiest engineering item with
documented contingency to defer to MODEL-DIST-002 if blocked).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Pipeline (CLAUDE.md Phase 13.5 documented path):
  1. mlx_lm.fuse --dequantize: mlx-community/Qwen3-14B-4bit + adapters/tier1/
     -> 29.6 GB bf16 HF safetensors at .local-models/qwen3-14b-mdemg-v1-bf16/
  2. convert_hf_to_gguf.py --outtype f16 -> 30 GB f16 GGUF (required
     neural/.venv interpreter with torch + transformers + gguf installed;
     /opt/homebrew/bin/convert_hf_to_gguf.py uses system python which lacks
     these — installed gguf/sentencepiece/protobuf into neural/.venv)
  3. llama-quantize Q4_K_M -> 9.0 GB (4.87 BPW; 40s wall on M5)
  4. llama-quantize Q8_0 -> 16 GB (8.50 BPW; 11s wall on M5)
  5. Live smoke per new quant via llama-server on port 18102 — both serve
     /v1/models cleanly with embedded chat_template

SHAs captured in quant_manifest.json:
  Q4_K_M: 401161710c22f0ae...411d42ea
  Q5_K_M: 144ad723101d688f...d5f5d54 (matches Epic 0 baseline)
  Q8_0:   fc14dcb40af1bb58...8db6089
  f16:    436cd6f41a684805...3217bd (intermediate, retained for Epic 2)

Resource matrix updated with empirical sizes (Q4_K_M is 9.0 GB vs estimated
6.5 GB; min RAM revised 8 -> 12 GB to cover ~3 GB working memory above
weights). 14B params x 4.87 BPW ≈ 8.5 GB matches the formula.

GGUF binary artifacts stay local — .local-models/ gitignored per
.gitignore:70. Sprint deliverable in git is just the manifest update.

Production llama-server (PID 20527 on port 8102) undisturbed throughout
Epic 1; live smokes used port 18102.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adapter (LoRA-only Modelfile via ADAPTER directive) deferred per the sprint
plan's documented contingency clause. Fused-only path (Epics 1, 3, 4, 5)
continues — that's the primary operator value.

Forensic findings (epic_2_forensic.md):
- MLX adapter is well-formed: 560 tensors, 40 layers x 7 target_modules,
  rank 32, alpha 64, scale 20.0.
- convert_lora_to_gguf.py is NOT in brew install llama.cpp; would need
  manual fetch from llama.cpp source.
- MLX -> PEFT requires tensor transposition: MLX lora_a is (in, rank);
  PEFT expects (rank, in). Same for lora_b.
- Estimated 80-95 min to complete vs ~30 min budget remaining for Epic 2.
- Hit the contingency criterion: "MLX -> PEFT conversion blocked by
  tooling gaps."

Decision: defer adapter scope to MODEL-DIST-002 (new follow-up sprint, to
be planned separately). Fused-only ships this sprint.

Knock-on changes (in-flight to subsequent epics):
- Epic 3: drop Modelfile.adapter; publish only 3 fused quants.
- Epic 4 CLI: --adapter flag accepted at parse-time but errors with
  "lands in MODEL-DIST-002"; machinery preserved for forward-compat.
- Epic 6 e2e: drop adapter-pull step.
- Epic 7 feature doc: adapter section notes "coming in MODEL-DIST-002".

Artifacts preserved on disk for MODEL-DIST-002 pickup:
- adapters/tier1/adapters.safetensors (MLX, 514 MB)
- .local-models/mdemg-llm-v1-gguf/mdemg-llm-v1.f16.gguf (30 GB,
  retained as base for llama-server --lora verification later)

quant_manifest.json adapter block updated with status=deferred + reason.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…sh pending)

Authored 3 Ollama Modelfiles in packaging/ollama/:
  Modelfile.Q4_K_M — 9.0 GB, 12 GB min RAM, 16 GB recommended
  Modelfile.Q5_K_M — 11 GB, 14 GB min RAM, 24 GB recommended (production canonical)
  Modelfile.Q8_0   — 16 GB, 20 GB min RAM, 32 GB recommended

Common shape: FROM ./../../.local-models/mdemg-llm-v1-gguf/...gguf relative
path (operator-machine local); num_ctx 32768, num_predict 4096, stop tokens
<|im_end|>/<|im_start|>; Apache-2.0 LICENSE; SYSTEM positioning block.
No TEMPLATE directive — chat template baked into GGUF metadata (Qwen3
chat_template.jinja preserved through mlx_lm.fuse --dequantize → convert_hf
→ llama-quantize pipeline).

packaging/ollama/README.md documents the publish workflow including the
fork-customization path (operators publishing under a different namespace
follow MDEMG_MODEL_NAMESPACE per the Configurability Contract).

Local ollama create completed for all 3:
  reh3376/mdemg-llm-v1:Q4_K_M  ID 5c3a7252c295
  reh3376/mdemg-llm-v1:Q5_K_M  ID 08c13b480864
  reh3376/mdemg-llm-v1:Q8_0    ID 6b1006facd36

Layers de-duplicated: config + params + system layers (3 layers) are
identical across all 3 quants; only the model blob (GGUF) differs.

** ollama push deferred ** — one-way action gated on operator confirmation
per Sprint Plan §10 Risk #8. Operator must claim reh3376 namespace on
ollama.com and generate API token before push proceeds. Local-create proves
the Modelfiles are well-formed; push is a separate decision.

Once pushed, manifest digests captured into quant_manifest.json
(ollama_manifest_digest field per quant) for mdemg model verify.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…interface

Sprint MODEL-DIST-001 Epic 4 — the bulk of the operator-facing surface.

New CLI subcommand group:
  mdemg model pull       # fetch + symlink + SHA verify
  mdemg model list       # show pulled models
  mdemg model verify     # re-check SHAs vs quant manifest
  mdemg model remove     # destructive (requires --yes)
  mdemg model where      # print resolved path for shell scripting

Pluggable backend (internal/cli/model_fetcher.go):
  type Fetcher interface { Name, Fetch, Verify, Remove }
  NewFetcher dispatches on cfg.ModelBackend (env: MDEMG_MODEL_BACKEND)
  v1 ships OllamaFetcher only; future backends (hf, s3, github-release,
  file) plug in via factory branch — CLI surface unchanged.

OllamaFetcher (internal/cli/model_fetcher_ollama.go):
  Encapsulates ALL Ollama-specific concepts: `ollama pull` invocation,
  manifest path under <OLLAMA_MODELS>/manifests/<OLLAMA_HOST>/<ns>/<n>/<tag>,
  mediaType=application/vnd.ollama.image.model layer filtering,
  blob path under <OLLAMA_MODELS>/blobs/sha256-<digest>, symlink under
  <MDEMG_MODEL_DIR>, idempotent.

Configurability Contract (no hardcoding; memory: feedback_no_hardcoded_values.md):
  12 env vars + flag overrides, each with v1-production-tuned defaults so
  `mdemg model pull` with no flags Just Works. See sprint plan §3.
  Live-verified all 3 resolution paths:
    `--quant Q5_K_M`                          → namespace=reh3376
    `--namespace acme --name custom-model`    → namespace=acme name=custom
    `MDEMG_MODEL_NAMESPACE=acme env`          → env overrides applied
  Added to internal/config/config.go: ModelBackend, ModelNamespace,
  ModelName, ModelQuants, ModelRamTiers, ModelQuant, AdapterBase,
  ModelDir, OllamaModelsRoot, OllamaRegistryHost, ModelManifestPath.

Embedded quant manifest (internal/cli/quant_manifest.json via embed.FS):
  Runtime source-of-truth for SHA verification. Operator override via
  MDEMG_MODEL_MANIFEST_PATH for air-gapped deployments. Mirrors
  docs/development/model-dist-001/quant_manifest.json.

RAM-tier auto-pick:
  Default JSON `{"<16":"Q4_K_M","<24":"Q5_K_M","default":"Q8_0"}` maps
  host RAM (sysctl on darwin, /proc/meminfo on linux) to quant. Operator
  override via MDEMG_MODEL_RAM_TIERS.

Adapter path (--adapter flag) returns ErrAdapterDeferred per Epic 2's
contingency exit — adapter publication lands in MODEL-DIST-002. Flag
machinery preserved for forward compatibility.

Tests (22, all green) in internal/cli/model_test.go:
- Backend factory dispatch (5 cases incl. case-insensitive, default, error)
- Quant allowlist parsing (5 cases incl. whitespace + empty entries)
- RAM-tier JSON parsing (default + operator override + malformed)
- PickQuantForRAM (7 boundary cases)
- ResolveQuant across paths (auto, explicit, rejection, operator-custom)
- QuantManifest load (embedded + file override + missing-file error)
- Ollama tag composition (fused + adapter forms)
- Manifest path composition under custom OLLAMA_MODELS/OLLAMA_HOST
- Blob path digest prefix handling
- Adapter deferred error
- Manifest JSON parser (mediaType filtering + malformed + no-model-layer)

Grep audit (verification checklist):
  grep on internal/cli/model*.go for hardcoded values found only in help
  text Long/example strings documenting defaults to operators — not in
  logic. Behavior values all flow through cfg.Model* fields.

Build + lint clean. Full cli test suite (61s wall) green.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…+ writer

Sprint MODEL-DIST-001 Epic 5 — observability for `mdemg model` operations.
Grafana panels deferred to Sprint B (Grafana audit).

New migration:
  internal/tsdb/migrations/021_model_install_events.sql
  Hypertable on recorded_at, 7-day chunks, 3 indexes (quant-time,
  failed-events partial, backend-event-time). Columns: event_id CUIDv2
  PK + recorded_at, event_type (pull/verify/remove), backend_name,
  namespace, model_name, quant, adapter bool, success bool, latency_ms,
  sha256, size_bytes, err_message (1 KB cap).

New writer:
  internal/tsdb/model_install_writer.go
  Synchronous single-row INSERT (not buffered + CopyFrom — CLI is
  one-shot, writes are infrequent vs the V0017/V0018/V0019/V0020 retrieval-
  path writers that fire per-request). Nil-pool no-op for degraded mode.
  errMessageMaxLen=1024 truncation at write time. New modelInstallPool
  interface (Exec-shaped) avoids touching the existing CopyFrom-shaped
  poolIface used by buffered writers.

Wiring:
  internal/cli/model.go gets recordModelEvent(parent, cfg, row) helper:
  - Returns immediately if !cfg.TSDBEnabled || cfg.TSDBHost==""
  - 2s timeout on connect (TSDB unreachable doesn't block CLI exit)
  - Logs warning + degrades gracefully on any TSDB error
  Called from runModelPull (success + failure paths), runModelVerify
  (single sweep row), runModelRemove (success + failure paths).

Schema version bump:
  internal/config/config.go: TSDB_REQUIRED_SCHEMA_VERSION default 20→21.
  CI validator at .github/workflows/ci.yml:60-65 counts SQL files in
  internal/tsdb/migrations/ and asserts equality; now 21 files = 21
  in config = passes.

Build + lint clean. Existing tsdb / cli test suites green; no new tests
added for the writer itself (single INSERT mirrors V0017/V0018/V0019
patterns already covered; integration is operational verification at
Epic 6 once tsdb is up in the dev stack).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sprint MODEL-DIST-001 Epic 7 — operator-facing feature documentation
following the standard Why / Choices / How / How-to-use shape (memory:
feedback_per_feature_docs_required.md).

Contents:
- Why: gap between brew install and a working local LLM after Phase 13.5
- Choices: backend matrix (Ollama vs HF vs GitHub vs S3 vs file://),
  artifact form (fused vs adapter), Apple Silicon scope, "Ollama runtime
  rejected (broken on M5+macOS 26.3.x), Ollama distribution only"
- How it works: ASCII flow diagram covering CLI dispatch -> Fetcher
  interface -> OllamaFetcher (preflight, ollama pull, manifest discovery,
  blob resolve, symlink, SHA verify) -> V0021 observability row
- How to use:
    * Quick start (3 commands: brew install ollama, mdemg model pull,
      curl /v1/models)
    * Explicit quant selection
    * Managing pulled models (list / verify / where / remove)
    * Forks + enterprise (MDEMG_MODEL_NAMESPACE override)
    * Air-gapped (MDEMG_MODEL_MANIFEST_PATH override)
    * Resource matrix per quant (disk, min RAM, recommended RAM, BPW)
    * Full Configurability Contract table (11 env vars + flags + defaults)
    * V0021 observability schema
- Troubleshooting: ollama missing, SHA mismatch, quant allowlist
  rejection, RAM auto-detection failure, out-of-disk, symlink permission
- Forward-looking: MODEL-DIST-002 adapter, Sprint B Grafana panels,
  future backends, cross-platform
- References: all source-of-truth files cross-linked

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sprint MODEL-DIST-001 Epic 8 — final epic, never cut (memory:
feedback_sequential_epics.md).

This commit lands the main-repo doc updates. The packaging/homebrew-mdemg/
submodule docs (README, CHANGELOG, formula caveats text) update at
v0.10.0 release-tag time per the v0.9.0 release flow precedent — that's
when goreleaser auto-regenerates mdemg.rb from .goreleaser.yaml's caveats
template, and the tap-side README/CHANGELOG get edited in lockstep.

Changes:
- CHANGELOG.md: comprehensive Unreleased entry documenting Epics 0-5 + 7
  landed in this sprint. Epic 3 ollama push and Epic 6 Tier 3 e2e marked
  as gated on operator confirmation. Adapter path explicitly deferred to
  MODEL-DIST-002 with epic_2_forensic.md cross-reference. Captures the
  Configurability Contract enumeration, the 3 quant SHAs, the Fetcher
  interface design, the V0021 hypertable, and the explicit out-of-scope
  list.
- CLAUDE.md: new "Model Distribution (Sprint MODEL-DIST-001)" subsection
  in Architecture Notes, slotted ABOVE the existing Compose embed entry
  for visibility. Captures the pluggable-backend design, the Ollama-as-
  distribution-only constraint, the on-disk symlink + manifest discovery
  flow, the 11-knob Configurability Contract surface, the no-hardcoding
  enforcement, the TSDB V0021 hookup, and the Apple Silicon v1 scope.
- README.md: new "Step 2b (optional): Pull the local LLM" section
  between Step 2 (Initialize/Start) and Open the Dashboard. 3-command
  quick start (brew install ollama -> mdemg model pull -> set
  MDEMG_MODEL_PATH). Cross-references the feature doc for the full
  Configurability Contract.
- .goreleaser.yaml: caveats template updated to include `mdemg model pull`
  instructions. Goreleaser regenerates the homebrew formula's caveats
  block from this on the next v* tag push, so v0.10.0 will ship the new
  text to brew users automatically.

Deferred to v0.10.0 release-tag time (handled per v0.9.0 precedent):
- packaging/homebrew-mdemg/README.md update
- packaging/homebrew-mdemg/CHANGELOG.md update
- packaging/homebrew-mdemg/mdemg.rb regeneration (automatic via
  goreleaser from the .goreleaser.yaml change in this commit)
- Submodule pointer bump in main repo

Deferred to Epic 6 close (after operator does ollama push):
- post.md sprint-close document
- Capture of remote Ollama manifest digests into quant_manifest.json

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
All 3 fused quants now live on Ollama Library:
  https://ollama.com/reh3376/mdemg-llm-v1:Q4_K_M
  https://ollama.com/reh3376/mdemg-llm-v1:Q5_K_M
  https://ollama.com/reh3376/mdemg-llm-v1:Q8_0

End-to-end integrity verified: remote model-layer digests captured via
GET https://registry.ollama.ai/v2/reh3376/mdemg-llm-v1/manifests/<quant>
match the local Epic 1 SHAs exactly:
  Q4_K_M  401161710c22f0ae...411d42ea  (matches Epic 1)
  Q5_K_M  144ad723101d688f...d5f5d54  (matches Epic 1)
  Q8_0    fc14dcb40af1bb58...8db6089  (matches Epic 1)

Captured into quant_manifest.json (both docs canonical + internal/cli
embed.FS mirror, byte-synced):
- ollama_manifest_digest per quant (computed from the manifest body):
    Q4_K_M  sha256:a210cccb12311773fd70bfa81f221ca0f7940a315bef87b84608caf894533b1b
    Q5_K_M  sha256:ae6e54fe1ee0b487ae41260687ed14c46c30d1ffb0fece936282418b5bcb78e1
    Q8_0    sha256:93df4d64bfa751506f7afba8bf08b891ea828575b838adec17b9399ad85be718
- Corrected size_bytes (Epic 1 used approximate values; replaced with
  registry-reported exact bytes for each tag):
    Q4_K_M   9.0 GB ->  8.4 GB (9001753408 B; was 9658404096)
    Q5_K_M  11 GB   ->  9.8 GB (10514569568 B; was 11811160064)
    Q8_0    16 GB   -> 14.6 GB (15698534208 B; was 17179869184)
- Status flipped from "local-create done; push pending" to "published".

Embedded runtime manifest (internal/cli/quant_manifest.json) re-built into
the binary via embed.FS. TestLoadQuantManifest_EmbeddedFallback green
with new values.

Epic 3 of Sprint MODEL-DIST-001 now COMPLETE. Epic 6 (Tier 3 live e2e —
`mdemg model pull` against the published tags + llama-server load on
port 18102 + sanity inference) is now unblocked.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sprint MODEL-DIST-001 close-out per memory rule
(feedback_sprint_plan_format.md §11 — sprint plans live in
docs/development/<sprint-line>/ with the standard post.md companion).

Sections (CLAUDE.md sprint-plan section guidance):
- Outcome: 3 quants live on Ollama Library, mdemg model pull is the
  canonical install path
- Process: how the plan held under reality (operator-surfaced no-
  hardcoding rule revised the plan in-place to add the Configurability
  Contract before code was written)
- Findings: 5 smooth parts + 5 friction items, both honest:
    * convert_hf_to_gguf.py python deps gap (silent ModuleNotFoundError)
    * mlx_lm.fuse adapter-path requirement
    * convert_lora_to_gguf.py missing from brew install llama.cpp
      (proximate Epic 2 deferral trigger)
    * mdemg tsdb migrate CWD-aware .env loader quirk
    * Epic 1 size estimates off vs registry-reported exact bytes
- Current state: per-layer state matrix
- Testing & benchmarking: all 3 tiers documented (Tier 3 e2e captured
  V0021 rows for both pull + verify event_types — live-verified)
- Risks & opportunities (forward): MODEL-DIST-002 adapter scope, Sprint
  B Grafana, cross-platform, HFFetcher slot, CWD-aware .env loader QoL
- Sprint commits: 9 commits on dev01, mapped to their epics

Closes Sprint MODEL-DIST-001 functionally. Operational sprint close
(v0.10.0 release tag + tap-repo doc updates) is a separate motion.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…shed main)

PR #385 squash-merged the original Epic 3 quant_manifest values (estimated
sizes from llama-quantize wall output, null ollama_manifest_digest because
the push hadn't happened yet) into main as commit f1d029a. Meanwhile on
dev01, commit 87293f8 (Epic 3 closeout) corrected those values to the
registry-canonical state after the ollama push completed:

- size_bytes: replaced Epic 1 approximations with registry-reported exact
  bytes (Q4_K_M 9001753408 / Q5_K_M 10514569568 / Q8_0 15698534208)
- size_human: 9.0/11/16 GB -> 8.4/9.8/14.6 GB (more accurate)
- ollama_manifest_digest: null -> sha256:a210cccb...|ae6e54fe...|93df4d64...
- status: "local-create done; push pending" -> "published (...)"

Conflict resolution: keep dev01 (HEAD) values for both files — those are
the registry-canonical post-push state. JSON validity verified for both
files; TestLoadQuantManifest_{EmbeddedFallback,OperatorOverride,OverrideMissingFile}
all green against the resolved embedded manifest.

The non-conflicting fast-forwarded changes from main (claude workflow
edits + dependabot go.mod/go.sum bumps) are folded in by this merge
unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Promote the Sprint MODEL-DIST-001 entry from Unreleased to v0.10.0
(2026-05-11) ahead of release.yml / goreleaser tag push. Fresh empty
Unreleased section seeded above.

v0.10.0 ships:
- mdemg model pull|list|verify|remove|where — one-command path from
  brew install mdemg to a working local LLM
- Pluggable ModelFetcher interface (Ollama in v1, slots for HF/S3/GHR/file)
- 3 fused GGUF quants live on Ollama Library at reh3376/mdemg-llm-v1
  (:Q4_K_M 8.4 GB / :Q5_K_M 9.8 GB / :Q8_0 14.6 GB)
- 11-knob Configurability Contract (every operator-visible value dynamic)
- TSDB V0021 model_install_events hypertable + writer
- docs/features/local-model-distribution.md

Adapter (LoRA-only) path deferred to MODEL-DIST-002 per the sprint plan's
documented contingency (epic_2_forensic.md).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ce Model Distribution section

Stage 4 + Stage 5 of v0.10.0 release.

Submodule pointer bump:
  packaging/homebrew-mdemg 6077097 -> c3aa68b
incorporates:
- 42d7390 — goreleaser auto-bumped mdemg.rb to version "0.10.0" + new
  caveats text on v0.10.0 tag push
- c3aa68b — manual docs round-trip: CHANGELOG v0.10.0 entry,
  README Optional Pull-the-local-LLM section in Quick Start (full
  Ollama Library doc with quant matrix, list/verify/where/remove
  subcommands, fork variants via MDEMG_MODEL_NAMESPACE, architecture
  note "Ollama is distribution-only"), Upgrading to v0.10.0 +
  What's New in v0.10.0 blocks, default-LLM rotation history extended,
  mdemg_beta_testing.md version pin v0.9.0 -> v0.10.0

docs/user/cli-reference.md (per Stage 5 user request to align refs
with current codebase):
- New ## Model Distribution top-level section before ## Synergy
  Optimization (model command group is GroupID="config" in root.go
  but a top-level cli-ref section is cleaner for discoverability).
  Documents all 5 subcommands (pull, list, verify, remove, where) with
  flag tables, usage examples, the full Configurability Contract (11
  knobs), the architecture note (Ollama is distribution-only).
- Updated Environment Variable Reference with new "Model Distribution
  (Sprint MODEL-DIST-001, v0.10.0)" subsection — 11 env vars +
  defaults table.
- Updated Command Tree Summary with the new model subcommand group
  slotted between Configuration and Advanced.

docs/user/api-reference.md unchanged: Sprint MODEL-DIST-001 added zero
HTTP endpoints (CLI-only sprint; observability via TSDB V0021 row
writer is server-side internal). Audit also surfaced ~25 routes of
pre-existing drift between code and docs (mostly path-parameter
notation: `/v1/backup/` in code vs `/v1/backup/{id}` in docs — same
routes — plus 3 undocumented /api/graph/* endpoints and 2
undocumented /v1/admin/features/{restart,stop} actions). That drift
is out-of-scope for v0.10.0 and belongs in its own follow-up sprint.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
One-shot or interactive REPL chat against the configured LLM endpoint
(default: llama-server at port 8102 per Phase 13.5). Closes the gap
operators noted between `ollama run` and the mdemg framework.

Two modes:
- One-shot: `mdemg model run -p "hello"` or positional arg after `--`
- Interactive REPL: no prompt; reads stdin line-by-line, accumulates
  conversation history across turns

Pure stdlib HTTP (no llmclient retries/breakers/recording). CLI
invocations are intentionally NOT recorded to llm_interactions — this
is an ad-hoc exploration tool, not a production code path; keeping the
training-data corpus clean.

Every operator-visible value is dynamic per the no-hardcoding rule:
  --endpoint   override cfg.EffectiveLLMEndpoint
  --model      override cfg.LLMModel (final fallback: mdemg-llm-v1)
  --prompt/-p  one-shot prompt (omit for REPL)
  --system/-s  system message
  --temperature (default 0.7)
  --max-tokens (default 1024)
  --timeout    (default 60s)

Live-verified end-to-end on the operator's running llama-server on
port 8102 with mdemg-llm-v1: one-shot worked; system+prompt with
--model override worked.

13 unit tests in model_run_test.go covering: message composition
(system first, no-system skip, history preservation), config
resolution (flag > cfg > final fallback), OpenAI-compat HTTP shape,
error paths (HTTP error, inline error object, no choices, timeout),
trailing-slash endpoint normalization, body-bounding helper. All green.

Renamed local body-bounding helper to `truncateRunBody` to avoid name
collision with a same-named helper in internal/cli/data.go.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Audit of internal/api/server.go (167 routes) vs docs/user/api-reference.md
surfaced 19 genuinely missing endpoints. v0.10.0 commit noted this as
out-of-scope; this commit resolves the gap.

Audit method: extract mux.HandleFunc registrations from server.go, extract
documented "VERB /path" headings from api-reference.md, normalize both to
strip path parameters and trailing prefix slashes, diff. Of the initial
24-entry code-only set, 5 are false positives (combined headers like
"POST /v1/admin/features/start|stop|restart" cover the individual verbs;
"GET|POST /v1/jiminy/protocol/metrics" covers both methods on one route).

Added sections:

Jiminy / J17 (10 endpoints, all under "## Jiminy Inner-Voice"):
  GET|POST /v1/jiminy/protocol/metrics    # snapshot + reset
  GET /v1/jiminy/protocol/status          # per-session J17 state
  POST /v1/jiminy/checkpoint              # tier-transition checkpoint
  POST /v1/jiminy/resume-protocol         # restore from checkpoint
  POST /v1/jiminy/extension               # operator-driven tier hold
  POST /v1/jiminy/strict                  # toggle strict mode per session
  POST /v1/jiminy/reformulate             # advisory -> imperative rewrite
  POST /v1/jiminy/classify                # pre-Write/Edit pass/deny gate
  GET /v1/jiminy/latest                   # most recent guidance (warm store)
  POST /v1/jiminy/warm                    # eager cache warmup

Memory / Graph (3 endpoints, under "## Memory Operations"):
  GET /v1/memory/graph/topology           # node/edge counts per layer
  GET /v1/memory/graph/neighborhood       # local 1-3 hop walk
  GET /v1/memory/spaces                   # root listing of all spaces

Observability (2 endpoints, under "## Metrics & Monitoring"):
  GET /v1/metrics/trends                  # TSDB time-series query
  GET /v1/prometheus                      # Prometheus scrape endpoint

Dashboard / Viz (4 endpoints, new "## Dashboard / Visualization (internal)"
section before MCP Server Tools — operator-internal endpoints backing the
browser dashboard at /ui/):
  GET /api/graph/data                     # force-directed graph data
  GET /api/graph/fields                   # schema field catalog
  GET /api/graph/health                   # explorer health
  GET /viz/topology                       # standalone HTML topology view

Each entry has handler-signature-derived request/response shape, query
parameter table, sample curl/JSON examples following the existing
api-reference convention. TOC updated with new "Dashboard / Visualization
(internal)" entry and renumbered tail.

Out of scope (deliberate, deferred):
- 28 "docs-only" entries from the audit are confirmed false positives
  from prefix-matching path normalization (code registers /v1/memory/nodes/
  with trailing slash and routes the suffix; docs spell out the full
  /v1/memory/nodes/{node_id}/archive form correctly)
- /v1/symbols root path is partially covered by /v1/symbols/relationships
  + /v1/symbols/{id}/relationships in docs; root listing endpoint
  documentation can land later if/when its handler grows specific shape
- /v1/conversation/observations covered indirectly by the flag-for-org
  endpoint documentation

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sprint GRAFANA-AUDIT-001 Epic 0. Builds the per-panel audit harness:
walks every panel in deploy/docker/grafana/dashboards/*.json, extracts
rawSql/sql targets, substitutes Grafana macros (\$__timeFilter,
\$__timeFrom/To, \$__interval, \$__unixEpoch*) + template variables
(\$space_id, \$instance + multi-value variants like \${space_id:raw}),
executes via docker exec mdemg-timescaledb-1 psql, classifies each
panel target as PASS / EMPTY / FAIL / SKIP.

Tier 1 unit tests (17 tests, all green):
- Template-variable substitution: time_filter / from-to / unix epoch /
  interval / interval_ms / space_id (3 syntaxes) / instance (3
  syntaxes) / multi-macro composite query
- Table extraction (FROM/JOIN with alias, case-insensitive, no-table)
- Panel walking (flat, nested rows, targets-with-sql vs no-sql)

Smoke test against mdemg-overview.json IMMEDIATELY validated the
operator's "diminished observability" report — 5 of 13 panels FAIL,
1 EMPTY, 7 PASS on the front-page dashboard:
  FAIL  Request Rate
  FAIL  Error Rate
  FAIL  Circuit Breakers
  FAIL  Requests by Status
  FAIL  Rate Limit Rejections
  EMPTY Request Latency Distribution (t0; t1/t2 PASS)

The original 11-panel sample missed these because it sampled different
panels. Lesson: trust the rigorous audit, not the sample. Sprint
proceeds to Epic 1 (full audit across all 146 panels) immediately.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sprint GRAFANA-AUDIT-001 Epics 1 + 2. Per-panel rigorous audit of all
165 target executions across 146 panels in 8 dashboards.

Headline:
  PASS  125 (76%) — executes, returns rows in 24h window
  EMPTY  19 (12%) — executes, 0 rows
  FAIL    3 (2%)  — SQL error
  SKIP   18 (11%) — non-SQL panel types

Harness fix mid-Epic-1: \$__interval substitution was wrapping the
value in quotes, but Grafana convention has panel SQL provide its own
outer quotes — producing doubled quotes and 18 false-positive FAILs.
Fixed: substitute bare value. Verified by re-run: 20→3 FAILs.

Real failures (Epic 2 findings):

(a) 3 SQL bugs on mdemg-llm-routing.json — all three panels hardcoded
    `mdemg-dev` (unquoted) in WHERE clauses instead of '\$space_id'
    template variable. PG parses `mdemg-dev` as subtraction.

(b) 5 schema-drift EMPTYs — panel filter expects metric_type or labels
    shape that doesn't match server emission:
    - mdemg_j17_events_total: panel 'counter', server 'gauge'
    - mdemg_rsic_action_total: panel status='success', server status='completed'
    - 2 more suspected pending full-SQL inspection.

(c) 2 missing-server-side metrics — mdemg_rate_limit_rejected_total
    and mdemg_http_request_duration_seconds_p50 not emitted. Will be
    documented; server emission is follow-up.

(d) ~11 sparse-data EMPTYs — panel SQL correct, no rows in 24h window.
    Widening time-range in Epic 4.

Projected post-Epic-3/4: 133 PASS, ≤11 EMPTY, 0 FAIL.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sprint GRAFANA-AUDIT-001 Epic 3. Minimum-change JSON edits to fix
category (a) SQL bugs and category (b) schema-drift EMPTYs identified
in Epic 1/2.

mdemg-llm-routing.json (3 panels, all category-a SQL bugs):
  - LLM call distribution by model_name (24h)
  - LLM latency p50 / p95 / p99 by task × model
  - LLM error rate % by task_name (selected range)
  Bug: WHERE clause was `(\$space_id = '' OR space_id = '\$space_id')` —
  the first \$space_id was unquoted, so PG parsed `mdemg-dev = ''` as
  `column "mdemg-dev"` which doesn't exist. Also breached the
  no-hardcoding rule (memory: feedback_no_hardcoded_values.md).
  Fix: wrap the first variable reference in quotes → `('\$space_id' =
  '' OR space_id = '\$space_id')` — a proper string-literal comparison
  that also serves as the All-spaces guard the panel author intended.
  Verdict: 3 FAIL -> 3 PASS. mdemg-llm-routing is now 4/4 PASS.

mdemg-j17.json :: Total Events (1 panel, category-b drift):
  Panel filtered `metric_type = 'counter'` (Prometheus naming
  convention because metric is `mdemg_j17_events_total`). Server
  actually emits `metric_type = 'gauge'`. 6,393 rows in 7d; 0 panel
  matches. Fix: align panel filter to `'gauge'`.
  Verdict: EMPTY -> PASS.

mdemg-rsic.json :: Action Success Rate t0 (1 panel target, category-b
drift):
  Panel filtered `labels->>'status' = 'success'`. Server actually
  emits `'completed'` (181 rows in 24h; 0 panel matches). Fix: align
  panel filter to `'completed'`. The t1 'failed' target retained
  unchanged — its EMPTY result is now accurate observation (server
  emits no `'failed'` actions; 0 = legitimate zero).
  Verdict: 1/2 EMPTY -> PASS, 1/2 EMPTY accurate-zero.

Audit verdict counts:
  Before: 125 PASS, 19 EMPTY, 3 FAIL, 18 SKIP
  After:  130 PASS, 17 EMPTY, 0 FAIL, 18 SKIP

Remaining 17 EMPTYs (Epic 4 disposition):
  - 5 category-c emission regression — 4 rsic metrics stopped at
    2026-05-07/08 (server-side investigation queued as follow-up)
  - 2 category-c never-emitted — Rate Limit Rejections, p50 latency
  - 8 category-d sparse-data on ft-training — widen time-range
  - 1 mdemg-jiminy :: Effectiveness Trends — CTE pending inspection
  - 1 mdemg-rsic :: Action Success Rate t1 (accurate-zero)

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sprint GRAFANA-AUDIT-001 closeout (Epics 4 + 5 + 6 + 7 combined as a
single doc-only commit; Epic 5 deferred and Epic 6 deferred-to-operator
as documented in post.md).

New: docs/features/observability-dashboards.md (286 lines) — full
operator-facing inventory of the 8 dashboards with:
- Per-dashboard purpose + panel count + primary tables
- Audit verdict table (130/17/0/18 post-Epic-3)
- Epic 3 fix log: 3 SQL bugs + 2 schema-drift filters
- Known gaps in 3 buckets: (c) emission regression (4 May-7-8 metrics,
  current codebase has zero refs — server removed emission), (c)
  never-emitted (mdemg_rate_limit_rejected_total +
  mdemg_http_request_duration_seconds_p50), (d) sparse/zero data on
  this dev TSDB (ft-training tables)
- Refresh expectations per table
- Operator playbook for re-running scripts/grafana_panel_audit.py
- Forward-looking: CI integration, coverage expansion, server-side
  emission restore

New: docs/development/grafana-audit-001/post.md — sprint close per
memory rule, covers process / smooth-parts / friction / sprint-plan
vs reality / current state / risks-opportunities / commits.

Epic deferrals (documented in post.md):
- Epic 5 (coverage expansion for 11 unused TSDB tables): deferred
  because most target tables are zero on this dev TSDB. Adding panels
  would create more EMPTYs, defeating the goal.
- Epic 6 (Tier 3 browser e2e): deferred to operator; not blocking.

CHANGELOG Unreleased entry covers the sprint at high level + cross-
references the feature doc.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Sprint MODEL-DIST-002 picks up the adapter-only path deferred from
MODEL-DIST-001 Epic 2. Resolves the tooling gap documented in
epic_2_forensic.md.

Workspace prep:
- Vendored convert_lora_to_gguf.py from llama.cpp source (master, pinned
  2026-05-21) into scripts/vendor/llama_cpp/ with MIT LICENSE attribution
  and a README documenting refresh policy. brew install llama.cpp ships
  convert_hf_to_gguf.py but NOT convert_lora_to_gguf.py; vendoring is the
  cleanest path (vs requiring operators to clone llama.cpp source).
- pip install peft==0.19.1 + accelerate==1.13.0 + psutil==7.2.2 into
  neural/.venv (the same venv that has torch + transformers + gguf from
  MODEL-DIST-001 Epic 1). PEFT is needed for PEFT-format schema validation
  + as a dependency of convert_lora_to_gguf.py.
- Inspected convert_lora_to_gguf.py — expects directory with
  adapter_config.json + adapter_model.safetensors in PEFT layout. Confirms
  the MLX → PEFT direction is `lora_A: (rank, input)` and
  `lora_B: (output, rank)` (script line 41-42 docstring).

Sprint plan in 12-section v1.0 format. 7 epics, 1-2 dev-day estimate.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…e verify

Sprint MODEL-DIST-002 Epics 1, 2, 3 (combined commit).

Epic 1 — MLX → PEFT converter (scripts/mlx_adapter_to_peft.py + 14 unit tests):
  Reads adapters/tier1/adapters.safetensors (514 MB MLX format, 560 tensors,
  Phase 5 SFT Iter 2400 best). Per the analysis in MODEL-DIST-001
  epic_2_forensic.md:
    Key rename: model.layers.<N>.<module>.lora_a
                -> base_model.model.model.layers.<N>.<module>.lora_A.weight
    Tensor transpose: lora_a (input,rank) -> (rank,input)
                     lora_b (rank,output) -> (output,rank)
  Emits PEFT-format adapter_config.json + adapter_model.safetensors.
  Single-adapter PEFT layout (.lora_A.weight, not .lora_A.default.weight)
  required by convert_lora_to_gguf.py.

Epic 2 — PEFT → GGUF LoRA (scripts/vendor/llama_cpp/convert_lora_to_gguf.py):
  Pinned to llama.cpp release b9000 (self-contained version; upstream master
  refactored to a conversion/ Python package with 30+ model files, excessive
  vendoring scope). README documents refresh policy.
  Output: .local-models/mdemg-llm-v1-adapter.gguf
    SHA256: 0cfaf4bae3215a4aea664a8d28ae9a41d73ee740cbcce5c2eef950232cfe1de5
    Size: 257 MB (vs ~9 GB fused Q5_K_M; ~35x smaller download)
    Tensor count: 560 (matches expected 40 layers x 7 target_modules x 2)

Epic 3 — Live verification (docs/development/model-dist-002/verification.md):
  Side-port llama-server on 127.0.0.1:18103 with f16 base + adapter; sanity
  prompt vs production 8102 fused model returns semantically-aligned outputs
  on the same prompt — both describe MDEMG as a knowledge-graph memory
  system. Confirms the MLX-PEFT-GGUF chain is structurally correct.

Iteration during Epic 2 (worth noting):
  - Initial vendored convert_lora_to_gguf.py from upstream master failed
    with ImportError (refactored to use conversion/ package). Pinned to
    b9000 release which is self-contained.
  - Initial PEFT keys used .default.weight suffix (multi-adapter layout);
    convert_lora_to_gguf.py rejected with \"Not a lora_A or lora_B tensor.\"
    Switched to single-adapter layout (.weight) which the script accepts.

Test results: 14/14 Tier 1 tests green; PEFT output loads via
peft.PeftConfig.from_pretrained; GGUF emission completes with all 560
tensors; runtime adapter application produces coherent outputs.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
rhenley1958 and others added 22 commits May 27, 2026 12:37
…ENTGRAPH-001 Epic 3)

ApplyCoactivation Cypher RETURN clause extended from "count(*) AS updated"
to 17 per-pair columns: src/dst node IDs, prev/new/delta weight,
evidence_count_after, eta_effective (cfg.LearningEta × etaMult),
surprise_factor, activation_product, path_sim, role_a/b, obs_type_a/b,
session_id, direction (forward/reverse/bidirectional), created_new_edge.

created_new_edge derived from (r.evidence_count = 1) — the ON CREATE
branch sets evidence_count to 1; ON MATCH increments. Reliable proxy
for "new connection formed" vs "existing connection strengthened" at
analysis time.

Plan-deviation disclosure (per feedback_plan_options_pattern.md): the
plan called for 2 rows per pair in asymmetric mode (forward + reverse).
The Cypher mirrors rr.weight = r.weight at all times — forward and
reverse edges carry identical weights. Emitting 2 rows would double-
count without adding signal. Final choice: 1 row per logical pair
regardless of mode, with the direction column carrying the
forward/reverse/bidirectional distinction. Revisit if EVENTGRAPH-003
introduces a Hebbian path where forward/reverse weights diverge.

New helper internal/learning/reinforcement_parser.go translates a
neo4j.Record (or any (key) → (any, bool) getter) into a
tsdb.ReinforcementEventRow. Lives in its own file so service.go
doesn't grow. Defensive against missing keys (zero values), nil values
(zero/empty), wrong types (fallback to zero) — no panics.

Tier 1 unit tests (6 green) cover:
- Symmetric bidirectional + ON CREATE branch
- Asymmetric forward + ON MATCH branch (evidence > 1)
- Missing optional fields → zero values (nullable* writer helpers
  serialize as DB NULL)
- Neo4j int64 → Go int coercion
- nil values → zero/empty
- Wrong-typed values → graceful fallback

Reinforcement rows are captured locally in ApplyCoactivation but not
yet forwarded to TSDB — Epic 4 wires the writer.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…pic 4)

learning.Service grows a reinforcementWriter field + SetReinforcementWriter
setter (mirrors the SetStabilityReinforcer back-compat pattern). After
ExecuteWrite returns from ApplyCoactivation, each captured per-pair row
gets the spaceID stamped on it and is enqueued via writer.Record. The
writer is non-blocking; the Hebbian hot path never waits on TSDB.

Configurability Contract — 7 new env vars (no-hardcoding rule):
  - EVENTGRAPH_ENABLED (bool, default true)
  - EVENTGRAPH_WRITER_FLUSH_INTERVAL_SEC (int, default 30, floor 5)
  - EVENTGRAPH_WRITER_BUFFER_SIZE (int, default 1000, 0 = unlimited)
  - EVENTGRAPH_MAX_PAIRS_PER_EVENT_BATCH (int, default 200)
  - EVENTGRAPH_MAX_EVENTS_PER_QUERY (int, default 500, Epic 5 ceiling)
  - EVENTGRAPH_FEDERATION_DEFAULT_HOPS (int, default 2)
  - EVENTGRAPH_FEDERATION_DEFAULT_LOOKBACK_HOURS (int, default 24)

api/server.go wires the writer's lifecycle:
- Constructed after TSDB client is ready, gated by cfg.EventGraphEnabled
  so EVENTGRAPH_ENABLED=false cleanly skips construction; learner's
  reinforcementWriter stays nil and the Hebbian path short-circuits.
- Closed alongside the other TSDB writers in graceful-shutdown — buffer
  drains before the process exits.

Tier 2 integration tests (against real TSDB, build tag integration):
- TestEventGraph_Writer_RoundTrip: 3 rows recorded → flush-window
  elapses → SELECT count(*) returns 3.
- TestEventGraph_Writer_DrainOnClose: 5 rows recorded with 1-hour flush
  interval → Close() drains → SELECT returns 5 (verifies the server
  shutdown invariant).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…001 Epic 5)

internal/eventgraph/query.go — Pattern Y1 federation helper.
EventsInGraphNeighborhood orchestrates a two-step query:

  1. Cypher graph walk from a seed node — variable-length path over
     CO_ACTIVATED_WITH | GENERALIZES at depth 0..Hops. Returns the
     N-hop neighborhood (DISTINCT node_ids, includes the seed).
  2. TSDB query against reinforcement_events for events where src OR
     dst is in the neighborhood, within the lookback window, ordered
     newest-first, capped at the configured limit.
  3. Go-side join — annotates events with SrcInNeighborhood /
     DstInNeighborhood so the consumer can distinguish "both endpoints
     in the subgraph" from "one endpoint outside the seed's N-hop
     reach but the event still touches our subgraph."

Empty neighborhood (no seed match, hops=0) short-circuits before the
TSDB call. Sub-1-second Since values clamp to 1s. Hops < 0 is rejected
upfront. The handler enforces an additional ceiling of 2 ×
EVENTGRAPH_FEDERATION_DEFAULT_HOPS for runaway-walk protection.

internal/api/eventgraph_handler.go — POST /v1/eventgraph/reinforcement-
neighborhood. Same auth convention as /v1/admin/breakers. 503 when
EVENTGRAPH_ENABLED=false or when eventgraphService is nil (TSDB-down at
boot). 400 on missing space_id / seed_node_id / negative hops / hops >
ceiling. Defaults applied from config when fields omitted from request.

Plan-decision disclosure (per feedback_plan_options_pattern.md): plan
proposed Option A (single endpoint with event_type query param) vs
Option B (endpoint per event class). Final choice: A. v1 has one event
class (reinforcement); the endpoint URL is explicit about that.
EVENTGRAPH-002 can either add a query param or split the URL when a
second event class arrives — no breaking change either way.

Tests:
- Tier 1 (internal/eventgraph/query_test.go, 7 green): request
  validation rejects empty space_id, empty seed, negative hops; interval
  formatting roundtrips; join annotation handles both-inside,
  one-outside, and empty-neighborhood cases.
- Tier 1 (internal/api/eventgraph_handler_test.go, 4 green + 2 skipped):
  method-not-allowed, feature-disabled 503, nil-service 503, invalid-
  JSON short-circuit. Two validation paths skipped — they require a
  non-nil eventgraphService which can't be constructed without a real
  driver; Tier 2 exercises them.
- Tier 2 (tests/integration/eventgraph_federation_test.go, 1 green):
  builds seed--mid--leaf graph + off-node, emits 3 reinforcement
  events touching all four nodes, calls federation at hops=0 and
  hops=1, asserts neighborhood + in-neighborhood flags. The hops=0
  test confirms that mid↔leaf (touching neither seed nor any 0-hop
  neighbor) is correctly excluded.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ement events (EVENTGRAPH-001 Epic 6)

Three new Prometheus counters mirror the V0022 writer's internal atomic
counters:

- mdemg_eventgraph_writer_rows_enqueued_total — rows successfully CopyFrom'd
- mdemg_eventgraph_writer_rows_dropped_total — rows FIFO-evicted (buffer full)
- mdemg_eventgraph_writer_flush_failure_total — flush errors

Wiring: the writer accepts a narrow PrometheusCounter interface
(Add(int64)) so internal/tsdb does not import internal/metrics (which
would cycle). api/server.go calls SetPrometheusCounters after the
writer is constructed, passing the three counters from the global
StandardMetrics struct. Nil-safe.

Dashboard: mdemg-graph-topology.json gains a new collapsed row
"Reinforcement Events (EVENTGRAPH-001)" with a single time-series
panel "Reinforcement Event Rate (events/min)" showing all three rates
(enqueued / dropped / flush failures) over the last 24h. Dropped is
colored orange, flush failures red, enqueued the default palette. Tied
to the prometheus datasource.

The existing GRAFANA-AUDIT-001 harness (scripts/grafana_panel_audit.py)
only evaluates SQL-target panels — the new panel uses Prometheus
queries, so it lands on the SKIP pile, same as the other 8 Cypher /
Prometheus panels on this dashboard. Audit JSON refreshed.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Epic 6's targeted audit run (scripts/grafana_panel_audit.py --dashboard
mdemg-graph-topology.json) overwrote the full multi-dashboard audit
results from GRAFANA-AUDIT-001 with the single-dashboard subset (9
SKIPs only). Restoring the full snapshot from commit 0a1e8e1 — that
audit covered all 8 dashboards and is the canonical baseline the
GRAFANA-AUDIT-001 post.md references. EVENTGRAPH-001 did not need to
regenerate it; the new panel uses Prometheus queries, which the audit
harness SKIPs regardless of dashboard.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…fix-commit)

ScoreAndRankRRF's ConsensusResult → RetrieveResult conversion was
silently dropping the Activation field. The legacy ScoreAndRank path at
scoring.go:883 sets Activation: a (where a := act[c.NodeID] is the
spreading-activation map value). The RRF path constructed
models.RetrieveResult{...} with no Activation key, leaving the field
zero-valued.

Net effect: since Phase 13.1 default-on (2026-05-03),
learning.Service.ApplyCoactivation has filtered out every L0 candidate
on the retrieve hot path. The filter is r.Activation >=
LearningMinActivation (default 0.20). With Activation=0, no pair makes
it to the Hebbian Cypher; the function returns nil without writing.

Hebbian learning has been silently no-op on the production retrieve
goroutine for ~24 days. CO_ACTIVATED_WITH edges still exist in the
graph — sidecar paths (CoactivateSession, ApplySymbolCoactivation,
consolidation walks) and pre-Phase-13.1 retrieves wrote them — but the
retrieve-time goroutine has been a silent no-op.

Discovered during EVENTGRAPH-001 Epic 7 live e2e. Three retrieves
produced 0 rows in reinforcement_events. Investigation traced the gap
to the missing Activation field.

Fix: one-line addition in scoring_rrf.go — Activation: act[c.NodeID].
Brings the RRF path to parity with the legacy scorer.

Post-fix verification: rebuilt, restarted server, re-issued 3 retrieves
→ 10 reinforcement events landed in TSDB. Federation API at hops=1
correctly returned all 10 with src_in_neighborhood=true,
dst_in_neighborhood=true. Documented in
docs/development/eventgraph-001/verification.md.

Per CLAUDE.md "Testing — Live System Testing Is Required":
"surprise bugs caught during live smoke get their own follow-up
fix-commit — do not silently roll them into the sprint commit." This
is the precedent-aligned separate commit.

Forward-only: existing graph state is preserved; new retrieves now
correctly emit Hebbian updates. EVENTGRAPH-002 may revisit whether to
backfill the missing 24-day window.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Real /v1/memory/retrieve × 3 against mdemg-dev → 10 reinforcement events
landed in TSDB within the flush window. Federation API at hops=1 from a
seed node returned 5-node neighborhood + 10 in-neighborhood events.
Documents the surprise-bug discovery + fix that preceded this transcript
(see fix-commit for scoring_rrf.go::ScoreAndRankRRF Activation
propagation).

Acceptance criteria from sprint plan §"Acceptance Criteria" all PASS.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…ose (Epic 8)

Final epic — Documentation Update (never cut, per feedback_per_feature_docs_required.md
and the standardized v1.0 sprint plan format).

New: docs/features/event-graph-federation.md (~240 lines, Why / Choices /
How it works / How to use / Forward-looking). Documents:
- Pattern Y1 vs Y2 trade-off (why federation-in-Go now, link-node
  reification deferred until a query forces it)
- Why V0019 buffered-CopyFrom over V0021 sync-INSERT (per-retrieve volume)
- Why ApplyCoactivation first (other 3 Hebbian entry points deferred to
  EVENTGRAPH-003)
- Why forward-only (no source to backfill from)
- Federation pipeline (Cypher walk → TSDB query → Go-side join with
  src/dst_in_neighborhood annotation)
- TSDB schema, API request/response shape, 7 env vars + defaults
- Observability (3 Prometheus counters + Grafana panel)
- Forward-looking sprints

New: docs/development/eventgraph-001/post.md — epic-by-epic outcomes,
acceptance criteria check-off, surprise log (RRF Activation drop +
audit-JSON overwrite + orphan-process port collision), plan deviations
disclosed (1-row-per-pair regardless of asymmetric mode; single-
endpoint over endpoint-per-class), forward-looking.

CHANGELOG.md Unreleased gains the EVENTGRAPH-001 entry — 11 bullet
points covering V0022 migration, buffered writer, Cypher RETURN-shape
change, Configurability Contract, federation helper + API, Prometheus
+ Grafana, Tier 2 + Tier 3 verification, the surprise-bug RRF
Activation fix-commit, and the audit-JSON restore.

CLAUDE.md Architecture Notes gains a new "Event Graph Federation" entry
above the Model Distribution section. Documents the pattern, surface,
deferrals, and the load-bearing fix-commit f307f55 that surfaced 24
days of silent Hebbian no-op on the retrieve hot path.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…Prometheus datasource

The Epic 6 panel used datasource {type: prometheus, uid: prometheus} but
this Grafana instance has no Prometheus datasource configured — mdemg
exposes counters as JSON via /v1/metrics/snapshot, not a /metrics scrape
endpoint. Configured datasources: mdemg-nodegraph, neo4j, timescaledb
only. The panel rendered "No data" in the live Grafana.

Rewritten panel queries the reinforcement_events hypertable directly via
the timescaledb postgres datasource. Two targets:

  1. count(*) over 1-minute time_buckets → overall events/min
  2. count(*) FILTER (WHERE created_new_edge) vs WHERE NOT created_new_edge
     → split between new connections formed and existing connections
     strengthened (the operational dimension the analytic queries
     actually need)

Both targets templated on $space_id (existing dashboard variable). The
Prometheus counters (mdemg_eventgraph_writer_rows_{enqueued,dropped,
flush_failure}_total) remain wired and incrementing — they surface via
/v1/metrics/snapshot for ops scripts. The Grafana panel now actually
displays data instead of relying on a scrape path that doesn't exist
in this deployment.

Discovered during post-merge live verification (2026-05-29). Verified
fix: reloaded dashboard via Grafana API → /api/ds/query against same
SQL returns 1-minute buckets matching TSDB direct count. Audit harness
now reports 2 PASS for the new panel (previously SKIP — no SQL target).

verification.md updated with the post-merge transcript.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
# Conflicts:
#	deploy/docker/grafana/dashboards/mdemg-graph-topology.json
#	docs/development/eventgraph-001/verification.md
P0 fix. The Jiminy guidance->feedback->outcome loop has been dormant
~9 weeks: consulting/service.go gates constraint/suggestion extraction
on hardcoded legacy-scale score thresholds (r.Score < 0.55 et al.).
Phase 13.1 RRF (default-on May 3) dropped the score scale so strong
matches top out ~0.53 -> 0/10 results clear the gates -> empty guidance
-> dead loop. Third instance of the RRF-score-contract bug class (after
the EVENTGRAPH-001 Activation drop).

12-section format; 6 epics; config-driven percentile-gate fix +
sigmoid recalibration; live-verify the revived loop end-to-end.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Full-repo sweep of post-RRF score/activation/confidence consumers +
live score-distribution sampling. Findings:

- HIGH (4): consulting constraint gates (1005/1081/1087) + confidence
  sigmoid midpoint 1.5 (35-36) — the loop-killer cluster.
- MED (5): consulting conflict gates (931/944/957/981) + minConfidence
  pre-filter (619, already config-driven).
- LOW (3): retrieval/jiminy.go Activation display gates (45/155/192) —
  explanation text only, no guidance gating.
- NONE (2): jiminy trial score (0-10 scale), trust-score clamp.

Live distribution: RRF strong-match top scores cluster 0.49-0.58; the
0.55 gate sits mid-band, rejecting the most-relevant constraint half
the time. NormalizedConfidence is positional rank (spreads 100->0 even
on uniform-score sets) -> rules out plan Option A (percentile) as sole
gate. Remediation: config-driven RRF-calibrated absolute thresholds
(Option B), constraint floor default 0.45, sigmoid midpoint ->0.45.
Disclosed deviation per feedback_plan_options_pattern.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…SCALE-001 Epic 2)

Revives the dormant Jiminy guidance loop. Replaces 7 hardcoded legacy-
scale score gates in consulting/service.go + the score->confidence
sigmoid (both copies) with config-driven, RRF-calibrated values.

Gates (all default 0.45, RRF strong-match band is 0.49-0.58):
- constraint extraction (was <0.55)        -> CONSULTING_CONSTRAINT_SCORE_FLOOR
- keyword/name authority inner gate (0.55/0.6) -> CONSULTING_AUTHORITY_SCORE_FLOOR
- conflict/contradiction detection (0.6-0.7)   -> CONSULTING_CONFLICT_SCORE_FLOOR

Key Epic-2 finding: keywordClassifyConstraint has an INNER authority
gate that binds tighter than the outer constraint gate. If authority
floor > constraint floor, the binding gate re-rejects the strong-match
band and the loop stays dormant -> all three default to 0.45. The RRF
band is too compressed to subdivide into tiers; knobs stay separate so
operators can raise any one independently.

Sigmoid (score->confidence), both consulting/service.go and
jiminy/retrieval_source.go (they MUST stay in sync per their own
comments): midpoint 1.5 -> 0.45, steepness 1.5 -> 8.0, config-driven via
RETRIEVAL_CONFIDENCE_SIGMOID_{MIDPOINT,STEEPNESS}. Legacy crushed a
strong 0.5 match to 0.18 confidence; recalibrated maps it to 0.60
(0.1->0.06, 0.58->0.74). normalizeRetrievalConfidence is now a Service
method reading cfg with zero-value fallback; mapRetrievalToGuidance
takes the sigmoid params from its caller's cfg.

5 new config knobs, all with RRF-calibrated defaults + zero-value
guards (no-hardcoding rule; the bug WAS a hardcoded value).

Tier 1 tests: updated 2 legacy-scale boundary tests to the new
thresholds + added RRFStrongMatchBand regression (0.50 must surface),
ConstraintFloor_ConfigDriven (override honored), and
NormalizeRetrievalConfidence_RRFCalibration (band mapping). Full
consulting + jiminy + config suites green; lint clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
retrieval/jiminy.go Activation display gates (45/155/192 + LearningEdge
siblings) traced live: they're in the explainability renderer, not the
guidance-surfacing path; always-additive at RRF scale (live activation
~0.723 >> thresholds), no misbehavior. Intentionally left unchanged with
rationale — config-ifying display verbosity is out of proportion to zero
functional impact. Every High/Med remediated (Epic 2), every Low decided.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…pic 4)

Tier 3 live e2e (verification.md): the score-gate fix revives the
dormant guidance loop on the live stack —
- /v1/jiminy/guide guidance items 0 -> 10, source_counts.constraints
  0 -> 2, patterns 0 -> 3 (acceptance #1 MET).
- Full loop warm->latest->feedback->outcome: TSDB constraint_outcomes
  sink REVIVED — fresh rows dated 2026-06-03 (table was dead since
  May 1). Constraint-effectiveness Grafana sink is live again.

Three adjacent issues surfaced during live smoke, documented as distinct
follow-ups (NOT score-scale, not bolted on):
- A: Neo4j GUIDANCE_OUTCOME edges still dormant — guidance SourceNodes
  point at emergent_concept nodes; PersistGuidanceOutcome only writes
  edges for constraint/correction/pattern/learning or role_type=
  constraint targets. Node-type-targeting bug, independent of RRF.
  Candidate sprint JIMINY-OUTCOME-001.
- B: LLM guidance synthesis timeout (now that synthesis runs).
- C: /v1/jiminy/latest unescaped control chars break jq/json parsers —
  the hook uses jq, so may compound dormancy. Low-effort follow-up.

Tier 2 (rrf_scale_guidance_test.go, integration tag, 2 green):
- SuggestSurfacesGuidance: constraint-matching context surfaces 7
  suggestions (was 0 before fix) against live mdemg-dev.
- SuggestRejectsNoise: gibberish does not flood constraints (no
  over-correction).

Cold-start note: first guide call post-restart returned constraints:0
(LLM classifier cold-model timeout -> keyword fallback); after one
warm-up call, constraints surface. Model-warmth artifact, not a fix
defect.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…t.md (Epic 5)

Final epic. CHANGELOG Unreleased gains the RRF-SCALE-001 Fixed entry.

CLAUDE.md gains a 'score-scale contract' architecture note — the
structural defense against a 4th instance: downstream consumers MUST
NOT hardcode absolute thresholds against RetrieveResult.Score (the
scorer scale is not a stable contract); gate via config or a
scale-invariant signal, and re-audit on any scorer change. Notes that
NormalizedConfidence is positional (not a safe sole gate) and records
the three open follow-ups.

post.md: epic-by-epic, acceptance check-off (honest: #2 partial — TSDB
sink revived, Neo4j edge is distinct Follow-up A), scope note
separating the score-scale fix (done) from the 3 adjacent surfaced
issues (documented follow-ups), discipline notes (cold-start mask,
inner authority gate).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ment (CI fix)

CI failure on PR 404: TestRRFScale_SuggestSurfacesGuidance failed in
0.02s. Root cause: the test assumed the populated local mdemg-dev space
(111 constraint nodes), but CI boots a FRESH EMPTY Neo4j with stub
embeddings (and RETRIEVAL_COLUMN_VOTING_ENABLED=false / legacy scorer).
With no data, /v1/memory/suggest returns 0 candidates, so the
'total == 0' assertion fired.

Other integration tests self-seed data or skip when prerequisites are
absent; mine relied on ambient data — wrong for a reproducible CI run.

Fix: skip when debug.retrieved_count == 0 (no retrievable data → the
score-gate fix isn't exercisable; there's nothing for the gate to admit
or reject). The test stays meaningful against a populated stack (local:
9 suggestions from 15 retrieved → PASS) and skips cleanly in CI's
empty-DB environment. Verified both paths live: populated → PASS,
empty space → retrieved_count 0 → SKIP.

The gate fix itself is validated by Tier 1 unit tests + the live Tier 3
e2e (docs/development/rrf-scale-001/verification.md); this integration
test is a bonus live-stack assertion, not the primary proof.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… sink

Follow-up A from RRF-SCALE-001: the Neo4j GUIDANCE_OUTCOME edge sink has
been dormant since Apr 12. Root cause: matchConstraintCode links guidance
items to constraint codes by keyword overlap (>=3 shared words), but
retrieval surfaces emergent_concept abstractions whose content does not
share 3+ literal words with raw constraint text -> no constraint_code ->
PersistGuidanceOutcome falls back to the concept SourceNode -> the
role_type=constraint filter rejects it -> no edge. Live-proven: all 17
recent outcome rows had constraint_code=(none).

Fix (Option 1): switch the matcher to embedding cosine similarity
(content already normalized to natural language ~0.70 cosine; Service
has an embedder; cosineSimilarity + embed->cosine pattern already exist
in-package via OutcomeClassifier). Existing PersistGuidanceOutcome +
findConstraintNodeID then create edges on the correct constraint nodes.
Keyword matching stays as fallback -- never regresses.

4 epics; ~1-1.5 dev-days; config-driven threshold; acceptance bar = a
fresh Neo4j GUIDANCE_OUTCOME edge on a real role_type=constraint node
dated today, reflected in GetConstraintEffectiveness.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…UTCOME-001 Epic 1)

Revives the Neo4j GUIDANCE_OUTCOME edge sink (dormant since Apr 12).

Root cause (RRF-SCALE-001 Follow-up A): matchConstraintCode links
guidance items to constraint codes by keyword overlap (>=3 shared
words), but retrieval surfaces emergent_concept abstractions whose
content rarely shares 3+ literal words with raw constraint text -> no
code -> PersistGuidanceOutcome falls back to the concept SourceNode ->
the role_type=constraint filter rejects it -> no edge.

Fix: new matchConstraintCodeByEmbedding queries the constraint vector
index (db.index.vector.queryNodes, role_type=constraint, sim >=
threshold) and returns the closest constraint's code. Guide() tries
this first, falling back to the keyword matcher when the embedder is
unavailable, content is empty, or nothing clears the threshold — never
regresses. The existing PersistGuidanceOutcome + findConstraintNodeID
then create the edge on the correct constraint node.

Implementation refinement vs plan: uses Neo4j's vector index server-side
(mirrors the proven Evaluator.findMatchingConstraints pattern) rather
than loading all constraint embeddings into Go and computing cosine in a
loop — cleaner, no constraintCodeEntry.Embedding needed. Same Option-1
outcome.

Config: JIMINY_CONSTRAINT_CODE_SIM_THRESHOLD (default 0.55, zero-value
fallback) — provisional; tuned against the live similarity distribution
in Epic 2.

Tier 1 (4 tests): nil-driver/empty-embedding guards, threshold default
resolution, keyword-fallback non-regression. Full jiminy + config
suites green; lint clean.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
reh3376
reh3376 previously approved these changes Jun 8, 2026
@reh3376 reh3376 self-assigned this Jun 8, 2026
rhenley1958 and others added 2 commits June 8, 2026 10:20
…on (Epic 2)

Tier 3 live e2e (verification.md) — acceptance bar MET:
- /v1/jiminy/guide now yields guidance items carrying constraint_codes
  (10 items, 6 coded; was 0). Matched code 'no-direct-main-commits' is
  semantically exact for the 'commit to main' context.
- Full warm->latest->feedback loop: Neo4j GUIDANCE_OUTCOME 893 -> 899
  (+6), latest today. All 6 new edges land on REAL role_type=constraint
  nodes ('CONSTRAINT: NEVER commit directly to main') — not
  emergent_concept. The sink dormant since Apr 12 is revived on the
  correct nodes.
- /v1/constraints/effectiveness reflects it: 'NEVER commit directly to
  main | surfaced: 30 followed: 28 rate: 0.93'.
- Both sinks now revived: TSDB (RRF-SCALE-001) + Neo4j (here). The
  constraint-effectiveness loop is fully restored.

Threshold 0.55 validated live: correct matches, no false positives.

Tier 2 (jiminy_outcome_test.go, integration tag, skip-on-empty): PASSES
on a populated stack with an idle LLM (7/10 items coded). The guide path
is LLM-latency-dependent (per-node classifier ~31s/call, serialized; a
call fired while the LLM is busy fast-fails empty), so the test
warm-retries and SKIPS (never false-fails) when the LLM path can't
produce items. Bonus check; Tier 3 is the definitive proof. The LLM
serialization/synthesis-timeout is RRF-SCALE-001 Follow-up B, tracked
separately.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Final epic. CHANGELOG Unreleased gains the JIMINY-OUTCOME-001 Fixed
entry. CLAUDE.md gains a guidance-outcome constraint-code-matching note
(embedding-first via vector index, keyword fallback; both outcome sinks
now live). post.md: epic-by-epic, acceptance check-off, the loop-revival
completion (TSDB from RRF-SCALE-001 + Neo4j here), discipline notes (LLM
serialization is the test-flakiness source), forward-looking (Follow-up
B now the most operationally-visible remaining issue).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
@reh3376

reh3376 commented Jun 8, 2026

Copy link
Copy Markdown
Owner

Sprint JIMINY-OUTCOME-001 — Summary

Completes the guidance-loop revival (Follow-up A from RRF-SCALE-001): the Neo4j GUIDANCE_OUTCOME edge sink, dormant since Apr 12, is revived — with edges on the correct constraint nodes.

Root cause

matchConstraintCode linked guidance items to constraint codes by keyword overlap (≥3 shared words). Retrieval surfaces emergent_concept abstractions whose content rarely shares 3+ literal words with raw constraint text → no constraint_codePersistGuidanceOutcome fell back to the concept SourceNode → its role_type='constraint' filter rejected it → no edge. (All 17 recent outcome rows had constraint_code=(none).) Distinct from RRF score scale.

Fix

New matchConstraintCodeByEmbedding queries the constraint vector index (db.index.vector.queryNodes, role_type='constraint', sim ≥ threshold) — mirroring the proven Evaluator.findMatchingConstraints. Guide() tries embedding match first, keyword fallback second (never regresses). The existing PersistGuidanceOutcome + findConstraintNodeID then create the edge on the real constraint node.

Implementation refinement vs plan: server-side vector index instead of in-Go cosine — cleaner, same Option-1 outcome.

Config: JIMINY_CONSTRAINT_CODE_SIM_THRESHOLD (default 0.55, validated live).

Live verification (Tier 3 — the acceptance bar)

  • Guidance items carrying a constraint_code: 0 → 6/10; matched code no-direct-main-commits — semantically exact for "commit to main".
  • Full loop → 6 fresh GUIDANCE_OUTCOME edges on real role_type='constraint' nodes (893 → 899, dated today).
  • /v1/constraints/effectiveness: "NEVER commit directly to main | surfaced: 30 followed: 28 rate: 0.93" — updated.
  • Both sinks now revived: TSDB (RRF-SCALE-001) + Neo4j (here). Constraint-effectiveness loop fully restored.

Tests

  • Tier 1 (4): guards, threshold fallback, keyword non-regression.
  • Tier 2 integration (skip-on-empty + LLM-warmth-tolerant): passes on idle stack (7/10 coded), skips under LLM contention — never false-fails (the CI lesson from PR dev: reh3376_dev01 -> main #404 applied up front).

Still-open follow-ups (unchanged)

  • B (now most operationally-visible): LLM guidance synthesis timeout — the per-node constraint classifier serializes (~31s/guide call); a call fired while the LLM is busy fast-fails empty (synthesis_error: context deadline exceeded). Candidate next sprint.
  • C: /v1/jiminy/latest JSON control-char escaping (verify hook impact).

Details: docs/development/jiminy-outcome-001/ (plan · verification · post).

@reh3376 reh3376 merged commit 01cb9bc into main Jun 8, 2026
11 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants