dev: reh3376_dev01 -> main by github-actions[bot] · Pull Request #403 · reh3376/mdemg

github-actions · 2026-05-29T19:34:23Z

Summary

Development branch changes from reh3376_dev01.

Commits

Merge remote-tracking branch 'origin/main' into reh3376_dev01
Merge remote-tracking branch 'origin/main' into reh3376_dev01
fix(eventgraph-001): Grafana panel uses TSDB instead of unconfigured Prometheus datasource
Merge branch 'main' into reh3376_dev01
docs(eventgraph-001): feature doc + CHANGELOG + CLAUDE.md + sprint close (Epic 8)
docs(eventgraph-001): Tier 3 live e2e verification transcript (Epic 7)
fix(retrieval): set Activation on RRF RetrieveResult (EVENTGRAPH-001 fix-commit)
fix(eventgraph-001): restore full GRAFANA-AUDIT-001 audit_results.json
feat(observability): Grafana panel + Prometheus counters for reinforcement events (EVENTGRAPH-001 Epic 6)
feat(eventgraph): federation query helper + API endpoint (EVENTGRAPH-001 Epic 5)
feat(learning): record reinforcement events to TSDB (EVENTGRAPH-001 Epic 4)
refactor(learning): expose per-pair telemetry from Hebbian Cypher (EVENTGRAPH-001 Epic 3)
feat(tsdb): buffered reinforcement_events writer (EVENTGRAPH-001 Epic 2)
feat(tsdb): V0022 reinforcement_events hypertable (EVENTGRAPH-001 Epic 1)
docs(eventgraph-001): sprint plan (Pattern Y1 TSDB-federation)
docs(model-dist-002): flip adapter section to shipped + sprint close
feat(cli): enable mdemg model pull --adapter (MODEL-DIST-002 Epic 5+6)
feat(model-dist-002): Epic 4 local — Modelfile.adapter + ollama create
feat(model-dist-002): Epic 1-3 — MLX adapter → PEFT → GGUF LoRA + live verify
feat(model-dist-002): Epic 0 — sprint plan + workspace prep
Merge remote-tracking branch 'origin/main' into reh3376_dev01
docs(grafana-audit): Epic 4 + 7 — feature doc + sprint close
fix(grafana): Epic 3 — 5 panels recovered (3 FAIL + 2 schema-drift)
feat(grafana-audit): Epic 1 + 2 — full audit + findings
feat(grafana-audit): Epic 0 — sprint plan + audit harness
Merge remote-tracking branch 'origin/main' into reh3376_dev01
docs(api): document 19 previously-undocumented endpoints (follow-up Implement Learning Loop - ApplyCoactivation #2)
Merge remote-tracking branch 'origin/main' into reh3376_dev01
feat(cli): add mdemg model run wrapper (follow-up Edge Weight Decay CLI Command #1 to MODEL-DIST-001)
chore(submodule + docs): bump homebrew-mdemg to v0.10.0 + cli-reference Model Distribution section
Merge remote-tracking branch 'origin/main' into reh3376_dev01
docs(release): promote Unreleased -> v0.10.0
merge: resolve quant_manifest.json conflicts (Epic 3 closeout vs squashed main)
docs(model-dist-001): sprint close — post.md
feat(model-dist-001): Epic 3 closeout — Ollama Library push complete
docs(model-dist-001): Epic 8 — Documentation Update (main repo)
docs(model-dist-001): Epic 7 — local-model-distribution feature doc
feat(model-dist-001): Epic 5 — V0021 model_install_events hypertable + writer
feat(model-dist-001): Epic 4 — mdemg model CLI + pluggable Fetcher interface
feat(model-dist-001): Epic 3 — 3 Modelfiles + local ollama create (push pending)
docs(model-dist-001): Epic 2 — defer adapter to MODEL-DIST-002
feat(model-dist-001): Epic 1 — built Q4_K_M + Q8_0 fused GGUFs
docs(sprint): MODEL-DIST-001 sprint plan + quant manifest skeleton
fix(service): replace decommissioned mlx-server LaunchAgent with llama-server
fix(api): /healthz returns build-time version, not stale literal "0.6.0"
chore(submodule): bump homebrew-mdemg to v0.9.0 formula + docs
Merge remote-tracking branch 'origin/main' into reh3376_dev01
docs(release): promote Unreleased -> v0.9.0

Auto-generated PR from reh3376_dev01 push

Promote the Unreleased CHANGELOG block to v0.9.0 (2026-05-06) ahead of release.yml / goreleaser tag push. New ### Breaking subsection captures two operator-visible cutovers since v0.8.5: (1) Phase 13.5 LLM runtime port 8101 -> 8102 + .env migration required; (2) Phase 13.6 MLX_* -> LLM_* env-var rename (legacy aliases retained for >= 1 release cycle). New ### Added entries: Phase 10.5 closeout (UBENCH framework promotion, commit 0389b49) and Claude Code GitHub App workflows (PRs #378, #379). All previously-Unreleased entries (Phase 14.2.3, 14.2.x, 14.1.x, 14, 13.6, 13.5, 13.2, 13.1) carried forward unchanged into the v0.9.0 block. Fresh empty Unreleased section seeded above. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Bumps packaging/homebrew-mdemg pointer a235977 -> 6077097, which incorporates: - f9358cd Brew formula update for mdemg version v0.8.5 (goreleaser, prior) - b4a0d2c Brew formula update for mdemg version v0.9.0 (goreleaser, this release) - 6077097 docs: v0.9.0 -- CHANGELOG, README What's New, beta-testing version pin Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

`config.FromEnv()` defaulted MdemgVersion/MdemgCommit to literal "0.6.0"/ "unknown" when MDEMG_VERSION/MDEMG_COMMIT envs were unset. Both /healthz and /readyz serialize cfg.MdemgVersion, so they reported "0.6.0" forever regardless of the actual binary's ldflags-injected cli.Version. Fix: defaults to "" in config; cli/config_loader.go injects cli.Version / cli.Commit (the build-time vars set by goreleaser ldflags) when the env override is unset. Operators can still pin via MDEMG_VERSION env. Live-verified: dev build (no ldflags) now reports {"version":"dev"} on /healthz instead of the lying "0.6.0". Production builds via goreleaser will report the real semver tag. TestHandleHealthz unaffected (sets cfg.MdemgVersion directly). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…a-server Phase 13.5 cutover (2026-05-03) replaced mlx_lm.server (port 8101) with llama.cpp llama-server (port 8102) as the production LLM runtime, but the embedded launchd plist template + service install code paths were never updated. Any operator running 'mdemg service install' from a fresh checkout got the decommissioned mlx_lm.server agent — mdemg's startup preflight then failed because LLM_ENDPOINT=http://127.0.0.1:8102/v1 wasn't reachable. Changes: - New packaging/launchd/com.mdemg.llama-server.plist with the Phase 13.5 production flags (--ctx-size 32768 --parallel 4 --cont-batching --metrics --jinja). Byte-identical mirror at internal/cli/launchd_templates/ for the embed.FS (CI sync-check enforced). - Removed packaging/launchd/com.mdemg.mlx-server.plist + embed.FS mirror. mlx_lm.server is decommissioned and known-broken on M5 + macOS 26.3.x; keeping the template would just risk re-deploying it. - internal/cli/service_darwin.go: launchdServices entry replaced with com.mdemg.llama-server. resolveMLXLMBin renamed to resolveLlamaServerBin with primary env MDEMG_LLAMA_SERVER_BIN, deprecation alias for MDEMG_MLX_LM_BIN (slog.Warn at boot, retained ≥1 release cycle per the Phase 13.6 deprecation pattern), PATH lookup of `llama-server`. resolveMDEMGModelPath default updated to the canonical Phase 13.5 GGUF filepath (.local-models/mdemg-llm-v1-gguf/mdemg-llm-v1.Q5_K_M.gguf) since llama-server takes a `.gguf` filepath, not an HF-format directory like mlx_lm.server. Install error message updated for the new env var name + remediation steps (`brew install llama.cpp`). - migrateLegacyMLXServerPlist() added: if a pre-cutover com.mdemg.mlx-server plist is bootstrapped on the operator's machine, Install() boots it out and renames the file to .disabled-phase13_5 (matches the manual operator convention from Phase 13.5 rollout). Best-effort: failures don't block the install. - internal/cli/service_darwin_test.go fully rewritten: * TestLaunchdServicesIncludesLlamaServer asserts the new entry exists and is Optional=false (production matches Hotfix 11.6.3.1; the old test asserted Optional=true, a latent lie since 2026-05-02 that Linux CI never caught because of //go:build darwin) * TestLlamaServerPlistEmbedded replaces TestMLXServerPlistEmbedded; additionally asserts mlx-server.plist is NOT in embed.FS * Two resolver tests for the primary env var * New TestResolveLlamaServerBinFallsBackToMLXAlias proves the Phase 13.6 deprecation alias path works * resolveMDEMGModelPath tests updated for the new GGUF default - internal/cli/watchdog.go: help text references com.mdemg.llama-server (instead of com.mdemg.mlx-server) and llama-server (instead of mlx_lm.server). Notes that mdemg_mlx_health_state metric name is retained for dashboard compatibility. Tested: - Tier 1 unit: 7/7 new tests pass; full ./internal/cli/... suite green (61s wall-clock). - Tier 2 integration: golangci-lint run ./internal/cli/ — 0 issues. CI plist sync-check (diff -q packaging/launchd/*.plist internal/cli/launchd_templates/) — 6/6 byte-identical. - Tier 3 live e2e: deferred. Running mdemg service install on the operator's currently-serving machine would briefly bootout the running llama-server LaunchAgent (PID 20527 actively serving production inference). The hand-installed llama-server plist on the operator's machine is byte-equivalent (modulo template substitutions) to what this commit will install via `mdemg service install` on a fresh operator setup, so the operator can verify on next planned redeploy. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

@sha256

Epic 0 of Sprint MODEL-DIST-001 — Local LoRA Distribution via Ollama Library. Sprint plan in 12-section v1.0 format. Supersedes parts of the speculative spec at docs/research/mdemg_sprint_ideas/MDEMG_FT_LORA_PACKAGING_SPEC.md (HF Hub vs Ollama Library; adapter-only vs both-fused-and-adapter; Apple Silicon scope vs cross-platform). Configurability Contract — every operator-visible value is dynamic per the framework's no-hardcoding rule. 12 env vars + flag overrides + sensible defaults. ModelFetcher interface decouples CLI from Ollama-specific knowledge; v1 ships OllamaFetcher only, future backends (HF / S3 / GitHub Release / file) plug in via factory dispatch on MDEMG_MODEL_BACKEND without touching the CLI surface. Forensic from Epic 0: - adapters/tier1/adapters.safetensors verified present (514 MB MLX, Phase 5 SFT Iter 2400 best output) - mdemg-llm-v1.Q5_K_M.gguf SHA256 captured (9.8 GB; 144ad7231...) - f16 GGUF intermediate NOT on disk; Epic 1 will regenerate via convert_hf_to_gguf.py from the MLX merged model (~5 min) - qwen3:14b model-layer digest captured from Ollama registry; manifest digest to be computed at Epic 3 for Modelfile FROM @sha256: pinning quant_manifest.json skeleton with Q5_K_M SHA pre-populated; Q4_K_M / Q8_0 / adapter SHAs filled in during Epics 1+2. Estimated effort 5–7 dev-days. OpenAI spend $0. Risk medium (Ollama publish one-way; MLX→PEFT→GGUF LoRA conversion is the riskiest engineering item with documented contingency to defer to MODEL-DIST-002 if blocked). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Pipeline (CLAUDE.md Phase 13.5 documented path): 1. mlx_lm.fuse --dequantize: mlx-community/Qwen3-14B-4bit + adapters/tier1/ -> 29.6 GB bf16 HF safetensors at .local-models/qwen3-14b-mdemg-v1-bf16/ 2. convert_hf_to_gguf.py --outtype f16 -> 30 GB f16 GGUF (required neural/.venv interpreter with torch + transformers + gguf installed; /opt/homebrew/bin/convert_hf_to_gguf.py uses system python which lacks these — installed gguf/sentencepiece/protobuf into neural/.venv) 3. llama-quantize Q4_K_M -> 9.0 GB (4.87 BPW; 40s wall on M5) 4. llama-quantize Q8_0 -> 16 GB (8.50 BPW; 11s wall on M5) 5. Live smoke per new quant via llama-server on port 18102 — both serve /v1/models cleanly with embedded chat_template SHAs captured in quant_manifest.json: Q4_K_M: 401161710c22f0ae...411d42ea Q5_K_M: 144ad723101d688f...d5f5d54 (matches Epic 0 baseline) Q8_0: fc14dcb40af1bb58...8db6089 f16: 436cd6f41a684805...3217bd (intermediate, retained for Epic 2) Resource matrix updated with empirical sizes (Q4_K_M is 9.0 GB vs estimated 6.5 GB; min RAM revised 8 -> 12 GB to cover ~3 GB working memory above weights). 14B params x 4.87 BPW ≈ 8.5 GB matches the formula. GGUF binary artifacts stay local — .local-models/ gitignored per .gitignore:70. Sprint deliverable in git is just the manifest update. Production llama-server (PID 20527 on port 8102) undisturbed throughout Epic 1; live smokes used port 18102. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Adapter (LoRA-only Modelfile via ADAPTER directive) deferred per the sprint plan's documented contingency clause. Fused-only path (Epics 1, 3, 4, 5) continues — that's the primary operator value. Forensic findings (epic_2_forensic.md): - MLX adapter is well-formed: 560 tensors, 40 layers x 7 target_modules, rank 32, alpha 64, scale 20.0. - convert_lora_to_gguf.py is NOT in brew install llama.cpp; would need manual fetch from llama.cpp source. - MLX -> PEFT requires tensor transposition: MLX lora_a is (in, rank); PEFT expects (rank, in). Same for lora_b. - Estimated 80-95 min to complete vs ~30 min budget remaining for Epic 2. - Hit the contingency criterion: "MLX -> PEFT conversion blocked by tooling gaps." Decision: defer adapter scope to MODEL-DIST-002 (new follow-up sprint, to be planned separately). Fused-only ships this sprint. Knock-on changes (in-flight to subsequent epics): - Epic 3: drop Modelfile.adapter; publish only 3 fused quants. - Epic 4 CLI: --adapter flag accepted at parse-time but errors with "lands in MODEL-DIST-002"; machinery preserved for forward-compat. - Epic 6 e2e: drop adapter-pull step. - Epic 7 feature doc: adapter section notes "coming in MODEL-DIST-002". Artifacts preserved on disk for MODEL-DIST-002 pickup: - adapters/tier1/adapters.safetensors (MLX, 514 MB) - .local-models/mdemg-llm-v1-gguf/mdemg-llm-v1.f16.gguf (30 GB, retained as base for llama-server --lora verification later) quant_manifest.json adapter block updated with status=deferred + reason. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…sh pending) Authored 3 Ollama Modelfiles in packaging/ollama/: Modelfile.Q4_K_M — 9.0 GB, 12 GB min RAM, 16 GB recommended Modelfile.Q5_K_M — 11 GB, 14 GB min RAM, 24 GB recommended (production canonical) Modelfile.Q8_0 — 16 GB, 20 GB min RAM, 32 GB recommended Common shape: FROM ./../../.local-models/mdemg-llm-v1-gguf/...gguf relative path (operator-machine local); num_ctx 32768, num_predict 4096, stop tokens <|im_end|>/<|im_start|>; Apache-2.0 LICENSE; SYSTEM positioning block. No TEMPLATE directive — chat template baked into GGUF metadata (Qwen3 chat_template.jinja preserved through mlx_lm.fuse --dequantize → convert_hf → llama-quantize pipeline). packaging/ollama/README.md documents the publish workflow including the fork-customization path (operators publishing under a different namespace follow MDEMG_MODEL_NAMESPACE per the Configurability Contract). Local ollama create completed for all 3: reh3376/mdemg-llm-v1:Q4_K_M ID 5c3a7252c295 reh3376/mdemg-llm-v1:Q5_K_M ID 08c13b480864 reh3376/mdemg-llm-v1:Q8_0 ID 6b1006facd36 Layers de-duplicated: config + params + system layers (3 layers) are identical across all 3 quants; only the model blob (GGUF) differs. ** ollama push deferred ** — one-way action gated on operator confirmation per Sprint Plan §10 Risk #8. Operator must claim reh3376 namespace on ollama.com and generate API token before push proceeds. Local-create proves the Modelfiles are well-formed; push is a separate decision. Once pushed, manifest digests captured into quant_manifest.json (ollama_manifest_digest field per quant) for mdemg model verify. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…interface Sprint MODEL-DIST-001 Epic 4 — the bulk of the operator-facing surface. New CLI subcommand group: mdemg model pull # fetch + symlink + SHA verify mdemg model list # show pulled models mdemg model verify # re-check SHAs vs quant manifest mdemg model remove # destructive (requires --yes) mdemg model where # print resolved path for shell scripting Pluggable backend (internal/cli/model_fetcher.go): type Fetcher interface { Name, Fetch, Verify, Remove } NewFetcher dispatches on cfg.ModelBackend (env: MDEMG_MODEL_BACKEND) v1 ships OllamaFetcher only; future backends (hf, s3, github-release, file) plug in via factory branch — CLI surface unchanged. OllamaFetcher (internal/cli/model_fetcher_ollama.go): Encapsulates ALL Ollama-specific concepts: `ollama pull` invocation, manifest path under <OLLAMA_MODELS>/manifests/<OLLAMA_HOST>/<ns>/<n>/<tag>, mediaType=application/vnd.ollama.image.model layer filtering, blob path under <OLLAMA_MODELS>/blobs/sha256-<digest>, symlink under <MDEMG_MODEL_DIR>, idempotent. Configurability Contract (no hardcoding; memory: feedback_no_hardcoded_values.md): 12 env vars + flag overrides, each with v1-production-tuned defaults so `mdemg model pull` with no flags Just Works. See sprint plan §3. Live-verified all 3 resolution paths: `--quant Q5_K_M` → namespace=reh3376 `--namespace acme --name custom-model` → namespace=acme name=custom `MDEMG_MODEL_NAMESPACE=acme env` → env overrides applied Added to internal/config/config.go: ModelBackend, ModelNamespace, ModelName, ModelQuants, ModelRamTiers, ModelQuant, AdapterBase, ModelDir, OllamaModelsRoot, OllamaRegistryHost, ModelManifestPath. Embedded quant manifest (internal/cli/quant_manifest.json via embed.FS): Runtime source-of-truth for SHA verification. Operator override via MDEMG_MODEL_MANIFEST_PATH for air-gapped deployments. Mirrors docs/development/model-dist-001/quant_manifest.json. RAM-tier auto-pick: Default JSON `{"<16":"Q4_K_M","<24":"Q5_K_M","default":"Q8_0"}` maps host RAM (sysctl on darwin, /proc/meminfo on linux) to quant. Operator override via MDEMG_MODEL_RAM_TIERS. Adapter path (--adapter flag) returns ErrAdapterDeferred per Epic 2's contingency exit — adapter publication lands in MODEL-DIST-002. Flag machinery preserved for forward compatibility. Tests (22, all green) in internal/cli/model_test.go: - Backend factory dispatch (5 cases incl. case-insensitive, default, error) - Quant allowlist parsing (5 cases incl. whitespace + empty entries) - RAM-tier JSON parsing (default + operator override + malformed) - PickQuantForRAM (7 boundary cases) - ResolveQuant across paths (auto, explicit, rejection, operator-custom) - QuantManifest load (embedded + file override + missing-file error) - Ollama tag composition (fused + adapter forms) - Manifest path composition under custom OLLAMA_MODELS/OLLAMA_HOST - Blob path digest prefix handling - Adapter deferred error - Manifest JSON parser (mediaType filtering + malformed + no-model-layer) Grep audit (verification checklist): grep on internal/cli/model*.go for hardcoded values found only in help text Long/example strings documenting defaults to operators — not in logic. Behavior values all flow through cfg.Model* fields. Build + lint clean. Full cli test suite (61s wall) green. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…+ writer Sprint MODEL-DIST-001 Epic 5 — observability for `mdemg model` operations. Grafana panels deferred to Sprint B (Grafana audit). New migration: internal/tsdb/migrations/021_model_install_events.sql Hypertable on recorded_at, 7-day chunks, 3 indexes (quant-time, failed-events partial, backend-event-time). Columns: event_id CUIDv2 PK + recorded_at, event_type (pull/verify/remove), backend_name, namespace, model_name, quant, adapter bool, success bool, latency_ms, sha256, size_bytes, err_message (1 KB cap). New writer: internal/tsdb/model_install_writer.go Synchronous single-row INSERT (not buffered + CopyFrom — CLI is one-shot, writes are infrequent vs the V0017/V0018/V0019/V0020 retrieval- path writers that fire per-request). Nil-pool no-op for degraded mode. errMessageMaxLen=1024 truncation at write time. New modelInstallPool interface (Exec-shaped) avoids touching the existing CopyFrom-shaped poolIface used by buffered writers. Wiring: internal/cli/model.go gets recordModelEvent(parent, cfg, row) helper: - Returns immediately if !cfg.TSDBEnabled || cfg.TSDBHost=="" - 2s timeout on connect (TSDB unreachable doesn't block CLI exit) - Logs warning + degrades gracefully on any TSDB error Called from runModelPull (success + failure paths), runModelVerify (single sweep row), runModelRemove (success + failure paths). Schema version bump: internal/config/config.go: TSDB_REQUIRED_SCHEMA_VERSION default 20→21. CI validator at .github/workflows/ci.yml:60-65 counts SQL files in internal/tsdb/migrations/ and asserts equality; now 21 files = 21 in config = passes. Build + lint clean. Existing tsdb / cli test suites green; no new tests added for the writer itself (single INSERT mirrors V0017/V0018/V0019 patterns already covered; integration is operational verification at Epic 6 once tsdb is up in the dev stack). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Sprint MODEL-DIST-001 Epic 7 — operator-facing feature documentation following the standard Why / Choices / How / How-to-use shape (memory: feedback_per_feature_docs_required.md). Contents: - Why: gap between brew install and a working local LLM after Phase 13.5 - Choices: backend matrix (Ollama vs HF vs GitHub vs S3 vs file://), artifact form (fused vs adapter), Apple Silicon scope, "Ollama runtime rejected (broken on M5+macOS 26.3.x), Ollama distribution only" - How it works: ASCII flow diagram covering CLI dispatch -> Fetcher interface -> OllamaFetcher (preflight, ollama pull, manifest discovery, blob resolve, symlink, SHA verify) -> V0021 observability row - How to use: * Quick start (3 commands: brew install ollama, mdemg model pull, curl /v1/models) * Explicit quant selection * Managing pulled models (list / verify / where / remove) * Forks + enterprise (MDEMG_MODEL_NAMESPACE override) * Air-gapped (MDEMG_MODEL_MANIFEST_PATH override) * Resource matrix per quant (disk, min RAM, recommended RAM, BPW) * Full Configurability Contract table (11 env vars + flags + defaults) * V0021 observability schema - Troubleshooting: ollama missing, SHA mismatch, quant allowlist rejection, RAM auto-detection failure, out-of-disk, symlink permission - Forward-looking: MODEL-DIST-002 adapter, Sprint B Grafana panels, future backends, cross-platform - References: all source-of-truth files cross-linked Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Sprint MODEL-DIST-001 Epic 8 — final epic, never cut (memory: feedback_sequential_epics.md). This commit lands the main-repo doc updates. The packaging/homebrew-mdemg/ submodule docs (README, CHANGELOG, formula caveats text) update at v0.10.0 release-tag time per the v0.9.0 release flow precedent — that's when goreleaser auto-regenerates mdemg.rb from .goreleaser.yaml's caveats template, and the tap-side README/CHANGELOG get edited in lockstep. Changes: - CHANGELOG.md: comprehensive Unreleased entry documenting Epics 0-5 + 7 landed in this sprint. Epic 3 ollama push and Epic 6 Tier 3 e2e marked as gated on operator confirmation. Adapter path explicitly deferred to MODEL-DIST-002 with epic_2_forensic.md cross-reference. Captures the Configurability Contract enumeration, the 3 quant SHAs, the Fetcher interface design, the V0021 hypertable, and the explicit out-of-scope list. - CLAUDE.md: new "Model Distribution (Sprint MODEL-DIST-001)" subsection in Architecture Notes, slotted ABOVE the existing Compose embed entry for visibility. Captures the pluggable-backend design, the Ollama-as- distribution-only constraint, the on-disk symlink + manifest discovery flow, the 11-knob Configurability Contract surface, the no-hardcoding enforcement, the TSDB V0021 hookup, and the Apple Silicon v1 scope. - README.md: new "Step 2b (optional): Pull the local LLM" section between Step 2 (Initialize/Start) and Open the Dashboard. 3-command quick start (brew install ollama -> mdemg model pull -> set MDEMG_MODEL_PATH). Cross-references the feature doc for the full Configurability Contract. - .goreleaser.yaml: caveats template updated to include `mdemg model pull` instructions. Goreleaser regenerates the homebrew formula's caveats block from this on the next v* tag push, so v0.10.0 will ship the new text to brew users automatically. Deferred to v0.10.0 release-tag time (handled per v0.9.0 precedent): - packaging/homebrew-mdemg/README.md update - packaging/homebrew-mdemg/CHANGELOG.md update - packaging/homebrew-mdemg/mdemg.rb regeneration (automatic via goreleaser from the .goreleaser.yaml change in this commit) - Submodule pointer bump in main repo Deferred to Epic 6 close (after operator does ollama push): - post.md sprint-close document - Capture of remote Ollama manifest digests into quant_manifest.json Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

All 3 fused quants now live on Ollama Library: https://ollama.com/reh3376/mdemg-llm-v1:Q4_K_M https://ollama.com/reh3376/mdemg-llm-v1:Q5_K_M https://ollama.com/reh3376/mdemg-llm-v1:Q8_0 End-to-end integrity verified: remote model-layer digests captured via GET https://registry.ollama.ai/v2/reh3376/mdemg-llm-v1/manifests/<quant> match the local Epic 1 SHAs exactly: Q4_K_M 401161710c22f0ae...411d42ea (matches Epic 1) Q5_K_M 144ad723101d688f...d5f5d54 (matches Epic 1) Q8_0 fc14dcb40af1bb58...8db6089 (matches Epic 1) Captured into quant_manifest.json (both docs canonical + internal/cli embed.FS mirror, byte-synced): - ollama_manifest_digest per quant (computed from the manifest body): Q4_K_M sha256:a210cccb12311773fd70bfa81f221ca0f7940a315bef87b84608caf894533b1b Q5_K_M sha256:ae6e54fe1ee0b487ae41260687ed14c46c30d1ffb0fece936282418b5bcb78e1 Q8_0 sha256:93df4d64bfa751506f7afba8bf08b891ea828575b838adec17b9399ad85be718 - Corrected size_bytes (Epic 1 used approximate values; replaced with registry-reported exact bytes for each tag): Q4_K_M 9.0 GB -> 8.4 GB (9001753408 B; was 9658404096) Q5_K_M 11 GB -> 9.8 GB (10514569568 B; was 11811160064) Q8_0 16 GB -> 14.6 GB (15698534208 B; was 17179869184) - Status flipped from "local-create done; push pending" to "published". Embedded runtime manifest (internal/cli/quant_manifest.json) re-built into the binary via embed.FS. TestLoadQuantManifest_EmbeddedFallback green with new values. Epic 3 of Sprint MODEL-DIST-001 now COMPLETE. Epic 6 (Tier 3 live e2e — `mdemg model pull` against the published tags + llama-server load on port 18102 + sanity inference) is now unblocked. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Sprint MODEL-DIST-001 close-out per memory rule (feedback_sprint_plan_format.md §11 — sprint plans live in docs/development/<sprint-line>/ with the standard post.md companion). Sections (CLAUDE.md sprint-plan section guidance): - Outcome: 3 quants live on Ollama Library, mdemg model pull is the canonical install path - Process: how the plan held under reality (operator-surfaced no- hardcoding rule revised the plan in-place to add the Configurability Contract before code was written) - Findings: 5 smooth parts + 5 friction items, both honest: * convert_hf_to_gguf.py python deps gap (silent ModuleNotFoundError) * mlx_lm.fuse adapter-path requirement * convert_lora_to_gguf.py missing from brew install llama.cpp (proximate Epic 2 deferral trigger) * mdemg tsdb migrate CWD-aware .env loader quirk * Epic 1 size estimates off vs registry-reported exact bytes - Current state: per-layer state matrix - Testing & benchmarking: all 3 tiers documented (Tier 3 e2e captured V0021 rows for both pull + verify event_types — live-verified) - Risks & opportunities (forward): MODEL-DIST-002 adapter scope, Sprint B Grafana, cross-platform, HFFetcher slot, CWD-aware .env loader QoL - Sprint commits: 9 commits on dev01, mapped to their epics Closes Sprint MODEL-DIST-001 functionally. Operational sprint close (v0.10.0 release tag + tap-repo doc updates) is a separate motion. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…shed main) PR #385 squash-merged the original Epic 3 quant_manifest values (estimated sizes from llama-quantize wall output, null ollama_manifest_digest because the push hadn't happened yet) into main as commit f1d029a. Meanwhile on dev01, commit 87293f8 (Epic 3 closeout) corrected those values to the registry-canonical state after the ollama push completed: - size_bytes: replaced Epic 1 approximations with registry-reported exact bytes (Q4_K_M 9001753408 / Q5_K_M 10514569568 / Q8_0 15698534208) - size_human: 9.0/11/16 GB -> 8.4/9.8/14.6 GB (more accurate) - ollama_manifest_digest: null -> sha256:a210cccb...|ae6e54fe...|93df4d64... - status: "local-create done; push pending" -> "published (...)" Conflict resolution: keep dev01 (HEAD) values for both files — those are the registry-canonical post-push state. JSON validity verified for both files; TestLoadQuantManifest_{EmbeddedFallback,OperatorOverride,OverrideMissingFile} all green against the resolved embedded manifest. The non-conflicting fast-forwarded changes from main (claude workflow edits + dependabot go.mod/go.sum bumps) are folded in by this merge unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Promote the Sprint MODEL-DIST-001 entry from Unreleased to v0.10.0 (2026-05-11) ahead of release.yml / goreleaser tag push. Fresh empty Unreleased section seeded above. v0.10.0 ships: - mdemg model pull|list|verify|remove|where — one-command path from brew install mdemg to a working local LLM - Pluggable ModelFetcher interface (Ollama in v1, slots for HF/S3/GHR/file) - 3 fused GGUF quants live on Ollama Library at reh3376/mdemg-llm-v1 (:Q4_K_M 8.4 GB / :Q5_K_M 9.8 GB / :Q8_0 14.6 GB) - 11-knob Configurability Contract (every operator-visible value dynamic) - TSDB V0021 model_install_events hypertable + writer - docs/features/local-model-distribution.md Adapter (LoRA-only) path deferred to MODEL-DIST-002 per the sprint plan's documented contingency (epic_2_forensic.md). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ce Model Distribution section Stage 4 + Stage 5 of v0.10.0 release. Submodule pointer bump: packaging/homebrew-mdemg 6077097 -> c3aa68b incorporates: - 42d7390 — goreleaser auto-bumped mdemg.rb to version "0.10.0" + new caveats text on v0.10.0 tag push - c3aa68b — manual docs round-trip: CHANGELOG v0.10.0 entry, README Optional Pull-the-local-LLM section in Quick Start (full Ollama Library doc with quant matrix, list/verify/where/remove subcommands, fork variants via MDEMG_MODEL_NAMESPACE, architecture note "Ollama is distribution-only"), Upgrading to v0.10.0 + What's New in v0.10.0 blocks, default-LLM rotation history extended, mdemg_beta_testing.md version pin v0.9.0 -> v0.10.0 docs/user/cli-reference.md (per Stage 5 user request to align refs with current codebase): - New ## Model Distribution top-level section before ## Synergy Optimization (model command group is GroupID="config" in root.go but a top-level cli-ref section is cleaner for discoverability). Documents all 5 subcommands (pull, list, verify, remove, where) with flag tables, usage examples, the full Configurability Contract (11 knobs), the architecture note (Ollama is distribution-only). - Updated Environment Variable Reference with new "Model Distribution (Sprint MODEL-DIST-001, v0.10.0)" subsection — 11 env vars + defaults table. - Updated Command Tree Summary with the new model subcommand group slotted between Configuration and Advanced. docs/user/api-reference.md unchanged: Sprint MODEL-DIST-001 added zero HTTP endpoints (CLI-only sprint; observability via TSDB V0021 row writer is server-side internal). Audit also surfaced ~25 routes of pre-existing drift between code and docs (mostly path-parameter notation: `/v1/backup/` in code vs `/v1/backup/{id}` in docs — same routes — plus 3 undocumented /api/graph/* endpoints and 2 undocumented /v1/admin/features/{restart,stop} actions). That drift is out-of-scope for v0.10.0 and belongs in its own follow-up sprint. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

One-shot or interactive REPL chat against the configured LLM endpoint (default: llama-server at port 8102 per Phase 13.5). Closes the gap operators noted between `ollama run` and the mdemg framework. Two modes: - One-shot: `mdemg model run -p "hello"` or positional arg after `--` - Interactive REPL: no prompt; reads stdin line-by-line, accumulates conversation history across turns Pure stdlib HTTP (no llmclient retries/breakers/recording). CLI invocations are intentionally NOT recorded to llm_interactions — this is an ad-hoc exploration tool, not a production code path; keeping the training-data corpus clean. Every operator-visible value is dynamic per the no-hardcoding rule: --endpoint override cfg.EffectiveLLMEndpoint --model override cfg.LLMModel (final fallback: mdemg-llm-v1) --prompt/-p one-shot prompt (omit for REPL) --system/-s system message --temperature (default 0.7) --max-tokens (default 1024) --timeout (default 60s) Live-verified end-to-end on the operator's running llama-server on port 8102 with mdemg-llm-v1: one-shot worked; system+prompt with --model override worked. 13 unit tests in model_run_test.go covering: message composition (system first, no-system skip, history preservation), config resolution (flag > cfg > final fallback), OpenAI-compat HTTP shape, error paths (HTTP error, inline error object, no choices, timeout), trailing-slash endpoint normalization, body-bounding helper. All green. Renamed local body-bounding helper to `truncateRunBody` to avoid name collision with a same-named helper in internal/cli/data.go. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Audit of internal/api/server.go (167 routes) vs docs/user/api-reference.md surfaced 19 genuinely missing endpoints. v0.10.0 commit noted this as out-of-scope; this commit resolves the gap. Audit method: extract mux.HandleFunc registrations from server.go, extract documented "VERB /path" headings from api-reference.md, normalize both to strip path parameters and trailing prefix slashes, diff. Of the initial 24-entry code-only set, 5 are false positives (combined headers like "POST /v1/admin/features/start|stop|restart" cover the individual verbs; "GET|POST /v1/jiminy/protocol/metrics" covers both methods on one route). Added sections: Jiminy / J17 (10 endpoints, all under "## Jiminy Inner-Voice"): GET|POST /v1/jiminy/protocol/metrics # snapshot + reset GET /v1/jiminy/protocol/status # per-session J17 state POST /v1/jiminy/checkpoint # tier-transition checkpoint POST /v1/jiminy/resume-protocol # restore from checkpoint POST /v1/jiminy/extension # operator-driven tier hold POST /v1/jiminy/strict # toggle strict mode per session POST /v1/jiminy/reformulate # advisory -> imperative rewrite POST /v1/jiminy/classify # pre-Write/Edit pass/deny gate GET /v1/jiminy/latest # most recent guidance (warm store) POST /v1/jiminy/warm # eager cache warmup Memory / Graph (3 endpoints, under "## Memory Operations"): GET /v1/memory/graph/topology # node/edge counts per layer GET /v1/memory/graph/neighborhood # local 1-3 hop walk GET /v1/memory/spaces # root listing of all spaces Observability (2 endpoints, under "## Metrics & Monitoring"): GET /v1/metrics/trends # TSDB time-series query GET /v1/prometheus # Prometheus scrape endpoint Dashboard / Viz (4 endpoints, new "## Dashboard / Visualization (internal)" section before MCP Server Tools — operator-internal endpoints backing the browser dashboard at /ui/): GET /api/graph/data # force-directed graph data GET /api/graph/fields # schema field catalog GET /api/graph/health # explorer health GET /viz/topology # standalone HTML topology view Each entry has handler-signature-derived request/response shape, query parameter table, sample curl/JSON examples following the existing api-reference convention. TOC updated with new "Dashboard / Visualization (internal)" entry and renumbered tail. Out of scope (deliberate, deferred): - 28 "docs-only" entries from the audit are confirmed false positives from prefix-matching path normalization (code registers /v1/memory/nodes/ with trailing slash and routes the suffix; docs spell out the full /v1/memory/nodes/{node_id}/archive form correctly) - /v1/symbols root path is partially covered by /v1/symbols/relationships + /v1/symbols/{id}/relationships in docs; root listing endpoint documentation can land later if/when its handler grows specific shape - /v1/conversation/observations covered indirectly by the flag-for-org endpoint documentation Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Sprint GRAFANA-AUDIT-001 Epic 0. Builds the per-panel audit harness: walks every panel in deploy/docker/grafana/dashboards/*.json, extracts rawSql/sql targets, substitutes Grafana macros (\$__timeFilter, \$__timeFrom/To, \$__interval, \$__unixEpoch*) + template variables (\$space_id, \$instance + multi-value variants like \${space_id:raw}), executes via docker exec mdemg-timescaledb-1 psql, classifies each panel target as PASS / EMPTY / FAIL / SKIP. Tier 1 unit tests (17 tests, all green): - Template-variable substitution: time_filter / from-to / unix epoch / interval / interval_ms / space_id (3 syntaxes) / instance (3 syntaxes) / multi-macro composite query - Table extraction (FROM/JOIN with alias, case-insensitive, no-table) - Panel walking (flat, nested rows, targets-with-sql vs no-sql) Smoke test against mdemg-overview.json IMMEDIATELY validated the operator's "diminished observability" report — 5 of 13 panels FAIL, 1 EMPTY, 7 PASS on the front-page dashboard: FAIL Request Rate FAIL Error Rate FAIL Circuit Breakers FAIL Requests by Status FAIL Rate Limit Rejections EMPTY Request Latency Distribution (t0; t1/t2 PASS) The original 11-panel sample missed these because it sampled different panels. Lesson: trust the rigorous audit, not the sample. Sprint proceeds to Epic 1 (full audit across all 146 panels) immediately. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Sprint GRAFANA-AUDIT-001 Epics 1 + 2. Per-panel rigorous audit of all 165 target executions across 146 panels in 8 dashboards. Headline: PASS 125 (76%) — executes, returns rows in 24h window EMPTY 19 (12%) — executes, 0 rows FAIL 3 (2%) — SQL error SKIP 18 (11%) — non-SQL panel types Harness fix mid-Epic-1: \$__interval substitution was wrapping the value in quotes, but Grafana convention has panel SQL provide its own outer quotes — producing doubled quotes and 18 false-positive FAILs. Fixed: substitute bare value. Verified by re-run: 20→3 FAILs. Real failures (Epic 2 findings): (a) 3 SQL bugs on mdemg-llm-routing.json — all three panels hardcoded `mdemg-dev` (unquoted) in WHERE clauses instead of '\$space_id' template variable. PG parses `mdemg-dev` as subtraction. (b) 5 schema-drift EMPTYs — panel filter expects metric_type or labels shape that doesn't match server emission: - mdemg_j17_events_total: panel 'counter', server 'gauge' - mdemg_rsic_action_total: panel status='success', server status='completed' - 2 more suspected pending full-SQL inspection. (c) 2 missing-server-side metrics — mdemg_rate_limit_rejected_total and mdemg_http_request_duration_seconds_p50 not emitted. Will be documented; server emission is follow-up. (d) ~11 sparse-data EMPTYs — panel SQL correct, no rows in 24h window. Widening time-range in Epic 4. Projected post-Epic-3/4: 133 PASS, ≤11 EMPTY, 0 FAIL. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Sprint GRAFANA-AUDIT-001 Epic 3. Minimum-change JSON edits to fix category (a) SQL bugs and category (b) schema-drift EMPTYs identified in Epic 1/2. mdemg-llm-routing.json (3 panels, all category-a SQL bugs): - LLM call distribution by model_name (24h) - LLM latency p50 / p95 / p99 by task × model - LLM error rate % by task_name (selected range) Bug: WHERE clause was `(\$space_id = '' OR space_id = '\$space_id')` — the first \$space_id was unquoted, so PG parsed `mdemg-dev = ''` as `column "mdemg-dev"` which doesn't exist. Also breached the no-hardcoding rule (memory: feedback_no_hardcoded_values.md). Fix: wrap the first variable reference in quotes → `('\$space_id' = '' OR space_id = '\$space_id')` — a proper string-literal comparison that also serves as the All-spaces guard the panel author intended. Verdict: 3 FAIL -> 3 PASS. mdemg-llm-routing is now 4/4 PASS. mdemg-j17.json :: Total Events (1 panel, category-b drift): Panel filtered `metric_type = 'counter'` (Prometheus naming convention because metric is `mdemg_j17_events_total`). Server actually emits `metric_type = 'gauge'`. 6,393 rows in 7d; 0 panel matches. Fix: align panel filter to `'gauge'`. Verdict: EMPTY -> PASS. mdemg-rsic.json :: Action Success Rate t0 (1 panel target, category-b drift): Panel filtered `labels->>'status' = 'success'`. Server actually emits `'completed'` (181 rows in 24h; 0 panel matches). Fix: align panel filter to `'completed'`. The t1 'failed' target retained unchanged — its EMPTY result is now accurate observation (server emits no `'failed'` actions; 0 = legitimate zero). Verdict: 1/2 EMPTY -> PASS, 1/2 EMPTY accurate-zero. Audit verdict counts: Before: 125 PASS, 19 EMPTY, 3 FAIL, 18 SKIP After: 130 PASS, 17 EMPTY, 0 FAIL, 18 SKIP Remaining 17 EMPTYs (Epic 4 disposition): - 5 category-c emission regression — 4 rsic metrics stopped at 2026-05-07/08 (server-side investigation queued as follow-up) - 2 category-c never-emitted — Rate Limit Rejections, p50 latency - 8 category-d sparse-data on ft-training — widen time-range - 1 mdemg-jiminy :: Effectiveness Trends — CTE pending inspection - 1 mdemg-rsic :: Action Success Rate t1 (accurate-zero) Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Sprint GRAFANA-AUDIT-001 closeout (Epics 4 + 5 + 6 + 7 combined as a single doc-only commit; Epic 5 deferred and Epic 6 deferred-to-operator as documented in post.md). New: docs/features/observability-dashboards.md (286 lines) — full operator-facing inventory of the 8 dashboards with: - Per-dashboard purpose + panel count + primary tables - Audit verdict table (130/17/0/18 post-Epic-3) - Epic 3 fix log: 3 SQL bugs + 2 schema-drift filters - Known gaps in 3 buckets: (c) emission regression (4 May-7-8 metrics, current codebase has zero refs — server removed emission), (c) never-emitted (mdemg_rate_limit_rejected_total + mdemg_http_request_duration_seconds_p50), (d) sparse/zero data on this dev TSDB (ft-training tables) - Refresh expectations per table - Operator playbook for re-running scripts/grafana_panel_audit.py - Forward-looking: CI integration, coverage expansion, server-side emission restore New: docs/development/grafana-audit-001/post.md — sprint close per memory rule, covers process / smooth-parts / friction / sprint-plan vs reality / current state / risks-opportunities / commits. Epic deferrals (documented in post.md): - Epic 5 (coverage expansion for 11 unused TSDB tables): deferred because most target tables are zero on this dev TSDB. Adding panels would create more EMPTYs, defeating the goal. - Epic 6 (Tier 3 browser e2e): deferred to operator; not blocking. CHANGELOG Unreleased entry covers the sprint at high level + cross- references the feature doc. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Sprint MODEL-DIST-002 picks up the adapter-only path deferred from MODEL-DIST-001 Epic 2. Resolves the tooling gap documented in epic_2_forensic.md. Workspace prep: - Vendored convert_lora_to_gguf.py from llama.cpp source (master, pinned 2026-05-21) into scripts/vendor/llama_cpp/ with MIT LICENSE attribution and a README documenting refresh policy. brew install llama.cpp ships convert_hf_to_gguf.py but NOT convert_lora_to_gguf.py; vendoring is the cleanest path (vs requiring operators to clone llama.cpp source). - pip install peft==0.19.1 + accelerate==1.13.0 + psutil==7.2.2 into neural/.venv (the same venv that has torch + transformers + gguf from MODEL-DIST-001 Epic 1). PEFT is needed for PEFT-format schema validation + as a dependency of convert_lora_to_gguf.py. - Inspected convert_lora_to_gguf.py — expects directory with adapter_config.json + adapter_model.safetensors in PEFT layout. Confirms the MLX → PEFT direction is `lora_A: (rank, input)` and `lora_B: (output, rank)` (script line 41-42 docstring). Sprint plan in 12-section v1.0 format. 7 epics, 1-2 dev-day estimate. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…e verify Sprint MODEL-DIST-002 Epics 1, 2, 3 (combined commit). Epic 1 — MLX → PEFT converter (scripts/mlx_adapter_to_peft.py + 14 unit tests): Reads adapters/tier1/adapters.safetensors (514 MB MLX format, 560 tensors, Phase 5 SFT Iter 2400 best). Per the analysis in MODEL-DIST-001 epic_2_forensic.md: Key rename: model.layers.<N>.<module>.lora_a -> base_model.model.model.layers.<N>.<module>.lora_A.weight Tensor transpose: lora_a (input,rank) -> (rank,input) lora_b (rank,output) -> (output,rank) Emits PEFT-format adapter_config.json + adapter_model.safetensors. Single-adapter PEFT layout (.lora_A.weight, not .lora_A.default.weight) required by convert_lora_to_gguf.py. Epic 2 — PEFT → GGUF LoRA (scripts/vendor/llama_cpp/convert_lora_to_gguf.py): Pinned to llama.cpp release b9000 (self-contained version; upstream master refactored to a conversion/ Python package with 30+ model files, excessive vendoring scope). README documents refresh policy. Output: .local-models/mdemg-llm-v1-adapter.gguf SHA256: 0cfaf4bae3215a4aea664a8d28ae9a41d73ee740cbcce5c2eef950232cfe1de5 Size: 257 MB (vs ~9 GB fused Q5_K_M; ~35x smaller download) Tensor count: 560 (matches expected 40 layers x 7 target_modules x 2) Epic 3 — Live verification (docs/development/model-dist-002/verification.md): Side-port llama-server on 127.0.0.1:18103 with f16 base + adapter; sanity prompt vs production 8102 fused model returns semantically-aligned outputs on the same prompt — both describe MDEMG as a knowledge-graph memory system. Confirms the MLX-PEFT-GGUF chain is structurally correct. Iteration during Epic 2 (worth noting): - Initial vendored convert_lora_to_gguf.py from upstream master failed with ImportError (refactored to use conversion/ package). Pinned to b9000 release which is self-contained. - Initial PEFT keys used .default.weight suffix (multi-adapter layout); convert_lora_to_gguf.py rejected with \"Not a lora_A or lora_B tensor.\" Switched to single-adapter layout (.weight) which the script accepts. Test results: 14/14 Tier 1 tests green; PEFT output loads via peft.PeftConfig.from_pretrained; GGUF emission completes with all 560 tensors; runtime adapter application produces coherent outputs. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Authored packaging/ollama/Modelfile.adapter: FROM qwen3:14b ADAPTER ../../.local-models/mdemg-llm-v1-adapter.gguf PARAMETER num_ctx 32768 num_predict 4096 stop "<|im_end|>" stop "<|im_start|>" SYSTEM (Qwen3-14B mdemg fine-tune positioning) LICENSE Apache 2.0 (inherits from base) Local ollama create succeeded: reh3376/mdemg-llm-v1-adapter:latest Local ID dda290492091 Layers: qwen3:14b base (a8cc1361...) + adapter blob (0cfaf4ba...) + template + license + params + system quant_manifest.json adapter block updated: status: "deferred to MODEL-DIST-002" -> "local-create done; push pending" sha256, size_bytes, ollama_local_id captured pipeline field added (MLX -> PEFT -> GGUF LoRA chain) Push is operator-gated per MODEL-DIST-001 pattern (one-way action). After push, ollama_manifest_digest will be captured and embedded quant_manifest.json will be updated alongside. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Lifts the ErrAdapterDeferred guard from MODEL-DIST-001's deferred adapter path now that reh3376/mdemg-llm-v1-adapter:latest is published. CLI changes: - model_fetcher_ollama.go: removed deferral guard from Fetch; switched readModelBlobDigest to target application/vnd.ollama.image.adapter mediaType for adapter pulls; added destFilename() helper so adapter symlinks land at <name>-adapter.gguf (no quant suffix). - model.go: SHA verify in runModelPull now branches on req.Adapter to look up mf.Adapter when pulling the adapter form; tag printout shows <ns>/<name>-adapter:latest for adapter pulls instead of the resolved fused quant. - model_fetcher.go: ErrAdapterDeferred sentinel retained for future non-Ollama backends that ship fused-only first; not currently returned. QuantManifest gained Adapter *QuantRecord field. Manifest updates (both embedded + canonical): - adapter SHA256 0cfaf4bae3215a4aea664a8d28ae9a41d73ee740cbcce5c2eef950232cfe1de5 - Ollama manifest digest sha256:57b98b97ede0e340e8c530aabf579136616ba670281fe04b14777164e655c278 - ollama_media_type application/vnd.ollama.image.adapter Tests: - Removed TestOllamaFetcher_AdapterDeferred. - Added TestDestFilename_FusedQuantAndAdapter (6 cases). - Added TestOllamaFetcher_ReadAdapterBlobDigest_FiltersOnAdapterMediaType. Tier 3 live e2e: mdemg model pull --adapter completed in 987 ms, SHA verify ok, symlink at ~/.mdemg/models/mdemg-llm-v1-adapter.gguf, and llama-server --lora produced coherent inference against the symlinked adapter ("MDEMG is a knowledge graph memory system..."). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Epic 7 (Documentation Update — never cut). - docs/features/local-model-distribution.md: adapter section flipped from "deferred to MODEL-DIST-002" to "shipped 2026-05-25"; status header updated; Configurability Contract table adds --adapter flag row. - CHANGELOG.md: Unreleased gains "Sprint MODEL-DIST-002 — Adapter-only distribution path shipped" entry with full pipeline + verification + SHA + Ollama manifest digest. - CLAUDE.md Model Distribution architecture note: replaces "adapter-only deferred to MODEL-DIST-002+" with the operator-facing recipe and the pinned-toolchain pointer. - docs/development/model-dist-002/post.md: sprint close with epic-by-epic outcomes, acceptance criteria check-off, surprise log, and forward- looking notes. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Sprint EVENTGRAPH-001 — Reinforcement-Event TSDB Hypertable + Graph Federation. First implementation of Pattern Y1 from the TypeDB-inspired topology discussion: federate events into TSDB rather than reify them in the Neo4j graph, preserve graph traversal via a Go orchestration layer. 12-section v1.0 format; 8 sequential epics; ~1.5-2 dev-days; $0 LLM; low-medium risk (touches the Hebbian hot write path so the new writer must be fully non-blocking + the Cypher RETURN-shape change must be backwards-compatible at the Go call site). Targets ApplyCoactivation only for v1. Other Hebbian entry points (ApplySymbolCoactivation, CoactivateSession, ApplyNegativeFeedback) deferred to EVENTGRAPH-003 once the pattern proves out under production traffic. Pattern Y2 (link-node promotion in Neo4j) explicitly deferred until a query proves federation-in-Go insufficient. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…c 1) One row per Hebbian co-activation pair update. Captures prev/new weight (plus signed delta), evidence_count_after, eta_effective, surprise_factor, activation_product, path_sim, role/obs_type of both endpoints, session_id, direction (forward/reverse/bidirectional), and a created_new_edge flag that distinguishes "new connection formed" from "existing connection strengthened" at analysis time. trigger_path column will distinguish ApplyCoactivation from EVENTGRAPH-003's other Hebbian entry points. 7-day chunks (same as V0017-V0021). 4 indexes: per-space time-series, src+time, dst+time, partial index on (space_id, session_id, time) where session_id is set. Federation API (Epic 5) needs src + dst lookups for the graph-neighborhood join. Buffered + flushed via CopyFrom on TSDB_FLUSH_INTERVAL_SEC cadence (default 30s). Pattern matches V0019 (sparse_gate_metrics) buffered writer, NOT V0021 (model_install_events) sync writer — Hebbian writes are per-retrieve, far higher volume than CLI-driven model install events. Config: TSDB_REQUIRED_SCHEMA_VERSION default bumped 21 -> 22. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

internal/tsdb/reinforcement_writer.go — buffered CopyFrom writer mirroring the V0019 SparseGateMetricsWriter pattern. 30s auto-flush ticker, Close() drains buffer + flushes final batch, idempotent across multiple Close calls. FIFO eviction on buffer-full matches the LLMInteractionWriter precedent; eviction counted in droppedRows for Epic 6 Prometheus surfacing. ReinforcementEventRow serializes optional float / string fields via nullableFloat / nullableString helpers — zero-valued inputs land as DB NULL rather than 0 / '', so analytic queries can distinguish "no data" from "actually zero." Required fields (prev/new/delta weight, evidence_count_after, created_new_edge, trigger_path) are never nullable. Tier 1 unit tests (9 green): - Record + Flush writes all rows with correct table + column shape. - Empty buffer Flush is a no-op (no CopyFrom call). - Buffer-full evicts oldest, increments droppedRows counter. - Unlimited buffer (maxBufferSize=0) never drops. - Nullable serialization: zero-valued optionals → DB NULL. - Flush error increments FailureCount; SuccessCount/TotalRows unchanged. - Close drains buffer (final flush triggered). - Close is idempotent (Close × 2 does not double-flush). - Auto-flush ticker fires within deadline. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ENTGRAPH-001 Epic 3) ApplyCoactivation Cypher RETURN clause extended from "count(*) AS updated" to 17 per-pair columns: src/dst node IDs, prev/new/delta weight, evidence_count_after, eta_effective (cfg.LearningEta × etaMult), surprise_factor, activation_product, path_sim, role_a/b, obs_type_a/b, session_id, direction (forward/reverse/bidirectional), created_new_edge. created_new_edge derived from (r.evidence_count = 1) — the ON CREATE branch sets evidence_count to 1; ON MATCH increments. Reliable proxy for "new connection formed" vs "existing connection strengthened" at analysis time. Plan-deviation disclosure (per feedback_plan_options_pattern.md): the plan called for 2 rows per pair in asymmetric mode (forward + reverse). The Cypher mirrors rr.weight = r.weight at all times — forward and reverse edges carry identical weights. Emitting 2 rows would double- count without adding signal. Final choice: 1 row per logical pair regardless of mode, with the direction column carrying the forward/reverse/bidirectional distinction. Revisit if EVENTGRAPH-003 introduces a Hebbian path where forward/reverse weights diverge. New helper internal/learning/reinforcement_parser.go translates a neo4j.Record (or any (key) → (any, bool) getter) into a tsdb.ReinforcementEventRow. Lives in its own file so service.go doesn't grow. Defensive against missing keys (zero values), nil values (zero/empty), wrong types (fallback to zero) — no panics. Tier 1 unit tests (6 green) cover: - Symmetric bidirectional + ON CREATE branch - Asymmetric forward + ON MATCH branch (evidence > 1) - Missing optional fields → zero values (nullable* writer helpers serialize as DB NULL) - Neo4j int64 → Go int coercion - nil values → zero/empty - Wrong-typed values → graceful fallback Reinforcement rows are captured locally in ApplyCoactivation but not yet forwarded to TSDB — Epic 4 wires the writer. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…pic 4) learning.Service grows a reinforcementWriter field + SetReinforcementWriter setter (mirrors the SetStabilityReinforcer back-compat pattern). After ExecuteWrite returns from ApplyCoactivation, each captured per-pair row gets the spaceID stamped on it and is enqueued via writer.Record. The writer is non-blocking; the Hebbian hot path never waits on TSDB. Configurability Contract — 7 new env vars (no-hardcoding rule): - EVENTGRAPH_ENABLED (bool, default true) - EVENTGRAPH_WRITER_FLUSH_INTERVAL_SEC (int, default 30, floor 5) - EVENTGRAPH_WRITER_BUFFER_SIZE (int, default 1000, 0 = unlimited) - EVENTGRAPH_MAX_PAIRS_PER_EVENT_BATCH (int, default 200) - EVENTGRAPH_MAX_EVENTS_PER_QUERY (int, default 500, Epic 5 ceiling) - EVENTGRAPH_FEDERATION_DEFAULT_HOPS (int, default 2) - EVENTGRAPH_FEDERATION_DEFAULT_LOOKBACK_HOURS (int, default 24) api/server.go wires the writer's lifecycle: - Constructed after TSDB client is ready, gated by cfg.EventGraphEnabled so EVENTGRAPH_ENABLED=false cleanly skips construction; learner's reinforcementWriter stays nil and the Hebbian path short-circuits. - Closed alongside the other TSDB writers in graceful-shutdown — buffer drains before the process exits. Tier 2 integration tests (against real TSDB, build tag integration): - TestEventGraph_Writer_RoundTrip: 3 rows recorded → flush-window elapses → SELECT count(*) returns 3. - TestEventGraph_Writer_DrainOnClose: 5 rows recorded with 1-hour flush interval → Close() drains → SELECT returns 5 (verifies the server shutdown invariant). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…001 Epic 5) internal/eventgraph/query.go — Pattern Y1 federation helper. EventsInGraphNeighborhood orchestrates a two-step query: 1. Cypher graph walk from a seed node — variable-length path over CO_ACTIVATED_WITH | GENERALIZES at depth 0..Hops. Returns the N-hop neighborhood (DISTINCT node_ids, includes the seed). 2. TSDB query against reinforcement_events for events where src OR dst is in the neighborhood, within the lookback window, ordered newest-first, capped at the configured limit. 3. Go-side join — annotates events with SrcInNeighborhood / DstInNeighborhood so the consumer can distinguish "both endpoints in the subgraph" from "one endpoint outside the seed's N-hop reach but the event still touches our subgraph." Empty neighborhood (no seed match, hops=0) short-circuits before the TSDB call. Sub-1-second Since values clamp to 1s. Hops < 0 is rejected upfront. The handler enforces an additional ceiling of 2 × EVENTGRAPH_FEDERATION_DEFAULT_HOPS for runaway-walk protection. internal/api/eventgraph_handler.go — POST /v1/eventgraph/reinforcement- neighborhood. Same auth convention as /v1/admin/breakers. 503 when EVENTGRAPH_ENABLED=false or when eventgraphService is nil (TSDB-down at boot). 400 on missing space_id / seed_node_id / negative hops / hops > ceiling. Defaults applied from config when fields omitted from request. Plan-decision disclosure (per feedback_plan_options_pattern.md): plan proposed Option A (single endpoint with event_type query param) vs Option B (endpoint per event class). Final choice: A. v1 has one event class (reinforcement); the endpoint URL is explicit about that. EVENTGRAPH-002 can either add a query param or split the URL when a second event class arrives — no breaking change either way. Tests: - Tier 1 (internal/eventgraph/query_test.go, 7 green): request validation rejects empty space_id, empty seed, negative hops; interval formatting roundtrips; join annotation handles both-inside, one-outside, and empty-neighborhood cases. - Tier 1 (internal/api/eventgraph_handler_test.go, 4 green + 2 skipped): method-not-allowed, feature-disabled 503, nil-service 503, invalid- JSON short-circuit. Two validation paths skipped — they require a non-nil eventgraphService which can't be constructed without a real driver; Tier 2 exercises them. - Tier 2 (tests/integration/eventgraph_federation_test.go, 1 green): builds seed--mid--leaf graph + off-node, emits 3 reinforcement events touching all four nodes, calls federation at hops=0 and hops=1, asserts neighborhood + in-neighborhood flags. The hops=0 test confirms that mid↔leaf (touching neither seed nor any 0-hop neighbor) is correctly excluded. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ement events (EVENTGRAPH-001 Epic 6) Three new Prometheus counters mirror the V0022 writer's internal atomic counters: - mdemg_eventgraph_writer_rows_enqueued_total — rows successfully CopyFrom'd - mdemg_eventgraph_writer_rows_dropped_total — rows FIFO-evicted (buffer full) - mdemg_eventgraph_writer_flush_failure_total — flush errors Wiring: the writer accepts a narrow PrometheusCounter interface (Add(int64)) so internal/tsdb does not import internal/metrics (which would cycle). api/server.go calls SetPrometheusCounters after the writer is constructed, passing the three counters from the global StandardMetrics struct. Nil-safe. Dashboard: mdemg-graph-topology.json gains a new collapsed row "Reinforcement Events (EVENTGRAPH-001)" with a single time-series panel "Reinforcement Event Rate (events/min)" showing all three rates (enqueued / dropped / flush failures) over the last 24h. Dropped is colored orange, flush failures red, enqueued the default palette. Tied to the prometheus datasource. The existing GRAFANA-AUDIT-001 harness (scripts/grafana_panel_audit.py) only evaluates SQL-target panels — the new panel uses Prometheus queries, so it lands on the SKIP pile, same as the other 8 Cypher / Prometheus panels on this dashboard. Audit JSON refreshed. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Epic 6's targeted audit run (scripts/grafana_panel_audit.py --dashboard mdemg-graph-topology.json) overwrote the full multi-dashboard audit results from GRAFANA-AUDIT-001 with the single-dashboard subset (9 SKIPs only). Restoring the full snapshot from commit 0a1e8e1 — that audit covered all 8 dashboards and is the canonical baseline the GRAFANA-AUDIT-001 post.md references. EVENTGRAPH-001 did not need to regenerate it; the new panel uses Prometheus queries, which the audit harness SKIPs regardless of dashboard. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…fix-commit) ScoreAndRankRRF's ConsensusResult → RetrieveResult conversion was silently dropping the Activation field. The legacy ScoreAndRank path at scoring.go:883 sets Activation: a (where a := act[c.NodeID] is the spreading-activation map value). The RRF path constructed models.RetrieveResult{...} with no Activation key, leaving the field zero-valued. Net effect: since Phase 13.1 default-on (2026-05-03), learning.Service.ApplyCoactivation has filtered out every L0 candidate on the retrieve hot path. The filter is r.Activation >= LearningMinActivation (default 0.20). With Activation=0, no pair makes it to the Hebbian Cypher; the function returns nil without writing. Hebbian learning has been silently no-op on the production retrieve goroutine for ~24 days. CO_ACTIVATED_WITH edges still exist in the graph — sidecar paths (CoactivateSession, ApplySymbolCoactivation, consolidation walks) and pre-Phase-13.1 retrieves wrote them — but the retrieve-time goroutine has been a silent no-op. Discovered during EVENTGRAPH-001 Epic 7 live e2e. Three retrieves produced 0 rows in reinforcement_events. Investigation traced the gap to the missing Activation field. Fix: one-line addition in scoring_rrf.go — Activation: act[c.NodeID]. Brings the RRF path to parity with the legacy scorer. Post-fix verification: rebuilt, restarted server, re-issued 3 retrieves → 10 reinforcement events landed in TSDB. Federation API at hops=1 correctly returned all 10 with src_in_neighborhood=true, dst_in_neighborhood=true. Documented in docs/development/eventgraph-001/verification.md. Per CLAUDE.md "Testing — Live System Testing Is Required": "surprise bugs caught during live smoke get their own follow-up fix-commit — do not silently roll them into the sprint commit." This is the precedent-aligned separate commit. Forward-only: existing graph state is preserved; new retrieves now correctly emit Hebbian updates. EVENTGRAPH-002 may revisit whether to backfill the missing 24-day window. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

Real /v1/memory/retrieve × 3 against mdemg-dev → 10 reinforcement events landed in TSDB within the flush window. Federation API at hops=1 from a seed node returned 5-node neighborhood + 10 in-neighborhood events. Documents the surprise-bug discovery + fix that preceded this transcript (see fix-commit for scoring_rrf.go::ScoreAndRankRRF Activation propagation). Acceptance criteria from sprint plan §"Acceptance Criteria" all PASS. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…ose (Epic 8) Final epic — Documentation Update (never cut, per feedback_per_feature_docs_required.md and the standardized v1.0 sprint plan format). New: docs/features/event-graph-federation.md (~240 lines, Why / Choices / How it works / How to use / Forward-looking). Documents: - Pattern Y1 vs Y2 trade-off (why federation-in-Go now, link-node reification deferred until a query forces it) - Why V0019 buffered-CopyFrom over V0021 sync-INSERT (per-retrieve volume) - Why ApplyCoactivation first (other 3 Hebbian entry points deferred to EVENTGRAPH-003) - Why forward-only (no source to backfill from) - Federation pipeline (Cypher walk → TSDB query → Go-side join with src/dst_in_neighborhood annotation) - TSDB schema, API request/response shape, 7 env vars + defaults - Observability (3 Prometheus counters + Grafana panel) - Forward-looking sprints New: docs/development/eventgraph-001/post.md — epic-by-epic outcomes, acceptance criteria check-off, surprise log (RRF Activation drop + audit-JSON overwrite + orphan-process port collision), plan deviations disclosed (1-row-per-pair regardless of asymmetric mode; single- endpoint over endpoint-per-class), forward-looking. CHANGELOG.md Unreleased gains the EVENTGRAPH-001 entry — 11 bullet points covering V0022 migration, buffered writer, Cypher RETURN-shape change, Configurability Contract, federation helper + API, Prometheus + Grafana, Tier 2 + Tier 3 verification, the surprise-bug RRF Activation fix-commit, and the audit-JSON restore. CLAUDE.md Architecture Notes gains a new "Event Graph Federation" entry above the Model Distribution section. Documents the pattern, surface, deferrals, and the load-bearing fix-commit f307f55 that surfaced 24 days of silent Hebbian no-op on the retrieve hot path. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

…Prometheus datasource The Epic 6 panel used datasource {type: prometheus, uid: prometheus} but this Grafana instance has no Prometheus datasource configured — mdemg exposes counters as JSON via /v1/metrics/snapshot, not a /metrics scrape endpoint. Configured datasources: mdemg-nodegraph, neo4j, timescaledb only. The panel rendered "No data" in the live Grafana. Rewritten panel queries the reinforcement_events hypertable directly via the timescaledb postgres datasource. Two targets: 1. count(*) over 1-minute time_buckets → overall events/min 2. count(*) FILTER (WHERE created_new_edge) vs WHERE NOT created_new_edge → split between new connections formed and existing connections strengthened (the operational dimension the analytic queries actually need) Both targets templated on $space_id (existing dashboard variable). The Prometheus counters (mdemg_eventgraph_writer_rows_{enqueued,dropped, flush_failure}_total) remain wired and incrementing — they surface via /v1/metrics/snapshot for ops scripts. The Grafana panel now actually displays data instead of relying on a scrape path that doesn't exist in this deployment. Discovered during post-merge live verification (2026-05-29). Verified fix: reloaded dashboard via Grafana API → /api/ds/query against same SQL returns 1-minute buckets matching TSDB direct count. Audit harness now reports 2 PASS for the new panel (previously SKIP — no SQL target). verification.md updated with the post-merge transcript. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>

# Conflicts: # deploy/docker/grafana/dashboards/mdemg-graph-topology.json # docs/development/eventgraph-001/verification.md

rhenley1958 and others added 30 commits May 6, 2026 12:04

Merge remote-tracking branch 'origin/main' into reh3376_dev01

e738208

Merge remote-tracking branch 'origin/main' into reh3376_dev01

eb81291

Merge remote-tracking branch 'origin/main' into reh3376_dev01

f37add8

Merge remote-tracking branch 'origin/main' into reh3376_dev01

c895316

Merge remote-tracking branch 'origin/main' into reh3376_dev01

1a3238c

rhenley1958 and others added 18 commits May 25, 2026 16:32

Merge branch 'main' into reh3376_dev01

e6e40f3

Merge remote-tracking branch 'origin/main' into reh3376_dev01

55b976f

# Conflicts: # deploy/docker/grafana/dashboards/mdemg-graph-topology.json # docs/development/eventgraph-001/verification.md

Merge remote-tracking branch 'origin/main' into reh3376_dev01

67a7169

reh3376 self-assigned this May 30, 2026

reh3376 approved these changes May 30, 2026

View reviewed changes

reh3376 merged commit 3809a33 into main May 30, 2026
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dev: reh3376_dev01 -> main#403

dev: reh3376_dev01 -> main#403
reh3376 merged 48 commits into
mainfrom
reh3376_dev01

github-actions Bot commented May 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

github-actions Bot commented May 29, 2026

Summary

Commits

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants