[Dream Cycle 2026-06-09] swarm: RL orchestration 5-decision gap (no stopping-RL in any framework) + ruview-integration,ruvector-integration scan

## Tonight's Rotation

| Field | Value |
|-------|-------|
| **SLOT** | 4 |
| **DEEP surface** | swarm |
| **SCAN surfaces** | ruview-integration, ruvector-integration |
| **Session commit** | `cc8830d798152e9ee6647db11eaaf014759ac2ff` |
| **Date** | 2026-06-09 |

---

## Drift Check

- **Prior dream-cycle issues (last 7):** #2324 (meta, 2026-06-08), #2316 (memory, 2026-06-08), #2309 (intelligence, 2026-06-07), #2303 (security, 2026-06-06), #2294 (performance, 2026-06-05), #2289 (swarm, 2026-06-04), #2277 (memory, 2026-06-03)
- **Swarm surface DEEP count:** 2 prior DEEP=swarm issues (#2289 on 2026-06-04, #2223 on 2026-05-29) — below the ≥3 repetition threshold. No substitution needed.
- **⚠️ needs-merge TRIGGERED:** `needs-merge` label added tonight. Meta-issue #2324 confirms 0 dream-cycle PRs merged in 14 nights (earliest open: #2149 from 2026-05-26). This is the 14-night threshold.
- **⚠️ ADR-147 collision:** 6 open PRs (#2278, #2290, #2295, #2304, #2310, #2317) all claim ADR-147; none merged. See meta-issue #2324. Tonight's ADR uses **153** (147 + 6 in-flight) to avoid further pollution. Human review should renumber 147–153 once PRs land.
- **Self-score of last night's gist (#2316, memory — multi-signal retrieval gap):**
  - Grade A benchmark (Mem0 94.4% LongMemEval, arXiv:2603.19935): ✅ 2 pts
  - ≥4 competitor rows: ✅ 2 pts
  - Specific actions (wire fts5.ts + entity-tagger, add RRF fusion): ✅ 2 pts
  - Witness present: ✅ 2 pts
  - <1500 words: ✅ 1 pt
  - Novel finding (FTS5 wired but unwired gap): ✅ 1 pt
  - **Score: 10/10**
- **Three-night running score:** 10/10, 10/10, 10/10. Note: per meta-issue #2324 this rubric measures shape, not truth (merge rate = 0/14). No narrow-surface trigger by the rules, but the rubric signal is known-weak.

---

## Deep Dive Findings — Swarm SOTA 2026

### SOTA Summary

| Finding | Source | Confidence |
|---------|--------|------------|
| RL for orchestration traces: 5 sub-decisions (spawn, delegate, communicate, aggregate, **stop**); **no RL method exists for the stopping decision** across all surveyed frameworks | arXiv:2605.02801, May 2026, reproducible artifact (84 papers + JSON schema) | **A** |
| Reward design decomposes into 8 families (parallelism speedup, split correctness, aggregation quality, …); credit signal spans token-to-team | arXiv:2605.02801 | **A** |
| SGTO-MAS: security-aware adaptive swarm selection via Gorilla Troops Optimization; consensus 0.8764, risk 0.3000 | arXiv:2606.07940, Jun 2026 | **B** — single source |
| SPIN: tensor-network factorization reduces O(n^m) → O(m·n·χ²); targets resource-constrained edge swarms | arXiv:2606.07557, Jun 2026 | **C** — single source, no cross-check |
| Framework latency (2K-instance independent test): LangGraph fastest across all 5 tasks; CrewAI 30-60% faster than AutoGen on simple tasks | Independent 2026 comparison (tensoria.fr) | **B** |

### Gap vs Current Ruflo

| Orchestration Decision | Ruflo v3.6.10 | SOTA (arXiv:2605.02801) | Gap |
|-----------------------|--------------|------------------------|-----|
| When to spawn | Hard-coded `maxAgents=8` | RL-trainable | Missing RL |
| Who to delegate | 3-tier model routing (rules) | RL-trainable | Missing RL |
| How to communicate | SendMessage protocol | RL-trainable | Missing RL |
| How to aggregate | Task orchestrator collects | RL-trainable | Missing RL |
| **When to stop** | **Hard-coded completion check** | **No RL method exists anywhere** | **Open frontier** |
| Orchestration trace recording | `hooks post-task` → ReasoningBank | Replayable JSON schema | Missing replay schema |
| RL training on orchestration data | SONA (agent-level only) | Policy learning over traces | Gap |

SONA learns agent-level behavior (0.0043ms/adapt). No component in Ruflo learns the *orchestration* decisions (spawn/delegate/aggregate/stop). This is a layer above SONA.

### Recommended Action

**ADR-153 filed** (see PR): Gated RuVector backend adoption — but the higher-priority recommendation is:

1. Adopt the replayable JSON schema from arXiv:2605.02801 in `v3/@claude-flow/hooks/src/orchestration-trace.ts` (~120 LOC). Records all 5 sub-decisions per swarm run.
2. Emit `StoppingDecisionTrace` events from `unified-coordinator.ts` completion handler (~80 LOC). This creates the training dataset for the open stopping-RL gap.
3. These two steps together make Ruflo the only framework with a replayable orchestration trace store — prerequisite for any future RL policy.

No existing ADR (142–146) covers orchestration-level RL. **No ADR filed for items 1–2** — implementation-level, ~200 LOC, no architectural decision required.

---

## Scan Findings — ruview-integration

- **Source:** github.com/ruvnet/RuView; github.com/ruvnet/RuView/blob/main/docs/adr/ADR-069-cognitum-seed-csi-pipeline.md
- **Competitive signal:** WiFi-CSI vital monitoring (breathing rate 6-30 BPM, heart rate 40-120 BPM, 17-keypoint pose) via 65 WASM edge modules. Swarm node roles: sensor → coordinator → gateway with epoch-based delta replication between peers.
- **Ruflo position:** `@claude-flow/plugin-iot-cognitum` already exists in the repo and bridges Cognitum Seed devices as Ruflo agents. RuView has its own ADR-069 (cognitum-seed-csi-pipeline) for feeding CSI into Cognitum hardware.
- **Finding:** The IoT plugin bridge already exists. The gap is that RuView's CSI data plane is not wired into AgentDB as time-series memory entries — swarm agents cannot currently react to physical presence/vital-sign changes from RuView. This is implementation-level (no ADR needed): add a `ruview-csi` ingestion worker in `plugin-iot-cognitum/src/workers/` that converts CSI frames to AgentDB memory entries tagged with `source: ruview-csi`. **(B — GitHub repo crosschecked)**

---

## Scan Findings — ruvector-integration

- **Source:** gist.github.com/ruvnet/f9b631bae8303cb114bd7bf3a8e39217; dasroot.net/posts/2026/04/vector-databases-rag-qdrant-milvus-weaviate-comparison-2026/
- **Competitive signal (2026):** Qdrant <100ms p99 at 100M vectors 95% recall (Grade B); Milvus 100K+ QPS at 1B vectors (Grade A — vendor docs); Weaviate ~150ms at 500M vectors 92% recall (Grade B). All three tested at 100M+ vectors.
- **RuVector vendor claim:** 50K QPS (1-thread), 100K QPS (8-thread) at 1M × 128D; p99 <5ms; AgenticDB-compatible API (Grade C — vendor gist only).
- **Finding:** Ruflo's AgentDB benchmarked at 5k–20k vectors only. Competitors publish at 100M+. The scale cliff above 20K is unknown. **ADR-153 filed**: adds `ruvector` as optional feature-flagged backend, gated on independent benchmark at N ≥ 100K before default change. Do not flip default until `scripts/benchmark-intelligence.mjs` validates the Grade-C claim. **(C — vendor gist, no peer-reviewed benchmark)**

---

## Competitors Reviewed

| Framework | Stopping Decision | Orchestration RL | Scale Tested | Notable 2026 Change |
|-----------|------------------|-----------------|-------------|---------------------|
| **Ruflo (claude-flow 3.6.10)** | Hard-coded completion check | SONA (agent-level, not orchestration) | ~8 agents | Federation hub, comms-first |
| **LangGraph v0.4** | Graph terminal node | None | Production | Fastest latency in 2K-task benchmark (B) |
| **AutoGen AG2 1.0** | `ConversationTerminated` signal | None | Production | Event-driven rearchitecture |
| **CrewAI 0.105** | `max_iter` + task_output check | None | Production | 30-60% faster than AutoGen simple tasks (B) |
| **OpenAI Agents SDK** | Loop exit when no handoff | None | Production | Sandbox + approval callbacks |

**No framework surveyed implements RL-trained stopping.** This is the open frontier identified by arXiv:2605.02801.

---

## Gist Link

Gist publish: no `gh gist create` available in this remote environment. Report committed to branch at `v3/docs/dream/dream-gist-2026-06-09.md` in `dream/2026-06-09-swarm`.

---

## Witness

| Field | Value |
|-------|-------|
| **Session commit** | `cc8830d798152e9ee6647db11eaaf014759ac2ff` |
| **Report SHA-256** | `e0dbcf415b98968c823c10fbd9c64094603c1d23868db126f06546cd1269f1b7` |
| **Witness stamp** | `d82a838fcd3f2cfce1bd8cd94d729508e666f4418394b1d8f74f7333ec147526` |

Verifier: fetch raw `v3/docs/dream/dream-gist-2026-06-09.md` from branch → `sha256sum` → concat `cc8830d798152e9ee6647db11eaaf014759ac2ff` → `sha256sum` → must equal witness stamp.

**ADR filed:** ADR-153 (RuVector production-scale backend). Branch: `dream/2026-06-09-swarm`.

Field	Value
SLOT	4
DEEP surface	swarm
SCAN surfaces	ruview-integration, ruvector-integration
Session commit	`cc8830d798152e9ee6647db11eaaf014759ac2ff`
Date	2026-06-09

Field	Value
Session commit	`cc8830d798152e9ee6647db11eaaf014759ac2ff`
Report SHA-256	`e0dbcf415b98968c823c10fbd9c64094603c1d23868db126f06546cd1269f1b7`
Witness stamp	`d82a838fcd3f2cfce1bd8cd94d729508e666f4418394b1d8f74f7333ec147526`

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Dream Cycle 2026-06-09] swarm: RL orchestration 5-decision gap (no stopping-RL in any framework) + ruview-integration,ruvector-integration scan #2332

Tonight's Rotation

Drift Check

Deep Dive Findings — Swarm SOTA 2026

SOTA Summary

Gap vs Current Ruflo

Recommended Action

Scan Findings — ruview-integration

Scan Findings — ruvector-integration

Competitors Reviewed

Gist Link

Witness

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Finding	Source	Confidence
RL for orchestration traces: 5 sub-decisions (spawn, delegate, communicate, aggregate, stop); no RL method exists for the stopping decision across all surveyed frameworks	arXiv:2605.02801, May 2026, reproducible artifact (84 papers + JSON schema)	A
Reward design decomposes into 8 families (parallelism speedup, split correctness, aggregation quality, …); credit signal spans token-to-team	arXiv:2605.02801	A
SGTO-MAS: security-aware adaptive swarm selection via Gorilla Troops Optimization; consensus 0.8764, risk 0.3000	arXiv:2606.07940, Jun 2026	B — single source
SPIN: tensor-network factorization reduces O(n^m) → O(m·n·χ²); targets resource-constrained edge swarms	arXiv:2606.07557, Jun 2026	C — single source, no cross-check
Framework latency (2K-instance independent test): LangGraph fastest across all 5 tasks; CrewAI 30-60% faster than AutoGen on simple tasks	Independent 2026 comparison (tensoria.fr)	B

Orchestration Decision	Ruflo v3.6.10	SOTA (arXiv:2605.02801)	Gap
When to spawn	Hard-coded `maxAgents=8`	RL-trainable	Missing RL
Who to delegate	3-tier model routing (rules)	RL-trainable	Missing RL
How to communicate	SendMessage protocol	RL-trainable	Missing RL
How to aggregate	Task orchestrator collects	RL-trainable	Missing RL
When to stop	Hard-coded completion check	No RL method exists anywhere	Open frontier
Orchestration trace recording	`hooks post-task` → ReasoningBank	Replayable JSON schema	Missing replay schema
RL training on orchestration data	SONA (agent-level only)	Policy learning over traces	Gap

Framework	Stopping Decision	Orchestration RL	Scale Tested	Notable 2026 Change
Ruflo (claude-flow 3.6.10)	Hard-coded completion check	SONA (agent-level, not orchestration)	~8 agents	Federation hub, comms-first
LangGraph v0.4	Graph terminal node	None	Production	Fastest latency in 2K-task benchmark (B)
AutoGen AG2 1.0	`ConversationTerminated` signal	None	Production	Event-driven rearchitecture
CrewAI 0.105	`max_iter` + task_output check	None	Production	30-60% faster than AutoGen simple tasks (B)
OpenAI Agents SDK	Loop exit when no handoff	None	Production	Sandbox + approval callbacks

[Dream Cycle 2026-06-09] swarm: RL orchestration 5-decision gap (no stopping-RL in any framework) + ruview-integration,ruvector-integration scan #2332

Description

Tonight's Rotation

Drift Check

Deep Dive Findings — Swarm SOTA 2026

SOTA Summary

Gap vs Current Ruflo

Recommended Action

Scan Findings — ruview-integration

Scan Findings — ruvector-integration

Competitors Reviewed

Gist Link

Witness

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions