h5p: Add Vitest test harness (63 tests) by rjcorwin · Pull Request #21 · rjcorwin/cook

rjcorwin · 2026-03-18T01:46:11Z

Full RPI cycle complete. Infrastructure is solid and non-invasive. Coverage is honest v1 — the harness exists and adding tests is now friction-free, but the original goal of full SPEC.md surface isn't there yet.

Status

63 tests passing across 5 files. npm test runs in ~300ms with no real LLM calls, no Docker.

✓ src/parser.test.ts     (16 tests)
✓ src/loop.test.ts       (17 tests)
✓ src/executor.test.ts   (10 tests)
✓ src/race.test.ts       (15 tests)
✓ src/template.test.ts   (5 tests)

npm run build passes with no TypeScript errors.

What changed

src/executor.ts — ExecutionContext gains one optional field:

poolFactory?: (worktreePath: string, config: CookConfig, runAgents: AgentName[]) => RunnerPool

Seven createRunnerPool(...) calls replaced with (ctx.poolFactory ?? createRunnerPool)(...). parseRalphVerdict exported. In production nothing changes — the field is absent and behavior is identical.

src/testing/test-runner.ts — TestRunner (queue-driven AgentRunner), makeTestPool, testPoolFactory helpers.

vitest.config.ts + "test": "vitest run" in package.json.

Five test files covering: all AST node types and flags (parser), parseGateVerdict table tests and full agentLoop loop (loop), parseRalphVerdict direct + executor integration with mocked Ink (executor), parseJudgeVerdict/buildJudgePrompt/sessionId (race), renderTemplate/loadCookMD (template).

Architecture

graph TD
    A[ExecutionContext.poolFactory?] -->|set in tests| B[TestRunner — queue-driven mock]
    A -->|absent in production| C[createRunnerPool → NativeRunner / Docker / BareRunner]
    B --> D[responses queue controls LLM output in tests]
    E[execute / agentLoop] --> A

Key decisions

poolFactory on ExecutionContext over vi.mock or a fake CLI binary — explicit, typed, propagates through all execution paths including composition branches.
vi.mock('ink', ...) in executor.test.ts — every path touching a work or review node calls Ink's render(); mocked to no-op to prevent TTY noise or hangs.
Composition integration tests deferred — full worktree tests need temp git repo + coordinated mock sequences; pure function coverage of resolver logic is in race.test.ts.

Bug surfaced

parseRalphVerdict / APPROVE inconsistency. parseGateVerdict explicitly handles APPROVE as a DONE keyword. parseRalphVerdict does not — RALPH_DONE_KEYWORDS is ['DONE', 'COMPLETE', 'FINISHED']. Passing 'APPROVE' to parseRalphVerdict returns 'DONE' via the fail-safe default (no keyword matched), not explicit handling. The behaviour is correct today but the two functions are inconsistent. If logic is ever reordered, APPROVE through ralph breaks silently. Tracked in TODO.md.

Coverage gaps (not blocking merge, but honest accounting)

Parser: complex compositions untested. v3, compare, vs with review branches, nested operators (work x3 review v2 pick). The parser handles all of this — none of it is tested.
agentLoop with iteratePrompt — when i > 1 && config.iteratePrompt, the loop uses the 'iterate' step instead of 'work'. Code path not exercised.
agentLoop error handling — when runner.runAgent throws, the loop catches and returns { verdict: 'ITERATE' }. Not tested.
executeReview with compound inner — the branch where inner.type !== 'work' (e.g. repeat-then-review) is untested. Only the simple inner.type === 'work' path is covered.
Composition entirely absent — by design (deferred), but was in the original requirement.

Deferred (see TODO.md)

Composition integration tests (git worktree + poolFactory, infrastructure ready)
Resolver integration tests (resolvePick/resolveMerge/resolveCompare)
UI tests (App.tsx, RaceApp.tsx, LogStream.tsx)
loopEvents singleton refactor to injected EventEmitter

Testing instructions

npm install
npm test       # 63 passed (63), ~300ms
npm run build  # no TypeScript errors

🤖 Generated with Claude Code

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- Add poolFactory? to ExecutionContext for TestRunner injection - Export parseRalphVerdict for direct unit testing - Replace 7 createRunnerPool call sites with (ctx.poolFactory ?? createRunnerPool) - Add src/testing/test-runner.ts with TestRunner, makeTestPool, testPoolFactory - Add vitest.config.ts and "test": "vitest run" script - Write parser, loop, executor, race, template test files Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

rjcorwin and others added 4 commits March 17, 2026 21:45

h5p: Add test harness research

82cae96

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

h5p: Add TODO items from test harness run

ea334f6

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

h5p: Add devlog-002 for RPI orchestration session

67baf2f

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

rjcorwin changed the title ~~h5p: Add test harness~~ h5p: Add Vitest test harness (63 tests) Mar 18, 2026

oleksiimazurenko mentioned this pull request May 15, 2026

Add agentArgs config field for pass-through CLI flags #44

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

h5p: Add Vitest test harness (63 tests)#21

h5p: Add Vitest test harness (63 tests)#21
rjcorwin wants to merge 4 commits into
mainfrom
add-test-harness

rjcorwin commented Mar 18, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

rjcorwin commented Mar 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Status

What changed

Architecture

Key decisions

Bug surfaced

Coverage gaps (not blocking merge, but honest accounting)

Deferred (see TODO.md)

Testing instructions

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rjcorwin commented Mar 18, 2026 •

edited

Loading