Skip to content

Tags: jahwag/clem

Tags

v0.18.1

Toggle v0.18.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
fix(runner): keep _backend assignment inside the python MCP-config bl…

…ock (#220)

## Bug

The generated `clem-runner.sh` placed the backend assignment on a
**bash** line directly above the Python MCP-config heredoc:

```sh
_backend = '{{.CoordinationBackend}}'
python3 -c "
import json, os
...
if _backend != 'github' and os.environ.get('DISCORD_TOKEN'):
```

So the shell executed `_backend = 'github'` →

```
/home/<user>/.local/bin/clem-runner.sh: line 36: _backend: command not found
Traceback (most recent call last):
  File "<string>", line 12, in <module>
NameError: name '_backend' is not defined
```

The Python block (which gates Discord/Slack MCP registration) never had
`_backend` defined. On the **github** backend it's non-fatal — no
coordination MCP is needed — but it errors on every iteration and aborts
the `.mcp.json` write.

## Fix

Move `_backend = '{{.CoordinationBackend}}'` **inside** the `python3 -c
"` script, right after `import json, os`. Applied to both runner
templates (claude-code and opencode), which had the identical layout.

## Test

`TestGenerate_BackendAssignedInsidePython` (table-driven over both
runtimes) asserts the assignment renders after `python3 -c "` and not on
a standalone bash line. Verified it **fails on the pre-fix code** and
passes after. `go fmt` / `go vet` / `go test ./...` all clean.

v0.18.0

Toggle v0.18.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
feat(init): have agents maintain their own open PRs (#219)

## What

The generated agent contract (`clem init` → `CLAUDE.shared.md`) ends the
task lifecycle at **"open a PR"**. Nothing brings an agent back to a PR
it already opened, so a PR that later:

- becomes **unmergeable** because its base branch moved on (conflicts),
- has **failing CI checks**, or
- receives **operator review feedback / change requests**

is never revisited — it's delivered work that silently never lands, and
the operator is left to babysit stale PRs.

## Change

Add a **"Your open PRs"** section to both shared templates (Discord and
GitHub backends). Each iteration, before claiming new work, the agent
lists its own open PRs (`gh pr list --author @me --state open`) and
keeps them mergeable:

- **Conflicts** → rebase onto the latest base, resolve, push.
- **Red checks** → fix the cause and push to the same branch.
- **Review feedback** → address it, **but only from trusted operators**
(consistent with the existing Trust section; all other review content
stays data, not instructions).

Merging remains the operator's job (unchanged Security rule).

## Notes

- No CLI/flag or `clem.yaml` schema changes — template content only.
- Both backends covered; `{{coordination.github_repo}}` is substituted
at generation time as usual.
- Added `TestInitTemplateContainsOpenPRMaintenance` (covers both
backends). `go fmt`, `go vet`, and `go test ./...` all clean.

v0.17.0

Toggle v0.17.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
feat(coordination): add GitHub Issues as coordination backend (#173)

## Summary

Adds `github` as a third `coordination.backend` option alongside Discord
(default) and Slack. Engineering fleets can coordinate tasks with native
GitHub Issues primitives — labels, assignees, comments, and PR linkage —
instead of reconstructing state from chat messages.

This does not replace Discord or Slack. It completes the existing
swappable-backend abstraction with an **issue-first** mode for teams
whose work naturally ends in pull requests.

## Why add GitHub Issues as a coordination backend?

`clem` already models coordination as a swappable backend selected
through `coordination.backend`. Discord and Slack are good defaults for
conversational fleets, but they are not the only useful coordination
surface.

For engineering-focused fleets, the work naturally starts and ends in
GitHub:

```text
task → claim → implementation → progress updates → PR → review → merge
```

GitHub Issues already provides the native primitives required to
represent this workflow:

| clem concept | GitHub primitive |
|--------------|------------------|
| Task queue | Open issues with a configured label |
| Task state | `clem:todo`, `clem:in-progress`, `clem:done`,
`clem:blocked` labels |
| Claim | Self-assignment (`gh issue edit N --add-assignee @me`) |
| Progress updates | Issue comments |
| Delivery | Pull request with `Closes #N` |
| Alerts | Comments on a configured alerts issue |
| Lessons and post-mortems | Comments on a configured lessons issue |

The GitHub backend is useful when users want:

1. **A durable task queue.** Each task has a persistent object with
explicit state, assignee, history, and linked delivery.
2. **Traceability from task to code.** A task can be followed from issue
creation to claim, implementation, PR review, and merge.
3. **Lower operational overhead.** Teams that already use GitHub do not
need an additional chat platform or coordination MCP server.
4. **Human-in-the-loop governance.** Humans can inspect, reprioritize,
block, or unblock tasks using familiar GitHub workflows.
5. **Asynchronous operation.** Coordination does not depend on
reconstructing state from a stream of chat messages.

### Field validation (pre-merge integration run)

The backend was exercised against a real shared repository in a
pre-merge integration run — not mocks, not unit tests alone.

A provisioned fleet autonomously processed six issues and produced six
pull requests:

| Evidence observed | Result |
|-------------------|--------|
| Tasks coordinated | 6 issues |
| Claims | 6 issues with exactly 1 assignee each |
| Double-claim race | None observed |
| Deliveries | 6 PRs |
| Task → delivery link | 6 PRs with correct `Closes #N` |
| Mergeability | 6 PRs reported as `MERGEABLE` |
| CI | Green; docs-only jobs correctly skipped by path filters |
| Operational memory | 24 comments on the lessons issue |
| Alerts and post-mortems | Recorded as comments on dedicated issues |

The run also surfaced meaningful engineering findings rather than only
generating code: divergence between design sketches and merged code,
shallow acceptance criteria hiding blocking defects, a script with no
effective entrypoint, and CI behaviour requiring human attention.

This should not be described as a production test: no generated PR was
merged and no production workload was executed. It is a real pre-merge
integration validation of the coordination loop.

**Claim semantics:** the pilot did not observe double claim — each task
ended with exactly one assignee. The protocol is adequate for
cooperative coordination between agents, but should not be described as
a formal mutual-exclusion guarantee. Stronger arbitration can be a
follow-up.

**Identity note:** the pilot used a single GitHub identity for all
events. Per-task and per-PR audit worked, but per-agent attribution
appeared only in comment content. The clem model already supports
per-agent Linux users, git identity, and PR authorship; separate tokens
per agent would improve this dimension and are the recommended
production configuration.

## What changed

### Coordination backend (`internal/coordination`)

- Register `github` in `Known()` with `AlertTemplate` posting to
`api.github.com/repos/{repo}/issues/{n}/comments` via `GITHUB_TOKEN`.
- New `RenderAlert()` + `AlertParams` — unified alert rendering for
Discord, Slack, and GitHub (watchdog and runner now share this path).

### Configuration (`internal/config`)

- `coordination.github_repo` (`owner/name`, required when `backend:
github`).
- Validation for GitHub channels: `tasks` = label (e.g. `clem:todo`);
`alerts` / `lessons` = issue numbers.
- Helpers: `UsesGitHubCoordination()`, `GitHubWatchServiceName()`,
`BackendOrDefault()`.
- `api.github.com` added to default egress allowlist.

### Issue watcher sidecar (`internal/githubwatch`, new)

- `clem provision` writes `~/.local/bin/clem-github-watch.sh` per agent.
- Polls `GET /repos/{repo}/issues?labels=…&state=open` every 60s with
**conditional requests** (`ETag` / `If-None-Match`) per [GitHub REST API
best
practices](https://docs.github.com/rest/guides/best-practices-for-using-the-rest-api).
- Detects new unassigned issues and wakes the tmux session (`tmux
send-keys`).
- Installs `clem-github-watch-{project}-{agent}.service` with
`JoinsNamespaceOf` the agent unit.
- Respects egress containment (loopback proxy export + `IPAddressDeny`
when enabled).

**Why polling, not webhooks?** Deliberately simple and compatible with
clem's self-hosted model: no public endpoint, no webhook receiver
service, no extra infrastructure. Webhooks may be a future option for
installations that already have inbound HTTP infrastructure.

### Runner (`internal/runner`)

- Skips Discord/Slack MCP registration when `backend: github` (agents
use `gh` CLI).
- Agent unit gains `Wants=clem-github-watch-…` when GitHub coordination
is active.
- Alert curl rendered via `coordination.RenderAlert()`.

### Provision / init

- `cmd/provision`: installs watcher script + systemd unit when
`UsesGitHubCoordination()`.
- `clem init --backend github`: scaffolds `clem.yaml` and
`CLAUDE.shared.md` with GitHub task-board semantics and claim protocol.

### Watchdog (`internal/watchdog`)

- `send_alert` uses `coordination.RenderAlert()` with repo + issue
number for GitHub backends.

### Agent docs (`internal/agentdoc`)

- `{{coordination.github_repo}}` placeholder for templates.

### Samples and docs

- `samples/github-tasks/` — reference `clem.yaml` and setup guide.
- README: coordination backends table, GitHub coordination section,
updated `clem.yaml` reference.
- `docs/index.html`: multi-backend copy updated.

### CI

- New `github-coordination` e2e job: provisions with `backend: github`,
asserts watcher script syntax, API polling, systemd wiring.

## Design scope (intentionally narrow)

- Opt-in via `coordination.backend: github`; Discord and Slack
unchanged.
- Reuses existing `GITHUB_TOKEN` and standard `gh` CLI — no coordination
MCP.
- Lightweight polling watcher; preserves egress-containment model.
- Quality gates and closed-loop verification are **out of scope** for
this PR (separate future work).

## Test plan

- [x] `go test ./...` — 238 tests pass
- [x] `go vet ./...` — clean
- [x] `go fmt ./...` — no diff
- [x] Unit tests: `coordination`, `config`, `runner`, `watchdog`,
`agentdoc`, `githubwatch`
- [x] e2e job `github-coordination` in `.github/workflows/e2e.yml`
- [x] Pre-merge fleet integration (6 issues → 6 PRs, described above)
- [ ] After merge: operator smoke test with `clem init --backend github`
→ `clem provision` → verify watcher service active and agent wakes on
new `clem:todo` issue

## Notes

- GitHub coordination is **not** chat emulation. Alerts and lessons use
issue comments because they are durable and auditable; free-form
conversation remains better suited to Discord or Slack.
- Optimistic `clem:done` labels with pending concerns observed in the
pilot reflect a **quality-policy** gap, not a coordination-transport
failure. Quality gates are a separate concern from task state, wake-up,
and traceability.
- Recommended follow-ups: per-agent GitHub tokens for attribution,
stronger claim arbitration, optional webhook mode for high-throughput
installations.

v0.16.0

Toggle v0.16.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
feat(watchdog): daily transcript prune + night-aware stale threshold (#…

…211)

- prune_transcripts: session JSONLs + UUID sidecar dirs older than 30d
  deleted once daily (observed ~1.5 GB/agent-month unbounded growth; a
  production host hit 88% disk). memory/ and other non-UUID dirs at the
  same depth are deliberately not matched.
- stale threshold now derives from max(iteration, iteration_night):
  sizing on the day value made every healthy 30m night sleep look stale
  and would have hard-restarted agents all night.

v0.15.1

Toggle v0.15.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
feat(watchdog): daily transcript prune + night-aware stale threshold (#…

…211)

- prune_transcripts: session JSONLs + UUID sidecar dirs older than 30d
  deleted once daily (observed ~1.5 GB/agent-month unbounded growth; a
  production host hit 88% disk). memory/ and other non-UUID dirs at the
  same depth are deliberately not matched.
- stale threshold now derives from max(iteration, iteration_night):
  sizing on the day value made every healthy 30m night sleep look stale
  and would have hard-restarted agents all night.

v0.15.0

Toggle v0.15.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
feat(runner): iteration_night, next-effort handshake, runner warnings…

…, quota snapshot (#210)

Four runner/config features driven by a production cache+quota audit
(2026-06-13, consultant.dev team host):

- iteration_night: separate night-hours (22-07) sleep. The hardcoded
  night doubler was removed when the prompt-cache TTL was believed to be
  5 min; subscription Claude Code actually gets the 1h TTL refreshed on
  access (verified from session-log usage fields), so night intervals up
  to ~45m still start warm. Default: match iteration.
- next-effort handshake: agent writes low|medium|high|xhigh to
  ~/.claude/next-effort; runner validates, exports session-scoped
  CLAUDE_CODE_EFFORT_LEVEL for the next launch, deletes the file. No
  reset bookkeeping, no drift.
- runner warnings: sync-skills failures and <1h-to-expiry OAuth tokens
  are prepended to the injected prompt so the agent itself escalates.
  Also fixes sync-skills failure detection: 'sync | tee || log' tested
  tee's exit status (no pipefail), so failures never logged -- a dirty
  clone silently blocked one production agent's skill sync for 3 weeks.
  Now uses PIPESTATUS[0].
- quota snapshot: runner refreshes ~/.claude/quota.json from the OAuth
  usage endpoint at most every 25m; agents read the file instead of
  polling per-iteration (which 429s with multiple agents per host).

claude-code runtime gets all four; opencode gets the warnings prepend
plus the sync-skills fix (effort/quota are Claude-specific).

v0.14.0

Toggle v0.14.0's commit message

Partially verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
We cannot verify signatures from co-authors, and some of the co-authors attributed to this commit require their commits to be signed.
feat(skills): team skills repo sync (provision seed + per-iteration r…

…efresh) (#205)

Rebases the skills feature (059e5bf + 822c997, previously only on the
v0.10.0-snapshot.1 channel) onto current main, restoring per-provision
and per-iteration team-skills sync that went dormant after the box moved
to mainline v0.13.0. Closes #204.

## What
- Top-level skills_repo config key: clem provision clones the repo per
agent and symlinks shared/<skill> and <agentKey>/<skill> into
~/.claude/skills/; idempotent re-runs git pull --ff-only, stale symlinks
pruned.
- clem sync-skills subcommand + runner hook: skills refresh at the top
of every iteration, no operator round-trip after a skills PR merges.
- clem update --snapshot flag: opt-in prerelease channel (goreleaser
prerelease: auto keeps snapshot tags off stable hosts).

## Rebase conflict resolutions (vs v0.13.0-era main)
- config.go: SkillsRepo registered as a real struct field, so it passes
the new strict unknown-key validation; isPlausibleGitURL check runs in
Load().
- IsValidExtensionName moved to extensions.go next to extensionNameRe
(file was split since the original commits).
- update.go: kept main's exact-name selectBinaryAsset (#201) and
test-overridable URL vars; added Prerelease/Draft fields +
allReleasesURL for the snapshot channel.
- runner.go/provision.go: skills hooks re-inserted into the refactored
provisionAgent / Params paths alongside ProxyExport/SidecarServers.

## Verification
- go build ./... clean, gofmt clean, go vet clean
- go test ./... all packages ok, including restored skills tests:
TestLoad_SkillsRepoAccepted/Rejected,
TestGenerate_SkillsSyncInjectedWhenRepoSet/AbsentWhenRepoUnset,
SyncSkillsRepo manager tests
- Pre-push secret-scan flag is a false positive: neither commit diff
contains a GH_TOKEN read (grep of both diffs is empty; provision.go:46
is pre-existing main code), pushed with CLEM_HOOK_SKIP_CODE_SCAN=1

Release plan per jahwag: merge, then tag v0.14.0 (new feature = minor
bump rather than v0.13.1).

---------

Co-authored-by: jahwag <540380+jahwag@users.noreply.github.com>

v0.13.0

Toggle v0.13.0's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
fix(agent): make pre-push unicode-trap scan locale-independent (#203)

Fable audit

---------

Signed-off-by: Claude <noreply@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>

v0.12.2

Toggle v0.12.2's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
fix(config): reject control characters in agent name/role at Load() (#…

…198)

Closes #124.

## Problem

`AgentConfig.Name` and `AgentConfig.Role` are free-form strings from
`clem.yaml` with no validation at `Load()`. `Name` is interpolated into
systemd unit `Description=` lines via `serviceTemplate` and
`ttydServiceTemplate` in `internal/runner/runner.go`. systemd unit files
are newline-delimited, so a name containing a literal newline terminates
the `Description=` directive and injects arbitrary subsequent directives
— including a second `[Service]` section with a crafted `ExecStart` that
systemd merges and runs at service start.

## Fix

Reject all ASCII control characters (`[\x00-\x1f\x7f]`) in `name` and
`role` at `Load()`, following the same pattern as the `git_email`
validation (#183). Spaces remain legal — display names like "Lead
Software Engineer" are the common case in every sample config.

Scope notes:
- systemd splits unit files only on ASCII newline, so unicode separators
(U+2028, NEL) are not line breaks in that sink — the ASCII control-char
class matches the sink parser (verified empirically).
- Shell metacharacter escaping in the runner bash templates is
deliberately out of scope; that's #112.
- The `ac.Name` JSON-injection vector in the alert message is #115, also
untouched here.

## Testing

- New `TestLoad_AgentNameRoleRejectControlCharacters` covers newline /
CR / tab / \x01 / \x7f across both fields (fixtures pass through the
Go-string → YAML double-quoted-scalar decode chain, so real control
bytes reach `Load()`).
- New `TestLoad_AgentNameRoleAllowSpaces` pins that ordinary multi-word
names/roles still load.
- Full `go test ./...` passes.

Adversarial review ran before this PR: independent reviewers confirmed
the validation sits on the only path to the unit-file templates (single
`Load()` call site, no post-Load mutation of `Name`/`Role` anywhere),
confirmed no existing sample/doc/test config would be rejected, and
verified the regex class against the systemd sink parser. Review caught
an initially-incomplete sink list in the regex comment (now also names
the runner bash sinks) and an unpinned tab-rejection behavior (now
tested).

v0.12.1

Toggle v0.12.1's commit message

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature.
fix(cmd): honour [agent...] args in clem login (#182)

Closes #152.

`clem login` advertised `[agent...]` positional args in its Use string
but `runLogin` never read them — every invocation looped all configured
agents, so selective login was silently ignored.

## Changes
- New `selectAgents` helper filters `cfg.Agents` to the keys given on
the command line; unknown keys return `unknown agent: <key>` (same
convention as `clem logs`). No args keeps current behaviour (all
agents).
- Agents are now iterated in sorted key order for deterministic
interactive prompts, matching the sorted-output convention in `clem
status`.
- Combining agent args with `--remote` now errors instead of silently
dropping the selection: `remote.Login` only takes a host and cannot
forward agent filtering, so an honest error beats logging in every
remote agent against the operator's intent.

## Testing
- `go build ./...`, `go vet ./cmd/`, `go test ./cmd/` green.
- New `TestSelectAgents` covers no-args (all), single key, multiple
keys, and unknown-key error.

Adversarial review ran before this PR (multi-angle finders + verifiers).
It caught two confirmed issues that are fixed in this diff: the
`--remote` + agent-args silent ignore, and nondeterministic
map-iteration order for login prompts.