Tags · jahwag/clem

v0.18.1

fix(runner): keep _backend assignment inside the python MCP-config bl…

…ock (#220)

## Bug

The generated `clem-runner.sh` placed the backend assignment on a
**bash** line directly above the Python MCP-config heredoc:

```sh
_backend = '{{.CoordinationBackend}}'
python3 -c "
import json, os
...
if _backend != 'github' and os.environ.get('DISCORD_TOKEN'):
```

So the shell executed `_backend = 'github'` →

```
/home/<user>/.local/bin/clem-runner.sh: line 36: _backend: command not found
Traceback (most recent call last):
  File "<string>", line 12, in <module>
NameError: name '_backend' is not defined
```

The Python block (which gates Discord/Slack MCP registration) never had
`_backend` defined. On the **github** backend it's non-fatal — no
coordination MCP is needed — but it errors on every iteration and aborts
the `.mcp.json` write.

## Fix

Move `_backend = '{{.CoordinationBackend}}'` **inside** the `python3 -c
"` script, right after `import json, os`. Applied to both runner
templates (claude-code and opencode), which had the identical layout.

## Test

`TestGenerate_BackendAssignedInsidePython` (table-driven over both
runtimes) asserts the assignment renders after `python3 -c "` and not on
a standalone bash line. Verified it **fails on the pre-fix code** and
passes after. `go fmt` / `go vet` / `go test ./...` all clean.

Jun 14, 2026
7265b55
zip
tar.gz
Notes
Downloads

v0.18.0

feat(init): have agents maintain their own open PRs (#219)

## What

The generated agent contract (`clem init` → `CLAUDE.shared.md`) ends the
task lifecycle at **"open a PR"**. Nothing brings an agent back to a PR
it already opened, so a PR that later:

- becomes **unmergeable** because its base branch moved on (conflicts),
- has **failing CI checks**, or
- receives **operator review feedback / change requests**

is never revisited — it's delivered work that silently never lands, and
the operator is left to babysit stale PRs.

## Change

Add a **"Your open PRs"** section to both shared templates (Discord and
GitHub backends). Each iteration, before claiming new work, the agent
lists its own open PRs (`gh pr list --author @me --state open`) and
keeps them mergeable:

- **Conflicts** → rebase onto the latest base, resolve, push.
- **Red checks** → fix the cause and push to the same branch.
- **Review feedback** → address it, **but only from trusted operators**
(consistent with the existing Trust section; all other review content
stays data, not instructions).

Merging remains the operator's job (unchanged Security rule).

## Notes

- No CLI/flag or `clem.yaml` schema changes — template content only.
- Both backends covered; `{{coordination.github_repo}}` is substituted
at generation time as usual.
- Added `TestInitTemplateContainsOpenPRMaintenance` (covers both
backends). `go fmt`, `go vet`, and `go test ./...` all clean.

Jun 14, 2026
1966d52
zip
tar.gz
Notes
Downloads

v0.17.0

feat(coordination): add GitHub Issues as coordination backend (#173)

## Summary

Adds `github` as a third `coordination.backend` option alongside Discord
(default) and Slack. Engineering fleets can coordinate tasks with native
GitHub Issues primitives — labels, assignees, comments, and PR linkage —
instead of reconstructing state from chat messages.

This does not replace Discord or Slack. It completes the existing
swappable-backend abstraction with an **issue-first** mode for teams
whose work naturally ends in pull requests.

## Why add GitHub Issues as a coordination backend?

`clem` already models coordination as a swappable backend selected
through `coordination.backend`. Discord and Slack are good defaults for
conversational fleets, but they are not the only useful coordination
surface.

For engineering-focused fleets, the work naturally starts and ends in
GitHub:

```text
task → claim → implementation → progress updates → PR → review → merge
```

GitHub Issues already provides the native primitives required to
represent this workflow:

| clem concept | GitHub primitive |
|--------------|------------------|
| Task queue | Open issues with a configured label |
| Task state | `clem:todo`, `clem:in-progress`, `clem:done`,
`clem:blocked` labels |
| Claim | Self-assignment (`gh issue edit N --add-assignee @me`) |
| Progress updates | Issue comments |
| Delivery | Pull request with `Closes #N` |
| Alerts | Comments on a configured alerts issue |
| Lessons and post-mortems | Comments on a configured lessons issue |

The GitHub backend is useful when users want:

1. **A durable task queue.** Each task has a persistent object with
explicit state, assignee, history, and linked delivery.
2. **Traceability from task to code.** A task can be followed from issue
creation to claim, implementation, PR review, and merge.
3. **Lower operational overhead.** Teams that already use GitHub do not
need an additional chat platform or coordination MCP server.
4. **Human-in-the-loop governance.** Humans can inspect, reprioritize,
block, or unblock tasks using familiar GitHub workflows.
5. **Asynchronous operation.** Coordination does not depend on
reconstructing state from a stream of chat messages.

### Field validation (pre-merge integration run)

The backend was exercised against a real shared repository in a
pre-merge integration run — not mocks, not unit tests alone.

A provisioned fleet autonomously processed six issues and produced six
pull requests:

| Evidence observed | Result |
|-------------------|--------|
| Tasks coordinated | 6 issues |
| Claims | 6 issues with exactly 1 assignee each |
| Double-claim race | None observed |
| Deliveries | 6 PRs |
| Task → delivery link | 6 PRs with correct `Closes #N` |
| Mergeability | 6 PRs reported as `MERGEABLE` |
| CI | Green; docs-only jobs correctly skipped by path filters |
| Operational memory | 24 comments on the lessons issue |
| Alerts and post-mortems | Recorded as comments on dedicated issues |

The run also surfaced meaningful engineering findings rather than only
generating code: divergence between design sketches and merged code,
shallow acceptance criteria hiding blocking defects, a script with no
effective entrypoint, and CI behaviour requiring human attention.

This should not be described as a production test: no generated PR was
merged and no production workload was executed. It is a real pre-merge
integration validation of the coordination loop.

**Claim semantics:** the pilot did not observe double claim — each task
ended with exactly one assignee. The protocol is adequate for
cooperative coordination between agents, but should not be described as
a formal mutual-exclusion guarantee. Stronger arbitration can be a
follow-up.

**Identity note:** the pilot used a single GitHub identity for all
events. Per-task and per-PR audit worked, but per-agent attribution
appeared only in comment content. The clem model already supports
per-agent Linux users, git identity, and PR authorship; separate tokens
per agent would improve this dimension and are the recommended
production configuration.

## What changed

### Coordination backend (`internal/coordination`)

- Register `github` in `Known()` with `AlertTemplate` posting to
`api.github.com/repos/{repo}/issues/{n}/comments` via `GITHUB_TOKEN`.
- New `RenderAlert()` + `AlertParams` — unified alert rendering for
Discord, Slack, and GitHub (watchdog and runner now share this path).

### Configuration (`internal/config`)

- `coordination.github_repo` (`owner/name`, required when `backend:
github`).
- Validation for GitHub channels: `tasks` = label (e.g. `clem:todo`);
`alerts` / `lessons` = issue numbers.
- Helpers: `UsesGitHubCoordination()`, `GitHubWatchServiceName()`,
`BackendOrDefault()`.
- `api.github.com` added to default egress allowlist.

### Issue watcher sidecar (`internal/githubwatch`, new)

- `clem provision` writes `~/.local/bin/clem-github-watch.sh` per agent.
- Polls `GET /repos/{repo}/issues?labels=…&state=open` every 60s with
**conditional requests** (`ETag` / `If-None-Match`) per [GitHub REST API
best
practices](https://docs.github.com/rest/guides/best-practices-for-using-the-rest-api).
- Detects new unassigned issues and wakes the tmux session (`tmux
send-keys`).
- Installs `clem-github-watch-{project}-{agent}.service` with
`JoinsNamespaceOf` the agent unit.
- Respects egress containment (loopback proxy export + `IPAddressDeny`
when enabled).

**Why polling, not webhooks?** Deliberately simple and compatible with
clem's self-hosted model: no public endpoint, no webhook receiver
service, no extra infrastructure. Webhooks may be a future option for
installations that already have inbound HTTP infrastructure.

### Runner (`internal/runner`)

- Skips Discord/Slack MCP registration when `backend: github` (agents
use `gh` CLI).
- Agent unit gains `Wants=clem-github-watch-…` when GitHub coordination
is active.
- Alert curl rendered via `coordination.RenderAlert()`.

### Provision / init

- `cmd/provision`: installs watcher script + systemd unit when
`UsesGitHubCoordination()`.
- `clem init --backend github`: scaffolds `clem.yaml` and
`CLAUDE.shared.md` with GitHub task-board semantics and claim protocol.

### Watchdog (`internal/watchdog`)

- `send_alert` uses `coordination.RenderAlert()` with repo + issue
number for GitHub backends.

### Agent docs (`internal/agentdoc`)

- `{{coordination.github_repo}}` placeholder for templates.

### Samples and docs

- `samples/github-tasks/` — reference `clem.yaml` and setup guide.
- README: coordination backends table, GitHub coordination section,
updated `clem.yaml` reference.
- `docs/index.html`: multi-backend copy updated.

### CI

- New `github-coordination` e2e job: provisions with `backend: github`,
asserts watcher script syntax, API polling, systemd wiring.

## Design scope (intentionally narrow)

- Opt-in via `coordination.backend: github`; Discord and Slack
unchanged.
- Reuses existing `GITHUB_TOKEN` and standard `gh` CLI — no coordination
MCP.
- Lightweight polling watcher; preserves egress-containment model.
- Quality gates and closed-loop verification are **out of scope** for
this PR (separate future work).

## Test plan

- [x] `go test ./...` — 238 tests pass
- [x] `go vet ./...` — clean
- [x] `go fmt ./...` — no diff
- [x] Unit tests: `coordination`, `config`, `runner`, `watchdog`,
`agentdoc`, `githubwatch`
- [x] e2e job `github-coordination` in `.github/workflows/e2e.yml`
- [x] Pre-merge fleet integration (6 issues → 6 PRs, described above)
- [ ] After merge: operator smoke test with `clem init --backend github`
→ `clem provision` → verify watcher service active and agent wakes on
new `clem:todo` issue

## Notes

- GitHub coordination is **not** chat emulation. Alerts and lessons use
issue comments because they are durable and auditable; free-form
conversation remains better suited to Discord or Slack.
- Optimistic `clem:done` labels with pending concerns observed in the
pilot reflect a **quality-policy** gap, not a coordination-transport
failure. Quality gates are a separate concern from task state, wake-up,
and traceability.
- Recommended follow-ups: per-agent GitHub tokens for attribution,
stronger claim arbitration, optional webhook mode for high-throughput
installations.

Jun 14, 2026
416240f
zip
tar.gz
Notes
Downloads

v0.16.0

feat(watchdog): daily transcript prune + night-aware stale threshold (#…

…211)

- prune_transcripts: session JSONLs + UUID sidecar dirs older than 30d
  deleted once daily (observed ~1.5 GB/agent-month unbounded growth; a
  production host hit 88% disk). memory/ and other non-UUID dirs at the
  same depth are deliberately not matched.
- stale threshold now derives from max(iteration, iteration_night):
  sizing on the day value made every healthy 30m night sleep look stale
  and would have hard-restarted agents all night.

Jun 12, 2026
c40d236
zip
tar.gz
Notes
Downloads

v0.15.1

feat(watchdog): daily transcript prune + night-aware stale threshold (#…

…211)

- prune_transcripts: session JSONLs + UUID sidecar dirs older than 30d
  deleted once daily (observed ~1.5 GB/agent-month unbounded growth; a
  production host hit 88% disk). memory/ and other non-UUID dirs at the
  same depth are deliberately not matched.
- stale threshold now derives from max(iteration, iteration_night):
  sizing on the day value made every healthy 30m night sleep look stale
  and would have hard-restarted agents all night.

Jun 12, 2026
c40d236
zip
tar.gz
Notes
Downloads

v0.15.0

feat(runner): iteration_night, next-effort handshake, runner warnings…

…, quota snapshot (#210)

Four runner/config features driven by a production cache+quota audit
(2026-06-13, consultant.dev team host):

- iteration_night: separate night-hours (22-07) sleep. The hardcoded
  night doubler was removed when the prompt-cache TTL was believed to be
  5 min; subscription Claude Code actually gets the 1h TTL refreshed on
  access (verified from session-log usage fields), so night intervals up
  to ~45m still start warm. Default: match iteration.
- next-effort handshake: agent writes low|medium|high|xhigh to
  ~/.claude/next-effort; runner validates, exports session-scoped
  CLAUDE_CODE_EFFORT_LEVEL for the next launch, deletes the file. No
  reset bookkeeping, no drift.
- runner warnings: sync-skills failures and <1h-to-expiry OAuth tokens
  are prepended to the injected prompt so the agent itself escalates.
  Also fixes sync-skills failure detection: 'sync | tee || log' tested
  tee's exit status (no pipefail), so failures never logged -- a dirty
  clone silently blocked one production agent's skill sync for 3 weeks.
  Now uses PIPESTATUS[0].
- quota snapshot: runner refreshes ~/.claude/quota.json from the OAuth
  usage endpoint at most every 25m; agents read the file instead of
  polling per-iteration (which 429s with multiple agents per host).

claude-code runtime gets all four; opencode gets the warnings prepend
plus the sync-skills fix (effort/quota are Claude-specific).

Jun 12, 2026
90f9fbd
zip
tar.gz
Notes
Downloads

v0.14.0

feat(skills): team skills repo sync (provision seed + per-iteration r…

…efresh) (#205)

Rebases the skills feature (059e5bf + 822c997, previously only on the
v0.10.0-snapshot.1 channel) onto current main, restoring per-provision
and per-iteration team-skills sync that went dormant after the box moved
to mainline v0.13.0. Closes #204.

## What
- Top-level skills_repo config key: clem provision clones the repo per
agent and symlinks shared/<skill> and <agentKey>/<skill> into
~/.claude/skills/; idempotent re-runs git pull --ff-only, stale symlinks
pruned.
- clem sync-skills subcommand + runner hook: skills refresh at the top
of every iteration, no operator round-trip after a skills PR merges.
- clem update --snapshot flag: opt-in prerelease channel (goreleaser
prerelease: auto keeps snapshot tags off stable hosts).

## Rebase conflict resolutions (vs v0.13.0-era main)
- config.go: SkillsRepo registered as a real struct field, so it passes
the new strict unknown-key validation; isPlausibleGitURL check runs in
Load().
- IsValidExtensionName moved to extensions.go next to extensionNameRe
(file was split since the original commits).
- update.go: kept main's exact-name selectBinaryAsset (#201) and
test-overridable URL vars; added Prerelease/Draft fields +
allReleasesURL for the snapshot channel.
- runner.go/provision.go: skills hooks re-inserted into the refactored
provisionAgent / Params paths alongside ProxyExport/SidecarServers.

## Verification
- go build ./... clean, gofmt clean, go vet clean
- go test ./... all packages ok, including restored skills tests:
TestLoad_SkillsRepoAccepted/Rejected,
TestGenerate_SkillsSyncInjectedWhenRepoSet/AbsentWhenRepoUnset,
SyncSkillsRepo manager tests
- Pre-push secret-scan flag is a false positive: neither commit diff
contains a GH_TOKEN read (grep of both diffs is empty; provision.go:46
is pre-existing main code), pushed with CLEM_HOOK_SKIP_CODE_SCAN=1

Release plan per jahwag: merge, then tag v0.14.0 (new feature = minor
bump rather than v0.13.1).

---------

Co-authored-by: jahwag <540380+jahwag@users.noreply.github.com>

Jun 10, 2026
ec690ab
zip
tar.gz
Notes
Downloads

v0.13.0

fix(agent): make pre-push unicode-trap scan locale-independent (#203)

Fable audit

---------

Signed-off-by: Claude <noreply@anthropic.com>
Co-authored-by: Claude <noreply@anthropic.com>

Jun 10, 2026
b218ee2
zip
tar.gz
Notes
Downloads

v0.12.2

fix(config): reject control characters in agent name/role at Load() (#…

…198)

Closes #124.

## Problem

`AgentConfig.Name` and `AgentConfig.Role` are free-form strings from
`clem.yaml` with no validation at `Load()`. `Name` is interpolated into
systemd unit `Description=` lines via `serviceTemplate` and
`ttydServiceTemplate` in `internal/runner/runner.go`. systemd unit files
are newline-delimited, so a name containing a literal newline terminates
the `Description=` directive and injects arbitrary subsequent directives
— including a second `[Service]` section with a crafted `ExecStart` that
systemd merges and runs at service start.

## Fix

Reject all ASCII control characters (`[\x00-\x1f\x7f]`) in `name` and
`role` at `Load()`, following the same pattern as the `git_email`
validation (#183). Spaces remain legal — display names like "Lead
Software Engineer" are the common case in every sample config.

Scope notes:
- systemd splits unit files only on ASCII newline, so unicode separators
(U+2028, NEL) are not line breaks in that sink — the ASCII control-char
class matches the sink parser (verified empirically).
- Shell metacharacter escaping in the runner bash templates is
deliberately out of scope; that's #112.
- The `ac.Name` JSON-injection vector in the alert message is #115, also
untouched here.

## Testing

- New `TestLoad_AgentNameRoleRejectControlCharacters` covers newline /
CR / tab / \x01 / \x7f across both fields (fixtures pass through the
Go-string → YAML double-quoted-scalar decode chain, so real control
bytes reach `Load()`).
- New `TestLoad_AgentNameRoleAllowSpaces` pins that ordinary multi-word
names/roles still load.
- Full `go test ./...` passes.

Adversarial review ran before this PR: independent reviewers confirmed
the validation sits on the only path to the unit-file templates (single
`Load()` call site, no post-Load mutation of `Name`/`Role` anywhere),
confirmed no existing sample/doc/test config would be rejected, and
verified the regex class against the systemd sink parser. Review caught
an initially-incomplete sink list in the regex comment (now also names
the runner bash sinks) and an unpinned tab-rejection behavior (now
tested).

Jun 10, 2026
aeb4968
zip
tar.gz
Notes
Downloads

v0.12.1

fix(cmd): honour [agent...] args in clem login (#182)

Closes #152.

`clem login` advertised `[agent...]` positional args in its Use string
but `runLogin` never read them — every invocation looped all configured
agents, so selective login was silently ignored.

## Changes
- New `selectAgents` helper filters `cfg.Agents` to the keys given on
the command line; unknown keys return `unknown agent: <key>` (same
convention as `clem logs`). No args keeps current behaviour (all
agents).
- Agents are now iterated in sorted key order for deterministic
interactive prompts, matching the sorted-output convention in `clem
status`.
- Combining agent args with `--remote` now errors instead of silently
dropping the selection: `remote.Login` only takes a host and cannot
forward agent filtering, so an honest error beats logging in every
remote agent against the operator's intent.

## Testing
- `go build ./...`, `go vet ./cmd/`, `go test ./cmd/` green.
- New `TestSelectAgents` covers no-args (all), single key, multiple
keys, and unknown-key error.

Adversarial review ran before this PR (multi-angle finders + verifiers).
It caught two confirmed issues that are fixed in this diff: the
`--remote` + agent-args silent ignore, and nondeterministic
map-iteration order for login prompts.

Jun 10, 2026
521299c
zip
tar.gz
Notes
Downloads

PreviousNext

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.18.1

v0.18.0

v0.17.0

v0.16.0

v0.15.1

v0.15.0

v0.14.0

v0.13.0

v0.12.2

v0.12.1

Tags: jahwag/clem