Skip to content

feat(workflows): add reusable secret-leak check for per-PR scanning#24

Merged
dkastl merged 3 commits into
mainfrom
feat/reusable-secret-leak-check
May 14, 2026
Merged

feat(workflows): add reusable secret-leak check for per-PR scanning#24
dkastl merged 3 commits into
mainfrom
feat/reusable-secret-leak-check

Conversation

@dkastl

@dkastl dkastl commented May 14, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds reusable-secret-leak-check.yml, an org-wide reusable workflow that any Geolonia repo can call from pull_request to block new secret commits before merge.

Pairs with the weekly org-wide audit in geolonia-operations/.github/workflows/secret-leak-audit.yml:

Workflow Where Catches
Org-wide audit (cron) geolonia-operations Secrets already in main history of any org repo
This reusable Any consumer repo, pull_request trigger New secret commits before they merge

Tracking issue: geolonia/geolonia-operations#60.

How it works

Scans only the commits the PR adds: betterleaks git . --log-opts="<base-sha>..<head-sha>". Runs the upstream betterleaks container pinned by immutable digest (v1.2.0).

Outputs:

  • Inline workflow annotations on the PR's "Files changed" view (one per finding)
  • An idempotent PR comment (updated across re-pushes via a hidden marker line, so the PR doesn't accumulate one comment per push)
  • A workflow artifact with the redacted JSON (30-day retention)
  • Optional job failure when findings exist (the merge-block gate)

Inputs

Name Default Purpose
betterleaks-image digest of v1.2.0 Override only for testing newer versions.
fail-on-findings true Block by default. Set to false for warn-only mode while a repo cleans up legacy false positives.
config-path '' Optional .betterleaks.toml config path.
base-ref '' Defaults to PR base SHA; override only for non-pull_request triggers.
validate-secrets false Active-secret HTTP validation. Off by default for PR triggers (don't make outbound HTTP from contributor work).

Permissions

permissions:
  contents: read
  pull-requests: write

No org-level token. GITHUB_TOKEN is sufficient since the scan is scoped to the caller repo.

Consumer pattern (after tagging v1)

# .github/workflows/secret-leak-check.yml in any consumer repo
name: Secret Leak Check
on:
  pull_request:
    branches: [main]
jobs:
  scan:
    uses: geolonia/.github/.github/workflows/reusable-secret-leak-check.yml@v1

Override defaults per repo if needed:

    with:
      fail-on-findings: false      # warn-only during cleanup
      config-path: .betterleaks.toml

Failure semantics + body cap (inherited from secret-leak-audit.yml)

  • Schema-validates the JSON: accepts null (clean) or top-level array (findings). Anything else triggers scan_failed=true and explicit failure.
  • --redact on by default. Detected secret values never appear in logs, annotations, or PR comments.
  • 60,000-char inline comment cap; full report in the artifact.

Test plan

  • CodeRabbit review passes
  • After merge, tag v1 so consumers can uses: ...@v1
  • Follow-up: add a consumer wrapper in geolonia-operations first (self-dogfood); confirm the comment + annotation behavior on a test PR
  • Then roll out wrappers to active submodules one by one

Out of scope

  • Consumer wrappers in individual repos (each adopts the reusable on its own schedule)
  • Branch protection automation (turning the check into a required status)

Summary by CodeRabbit

  • New Features

    • Added a reusable per-PR secret-leak scanning workflow with configurable scan image, fail-on-findings, config path, base ref override, and optional live validation.
    • Added a workflow template to enable the per-PR secret leak check easily.
  • Chores

    • Scans only PR-introduced changes, posts inline annotations and an updatable PR comment, uploads a findings artifact, and can optionally fail the check on findings.

Review Change Stack

Adds `reusable-secret-leak-check.yml` to the org-wide reusable
workflow library. Pairs with the weekly org-wide audit in
geolonia-operations (`secret-leak-audit.yml`): the audit catches what
is already in main history; this reusable prevents new secret commits
from getting in.

## Behavior

Scans only the commits added by the PR (the diff against the base
ref), via `betterleaks git . --log-opts="<base>..<head>"`. Runs the
upstream betterleaks container pinned by immutable digest. Emits:

- Inline workflow annotations on the PR's "Files changed" view
- An idempotent PR comment (updated across re-pushes via a hidden
  marker line)
- A workflow artifact with the redacted JSON report (30-day retention)
- Optional job failure when findings exist (the gate that blocks merge)

## Inputs

| Name | Default | Purpose |
|---|---|---|
| `betterleaks-image` | digest of v1.2.0 | Override only for testing newer versions. |
| `fail-on-findings` | `true` | Block-by-default. Set to `false` for warn-only mode during cleanup of legacy false positives. |
| `config-path` | `''` | Optional path to a `.betterleaks.toml` config in the caller repo. |
| `base-ref` | `''` | Defaults to the PR base; override only for non-pull_request triggers. |
| `validate-secrets` | `false` | Active-secret HTTP validation. Off by default for PR triggers. |

## Permissions

`contents: read` + `pull-requests: write` (for the comment). No org-
level token needed; `GITHUB_TOKEN` is sufficient since the scan is
scoped to the caller repo.

## Consumer pattern

```yaml
name: Secret Leak Check
on:
  pull_request:
    branches: [main]
jobs:
  scan:
    uses: geolonia/.github/.github/workflows/reusable-secret-leak-check.yml@v1
```

Override defaults per repo if needed:

```yaml
    with:
      fail-on-findings: false
      config-path: .betterleaks.toml
```

## Failure semantics

- Schema-validates the JSON output is `null` (clean) or array
  (findings). Anything else triggers `scan_failed=true` and an
  explicit job failure.
- `--redact` on by default: detected values never appear in logs,
  annotations, or PR comments.
- 60,000-char cap on the inline comment body; full report in the
  artifact.

## Related

- Tracking issue: geolonia/geolonia-operations#60
- Pair workflow: `secret-leak-audit.yml` in geolonia-operations
@coderabbitai

coderabbitai Bot commented May 14, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 68fe1eae-de79-49e7-b57d-bb664e91236c

📥 Commits

Reviewing files that changed from the base of the PR and between 6ad0f7d and d8125b5.

📒 Files selected for processing (1)
  • .github/workflows/reusable-secret-leak-check.yml

Walkthrough

This pull request adds a reusable GitHub Actions workflow that scans PR-introduced commits for secret leaks using betterleaks, validates and counts JSON findings, emits inline annotations, posts or updates an idempotent PR comment, uploads findings as an artifact, and optionally fails the job; it also adds workflow-template metadata for the new workflow.

Changes

Secret Leak Detection Workflow

Layer / File(s) Summary
Workflow definition and environment setup
.github/workflows/reusable-secret-leak-check.yml
Defines the reusable workflow interface and inputs, sets job permissions/timeout, checks out full repo history, and resolves base/head SHAs for PR-only diffing.
Image pull and scan execution
.github/workflows/reusable-secret-leak-check.yml
Pulls/verifies the configured betterleaks container image and runs betterleaks in Docker limited to PR-introduced commits, always writing JSON output as a runner-readable file.
Findings JSON validation and counting
.github/workflows/reusable-secret-leak-check.yml
Validates findings.json (accepts null or top-level array), computes total findings with jq, and sets scan_failed when JSON is unreadable or malformed.
Inline PR annotations for findings
.github/workflows/reusable-secret-leak-check.yml
Emits one sanitized inline annotation per finding when readable JSON exists.
Idempotent PR comment construction
.github/workflows/reusable-secret-leak-check.yml
Builds an idempotent PR comment body with a hidden marker for scan-failed / zero-findings / findings-present branches and truncates inline details.
Post/update PR comment via gh
.github/workflows/reusable-secret-leak-check.yml
Searches for existing marker comment and patches or posts a new comment; gracefully degrades on forks/read-only tokens.
Upload findings artifact
.github/workflows/reusable-secret-leak-check.yml
Uploads findings.json as a workflow artifact keyed by PR number or run id.
Job completion and conditional failure
.github/workflows/reusable-secret-leak-check.yml
Fails when findings exist and fail-on-findings is enabled; fails separately when scan-level JSON/reporting is not produced.
Workflow template metadata
workflow-templates/secret-leak-check.properties.json, workflow-templates/secret-leak-check.yml
Adds template metadata and a consumer wrapper workflow template that triggers the reusable workflow on pull requests.

Sequence Diagram

sequenceDiagram
  participant Workflow
  participant Git
  participant Container as betterleaks
  participant GitHubAPI
  participant Artifacts

  Workflow->>Git: Checkout repo (full history)
  Git-->>Workflow: Repository ready

  Workflow->>Workflow: Resolve base & head SHAs

  Workflow->>Container: Pull & verify image
  Container-->>Workflow: Image ready

  Workflow->>Container: Run scan (PR commits only)
  Container-->>Workflow: findings.json + exit code

  Workflow->>Workflow: Validate findings.json

  alt Scan succeeded & findings found
    Workflow->>GitHubAPI: Emit inline annotations
    GitHubAPI-->>Workflow: Annotations posted
  end

  Workflow->>Workflow: Build idempotent PR comment (hidden marker)

  Workflow->>GitHubAPI: Post/update PR comment
  GitHubAPI-->>Workflow: Comment posted

  Workflow->>Artifacts: Upload findings.json
  Artifacts-->>Workflow: Artifact stored

  alt Findings non-zero & fail-on-findings enabled
    Workflow->>Workflow: ❌ Fail job
  else Scan failed before JSON
    Workflow->>Workflow: ❌ Fail job (scan error)
  else Success
    Workflow->>Workflow: ✓ Job passes
  end
Loading

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related issues

  • geolonia/geolonia-operations#60 — Adds the same reusable betterleaks workflow (matching inputs and PR-diff scanning/annotation/comment behavior).
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately describes the main change: adding a reusable GitHub Actions workflow for secret leak detection that runs per PR.
Description check ✅ Passed The description comprehensively covers the summary, how it works, inputs, permissions, consumer pattern, failure semantics, test plan, and out-of-scope items, exceeding the template's basic requirements.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/reusable-secret-leak-check

Comment @coderabbitai help to get the list of available commands and usage tips.

Makes the reusable secret-leak workflow discoverable via GitHub's
"New workflow" UI. When someone clicks "New workflow" in any
geolonia-org repo, they will see a "Secret Leak Check (per PR)" tile
under Security; selecting it copies the wrapper into their repo.

Files:

- `workflow-templates/secret-leak-check.yml` - thin wrapper that
  calls the reusable `reusable-secret-leak-check.yml@v1` (added in
  the same PR). Per-repo overrides exposed as commented-out lines
  the contributor can uncomment.
- `workflow-templates/secret-leak-check.properties.json` - GitHub UI
  metadata (name, description, category=Security, iconName=shield).

This mirrors the existing template pattern (see
`publish-techdocs.yml` / `cdk-deploy-monitor.yml`).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In @.github/workflows/reusable-secret-leak-check.yml:
- Around line 163-174: The jq snippet that emits GitHub workflow annotations is
interpolating untrusted fields (.Attributes.path, .RuleID,
.Attributes["git.sha"]) directly into the ::error command and must strip CR/LF
characters to prevent workflow-command injection; update the jq expression used
on findings.json (the jq pipeline calling .[] | "::error file=" + ...) to
sanitize each interpolated value by removing \r and \n (e.g., use jq's gsub to
replace "\\r" and "\\n" with empty strings) for .Attributes.path, .RuleID and
the git SHA slice before concatenation so no newline/carriage-return can break
the annotation protocol.
- Around line 230-248: The PR-comment step can fail for fork/Dependabot PRs due
to GITHUB_TOKEN write restrictions; make it best-effort by either skipping it
for forked PRs or ignoring failures: update the "Post / update PR comment" step
to include a conditional skip (e.g. extend the existing if to require
github.event.pull_request.head.repo.fork == false) or wrap the shell commands
(the gh api and gh pr comment invocations that set/read EXISTING and call gh api
-X PATCH / gh pr comment) with non-fatal error handling (use set +e and ensure
gh commands end with || true so failures don’t fail the job) so gh pr comment
and gh api failures become informational only.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: de0ab9b2-b9d1-4e61-89e0-a13eab27f3c3

📥 Commits

Reviewing files that changed from the base of the PR and between 73c555b and 612d9e3.

📒 Files selected for processing (1)
  • .github/workflows/reusable-secret-leak-check.yml

Comment thread .github/workflows/reusable-secret-leak-check.yml
Comment thread .github/workflows/reusable-secret-leak-check.yml
Two findings, both real:

1. **Major** - workflow-command annotation injection. The `::error
   file=...,line=...::msg` lines built from JSON could be split or
   spoofed if any interpolated value (path, rule id, commit SHA)
   contained a CR/LF or a literal `::` sequence. betterleaks' own
   output is unlikely to do that for legitimate findings, but a
   crafted path in a malicious repo could. Added a jq `safe`
   function that strips CR/LF and rewrites `::` to a non-parsable
   placeholder, applied to every interpolated field.

2. **Major** - the PR-comment step is unreachable from fork PRs and
   Dependabot PRs because `GITHUB_TOKEN` is read-only in those
   contexts regardless of `permissions:` (GitHub override). Previous
   code would hard-fail the step with a 403. Now:

   - `continue-on-error: true` so the step doesn't fail the job
   - Each `gh` call wrapped to detect failure and emit a clean
     `::warning::` instead of a 403 stack trace
   - The actual pass/fail of the job is decided by the dedicated
     "Fail if findings" step below, not by whether commenting worked

   Inline annotations + workflow artifact still surface findings on
   fork / Dependabot PRs - the comment is a UX nicety, not the gate.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant