Skip to content

feat(betterleaks): add placeholder allowlists to org config#31

Merged
dkastl merged 2 commits into
mainfrom
feat/betterleaks-placeholder-allowlist
May 19, 2026
Merged

feat(betterleaks): add placeholder allowlists to org config#31
dkastl merged 2 commits into
mainfrom
feat/betterleaks-placeholder-allowlist

Conversation

@dkastl

@dkastl dkastl commented May 19, 2026

Copy link
Copy Markdown
Contributor

Summary

Adds five `[[allowlists]]` blocks to `betterleaks/default.toml` so common docs-placeholder patterns no longer trip the targeted rules (mainly `curl-auth-header` and `curl-auth-user`).

Sample analysis on the May 2026 baseline

Out of 10 `curl-auth-header` findings sampled across distinct repos, 8 were unmistakable placeholders:

Pattern Count in sample
`YOUR_API_KEY` / `YOUR-API-KEY` 5 (kagawa-pref, nasushiobara, nobeoka, gsi-3d, smartcity-geospatial-platform-api)
`<エンドポイントURL>` angle-bracket (same files as above)
Famous example UUID `550e8400-e29b-41d4-a716-446655440000` 1 (geonicdb-docs)
Other docs placeholder (`gdb_a1b2c3...`) 1 (geonicdb)
Already-scrubbed dev token 1 (geolonia-backstage)
Possibly real 2 (tokyo-road-manager-api Devise token, yoshioka-town-2025 work log)

The 2 real-looking ones remain detected; they'll be in the per-repo issues we open next.

Patterns added

Conservative — every regex is constrained so it cannot match a real credential:

  • `YOUR-_` — only the literal "YOUR_*" placeholder family
  • `<[A-Za-z\p{Han}\p{Hiragana}\p{Katakana}][^>]{1,60}>` — angle-bracket placeholders incl. Japanese
  • Three specific famous example UUIDs
  • `(REPLACE|CHANGE|PLACEHOLDER|EXAMPLE|FAKE|SAMPLE)[_-]?(ME|KEY|TOKEN|SECRET)?` — standard placeholder words
  • Explicit dev-scaffold patterns (`dev-local-token-please-change`, `please-(change|rotate)`)

Test plan

  • CodeRabbit clean
  • After merge, re-run local baseline. `curl-auth-header` should drop substantially (estimated to ~60-100 from 329); the 2 real-looking findings (tokyo-road-manager-api Devise token + yoshioka-town-2025 work-log key) should still appear.

Summary by CodeRabbit

  • New Features
    • Added an org-wide allowlist for placeholder values so common documentation/test placeholders (API key templates, angle-bracket tokens, example UUIDs, generic keywords like REPLACE_ME/EXAMPLE/PLACEHOLDER, and obvious dev/test scaffold token phrases) are recognized and excluded from alerts or scans.

Review Change Stack

The curl-auth-header rule (329 hits in the May 2026 baseline) is
dominated by docs examples using stable placeholder values:
YOUR_API_KEY (across city-project API-spec templates), angle-bracket
placeholders <your-token>, famous example UUIDs (550e8400-e29b-...),
and dev scaffold tokens. Sampled noise rate: ~80%.

Allowlists cover only patterns that cannot accidentally match a real
credential:

- YOUR_API_KEY / YOUR-API-KEY / YOUR_TOKEN family
- Angle-bracket placeholders <foo>, <エンドポイントURL>
- Famous example UUIDs (550e8400-e29b-..., 00000000-..., deadbeef-...)
- REPLACE_ME / CHANGEME / PLACEHOLDER / EXAMPLE / FAKE / SAMPLE
- Explicit dev tokens (dev-local-token-please-change, please-change-X)

Real-looking values continue to trip the rule. Detection on the
Devise/Rails token format (eyJhY...) and high-entropy random keys is
unchanged.
@coderabbitai

coderabbitai Bot commented May 19, 2026

Copy link
Copy Markdown

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: 67d9b4d9-2721-49cc-9e5e-bc174858411f

📥 Commits

Reviewing files that changed from the base of the PR and between eff8cfe and e558d9f.

📒 Files selected for processing (1)
  • betterleaks/default.toml

Walkthrough

Added org-wide allowlist configuration to betterleaks/default.toml with regex patterns suppressing detections of known placeholder strings used in documentation and test code, including YOUR_API_KEY templates, angle-bracket placeholders, example UUIDs, generic placeholder keywords, and dev/test scaffold token phrases.

Changes

Allowlist configuration for placeholder strings

Layer / File(s) Summary
Allowlist rules and documentation
betterleaks/default.toml
Documentation comments clarify allowlist matching scope and constraints. Multiple [[allowlists]] entries define regex patterns to suppress detections of placeholder strings: YOUR_API_KEY/YOUR_TOKEN templates, angle-bracket placeholders, well-known example UUIDs, generic placeholder words (REPLACE_ME, EXAMPLE, PLACEHOLDER and variants), and dev/test scaffold phrases (e.g., dev-local-token-please-change, test/sample/dummy/stub-style wording).

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

  • geolonia/.github#26: Introduced the org-wide betterleaks default config file and its extend.disabledRules structure, which this PR extends with allowlist rules.
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately summarizes the main change: adding placeholder allowlists to the betterleaks organization configuration file.
Description check ✅ Passed The description provides a comprehensive summary with patterns added, analysis data, and test plan, though the Checklist section items are marked incomplete rather than confirmed.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/betterleaks-placeholder-allowlist

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@betterleaks/default.toml`:
- Line 77: The allowlist entry containing the regex pattern
`(?i)\bplease[-_]?(change|update|replace|rotate)\b` is too broad and may
suppress real credentials; either remove this pattern entirely or tighten it to
require credential context (e.g., require "token|key|secret|password" or
explicit words like "token" alongside the please-change phrase) so it only
matches placeholder scaffold text; update the pattern in the TOML (the literal
pattern string shown) to the constrained form or delete the line and keep the
surrounding credential-specific patterns.
- Line 70: Update the regex entry in betterleaks/default.toml so the SAMPLE
group mirrors EXAMPLE and FAKE by making the suffix group optional: change the
part matching "SAMPLE[-_]?(KEY|TOKEN|SECRET)" to include a trailing ? after the
suffix group so it will match standalone "SAMPLE" as well as
"SAMPLE_KEY"/"SAMPLE_TOKEN"/"SAMPLE_SECRET"; locate the regex string containing
'''(?i)\b(REPLACE[-_]?ME|CHANGE[-_]?ME|PLACEHOLDER|EXAMPLE[-_]?(KEY|TOKEN|SECRET)?|FAKE[-_]?(KEY|TOKEN|SECRET)?|SAMPLE[-_]?(KEY|TOKEN|SECRET))\b'''
and add the missing ? for the SAMPLE suffix group.
- Line 56: The regex in the angle-bracket allowlist uses Unicode property
escapes (\p{Han}\p{Hiragana}\p{Katakana}) which Go's regexp engine doesn't
support; replace those property classes in the pattern string
'''<[A-Za-z\p{Han}\p{Hiragana}\p{Katakana}][^>]{1,60}>''' with explicit Unicode
escape ranges for Hiragana, Katakana and Han (e.g., use \u3040-\u309F,
\u30A0-\u30FF, \u4E00-\u9FFF) so the pattern compiles and matches the intended
characters in Betterleaks' Go regex engine.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: ASSERTIVE

Plan: Pro

Run ID: e5fc6c10-80e3-4c61-b4e2-221b78438b6d

📥 Commits

Reviewing files that changed from the base of the PR and between 60fa5aa and eff8cfe.

📒 Files selected for processing (1)
  • betterleaks/default.toml

Comment thread betterleaks/default.toml Outdated
Comment thread betterleaks/default.toml Outdated
Comment thread betterleaks/default.toml Outdated
Three fixes from #31 review:

1. Replace `\p{Han}\p{Hiragana}\p{Katakana}` with explicit `\x{...}`
   Unicode escape ranges. While Go's RE2 does accept `\p{...}` for
   Unicode scripts, explicit ranges are clearer for readers and
   match betterleaks' lower-bound regex feature set.

2. Add the missing `?` after `SAMPLE[-_]?(KEY|TOKEN|SECRET)` so the
   group is consistent with EXAMPLE and FAKE (matches bare "SAMPLE").

3. Drop the standalone `please[-_]?(change|update|replace|rotate)`
   pattern from the dev-scaffold allowlist. It lacked credential
   context and could plausibly suppress real secrets whose value
   contained the phrase. The remaining patterns on lines 76 and 78
   already cover `dev-local-token-please-change` and the
   `(test|sample|dummy|dev|stub)[-_]?(token|key|secret|password)`
   shapes.
@dkastl dkastl merged commit 4281434 into main May 19, 2026
1 check passed
@dkastl dkastl deleted the feat/betterleaks-placeholder-allowlist branch May 19, 2026 05:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant