Skip to content

Security: Dragoon0x/terse

Security

SECURITY.md

Security Policy

Scope

terse is a text-transformation tool. its inputs are strings, its outputs are strings. the security surface is small but real.

What we consider a security issue

  • input that crashes the engine. any deterministic string that causes compress() to raise an uncaught exception, hang, or consume unbounded memory.
  • preservation bypass. any string where code blocks, inline code, URLs, file paths, or error messages in the input are mutated (not preserved verbatim) in the output.
  • marker leak. any case where the internal preservation markers appear in the final output.
  • catastrophic regex backtracking. any input that makes the engine take longer than ~2 seconds on modern hardware for reasonable-sized text (under 500KB).
  • supply-chain. the repo has zero runtime dependencies. any PR that adds one should be reviewed with care.

What we don't consider a security issue

  • compression that's suboptimal (not bugs, just tuning).
  • compression that's more aggressive than desired (that's the mode/level configuration).
  • Claude producing a different output than the deterministic engine (the engine is a conservative reference, not a Claude simulator).

Reporting

please file a GitHub issue with the security label, or email the maintainer (see the repo profile). include:

  • the exact input that triggered the issue
  • the mode and level
  • the output (if any)
  • the expected behavior

for serious issues (crashes on adversarial input, preservation bypass), please allow 7 days before public disclosure so a fix can ship.

Engineered protections

the engine includes these specific protections, verified by the test suite:

  • input size capcompress() raises ValueError on inputs over 500,000 characters.
  • regex-complete-in-bounded-time — every pattern is a constant from the source; none is built from user input.
  • deterministic markers with fallback — preservation markers use non-whitespace control characters; if the input contains the primary marker pattern, the engine escalates to a fallback.
  • no eval, no exec, no shell-out — the engine is pure string and regex operations.
  • bounded regex patterns — no unbounded .* across unbounded text; all patterns have natural boundaries.

the regex safety property is tested in tests/test_compress.py::TestSecurity::test_no_regex_catastrophic_backtracking which asserts the engine completes in under 2 seconds on adversarial inputs (10,000 repeated chars, 500 code-fence starts, etc).

What's out of scope for terse

  • claims about LLM output safety — terse rewrites prose; it does not influence the content of Claude's answers.
  • compliance claims (SOC 2, HIPAA, etc.) — terse is an MIT-licensed experimental tool. use in regulated environments at your own discretion and responsibility.
  • protecting against prompt injection — terse is not a safety tool. the references/anti-patterns.md document lists contexts where compression should not be applied, but that's a style guide, not a security guarantee.

There aren't any published security advisories