ReDoS detection has false positives for patterns with literal boundaries

## Description

The ReDoS detection plugin (`regex_redos`) produces false positives for patterns that have nested quantifiers but are actually safe due to literal boundary characters.

## Example False Positive

Pattern: `^/([_0-9a-zA-Z-]+/)?core/cache/busting/1/core/(.*)`

This is flagged as vulnerable with "Nested quantifier in group - causes exponential O(2^n) backtracking", but it's actually **safe**.

## Why It's Safe

The pattern `([_0-9a-zA-Z-]+/)?` contains:
- Inner quantifier: `[_0-9a-zA-Z-]+` (matches word chars and hyphens)
- Literal boundary: `/` (NOT in the character set)

The `/` creates a clear boundary that prevents ambiguous matching. For input like `abc/`, there's only ONE way to match it: `abc` + `/`. No backtracking explosion is possible.

## Comparison with Truly Vulnerable Pattern

```python
import re
import time

# SAFE pattern (has boundary)
safe = re.compile(r'^/([_0-9a-zA-Z-]+/)?core/')

# VULNERABLE pattern (no boundary)  
vuln = re.compile(r'^/(a+)+$')

# Test with increasing input lengths
for n in [20, 25, 28]:
    test = 'a' * n + 'X'
    
    start = time.perf_counter()
    safe.match('/' + test)
    safe_time = (time.perf_counter() - start) * 1000
    
    start = time.perf_counter()
    vuln.match('/' + test)
    vuln_time = (time.perf_counter() - start) * 1000
    
    print(f"n={n}: safe={safe_time:.2f}ms, vulnerable={vuln_time:.2f}ms")
```

Output:
```
n=20: safe=0.01ms, vulnerable=38.0ms
n=25: safe=0.01ms, vulnerable=1244.3ms
n=28: safe=0.01ms, vulnerable=9738.5ms
```

The safe pattern runs in constant time, while the vulnerable pattern shows exponential growth.

## Root Cause

In `_check_nested_quantifiers`, the code calls `_contains_quantifier(subpattern)` which returns `True` for any nested quantifier without checking if there's a safe literal boundary.

## Proposed Fix

When detecting nested quantifiers, check if the inner quantifier is followed by a literal character that is NOT in the quantified character set. If so, it's safe:

```python
def _has_safe_boundary(self, parsed):
    """Check if quantifier is followed by a literal not in its charset."""
    if len(parsed) < 2:
        return False
    
    first_op, first_av = parsed[0]
    second_op, second_av = parsed[1]
    
    if first_op not in QUANTIFIERS or second_op != LITERAL:
        return False
    
    boundary_char = second_av
    _, _, quant_content = first_av
    char_set = self._get_quantified_chars(quant_content)
    
    return char_set is not None and boundary_char not in char_set
```

## Impact

This false positive affects real-world configs. In one user's nginx config, 158 out of 173 HIGH severity issues were false positives from this bug.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

ReDoS detection has false positives for patterns with literal boundaries #99

Description

Example False Positive

Why It's Safe

Comparison with Truly Vulnerable Pattern

Root Cause

Proposed Fix

Impact

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

ReDoS detection has false positives for patterns with literal boundaries #99

Description

Description

Example False Positive

Why It's Safe

Comparison with Truly Vulnerable Pattern

Root Cause

Proposed Fix

Impact

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions