You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Attempting to use a newline character ("\n") in pattern with the PCRE2 engine (-P), outside multiline mode fails silently.
If this is a bug, what are the steps to reproduce the behavior?
Create the following corpus.txt:
testing
RIP
GREP
again
First, try the search with Rust's engine:
rg "RIP\nGREP" corpus.txt
Output:
the literal '"\n"' is not allowed in a regex
Consider enabling multiline mode with the --multiline flag (or -U for short).
When multiline mode is enabled, new line characters can be matched.
As expected (from the documentation), we see an error.
Now try searching using the PCRE2 engine:
rg -P "RIP\nGREP" corpus.txt
Output:
As we see, we've tried to use a newline char without multiline mode, but receive no error. The search fails silently.
N.B. Searching using the PCRE2 engine, with multiline mode (-U) works as expected:
rg -PU "RIP\nGREP" corpus.txt
Output:
2:RIP
3:GREP
If this is a bug, what is the actual behavior?
See output above
If this is a bug, what is the expected behavior?
I would expect ripgrep to throw an error, warning the user that newlines cannot be matched outside multiline mode (-U) when using PCRE2. This would yield the same behaviour as when Rust's engine is used (#1055)
The text was updated successfully, but these errors were encountered:
This can't be fixed because PCRE doesn't expose anything to parse its syntax to detect the newline characters in the pattern. It might be possible to detect some simple cases without parsing the regex, but I don't know how far down that road I want to go.
That's a shame. I thought the reason might be something like that.
Maybe it would be worth us tweaking the documentation, to make to clear an error won't be thrown when using PCRE2?
Current text:
For example, when multiline mode is not enabled (the default), then the regex \p{any} will match any Unicode codepoint other than \n. Similarly, the regex \n is explicitly forbidden, and if you try to use it, ripgrep will return an error.
What version of ripgrep are you using?
ripgrep 11.0.1 (rev e7829c0)
-SIMD -AVX (compiled)
+SIMD -AVX (runtime)
How did you install ripgrep?
Compiled from source
What operating system are you using ripgrep on?
Arch Linux
Describe your question, feature request, or bug.
Attempting to use a newline character ("\n") in pattern with the PCRE2 engine (-P), outside multiline mode fails silently.
If this is a bug, what are the steps to reproduce the behavior?
Create the following corpus.txt:
First, try the search with Rust's engine:
Output:
As expected (from the documentation), we see an error.
Now try searching using the PCRE2 engine:
Output:
As we see, we've tried to use a newline char without multiline mode, but receive no error. The search fails silently.
N.B. Searching using the PCRE2 engine, with multiline mode (-U) works as expected:
Output:
If this is a bug, what is the actual behavior?
See output above
If this is a bug, what is the expected behavior?
I would expect ripgrep to throw an error, warning the user that newlines cannot be matched outside multiline mode (-U) when using PCRE2. This would yield the same behaviour as when Rust's engine is used (#1055)
The text was updated successfully, but these errors were encountered: