-
Notifications
You must be signed in to change notification settings - Fork 589
Description
This issue was already mentioned in #90, but I think the problem deserves a specific issue.
Currently, for matching suspicious keywords, there is no attempt to distinguish a regular line of code from a comment:
eg.:
Line 2201 in 168a92d
| match = re.search(r'(?i)\b' + re.escape(keyword) + r'\b', vba_code) |
I think there is a few ways we could avoid false positives related to comments, one of them would be to edit the pattern to look like this:
r'(?i)^(?:[^']|\b).*\b' + re.escape(keyword) + r'\b'
The key here is that ^(?:[^']|\b).* will not match if the line starts with an apostrophe ('). The |\b is necessary otherwise the pattern would not match if the keyword was at the start of the line:
https://regex101.com/r/CUI2V3/1
Alternatively, an other option to solve the issue would be to remove all lines with comments from vba_code before running the regex.
Affected tool:
olevba and mraptor (maybe others as well that I haven't used)
Describe the bug
Suspicious keywords (eg. "create") in the comments are causing false positives
File/Malware sample to reproduce the bug
Sub test()
'I love to create
MsgBox "Hello world"
End SubHow To Reproduce the bug
run olevba on the sample
Expected behavior
No threat detected