Skip to content

Comments are causing false positives #817

@DecimalTurn

Description

@DecimalTurn

This issue was already mentioned in #90, but I think the problem deserves a specific issue.

Currently, for matching suspicious keywords, there is no attempt to distinguish a regular line of code from a comment:

eg.:

match = re.search(r'(?i)\b' + re.escape(keyword) + r'\b', vba_code)

I think there is a few ways we could avoid false positives related to comments, one of them would be to edit the pattern to look like this:
r'(?i)^(?:[^']|\b).*\b' + re.escape(keyword) + r'\b'

The key here is that ^(?:[^']|\b).* will not match if the line starts with an apostrophe ('). The |\b is necessary otherwise the pattern would not match if the keyword was at the start of the line:
https://regex101.com/r/CUI2V3/1

Alternatively, an other option to solve the issue would be to remove all lines with comments from vba_code before running the regex.


Affected tool:
olevba and mraptor (maybe others as well that I haven't used)

Describe the bug
Suspicious keywords (eg. "create") in the comments are causing false positives

File/Malware sample to reproduce the bug

Sub test()
    'I love to create
    MsgBox "Hello world"
End Sub

How To Reproduce the bug
run olevba on the sample

Expected behavior
No threat detected

Metadata

Metadata

Assignees

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions