Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

C#: Restrict dataflow node creation to source and source-referenced entities #17482

Conversation

smowton
Copy link
Contributor

@smowton smowton commented Sep 16, 2024

This addresses a problem observed when libraries define a very large number of types with large class hierarchies, for example as part of generated code. Since user code is very unlikely to refer to all or even most of them, by excluding such code from dataflow node generation and therefore from the C# dataflow hooks' idea of a "relevant" type, we can make many of the type predicates defined in DataFlowPrivate.qll very much cheaper.

#17483 is a variation on this PR that additionally considers virtual dispatch targets and accessed (but not called) callables as being used in source, and therefore meriting dataflow node creation.

Comment on lines +1134 to +1135
TExplicitParameterNode(Parameter p, DataFlowCallable c) {
p = c.asCallable(_).(CallableUsedInSource).getAParameter()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Java simply uses the constraint exists(p.getCallable().getBody()). Would that be enough in this case?

Suggested change
TExplicitParameterNode(Parameter p, DataFlowCallable c) {
p = c.asCallable(_).(CallableUsedInSource).getAParameter()
TExplicitParameterNode(Parameter p, DataFlowCallable c) {
exists(Callable c0 | c0 = c.asCallable(_) and p = c0.getAParameter() and exists(c0.getBody()))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've refined this further at #17483 -- my observations so far:

There are methods with no body but which are from source where we probably ought to include parameter nodes, e.g. abstract and interface methods

There are also methods that have a body but for which fromSource doesn't hold -- synthetic methods, so far as I can tell, such as default constructors.

Therefore I've included both criteria for now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are methods with no body but which are from source where we probably ought to include parameter nodes, e.g. abstract and interface methods

Why? What use are they? Those parameter nodes cannot flow to anything.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They indeed cannot; my concern was whether interface or abstract method parameter nodes might be relevant to models, either MaD-written or implemented in QL? The short answer is I don't know, but the cost of keeping them is probably small since they're bounded by the size of the source code, and giving nodes as-is to everything present in source is the least surprising route for any third-party code we don't get to exercise via DCA and QA.

@smowton
Copy link
Contributor Author

smowton commented Sep 17, 2024

I'm leaning towards variant #17483 since it plays things safer (creating more nodes than this variant) but still seems to have significantly beneficial performance consequences. I've now started QA, linked from that PR.

@smowton
Copy link
Contributor Author

smowton commented Sep 18, 2024

Variant #17483 has now passed DCA and QA as noted on that PR -- I therefore propose we should merge that variant and invite reviews there.

@smowton smowton closed this Sep 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants