-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Java Tracking From Exception Construction to Catch Clause #17632
Comments
I have been working on this problem more and I believe I am getting closer with some of the additions I have made. Here is my latest query.
I added to the
With this, I am starting to see results that I am looking for. For example,
However, I am noticing that there are some results that are detected that don't actually have sensitive information such as this one,
In this, You can see that some have the Thank you |
The following code will flag up many more variables as sink than you need: // Ensure it's the caught exception variable
mc.getQualifier().(VarAccess).getVariable() = sink.asExpr().(VarAccess).getVariable() It says that the qualifier of a call to Exception e = new Exception(someTaintedValue);
e.getMessage(); // a sink, because `e` is accessed here
e.toString(); // another sink, because `e` is accessed again
throw e; // yet another sink; I think you could model what you want as additional taint steps:
|
Hello @aibaars Thank you for your response. I updated my query based on how I interpreted your suggestions. This is what I have so far
The good news is that with this, all of the false positives have gone away. However, I am now trying to figure out how to get the full path for my results. Right, now it's just giving me the stopping point. As seen by the From reading about this, it seems like the flow is not connecting somewhere. I am suspecting that the issue is in the I tried to include more steps to it with this, but I am getting the same result.
I am not sure if this is too much, but I wanted to make sure I wasn't missing a step in the flow. |
Your String message = e.getMessage();
sendError(message); Something like the following predicate isSink(DataFlow::Node sink) {
exists(MethodCall mcSink |
// The sink methods (println, sendError, write)
mcSink.getMethod().getName() in ["println", "sendError", "write"] and
mcSink.getAnArgument() = sink.asExpr()
)
} The For additional flow steps it is usually better to keep each step "small". It's the task of the dataflow analyzer to combine all the steps into a complete flow path. It can do better job when has many small steps that is can combine than an overly specific big step. You did this right for |
Hi @aibaars, Thank you again for your help. I appreciate it. I have gone back and implemented the changes that you suggested. However, this leads to a lot of false positives. For example, this code has a detection
This doesn't have anything to do with what is defined in the Here is my current query
|
Ah indeed, now the query should flag up any sensitive data that is exposed to a sink. That is of course a useful analysis, but you want to restrict the results to exposure via exception messages.
The dataflow analyzer has a lot of default steps, for example variable writes to variable reads, method call arguments to method parameters, etc. It uses the default steps in combination with any additional steps you provide. If it can find a path without using the additional steps, that is fine too. Instead of any path from a sensitive data source into print/write sink you want the dataflow analyzer to restrict those paths to the ones that flow via an exception. I think you can achieve that using flow labels: https://codeql.github.com/docs/codeql-language-guides/using-flow-labels-for-precise-data-flow-analysis/ . The idea is to start at the source with a default label, but make the |
Hi @aibaars, That might help. However, is |
The shared dataflow library (which JS/TS is currently not using) uses the concept of flow states instead. |
Hi @intrigus-lgtm , Thank you for pointing me towards flow states. This is a bit out of my depth so I am struggling to understand exactly how to convert my query to one that uses this functionally. Looking at the solution that you sent it looks like it's used to combine multiple configurations into one query, where each config is a state. With that said, the next question would be how many states should I use? I found this discussion that uses 3 states. In my current query I have
With this mapped out I would think these are my states State 1. Sensitive variable -> Runtime Exception Does this seem like too many states? I am not sure if the default dataflow will take care of some of these or not? Let's assume that these are the correct states, I am still struggling to understand how to implement them as I have seen example that are more theoretical. I have this example as well but it's going over my head. I would appreciate any help as this is new to me. Thank you |
Here is my current attempt at this. Since I believe getMessage() -> println() will be done automatically I removed that one to include the first 3 states. Using this query I am not getting any results when I run the full thing. However, I am getting the expected results when doing the quick eval for the isSource and isSink predicates.
Do you see any obvious issues with how I have implemented this? |
This looks sensible. As a matter of style I'd recommend naming your states for their meaning, not just numbering them. Note by the way that while this is fine for prototyping, the old DataFlow::Configuration and TaintTracking::Configuration are deprecated -- the new ConfigSig style (https://github.blog/changelog/2023-08-14-new-dataflow-api-for-writing-custom-codeql-queries/) is now preferred. For stateful queries you'd need to use For an example, check out |
This issue is stale because it has been open 14 days with no activity. Comment or remove the |
This issue was closed because it has been inactive for 7 days. |
Hello,
I am trying to detect a scenario where sensitive information is exposed via an error message. For example,
Here we have
apiKey
as the sensitive information. It is then used in the constructor of the Exception. Which is then caught and used ine.getMessage()
. I am finding it difficult to track this flow all the way to the sinksendError
.The challenge I'm facing is that CodeQL doesn't seem to automatically capture the flow from the thrown exception into the catch block, where it is accessed through
e.getMessage()
. I'm manually modeling this with additional flow steps but am finding it difficult to track the full flow from source to sink.Here is my current query for reference
I have run quick evals for all of the steps.
For the source I get
For the additional flow step, I get
However, I don't get anything for the additional flow step.
If I simplify it to just be
I get
So, I can track it going into the constructor, but not the constructor going to the catch clause.
Any advice on how to improve the tracking of the flow between the throw and catch block, or any suggestions on alternative approaches, would be greatly appreciated!
The text was updated successfully, but these errors were encountered: