Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Java Tracking From Exception Construction to Catch Clause #17632

Closed
KylerKatzUH opened this issue Sep 30, 2024 · 13 comments
Closed

Java Tracking From Exception Construction to Catch Clause #17632

KylerKatzUH opened this issue Sep 30, 2024 · 13 comments
Labels
awaiting-response The CodeQL team is awaiting further input or clarification from the original reporter of this issue. question Further information is requested Stale

Comments

@KylerKatzUH
Copy link

Hello,

I am trying to detect a scenario where sensitive information is exposed via an error message. For example,

    protected void doGet(HttpServletRequest request, HttpServletResponse response) throws ServletException, IOException {
            if (apiKey == "apiKey" || apiKey.isEmpty()) {
                throw new Exception("Invalid API key " + apiKey + " provided.");
            }
        }
        catch(Exception e){
            response.sendError(HttpServletResponse.SC_INTERNAL_SERVER_ERROR,  e.getMessage());
        }
    }

Here we have apiKey as the sensitive information. It is then used in the constructor of the Exception. Which is then caught and used in e.getMessage(). I am finding it difficult to track this flow all the way to the sink sendError.

The challenge I'm facing is that CodeQL doesn't seem to automatically capture the flow from the thrown exception into the catch block, where it is accessed through e.getMessage(). I'm manually modeling this with additional flow steps but am finding it difficult to track the full flow from source to sink.

Here is my current query for reference

/**
 * @name CWE-537: Java Runtime Error Message Containing Sensitive Information
 * @description Detects sensitive information (e.g., apiKey) being added to an exception message and then exposed via getMessage in HTTP responses.
 * @kind path-problem
 * @problem.severity error
 * @id java/runtime-error-message-exposure/537
 * @tags security
 *       external/cwe/cwe-537
 * @cwe CWE-537
 */

 import java
 import semmle.code.java.dataflow.TaintTracking
 import semmle.code.java.dataflow.DataFlow
 
 module Flow = TaintTracking::Global<SensitiveInfoInExceptionConfig>;
 
 import Flow::PathGraph
 
 /**
  * Defines the taint configuration for tracking sensitive data in exception messages.
  */
 module SensitiveInfoInExceptionConfig implements DataFlow::ConfigSig {
 
   // Sensitive variables (e.g., apiKey)
   predicate isSource(DataFlow::Node source) {
     exists(VarAccess var |
       var.getVariable().getName() = "apiKey" and
       source.asExpr() = var
     )
   }
 
   // Combined logic for detecting sensitive data flowing into exception constructors and linking throw to catch
   predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
     // Track flow into exception constructors
     exists(ThrowStmt t |
       t.getExpr() = node2.asExpr() and
       node1.asExpr() instanceof BinaryExpr // Detect concatenation into exception message
       and
     // Track flow from throw statement to catch block
     exists(CatchClause c |
       c.getACaughtType().hasQualifiedName("java.lang", "Throwable") and
      //  // Ensure that the catch block handles the thrown exception
       c.getBlock().getAStmt() = node2.asExpr().getEnclosingStmt()
     )
     )
     
   }
 
   // Sink: Tracks `e.getMessage()` where sensitive information may be exposed
   predicate isSink(DataFlow::Node sink) {
     exists(MethodCall mc |
       mc.getMethod().getName() = "getMessage" and
       mc.getQualifier() instanceof VarAccess and
       // Ensure it's the caught exception variable
       mc.getQualifier().(VarAccess).getVariable() = sink.asExpr().(VarAccess).getVariable()
     )
   }
 }
 
 // Perform taint tracking from source to sink
 from Flow::PathNode source, Flow::PathNode sink
 where Flow::flowPath(source, sink)
 select sink.getNode(), source, sink,
   "Sensitive information might be exposed through an exception message."

I have run quick evals for all of the steps.

For the source I get
image

For the additional flow step, I get
image

However, I don't get anything for the additional flow step.
If I simplify it to just be

   predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
     // Track flow into exception constructors
     exists(ThrowStmt t |
       t.getExpr() = node2.asExpr() and
       node1.asExpr() instanceof BinaryExpr // Detect concatenation into exception message
     )

I get
image

So, I can track it going into the constructor, but not the constructor going to the catch clause.

Any advice on how to improve the tracking of the flow between the throw and catch block, or any suggestions on alternative approaches, would be greatly appreciated!

@KylerKatzUH KylerKatzUH added the question Further information is requested label Sep 30, 2024
@KylerKatzUH
Copy link
Author

I have been working on this problem more and I believe I am getting closer with some of the additions I have made.

Here is my latest query.

/**
 * @name CWE-537: Java Runtime Error Message Containing Sensitive Information
 * @description Detects sensitive information (e.g., apiKey) being added to an exception message and then exposed via getMessage in HTTP responses.
 * @kind path-problem
 * @problem.severity error
 * @id java/runtime-error-message-exposure/537
 * @tags security
 *       external/cwe/cwe-537
 * @cwe CWE-537
 */

 import java
 import semmle.code.java.dataflow.TaintTracking
 import semmle.code.java.dataflow.DataFlow
 import SensitiveInfo.SensitiveInfo
 import CommonSinks.CommonSinks
 
 module Flow = TaintTracking::Global<SensitiveInfoInExceptionConfig>;
 
 import Flow::PathGraph
 
 /**
  * Defines the taint configuration for tracking sensitive data in exception messages.
  */
 module SensitiveInfoInExceptionConfig implements DataFlow::ConfigSig {
 
   // Sensitive variables (e.g., apiKey)
   predicate isSource(DataFlow::Node source) {
    //  exists(VarAccess var |
    //    var.getVariable().getName() = "apiKey" and
    //    source.asExpr() = var
    //  )
    exists(SensitiveVariableExpr sve | source.asExpr() = sve)
   }
 
   // Combined logic for detecting sensitive data flowing into exception constructors and linking throw to catch
   predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
    // Ensure that sensitive data flows into a RuntimeException or its subclass
    exists(ThrowStmt t |
        t.getEnclosingCallable() = node1.asExpr().getEnclosingCallable() and
        t.getEnclosingCallable() = node2.asExpr().getEnclosingCallable() and
        t.getExpr() = node2.asExpr() and
        node1.asExpr() instanceof BinaryExpr and
        t.getExpr().getType().(RefType).getASupertype+().hasQualifiedName("java.lang", "RuntimeException")
    ) or
    // Flow from throw to catch within the same method
    exists(CatchClause c |
        c.getEnclosingCallable() = node1.asExpr().getEnclosingCallable() and
        c.getEnclosingCallable() = node2.asExpr().getEnclosingCallable() and
        node2.asExpr() = c.getVariable().getAnAccess() and
        exists(ThrowStmt t |
            t.getEnclosingCallable() = c.getEnclosingCallable() and
            node1.asExpr() = t.getExpr() and
            t.getExpr().getType().(RefType).getASupertype+().hasQualifiedName("java.lang", "RuntimeException")
        )
    ) or
    // Track method A calling method B and propagating the exception back to A
    exists(MethodCall call |
        call.getCallee() = node1.asExpr().getEnclosingCallable() and
        call.getEnclosingCallable() = node2.asExpr().getEnclosingCallable() and
        exists(ThrowStmt t |
            t.getEnclosingCallable() = call.getCallee() and
            node1.asExpr() = t.getExpr() and
            t.getExpr().getType().(RefType).getASupertype+().hasQualifiedName("java.lang", "RuntimeException")
        )
    ) and
    // Ensure node2 is within Method A’s control flow, handling the exception
    exists(CatchClause catchStmt |
        catchStmt.getEnclosingCallable() = node2.asExpr().getEnclosingCallable() and
        node2.asExpr() = catchStmt.getVariable().getAnAccess()
    )
}



 
   // Sink: Tracks `e.getMessage()` where sensitive information may be exposed
   predicate isSink(DataFlow::Node sink) {
     exists(MethodCall mc |
       mc.getMethod().getName() = "getMessage" and
       mc.getQualifier() instanceof VarAccess and
       // Ensure it's the caught exception variable
       mc.getQualifier().(VarAccess).getVariable() = sink.asExpr().(VarAccess).getVariable()
     )
     or

     exists(MethodCall mc, CatchClause cc |
      cc.getACaughtType().getASupertype*().hasQualifiedName("java.lang", "RuntimeException") and
      mc.getEnclosingStmt().getEnclosingStmt*() = cc.getBlock() and
      (
        CommonSinks::isPrintSink(sink) or
        CommonSinks::isErrorSink(sink) or
        CommonSinks::isServletSink(sink) or
        getSinkAny(sink)
      ) and
      sink.asExpr() = mc.getAnArgument()
    )
   }
 }
 
 // Perform taint tracking from source to sink
 from Flow::PathNode source, Flow::PathNode sink
 where Flow::flowPath(source, sink)
 select sink.getNode(), source, sink,
   "Sensitive information might be exposed through an exception message."

I added to the isAdditionalFlowStep predicate two cases.

  1. When the exception throw and catch are in the same method.
  2. When a method calls a method that throws the exception and it is caught in the calling method.

With this, I am starting to see results that I am looking for.

For example,

public class BAD_AccessControlError {
    public void validateUserAccess(String userName, int accessLevel) {
        try {
            if (accessLevel < 1 || accessLevel > 5) {
                throw new IllegalArgumentException("Access level " + accessLevel + " is out of valid range for " + userName);
            }
            // Access validation logic
        } catch (IllegalArgumentException e) {
            System.err.println(e.getMessage());
        }
    }

    public static void main(String[] args) {
        new BAD_AccessControlError().validateUserAccess("adminUser", 0);
    }
}

Has this detection
image

However, I am noticing that there are some results that are detected that don't actually have sensitive information such as this one,

public class BAD_EmailHeaderExposure {
    public void sendEmailWithSensitiveHeader(String recipient, String sessionToken) {
        Properties properties = System.getProperties();
        properties.setProperty("mail.smtp.host", "smtp.internal.books.com");
        Session session = Session.getDefaultInstance(properties, null);

        try {
            MimeMessage message = new MimeMessage(session);
            message.setRecipient(Message.RecipientType.TO, new InternetAddress(recipient));
            message.setSubject("Here is your session Info");
            message.addHeader("X-Session-Token", sessionToken);
            Transport.send(message);
        } catch (MessagingException e) {
            System.out.println("Failed to send email: " + e.getMessage());
        }
    }
}

In this, e.getMessage() doesn't have any sensitive information so it is a false positive. I believe what is happening here is that because I have mc.getMethod().getName() = "getMessage" in my isSink predicate that it is picking up on all uses of this method in my code even if there isn't a path. This is reinforced by the format of some of the results.

image

You can see that some have the ! while others have the path. The ones that have the ! seem to be false positives. Does anyone know what might be causing this, and how to fix it?

Thank you

@aibaars
Copy link
Contributor

aibaars commented Oct 1, 2024

The following code will flag up many more variables as sink than you need:

       // Ensure it's the caught exception variable
       mc.getQualifier().(VarAccess).getVariable() = sink.asExpr().(VarAccess).getVariable()

It says that the qualifier of a call to getMessage() is a variable, and all accesses to that variable are to be considered a sink. That would include things like

Exception e = new Exception(someTaintedValue);
e.getMessage(); // a sink, because `e` is accessed here
e.toString(); // another sink, because `e` is accessed again
throw e; // yet another sink;

I think you could model what you want as additional taint steps:

  • step 1: from argument of the constructor of an Exception to the constructed Exception
  • step 2: from the qualifier of Exception::getMessage() to a call of the getMessage() method.
    The above step steps should ensure that taint flowing into a constructor flows out of a getMessage() call.

@KylerKatzUH
Copy link
Author

Hello @aibaars

Thank you for your response. I updated my query based on how I interpreted your suggestions.

This is what I have so far

/**
 * @name CWE-537: Java Runtime Error Message Containing Sensitive Information
 * @description Detects sensitive information (e.g., apiKey) being added to an exception message and then exposed via getMessage in HTTP responses.
 * @kind path-problem
 * @problem.severity error
 * @id java/runtime-error-message-exposure/537
 * @tags security
 *       external/cwe/cwe-537
 * @cwe CWE-537
 */

 import java
 import semmle.code.java.dataflow.TaintTracking
 import semmle.code.java.dataflow.DataFlow
 import SensitiveInfo.SensitiveInfo
 
 module Flow = TaintTracking::Global<SensitiveInfoInExceptionConfig>;
 
 import Flow::PathGraph
 
 /**
  * Defines the taint configuration for tracking sensitive data in exception messages.
  */
 module SensitiveInfoInExceptionConfig implements DataFlow::ConfigSig {
   
   // Sensitive variables (e.g., apiKey)
   predicate isSource(DataFlow::Node source) {
     exists(SensitiveVariableExpr sve | source.asExpr() = sve)
   }
 
   // Track step 1: From argument of the constructor of an Exception to the constructed Exception
   predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
     // Step 1: Flow from sensitive data passed as an argument to the Exception constructor
     exists(ConstructorCall c |
       c.getConstructor().getDeclaringType().(RefType).getASupertype+().hasQualifiedName("java.lang", "RuntimeException") and
       c.getAnArgument() = node1.asExpr() and
       c = node2.asExpr()
     )
     or
     // Step 2: Flow from the qualifier of Exception::getMessage() to a call of getMessage
     exists(MethodCall mc |
       mc.getMethod().getName() = "getMessage" and
       mc.getQualifier() = node1.asExpr() and
       node2.asExpr() = mc
     )
   }
 
   // Sink: Tracks `e.getMessage()` where sensitive information may be exposed
   predicate isSink(DataFlow::Node sink) {
     exists(MethodCall mcSink, MethodCall mcGetMessage |
      // The argument passed is a call to getMessage()
      mcGetMessage.getMethod().getName() = "getMessage" and
      mcGetMessage.getQualifier() = sink.asExpr()
      and 
      // The sink methods (println, sendError, write)
       mcSink.getMethod().getName() in ["println", "sendError", "write"] and
       mcSink.getAnArgument() = mcGetMessage 
     )
   }
 
 }
 
 // Perform taint tracking from source to sink
 from Flow::PathNode source, Flow::PathNode sink
 where Flow::flowPath(source, sink)
 select sink.getNode(), source, sink,
   "Sensitive information might be exposed through an exception message."

The good news is that with this, all of the false positives have gone away. However, I am now trying to figure out how to get the full path for my results. Right, now it's just giving me the stopping point. As seen by the ! next to each result.

image

From reading about this, it seems like the flow is not connecting somewhere. I am suspecting that the issue is in the isAdditionalFlowStep predicate. I am just not entirely sure where.

I tried to include more steps to it with this, but I am getting the same result.

predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
    // Track flow from a throw statement to a catch block (ignore throws inside the catch block itself)
    exists(ThrowStmt t |
      // Ensure the throw statement is outside any catch block
      not exists(CatchClause catchInside |
        // Check if the throw statement is inside the body (block) of the catch clause
        t = catchInside.getBlock().getAStmt()
      ) and
      // Ensure that node1 is the thrown exception (e.g., new Exception(...))
      t.getExpr() = node1.asExpr() and
      // Ensure that both the throw and the catch block are in the same method or constructor
      t.getEnclosingCallable() = node1.asExpr().getEnclosingCallable() and
      t.getEnclosingCallable() = node2.asExpr().getEnclosingCallable() and
      // Ensure the thrown exception is a RuntimeException or its subclass
      t.getExpr().getType().(RefType).getASupertype+().hasQualifiedName("java.lang", "RuntimeException") and
      // Ensure the exception is caught by a catch block
      exists(CatchClause cc |
        cc.getEnclosingCallable() = t.getEnclosingCallable() and
        // Track flow from the caught exception to a getMessage() call
        exists(MethodCall mc |
          mc.getMethod().getName() = "getMessage" and
          mc.getQualifier() = cc.getVariable().getAnAccess() and
          node2.asExpr() = mc
        )
      )
    )
  }

I am not sure if this is too much, but I wanted to make sure I wasn't missing a step in the flow.

@aibaars
Copy link
Contributor

aibaars commented Oct 2, 2024

Your isSink predicate doesn't seem right. I think all you need to do is to flag the argument to a println, sendError, and write method call as a sink. The flow steps should take care of tracking the flow through e.getMesage(), no need to repeat that in the isSink definition. If you make the isSink too specific, you would miss out on code like:

String message = e.getMessage();
sendError(message);

Something like the following isSink definition should work better.

   predicate isSink(DataFlow::Node sink) {
     exists(MethodCall mcSink |
      // The sink methods (println, sendError, write)
       mcSink.getMethod().getName() in ["println", "sendError", "write"] and
       mcSink.getAnArgument() = sink.asExpr() 
     )
   }

The isSource and isSink predicates should define the points where flow starts and ends, nothing more and nothing less. If you make them to broad, you'll get false positives, if you make them too specific, you'll miss out on results (or flow steps)

For additional flow steps it is usually better to keep each step "small". It's the task of the dataflow analyzer to combine all the steps into a complete flow path. It can do better job when has many small steps that is can combine than an overly specific big step. You did this right for Step 1 and Step 2 , but the throw to catch clause step is too specific. Just define it as a step from the expression of a throw to the variable of the catch clause. No need to mention getMessage() again. The getMessage() method call should be handled by Step 2.

@KylerKatzUH
Copy link
Author

Hi @aibaars,

Thank you again for your help. I appreciate it.

I have gone back and implemented the changes that you suggested. However, this leads to a lot of false positives.

For example, this code has a detection

public class BAD_ExposeSessionIDInHeaders {
    public void exposeSessionID(HttpServletResponse response, String sessionID) {
        response.setHeader("X-Session-ID", sessionID);
        response.setContentType("text/html");
        try {
            response.getWriter().println("Session details " + sessionID + " are set in headers.");
        } catch (IOException e) {
            e.printStackTrace();
        }
    }
}

image
With this being the sink
response.getWriter().println("Session details " + sessionID + " are set in headers.");

This doesn't have anything to do with what is defined in the isAdditionalFlowStep it's almost like it's completely ignoring this predicate during the dataflow analysis. The good news is that this gives me a path for the results.

Here is my current query

/**
 * @name CWE-537: Java Runtime Error Message Containing Sensitive Information
 * @description Detects sensitive information (e.g., apiKey) being added to an exception message and then exposed via getMessage in HTTP responses.
 * @kind path-problem
 * @problem.severity error
 * @id java/runtime-error-message-exposure/537
 * @tags security
 *       external/cwe/cwe-537
 * @cwe CWE-537
 */

 import java
 import semmle.code.java.dataflow.TaintTracking
 import semmle.code.java.dataflow.DataFlow
 import SensitiveInfo.SensitiveInfo
 
 module Flow = TaintTracking::Global<SensitiveInfoInExceptionConfig>;
 
 import Flow::PathGraph
 
 /**
  * Defines the taint configuration for tracking sensitive data in exception messages.
  */
 module SensitiveInfoInExceptionConfig implements DataFlow::ConfigSig {
   
   // Sensitive variables (e.g., apiKey)
   predicate isSource(DataFlow::Node source) {
     exists(SensitiveVariableExpr sve | source.asExpr() = sve)
   }
 
   // Track step 1: From argument of the constructor of an Exception to the constructed Exception
   predicate isAdditionalFlowStep(DataFlow::Node node1, DataFlow::Node node2) {
     // Step 1: Flow from sensitive data passed as an argument to the Exception constructor
     exists(ConstructorCall c |
       c.getConstructor().getDeclaringType().(RefType).getASupertype+().hasQualifiedName("java.lang", "RuntimeException") and
       c.getAnArgument() = node1.asExpr() and
       c = node2.asExpr()
     )
     or
     // Step 2: Flow from the qualifier of Exception::getMessage() to a call of getMessage
     exists(MethodCall mc |
       mc.getMethod().getName() = "getMessage" and
       mc.getQualifier() = node1.asExpr() and
       node2.asExpr() = mc
     )
   }
 
   // Sink: Tracks `e.getMessage()` where sensitive information may be exposed
   predicate isSink(DataFlow::Node sink) {
    exists(MethodCall mcSink |
     // The sink methods (println, sendError, write)
      mcSink.getMethod().getName() in ["println", "sendError", "write"] and
      mcSink.getAnArgument() = sink.asExpr() 
    )
  }
 }
 
 // Perform taint tracking from source to sink
 from Flow::PathNode source, Flow::PathNode sink
 where Flow::flowPath(source, sink)
 select sink.getNode(), source, sink,
   "Sensitive information might be exposed through an exception message."

@aibaars
Copy link
Contributor

aibaars commented Oct 3, 2024

Ah indeed, now the query should flag up any sensitive data that is exposed to a sink. That is of course a useful analysis, but you want to restrict the results to exposure via exception messages.

This doesn't have anything to do with what is defined in the isAdditionalFlowStep it's almost like it's completely ignoring this predicate during the dataflow analysis. The good news is that this gives me a path for the results.

The dataflow analyzer has a lot of default steps, for example variable writes to variable reads, method call arguments to method parameters, etc. It uses the default steps in combination with any additional steps you provide. If it can find a path without using the additional steps, that is fine too.

Instead of any path from a sensitive data source into print/write sink you want the dataflow analyzer to restrict those paths to the ones that flow via an exception. I think you can achieve that using flow labels: https://codeql.github.com/docs/codeql-language-guides/using-flow-labels-for-precise-data-flow-analysis/ . The idea is to start at the source with a default label, but make the isSink predicate require that the label is exception. You also need to make one of the additional steps (for example the exception construction step) a label transforming step that allows the incoming flow to have any label, but makes the outgoing label exception. With these changes the dataflow analyzer should be able to construct paths that start with a default label in the source, take any number of label-preserving steps, reach an exception creation, which changes the label from default to exception, take any number of label-preserving steps and reach the sink with the exception label.

@KylerKatzUH
Copy link
Author

Hi @aibaars,

That might help. However, is FlowLabel supported for Java? I can only find Javascript/Typescript examples.

@intrigus-lgtm
Copy link
Contributor

Hi @aibaars,

That might help. However, is FlowLabel supported for Java? I can only find Javascript/Typescript examples.

The shared dataflow library (which JS/TS is currently not using) uses the concept of flow states instead.
As an example, see here:
#12797 (comment)

@KylerKatzUH
Copy link
Author

Hi @intrigus-lgtm ,

Thank you for pointing me towards flow states. This is a bit out of my depth so I am struggling to understand exactly how to convert my query to one that uses this functionally.

Looking at the solution that you sent it looks like it's used to combine multiple configurations into one query, where each config is a state. With that said, the next question would be how many states should I use? I found this discussion that uses 3 states.

In my current query I have

  1. Start at a variable marked as sensitive.
  2. Check to see if that variable is used in the construction of a new Runtime Exception.
  3. Find where this exception is caught
  4. Check to see if getMessage() is used.
  5. Look to see if it flows to a sink (E.x. println())

With this mapped out I would think these are my states

State 1. Sensitive variable -> Runtime Exception
State 2. Runtime Exception -> Catch Clause
State 3. Catch Clause -> getMessage()
State 4. getMessage() -> println()

Does this seem like too many states? I am not sure if the default dataflow will take care of some of these or not?

Let's assume that these are the correct states, I am still struggling to understand how to implement them as I have seen example that are more theoretical. I have this example as well but it's going over my head.

I would appreciate any help as this is new to me.

Thank you

@KylerKatzUH
Copy link
Author

Here is my current attempt at this. Since I believe getMessage() -> println() will be done automatically I removed that one to include the first 3 states. Using this query I am not getting any results when I run the full thing. However, I am getting the expected results when doing the quick eval for the isSource and isSink predicates.

/**
 * @name Sensitive information exposed via exception messages
 * @description Tracks sensitive information flowing into exception constructors and exposed via getMessage in sinks.
 * @kind path-problem
 * @id java/sensitive-info-exception-message
 * @problem.severity warning
 */

 import java
 import semmle.code.java.dataflow.TaintTracking
 import semmle.code.java.dataflow.DataFlow
 import DataFlow::PathGraph
 
 // Define states for the flow
 class State1 extends DataFlow::FlowState { State1() { this = "State1" } }
 class State2 extends DataFlow::FlowState { State2() { this = "State2" } }
 class State3 extends DataFlow::FlowState { State3() { this = "State3" } }
 
 // Define sensitive variables
 class SensitiveVariable extends VarAccess {
   SensitiveVariable() {
     this.getVariable().getName() = "username" or
     this.getVariable().getName() = "email" or
     this.getVariable().getName() = "password"
   }
 }
 
 // Dataflow configuration using flow states
 class ExceptionDataFlowConfig extends TaintTracking::Configuration {
   ExceptionDataFlowConfig() { this = "ExceptionDataFlowConfig" }
 
   // Track sensitive variables as the source in State1
   override predicate isSource(DataFlow::Node source, DataFlow::FlowState state) {
     state instanceof State1 and
     exists(SensitiveVariable var |
       source.asExpr() = var
     )
   }
 
   // Track sinks like `println`, `sendError`, etc. in State3
   override predicate isSink(DataFlow::Node sink,DataFlow::FlowState state) {
     state instanceof State3 and
     exists(MethodCall mcSink |
       mcSink.getMethod().getName() in ["println", "sendError", "write"] and
       mcSink.getAnArgument() = sink.asExpr()
     )
   }
 
   // Define transitions between flow states
   override predicate isAdditionalTaintStep(
     DataFlow::Node node1, DataFlow::FlowState state1,
     DataFlow::Node node2, DataFlow::FlowState state2
   ) {
     // Transition from State1 to State2: sensitive data flows into an exception constructor
     state1 instanceof State1 and
     state2 instanceof State2 and
     exists(ConstructorCall cc |
       cc.getAnArgument() = node1.asExpr() and
       cc.getConstructor().getDeclaringType().(RefType).getASupertype+().hasQualifiedName("java.lang", "RuntimeException") and
       node2.asExpr() = cc
     ) 
     or
 
     // Transition from State2 to State3: exception is caught, and getMessage() is called
     state1 instanceof State2 and
     state2 instanceof State3 and
     exists(CatchClause catchClause, MethodCall mcGetMessage |
       mcGetMessage.getMethod().getName() = "getMessage" and
       mcGetMessage.getQualifier() = catchClause.getVariable().getAnAccess() and
       node1.asExpr() = catchClause.getVariable().getAnAccess() and
       node2.asExpr() = mcGetMessage
     )
   }
 }
 
 // Query for sensitive information flow from source to sink
 from DataFlow::PathNode source, DataFlow::PathNode sink, ExceptionDataFlowConfig cfg
 where cfg.hasFlowPath(source, sink)
 select source, sink, source, "Sensitive information flows into exception and is exposed via getMessage."

Do you see any obvious issues with how I have implemented this?

@smowton
Copy link
Contributor

smowton commented Oct 11, 2024

This looks sensible. As a matter of style I'd recommend naming your states for their meaning, not just numbering them. Note by the way that while this is fine for prototyping, the old DataFlow::Configuration and TaintTracking::Configuration are deprecated -- the new ConfigSig style (https://github.blog/changelog/2023-08-14-new-dataflow-api-for-writing-custom-codeql-queries/) is now preferred. For stateful queries you'd need to use DataFlow::StateConfigSig, and can actually define a non-string FlowState, like newtype TExceptionTrackingState = TStart() or TAfterExnConstructor() or TAfterGetMessage() ... class FlowState extends TExceptionTrackingState { ... }

For an example, check out java/ql/lib/semmle/code/java/security/ArbitraryApkInstallationQuery.qll

@rvermeulen rvermeulen added the awaiting-response The CodeQL team is awaiting further input or clarification from the original reporter of this issue. label Oct 14, 2024
Copy link
Contributor

This issue is stale because it has been open 14 days with no activity. Comment or remove the Stale label in order to avoid having this issue closed in 7 days.

@github-actions github-actions bot added the Stale label Oct 29, 2024
Copy link
Contributor

github-actions bot commented Nov 6, 2024

This issue was closed because it has been inactive for 7 days.

@github-actions github-actions bot closed this as not planned Won't fix, can't repro, duplicate, stale Nov 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting-response The CodeQL team is awaiting further input or clarification from the original reporter of this issue. question Further information is requested Stale
Projects
None yet
Development

No branches or pull requests

5 participants