Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Javascript: How can I filter some dataflow results? #6920

Open
greatyy opened this issue Oct 20, 2021 · 13 comments
Open

Javascript: How can I filter some dataflow results? #6920

greatyy opened this issue Oct 20, 2021 · 13 comments
Labels
JS question Further information is requested

Comments

@greatyy
Copy link

greatyy commented Oct 20, 2021

Example:

var a = request.body;
var obj = {
    "body": a.a;
    "headers": a.b
}
foo(obj)

while I query the above code, there are two flow path:

1. a -> obj.body -> foo
2. a -> obj.headers -> foo

And I would like to show result 1 only, and filter result 2.
How can I solve the problem? Thanks

@greatyy greatyy added the question Further information is requested label Oct 20, 2021
@esbena
Copy link
Contributor

esbena commented Oct 20, 2021

Can you provide some additional context for the query? Like the kind of problem are you trying to identify, or some additional nearby source code.

Generally speaking from the security angle, if request is some kind of (HTTP) request object from an attacker, then request.body.a is just as attacker-controlled as request.body.b, so obj.body and obj.headers should be treated the same by a security query.

@greatyy
Copy link
Author

greatyy commented Oct 21, 2021

For example, the foo function is in third party, and it concats obj.body to a sql string, and just log obj.headers, like

function foo(obj){
   let sql = "select * from db where body=" + obj.body
   exec(sql)
   logger.debug(obj.headers)
}

Because foo is invisible, I make the arguement of foo as sink, request.body as source.
And flows via a.b is false positive, I want to filter this kind of results. @esbena

@greatyy
Copy link
Author

greatyy commented Oct 21, 2021

Can FlowLabel do this work? And if so, how to use it in this example? Thanks

@esbena
Copy link
Contributor

esbena commented Oct 21, 2021

https://codeql.github.com/docs/codeql-language-guides/using-flow-labels-for-precise-data-flow-analysis/ is the documentation for FlowLabel. This is indeed the proper way to encode the problem, but there is a simpler, and common, solution that usually produces good results in practice. Can you take a look and let me know if that is a solution for you?

Instead of modelling:

sink = any(CallNode n | n.getCalleeName() = "foo").getArgument(0)

we can model:

sink = any(CallNode n | n.getCalleeName() = "foo").getOptionArgument(0, "body")`

That is: we essentially stop the dataflow tracking one step earlier.

This approach works for all queries without introducing the complex FlowLabel, and it works for your example since obj.body is defined relatively close to the foo(obj) call.

@greatyy
Copy link
Author

greatyy commented Oct 21, 2021

It works in the above example. But in my real world code, the example is more like :

var a = request.body;
var obj = {
    "body": a.a,
    "headers": a.b
}
function buildMsg(obj){
   let result = new SomeKindOfClass();
   result.body = obj.body;
   result.headers = obj.headers
   return result
}
foo(buildMsg(obj))

The arguement flows into other function. Is there a simple way to solve it? @esbena

@esbena
Copy link
Contributor

esbena commented Oct 21, 2021

OK. Then I'll refer to https://codeql.github.com/docs/codeql-language-guides/using-flow-labels-for-precise-data-flow-analysis/ again. The JsonTrackingConfig has the form you want. I'll give you some pointers, please provide QL code for further assistance.

Like in that example you need the following:

  • two flow labels: the builtin FlowLabel::taint() and a custom one: MyFlowLabel::objectWithTaintedBody().
  • an isAdditionalFlowStep that taints a base object with MyFlowLabel::objectWithTaintedBody() for .body writes with an FlowLabel::taint() RHS label.
  • an isSink predicate that only holds for MyFlowLabel::objectWithTaintedBody()

@greatyy
Copy link
Author

greatyy commented Oct 21, 2021

The origin query:

import javascript
import DataFlow::PathGraph

class Configuration extends TaintTracking::Configuration {
    Configuration() { this = "" }

    override predicate isSource(DataFlow::Node source) {
        DataFlow::globalVarRef("request").getAPropertyRead("body")=source
    }

    override predicate isSink(DataFlow::Node sink) {
        exists(DataFlow::CallNode process | process.getCalleeName()="foo" and sink=process.getArgument(0))
    }
    
    override predicate isAdditionalTaintStep(DataFlow::Node node1, DataFlow::Node node2) {
        node2.(DataFlow::SourceNode).getAPropertyWrite().getRhs() = node1 
    }
}

from Configuration cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink, source, sink, "vuln"

@esbena

@esbena
Copy link
Contributor

esbena commented Oct 21, 2021

You need to change TaintTracking::Configuration to DataFlow::Configuration, add DataFlow::FlowLabel parameters to all of the predicates, and then make of of those parameters. It is probably easiest to copy your bodies into the linked JsonTrackingConfig.

@greatyy
Copy link
Author

greatyy commented Oct 21, 2021

It seems that the DataFlow::Configuration is a local flow...
I change the ql code as below:

import javascript
import DataFlow::PathGraph

class JsonLabel extends DataFlow::FlowLabel {
    JsonLabel() {
      this = "json"
    }
  }
  
  class MaybeNullLabel extends DataFlow::FlowLabel {
    MaybeNullLabel() {
      this = "maybe-null"
    }
  }

class JsonTrackingConfig extends DataFlow::Configuration {
    JsonTrackingConfig() { this = "JsonTrackingConfig" }
  
    override predicate isSource(DataFlow::Node nd, DataFlow::FlowLabel lbl) {
        DataFlow::globalVarRef("request").getAPropertyRead("body")=nd
        and
        (lbl instanceof JsonLabel or lbl instanceof MaybeNullLabel)
      
    }
  
    override predicate isSink(DataFlow::Node nd, DataFlow::FlowLabel lbl) {
        exists(DataFlow::CallNode process | process.getCalleeName()="foo" and nd=process.getArgument(0))
        and 
        lbl instanceof MaybeNullLabel
      
    }
  
    override predicate isAdditionalFlowStep(DataFlow::Node pred, DataFlow::Node succ,
                               DataFlow::FlowLabel predlbl, DataFlow::FlowLabel succlbl) {
      succ.(DataFlow::SourceNode).getAPropertyWrite().getRhs() = pred 
      and
      predlbl instanceof JsonLabel and
      (succlbl instanceof JsonLabel or succlbl instanceof MaybeNullLabel)
    }
  
  }
  
  from JsonTrackingConfig cfg, DataFlow::PathNode source, DataFlow::PathNode sink
  where cfg.hasFlowPath(source, sink)
  select sink, source, sink, "Property access on JSON value originating $@.", source, "here"

It turns out no result. @esbena

@esbena
Copy link
Contributor

esbena commented Oct 21, 2021

The above code looks right to me, but it is confusing to see the JsonLabel and MaybeNullLabel usage in this context, so perhaps I missed a subtlety.

Some suggestions towards spotting a bug:

  • the isSource label should only be JsonLabel
  • the isAdditionalFlowStep succLbl should be MaybeNullLabel
  • rename the labels appropriately to make the query less confusing to read (it has nothing to do with json/null anymore...)
  • in vscode, use quickeval on each predicate and confirm their results in isolation

@greatyy
Copy link
Author

greatyy commented Oct 22, 2021

I change my code to:

import javascript
import DataFlow::PathGraph

class FilterLabel extends DataFlow::FlowLabel {
    FilterLabel() {
      this = "filter"
    }
  }
  
  class PassLabel extends DataFlow::FlowLabel {
    PassLabel() {
      this = "pass"
    }
  }

class JsonTrackingConfig extends DataFlow::Configuration {
    JsonTrackingConfig() { this = "JsonTrackingConfig" }
  
    override predicate isSource(DataFlow::Node nd, DataFlow::FlowLabel lbl) {
        DataFlow::globalVarRef("request").getAPropertyRead("body")=nd
        and
        (lbl instanceof PassLabel)
      
    }
  
    override predicate isSink(DataFlow::Node nd, DataFlow::FlowLabel lbl) {
        exists(DataFlow::CallNode process | process.getCalleeName()="foo" and nd=process.getArgument(0))
        and 
        lbl instanceof PassLabel
      
    }
  
    override predicate isAdditionalFlowStep(DataFlow::Node pred, DataFlow::Node succ,
                               DataFlow::FlowLabel predlbl, DataFlow::FlowLabel succlbl) {
      succ.(DataFlow::SourceNode).getAPropertyWrite().getRhs() = pred 
      and
      predlbl instanceof PassLabel and
      (succlbl instanceof PassLabel)
    }
  
  }
  
  from JsonTrackingConfig cfg, DataFlow::PathNode source, DataFlow::PathNode sink
  where cfg.hasFlowPath(source, sink)
  select sink, source, sink, "Property access on JSON value originating $@.", source, "here"

which FilterLabel means property I would like to filter, and PassLabel means that property I would like to pass.
It still has no results.
And I am confused about :

  1. It seems that DataFlow::Configuration is a funcion scope local Configuration, is it proper using here?
  2. I can not understand the principle of FlowLabel using here. And why it still has not results?

@esbena

@greatyy
Copy link
Author

greatyy commented Oct 22, 2021

For convenience, I put the demo code here: https://github.com/greatyy/test_codeql_json.git

@esbena
Copy link
Contributor

esbena commented Oct 22, 2021

Your example ql code above only uses the PassLabel. Please use Quick Eval in vscode to confirm that your predicates are correct.

  • I have fixed that, along with some label renaming and query formatting (codeql query format -i query.ql).
  • I have introduced the use of TaintedObject since any nested property of request.body is dangerous to add to the headers object. It is also dangerous to pass any nested property of request.body to the sink since that property in turn could have a .headers property.
  • I have added the @kind path-problem
/**
 * @kind path-problem
 */

import javascript
import DataFlow::PathGraph
import semmle.javascript.security.TaintedObject

class TaintedObjectInHeaders extends DataFlow::FlowLabel {
  TaintedObjectInHeaders() { this = "tainted-object-in-headers" }
}

class JsonTrackingConfig extends DataFlow::Configuration {
  JsonTrackingConfig() { this = "JsonTrackingConfig" }

  override predicate isSource(DataFlow::Node nd, DataFlow::FlowLabel lbl) {
    DataFlow::globalVarRef("request").getAPropertyRead("body") = nd and
    lbl = TaintedObject::label()
  }

  override predicate isSink(DataFlow::Node nd, DataFlow::FlowLabel lbl) {
    exists(DataFlow::CallNode process |
      process.getCalleeName() = "foo" and nd = process.getArgument(0)
    ) and
    (lbl instanceof TaintedObjectInHeaders or lbl = TaintedObject::label())
  }

  override predicate isAdditionalFlowStep(
    DataFlow::Node pred, DataFlow::Node succ, DataFlow::FlowLabel predlbl,
    DataFlow::FlowLabel succlbl
  ) {
    succ.(DataFlow::SourceNode).getAPropertyWrite("headers").getRhs() = pred and
    predlbl = TaintedObject::label() and
    succlbl instanceof TaintedObjectInHeaders
    or
    TaintedObject::step(pred, succ, predlbl, succlbl)
  }
}

from JsonTrackingConfig cfg, DataFlow::PathNode source, DataFlow::PathNode sink
where cfg.hasFlowPath(source, sink)
select sink.getNode(), source, sink, "Property access on JSON value originating $@.",
  source.getNode(), "here"

@sidshank sidshank added the JS label Oct 19, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
JS question Further information is requested
Projects
None yet
Development

No branches or pull requests

3 participants