Integrating with Code Scanning
Problem statement
Many of GitHub’s Technology Partners offering security products in the form of static analysis tooling, wish to surface their tools’ security findings directly in GitHub’s UI, making it easier for developers to adopt their tooling, and adding value to the development workflow by identifying potential vulnerabilities before they reach production. This kind of developer workflow is often associated with DevSecOps and the concept of shifting left, as security analyses are performed frequently and earlier in the development process.
Solution
A paved path exists that is tailored for this type of integration in the form of GitHub code scanning, a developer-first, GitHub-native approach to easily find security vulnerabilities before they reach production.
Technology Partners can integrate their tooling with code scanning by submitting analyses in the Static Analysis Results Interchange Format (SARIF)
(v2.1.0) format to GitHub. This format is specified formally here, however GitHub code scanning supports only a subset of the properties, which are listed here.
The analysis is typically triggered by events originating from GitHub, such as developers pushing code (the push
event), opening a pull request (the pull_request
event), or on some pre-determined automated schedule (i.e. once per week).
Two implementation approaches are available, via GitHub Actions, or via GitHub Apps, each of which are explored further below.
Prior to diving in to the implementation detail, it is worth designing how your tool should structure its output using the SARIF format, with consideration for the SARIF properties that are supported by GitHub code scanning.
Commonly, this will be an iterative process:
- Generate your SARIF report (potentially by hand, at least initally)
- Validate your SARIF report, using the online SARIF validator at sarifweb.azurewebsites.net/Validation
- Important: It is recommended to enable
GitHub ingestion rules
, for additional code scanning compatibility validation
- Important: It is recommended to enable
- Upload your SARIF report to GitHub code scanning for visual verification. Note: Code scanning is available for all public repositories and for private repositories owned by organizations where GitHub Advanced Security is enabled. For more information, see About GitHub Advanced Security.
- Uploading may be done using the REST API, via a
curl
command. Note, the SARIF report must be gzipped and base64-encoded prior to being uploaded to GitHub. - Alternatively, commit the SARIF report directly to a GitHub repo and upload it to code scanning using the
github/codeql-action/upload-sarif
action.
- Uploading may be done using the REST API, via a
- Repeat.
Example SARIF report
An example SARIF report (generated by the Brakeman tool for an intentially vulnerable Ruby on Rails application), whose structure was designed by following the process outlined above, is available:
The following points warrant special mention:
- The output conforms to version
2.1.0
of the SARIF spec, as indicated by the top-levelversion
andschema
properties, and confirmed by the online SARIF validator - The top-level
runs
object is an array containing a single element, an object representing thetool
,rules
, andresults
of the run.- The tool’s
semanticVersion
is useful to include, it’s helpful for ingestion systems to know run-over-run if a tool is updated.
- The tool’s
- The
rules
array represents the set of vulnerabilities that the tool scans for, each rule is represnted by anid
,name
,fullDescription
,helpUri
,help
text, and an additionalproperties
bag- Each rule’s
id
uses a prefix that is representative of the tool name,BRAKE
in this instance, followed by a numeric identifier. This helps with filtering of rules in the GitHub code scanning UI - Each rule’s
name
is a hierarchical property, this makes sense for this particular tool and othes may also adopt this pattern where it makes sense - Each rule’s
fullDescription
ends with a period, which helps facilitate a consistent user experience when the rule is rendered by GitHub code scanning - Each rule’s
help
references an external article via a URL. Generally it is preferred to include the help text inline, within the SARIF report, but for this implementation this was not straightforward, and will hopefully be addressed in a subsequent iteration.
- Each rule’s
- The
results
array captures the results of the analysis, with each violation of a rule being captured in a single result entry. For example, ruleBRAKE0014
is violated five times, as indicated by results on lines437
,458
,479
, and500
,521
- Each result’s entry references the rule being violated, via the rule’s
id
and position in the rules array - Each result’s entry maps onto a source file via the
locations
array, for portability across systems, theuri
is expressed a path relative to%SRCROOT%
- Each result’s entry references the rule being violated, via the rule’s
The following screenshot shows the GitHub code scanning representation of a violation of rule BRAKE0014
, derived from the corresponding result object on lines 520
through 540
.
Note, this is a relatively straightforward SARIF report, more sophisticated constructs are possible. To learn more, it is recommended to follow the SARIF tutorials, and review the specification.
Implementation detail
Once you are satisfied with the structure of the SARIF produced by your tool, there are two primary approaches when integrating it with code scanning:
- Via GitHub Actions
- Via GitHub Apps
The former is generally applicable where:
- The tooling is installable as a CLI tool that can easily execute on GitHub’s compute (e.g. Brakeman, detekt), -or-
- The tooling may be easily invoked via public or authenticated API calls. Tokens for authentication may be held in GitHub as encrypted secrets.
The latter is more suitable for solutions that have unique compute requirements, or that have user-facing elements (such as configuration controls or dashboards), potentially via a dedicated web UI or control panel.
GitHub Actions and GitHub Apps are both covered in more detail in the Platform Integration 101
presentation.
Additional resources are also available for both GitHub Actions and GitHub Apps
Onboarding your integration into the GitHub code scanning UI
When complete, the onboarding of your integration into the GitHub code scanning can be initiated by opening a new pull request in the actions/starter-workflows
repo. Additional instructions are located in the pull request template.
Publication to GitHub Marketplace
In addition to onboarding into the code scanning UI, we highly recommend publishing your integration to Marketplace for increased visibility.
Additional information is available for both GitHub Actions and GitHub Apps.
Examples
Existing implementations and examples are available:
- Brakeman SARIF implementation: github.com/presidentbeef/brakeman/pull/1500 (Brakeman is an open source statis analysis tool, popular in the Ruby on Rails community)
- Code Scanning playground: github.com/swinton/code-scanning-playground (a forkable template repo, showing a simple code scanning workflow leveraging ESlint)
Related
Resources for learning SARIF
Useful resources for learning SARIF are available:
- SARIF tutorials from Microsoft: github.com/microsoft/sarif-tutorials
- SARIF Validator web-based tool: sarifweb.azurewebsites.net/Validation
- SARIF specification, v2.1.0: docs.oasis-open.org/sarif/sarif/v2.1.0/sarif-v2.1.0.html
Further documentation
Further documentation is available on GitHub.com, including:
- SARIF support for code scanning: docs.github.com/github/finding-security-vulnerabilities-and-errors-in-your-code/sarif-support-for-code-scanning
- Code scanning REST API: docs.github.com/rest/reference/code-scanning