Skip to content

e2toe8/iris

 
 

Repository files navigation

IRIS logo

Read the Docs ]


⚠️ Code and data for the ICLR 2025 Paper can be found in the v1 branch, license and citation below.

📰 News

  • [Jul. 10, 2025]: IRIS v2 released, added support for 7 new CWEs.

👋 Overview

IRIS

IRIS is a neurosymbolic framework that combines LLMs with static analysis for security vulnerability detection. IRIS uses LLMs to generate source and sink specifications and to filter false positive vulnerable paths. At a high level, IRIS takes a project and a CWE (vulnerability class, such as path traversal vulnerability or CWE-22) as input, statically analyzes the project, and outputs a set of potential vulnerabilities (of type CWE) in the project.

iris workflow

CWE-Bench-Java

This repository also contains the dataset CWE-Bench-Java, presented in the paper LLM-Assisted Static Analysis for Detecting Security Vulnerabilities. At a high level, this dataset contains 120 CVEs spanning 4 CWEs, namely path-traversal, OS-command injection, cross-site scripting, and code-injection. Each CVE includes the buggy and fixed source code of the project, along with the information of the fixed files and functions. We provide the seed information in this repository, and we provide scripts for fetching, patching, and building the repositories. The dataset collection process is illustrated in the figure below:

cwe-bench graphic

🚀 Set Up

Using Docker (Recommended)

docker build -f Dockerfile --platform linux/x86_64 -t iris:latest .
docker run --platform=linux/amd64 -it iris:latest

If you intend to configure build tools (Java, Maven, or Gradle) or CodeQL, follow the native setup instructions below.

Native (Mac/ Linux)

Step 1: Setup Conda environment

conda env create -f environment.yml
conda activate iris

If you have a CUDA-capable GPU and want to enable hardware acceleration, install the appropriate CUDA toolkit, for example:

$ conda install pytorch-cuda=12.1 -c nvidia -c pytorch

Replace 12.1 with the CUDA version compatible with your GPU and drivers, if needed.

Step 2: Configure Java build tools

To apply IRIS to Java projects, you need to specify the paths to your Java build tools (JDK, Maven, Gradle) in the dep_configs.json file in the project root.

The versions of these tools required by each project are specified in data/build_info.csv. For instance, perwendel__spark_CVE-2018-9159_2.7.1 requires JDK 8 and Maven 3.5.0. You can install and manage these tools easily using SDKMAN!.

# Install SDKMAN!
curl -s "https://get.sdkman.io" | bash
source "$HOME/.sdkman/bin/sdkman-init.sh"

# Install Java 8 and Maven 3.5.0
sdk install java 8.0.452-amzn
sdk install maven 3.5.0

Step 3: Configure CodeQL

IRIS relies on the CodeQL Action bundle, which includes CLI utilities and pre-defined queries for various CWEs and languages ("QL packs").

If you already have CodeQL installed, specify its location via the CODEQL_DIR environment variable in src/config.py. Otherwise, download an appropriate version of the CodeQL Action bundle from the CodeQL Action releases page.

  • For the latest version: Visit the latest release and download the appropriate bundle for your OS:

    • codeql-bundle-osx64.tar.gz for macOS
    • codeql-bundle-linux64.tar.gz for Linux
  • For a specific version (e.g., 2.15.0): Go to the CodeQL Action releases page, find the release tagged codeql-bundle-v2.15.0, and download the appropriate bundle for your platform.

After downloading, extract the archive in the project root directory:

tar -xzf codeql-bundle-<platform>.tar.gz

This should create a sub-directory codeql/ with the executable codeql inside.

Lastly, add the path of this executable to your PATH environment variable:

export PATH="$PWD/codeql:$PATH"

Note: Also adjust the environment variable CODEQL_QUERY_VERSION in src/config.py according to the instructions therein. For instance, for CodeQL v2.15.0, this should be 0.8.0.

Visualizer

IRIS comes with a visualizer to view the SARIF output files. More detailed instructions can be found in the docs.

iris visualizer

Usage:

  1. Configure paths: Edit config.json to point to your outputs and source directories
  2. Start the server: Run python3 server.py
  3. Open in browser: Navigate to http://localhost:8000
  4. Select a project: Choose a project from the dropdown to load its analysis results
  5. Filter and explore: Use the CWE and model filters to explore specific vulnerabilities

💫 Contributions

We welcome any contributions, pull requests, or issues! If you would like to contribute, please either file a new pull request or issue. We'll be sure to follow up shortly!

🤝 Our Team

IRIS is a collaborative effort between researchers at Cornell University and the University of Pennsylvania. Please reach out to us if you have questions about IRIS.

Students

Claire Wang, University of Pennsylvania

Amartya Das, Ward Melville High School

Derin Gezgin, Connecticut College

Zhengdong (Forest) Huang, Southern University of Science and Technology

Nevena Stojkovic, Massachusetts Institute of Technology

Faculty

Ziyang Li, Johns Hopkins University, previously PhD student at the University of Pennsylvania

Saikat Dutta, Cornell University

Mayur Naik, University of Pennsylvania

Cornell University University of Pennsylvania

✍️ Citation & license

MIT license. Check LICENSE.md.

If you find our work helpful, please consider citing our ICLR'25 paper:

@inproceedings{li2025iris,
title={LLM-Assisted Static Analysis for Detecting Security Vulnerabilities},
author={Ziyang Li and Saikat Dutta and Mayur Naik},
booktitle={International Conference on Learning Representations},
year={2025},
url={https://arxiv.org/abs/2405.17238}
}

Arxiv Link

About

A neurosymbolic framework for vulnerability detection in code

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages

  • Python 58.3%
  • CodeQL 28.0%
  • JavaScript 7.5%
  • CSS 2.7%
  • HTML 2.0%
  • Shell 0.8%
  • Dockerfile 0.7%