Skip to content

SoK-Vul4C/SoK

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

73 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

SoK: Automated Vulnerability Repair: Methods, Tools, and Assessments

This repository contains the benchmark dataset Vul4C and the experiment results for automated vulnerability repair (AVR) tools in C/C++.

The conference version of our paper is available at conference-version-link, and the full version is at full-version-link.

Table of Contents

  1. Repository Structure
  2. Benchmark Dataset Vul4C
  3. Experimental Tools and Results
  4. Cite
  5. Contact

1. Repository Structure

This repository is structured as follow:

|----- Vul4C-Benchmark
    |----- [Software]
        |----- [CVE ID]
            |----- [CVE ID]_[CWE ID]_[filename].diff 
            |----- [CVE ID]_[CWE ID]_[filename]_NEW.c
            |----- [CVE ID]_[CWE ID]_[filename]_OLD.c
            |----- README.txt 
            |----- exploit
            |----- setup.sh
|----- Vul4C_Src: Source code for command line tool.
|----- Results
    |----- Results.xlsx: All experimental results.
    |----- [Vulnerability Repair Tools]
        |----- [Software]
            |----- [CVE ID]
                |----- 50-Candidates: This folder contains all 50 candidates generated by models. (Only for learning-based methods.)
                |----- Candidate Patches: This folder contains all patches generated by vulnerability repair tools. 
                                          (For learning-based methods, this folder contains all successfully restored patches within original 50 generated candidates.)
                |----- Compilable Patches: This folder contains all successfully compiled patches within all candidate patches.
                |----- Plausible Patches: This folder contains all patches that successfully pass vulnerability exploit test within all compilable patches.
                |----- Correct Patches: This folder contains all correct patches assessed by humans.
|----- test
    |----- [Software]
        |----- test.sh: Script used for compiling and testing the software.
|----- train_valid_data
    |----- train.csv: Train data for learning-based tools.
    |----- valid.csv: Valid data for learning-based tools.
|----- setup.py: setup script for command line tool.
|----- README.md

2. Benchmark Dataset Vul4C

2.1 Details about Vul4C

Our benchmark dataset Vul4C contains 144 vulnerabilities over 19 CWE types and 23 software.

Here is the statistics of Vul4C.

SH-SL = Single-Hunk, Single-Line; SH-ML = Single-Hunk, Multiple-Line;

MH-SF = Multiple-Hunk, Single-File; MH-SF = Multiple-Hunk, Multiple-File.

CWE Type Total SH-SL SH-ML MH-SF MH-MF
CWE-119 35 7 11 11 6
CWE-125 29 2 5 12 10
CWE-476 16 1 4 10 1
CWE-369 11 1 8 2 0
CWE-190 9 0 0 6 3
CWE-787 14 4 1 5 4
CWE-20 6 0 1 2 3
CWE-416 4 0 0 4 0
CWE-835 4 0 1 2 1
CWE-189 2 1 0 1 0
CWE-617 2 0 1 1 0
CWE-120 1 0 0 0 1
CWE-415 1 0 1 0 0
CWE-704 1 0 0 1 0
CWE-770 1 0 1 0 0
CWE-191 1 1 0 0 0
CWE-682 1 0 0 0 1
CWE-843 1 0 0 1 0
N/A 5 0 1 3 1
Total 144 17 35 61 31

2.2 Usage

2.2.1 Start

Since our benchmark consists of multiple projects, each with distinct compilation environment requirements, we highly recommend using our pre-configured Docker image: vul4c/vul4c:4.0.

2.2.2 Command

We have built a command-line tool for Vul4c. All commands are shown as follows:

   vul4c info -i <CVE ID>                           # Print information about a vulnerability  
   vul4c status                                     # List vul4c requirements and their availability  
   vul4c checkout -i <CVE ID> -d <checkout dir>     # Checkout the vulnerability
   vul4c compile -d <checkout dir>                  # Compile the checked out vulnerability
   vul4c reproduce -d <checkout dir>                # Exploit the checked out vulnerability
   vul4c test -d <checkout dir>                     # Test the checked out vulnerability with test suite
   vul4c apply -d <checkout dir> -p <patch file>    # Apply the patch to the vulnerability
   vul4c validate -d <checkout dir> -p <patch file> # Exploit and test the checked out vulnerability before and after patching

2.2.3 Example

Take CVE-2017-7607 as example, its usage is as follows:

  1. First, you need to install the command-line tool by running python3 setup.py install in the SoK root directory. After installation, you can enter vul4c in the command line to check whether the installation was successful.

  2. To checkout a vulnerability into the specified directory, use the command:
    vul4c checkout -i CVE-2017-7607 -d /root/test/CVE-2017-7607

  3. To compile the checked out vulnerability, use the command:
    vul4c compile -d /root/test/CVE-2017-7607
    To run the exploit, we added the sanitizer options during compilation, which may cause errors when executing tests. If you want to run the test suite, you can add the --no-flags option:

    vul4c compile -d /root/test/CVE-2017-7607 --no-flags

  4. To exploit the checked out vulnerability, use the command:
    vul4c reproduce -d /root/test/CVE-2017-9038

  5. To apply a patch to the checked out vulnerability, use the command:
    vul4c apply -d /root/test/CVE-2017-9038 -p patch.diff
    Here, the patch file should conform to the standard diff format. We recommend using the following command to generate the patch file:
    diff -u OLD.c NEW.c > patch.diff
    The format of the patch file is similar to the following:

     --- Vul4C-Benchmark/elfutils/CVE-2017-7607/CVE-2017-7607_CWE-125_readelf.c_OLD.c        2025-06-06 16:26:00.000000000 +0000
     +++ Vul4C-Benchmark/elfutils/CVE-2017-7607/CVE-2017-7607_CWE-125_readelf.c_NEW.c        2025-06-06 16:26:00.000000000 +0000
     @@ -3262,7 +3262,7 @@
                 ++nsyms;
                 if (maxlength < ++lengths[cnt])
                 ++maxlength;
     -           if (inner > max_nsyms)
     +           if (inner >= max_nsyms)
                 goto invalid_data;
             }
             while ((chain[inner++] & 1) == 0);
    

    We also provide the git apply method. Use it by adding the -g parameter. When using this method, please ensure your patch file conforms to the standard git format:

    vul4c apply -d /root/test/CVE-2017-9038 -p patch.diff -g

  6. To compare the exploit and test results of the checked out vulnerability before and after patching, use the command:

    vul4c validate -d /root/test/CVE-2017-7607 -p /home/SoK-main/Vul4C-Benchmark/elfutils/CVE-2017-7607/CVE-2017-7607_CWE-125_readelf.c.diff

    The exploit and test outputs for vulnerabilities before and after patching will be saved to the following files respectively:

    /root/test/CVE-2017-7607/VUL4C/reproducing_old.log
    /root/test/CVE-2017-7607/VUL4C/reproducing_new.log
    /root/test/CVE-2017-7607/VUL4C/testing_old.log
    /root/test/CVE-2017-7607/VUL4C/testing_new.log
    

3. Experimental Tools and Results

The following tools were used in the evaluation of our paper. Some experimental tools could be found at docker repository vul4c. We strongly recommend using the original Docker image from the authors to ensure you have the latest features and fixes (if any).

2.1 Automated Vulnerability Repair Tools

Tool Venue Repository
VulRepair ESEC/FSE'22 https://github.com/awsm-research/VulRepair
VRepair TSE'23 https://github.com/ASSERT-KTH/VRepair
VQM TOSEM'24 https://github.com/awsm-research/VQM
VulMaster ICSE'24 https://github.com/soarsmu/VulMaster_
ExtractFix TOSEM'20 https://extractfix.github.io/
VulnFix ISSTA'22 https://github.com/yuntongzhang/vulnfix
Senx S&P'19 Not open source, but we asked the artifacts from the authors
Seader ICPC'22 https://github.com/NiSE-Virginia-Tech/ying-ICPC-2022
SeqTrans TSE'23 https://github.com/chijianlei/SeqTrans

2.2 Automated Program Repair Tools

Tool Venue Repository
CquenceR ISSRE'21 https://github.com/epicosy/CquenceR
NTR ICSE'25 https://sites.google.com/view/neuraltemplaterepair
ThinkRepair ISSTA'24 https://github.com/vinci-grape/ThinkRepair
SRepair arXiv'24 https://github.com/GhabiX/SRepair

The experimental results could be found at Table [14] and Table [15] of the full version paper. All the validation results of generated patches in our experiments could be found at Results directory in this repo (please see Repository Structure for more information).

4. Cite

@inproceedings{hu2025sok,
author = {Hu, Yiwei and Li, Zhen and Shu, Kedie and Guan, Shenghua and Zou, Deqing and Xu, Shouhuai and Yuan, Bin and Jin, Hai},
title = {SoK: automated vulnerability repair: methods, tools, and assessments},
year = {2025},
booktitle = {Proceedings of the 34th USENIX Security Symposium (USENIX Security 25)}
}

5. Contact

If you have any questions on this research, please contact the first author ([firstName][lastName]@hust.edu.cn) or the corresponding author (zh_li@hust.edu.cn). We will reply in one week.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors