Fuzzing Crypto

Code release for

Fenzi, G., Gilcher, J., Virdia, F. (2026). Finding Bugs and Features Using Cryptographically-Informed Functional Testing. IACR Transactions on Cryptographic Hardware and Embedded Systems, 2026(1). https://eprint.iacr.org/2024/1122

In this repository, we include instructions to run our tests, instructions to run a baseline of generic fuzzing, and an explanation of our source code explaining how it is structured.

Dependencies

Docker
(Soft dependence): a server with 16+ cores to run the tests in parallel. See Table 2 of the paper for wall times for test completion on 31-core and 75-core machines.

Instructions to run

bash run.sh

A terminal within the container will open.

To run experiments on liboqs

To run experiments on version:

0.14.0, replace ver_liboqs with ches_liboqs below.
0.8.0, replace ver_liboqs with cur_liboqs below.
0.4.0, replace ver_liboqs with mid_liboqs below.
2018-11, replace ver_liboqs with old_liboqs below.

Within the container, run

bash reproduce.sh ver_liboqs

Comment the values in BLACKLIST on lines 35-42 of fuzz_liboqs.py if you want a full run (it will take significantly longer).

To run experiments on supercop 20240107

Within the container

bash reproduce.sh supercop

Reports

The reports generated by the code above can be found inside /reports/, which is mounted as a volume within the docker container. The reports are generated in three formats, as SQLite database, as Excel file and as a Latex table.

The reports from the experiments described in the paper can be found in /paper_reports/. The SQLite format was omitted due to the size of the databases.

Reading the reports

The reports generated by our code refer to the tests with specific names rather than numbers (as done in the paper). The mapping between the two is the following:

Paper test number	Paper test name	Report test name
Test 1	Hash(Maul(x))	this is the test performed on SUPERCOP
Test 2	Gen(; Maul(r))	KEM/Keygen/badrng and SIGN/Keygen/badrng
Test 3	Encaps(Maul(pk); r)	KEM/Encaps/pk-0
Test 4	Decaps(sk, Encaps(Maul(pk); r))	KEM/Encaps/pk
Test 5	Encaps(pk; Maul(r))	KEM/Encaps/badrng
Test 6	Decaps(Maul(sk), c)	KEM/Decaps/sk
Test 7	Decaps(sk, Maul(c))	KEM/Decaps/c
Test 8	Sign(Maul(sk), m; r)	SIGN/Sign/sk
Test 9	Sign(sk, Maul(m); r)	SIGN/Sign/m
Test 10	Sign(sk, m; Maul(r))	SIGN/Sign/badrng
Test 11	Verify(Maul(pk), m, sigma)	SIGN/Verify/pk
Test 12	Verify(pk, Maul(m), sigma)	SIGN/Verify/m
Test 13	Verify(pk, m, Maul(sigma))	SIGN/Verify/sig

Expected deviations

Some observed "software" crashes are partially probabilistic in nature. For example, hangs are measured by wall time, meaning that running the same tests on a slower CPU could result in more hangs being reported. Similarly, out-of-bounds memory writes may not cause segmentation faults if the memory they write in is not currently allocated to a different process.

This may result in slightly different numbers if reproducing our experiments on the same libraries but different hardware.

Baseline

The instructions in this section allow reproducing the experimental results from section "5.1.1 Baseline" in the paper.

First, create and run the same container as for the above experiments by running.

bash run.sh

A terminal within the container will open.

To run the baseline on liboqs

To run experiments on version:

0.14.0, replace ver_liboqs with ches_liboqs_afl below.
0.8.0, replace ver_liboqs with cur_liboqs_afl below.
0.4.0, replace ver_liboqs with mid_liboqs_afl below.
2018-11, replace ver_liboqs with old_liboqs_afl below.

Within the container, run

bash reproduce.sh ver_liboqs baseline

To run the baseline on supercop 20240107

Within the container, run

bash reproduce.sh supercop baseline

Source code structure

Dockerfile: configuration to build an environment that can reproduce results
Makefile: configuration to build dependencies for the experiments; takes care of cloning the correct snapshots for LibOQS/SUPERCOP, install dependencies and correct versions of the compiler, fetches, patches and installs AFL++
build.sh / run.sh: scripts for creating Docker container to reproduce results in
reproduce.sh: wrapper script that consolidates the various steps to reproduce our experiments within the Docker container
fuzz_liboqs.py: starts the parallel testing of every implementation provided by LibOQS. For each scheme, it runs the relevant tests by internally calling AFL++ on a harness implementing the metamorphic test
fuzz_liboqs_baseline.py: similarly to fuzz_liboqs.py, this script runs the baseline fuzzing campaign.
report.py: collects crashes generated by AFL++ within fuzz_liboqs.py, generating reports in three formats: an Excel table, a SQLite database, and a less detailed Latex table. The Excel and SQLite reports contain every crash found, including all inputs to the algorithm being tested that cause the crash/security notion violation, and the diff between the original input and the mauled input that caused the crash/security notion violation
report_baseline.py: similar to report.py, collects results from the baseline fuzzing campaign and reports them in Excel and SQLite formats
supercop_report.py/supercop_report_baseline.py: similar to report.py/report_baseline.py, they collect crashes and build Excel and SQLite reports for SUPERCOP experiments and baseline fuzzing
paper_reports/: directory containing Excel-format reports from experiments reported in the papers, generated with the various reporting scripts mentioned above.
/tech/paper_fuzzing/liboqs: contains C and Python code implementing our testing framework for LibOQS. Test harnesses for KEM (resp. SIGN) tests can be found in the KEM (resp. SIGN) subdirectory. For example, consider the KEM.Decaps(sk, Maul(c)) test (source files within /tech/paper_fuzzing/liboqs/KEM/Decaps/c):
- Call.c: implements the Call function from Definition 7
- GenInput.c: implements GenInput from Definition 7
- ParseInput.c: implements a program that passed in input a crash dump generated by AFL++, displays to screen the inputs and outputs from the Call function evaluated for the corresponding dump
- CodeGen.py: given in input a crash dump, it outputs a C source file that replicates the crashing Call evaluation as a standalone binary (useful when inspecting the cause of a crash)
- Makefile: contains the necessary commands to run AFL++ on a desired KEM and library version, using our custom mutator.
- Note: Match and Maul from Definition 7 do not appear in this directory, since they are not specific to the KEM/Decaps/c test. Instead, these are shared by most tests and can be found in /tech/paper_fuzzing/liboqs/
/tech/paper_fuzzing/supercop/crypto_hash: Implements our testing specification from Definition 7 (Call, GenInput, Maul, Match). The testing loop is implemented in supercop.sh, which tries to follow the structure of the testing scripts provided by SUPERCOP such as do-part or data-run
/tech/paper_fuzzing/utilities: contains small utility C libraries to perform operations on buffers, and an implementation of a custom PRNG for LibOQS tests where we maul the randomness source
/tech/paper_fuzzing/vanilla: structured like /tech/paper_fuzzing, it contains the source code needed to run the baseline fuzzing campaign to compare against.

License

This software is distributed under the GNU General Public License version 3. See LICENSE for more details.

Contributors

Code was contributed by

Jan Gilcher
Fernando Virdia
Giacomo Fenzi

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Fuzzing Crypto

Dependencies

Instructions to run

To run experiments on liboqs

To run experiments on supercop 20240107

Reports

Reading the reports

Expected deviations

Baseline

To run the baseline on liboqs

To run the baseline on supercop 20240107

Source code structure

License

Contributors

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
docker/ubuntu		docker/ubuntu
paper_reports		paper_reports
tech/paper_fuzzing		tech/paper_fuzzing
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
README.md		README.md
build.sh		build.sh
fuzz_liboqs.py		fuzz_liboqs.py
fuzz_liboqs_baseline.py		fuzz_liboqs_baseline.py
report.py		report.py
report_baseline.py		report_baseline.py
reproduce.sh		reproduce.sh
run.sh		run.sh
supercop_report.py		supercop_report.py
supercop_report_baseline.py		supercop_report_baseline.py

Folders and files

Latest commit

History

Repository files navigation

Fuzzing Crypto

Dependencies

Instructions to run

To run experiments on liboqs

To run experiments on supercop 20240107

Reports

Reading the reports

Expected deviations

Baseline

To run the baseline on liboqs

To run the baseline on supercop 20240107

Source code structure

License

Contributors

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages