Skip to content

Symex-rs/Symex

Repository files navigation

SYMEX

Symex is a symbolic execution engine, producing cycle accurate Worst Case Execution Time estimates for certain architectures under certain conditions. This is achieved by operating on compile binaries for one of the following targets at the users discretion:

  • ARMv6-M
  • ARMv7-M/ARMv7-EM
  • RISC-V (only RV32I base integer instruction set is currently supported), for the Hippomenes architecture.

As the engine operates on compiled binaries it is language agnostic at the core. However, the testing has been focused on Rust applications, more specifically RTIC-v1 applications.

Getting started

The easiest way to use Symex is by the cargo-symex tool.

Symex has support for a number of SMT solver backends, the default, and recommended, choice is Bitwuzla. Installing cargo-symex using Bitwuzla as backend, there are two options, either using a standalone, system-wide installation of Bitwuzla, or building Bitwuzla directly as part of building symex.

Installing using standalone Bitwuzla

This requires Bitwuzla is installed on your system, this is typically available from your system package manager. Once Bitwuzla has been installed, one can install cargo-symex by running:

cargo install --path cargo-symex

Installing without standalone Bitwuzla

If installing Bitwuzla shows to be problematic, both symex and cargo-symex include a feature build-bitwuzla which builds bitwuzla instead of using the system version, this doesn't require a standalone installation of Bitwuzla, although it does require Cadical, which is also typically available from your system package manager.

Once Cadical has been installed, one can install cargo-symex by running:

cargo install --path cargo-symex --feature=build-bitwuzla
If the compiler does not find Cadical If you are building with bitwuzla as the solver you need to install all of bitwuzlas dependencies, this includes a SAT solver called Cadical which likely can be installed with your system package manager. Please install this and try again.
For Mac users If installing Cadical via e.g. Homebrew, `rustc` does not by default look for libraries in the Homebrew lib dir, you will need to point it in the right direction.

For instance,

RUSTFLAGS='-L /opt/homebrew/lib' cargo install --path=cargo-symex --features=build-bitwuzla

adds the Homebrew lib directory to the library search path and builds and installs cargo-symex.

This can then be used to compile examples or binaries and executing the (mangled) name of the function like so:

cargo symex --example [example name] --function [function name] (--release) # for examples
cargo symex --bin [example name] --function [function name] (--release) # for binaries

Or it can be used to run a pre-compiled binary by calling it like so

cargo symex --path [path to .elf file] --function [function name] (--release)

Application notes

Analysis of compiled binaries has a few caveats, namely:

  • To analyze a function it must have an entry in the .symtab section of the elf file. All symbols in an elf file can be shown using the readelf -s [path to elf file] command. To tell rustc to not mangle the function name the attribute #[no_mangle] can be used.
  • When using symex-lib functions or to be able to detect panic the debug-data must be included in the elf file.
  • Symex can be directly used as a library see wcet-analasis-example directory for examples of how to do that.

Notes on the max cycle count on ARMv6-M

The max cycle count for each path is calculated by counting the number of cycles for each instruction according to this document. It assumes a core without wait-states.

Notes on cycle counting for ARMv7-(E)M

The cycle counting model does not contain any model of a branch predictor. This means that the branching model always flushes the pipeline thus incurring a lot more cycles estimated as soon as a branch can be predicted by the hardware. This could be improved greatly by adding a branch prediction model. The cycle counting model also assumes that the code encounters zero wait states, i.e. the code is running from RAM. The Cycle counts for each instruction are based on the cortex-m4 documentation.

Note on cycle counting for RISC-V

The cycle counts are based on the single-cycle, non-pipelined Hippomenes architecture.

Limitations for ARMv7-(E)M

The ARMv7 support has partial implementations for DSP and the floating point extension. The DSP extension is parsable by the disarmv7 but is not implemented in the decoder. Armv7 has support for hardware semaphores, at the time of writing these are not implemented in Symex. Finally, the ARMv7 ISA defines floating point operations, most of these are partially supported. However, as outlined in 2

Future work planned or unplanned

Improve testing suite

The assembly to Symex GA translators needs further testing.

  • The ARMv7 translator is only $\approx 26%$ tested, due to the nature of this project we should strive to improve this.
  • The ARMv6 translator lacks direct testing and should also be improved.
Test percentage

The 26 percent comes from llvm-test-cov which and corresponds to the lines covered. The reasoning behind including this number is simply that it shows that more testing is needed.

Include DSP instructions

The current (v7) implementation lacks support for DSP instructions, most of these can be implemented without large changes.

Include support for hardware semaphores

This is nontrivial as it extensive modeling of the system if it is to be useful. However, we could implement the baseline definition from the data sheet if we simply added a hashmap to keep track of which memory addresses are subject to a semaphore.

Support more RISC-V instruction sets

Currently limited to RV32I base integer instruction set (Hippomenes architecture).

Building

Dependencies

  • bitwuzla, Bitwuzla is a Satisfiability Modulo Theories (SMT) solver for the theories of fixed-size bit-vectors, arrays and uninterpreted functions.
  • (Optional) boolector, Boolector is a Satisfiability Modulo Theories (SMT) solver for the theories of fixed-size bit-vectors, arrays and uninterpreted functions.

Debug output from SYMEX

The implementation uses the Rust log framework. You can set the logging level to the environment variable RUST_LOG. See below example (assumes the cargo-sub command symex).

> RUST_LOG=DEBUG cargo symex ...

If you want to narrow down the scope of logging you can give a list of modules to log.

> RUST_LOG="symex=debug" cargo symex ...

Symex uses different logging levels:

  • info, high level logging of steps taken.
  • debug, general logging of important actions.
  • trace, further information on internal operations and data structures.

You can also narrow down the scope to specific modules, e.g. the executor.

> RUST_LOG="symex::executor=trace" cargo symex ...

License

Licensed under either of

at your option.

Contribution

Unless you explicitly state otherwise, any contribution intentionally submitted for inclusion in the work by you, as defined in the Apache-2.0 license, shall be dual licensed as above, without any additional terms or conditions.

Acknowledgement

This tool is the product three thesis projects and other course work at Luleå University Of Technology. Two of these thesis projects have been performed with financial support by Grepit AB.

The thesis projects, in chronological order, are as follows

About

Symbolic execution of compiled binaries

Resources

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE-APACHE
MIT
LICENSE-MIT

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 8

Languages