Skip to content

CFGFast appears to truncate some functions #6299

@NikhilC2209

Description

@NikhilC2209

Description

While building CFG for an ARM binary I noticed that angr seems to truncate some functions and miss some blocks even when resolve_indirect_jumps is set to True.

I saw this bug on a huge ARM based OpenMV binary, but I was also able to reproduce this on smaller ARM and x86 binaries.

ARM binary: arm_binary.tar.gz
x86 binary: x86_64_binary.tar.gz

Minimal script to load and print the disassembly of the function in the binary:

#!/usr/bin/env python3
import sys

import angr


if len(sys.argv) < 3:
    print(f"Usage: python {sys.argv[0]} <binary_path> <function_address_hex>")
    sys.exit(1)

binary_path = sys.argv[1]
function_address = int(sys.argv[2], 16)

p = angr.Project(binary_path, auto_load_libs=False, load_debug_info=True)
print(f"Analyzing CFG for {binary_path}...")
cfg = p.analyses.CFGFast(
    normalize=True,
    resolve_indirect_jumps=True,
    detect_tail_calls=True,
    data_references=True,
    force_smart_scan=True,
    start_at_entry=True,
)
#cfg = p.analyses.CFGFast()
print("CFG analysis complete.")

try:
    target_function = cfg.functions[function_address]
except KeyError:
    target_function = None

if target_function is None:
    print(f"Function not found at address {hex(function_address)}")
    sys.exit(1)

total_size=0

print(f"\n== Function Disassembly ==\nname: {target_function.name}\naddr: {hex(target_function.addr)}")
for block in target_function.blocks:
    total_size+=block.size
    print(f"\nBLOCK {hex(block.addr)} size={block.size}")
    print(block.disassembly)

print(f"Total func size: {total_size}")

Steps to reproduce the bug

  • Load the binary using the script and provide a function address
  • Take main function at 0x11374 as an example for the ARM binary, which is affected by this. For the x86_64 binary take iocop_device at 0x401f70 as an example.
  • Function size turns out to be 820 for ARM binary according to angr.
  • Compare with function size from ELF symbol table using readelf which is 972.

Environment

  • angr: 9.2.207
  • Python: 3.10.17 - OS: Arch Linux
  • Architecture tested:
    • ARM32 ELF: cpio-2.12_clang-4.0_arm_32_O0_rmt
    • x86_64 ELF: cpio-2.12_clang-8.0_x86_64_O0_rmt

Additional context

The issue is much more prevalent on the ARM side where even after enabling resolve_indirect_jumps to True around 54 out of 147 functions were truncated whereas on the x86 binary only one function iocop_device is truncated.

Metadata

Metadata

Assignees

Labels

bugSomething is broken

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions