Structor is a Hex-Rays plugin for two closely related jobs:
recovering structure layouts from pointer arithmetic and fixing decompiler variable types inside functions.
Reverse engineering stripped binaries usually means working against missing type information. Hex-Rays can often recover control flow and basic expressions, but object layouts and variable types still degrade into pointer arithmetic, vague scalars, register-backed temporaries, and overlapped locals.
Structor exists to push that decompilation back toward meaningful C types. It started as a structure synthesis plugin: when Hex-Rays shows raw pointer arithmetic such as *(int *)((char *)ptr + 8), Structor can recover a structure layout, create the corresponding IDA type, apply it to the variable, and refresh pseudocode into member accesses.
The project now also includes a function type-fixing pipeline that runs on decompiled functions. That pipeline can:
- infer more specific local and argument types from observed usage
- recover types for exact-storage overlapped locals by borrowing from sibling variables
- report likely undeclared register-backed inputs that callers clearly populate, without mutating function signatures yet
README.md: current features, APIs, configuration, build, installation, and testingdocs/Z3_SYNTHESIS_PLAN.md: historical design plan for the Z3 synthesis pathdocs/Z3_TYPE_INFERENCE.md: research notes for richer Z3-driven type inferencedocs/CROSS_FUNCTION_SIBLING_DISCOVERY.md: focused note on one implemented cross-function analysis improvement
Structor has two public workflows.
- Collect constant-offset dereferences from a selected variable
- Merge evidence across callers and callees when enabled
- Solve a layout with Z3 when available, with heuristic fallback
- Create IDA struct and optional companion vtable types
- Propagate the resulting type to related functions and locals
- Refresh pseudocode so pointer arithmetic becomes field access
- Analyze variables in a decompiled function
- Compare current decompiler types with inferred types
- Apply significant, confident fixes to locals and arguments
- Emit warnings and diagnostics for unresolved or report-only cases
- Expose the results through IDC getters for automation and testing
Before Structor, Hex-Rays may show code like this:
void process_object(void *ptr) {
int type = *(int *)ptr;
void *data = *(void **)((char *)ptr + 8);
void (*callback)(void) = *(void (**)(void))((char *)ptr + 0x10);
if (type == 1) {
callback();
}
}After synthesis, the same function can decompile as typed member access:
struct synth_process_object_0 {
int field_0;
int _pad_4;
void *field_8;
void (*field_10)(void);
};
void process_object(struct synth_process_object_0 *ptr) {
int type = ptr->field_0;
void *data = ptr->field_8;
void (*callback)(void) = ptr->field_10;
if (type == 1) {
callback();
}
}Generated names vary by naming heuristics and the detected field roles.
| Capability | Notes |
|---|---|
| Z3-backed layout synthesis | Preferred by default, with fallback to heuristic synthesis |
| Cross-function analysis | Tracks pointer flow across callers and callees |
| Pointer-delta normalization | Handles subobject-style ptr + const flows |
| Array detection | Detects regular strided access patterns |
| Union creation | Represents irreducible overlapping fields |
| Nested sub-struct emission | Controlled by emit_substructs |
| Vtable detection | Can create a companion vtable struct |
| Type propagation | Propagates inferred struct types to related sites |
| Capability | Notes |
|---|---|
| Automatic decompilation-time analysis | Controlled by auto_fix_types |
| Local and argument type comparison | Uses significant-difference thresholds before applying fixes |
| Overlapped-local recovery | Recovers exact-storage siblings from better-typed peers |
| Missing register-input reporting | Warns when callers populate a non-argument register-backed input |
| IDC-accessible warnings and diagnostics | Query last-run warnings and diagnostics programmatically |
- IDA Pro with the Hex-Rays decompiler
- A valid IDA license
- 64-bit plugin build by default (
IDA_EA64=ON) - macOS, Linux, or Windows
- Z3
Z3 is detected from the system first. If no system Z3 is found, the build can fetch and build Z3 from source automatically.
export IDA_SDK_DIR=/path/to/idasdk
makeThe repository Makefile also accepts IDASDK and supports:
makemake debugmake installmake cleanmake rebuild
mkdir -p build
cmake -S . -B build -DIDA_SDK_DIR=/path/to/idasdk
cmake --build build --parallelUseful CMake options:
| Option | Meaning |
|---|---|
IDA_SDK_DIR |
Path to the IDA SDK |
IDA_INSTALL_DIR |
Optional install destination for the plugin |
IDA_EA64 |
Build for 64-bit IDA, default ON |
BUILD_TESTS |
Build unit and integration test targets |
Z3_USE_CUSTOM |
Use a specific Z3 library/include pair instead of system Z3 |
The built plugin is named after the project itself:
- macOS:
structor.dylib - Linux:
structor.so - Windows:
structor.dll
Example install on macOS/Linux:
mkdir -p ~/.idapro/plugins
cp build/structor.dylib ~/.idapro/plugins/structor.dylibOn macOS, sign the plugin after copying it into the IDA plugin directory:
codesign -s - -f ~/.idapro/plugins/structor.dylibIf you use the Makefile install target:
make install
codesign -s - -f ~/.idapro/plugins/structor.dylib- Open a function in the Hex-Rays pseudocode view.
- Place the cursor on a pointer-like variable.
- Press
Shift+Sor use the context menu actionSynthesize Structure.
Structor will collect accesses, synthesize a layout, persist the result in IDA's type system, apply the type, and refresh pseudocode.
When auto_fix_types=true, Structor also analyzes functions as Hex-Rays prints them.
That pipeline can:
- apply significant type upgrades automatically
- print warning messages for report-only cases
- print per-fix details when
auto_fix_verbose=true - emit overlap-recovery diagnostics when
debug_mode=true
Important: missing undeclared register-backed arguments are currently reported only. Structor does not rewrite the function signature for that case yet.
| Function | Returns | Meaning |
|---|---|---|
structor_synthesize(func_ea, var_idx) |
tid_t |
Synthesize a structure from a local variable index |
structor_synthesize_by_name(func_ea, var_name) |
tid_t |
Same, but by decompiler variable name |
structor_get_error() |
string |
Error from the last synthesis or type-fix action |
structor_get_field_count() |
long |
Field count from the last synthesis |
structor_get_vtable_tid() |
tid_t |
Companion vtable TID, or BADADDR |
| Function | Returns | Meaning |
|---|---|---|
structor_fix_function_types(func_ea) |
long |
Apply significant fixes in a function |
structor_fix_variable_type(func_ea, var_idx) |
long |
Fix a single variable by index |
structor_fix_variable_by_name(func_ea, var_name) |
long |
Fix a single variable by name |
structor_analyze_function_types(func_ea) |
long |
Dry-run analysis without applying changes |
structor_get_fix_count() |
long |
Variables analyzed in the last type-fix run |
structor_get_fixes_applied() |
long |
Fixes applied, or differences found for dry-run analysis |
structor_get_fixes_skipped() |
long |
Fixes skipped in the last type-fix run |
structor_get_fix_warning_count() |
long |
Warning count from the last type-fix run |
structor_get_fix_warning(idx) |
string |
Warning text by index |
structor_get_fix_diagnostic_count() |
long |
Diagnostic count from the last type-fix run |
structor_get_fix_diagnostic(idx) |
string |
Diagnostic text by index |
import ida_expr
import idc
result = ida_expr.idc_value_t()
ida_expr.eval_idc_expr(result, idc.BADADDR, "structor_synthesize(0x100000460, 0)")
struct_tid = result.i64
if struct_tid != idc.BADADDR:
count = ida_expr.idc_value_t()
ida_expr.eval_idc_expr(count, idc.BADADDR, "structor_get_field_count()")
print(f"Created structure {struct_tid:#x} with {count.num} fields")
else:
err = ida_expr.idc_value_t()
ida_expr.eval_idc_expr(err, idc.BADADDR, "structor_get_error()")
print(err.c_str())import ida_expr
import idc
result = ida_expr.idc_value_t()
ida_expr.eval_idc_expr(result, idc.BADADDR, "structor_analyze_function_types(0x100000548)")
warn_count = ida_expr.idc_value_t()
ida_expr.eval_idc_expr(warn_count, idc.BADADDR, "structor_get_fix_warning_count()")
for i in range(warn_count.num):
warn = ida_expr.idc_value_t()
ida_expr.eval_idc_expr(warn, idc.BADADDR, f"structor_get_fix_warning({i})")
print(warn.c_str())StructorAPI exposes the public programmatic entry points.
#include <structor/api.hpp>
structor::SynthOptions opts = structor::Config::instance().options();
opts.min_accesses = 2;
opts.vtable_detection = true;
opts.z3.cross_function = true;
opts.z3.detect_arrays = true;
structor::SynthResult result =
structor::StructorAPI::instance().synthesize_structure(func_ea, var_idx, &opts);
if (result.success()) {
msg("Created struct tid=%llx with %d fields\n",
static_cast<unsigned long long>(result.struct_tid),
result.fields_created);
}#include <structor/api.hpp>
structor::TypeFixResult result =
structor::StructorAPI::instance().analyze_function_types(func_ea);
for (const auto &warning : result.warnings) {
msg("Structor warning: %s\n", warning.c_str());
}
for (const auto &diagnostic : result.diagnostics) {
msg("Structor diagnostic: %s\n", diagnostic.c_str());
}Structor can now be consumed as a normal CMake subproject without going through Structor's plugin init path.
set(STRUCTOR_BUILD_PLUGIN OFF CACHE BOOL "" FORCE)
add_subdirectory(path/to/structor)
target_link_libraries(my_plugin PRIVATE structor::core)#include <structor/api.hpp>
#include <structor/host_integration.hpp>
structor::HostIntegration host;
// Optional: if your plugin wants Structor's callback-driven behavior,
// either install Structor's Hex-Rays hooks...
host.install_hexrays_hooks();
// ...or forward events from your own Hex-Rays callback.
host.handle_ctree_maturity(cfunc, maturity);
host.handle_func_printed(cfunc);
auto preview = structor::StructorAPI::instance().synthesize_structure(
func_ea,
var_idx,
structor::MaterializationMode::Preview,
&opts);Use structor::core when another plugin wants to orchestrate Structor directly.
The structor plugin target remains the UI/IDC/plugin-wrapper build.
Structor stores configuration in ~/.idapro/structor.cfg and creates the file automatically on first run.
Current keys written by the codebase:
# Structor Configuration
[General]
hotkey=Shift+S
interactive_mode=false
auto_open_struct=true
debug_mode=false
[TypeFix]
auto_fix_types=true
auto_fix_verbose=false
[Synthesis]
min_accesses=2
alignment=8
vtable_detection=true
emit_substructs=true
[Propagation]
auto_propagate=true
propagate_to_callers=true
propagate_to_callees=true
max_propagation_depth=3
[UI]
highlight_changes=true
highlight_duration_ms=2000
generate_comments=true
[Z3]
z3_mode=preferred
z3_timeout_ms=5000
z3_memory_limit_mb=256
z3_enable_maxsmt=true
z3_enable_unsat_core=true
z3_detect_arrays=true
z3_min_array_elements=3
z3_cross_function=true
z3_max_candidates=1000
z3_allow_unions=true
z3_min_confidence=20
z3_relax_on_unsat=true
z3_max_relax_iterations=5
z3_weight_minimize_padding=1
z3_weight_prefer_non_union=2cmake -S . -B build -DIDA_SDK_DIR=/path/to/idasdk -DBUILD_TESTS=ON
cmake --build build --parallelctest --test-dir build --output-on-failureWith BUILD_TESTS=ON, the test CMake project also provides:
checkfor the full suitecheck_z3for the Z3-focused subset
On supported Apple arm64 hosts with idump available and a valid local IDA license, the CTest suite also runs live plugin regressions against real binaries. Those regressions currently exercise:
- missing register-backed argument reporting with
test_missing_regarg - overlap-based local type recovery with
test_overlap_scope
You can also run the live regression script directly:
python3 integration_tests/check_type_fixer_regressions.py \
--repo-root /path/to/structor \
--plugin /path/to/structor/build/structor.dylib \
--idump idumpsh integration_tests/build_fixtures.shOr build only specific fixtures:
sh integration_tests/build_fixtures.sh test_missing_regarg test_overlap_scopeRepresentative fixture coverage includes:
- simple structs
- nested structs and arrays
- packed structs and packed overlaps
- callback tables and function pointers
- negative offsets and shifted windows
- cross-function subobject deltas
- overlap-based local type recovery
- missing undeclared register-backed arguments
Structor is not doing one monolithic "infer everything" pass. It uses a staged pipeline with different algorithms for synthesis, propagation, and intra-function type repair.
The first stage is a Hex-Rays ctree visitor.
AccessPatternVisitorwalks decompiler expressions and looks forcot_ptr,cot_memptr,cot_idx, assignments, comparisons, masked bitfield-style loads, and indirect calls.- For each relevant expression it tries to reduce the expression to a base variable plus a constant offset.
- It records more than just offset and size: semantic intent, inferred decompiler type, access direction, array stride hints, base-indirection depth, and observed constants from comparisons.
- Local aliases are forwarded through assignments when the right-hand side can be reduced to an access on the target variable.
That means Structor is not limited to raw *(base + off) loads. It also learns from patterns like:
- aliased temporaries
ptr->fieldforms already emitted by Hex-Rays- masked-and-shifted bitfield reads
- constant comparisons that help refine enum/flag-like fields
- indirect calls that indicate function-pointer or vtable usage
When cross-function analysis is enabled, Structor does not treat each function in isolation.
CrossFunctionAnalyzertraces both forward and backward through the call graph up to configured limits.- At each call site,
ArgDeltaExtractorlooks for direct passing, casts,ptr + const,ptr - const, and by-reference forwarding. - Offsets are normalized into a common coordinate system by accumulating pointer deltas across calls.
- Merged accesses are deduplicated by location and merged by semantic specificity, not just by first-seen order.
This is what lets Structor combine evidence such as:
caller(ptr); // accesses offset 0x0 and 0x8
callee(ptr + 0x10); // accesses offset 0x0 and 0x8 relative to the shifted baseinto one recovered layout with offsets 0x0, 0x8, 0x10, and 0x18.
The implementation also contains extra normalization for tricky shifted-window cases, including rebasing negative offsets and pruning intermediate positive-delta patterns when they would distort the merged coordinate system.
The primary synthesis path is candidate generation plus constraint solving.
FieldCandidateGeneratorexpands the merged access pattern into plausible field candidates.LayoutConstraintBuilderencodes the layout problem for Z3.- The solver prefers a layout that covers all accesses while minimizing ambiguity and unnecessary padding.
At a practical level, the Z3 path is optimizing for things like:
- coverage of every observed access
- non-overlap where possible
- unions only where overlap is irreducible
- alignment and packing consistency
- array recognition for regular stride patterns
If the solve succeeds, Structor extracts a SynthStruct directly from the Z3 model and records solver statistics such as arrays detected, unions created, relaxed constraints, and solve time.
If the constraints are unsatisfiable or the solve times out, Structor does not simply fail. It records the failure reason and drops into the fallback path unless Z3 was configured as required.
The fallback path is deliberately simpler and faster.
- Sort accesses by offset.
- Group overlapping accesses into offset groups.
- Mark groups as union candidates when same-offset accesses disagree on size or layout.
- Turn groups into fields.
- Insert padding between gaps.
- Infer types heuristically from semantic usage and access width.
- Generate names and compute final structure size.
This is less globally optimal than the Z3 path, but it keeps Structor productive on awkward decompilations and solver-failure cases.
Recovered layouts are not just flat lists of scalars.
- Regular stride patterns are lifted into array fields.
- Conflicting same-region views become unions.
emit_substructsenables nested aggregate emission when a field is better represented as a sub-structure than as a scalar blob.- Negative-offset layouts are rebased into shifted-view types so the emitted IDA type remains structurally valid.
That combination is why Structor can model packed overlaps, embedded windows into larger objects, and array-of-struct style patterns without collapsing everything into bytes.
Vtable recovery is not a separate manual pass; it is driven by access evidence and call-pattern matching.
VTableDetectorlooks for call shapes of the form(*(*(var + vtable_offset) + slot_offset))(args).- Slot indices are derived from the slot offset divided by pointer size.
- Slot signatures are reconstructed from the call site by inspecting argument types and how the call result is used.
If enough evidence is present, Structor creates a companion vtable type and links it to the parent object layout.
After synthesis, Structor attempts to apply the recovered type beyond the original variable.
- It propagates to callers and callees based on how the variable flows through arguments and return values.
- When propagation crosses pointer deltas,
TypePropagatorcan construct shifted window or tail views rather than blindly applying the unshifted parent type. - This is what makes subobject-style propagation viable when one function sees the object at base
0and another only seesptr + delta.
The type-fixing pipeline is separate from structure synthesis, but it reuses the same general evidence-first philosophy.
For each variable, TypeFixer does three main things.
-
Direct inference from local usage. It analyzes the variable's own access patterns and tries to infer a better type from observed dereferences and semantics.
-
Exact-storage overlap recovery. If the variable is overlapped, Structor looks for sibling lvars that share the exact same storage location and width. That means same stack slot plus width, or same register storage plus width. It then borrows the most specific compatible type from the better-typed peer and records a diagnostic describing the recovery.
-
Missing register-backed input reporting. For register locals that look like undeclared arguments, Structor uses two complementary strategies. It first tries ABI-style parameter-position inference from register families. It then has a non-ABI register-handoff path that scans callers, decodes instructions backward from the call site, and checks whether the callee's input register was populated immediately before the call.
That second path is intentionally conservative.
- It rejects obvious false positives where the register write is just copying a previous call's return register into a callee-saved register.
- It reports the case to the output window and IDC warning surface instead of mutating the function signature.
Structor does not apply every inferred type blindly.
- Synthesis requires a minimum amount of access evidence.
- Type fixing compares current and inferred types and only applies significant improvements.
- Warnings and diagnostics are preserved separately from applied fixes so automation can distinguish "changed", "report-only", and "interesting but unresolved" outcomes.
- Structure synthesis still requires at least
min_accessesobserved accesses for a variable. - Purely computed array indices such as
ptr[i * 4]do not provide a constant field offset. - Some aliasing patterns are still opaque to the synthesis path, especially when the interesting accesses happen only through a different local.
- Missing register-backed argument recovery is currently report-only; it does not rewrite function signatures.
The files under docs/ go deeper on Z3, cross-function analysis, and related experiments. Treat them as design and research material. Use this README for the current supported workflow and API surface.
Structor borrows several design ideas from suture, especially around access modeling, conflict resolution, and debug-oriented analysis ergonomics. The difference is that Structor is implemented as a native C++ plugin with a structure-synthesis pipeline and an increasingly capable function type-fixing subsystem.