Skip to content

Disassemble backwards in variable width instruction sets#3898

Draft
OBarronCS wants to merge 5 commits into
pwndbg:devfrom
OBarronCS:disassemble-backwards-variable-width
Draft

Disassemble backwards in variable width instruction sets#3898
OBarronCS wants to merge 5 commits into
pwndbg:devfrom
OBarronCS:disassemble-backwards-variable-width

Conversation

@OBarronCS

Copy link
Copy Markdown
Member

This implements disassembling backwards in variable width instruction architectures to implement #3784.

Prior to this PR, if you hit a breakpoint or did nearpc random_address, and if the disassembly system had not already encountered the surrounding sequence of instructions before, then we were unable to display the "previous" instructions behind the instruction pointer.

This implements a method to "guess" the valid sequence of instructions that lead to the current instruction, allowing disassembling backwards. Since ISA's like x86 are not self-synchronizing, we do a best-effort guess at determining the true instruction boundaries.

The method is simple: start a certain amount of bytes in the past, and start disassembling towards the current instruction. We will assume that after a certain number of disassembled instructions, that the sequence will "self-align" to the real sequence (which works in practice).

There is one case where this might not be desirable (i.e. changes longstanding pwndbg behavior), which is stepping into a function call. Previously, when you stepped into a call, the first instruction would be the first one displayed. Now, because we can disassemble backwards, it displays the instructions linearly behind the instruction.

This can get confusing, giving we are now mixing emulation (displaying the true sequence of instruction) with sometimes displaying instructions linearly behind the address.

image image

@OBarronCS

OBarronCS commented May 7, 2026

Copy link
Copy Markdown
Member Author

The test failures are from now disassembling backwards where we previously couldn't. I'm considering adding a setting for the tests like `disable_backwards_heuristic_disassembly" for the tests, rather than changing the 75+ that have changed.

@k4lizen

k4lizen commented May 7, 2026

Copy link
Copy Markdown
Contributor

The test failures are from now disassembling backwards where we previously couldn't. I'm considering adding a setting for the tests like `disable_backwards_heuristic_disassembly" for the tests, rather than changing the 75+ that have changed.

I think that's fine, though would be nice to then have some tests that cover this behavior specifically.

@k4lizen

k4lizen commented May 7, 2026

Copy link
Copy Markdown
Contributor

I'm thinking if we should only disassemble backwards until we hit a new symbol, or if it makes sense to do it this way where you can see behind the function you call into.

@k4lizen

k4lizen commented May 7, 2026

Copy link
Copy Markdown
Contributor

This can get confusing, giving we are now mixing emulation (displaying the true sequence of instruction) with sometimes displaying instructions linearly behind the address.

We could color code them (greying them out seems quite intuitive) or set a marker right above the PC that denotes this maybe (something like ? to signal "everything above this is linear").

@OBarronCS OBarronCS marked this pull request as draft May 8, 2026 02:59
@k4lizen

k4lizen commented May 8, 2026

Copy link
Copy Markdown
Contributor

btw i didn't check the source yet but do you use function beginnings as a heuristic to help with disassembling backwards

@OBarronCS

Copy link
Copy Markdown
Member Author

It currently doesn't check function boundaries/symbols.

image

@OBarronCS OBarronCS force-pushed the disassemble-backwards-variable-width branch from b64efe8 to 5ccbcb4 Compare May 14, 2026 20:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

2 participants