Skip to content

Field-report follow-ups: target.open lazy load + §1 CLI + §2 socket daemon + §6 chained-fixups phases 1-3#20

Merged
zachgenius merged 34 commits into
masterfrom
release/field-report-followups
May 16, 2026
Merged

Field-report follow-ups: target.open lazy load + §1 CLI + §2 socket daemon + §6 chained-fixups phases 1-3#20
zachgenius merged 34 commits into
masterfrom
release/field-report-followups

Conversation

@zachgenius

Copy link
Copy Markdown
Owner

Bundled merge of five reviewed feature branches that close the entire RE-engineer field report.

Field report → status

# Item Resolution
1 target.open pathologically slow on 503 MB iOS Mach-O (36 GB RSS, never returned) target.preload-symbols=false daemon-wide
2 target.open response was 2.2 MB / 587 inline sections summary-by-default; view.include_sections=true to opt back in
3 target_id doesn't survive across CLI invocations ldbd --listen unix:PATH + ldb --socket PATH (phase 1: single-client serial)
4 --repl < cmds.txt only ran first command incidental — was a symptom of #1 hanging recv
5 --ldbd PATH had to be passed on every in-tree dev call CLI sibling lookup via __file__
6 ARM64e chained fixups silently produced wrong iOS xrefs parser + Mach-O loader + ADRP-pair resolver with function-boundary reset, AAPCS64 + PAC call-clobber, ADD/SUB/MOV write tracking, pre/post-indexed LDR writeback, STR-family xref support, FAT slice selection, provenance.warnings field

Constituent merges (in order)

  1. fix/target-open-lazy-load — perf fix + design doc for the §1/§2/§3 follow-ups
  2. fix/cli-sibling-lookup — §1
  3. fix/socket-daemon-phase1 — §2 phase 1
  4. fix/chained-fixups-phase3 — §6 phases 1-3 (parser → indexer wire-up → ADRP correctness)

Each constituent branch went through:

  • An implementation agent (worktree-isolated)
  • A linus-code-reviewer (and security-auditor for §2; opus-model linus for §6 phase 2 and phase 3)
  • A cleanup agent that applied every reviewer-flagged blocker + nit

Test plan

  • All 4 constituent branches PR-ready independently, each with its own green ctest run
  • ctest --test-dir build --output-on-failure on this merged tip → 81/81 PASS (Apple silicon arm64, 182s wall-clock)
  • Build warning-clean
  • Token-budget gate green on Darwin-arm64 baseline (regenerated alongside target.open shape change)
  • Adversarial smoke tests (SUB / PAC / writeback / STR / phase-2 patterns) all green; each demonstrated to fail against its pre-fix base
  • Linux-x86_64 token-budget baseline regen — predicted net drift < 1% but unverified on Linux runner. One-line follow-up if CI flags it.

Deferred to phase 4

Captured in docs/35-field-report-followups.md and the worklog:

  • §2 phase 2: multi-client socket, notification-sink refactor, auto-spawn-if-no-daemon, in-flight RPC interruption, idle timeout
  • §6 phase 4: real iOS .ipa validation, conditional-branch boundary handling, fat_arch_64 triple-aware slice selection, stripped-binary function-boundary detection, register-offset LDR provenance

🤖 Generated with Claude Code

zachgenius and others added 30 commits May 16, 2026 11:47
Field report from a RE engineer driving LDB against a 503 MB iOS arm64
Mach-O (WeChat): target.open never returned. RSS grew linearly at
~5 MB/s past 36 GB; CPU pinned at 100%; killed at 14 min. Even on
smaller binaries the response was 2.2 MB / 587 inline sections per
call — agent-unfriendly cost for a "what is this binary" question.

Two root causes, one branch:

1. LLDB's target.preload-symbols=true default forces an eager DWARF +
   symbol-table parse on CreateTarget. For a stripped Obj-C binary
   that's the linear-growth phase — LLDB building an in-memory index
   over __objc_methname, function symbols, runtime metadata, etc.
   nobody had asked for yet. Flip it off daemon-wide in the
   LldbBackend ctor; symbol queries still work, they just trigger the
   parse lazily, scoped to the module they actually need.

2. convert_module() recursively walked every (sub)section, and
   module_to_json inlined the lot — the source of the 2.2 MB blob.
   New OpenOptions{ include_sections=false } makes the section walk
   opt-in. Default target.open returns {path, uuid, triple, load_addr,
   section_count} per module; pass params.view.include_sections=true
   for the old full shape. module.list and load_core retain the inline
   behaviour (their callers already paid the cost).

TDD: tests/smoke/test_target_open_view.sh pins the new default and
opt-in shapes; failing-first then green.

Token-budget gate caught the (intentional) drift and was the only
test that flagged the wire-shape change — Darwin-arm64 baseline
regenerated: target.open 21913 → 1712 tokens (12.8× drop), total
workflow 51280 → 32627 (−36%). Linux entry left in place; predicted
net drift < 1% (target.open shrink ≈ describe.endpoints schema
growth), CI will confirm or a one-line regen fixes it.

68/68 smoke + unit tests pass; build warning-clean.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Field-report follow-up: a user assumed --ldbd's "$PATH then
./build/bin/ldbd" lookup implied daemon discovery, then was surprised
that target_id from one `ldb` invocation didn't survive to the next
one. The behaviour is correct (each subcommand spawns a fresh ldbd
child and reaps it on exit), it just wasn't called out anywhere a
user would look.

Add a "Daemon lifecycle" stanza to the top-level help: one-shot is
ephemeral; --repl is the persistent-state path and also handles
piped stdin (the same case where the user expected batch execution
to work).

No code change — just documentation.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…64e fixups

The 2026-05-16 field report had six items; the killer two
(target.open runtime + response shape) landed in commits 8cb6915
and 191b049. The remaining three (#5 in-tree CLI lookup, #3 phase 2
persistent daemon, #6 chained-fixup decoding for iOS xrefs) each
needed enough design to be worth writing down before code.

Why one doc instead of three:

- They came from a single report and share fixture / sequencing
  context. Splitting them would force three cross-references and
  duplicate the "where this came from" preamble.
- Sequencing matters: §1 unblocks the dev flow; §2 phase 1 is a
  prereq for any agent-driven multi-call workflow; §3 needs the
  symbol-index cache (docs/23) to mature first so it has something
  to plug into.
- The risks differ by item (low / medium / high) but the design
  surface is small enough — ~200 lines per item — that a single
  doc beats three small ones.

No code change yet. Captures effort estimates, file paths, test
plans, and explicit out-of-scope lists so a follow-up agent can
implement each item without re-deriving the design.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The in-tree dev flow forced --ldbd /abs/path/to/build/bin/ldbd on
every invocation: neither tools/ldb/ldb nor build/bin/ldbd is on
$PATH, and the existing CWD-relative fallback only fired when the
user was standing in the repo root. Anchor the lookup on __file__
instead — tools/ldb/ldb → tools/ldb → tools → <repo> → build/bin/ldbd
— so the CLI works from any CWD without a flag.

Precedence: --ldbd PATH > shutil.which("ldbd") > sibling-from-__file__
> ./build/bin/ldbd. The CWD-relative fallback is preserved for the
case where ldb is vendored into a non-matching repo layout.

Failing test first: tests/smoke/test_cli_sibling_lookup.py runs the
CLI from an unrelated tempdir with $PATH scrubbed of ldbd, asserts
`ldb hello` succeeds, then pins precedence by passing a
non-executable --ldbd path and asserting fail-fast. See
docs/35-field-report-followups.md §1.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The one-shot `ldbd --stdio` model forces every `ldb <subcommand>` to
re-open the binary because target_id dies with the daemon process.
The field report in `docs/35-field-report-followups.md §2` walks
through the agent UX cost of this — a single shell script that does
target.open then symbol.find then disasm.function pays three full
target opens and gets three different target_ids.

This commit adds the alternate `ldbd --listen unix:PATH` mode (phase
1: single-client persistent socket). The daemon binds a unix socket,
accepts one connection at a time, serves it through the existing
Dispatcher + JSON-RPC framing, and accepts the next when the client
disconnects. target_id and dispatcher-side state persist across
disconnects because the dispatcher persists.

Lifecycle decisions match the §2 table:

  Socket path     — caller-provided via `unix:PATH`. Default-path
                    policy is documented in --help but resolved on
                    the client side; the daemon always wants an
                    explicit argument.
  Exit            — SIGTERM / SIGINT. No idle timeout, no
                    daemon.shutdown RPC yet (phase 2).
  Race / lock     — LOCK_EX|LOCK_NB on ${PATH}.lock. The second
                    daemon exits 1 with a stderr line naming the
                    holder pid.
  Auth            — filesystem-only. Socket inode 0600 (umask trick
                    + defensive chmod); auto-created parent dir
                    0700.
  Concurrency     — single-client, serial. The dispatcher's
                    notification sink is re-pointed at the
                    per-connection OutputChannel inside accept(),
                    which is race-free because at most one
                    connection is alive.

stdio_loop's request-from-json / response-to-json / dispatch body
is extracted into `serve_one_connection` so listen mode and stdio
mode share identical framing-error and notification semantics. A
self-contained `FdStreambuf` wraps a connection fd as a
std::istream/ostream pair so `protocol::read_message` /
`write_message` work unchanged over a socket.

The dispatcher's per-target mutability assumptions (probes,
sessions, breakpoints — see the §2 risks list) are not exercised
under phase 1 because all RPCs are still serial. Phase 2 needs
either per-target mutexes inside the dispatcher or per-connection
sinks plumbed through the NonStopRuntime. Flagged for the phase-2
author.

Tests:

  - smoke_socket_collision: two `ldbd --listen unix:$same_path` race
    → second exits non-zero with stderr explaining the collision.
  - smoke_socket_perms: socket inode is 0600 (existing-parent and
    auto-created-parent both verified); auto-created parent is 0700.

The smoke_socket_lifecycle test (cross-cuts the client side) lands
in the matching client-side commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Pair-commit to the daemon-side `--listen unix:PATH` work. The CLI
gains `--socket PATH` (env override `LDB_SOCKET`), mutually exclusive
with `--ssh` and `--ldbd`. When set, `spawn_daemon` returns a
`_SocketProc` adapter — a Popen-shaped object whose `stdin`/`stdout`
are `socket.makefile()` over a connected unix socket — so the
existing `JsonTransport`/`CborTransport` + `fetch_catalog` +
`do_rpc` plumbing works unchanged.

Why bother: the one-shot CLI spawns a fresh `ldbd --stdio` per
invocation, so `ldb target.open ... && ldb symbol.find target_id=$N`
fails with -32000 on the second call. With a persistent daemon at
`unix:PATH`, two CLI calls share dispatcher state and target_id
remains valid. This is the same persistence the existing `--repl`
gives, but addressed from the shell-pipeline side: each `ldb`
invocation is still one process, just talking to a long-lived
daemon instead of spawning its own.

Path-policy default (documented in `--help`):
  $XDG_RUNTIME_DIR/ldbd.sock  if XDG_RUNTIME_DIR is set
  $TMPDIR/ldbd-$UID.sock      else
  /tmp/ldbd-$UID.sock         last resort

Helper is exposed as `default_socket_path()` for future call-sites
(auto-spawn fallback is phase 2 and intentionally not wired here).

Connect-failure (daemon not running, bad path, permission denied)
becomes a clean `ldb: could not connect to ldbd socket 'X': ...`
diagnostic instead of an unhandled Python traceback — caught at the
`__main__` IOError boundary.

Test:

  - smoke_socket_lifecycle: starts `ldbd --listen unix:$sock`, runs
    `ldb --socket $sock target.open path=$fixture` (capturing
    target_id), then `ldb --socket $sock module.list target_id=$N`
    over a fresh connection. Both succeed — the load-bearing
    correctness claim of §2 phase 1.

Worklog entry on top of `docs/WORKLOG.md` records the lifecycle
decisions, the deferred phase-2 work (multi-client +
notification-sink refactor + auto-spawn), and the Dispatcher
thread-safety note for the phase-2 author.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sync

Pure docs / comment changes from the linus-code-reviewer pass on the
sibling-lookup change. Reviewer's observation: the previous single-line
"tools/ldb/ldb → tools/ldb → tools → <repo>" arrow comment was easy to
miscount; expand to one line per dirname call so the count is
unambiguous. Also reconcile docs/35 §1 with what actually shipped: the
implementation rejects a non-executable --ldbd argument up front (the
sketch in the doc didn't), which is strictly better — catches a bad
path at the right point rather than dying with a confusing EACCES deep
inside subprocess.Popen.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…se 1)

Two H-severity findings from the post-merge review on
`worktree-agent-ae73d824f609b6e86` and four medium-severity nits.
Phase-1's "uid is a single trust domain" assumption is now spelled
out in the doc and enforced by defense-in-depth checks against
accidental misconfiguration (or a same-uid attacker exploiting a
pre-staged symlink).

H1 lock path opened with O_NOFOLLOW. Without this, an attacker who
pre-creates `${PATH}.lock` as a symlink to e.g.
`~/.ssh/authorized_keys` has our `ftruncate(0) + pwrite(pid)`
silently overwrite the symlink target. ELOOP is fatal — refuse to
start rather than try to disambiguate. `O_EXCL` deliberately NOT
added: legitimate stale lockfiles from a crashed daemon must still
be reusable (flock releases on process death; ftruncate clears
stale pid).

H2 `ensure_parent_dir` now uses `lstat()` instead of
`std::filesystem::exists/is_directory` (which follow symlinks).
Refuses a pre-existing parent that is a symlink, owned by another
uid, or has any group/other permission bits set. New dirs are
still created via the `umask(0077) + mkdir(0700)` atomic pattern.

M1 peer-credential check via `getpeereid()` immediately after
`accept()`. Even though the inode is 0600, the defense-in-depth
check rejects any peer whose uid differs from the daemon's before
the first byte is read. `getpeereid()` is portable across macOS
(wraps LOCAL_PEERCRED) and Linux (wraps SO_PEERCRED).

M2 `SO_RCVTIMEO = 300s` on every accepted connection. A stalled
peer no longer pins the dispatcher thread forever; the existing
`FdStreambuf::underflow` already maps `read()` -1 to EOF, which
cleanly closes the connection in `serve_one_connection`.

M3 reject relative `--listen unix:PATH`. The symlinked-parent
guard needs a fixed reference point, and the default-path
policies (both daemon-side --help text and client-side
`default_socket_path()`) always produce absolute paths. Exit 2.

M6 `fchmod(fd, 0600)` instead of `chmod(path, 0600)` to close a
tiny TOCTOU window where an FS filter could swap the inode
between `bind()` and a path-based `chmod`. macOS rejects fchmod
on AF_UNIX sockets with EINVAL — fall back to path-based chmod
on that platform; the `umask(0077)` trick around bind() already
makes the inode land 0600 atomically, so the fallback is purely
belt-and-suspenders for filesystems that ignore umask.

Correctness #2: `memcpy(addr.sun_path, ..., size+1)` so the
trailing NUL is copied explicitly. addr is brace-initialized
to zero so the byte was already 0, but the new form is
impossible to silently break in a future refactor.

Doc: new "Trust model" subsection in §2 spells out the
single-uid-trust-domain assumption and the out-of-scope cases
(shared-uid hosts, NFS-homed uid, in-uid LLM sandboxes — all
deferred to phase 2 + token auth). `ldbd --help` summarises
the same.

Tests: `tests/smoke/test_socket_perms.py` grew three new
scenarios that fail before this commit and pass after:
  - relative `--listen unix:PATH` is refused with stderr
    mentioning "absolute",
  - symlinked parent dir is refused,
  - pre-staged symlinked lockfile is refused AND the symlink
    target is left untouched.

All 72 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ts-env

Four correctness items flagged in the post-merge review. The
common thread: a closed peer (mid-write, mid-recv, mid-RPC) used
to be observable only via timing — the daemon would silently
retry writes, or the client would hang until the OS killed it.
Now every failure mode produces a typed error and tears the
connection down on the next attempt.

#1 — FdStreambuf::sync() latches write failures.

Before: when ::write() returned -1 (EPIPE, ECONNRESET), sync()
scrubbed the buffer and returned -1, but the NEXT flush() saw
an empty buffer and "succeeded." That left the daemon in a
write-to-dead-peer loop with no error path. Now FdStreambuf
keeps a sticky `write_failed_` flag; once latched,
sync/overflow/xsputn short-circuit. FdOstream::flush() forwards
the flag to the ostream's badbit so callers observe failure on
the very next attempt. transport.cpp's write_json_line /
write_cbor_frame now throw protocol::Error when the stream is
!good() after flush — serve_one_connection's existing catch
block tears the connection down cleanly.

#3 — _SocketProc.__init__ sets settimeout(300).

Matches the daemon-side SO_RCVTIMEO (M2). A hung daemon
(deadlocked dispatcher, runaway target.open) no longer pins
fetch_catalog or do_rpc forever. socket.timeout is caught at
each tr.recv() and translated into a clean "daemon socket
timeout after 300s" error.

#4 — fetch_catalog / do_rpc close proc.stdout in finally.

For _SocketProc each of `stdin` and `stdout` is an independent
makefile() handle on the underlying socket fd. Closing only
stdin left a dup of the fd open, so the daemon's read() didn't
see EOF until process exit (whenever Python's GC ran). Closing
both halves makes the daemon's accept() loop move on to the
next caller promptly. Verified the same close() pattern is
harmless for subprocess.Popen-shaped local/ssh daemons.

M7 — --socket / --ssh flag wins over LDB_SOCKET / LDB_SSH_TARGET.

Flipped from env-beats-flag to flag-beats-env for the two
transport selectors. Rationale: an operator who typed
`--socket /custom/path` meant "this socket, right now"; if a
stale LDB_SOCKET from a previous shell silently overrode it
they'd spend a long time debugging. The daemon-side env knobs
(LDB_STORE_ROOT, LDB_OBSERVER_EXEC_ALLOWLIST) keep env-beats-
flag because those are launcher-style knobs whose env should
pin policy across argv rewrites — distinct purpose, distinct
precedence.

All 72 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Final commit of the post-review punch list. Code changes here are
all comment-only or test ergonomics; no behaviour change in the
daemon. Plus the worklog entry and a phase-2 follow-ups
subsection in §2 so the deferred items don't get rediscovered
later.

N1 — alphabetised `LDBD_SOURCES` so daemon/* sources read in
order (dispatcher, socket_loop, stdio_loop).

N2 — explicit `static` on `g_shutdown` and `on_term_signal`.
Anon-namespace already gives internal linkage; the duplicated
keyword is for the human reader and for a future refactor that
flattens the namespace.

N3 — `bind_listener` umask/bind atomicity comment now states the
single-threaded-startup precondition. Same clarification was
added to `ensure_parent_dir` in the security commit.

N4 — lockfile comment reworded. Says what the code actually
does: flock is the exclusivity mechanism, the stamped pid is
best-effort diagnostic only. Old comment said "the lock file is
not the pid file" but the file IS used as a pid file.

N5 — replaced three blocking `proc.stderr.read1(4096)` calls in
the socket smoke tests with a `read_stderr_nonblocking()` helper
that uses select() with a 200ms timeout. A healthy daemon's
quiet stderr no longer stalls the test runner.

§2 grew a "Phase-2 follow-ups" subsection capturing the
deferred items the reviewers flagged: in-flight RPC interruption
on SIGTERM (today only takes effect between connections), token
auth for shared-uid environments, per-connection notification
sinks, dispatcher mutability audit. None of these block the
phase-1 merge; recording them so they show up in the next
session's WORKLOG read.

Worklog entry on top summarises all three commits, the
decisions taken (O_NOFOLLOW-not-O_EXCL, fchmod fallback on
macOS, flag-beats-env only for transport selectors), and the
one runtime surprise (macOS fchmod-on-AF_UNIX EINVAL).

All 72 tests pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Lands the standalone Mach-O LC_DYLD_CHAINED_FIXUPS parser called
out in docs/35-field-report-followups.md §3 phase 1. iOS 13+ /
macOS 11+ ARM64e binaries replace literal 64-bit pointer slots
in __objc_selrefs / __got / __auth_got / __const / __data with
chained-fixup descriptors; LDB's xref pipeline silently
produces wrong results on those binaries today. The parser is
the prerequisite for phase 2 (indexer wire-up + bind
resolution + cache format) — landing it standalone lets the
larger phase-2 wiring change be focused on integration rather
than format decoding.

Parser supports pointer formats 1 (ARM64E), 2 (PTR_64), 6
(PTR_64_OFFSET), 9 (ARM64E_USERLAND), 12 (ARM64E_USERLAND24).
Everything else throws backend::Error with a "phase 2" message
instead of pretending to support it. Binds resolve to 0 in
phase 1 (membership in the resolved map still signals "this
slot is a fixup"); phase 2 will wire in the imports table.

Tests cover three vectors hand-derived from the SDK struct
layouts in <mach-o/fixup-chains.h>: ARM64E single-page chain,
PTR_64 multi-page chain (exercises per-page page_start[]
dispatch), and an unsupported-format rejection. The format-6
path was additionally cross-checked against a real arm64
binary built with `clang -Wl,-fixup_chains` and the output of
`dyld_info --fixups`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…ename keys, anchor phase-2 work items

Post-review cleanup of the phase-1 chained-fixup parser landed in
1741662. Three reviewer-flagged items, all clarity-oriented (no
behaviour change):

1. SegmentInfo::file_offset was dead — every caller set it, the
   parser never read it. YAGNI: deleted rather than documented.
   Phase 2 can re-add if it actually needs an on-disk-offset map.

2. ChainedFixupMap keys were documented as "file_addr (image-base-
   relative, NOT load-time VA)". "file_addr" conventionally means
   on-disk offset, so the name actively misled. Renamed the concept
   to "rva" (image-base-relative VM offset) in the header docstring
   and in the impl's local variable. Map type unchanged.

3. Two phase-2 throw sites now carry FIXME(phase 2):
   docs/35-field-report-followups.md §3 comments so they're greppable
   when phase 2 starts: unsupported_format() and the multi-start
   page rejection.

4. Reordered the starts_in_segment bounds check so the intent is
   obvious from reading the code: require_range for the 22-byte
   fixed header, then validate size >= 22, only then use size to
   range-check the body. The previous ordering was safe but read
   as "trust size before we've validated it can be that large".

All 3 existing chained_fixups tests still pass; no other test
surfaces touched.

Refs: docs/35-field-report-followups.md §3 reviewer punch list
items #1, #2, #3, #6.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Phase-1 chained_fixups landed (1741662) with coverage on the
ARM64E unauth-rebase + PTR_64_OFFSET paths only. Reviewer flagged
two coverage gaps:

- decode_arm64e()'s `auth=1, bind=0` branch had no test. Vector D
  is a minimal single-slot ARM64E (format 1) auth-rebase. Raw
  u64 derivation is documented inline against the SDK bit layout
  so a future reviewer can verify each field without rebuilding
  fixtures. Phase 1 drops PAC key/diversity metadata; the
  testable observable is the resolved pointer value (image_base
  + target).

- DYLD_CHAINED_PTR_ARM64E_USERLAND (format 9) shares decode with
  format 12 (USERLAND24) but is the wire format used on pre-
  macOS-12 / pre-iOS-15 arm64e builds. Without an explicit
  test, a regression on the format-9 dispatch (in particular the
  `target_is_runtime_offset()` table) could hide behind format-12
  coverage. Vector E is a minimal format-9 single-slot rebase.

After: 5 test cases / 14 assertions in the [chained_fixups]
filter; full unit_tests ctest target green.

Worklog entry records the phase-2 integration hazards the
reviewer asked us to remember: file_offset-vs-rva confusion at
indexer wire-up time, the segment_offset == (vm_addr -
image_base) assumption, and the still-open real-binary
validation work.

Refs: docs/35-field-report-followups.md §3 reviewer punch list
items #4, #5; phase-2 follow-ups documented in worklog.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The phase-1 parser took raw payload + segments; that wasn't useful
to the xref pipeline because the daemon doesn't carry the LC payload
or LC_SEGMENT_64 list in any reachable form. Phase 2's xref wire-up
needs a one-call entry point that handles the byte-level Mach-O
walk, so add `extract_chained_fixups_from_macho(bytes, size)`:

- Reads `mach_header_64`, iterates load commands.
- Collects all LC_SEGMENT_64s into the SegmentInfo[] the existing
  parser expects (binding the segment's on-disk bytes when filesize>0).
- Locates LC_DYLD_CHAINED_FIXUPS, dispatches to parse_chained_fixups.
- Returns empty map on non-Mach-O / FAT / classic-LC_DYLD_INFO_ONLY
  inputs so callers (Linux ELF, older Mach-O) can wire it
  unconditionally without their own format sniff. Malformed
  chained-fixup payloads still propagate backend::Error from the
  underlying parser.

Two new unit tests pin the empty-map behaviour (null + ELF magic)
and a minimal arm64 Mach-O round-trip through Vector A's payload.

Anchors docs/35-field-report-followups.md §3 phase 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
`xref.address` (and therefore `string.xref`) silently returned empty
on iOS/macOS arm64 Mach-Os built with -Wl,-fixup_chains. Two reasons,
both rooted in how LLDB renders ARM64 ADRP:

1. LLDB renders an ADRP operand as a 21-bit page count (`"x8, 4"`),
   not as the absolute page address. The literal-hex scan in
   string_references_address couldn't see the target page from
   ADRP+ADD/LDR pairs at all.
2. The 64-bit pointer slots that ADRP+LDR pairs consume on
   chained-fixup binaries (__got, __auth_got, __objc_selrefs,
   __data globals) carry chained-fixup descriptors at rest, not
   the values dyld will write at load time. Even if literal-hex
   matched the slot address, the resolved string address never
   appears in any operand or section byte sequence.

Wire-up:

- LldbBackend::Impl gains chained_fixup_maps + chained_fixup_loaded
  per-target caches. Lazy on first xref query; reaped in
  close_target. In-memory only — on-disk caching is phase 3.
- xref_address adds an ARM64 ADRP-pair resolver alongside the
  existing literal-operand + RIP-relative scans:
    * Track `adrp xN, imm` → absolute target page per dst register.
    * For each subsequent `add xN, src, #imm` /
      `ldr xN, [src, #imm]` consumer, compute page+imm and match
      against (a) the caller's target_addr directly, and (b) the
      chained-fixup map's `slot_rva → resolved_value` table for
      LDR-style loads (slot-indirection case).
    * Single-pass, last-ADRP-per-register heuristic — no liveness
      analysis. Matches what compiler-emitted ADRP+immediate-use
      code looks like in practice.
- Results from all paths deduped by instruction address.
- `find_string_xrefs` inherits the fix because it calls xref_address.

Apple-silicon-arm64-gated synthetic fixture (chain_slot.c) +
smoke test pins the slot-load case end to end. Without this
wire-up the new smoke test fails empty; with it, the LDR inside
reference_string is surfaced as an xref to the string.

§3's "Implementation sketch" + "Phase 2 — what shipped" updated in
the design doc to match. Worklog entry on top.

docs/35-field-report-followups.md §3 phase 2.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
LLDB renders compiler-emitted `adrp xN, -K` (page below the PC's page)
literally with a leading '-'. The old anonymous-namespace
`parse_uint_at` rejected the '-', so the ADRP-recording block silently
dropped the entry. The next LDR/ADD consumer either found nothing (xref
missed) or — worse — found a stale prior ADRP for the same register
(xref bound to the wrong page, silent wrong answer).

Add `parse_int_at` alongside `parse_uint_at`, both lifted out of
lldb_backend.cpp into src/backend/xref_arm64_parsers.{h,cpp} so unit
tests can exercise the tokeniser without a live LLDB target. ADRP page
math becomes `(pc & ~0xfff) + (static_cast<uint64_t>(imm) << 12)`; the
two's-complement wrap on the unsigned addition is exactly the
page-below-PC case we want.

TDD: tests/unit/test_xref_arm64_parsers.cpp pins negative decimal,
negative hex, positive decimal, '#'-prefixed negative, missing-digit
ok=false, and the canonical w→x register normalisation. Verified
parse_uint_at still refuses '-' so a future regression that quietly
deletes parse_int_at would surface.

Review punch-list item I1 from the phase-2 chained-fixups review;
docs/35-field-report-followups.md §3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The xref_address path was holding impl_->mu while reading the entire
Mach-O off disk (up to 500+ MiB on real iOS apps) and parsing
LC_DYLD_CHAINED_FIXUPS. Every other RPC and the listener-thread
breakpoint dispatch path blocked for that duration on first xref of a
target. The stdio dispatcher tolerates it today; the §2 socket-daemon
multi-client path would surface it as cross-client stalls.

Refactor to check-flag-under-lock → read+parse-outside-lock →
double-check-and-publish-under-lock. The benign race (two callers both
load, only the first publishes) wastes CPU on collision but preserves
correctness; phase-1 stdio rarely triggers it. The read-back uses
find() rather than operator[] so a concurrent close_target can't leave
a stale default-constructed entry in the cache after eviction.

Review punch-list item I2 from the phase-2 chained-fixups review;
docs/35-field-report-followups.md §3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…p shape

Four review nits from the phase-2 chained-fixups punch-list, all
tightening invariants that were structurally correct but loosely
expressed:

N1: resolve_adrp_consumer set is_slot_load=true for ldr/ldrsw/ldrh/ldrb
indiscriminately. Halfword and byte loads aren't 8-byte chained-fixup
slot loads; ldrsw is a 32→64-bit sign-extend; only `ldr xN, [...]` is.
Narrow is_slot_load to that case. The other widths keep their direct
ADRP-target match (useful for ADRP+LDRB string-byte loads); they just
no longer trigger the chained-fixup map lookup that would always miss.

N2: extract_chained_fixups_from_macho rejected LC_SEGMENT_64 commands
with cmdsize<56. The actual segment_command_64 struct is 72 bytes
(maxprot/initprot/nsects/flags after filesize). The reads at 24/32/
40/48 stayed in-range so no OOB today, but the bound was misleading.
Update to <72 and explain why. Updated test fixture's kSegCmdSize to
match.

N5: xref_address re-derived image_base from mod.GetSectionAtIndex(i).
GetFileAddress() in a "lowest non-zero across top-level sections"
loop — duplicating logic the parser already ran when it computed
(segment[0].vm_addr - segment_offset). Add `image_base` to
ChainedFixupMap, populate from parse_chained_fixups, drop the local
recompute. Zero when no chains present, which the slot-load path
short-circuits on anyway via empty `resolved`.

N4: dedupe `std::sort` → `std::stable_sort`. The dedupe step keeps the
first survivor of each address group; an unstable sort lets ordering
nondeterminism pick the winner on rebuilds. Stable_sort costs nothing
at the small XrefMatch counts xref_address returns and gives
reproducible dedupe output across runs.

Review punch-list items N1, N2, N4, N5; docs/35-field-report-followups.md §3.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Capture phase-2 review punch-list closure in WORKLOG (I1 + I2 + N1 +
N2 + N4 + N5 with the commit SHAs they landed in) and write the phase-3
acceptance gates into docs/35-field-report-followups.md §3 so the next
session has a concrete bar rather than vague aspirations:

- AAPCS64 call-clobber set cleared on BL/BLR.
- All adrp_regs cleared on RET / unconditional B / BR Xn.
- ADD/MOV destination-register clears (no dataflow propagation through
  arithmetic in phase 3 — conservative answer is the clear).
- FAT (universal) Mach-O slice selection.
- Adversarial smoke tests with a zero-false-positive regression bar.
- provenance.warnings field on xref.address counting skipped
  resolutions, so the agent can tell when the heuristic isn't
  authoritative on a given binary.

WORKLOG also captures three phase-3 follow-ups (FAT no-op, pre-/post-
indexed LDR writeback, register-offset LDR) the punch-list flagged but
deferred.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs/35-field-report-followups.md §3's phase-2 ADRP-pair resolver
has three silently-wrong modes that the existing fixture
(chain_slot.c) can't catch — it only exercises the happy path. Ship
hand-assembled fixtures that pin each false-positive concretely so
the regression bar for phase 3 can be "these xref queries return
zero hits in the right function":

- xref_addclobber: adrp x8 → add x8, x8, #0x100 → ldr x0, [x8, #0x10]
  reproduces the "ADD writes back to a tracked register but adrp_regs
  still has the old page" bug. Phase 2 resolves the LDR's target to
  page+0x10 (wrong; real effective is page+0x110).
- xref_fnleak: ADRP+LDR in fn_a, bare LDR x0, [x8, #0x10] in fn_b.
  Reproduces the "function boundary doesn't clear adrp_regs" bug;
  fn_b's stale x8 makes the scanner emit a false-positive xref.
- xref_callclobber: adrp x0, ...; bl helper; ldr x1, [x0, #0x10].
  Reproduces "AAPCS64 says x0 is caller-saved but phase 2 doesn't
  invalidate it on BL." Phase 2 emits an xref through the dead x0.

All three smoke tests assert the false-positive xref returns zero
hits attributed to the offending function. Against 25f35de they
FAIL with the expected single-match output (proof of phase-2 bug);
phase-3 commits turn them green.

Apple-silicon-arm64 gated identically to the phase-2 chain_slot
fixture — Linux + non-arm64 macOS skip the targets and tests
cleanly because the fixture binaries don't exist there.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs/35-field-report-followups.md §3 phase 2 shipped a "last ADRP
wins for this register" map that was silently wrong across three
patterns (covered by tests/smoke/test_xref_{addclobber,fnleak,
callclobber}.py, all of which failed against 25f35de). Phase 3
closes them deterministically — no dataflow analysis, just per-
instruction register-state mutations layered on the existing
single-pass scan.

Gates landed (numbered against the design doc's acceptance list):

  1. Function-boundary reset. `function_name_at` is consulted only
     when adrp_regs is non-empty (free on no-ADRP code); a name
     change clears the map. RET / unconditional B / BR additionally
     reset the map and the cached current_function. The combination
     catches the cross-function leak both at the RET-of-A side
     (clear on the way out) and at the first-tracked-insn-in-B side
     (clear on entry if A's adrp_regs survived a stub).

  2. AAPCS64 caller-saved clobber on BL / BLR. Erases x0..x18 and
     x30 (LR — BL writes the return address there). x19..x28 are
     callee-saved per AAPCS64; we leave them alone. No leaf-function
     special-casing — the scanner can't know what the callee did.

  3. ADD clobber. resolve_adrp_consumer still emits the ADRP+ADD
     match (page+imm is a legitimate target), then
     clobber_add_destination erases adrp_regs[dst] regardless of
     ADD shape (xN, xN, #imm — self-write; xN, xM, #imm —
     cross-write; xN, xM, xL — register-register). The dst now
     holds an arithmetic result, not a page; a subsequent
     LDR through dst must not resolve through the stale page.

  4. MOV propagate-or-clobber. mov xN, xM with xM tracked copies
     the AdrpPair into xN. Any other MOV form (immediate, MOVZ /
     MOVK / MOVN, mov xN, sp / lr / xzr / wN) erases adrp_regs[xN].
     The simple xN←xM propagation is the only shape worth modelling
     — real dataflow analysis (shifts, conditional MOVs) stays out
     of scope per the design doc.

The three adversarial smoke tests now pass; the existing 70-test
matrix is intact (73/73 ctest green, including the phase-2
chain_slot fixture).

Performance: `function_name_at` is only invoked when adrp_regs is
non-empty, so non-ARM64 code paths and ARM64 code regions with no
ADRP pay nothing extra. On dense ADRP code (every few instructions)
the per-iter symbol-context lookup dominates; defer profiling to
real iOS smoke (phase 4) where we'll have a worth-measuring target.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs/35-field-report-followups.md §3 phase 3 gate 5. Phase 2's
extract_chained_fixups_from_macho silently returned an empty map on
FAT binaries (magic CA FE BA BE / CF) — fine for the single-arch
synthetic fixture but wrong on any real Apple-shipped binary, which
ships as a fat universal2 (arm64 + arm64e + sometimes x86_64).

Implementation:

- Refactored the existing thin-Mach-O walk into a static helper
  (extract_chained_fixups_from_thin_macho) so the FAT dispatch can
  re-enter it with the picked slice's (offset, size) as the new
  buffer base.
- New static helper extract_chained_fixups_from_fat reads the FAT
  preamble (big-endian, hence dedicated read_u32_be / read_u64_be).
  Handles fat_header + fat_arch (20 bytes) and fat_arch_64 (32
  bytes). nfat_arch is capped at 16; anything larger is treated as
  malformed and returns empty.
- Slice preference order (per the design doc): arm64e cpusubtype=2
  first, then arm64-all, then no-op. x86_64 doesn't carry chained
  fixups (LC_DYLD_INFO_ONLY) so it would no-op in the thin parser
  anyway; explicit ordering keeps the picker compact.
- Out-of-bounds slice (offset > file_size or offset+size > file_size)
  is treated as a malformed individual entry — skipped, not thrown.
  The whole-file return is empty when no arm64 slice is recoverable.

Tests (tests/unit/test_chained_fixups.cpp):

- "FAT picks arm64 slice" — universal2 with x86_64 + arm64 picks
  arm64, returns Vector A's chain.
- "FAT prefers arm64e over arm64" — both slices are arm64-family;
  the arm64e one has a distinct image_base so the assertion proves
  preference order is respected.
- "malformed FAT is a no-op" — nfat_arch=17 rejected; OOB slice
  offset rejected; neither throws.

The phase-3 acceptance criterion says "match the SBTarget's triple";
this implementation approximates by preferring arm64e over arm64,
which is the relevant ordering for any xref consumer that produces
chained-fixup output (arm64e and arm64 are the only archs that
encode pointers in chains). A full triple-match would need to
thread SBTarget's GetTriple through the extractor's signature —
deferred until we see a real binary where the preference ordering
isn't enough.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
docs/35-field-report-followups.md §3 phase 3 gate 7. The ADRP-pair
resolver is a heuristic, not a dataflow analysis — there are
patterns it can identify as ambiguous but can't resolve. Surface
those as a `provenance` block on the xref.address response so the
agent can decide whether the result set is authoritative or should
be cross-checked against symbol-index correlate.

The natural first case is "register-offset LDR with an ADRP-tracked
base" — `ldr xN, [xM, xK]` or `ldr xN, [xM, xK, lsl #imm]`. The
runtime address depends on xK, which a single-pass scanner can't
statically evaluate, but if xM is in adrp_regs the load IS a
potential xref. Phase 3 counts these in `adrp_pair_skipped` and
emits a single human-readable warning when the count is non-zero.

Shape:
  data.provenance = {
    adrp_pair_skipped: <uint>,
    warnings: [<string>, ...]
  }

Only attached when adrp_pair_skipped > 0 (or warnings non-empty);
empty provenance is the common case and would cost bytes per
response with no signal.

Interface changes:
- New backend::XrefProvenance struct in src/backend/debugger_backend.h
- DebuggerBackend::xref_address gains optional `XrefProvenance*` out
  param (default nullptr; callers that don't care pay nothing).
- Eight test stubs implementing the virtual updated for the new
  override signature.
- describe.endpoints schema for xref.addr documents the new field
  (with descriptions for both adrp_pair_skipped and warnings) so a
  client generating types picks up the optional shape.

Phase 4 follow-ups (not in scope here): more skip cases — pre/post-
indexed LDRs with tracked bases (`ldr xN, [xM], #imm` / `ldr xN,
[xM, #imm]!`), auth-rebase semantics filtering, multi-start-page
diagnostics. The phase-3 implementation populates exactly one case
to prove the wire-shape works and to ship a non-trivial signal for
the agent.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Closes the phase-3 ADRP-pair work in `docs/35-field-report-followups.md
§3` — the seven acceptance gates listed in the design doc are all
shipped across commits adc083a / 7419945 / eebebca / c01fa47. The
phase-3 criteria list stays intact (phase 4 references them); a new
"Phase 3 — what shipped" subsection summarises which commit closed
each gate.

The worklog entry on top of docs/WORKLOG.md captures the decisions
that aren't obvious from the commits — lazy `function_name_at` is
the perf optimisation that keeps non-ARM64 scans free; no leaf-
function BL specialisation because the scanner can't statically know
the callee; `provenance` attached only on non-empty so the wire cost
stays at zero for the common case; FAT arm64e>arm64 preference
approximates the design doc's SBTarget-triple-match rule.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The phase-3 ADD-clobber rule (docs/35-field-report-followups.md §3
gate 3) only fired on mnemonic "add", but SUB has identical
destination-write semantics — `sub xN, xN, #imm` overwrites the
ADRP-tracked register exactly as `add xN, xN, #imm` does. Reviewer
flagged it as a silent false-positive vector of the same class the
spec was meant to close.

Rename clobber_add_destination → clobber_arith_destination and extend
the post-emit switch to fire on add / sub / adds / subs. SUB has no
match-emit half (compilers don't compute targets as page - imm); only
the clobber is wired. Comment documents the asymmetry.

TDD: xref_subclobber.s fixture + test_xref_subclobber.py smoke
proven RED against worktree-phase3-adrp HEAD before the fix
(returned 1 false-positive LDR in pattern_subclobber); GREEN after.
All four phase-3 adversarial smokes still pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The phase-3 register-state mutation switch matched only the bare
"bl"/"blr" / "br" / "ret" mnemonics. ARMv8.3-A PAC adds authenticated
siblings (BLRAA/BLRAB/BLRAAZ/BLRABZ for calls, BRAA/BRAB/BRAAZ/BRABZ
for indirect branches, RETAA/RETAB for returns) that have identical
semantics under AAPCS64 — PAC is an integrity check, not a calling-
convention change — and the spec's gate 2 ("BL/BLR clears AAPCS64
caller-saved") targets the calling convention, not the mnemonic
spelling.

Reviewer flagged this as the highest-impact silent false-positive
vector: every arm64e binary in the App Store and macOS system
frameworks emits PAC calls, and phase-3 was treating them as no-ops
for register-state mutation.

Refactor the post-emit switch to use named bool flags (is_call,
is_indirect_branch, is_return) that fold in the PAC variants. No
behaviour change for arm64-only binaries.

TDD: xref_pac_callclobber.s fixture (deliberate ADRP → BLRAAZ → LDR
shape, mirroring xref_callclobber.s with BL swapped for BLRAAZ) +
test_xref_pac_callclobber.py smoke. Fixture is gated on
LDB_CAN_COMPILE_ARM64E (CheckCSourceCompiles probe) — clang refuses
PAC mnemonics with -arch arm64. Built thin via OSX_ARCHITECTURES
override (the default fat slice picker would otherwise downcast to
arm64 and silently emit the wrong slice).

Proven RED against pre-fix HEAD (1 false-positive LDR returned in
pattern_pac_callclobber); GREEN after. All seven xref smokes pass.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The phase-3 ADRP-pair resolver emitted a match for the LDR's
effective address but never modelled the base-register rewrite
that pre- and post-indexed forms perform. Pre-indexed
`ldr xN, [xM, #imm]!` and post-indexed `ldr xN, [xM], #imm` both
rewrite `xM ← xM + imm` as a side effect, so a subsequent load
through xM must NOT resolve against the now-stale ADRP page.

Reviewer flagged this as the same family of false-positive as
ADD-clobber: the base register is written, the old page is gone,
the next consumer that reads the register sees an obsolete
tracking entry.

Resolver: extend AdrpResolved with has_writeback + writeback_base.
The address-operand parser now recognises three new shapes
(`, #imm]!`, `]`, `], #imm`) in addition to the existing
`, #imm]` / `]`. Post-indexed effective address is the bare page
(writeback after load); pre-indexed is page+imm (writeback before
load). Both surface via has_writeback=true.

Scanner: capture the resolved info, emit the match, then clear
adrp_regs[writeback_base] and bump
provenance.adrp_pair_writeback_cleared with a human-readable
warning. Provenance is exposed on the wire — dispatcher attaches
the block when any of the three counters are non-zero;
describe.endpoints schema documents the new field.

TDD: xref_writeback_ldr.s fixture with both pre- and post-indexed
forms (using imm9-range #0x40 to satisfy the assembler). Smoke
test asserts zero false-positive matches inside pattern_writeback_*
when querying page+0x10 (the stale target). Proven RED against
pre-fix HEAD (returned 2 false-positive LDRs); GREEN after.
76/76 ctest.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
…sumers

The phase-3 resolver recognised only LDR-family loads, so stores
through an ADRP-tracked base register were invisible to xref.addr —
a user asking "what writes to this global?" got an empty answer
even though the binary clearly had STR instructions referencing
the address. Reviewer flagged this as a real field-report trust
gap and asked us to implement (not doc-defer).

Refactor: extract the `[base{, #imm}]` / `[base], #imm` / `[base,
#imm]!` / `[base]` address-operand parsing into a reusable helper
(`parse_adrp_addr_operand`) so the LDR-family and STR-family
branches share one source of truth. The helper returns the base
register, immediate, and pre/post-indexed writeback flags.

Resolver: collapse the per-mnemonic branches into three buckets:
- is_load:       ldr / ldrsw / ldrh / ldrb / ldur
- is_store_one_reg: str / stur / strh / strb
- is_store_pair: stp (two source registers before the address operand)

All three share the same address-operand parser; STP just skips a
second register first. Effective address and writeback semantics
are identical to the LDR path; only is_slot_load remains LDR-
specific (a chained-fixup slot is only read, never written, through
the ADRP-pair pattern).

Provenance: extend the LDR-skipped sniff to the new store
mnemonics. STP intentionally omitted from the sniff — its two-
register prefix makes the [base] detection more involved and the
precision loss from a false counter bump is worse than the
recall loss from silently skipping. Writeback warning now names
the actual memop mnemonic instead of hard-coding "LDR".

TDD: xref_str.s fixture with three patterns (STR w0, STP x0/x1,
STRB w0 — three different operand shapes including the STP pair
prefix). test_xref_str.py asserts xref.addr against the target
returns at least one match per pattern with the right mnemonic.
Proven RED against pre-fix HEAD (returned zero matches);
77/77 ctest GREEN after.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
N5: resolve_adrp_consumer now short-circuits on MOV variants at the
top of the function rather than parsing operands that can never be
an ADRP-pair consumer. apply_mov_state's bool return is still
useful for callers that want to know whether it handled the
mnemonic; document the discard at the call site.

N6: clobber_aapcs64_caller_saved replaces the 20-entry table loop
(each iteration constructed a std::string for the unordered_map
key conversion — 20 allocations per BL/BLR/PAC-call) with an
iterate-and-erase-by-predicate. Adds an is_aapcs64_caller_saved
helper that classifies via small switch on the first character +
numeric suffix, hashless and allocation-free.

N7: apply_mov_state's `wN` prefix heuristic was a brittle first-
character check that would have misclassified e.g. a future
mnemonic operand starting with 'w' but not actually a w-register.
Replace with a classify_mov_source(token) enum that compares
explicitly against the canonical alias spellings (xzr, wzr, sp,
wsp, lr) and validates the rest of a register token is digits.
Order-of-normalisation documented in the comment.

N8: add a FAT64 (cafebabf magic + 32-byte fat_arch_64 entries) unit
test. Exercises the 64-bit offset path the prior FAT tests left
uncovered. Synthesised binary with one arm64 slice at offset
0x4000 (which only fits in u64 because of the fat_arch_64 width),
asserts the picker correctly reads it.

N9: comment the FAT slice picker's empty-resolved-map fallthrough
behaviour. Empty map is treated as "no chained fixups, try next
slice"; the hazard is that a FAT binary with chained fixups in
both arm64 and arm64e slices but different image_bases lets the
arm64e slice win even when LLDB loaded the arm64 slice. Phase-4
follow-up.

N10: x29/x30 comment in the AAPCS64 clobber set replaced. x29 is
callee-saved per AAPCS64 (the callee must restore it), not "clang
preserves it when -fno-omit-frame-pointer."

77/77 ctest. No behaviour change for the existing adversarial
fixtures; the nits are precision/allocation/clarity improvements.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Update docs/35-field-report-followups.md §3 with two new
subsections:

1. "Phase 3 — post-review cleanup (this branch, post-`96d079b`)"
   summarises the five fix commits (SUB clobber, PAC family,
   writeback, STR-family, nits N5–N10) and which reviewer item
   each addressed.

2. "Phase 4 — carried forward" enumerates the items reviewer
   flagged as out-of-phase-3-scope: two-adjacent-stripped-
   functions boundary fallback, conditional branches, MOV from
   XZR/WZR (now handled correctly but worth tracking),
   FAT slice picker triple plumbing, auth-rebase key classes,
   real iOS smoke validation, bind resolution, on-disk cache,
   correlate.* wire-up, multi-module support.

Worklog entry documents the goal, the five fix commits with
their commit SHAs, the decisions (PAC fixture gating, SUB
asymmetry, STP provenance precision, per-instance writeback
warning), the surprises (PAC fixture initially passed TDD-red
because of fat-slice silent failure; first PAC fixture had ADD
clobbering x0 before BLRAAZ; writeback imm initially out of
imm9 range), and pointers to next-session phase-4 work.

77/77 ctest. No code changes in this commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Field-report follow-ups: target.open lazy load (preload-symbols off +
summary modules), CLI lifecycle doc, and the design doc for the §1/§2/§3
follow-ups that ship on dependent branches.

See WORKLOG 2026-05-16 entry and docs/35-field-report-followups.md.
§1: in-tree dev flow no longer needs --ldbd on every invocation.
tools/ldb/ldb derives <repo>/build/bin/ldbd from __file__ when neither
$PATH nor CWD turn up the daemon.
§2 phase 1: ldbd --listen unix:PATH for persistent daemon-side state
across CLI invocations. Single-client serial; phase 2 will add
multi-client + auto-spawn. Hardened: O_NOFOLLOW, lstat parent-dir,
getpeereid peer-cred, SO_RCVTIMEO, absolute-path requirement.

# Conflicts:
#	docs/WORKLOG.md
§6 phases 1-3: ARM64e chained-fixup parser, indexer wire-up, and
ADRP-pair correctness (function-boundary reset, AAPCS64 + PAC call
clobber, ADD/SUB/MOV write tracking, pre/post-indexed LDR writeback,
STR-family xref support, FAT Mach-O slice selection, provenance.warnings
field). Closes the silent-wrong-xref class on iOS ARM64e binaries.

# Conflicts:
#	docs/WORKLOG.md
@zachgenius zachgenius merged commit 15808a2 into master May 16, 2026
1 of 4 checks passed
@zachgenius zachgenius deleted the release/field-report-followups branch May 16, 2026 08:38
zachgenius added a commit that referenced this pull request May 16, 2026
Bundles PRs #20 + #21 — the full RE-engineer field report and its
phase-3/phase-4 hardening cycle. Original 6-item report is closed;
phase-5 work is enhancement scope.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant