[WIP]: FFI support#65
Draft
aleksisch wants to merge 12 commits into
Draft
Conversation
f7ca673 to
c9769f7
Compare
Add one more helper to convert from fp to int and use this helper instead handwriting formulas.
On buggy LuaJIT the FOLD rules `x ^ 0.5 ==> sqrt(x)` and `2.0 ^ i ==> ldexp(1.0, i)` rewrite the optimised trace, while the interpreter and the unoptimised trace keep pow(). pow() is not interchangeable with sqrt/ldexp -- the result is libm-dependent -- so the optimised trace diverges. The buggy build must still contain the fold, so LUAJIT_BUGGY_TAG moves back to the parent of the pow fix. That commit is an ancestor of the previous tag, so its bug set is a superset -- every existing buggy test still reproduces (verified on both builds, 0 failures). Fills in the previously-skipped pow() stub.
Narrowing can be supported by reading and writing int as a num in non DUALNUM mode. It requires SLOAD support, and support from SNAP to treat all int as a num. Closes #34
Add the i64 SUB variant and a cdt ASTORE (a cdata box is referenced like a table, so it mirrors IRNodeASTORETab). With both in place ljopt reports sat, witnessed by a huge |s| >> 2^53. The bug is still OPEN upstream, so the divergence is present on every build; reproduce_bug_using_smt gains a `force_sat` flag for such open bugs and the test asserts sat on both builds.
c9769f7 to
5456c07
Compare
Implements `XLOAD int` / `XLOAD i64` via a shared
`(Array (_ BitVec 64) MemCell)` declared per trace pair.
Both unopt and opt traces select from the same `shared_xmem`
array, so equivalent FFI dereferences match by congruence.
Auxiliary plumbing required to keep the pointer arithmetic
chain non-NYI for the test snippet:
* SLOAD cdt now also stores the cdata's mem-slot id in
op_stack as i64, mirroring SLOAD tab. This gives the
pointer SSA a concrete 64-bit value that p64 ADD can
consume.
* ADD p64 and MUL i64 are added as BinOpI64 aliases, so
the `cdata + i*4 + 8` chain emits real bvadd/bvmul
instead of dummies.
Test (`tests/tests.lua`): hot-call `f(p) return p[0] + 1`
on `ffi.new("int[1]", 42)`. UNSAT verified.
Implements `XSTORE int` / `XSTORE i64` via per-trace versioned `xmem_<suffix>_v<N>` arrays. v0 = `shared_xmem` (both unopt and opt traces start from the same base); each XSTORE allocates a new version and asserts `new = (store prev ptr (int-val val))`. XLOAD now reads from `ctx.xmem_cur`, the latest version for the trace, so a write-then-read pattern forwards through the array theory without solver work. Test (`tests/tests.lua`): hot-call `f(p, v) p[0] = v; return p[0]`. Each iteration writes a different `v` and reads it back — UNSAT confirmed.
Models `CALLXS` as an uninterpreted SMT function `callxs_<arity>` (declared in `LJOPT_SMTLIB` for arities 1..4). Both unopt and opt traces apply the same function to congruent arguments, so equivalent FFI calls match by UF congruence even though ljopt has no semantic model of the underlying C routine. CARG was already routed to ir_node_dummy (Stage 3); the chain is walked here via `op:get_carg()` to gather argument SSAs. Test (`tests/tests.lua`): hot loop calling `ffi.C.abs(-i)`. Result is discarded — keeps the formula free of FP↔BV roundtrip chains that otherwise drive Z3 into timeout (see prior bisection on snap-slot reasoning).
Two pre-existing bugs in CNEW.lua surfaced when running FFI reproducers (lj_299) against this branch: 1. `maybe_size:get_num()` asserted when size came from an SSA (e.g. variable-length array `vla_t(n)` where n is loaded at runtime). Dispatch by is_num/is_ssa: load i64 from op_stack for runtime size, keep const path for compile-time size. 2. `mem_stack:store_index(idx, k, v)` was called without the required 5th `type` argument, hitting the "attempt to concatenate local 'type' (a nil value)" assert in MemoryStack.store_index. Pass 'i64' explicitly for both cdata.ctypeid and cdata.size stores. After the fix, lj_299 produces UNSAT (1 formula). Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`ljopt_init_new_trace` asserted on the second LJ trace whose bytecode-hash + SLOAD-types signature collided with an existing record. Real reproducers (e.g. tests/reproducers/lj_311.lua) record many small traces from many call sites; a fingerprint collision crashed the whole recording pass with "Trace with exactly this bytecode already exists ...". Skip the duplicate (nil out traces_num[tr] so subsequent ljopt_savetrace / ljopt_savesnap drop their data) and continue. The first trace with that fingerprint stays; only the colliding ones are dropped. No false-positive risk: an uncomparable trace just isn't paired across opt levels. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The first XLOAD/XSTORE commits only declared Int and I64
variants; FFI structs containing `double` fields produce
`num XLOAD` / `num XSTORE` (e.g. tests/reproducers/lj_651.lua
reads/writes a struct {double x, y}). Without a Num impl those
nodes were dummied, breaking the dependency chain into the loop
guards.
Add IRNodeXLOADNum extracting via `get-fp` (num MemCells store
native FP after the fp-val refactor) and IRNodeXSTORENum
wrapping with `fp-val`. Refactor make_xload to take the
extractor and make_xstore to dispatch on op_type.NUM via
retrieve_num_op.
lj_651 still SAT due to unrelated guard-set asymmetry between
unopt/opt traces; XLOAD/XSTORE Num themselves now translate
correctly.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
canonicalise ctype id
Two issues converged on lj_651, both producing SAT on otherwise
equivalent traces:
1. `mem_stack:allocate(ssa_ref)` — the single-arg form takes
the inherit branch and ties the freshly-allocated cdata
slot to `shared_mem[ssa_ref]`. CNEW's ssa_ref differs
between unopt (e.g. 30) and opt (e.g. 18) traces, so the
two slots inherited from different shared cells, leaving
their default array values free to diverge — solver picks
different values, OR-clause fires.
CNEW allocates *fresh* memory; nothing to inherit. Call
`allocate()` with no arg → zero-init branch.
2. CNEW's left_op is the LJ ctype id, an opaque per-process
counter. Anonymous `ffi.typeof("struct {...}")` returns a
fresh ctype on each call, so two record() runs see
different ids (lj_651: 106 in unopt, 166 in opt). The id
is never read back from mem_stack downstream; store a
stable canonical 0 instead so traces match.
After both fixes lj_651 (and similar FFI cdata reproducers)
report UNSAT on the FIXED LJ build.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a 0-arg `nil-val` constructor to the MemCell datatype and uses it for the `nil` HSTORE variant (HSTORE on `nil`-typed right operand). `zero_pointer` default is now `nil-val`, so unwritten cells read as nil too. HLOAD num emits a real type guard: te[ssa] = (not ((_ is nil-val) raw_cell)) instead of the previous unconditional `true`. This is what "guarded HLOAD" means at the IR level — a `>` flag fires the side exit when the loaded slot turned out to be nil. Plumbing: MemoryStack.store_index_raw / load_index_raw expose pre-wrapped MemCell expressions so HSTORE-nil can write `nil-val` directly and HLOAD's type-guard can run `(_ is nil-val)` on the raw cell. Note: alone this still doesn't catch lj_1133 — that test hinges on t1/t2 aliasing through different VM-stack slots, which ljopt currently models as independent memory regions. A separate aliasing layer is needed to surface the bug. No regressions on detected reproducers (lj_783, lj_1079, lj_299, lj_524, lj_651, lj_1086, lj_fix-fold-simplify-conv- sext). Existing tests unchanged. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
1df1ba4 to
a913b9f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Full FFI support
Closed #56