Skip to content

[WIP]: FFI support#65

Draft
aleksisch wants to merge 12 commits into
develfrom
aleksisch/support-ffi
Draft

[WIP]: FFI support#65
aleksisch wants to merge 12 commits into
develfrom
aleksisch/support-ffi

Conversation

@aleksisch

@aleksisch aleksisch commented Apr 28, 2026

Copy link
Copy Markdown
Collaborator

Full FFI support

Closed #56

@aleksisch aleksisch marked this pull request as draft April 28, 2026 14:25
@aleksisch aleksisch force-pushed the aleksisch/support-ffi branch from f7ca673 to c9769f7 Compare June 9, 2026 18:45
Add one more helper to convert from fp to int and use this helper
instead handwriting formulas.
On buggy LuaJIT the FOLD rules `x ^ 0.5 ==> sqrt(x)` and
`2.0 ^ i ==> ldexp(1.0, i)` rewrite the optimised trace, while the
interpreter and the unoptimised trace keep pow(). pow() is not
interchangeable with sqrt/ldexp -- the result is libm-dependent --
so the optimised trace diverges.

The buggy build must still contain the fold, so LUAJIT_BUGGY_TAG
moves back to the parent of the pow fix. That commit is an
ancestor of the previous tag, so its bug set is a superset --
every existing buggy test still reproduces (verified
on both builds, 0 failures). Fills in the previously-skipped
pow() stub.
Narrowing can be supported by reading and writing int as a num
in non DUALNUM mode. It requires SLOAD support, and support from
SNAP to treat all int as a num.

Closes #34
Add the i64 SUB variant and a cdt ASTORE (a cdata box is
referenced like a table, so it mirrors IRNodeASTORETab). With both
in place ljopt reports sat, witnessed by a huge |s| >> 2^53.

The bug is still OPEN upstream, so the divergence is present on
every build; reproduce_bug_using_smt gains a `force_sat` flag for
such open bugs and the test asserts sat on both builds.
@aleksisch aleksisch force-pushed the aleksisch/support-ffi branch from c9769f7 to 5456c07 Compare June 10, 2026 08:33
aleksisch and others added 8 commits June 10, 2026 11:58
Implements `XLOAD int` / `XLOAD i64` via a shared
`(Array (_ BitVec 64) MemCell)` declared per trace pair.
Both unopt and opt traces select from the same `shared_xmem`
array, so equivalent FFI dereferences match by congruence.

Auxiliary plumbing required to keep the pointer arithmetic
chain non-NYI for the test snippet:

  * SLOAD cdt now also stores the cdata's mem-slot id in
    op_stack as i64, mirroring SLOAD tab. This gives the
    pointer SSA a concrete 64-bit value that p64 ADD can
    consume.
  * ADD p64 and MUL i64 are added as BinOpI64 aliases, so
    the `cdata + i*4 + 8` chain emits real bvadd/bvmul
    instead of dummies.

Test (`tests/tests.lua`): hot-call `f(p) return p[0] + 1`
on `ffi.new("int[1]", 42)`. UNSAT verified.
Implements `XSTORE int` / `XSTORE i64` via per-trace
versioned `xmem_<suffix>_v<N>` arrays. v0 = `shared_xmem`
(both unopt and opt traces start from the same base);
each XSTORE allocates a new version and asserts
`new = (store prev ptr (int-val val))`.

XLOAD now reads from `ctx.xmem_cur`, the latest version
for the trace, so a write-then-read pattern forwards
through the array theory without solver work.

Test (`tests/tests.lua`): hot-call `f(p, v) p[0] = v;
return p[0]`. Each iteration writes a different `v` and
reads it back — UNSAT confirmed.
Models `CALLXS` as an uninterpreted SMT function `callxs_<arity>`
(declared in `LJOPT_SMTLIB` for arities 1..4). Both unopt and
opt traces apply the same function to congruent arguments,
so equivalent FFI calls match by UF congruence even though
ljopt has no semantic model of the underlying C routine.

CARG was already routed to ir_node_dummy (Stage 3); the
chain is walked here via `op:get_carg()` to gather argument
SSAs.

Test (`tests/tests.lua`): hot loop calling `ffi.C.abs(-i)`.
Result is discarded — keeps the formula free of FP↔BV
roundtrip chains that otherwise drive Z3 into timeout
(see prior bisection on snap-slot reasoning).
Two pre-existing bugs in CNEW.lua surfaced when running FFI
reproducers (lj_299) against this branch:

1. `maybe_size:get_num()` asserted when size came from an SSA
   (e.g. variable-length array `vla_t(n)` where n is loaded at
   runtime). Dispatch by is_num/is_ssa: load i64 from op_stack
   for runtime size, keep const path for compile-time size.

2. `mem_stack:store_index(idx, k, v)` was called without the
   required 5th `type` argument, hitting the
   "attempt to concatenate local 'type' (a nil value)" assert
   in MemoryStack.store_index. Pass 'i64' explicitly for both
   cdata.ctypeid and cdata.size stores.

After the fix, lj_299 produces UNSAT (1 formula).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
`ljopt_init_new_trace` asserted on the second LJ trace whose
bytecode-hash + SLOAD-types signature collided with an existing
record. Real reproducers (e.g. tests/reproducers/lj_311.lua)
record many small traces from many call sites; a fingerprint
collision crashed the whole recording pass with
"Trace with exactly this bytecode already exists ...".

Skip the duplicate (nil out traces_num[tr] so subsequent
ljopt_savetrace / ljopt_savesnap drop their data) and continue.
The first trace with that fingerprint stays; only the colliding
ones are dropped. No false-positive risk: an uncomparable trace
just isn't paired across opt levels.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
The first XLOAD/XSTORE commits only declared Int and I64
variants; FFI structs containing `double` fields produce
`num XLOAD` / `num XSTORE` (e.g. tests/reproducers/lj_651.lua
reads/writes a struct {double x, y}). Without a Num impl those
nodes were dummied, breaking the dependency chain into the loop
guards.

Add IRNodeXLOADNum extracting via `get-fp` (num MemCells store
native FP after the fp-val refactor) and IRNodeXSTORENum
wrapping with `fp-val`. Refactor make_xload to take the
extractor and make_xstore to dispatch on op_type.NUM via
retrieve_num_op.

lj_651 still SAT due to unrelated guard-set asymmetry between
unopt/opt traces; XLOAD/XSTORE Num themselves now translate
correctly.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
canonicalise ctype id

Two issues converged on lj_651, both producing SAT on otherwise
equivalent traces:

1. `mem_stack:allocate(ssa_ref)` — the single-arg form takes
   the inherit branch and ties the freshly-allocated cdata
   slot to `shared_mem[ssa_ref]`. CNEW's ssa_ref differs
   between unopt (e.g. 30) and opt (e.g. 18) traces, so the
   two slots inherited from different shared cells, leaving
   their default array values free to diverge — solver picks
   different values, OR-clause fires.

   CNEW allocates *fresh* memory; nothing to inherit. Call
   `allocate()` with no arg → zero-init branch.

2. CNEW's left_op is the LJ ctype id, an opaque per-process
   counter. Anonymous `ffi.typeof("struct {...}")` returns a
   fresh ctype on each call, so two record() runs see
   different ids (lj_651: 106 in unopt, 166 in opt). The id
   is never read back from mem_stack downstream; store a
   stable canonical 0 instead so traces match.

After both fixes lj_651 (and similar FFI cdata reproducers)
report UNSAT on the FIXED LJ build.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Adds a 0-arg `nil-val` constructor to the MemCell datatype
and uses it for the `nil` HSTORE variant (HSTORE on `nil`-typed
right operand). `zero_pointer` default is now `nil-val`, so
unwritten cells read as nil too.

HLOAD num emits a real type guard:
  te[ssa] = (not ((_ is nil-val) raw_cell))
instead of the previous unconditional `true`. This is what
"guarded HLOAD" means at the IR level — a `>` flag fires the
side exit when the loaded slot turned out to be nil.

Plumbing: MemoryStack.store_index_raw / load_index_raw
expose pre-wrapped MemCell expressions so HSTORE-nil can
write `nil-val` directly and HLOAD's type-guard can run
`(_ is nil-val)` on the raw cell.

Note: alone this still doesn't catch lj_1133 — that test
hinges on t1/t2 aliasing through different VM-stack slots,
which ljopt currently models as independent memory regions.
A separate aliasing layer is needed to surface the bug.

No regressions on detected reproducers (lj_783, lj_1079,
lj_299, lj_524, lj_651, lj_1086, lj_fix-fold-simplify-conv-
sext). Existing tests unchanged.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@aleksisch aleksisch force-pushed the aleksisch/support-ffi branch from 1df1ba4 to a913b9f Compare June 10, 2026 08:58
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Support various FFI types

1 participant