chore: Configure Renovate#1
Closed
renovate[bot] wants to merge 2 commits into
Closed
Conversation
2e84ccc to
b992aa2
Compare
6643bcf to
acf8995
Compare
47ab09e to
b8ee43e
Compare
9742a4d to
f83c525
Compare
d2891a0 to
fdb89c1
Compare
d26316d to
f7e9553
Compare
3e04b59 to
bb8587f
Compare
siyul-park
added a commit
that referenced
this pull request
May 30, 2026
…c name The earlier split surfaced two types — scanner (linear-scan policy) and the rewrite/rewriteOp/resolveVReg trio (operand substitution) — when both serve a single transformation: rewrite the instruction list so its operands reference physical registers. Both became the rewriter. The work still happens in two passes internally — assign first, then rewriteAll — because pin preemption can evict a vreg from the pool at the same instruction that still references it (e.g. `ADDI vSPOut, vSPE, #1` where both vregs share scratch slot rSP). Splitting into "build the persistent vreg → preg map, then substitute" lets the rewrite pass see the assignment even after the pool has reclaimed the slot. The merge is in the public surface: rewriter.run is the only entry point; assign / rewriteAll / ensure / resolve are private steps. regPool reverts to its prior name regAlloc — the rewriter handles the "linear scan register allocation" framing, so the underlying pool can keep its established label without confusing the layering. Files reshuffle to one concept each: - asm/regalloc.go — the register pool with both-direction bindings. - asm/rewriter.go — assign + rewriteAll + ensure + resolve + lastUses. - asm/emit.go — encode pipeline (renamed from encode.go to avoid visual clash with encoder.go). Assembler.Build now delegates to a single rewriter.run call instead of calling scanner + rewrite separately.
siyul-park
added a commit
that referenced
this pull request
Jun 4, 2026
* refactor(asm,jit): rebuild asm + introduce top-level jit skeleton asm is now arch-neutral and has zero minivm dependencies. The public surface is asmjit-inspired: Arch is an interface returned by per-arch factory functions; Assembler.Build produces a Code (bytes + label table + relocations); Link writes Codes into a Buffer, patches external relocations against a Resolver, and binds Callables via the ABI. Signature is now data-only (Args, Returns, Scratch) - multi-exit semantics move out of asm/. Buffer manages W^X internally; the new Data type owns the always-writable region used for runtime-patched slots. Callable exposes Addr() so consumers can emit direct branches without re-entering Go. asm/arm64 wraps the existing AArch64 encoder and trampoline behind the new Arch interface; asm/amd64 is a compile-clean stub returning ErrNotImplemented from every Encode/NewCallable call. A new top-level jit/ package owns the compilation driver, the Slots indirection table, a Layout struct that consumers register via Bind so jit never imports them, and a Lowerer interface dispatched by runtime.GOARCH. jit/arm64 and jit/amd64 contain Lowerer scaffolding; real opcode lowering follows in the next change. interp loses its asm.Buffer field, takes a *jit.Compiler instead, and falls back to threaded execution for every opcode until Phase A lowering lands. JIT-specific tests continue to gate on jit.Active() and stay skipped while the Lowerer is unregistered. * feat(jit): wire compile pipeline + NOP segment proof-of-concept The Lowerer interface gains an Arch() method so each backend declares its own asm.Arch, plus an Exit hook that emits the segment terminator (write next IP into the agreed scratch slot, then RET). Compiler.Compile now walks the function bytecode from IP 0, dispatches per opcode through the registered Lowerer, and produces an asm.Code linked into the compiler's buffer; the resulting Callable lands in Module.Segments. The arm64 Lowerer wires up scratch-slot conventions used by every Phase A segment (stack base, sp in/out, globals, constants, next IP at slot 4) and implements NOP as the first lowered opcode. A direct asm.Callable invocation test confirms the round trip: a chain of NOPs compiles, links, executes, and writes the correct exit IP into scratch[4]. interp.adaptSegment is now a real boxing closure: it packs i.stack, i.sp, i.globals, and i.constants base addresses into the scratch buffer before calling the segment and reads sp + next-IP back afterwards. The arm64 Lowerer is intentionally still unregistered, so JIT-related interp tests continue to skip while Phase A opcode coverage is incomplete. asm.regAlloc grows an `excluded` mask separate from `blocked`: ABI scratch registers are now withheld from auto-allocation while staying reservable via Pin, which is what Lowerer.Exit needs to write scratch[4]. * feat(jit/arm64): lower DROP, DUP, and the *_CONST family Each Phase A segment now tracks a logical VM-stack shadow in Context.Stack: pushes append a fresh VReg, pops drop the trailing entry, and DROP reuses the cheaper logical removal instead of touching memory. Exit walks the shadow at the end of the segment, computes the byte address of i.stack[sp] from the scratch register pair, and writes each retained VReg back at its (sp + i) slot before adjusting sp by the net delta. The four *_CONST opcodes lower to a pre-computed NaN-boxed 64-bit immediate so the constant is materialized exactly once at compile time. I64_CONST rejects unboxable magnitudes that the threaded path would otherwise heap-promote, preserving correctness without dragging the heap into the segment ABI. A direct asm.Callable test exercises three patterns end-to-end: const-only spill, const + DROP (top discarded without a store), and const + DUP (MOV plus dual spill). All three confirm the resulting stack memory and sp delta. The arm64 Lowerer stays unregistered with jit; interp-side JIT tests continue to skip until enough Phase A opcodes land for an end-to-end program to compile cleanly. * feat(jit/arm64): lower SWAP, CONST_GET, LOCAL_*, and GLOBAL_* Compile now takes a jit.Snapshot of the consumer's tables (constants, globals, and local kinds) so kind-sensitive opcodes can check at compile time whether they have to bail out for the runtime retain that Phase A does not yet model. The interp side builds the snapshot from i.constants, i.globals, and fn.Typ.Params ++ fn.Locals before each i.jit() attempt. SWAP is a pure stack-shadow reorder. CONST_GET emits the boxed value directly as a 64-bit immediate so the constants table never needs a runtime base pointer, freeing the third scratch slot for the frame bp. GLOBAL_GET/SET issue plain LDR/STR off the globals scratch base, rejecting any idx whose current value is a ref. LOCAL_GET/SET compute their effective address as rStack + rBP*8 plus a fixed idx*8 displacement, again rejecting refs by kind. The Lowerer remains unregistered with jit, so interp-level JIT tests continue to skip. Direct asm.Callable tests cover all five new opcodes end-to-end, including a GLOBAL_SET → GLOBAL_GET round trip and a LOCAL_SET → LOCAL_GET round trip with a non-zero bp. * refactor(asm,jit,interp): align with project file layout conventions Errors move from the centralized asm/errors.go to the file that owns each sentinel's producer (memory.go for the mmap family, buffer.go for ErrBufferFull, assembler.go for ErrConflictingPin, encoder.go for ErrInvalidOperand, link.go for ErrUnresolvedLabel, abi.go for the arity sentinels, regalloc.go for ErrNoRegistersAvailable, etc.). This matches how the rest of the project keeps errors near the code that returns them. Files now follow the slot order from docs/coding-patterns.md §2.4: public types first, then private types, consts, vars (including interface compliance assertions), and finally functions and methods. Constructors live right under their type. asm/reg.go and asm/operand.go move every public function above the methods that use them; jit/compiler.go puts Option next to Compiler instead of after the private config; and jit/lowerer.go orders Lowerer → Context → Snapshot so each declaration sits above the types it references. Interface compliance assertions are now declared for every concrete asm.Arch / asm.ABI / asm.Encoder / jit.Lowerer implementation so the type system catches accidental signature drift. The unused ErrTooManyScratch sentinel in asm/arm64/abi.go is removed. stackBase in interp/interp.go moves to the package-level helper section at the bottom of the file alongside unboxRef. * refactor(jit): drop speculative accessors, rename emptyModule, tidy docs Helper audit per docs/coding-patterns.md: - Compiler.Buffer() and Compiler.Data() had zero callers and no near-term use (Step 4 reaches the slot table via Compiler.Slots / SetSlots, which stay). Removed both accessors. - Compiler config carried a separate arch field even though arch is always derived from cfg.lowerer.Arch(). Inlined and dropped the field. - jit.Bound() was a speculative "is layout set" probe with no caller and no upcoming consumer. Removed alongside the matching `bound` flag. - emptyModule misled — it returns a Module that already carries ParamKinds / ReturnKinds. Renamed to newModule, swapped the param order to (fn, addr) to match other constructors, and folded the two emptyModule(...) call sites in Compile into a single shared variable. Layout / Slots / Module.ParamKinds / ReturnKinds and the WithBuffer / WithData options stay as-is because Steps 3–4 (whole-function Entry and direct-BL CALL) already need them. * refactor(asm): regAlloc → regPool, introduce scanner, simplify build.go The build pipeline used to ship state to a 6-argument package function that owned four closures over local maps. The rewrite splits the mechanism (the register pool) from the policy (the linear-scan walk over the instruction list). regAlloc is renamed to regPool — its real role. It now tracks both directions of every binding (vreg → preg via bindings, preg → vreg via owners) so pin-conflict detection can ask "who currently owns this slot?" without a parallel reverse map upstream. The scratch-vs-blocked distinction stays as before. A new scanner type wraps the pool and owns the scan-time policy: pin preemption, lifetime-driven free, persistent assigned map, and the width back-fill that widthMap previously handled in a separate pass. The four closures (keyOf/bind/free/ensure) become explicit methods. Instruction grows Def() and Uses() helpers so the scanner reads its operands through a clean API instead of three package-private query functions. memBase moves with them as a private helper on the same file. rewrite/rewriteOp dedupe their VReg / MemOperand branches via a small resolveVReg helper. encode splits its two passes into named helpers (encodeWithPlaceholders + emitFinal) so the placeholder/patch dance is visible at the top of encode itself. Assembler.Build no longer routes through a six-argument build() function; it orchestrates scanner.run → rewrite → encode inline. Net effect: build.go drops the standalone build()/assignRegs/widthMap trio (≈100 LOC of state plumbing), scan-time state becomes addressable fields instead of closure captures, and the pool/policy split is the public design. * feat(asm): implement linear-scan register allocation with scanner and encoding functions * refactor(asm): simplify scanner methods and unify backfill functionality * refactor(asm): move lastUses function to scanner for better organization * refactor(asm): merge scanner + rewrite into rewriter, restore regAlloc name The earlier split surfaced two types — scanner (linear-scan policy) and the rewrite/rewriteOp/resolveVReg trio (operand substitution) — when both serve a single transformation: rewrite the instruction list so its operands reference physical registers. Both became the rewriter. The work still happens in two passes internally — assign first, then rewriteAll — because pin preemption can evict a vreg from the pool at the same instruction that still references it (e.g. `ADDI vSPOut, vSPE, #1` where both vregs share scratch slot rSP). Splitting into "build the persistent vreg → preg map, then substitute" lets the rewrite pass see the assignment even after the pool has reclaimed the slot. The merge is in the public surface: rewriter.run is the only entry point; assign / rewriteAll / ensure / resolve are private steps. regPool reverts to its prior name regAlloc — the rewriter handles the "linear scan register allocation" framing, so the underlying pool can keep its established label without confusing the layering. Files reshuffle to one concept each: - asm/regalloc.go — the register pool with both-direction bindings. - asm/rewriter.go — assign + rewriteAll + ensure + resolve + lastUses. - asm/emit.go — encode pipeline (renamed from encode.go to avoid visual clash with encoder.go). Assembler.Build now delegates to a single rewriter.run call instead of calling scanner + rewrite separately. * refactor(asm): emit pipeline becomes Assembler methods, drop emit.go emit / emitDraft / emitFinal now hang off *Assembler so they pick up the encoder through a.arch.Encoder() instead of receiving it on every call. Build delegates to a.emit(rewritten) in one line. emit.go disappears — all three methods live alongside Build in assembler.go. Naming sweep at the same time: - encode → emit (the byte-stream pipeline; "encode" was confusable with Encoder.Encode, which handles only one instruction). - encodeWithPlaceholders → emitDraft (draft = preliminary; pairs with emitFinal). - rewriter: rewriteAll → rewrite (slice form), the old per-inst rewrite → rewriteInst, and backfillPinWidths inlines into assign since it is five lines used in one place. * feat(jit/arm64): lower I32 arithmetic, bitwise, and EQZ Six i32 opcodes lower to native code via the segment-stack shadow: - I32_ADD / I32_SUB / I32_MUL fold through a single helper that runs the op on the 64-bit boxed inputs, then masks the result to 32 bits and re-tags it as a fresh Boxed. - I32_AND / I32_OR take the fast path because ANDing or ORing two identically tagged values preserves the tag — no re-box required. - I32_XOR cancels the tag bits, so the lowering ORs the tag back in after the EOR. - I32_EQZ masks the operand to 32 bits, compares to zero, and uses CSET + the existing boxI32 helper to produce a boxed boolean i32. A shared boxI32 helper threads the ANDI mask + LDI tag + ORR sequence behind a single call so each opcode keeps its body short. The encoded tag constant lives next to the helper as boxedI32Tag for clarity. Two arm64 lowerer tests cover the new opcodes end to end (CONST + ADD gives a boxed sum, CONST + EQZ gives boxed 1/0). * feat(jit/arm64): lower I32 comparisons and shifts Thirteen more i32 opcodes lower to native code: - I32_SHL / I32_SHR_S / I32_SHR_U share a lowerI32Shift helper. The shift count is ANDed with 0x1F so ARM64's wider register-shift semantics match WebAssembly's i32 shift modulo 32. The value lane is prepped per opcode — zero-extend for logical shifts, sign-extend for the arithmetic right shift. - I32_EQ / I32_NE compare boxed inputs directly (tags match so a 64-bit CMP is correct without prep). - I32_LT_S / I32_LE_S / I32_GT_S / I32_GE_S sign-extend each operand via SXTW before the 64-bit CMP so the chosen LT/LE/GT/GE condition code reads the proper N/V flags. - I32_LT_U / I32_LE_U / I32_GT_U / I32_GE_U zero-extend via an ANDI mask and use the unsigned condition codes (CC/LS/HI/CS). Two new helpers — signExtendI32 and zeroExtendI32 — keep the prep shapes consistent across compares and shifts. Five lowerer tests cover the cases where signed/unsigned bit patterns diverge. * feat(jit/arm64): lower I64 arithmetic, comparisons, and shifts Fourteen i64 opcodes lower to native code, modeled after the i32 set with adjustments for the 49-bit value lane: - I64_ADD / I64_SUB / I64_MUL sign-extend both inputs to 64 bits, run the op, then mask back to 49 bits and re-tag. Results that exceed the boxable range silently wrap — full heap-promote handling lands in a later phase. - I64_EQ / I64_NE compare boxed inputs directly (tags match). - I64_EQZ masks off the tag and compares the value lane to zero, pushing a boxed i32 0/1 per WebAssembly semantics. - I64_LT_S / LE_S / GT_S / GE_S sign-extend via shift-pair and use the signed condition codes; LT_U / LE_U / GT_U / GE_U zero-extend with the value-lane mask and use the unsigned codes. - I64_SHL / SHR_S / SHR_U mirror the i32 shifts but mask the count to 6 bits (the i64 shift modulo). Three helpers carry the new shape: signExtendI64 / zeroExtendI64 / boxI64. Five lowerer tests cover ADD, signed compare with a negative input, EQZ for zero and non-zero, and SHL. WebAssembly i64 does not have AND / OR / XOR opcodes in this VM, so the bitwise family is intentionally absent from the dispatch table. * refactor(jit,asm): return segment stack via ABI returns Segments now expose their stack shadow through asm.Callable returns instead of writing back through a scratch slot. The jit package owns the shared scratch layout (jit/segment.go) used by lowerers and the interpreter adapter, and asm extracts its emit pipeline into a dedicated emitter struct (asm/emitter.go). - jit/compiler: cap segment stack at ABI MaxReturns, plumb Returns into the Code signature - jit/lowerer: add Context.Returns; Exit maps stack shadow to returns - jit/arm64: rewrite lowerer around the new ABI; inline BoxReturn at callsites as types.Boxed(v.Bits()) - interp/adaptSegment: read returns off the Callable and push onto the interpreter stack via jit.Scratch* indices - docs/jit-internals: sync prose with the new ABI * feat(jit): enhance segment handling with input args and stack management * feat(jit): compile profiled segments * feat(jit): expose internal segment entries * feat(jit/arm64): lower F32/F64 arithmetic, comparisons, and conversions Add float scalar lowering to the arm64 JIT backend: - F32/F64 binary ops (ADD, SUB, MUL, DIV) via FADD/FSUB/FMUL/FDIV - F32/F64 comparisons (EQ, NE, LT, GT, LE, GE) via FCMP + CSET - Int-to-float conversions (I32/I64 → F32/F64, signed and unsigned) via SCVTF/UCVTF - F32↔F64 widening/narrowing via FCVT - Extend FMOV encoder to accept cross-width Int64↔Float32 operand pairs - 7 new TestLowerer_Compile subtests covering all new paths * refactor: JIT Compiler and Lowerer for Improved Function Compilation - Updated the `lowerer_test.go` to enhance test coverage for CONST_GET and function compilation scenarios, including immediate calls and entry point validations. - Refactored the `compiler.go` to streamline the compilation process, introducing a whole-function compilation strategy that allows for direct callable entries when all opcodes lower successfully. - Enhanced the `lowerer.go` to support whole-function compilation and improved context management during opcode lowering. - Modified the `module.go` to clarify the structure of the Module, separating Params and Returns for better clarity and usability. - Adjusted the `slots.go` to improve slot management and ensure thread safety with mutex protection. * feat: implement JIT support for reference operations: REF_NULL, REF_IS_NULL, and REF_EQ - Added handling for REF_NULL in the ARM64 lowerer to push a null reference constant onto the shadow stack. - Implemented REF_IS_NULL to check if a boxed reference is null, returning BoxI32(1) for null and BoxI32(0) otherwise. - Implemented REF_EQ to compare two boxed references, returning BoxI32(1) if they are the same and BoxI32(0) otherwise. - Updated tests to verify the correct behavior of REF_NULL, REF_IS_NULL, and REF_EQ operations in the JIT context. * feat(jit): implement BR_TABLE lowering and corresponding tests * feat(jit): implement multi-block entry and intra-function branching for BR_IF and BR instructions * fix: prevent use-after-free in Buffer/Data grow and tighten JIT correctness - asm/data.go: retire old mmap regions to `old []memory` instead of munmap-ing them; Alloc pointers baked into compiled native code remain valid until Data.Free() releases all regions at once - asm/buffer.go: same lifetime fix for Buffer; grow seals the current region before archiving it, writeAt now searches both current and archived regions so LinkAll relocation patching works across a grow - jit/arm64/lowerer.go: guard globalGet/globalSet against idx > 4095 to prevent int16 overflow in 12-bit LDR/STR unsigned-offset encoding - jit/register.go: promote registryMu to sync.RWMutex; Lookup uses RLock to avoid write contention on every Active() call in the hot path - jit/compiler.go: check emit result in blocks(); bail instead of installing malformed native code when a lowerer rejects during emit - jit/arm64/lowerer_test.go: add NaN comparison tests (f32_lt/gt/ge/ne) documenting that AArch64 FCMP NaN flags (N=0,Z=0,C=1,V=1) produce correct Wasm results for all six ordered-comparison conditions * refactor: simplify jit asm boundaries * feat(jit): enhance JIT for recursive functions and improve entry handling * perf(jit): direct self-recursive calls * feat(jit): carry block stack state * perf(interp): inline jitted fused calls * Refactor ARM64 Lowerer and Compiler for Improved Clarity and Functionality - Updated the Lowerer interface and its Context structure to enhance clarity and maintainability. - Replaced direct references to sign and zero extension functions with instance methods in the Lowerer struct. - Modified the Compiler struct to reorder fields for better organization and readability. - Streamlined the Compile method to handle whole-function and multi-block entries more efficiently. - Introduced a new Invoke function to encapsulate the invocation logic for callables, improving separation of concerns. - Enhanced the handling of stack and globals in the Invoke function, ensuring proper memory management. - Removed redundant functions and comments to clean up the codebase and improve overall readability. * docs: update coding patterns and agent guidelines with frequent style traps * refactor(jit,asm,interp): collapse boilerplate across compile pipeline Driver-side IP advance: jit.Compiler.walk now bumps c.IP by opcode width after Lower returns true; arm64 handlers no longer carry the `c.IP += instr.Instruction(c.Code[c.IP:]).Width()` postlude. Stop-mode handlers (br/brIf/brTable/ret) still own their IP. Lower contract updated in docs/jit-internals.md. jit/compiler.go: whole() and blocks() share function() + walkFunction() + prepare(); pin/pins parallel maps replaced by a single assign() over basis + per-candidate local maps in entries(). jit/lowerer.go: Context fields layered per docs/coding-patterns.md §2.5 (infra → program data → runtime state → flow → flags). arm64/lowerer.go: pinReturns() method replaces a 6-line block copied across Exit/brIf/brTable. global() drops its width return now that the driver advances IP. imm() drops its width parameter. interp/interp.go: jit() split into ensureCompiler + install. snapshot() and New() share types.(*Function).LocalKinds() via the new types.Kinds helper. asm/buffer.go: writeAt() consolidates current/archived region paths through buffer.locate() + memory.within(). Tests + race detector green across all packages. * refactor(jit,arm64): rename pinReturns to returns for clarity * fix(jit/amd64): drop stub auto-registration so x86 reports no JIT The amd64 backend is a skeletal Lowerer that rejects every opcode. While it was registered via init(), jit.Active() returned a non-nil stub on x86_64, which made interp_test.go's requireJIT helper see "JIT available" and run JIT-counter tests that can never pass on the stub. Removing the init registration lets jit.Lookup("amd64") return nil, so Active() is nil on x86_64 and requireJIT skips the affected tests cleanly. The interpreter already short-circuits when Active() is nil, so the threaded fallback path is preserved. The Lowerer type stays in place as a drop-in for a future real codegen pass. * refactor(asm,jit,interp): drop redundant wrappers and white-box test peek - asm: remove Link() wrapper around LinkAll(); the only call site was one test, all production callers already use LinkAll directly - asm: inline single-use helpers (rewriter.recordWidth, Instruction.memBase, regAlloc.keyOf) per project counter-rule for one-shot extraction - asm: move region.mu to bottom of struct so sync primitives follow policy/infrastructure/runtime/config fields - interp: drop white-box assertion on unexported i.jitted map; downstream Run output and profiler stats already cover the behavior * refactor(asm): rename LinkAll to Link The All suffix was needed only while the old Link() wrapper still existed. With that wrapper removed, the simpler name is free. * refactor(jit): tighten Compiler method names - walkFunction -> walkBlocks: describes the per-block dispatch loop, less generic than "Function" and complements the inner walk over opcodes - context -> newContext: standard Go constructor-style name for a method that allocates and returns a fresh Context
This was referenced Jun 13, 2026
Merged
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Welcome to Renovate! This is an onboarding PR to help you understand and configure settings before regular Pull Requests begin.
🚦 To activate Renovate, merge this Pull Request. To disable Renovate, simply close this Pull Request unmerged.
Detected Package Files
go.mod(gomod)Configuration Summary
Based on the default config's presets, Renovate will:
fixfor dependencies andchorefor all others if semantic commits are in use.node_modules,bower_components,vendorand various test/tests (except for nuget) directories.🔡 Do you want to change how Renovate upgrades your dependencies? Add your custom config to
renovate.jsonin this branch. Renovate will update the Pull Request description the next time it runs.What to Expect
It looks like your repository dependencies are already up-to-date and no Pull Requests will be necessary right away.
❓ Got questions? Check out Renovate's Docs, particularly the Getting Started section.
If you need any further assistance then you can also request help here.
This PR was generated by Mend Renovate. View the repository job log.