Implement unsafe intrinsics for compile-time builtins #11825

fitzgen · 2025-10-08T23:33:03Z

This commit adds the extremely unsafe
wasmtime::CodeBuilder::expose_unsafe_intrinsics method. When enabled, the Wasm being compiled is given access to special imports that correspond to direct, unchecked and unsandboxed, native load and store operations. These intrinsics are intended to be used for implementing fast, inline-able versions of WASI interfaces that are special-cased to a particular host embedding, for example.

Compile-time builtins, as originally described in the RFC, are basically made up of three parts:

A function inliner
Unsafe intrinsics
Component composition to encapsulate the usage of unsafe intrinsics in a safe interface

Part (1) has been implemented in Wasmtime and Cranelift for a little while now (see wasmtime::Config::compiler_inlining). This commit is part (2). After this commit lands, part (3) can be done with wac and wasm-compose, although follow up work is required to make the developer experience nicer and more integrated into Wasmtime so that the APIs can look like those proposed in the RFC.

I still have a little bit of doc comments and examples to fill out, but I thought it would be worth opening this PR up so that folks can start taking a look now, especially as I am taking Friday off and have a super-packed day tomorrow and probably won't have time to cross all the Ts and dot all the Is before next week.

One thing that no one brought up during the RFC but which started bugging me during this implementation is whether we can expose tools for compile-time builtin authors to do spectre mitigations. Basically expose an intrinsic that lowers to spectre_select_guard or something? Seems possible but I haven't explored the design space too much yet. Also seems like it is probably something we can do in an additive fashion, without needing to figure everything out before landing any intrinsics. Interested in folks' thoughts!

github-actions · 2025-10-09T01:05:12Z

Subscribe to Label Action

cc @saulecabrera

Details

This issue or pull request has been labeled: "wasmtime:api", "winch"

Thus the following users have been cc'd because of the following labels:

saulecabrera: winch

To subscribe or unsubscribe from this label, edit the .github/subscribe-to-label.json configuration file.

Learn more.

crates/wasmtime/src/compile/code_builder.rs

crates/cranelift/src/compiler/component.rs

alexcrichton · 2025-10-09T20:56:36Z

crates/environ/src/component/info.rs

+    /// A map from a `UnsafeIntrinsic::index()` to that intrinsic's
+    /// module-interned type.
+    pub unsafe_intrinsics: [PackedOption<ModuleInternedTypeIndex>; UnsafeIntrinsic::len() as usize],


If it's expected that the list of intrinsics is going to grow over time, should this perhaps be PrimaryMap<SomethingIndex, (ModuleInternedTypeIndex, UnsafeIntrinsic)>? Basically ignoring unused intrinsics.

My thinking has been that we can cross that bridge if we get to it.

Right now, there are very few intrinsics, and I'm not concerned about the size of little arrays like this. However, it definitely is intended that a component that doesn't use these intrinsics doesn't have them compiled into its text section and doesn't have any additional space reserved for their VMFuncRefs and whatnot in its vmctx layout.

alexcrichton · 2025-10-09T21:08:56Z

crates/wasmtime/src/runtime/component/instance.rs

+        // Initialize the unsafe intrinsics used by this component, if any.
+        for (i, module_ty) in env_component
+            .unsafe_intrinsics
+            .iter()
+            .enumerate()
+            .filter_map(|(i, ty)| ty.expand().map(|ty| (i, ty)))
+        {
+            let i = u32::try_from(i).unwrap();
+            let intrinsic = UnsafeIntrinsic::from_u32(i);
+            let ptrs = self.component.unsafe_intrinsic_ptrs(intrinsic).expect(
+                "should have intrinsic pointers given that we assigned the intrinsic a type",
+            );
+            let shared_ty = self
+                .component
+                .signatures()
+                .shared_type(module_ty)
+                .expect("should have a shared type");
+            self.instance_mut(store.0).set_intrinsic(
+                intrinsic,
+                ptrs.wasm_call,
+                ptrs.array_call,
+                shared_ty,
+            );
+        }


Do you see a viable path to eventually omitting this? For example if we were to implement a DCE pass for functions if no intrinsics are actually imported anywhere and were inlined everywhere then all of these should get emptied out. "Just DCE" wouldn't be sufficient because of loops like this, however, and we'd have to, post function optimization, prune the list of intrinsics in theory.

Do you see a viable path to eventually omitting this?

When you say "this" what exactly are you referring to?

Already it should be the case that when unsafe intrinsics are not exposed to a component, this loop performs zero iterations and there should¹ be zero space reserved for intrinsics' VMFuncRefs in the vmctx.

For example if we were to implement a DCE pass for functions if no intrinsics are actually imported anywhere and were inlined everywhere then all of these should get emptied out. "Just DCE" wouldn't be sufficient because of loops like this, however, and we'd have to, post function optimization, prune the list of intrinsics in theory.

Yes, we would need to update the env_component.unsafe_intrinsics field after doing gc-sections/DCE during linking, same as we would need to do the moral equivalent for the VMFuncRefs of defined Wasm functions that are imported/exported within a component but are ultimately never called and are dead code.

Footnotes

I think there may be a bug where the space is reserved unconditionally right now, looking at CI. But that is definitely unintentional. ↩

For "this" I mean this entire loop in the context when unsafe intrinsics are used. This loop exists for the vanishingly rare case that intrinsics are used but also turned into funcref values one way or another, but in practice they'll basically never get used.

In some sense I'm not saying much here, it's pretty clear that DCE won't make this loop go away, but an wasmtime-aware DCE pass which updated the unsafe_intrinsics list, would, however. So all I'm really saying here is that I think we should strive to make this loop go away in most situations when unsafe intrinsics are used, but that'll require fancy DCE.

crates/winch/src/compiler.rs

crates/wasmtime/src/runtime/store.rs

crates/wasmtime/src/compile/code_builder.rs

alexcrichton · 2025-10-09T21:41:44Z

Oh, also, I'd recommend using prtest:full on this PR as this seems at high-risk of passing on x64 and failing elsewhere

This commit adds the extremely unsafe `wasmtime::CodeBuilder::expose_unsafe_intrinsics` method. When enabled, the Wasm being compiled is given access to special imports that correspond to direct, unchecked and unsandboxed, native load and store operations. These intrinsics are intended to be used for implementing fast, inline-able versions of WASI interfaces that are special-cased to a particular host embedding, for example. Compile-time builtins, as originally described in [the RFC](bytecodealliance/rfcs#43), are basically made up of three parts: 1. A function inliner 2. Unsafe intrinsics 3. Component composition to encapsulate the usage of unsafe intrinsics in a safe interface Part (1) has been implemented in Wasmtime and Cranelift for a little while now (see `wasmtime::Config::compiler_inlining`). This commit is part (2). After this commit lands, part (3) can be done with `wac` and `wasm-compose`, although follow up work is required to make the developer experience nicer and more integrated into Wasmtime so that the APIs can look like those proposed in the RFC.

…<T>`

fitzgen · 2025-10-16T00:37:09Z

@alexcrichton I think this should be ready for another review pass

alexcrichton

Thanks for slogging through all the CI bits and handling the miri bits, it's looking good!

To expand a bit on some of the unresolved comments from the previous review -- It feels a bit weird that there's different ways of managing the list of intrinsics for a component. The VMComponentContext either has 0 or all of them, compilation either compiles 0 or all of them, info::Component tracks a full list of intrinsics but has None for unneeded intrinsics, VMComponentContext initialization "nulls out" all intrinsics, and instantiation only fills in used intrinsics. To me this feels like a random mish-mash of different strategies to manage everything. I get your point about crossing the bridge when we get there, but I also feel like this PR is moving us to a state where it's pretty inconsistent how the intrinsics are handled. Some contexts are "all or nothing" and some contexts are "only used intrinsics".

Ideally I'd prefer a system where intrinsics were compacted/compiled on-demand as opposed to ever doing an "all or nothing" approach. My read of this is that this is basically a function of the initial analysis phase of a component and how clever it is. I would naively expect that fitting into the GlobalInitializer infrastructure would make the implementation "just fall out" by adding a new FooIndex type of some kind. Basically al lthe hash maps and helpers and such are all there, so I would naively expect the impementation to not be all that much work.

I'm perpetually worried about the quantity of work we defer given that the rate of burning down this deferred work is often much smaller than the rate of deferring, but this is a topic reasonable folks can disagree on. In that sense I'll lay out my concerns here, but I'll also leave it up to you whether to merge or not before addressing. If this merges as-is, though, mind opening an issue about these future improvements?

crates/wasmtime/src/runtime/vm/vmcontext.rs

crates/wasmtime/src/runtime/store.rs

alexcrichton · 2025-10-16T15:43:44Z

crates/wasmtime/src/runtime/component/instance.rs

+        // Initialize the unsafe intrinsics used by this component, if any.
+        for (i, module_ty) in env_component
+            .unsafe_intrinsics
+            .iter()
+            .enumerate()
+            .filter_map(|(i, ty)| ty.expand().map(|ty| (i, ty)))
+        {
+            let i = u32::try_from(i).unwrap();
+            let intrinsic = UnsafeIntrinsic::from_u32(i);
+            let ptrs = self.component.unsafe_intrinsic_ptrs(intrinsic).expect(
+                "should have intrinsic pointers given that we assigned the intrinsic a type",
+            );
+            let shared_ty = self
+                .component
+                .signatures()
+                .shared_type(module_ty)
+                .expect("should have a shared type");
+            self.instance_mut(store.0).set_intrinsic(
+                intrinsic,
+                ptrs.wasm_call,
+                ptrs.array_call,
+                shared_ty,
+            );
+        }


For "this" I mean this entire loop in the context when unsafe intrinsics are used. This loop exists for the vanishingly rare case that intrinsics are used but also turned into funcref values one way or another, but in practice they'll basically never get used.

In some sense I'm not saying much here, it's pretty clear that DCE won't make this loop go away, but an wasmtime-aware DCE pass which updated the unsafe_intrinsics list, would, however. So all I'm really saying here is that I think we should strive to make this loop go away in most situations when unsafe intrinsics are used, but that'll require fancy DCE.

fitzgen · 2025-10-16T17:10:56Z

I'm perpetually worried about the quantity of work we defer given that the rate of burning down this deferred work is often much smaller than the rate of deferring, but this is a topic reasonable folks can disagree on.

I hear you. I think it is important that we balance incrementalism and doing things the Right Way. Personally, I think that the lever we use to strike that balance is not by accepting that things will be in a fairly suboptimal state "temporarily" (which, as you note, is often not temporary) in service of shipping something a little sooner, but to instead drop functionality and features that we don't have time to implement well. This way, we might not have everything we ideally want, but what we do have is rock solid. So when we look at some work being left for "follow up PRs", we should ask whether what is landing now is rock solid, and the follow ups are "just" optimizations/features/functionality.¹

But of course all these generalizations are very high-level and vague. We can still come to reasonable disagreements on what is considered "rock solid" or which bits of functionality even can be left out while preserving what remains.

To make things in this PR a little more concrete when viewed through the above lens, I have been accepting the compromise that we will compile, link, and instantiate VMFuncRefs for all intrinsics when the flag to expose unsafe intrinsics is enabled. I think the Right Way to fix that, so that only the intrinsics that are actually used are compiled+linked+instantiated, is to do gc-sections/DCE during linking, but also that doing all that is something that can be delayed (perhaps for a very long time!) and things will be Fine in the meantime because what is implemented should be rock solid (albeit lacking the DCE optimizations we would have in an ideal world).

Ignoring for a second the implementation details in this PR around how intrinsics are tracked through translation and the like, do you agree that the above compromise is an acceptable cut to make?

Because I do agree somewhat with the following:

It feels a bit weird that there's different ways of managing the list of intrinsics for a component. The VMComponentContext either has 0 or all of them, compilation either compiles 0 or all of them, info::Component tracks a full list of intrinsics but has None for unneeded intrinsics, VMComponentContext initialization "nulls out" all intrinsics, and instantiation only fills in used intrinsics. To me this feels like a random mish-mash of different strategies to manage everything.

How the intrinsics are tracked through compilation and in the wasmtime-environ metadata structures is a little messy, and I'd like to improve it. But I don't want to expand the scope of those improvements to only compiling+linking+instantiating the intrinsics that are actually used. Doing that correctly (IMHO via DCE/gc-sections in linking) is too much for me to bite off immediately, and (unless I am missing something) the other ways of doing it require adding new special-cased code paths (which we have generally been trying to remove) and so would only bring us to local maxima, which doesn't seem worth spending the effort on.

And thanks as always for the thorough review! Will dig in some more momentarily.

This is all assuming we are talking about the implementation of new features. Obviously improving existing suboptimal code, even if there are still more follow ups to be done before it is fully optimal, is worth landing right away. ↩

…es to use the method accessors

…mut]`

alexcrichton · 2025-10-16T19:23:59Z

High-level definitely agree with everything you say, and I also want to strive for a balance of perfection and pragmatism. I'm happy to defer things to a follow-up PR to make progress effectively whenever, and the moment I pause is when the conclusion is to open an issue. Filing an issue means that it'll likely be years, if ever, before something changes. So in that sense "let's file an issue" is equated to me as "let's just ignore this" which is partly where I come from. I basically feel that there's no balance to filing an issue as it effectively means that someone else, who's likely to be less suited to the task, will have to take care of it.

For this PR specifically I agree the DCE/VMFuncRef bits should not happen here. There's definitely no need to entangle all that and it's a huge project for not a whole lot of benefit right now. What I was mostly referring to was the compilation of the unsafe intrinsics where it's a mish-mash of everything vs the subset used. That to me feels like the perfect candidate for if an issue is filed it'll just be forgotten and never addressed. It feels frequent that small improvements like this are rarely justified to spend time on which means they just never get time spent on them.

I agree with your rock-solid quality though, and to that end I have no concerns about this PR. I see no bugs in this PR and it's just a bit sub-optimal in a few areas. Given that I'm fine to see this merged. My personal preference would be to have a follow-up adjusting the compilation side of things to change the list-of-all-intrinsics to list-of-only-the-used-intrinsics, but I'm ok settling for an issue too.

alexcrichton · 2025-10-16T19:24:08Z

That was not the button I wanted...

Turns out we don't need to add phases, we already have the info available to do this.

fitzgen · 2025-10-16T22:47:27Z

@alexcrichton the latest commit (9ca326b) cleans this up a little bit, and also happens to give us demand-based compilation of intrinsics, since it turns out we know which ones were canon lowered by the time we are compiling functions. Mind taking another look? Does this help allieviate some of your concerns?

alexcrichton · 2025-10-17T00:00:41Z

Agreed! I think everything at least now only works on O(intrinsics_used), and data-representation-wise I'd still prefer PrimaryMap<SomethingIndex, ...> but that's fine to defer to a future refactoring if necessary.

…ce#11825) * Implement unsafe intrinsics for compile-time builtins This commit adds the extremely unsafe `wasmtime::CodeBuilder::expose_unsafe_intrinsics` method. When enabled, the Wasm being compiled is given access to special imports that correspond to direct, unchecked and unsandboxed, native load and store operations. These intrinsics are intended to be used for implementing fast, inline-able versions of WASI interfaces that are special-cased to a particular host embedding, for example. Compile-time builtins, as originally described in [the RFC](bytecodealliance/rfcs#43), are basically made up of three parts: 1. A function inliner 2. Unsafe intrinsics 3. Component composition to encapsulate the usage of unsafe intrinsics in a safe interface Part (1) has been implemented in Wasmtime and Cranelift for a little while now (see `wasmtime::Config::compiler_inlining`). This commit is part (2). After this commit lands, part (3) can be done with `wac` and `wasm-compose`, although follow up work is required to make the developer experience nicer and more integrated into Wasmtime so that the APIs can look like those proposed in the RFC. * fill out some more docs * fix non component model builds * start filling out the doc example * Factor abi params/returns out; truncate/extend pointers * Compile unsafe intrinsics on winch as well * prtest:full * have the macro define the signature * ignore tests in MIRI because MIRI can't compile Wasm * juggle pointer provenance in `Store::data[_mut]` * add a test for store data provenance and also fix it * use `VmPtr` for the store data pointer * finish writing unsafe intrinsics example * fix up docs and rules around only accessing data from `T` in a `Store<T>` * Only reserve space for the intrinsics' `VMFuncRef`s if they are in use * use dangling pointers instead of options * Rename `StoreInner::data` to `data_no_provenance` and fix some accesses to use the method accessors * Add comments about the provenance juggling inside `StoreInner::data[_mut]` * only compile intrinsics that are used Turns out we don't need to add phases, we already have the info available to do this. * fix duplicate symbol names

fitzgen requested review from a team as code owners October 8, 2025 23:33

fitzgen requested review from alexcrichton and cfallin and removed request for a team October 8, 2025 23:33

github-actions bot added wasmtime:api Related to the API of the `wasmtime` crate itself winch Winch issues or pull requests labels Oct 9, 2025

rvolosatovs reviewed Oct 9, 2025

View reviewed changes

crates/wasmtime/src/compile/code_builder.rs Show resolved Hide resolved

fitzgen force-pushed the new-compile-time-builtins branch 2 times, most recently from 14489f4 to 60daa62 Compare October 9, 2025 17:36

alexcrichton reviewed Oct 9, 2025

View reviewed changes

rvolosatovs mentioned this pull request Oct 14, 2025

support Wasm components near/nearcore#14439

Open

fitzgen added 7 commits October 14, 2025 11:14

fill out some more docs

fc4975a

fix non component model builds

e4a1497

start filling out the doc example

7ce404e

Factor abi params/returns out; truncate/extend pointers

b80b242

Compile unsafe intrinsics on winch as well

d505ad8

prtest:full

3a34fe4

fitzgen force-pushed the new-compile-time-builtins branch from 38f0dc1 to 3a34fe4 Compare October 14, 2025 21:03

fitzgen added 7 commits October 14, 2025 15:31

have the macro define the signature

3099873

ignore tests in MIRI because MIRI can't compile Wasm

d87a6fe

juggle pointer provenance in Store::data[_mut]

6b221e4

add a test for store data provenance and also fix it

8b893fe

use VmPtr for the store data pointer

e39c493

finish writing unsafe intrinsics example

3782667

fix up docs and rules around only accessing data from T in a `Store…

9358824

…<T>`

Only reserve space for the intrinsics' VMFuncRefs if they are in use

68c3461

fitzgen requested a review from alexcrichton October 16, 2025 00:36

alexcrichton approved these changes Oct 16, 2025

View reviewed changes

fitzgen added 3 commits October 16, 2025 10:42

use dangling pointers instead of options

85481ee

Rename StoreInner::data to data_no_provenance and fix some access…

ee2aeac

…es to use the method accessors

Add comments about the provenance juggling inside `StoreInner::data[_…

59f8f9a

…mut]`

alexcrichton closed this Oct 16, 2025

alexcrichton reopened this Oct 16, 2025

only compile intrinsics that are used

9ca326b

Turns out we don't need to add phases, we already have the info available to do this.

fix duplicate symbol names

c4fc459

fitzgen added this pull request to the merge queue Oct 17, 2025

Merged via the queue into bytecodealliance:main with commit ad56ff9 Oct 17, 2025
172 checks passed

fitzgen deleted the new-compile-time-builtins branch October 17, 2025 00:25

Implement unsafe intrinsics for compile-time builtins #11825

Implement unsafe intrinsics for compile-time builtins #11825

Uh oh!

Conversation

fitzgen commented Oct 8, 2025

Uh oh!

github-actions bot commented Oct 9, 2025

Subscribe to Label Action

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexcrichton Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

fitzgen Oct 14, 2025

Choose a reason for hiding this comment

Uh oh!

alexcrichton Oct 9, 2025

Choose a reason for hiding this comment

Uh oh!

fitzgen Oct 14, 2025

Choose a reason for hiding this comment

Footnotes

Uh oh!

alexcrichton Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

alexcrichton commented Oct 9, 2025

Uh oh!

fitzgen commented Oct 16, 2025

Uh oh!

alexcrichton left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

alexcrichton Oct 16, 2025

Choose a reason for hiding this comment

Uh oh!

fitzgen commented Oct 16, 2025

Footnotes

Uh oh!

alexcrichton commented Oct 16, 2025

Uh oh!

alexcrichton commented Oct 16, 2025

Uh oh!

fitzgen commented Oct 16, 2025

Uh oh!

alexcrichton commented Oct 17, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants