Skip to content

Conversation

@irh
Copy link
Contributor

@irh irh commented Dec 19, 2024

I decided to take a look at some potential runtime performance improvements, overall this PR produces benchmark improvements of 11-20% on my machine.

irh added 15 commits December 19, 2024 17:28
`Ptr<String>` is now used as the standard string type instead of
`Ptr<str>`, which gives enough space in KString for a non-allocated
slice that uses u32 bounds.

StringSlice with usize bounds is still available for strings larger than
4GB.

StringSlice is used heavily by the runtime (all access ops pull a
StringSlice out of the constant pool) so this results in a significant
speedup (5-6% in benchmarks).

Inlined strings are now unnecessary (and created overhead), and can be
removed.

String slices
This improves some benchmarks, the extra dereference doesn't seem to add
significant cost, and it's better to move a Vec into the Tuple rather
than having to copy its data into a new location, e.g. `.to_tuple()`.
- Remove temporary iterators from the register stack to ensure they have
  a reference count of one, allow pop_front to mutate the inner slice
  without allocation.
- Rework TupleIterator to work with indices instead of using
  pop_front/pop_back.
The NewFrame op is run at the start of each frame, and tells the VM the
number of registers required by the frame's bytecode.
This results in allowing the stack size check in set_register to be
removed.

The VM may use additional temporary registers but will ensure that at
least the required registers are present in the stack.

A fair amount of reworking of the VM's call semantics has been gone in
to this, hopefully simplifying the logic and clarifying how frame setup
should be performed.
This is made possible by reducing the size of KString to 16, which
allows KFunction to increase in size to 24 (only one variant can have
a size of 24, assuming it has a niche that can be shared by KValue).
This allows it to include the capture list, making KCaptureFunction
redundant.
@irh irh merged commit 10fa7d1 into main Dec 19, 2024
8 checks passed
@irh irh deleted the performance-improvements branch December 19, 2024 16:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants