MDBValue Optimizations #192

CoreyKaylor · 2025-12-16T17:14:03Z

Optimizations suggested by @sebastienros in #191
I've only compared to v0.20.0, not the previous comparer implementations. The current branch outperforms the baseline v0.20.0 between 5% and 20%

Separately, it feels like there might be a more optimal route for sorted Guid as well (which I assume is a common key scenario). I'll do a little bit of digging, but if anyone knows something I don't already feel free to share.

Memory

Allocation-free comparers confirmed - constant 56B overhead regardless of operation count (100 ops → 56B, 10000 ops → 56-59B)

Read Performance - All Comparers (1000 ops, 64B values)

Comparer	Time	vs Native
LengthOnly	24.9 μs	-75%
ReverseSignedInt	99.6 μs	-1%
Default (Native)	100.8 μs	baseline
SignedInt	103.0 μs	+2%
ReverseUnsignedInt	105.8 μs	+5%
UnsignedInt	107.3 μs	+6%
ReverseBitwise	110.8 μs	+10%
Bitwise	113.2 μs	+12%
Utf8String	114.1 μs	+13%
ReverseLength	118.4 μs	+17%
Length	119.4 μs	+18%
ReverseUtf8String	163.9 μs	+63%
HashCode	172.1 μs	+71%

Write Performance - All Comparers (1000 ops, 64B values)

Comparer	Time	vs Native
LengthOnly	82.3 μs	-65%
UnsignedInt	200.4 μs	-15%
SignedInt	202.5 μs	-14%
ReverseLength	219.0 μs	-7%
ReverseSignedInt	221.7 μs	-6%
ReverseUnsignedInt	224.8 μs	-5%
ReverseBitwise	228.6 μs	-3%
Default (Native)	236.5 μs	baseline
Length	237.6 μs	+0.5%
Utf8String	237.8 μs	+0.5%
Bitwise	294.5 μs	+25%
ReverseUtf8String	301.4 μs	+27%
HashCode	314.7 μs	+33%

Integer Keys (10000 ops, 4-byte keys)

Comparer	Time	vs Native
SignedInt	307 μs	-81%
UnsignedInt	312 μs	-81%
ReverseSignedInt	1,012 μs	-38%
ReverseUnsignedInt	1,007 μs	-38%
Default (Native)	1,633 μs	baseline

Notes

LengthOnly is fastest but only compares by length (no content comparison)
SignedInt/UnsignedInt provide major gains for integer keys
HashCode is consistently slowest across all scenarios
Most custom comparers perform within ±15% of native for general byte data

sebastienros · 2025-12-16T18:37:52Z

src/LightningDB/MDBValue.cs

 /// </remarks>
+#if NET5_0_OR_GREATER
+[SkipLocalsInit]
+#endif


Is MDBValue read-only?

This could be something to leverage, marking it readonly, and adding the in keyword in some methods such that the value is passed by ref automatically.

Here is the chatgpt explanation

Short answer:
They reduce copies, prevent accidental mutation, and enable better compiler optimizations—especially for larger structs.

Details:

1) readonly struct

Marking a struct as readonly guarantees it’s immutable after construction.

Advantages

No defensive copies: The compiler knows instance methods won’t mutate fields, so it doesn’t create hidden copies when the struct is accessed through in, readonly fields, or properties.

Clear intent & safety: Prevents accidental field mutation and enforces immutability at compile time.

Better optimizations: The JIT can make stronger assumptions, sometimes improving inlining and register usage.

Thread-safety by design: Immutable value types are naturally safer to share.

Cost

You must ensure all instance fields are readonly and methods don’t mutate state.

2) in parameters

in passes a struct by readonly reference instead of by value.

Advantages

Avoids copying large structs: Especially useful when structs exceed ~16 bytes or are passed frequently.

Expresses intent: Signals “this method will not modify the argument.”

Interoperates with readonly structs: No defensive copies when calling methods on a readonly struct.

Cost

Indirection: Very small structs can be slower due to pointer dereferencing.

Readonly rules: Attempting mutation causes compile errors or hidden copies if the struct isn’t readonly.

3) Using both together (best case)

This is where the real benefit shows up.

readonly struct + in parameters ⇒ zero copies, no defensive cloning, maximum safety.

Methods called on the struct don’t trigger hidden temporaries.

Ideal for math types, vectors, coordinates, timestamps, and domain value objects.

Practical guidance

Use readonly struct when:

The type is logically immutable

It’s used frequently or passed around a lot

Use in when:

The struct is medium-to-large

The method is hot-path or allocation/copy sensitive

Don’t bother for tiny structs (e.g., two ints).

Bottom line:
You get immutability guarantees, fewer copies, and better performance—when used selectively and intentionally.

I overlooked an aspect of the question.

P/Invoke signatures must use ref - Functions like mdb_cursor_get write back values (native code sets size/data pointers). Can't use in.

CompareFunction delegate must use ref - This is called FROM native code. The marshalling requires ref, not in:
delegate int CompareFunction(ref MDBValue left, ref MDBValue right);

IComparer passes by value - The standard interface signature is Compare(T x, T y), not Compare(in T x, in T y)

That said, there's probably several of the interop methods that could benefit from changing to 'in' instead of ref.

ref is fine, in just behaves like a ref without the need to use the ref keyword from the caller.
It's fine if not all methods use in or ref, like Compare. It's not a requirement, just that readonly allows the usage of in. We can still to ref when it's required.

mdb_cursor_get write back values...

Haven't checked the code, maybe a custom mutable and reusable struct (or class) could be used to pass to these, and then the library creates immutable ones.

Just brainstorming in case we can find patterns.

Even with readonly it can pass structs to be updated, as long as we know when it's done, ideally only during initialization:

private (MDBResultCode resultCode, MDBValue key, MDBValue value) Get(CursorOperation operation) { MDBValue mdbKey = default; MDBValue mdbValue = default; unsafe { var result = mdb_cursor_get( _handle, ref Unsafe.AsRef<MDBValue>(in mdbKey), ref Unsafe.AsRef<MDBValue>(in mdbValue), operation); return (result, mdbKey, mdbValue); } }

And this PR is fine, if you want to do changes maybe isolated them in separate PRs, no need to dead-lock PRs.

CoreyKaylor added 2 commits December 12, 2025 13:52

Incorporate Claude / Seb feedback for MDBValue optimizations

741f3c3

Adding comparer benchmarks

fc78d12

sebastienros reviewed Dec 16, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

MDBValue Optimizations #192

MDBValue Optimizations #192

Uh oh!

CoreyKaylor commented Dec 16, 2025

Uh oh!

sebastienros Dec 16, 2025

Uh oh!

CoreyKaylor Dec 16, 2025

Uh oh!

sebastienros Dec 16, 2025

Uh oh!

CoreyKaylor Dec 16, 2025 •

edited

Loading

Uh oh!

CoreyKaylor Dec 16, 2025

Uh oh!

sebastienros Dec 16, 2025

Uh oh!

sebastienros Dec 16, 2025

Uh oh!

sebastienros Dec 16, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

MDBValue Optimizations #192

Are you sure you want to change the base?

MDBValue Optimizations #192

Uh oh!

Conversation

CoreyKaylor commented Dec 16, 2025

Uh oh!

sebastienros Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

CoreyKaylor Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

sebastienros Dec 16, 2025

Choose a reason for hiding this comment

1) readonly struct

2) in parameters

3) Using both together (best case)

Practical guidance

Uh oh!

CoreyKaylor Dec 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

CoreyKaylor Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

sebastienros Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

sebastienros Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

sebastienros Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

1) `readonly struct`

2) `in` parameters

CoreyKaylor Dec 16, 2025 •

edited

Loading