Add samply-markers crate #710

nordzilla · 2025-11-11T22:06:20Z

Overview

This is a minimal initial implementation of #349 based off of @blp's commit feldera/feldera@1b448b7.

I've pulled the commit directly into my pull request, so as to preserve the original attributions, and built the crate around the Unix mmap capabilities. Windows is stubbed out, but not yet implemented.

Examples

The README and generated rust docs are full of examples to look at.

Crate Name Reservation

I've also reserved the crate name in good faith that I'd like to release this eventually.

https://crates.io/crates/samply-markers

Though, if for some reason this falls through, I'd be happy to relinquish ownership of the crate. I'm not a fan of name squatting without substance or genuine intent.

Current Status

That being said, I'm not ready to release the crate as 0.1.0 yet. This is just the first pull request.

This implementation does the bare minimum that is (mostly) compatible with the current samply implementation (I did have to make one small change that I would prefer to revert in time).

What's implemented:

The core public API that I'm fairly committed to, though welcome to feedback.
Full Unix support (Linux and macOS).
Documentation with links to Firefox Profiler examples.
Runnable examples demonstrating synchronous, asynchronous profiling.

Future Work

The following are all considerations that I would like to handle before a 0.1.0 crate release, but aren't necessarily critical functionality.

Windows Support

Windows is currently stubbed out with unimplemented!(). The implementation will need platform-specific equivalents for the Unix mmap-based approach.

Note

Of all the topics, I'm most okay with releasing 0.1.0 without this.
I don't have a Windows dev machine to implement or test this on.

Runtime Detection

I would like to have runtime detection for emitting markers if possible. Perhaps samply itself could set SAMPLY_MARKERS=1 for the process to run, or if the user already has SAMPLY_MARKERS=0 in the current session, then Samply could preserve that.

That would be a lot more ergonomic to use, since markers would either emit if samply is recording, or not if samply is not, and likely not cost much performance compared to the currently implemented compile-time disablement.

Extended Marker Data

I would like to allow samply-markers to output more optional data that can take advantage of runtime marker schemas, but this will require more changes to Samply itself.

Ideally, I'd like to redo the format and parsing of the marker files (which would be a breaking change).

I could do this in a backward-compatible, or perhaps I could make my own format that outputs to samply-marker-{PID}-{TID}.txt.

I'm not sure how important it is that the current format remains intact.

Known Issues

The following are all considerations that will need to be handled before a 0.1.0 crate release. But I don't think they should block merging this PR, creating these as sub-issues, and getting this crate hooked up to CI for testing.

Buffer Flushing

Currently, the marker-file buffers use thread-local storage (TLS) and ensure flushing on drop. This approach doesn't work for long-lived threads, such as those in Rayon's global thread pool. As a result, buffered marker data may never be written out for long-lived threads.

I really don't want to have to mutex-guard my writes to the buffers, but at this point, I can't think of a safe solution that isn't a centralized flush mechanism on the main thread.

Buffer Size

The current implementation has a 64 KiB buffer per thread to write markers, because it performed well in the single not-very-principled hyperfine benchmark that I ran. But 64 KiB feels like a lot. I'll definitely need to do more research and benchmarking into what is an ideal buffer size to have for each thread, both in terms of size and speed.

Ctrl+C Handling

This issue is related to buffer flushing. Whatever solution is developed for the flushing problem will likely be used here as well, but explicit handling of Ctrl+C will still be needed to ensure that buffers are properly flushed before program termination.

Attaching to a Process with Markers

Markers are emitted from the process that is being recorded, and this can make things look akward in the profiler UI when samply is attached to an already-running process. It means that the marker start time for some intervals could be earlier than the profiling start time.

I think the cleanest solution here would be to filter markers as they are added to the JSON, to ensure that any interval markers' start times are equal to or after the profiling start time.

This would require Samply to know the start time of the PID it just attached to. There may be system calls to find this information (I haven't looked too deeply yet), or if we revise the marker file format, we could ensure that every marker file includes a global first time-since-epoch-style timestamp that is guaranteed to be earlier than the first marker. This would allow Samply to reason about the marker timestamp data more easily.

@mstange, perhaps we could collaborate on this in a new issue once this stack is reviewed and (hopefully) landed.

blp · 2025-11-11T22:11:57Z

What a gift! Thank you!

(But I haven't looked beyond the commit message.)

nordzilla · 2025-11-11T22:17:42Z

What a gift! Thank you!

(But I haven't looked beyond the commit message.)

Please give it a try. I'd appreciate your feedback.

It's slightly different than what you implemented, and I've removed the dependency on tracing.

But I'm hoping the documentation generated by cargo doc, as well as in the README, is easy to follow, and that the design/intent is intuitive.

This commit adds a new workspace for the `samply-markers` crate starting at version `0.0.0`. Since this is my first PR to this project, introducing a new component, I will bump the version to 0.1.0 once it is fully reviewed, accepted, and ready.

Source: feldera/feldera@1b448b7 Original author: Ben Pfaff <blp@feldera.com> File renamed from crates/adapters/src/samply.rs to ./samply-markers/src/unix.rs

This commit runs cargo-fmt on the ported commit from feldera/feldera@1b448b7 to ensure that the code style is up to date with samply's repository.

This commit adds all of the required dependencies in order to get the ported code from feldera/feldera@1b448b7 to build. Some of these dependencies may be removed as I update the API to my vision for the project.

This commit removes all of the code from unix.rs that is not specific to actually writing to the `mmap` files on unix systems. This preserves the original author's attributions.

This commit adds the `SamplyTimestamp` implementation with a default provider that always returns zero.

This commit adds the `SamplyMarker` implementation with a default provider that intentionally does not write the markers to a file.

This commit updates the `unix.rs` implementation to be compatible with the uniform internal provider APIs that markers require in order to function.

This commit adds unimplemented stubs for a Windows provider that would conform to the uniform internal provider APIs.

This commit adds a `SamplyTimer` abstraction which is a scoped RAII object that records the start time it was created and emits an interval marker when it goes out of scope.

This commit adds a declarative macro wrapper around the `SamplyMarker` implementation with the intent to be the primary way of emitting markers.

This commit adds a declarative macro wrapper around the `SamplyTimer` implementation with the intent to be the primary way of constructing timer objects to emit intervals.

This commit adds a declarative macro wrapper that will automatically measure the runtime of a provided sync or async code block by emitting an intterval marker when the code block has finished running.

This commit adds a `prelude` module to the `samply-markers` crate so that users have an easy way to glob import everything that they need to instrument their code with markers.

This commit adds a top-level doc comment to the `samply-markers` crate.

This commit updates the way Samply adds markers to use instant marker timing, instead of a span marker timing, if the start time and end time of the span are exactly the same. Admittedly, I don't like this change, and I would like to revert it once a different marker file format is potentially supported. I think that a zero-length interval should be possible if that is actually what was measured. But as of right now, I have no way to emit an actual instant marker given the current file-format restrictions.

This commit adds integreation tests. The tests for Unix systems are fully fleshed out, ensuring that markers are correctly written to the expected files. The for Windows systems currently expect to panic.

This commit adds a project-level README file to the samply-markers crate.

This commit ensures that .truncate(true) is set when creating marker files to clarify that existing files should be overwritten. This is unlikely to happen with the temp directory file format containing the PID and TID, but if it were to happen, it's better to be safe than sorry. This also fixes the clippy::suspicious_open_options warning.

This commit ensures that the `unix.rs` file correctly handles emitting markers on macOS. I hadn't realized in previous commits that the functionality was only working on Linux. It should work on both systems now.

This commit adds a safety comment to every unsafe block in the `unix.rs` file, explaining why I blieve the call is justified.

This commit adds an examples directory with instructions on how to build, run, and profile the examples shown in the main README file.

This commit improves the performance of samply-markers when enabled, based on benchmarks of the Fibonacci example increased to n = 30 and measured with hyperfine. The Fibonacci example at n = 30 makes 2,692,536 recursive calls, emitting the same number of markers. This provides a great stress test for marker emission overhead. The configurations are as follows: - disabled: samply-markers disabled - enabled-baseline: samply-markers enabled before any performance improvements - buffered-writes: adds the buffered writes improvement - itoa-buffers: adds the itoa improvement on top of buffered writes Linux results: Benchmark 1: ./fib30-disabled Time (mean ± σ): 128.2 ms ± 0.6 ms [User: 127.2 ms, System: 0.9 ms] Range (min … max): 127.1 ms … 129.6 ms 23 runs Benchmark 2: ./fib30-enabled-baseline Time (mean ± σ): 3.144 s ± 0.009 s [User: 0.924 s, System: 2.220 s] Range (min … max): 3.134 s … 3.161 s 10 runs Benchmark 3: ./fib30-enabled-buffered-writes Time (mean ± σ): 522.9 ms ± 1.1 ms [User: 478.9 ms, System: 43.7 ms] Range (min … max): 520.2 ms … 524.1 ms 10 runs Benchmark 4: ./fib30-enabled-itoa-buffers Time (mean ± σ): 423.1 ms ± 1.8 ms [User: 377.8 ms, System: 45.1 ms] Range (min … max): 419.6 ms … 426.3 ms 10 runs Summary ./fib30-disabled ran 3.30 ± 0.02 times faster than ./fib30-enabled-itoa-buffers 4.08 ± 0.02 times faster than ./fib30-enabled-buffered-writes 24.53 ± 0.14 times faster than ./fib30-enabled-baseline macOS results: Benchmark 1: ./fib30-disabled Time (mean ± σ): 90.7 ms ± 0.4 ms [User: 89.4 ms, System: 0.9 ms] Range (min … max): 90.0 ms … 91.6 ms 31 runs Benchmark 2: ./fib30-enabled-baseline Time (mean ± σ): 4.384 s ± 0.053 s [User: 0.578 s, System: 3.783 s] Range (min … max): 4.295 s … 4.495 s 10 runs Benchmark 3: ./fib30-enabled-buffered-writes Time (mean ± σ): 257.4 ms ± 1.8 ms [User: 237.6 ms, System: 16.9 ms] Range (min … max): 255.1 ms … 261.8 ms 11 runs Benchmark 4: ./fib30-enabled-itoa-buffers Time (mean ± σ): 187.6 ms ± 1.1 ms [User: 169.9 ms, System: 16.1 ms] Range (min … max): 186.3 ms … 190.1 ms 15 runs Summary ./fib30-disabled ran 2.07 ± 0.01 times faster than ./fib30-enabled-itoa-buffers 2.84 ± 0.02 times faster than ./fib30-enabled-buffered-writes 48.32 ± 0.62 times faster than ./fib30-enabled-baseline Profiles: Ubuntu: * fib30-disabled: https://share.firefox.dev/48id64U * fib30-enabled-baseline: https://share.firefox.dev/47K8Taf * fib30-enabled-buffered-writes: https://share.firefox.dev/4oK5Uog * fib30-enabled-itoa-buffers: https://share.firefox.dev/47ZtGVQ macOS: * fib30-disabled: https://share.firefox.dev/3LMmaX2 * fib30-enabled-baseline: https://share.firefox.dev/49TPODF * fib30-enabled-buffered-writes: https://share.firefox.dev/4oFo1LP * fib30-enabled-itoa-buffers: https://share.firefox.dev/3K6rLqG Overall: - Using buffered writes dramatically reduces overhead. - Using itoa provides an additional ~19% improvement on Linux and ~27% on macOS.

nordzilla marked this pull request as ready for review November 11, 2025 22:08

nordzilla force-pushed the samply-markers branch 27 times, most recently from 1a6295a to 2ca0131 Compare November 14, 2025 12:22

nordzilla force-pushed the samply-markers branch 5 times, most recently from 1ba0c8d to 3337731 Compare November 18, 2025 11:45

nordzilla and others added 23 commits November 18, 2025 05:53

Add samply-markers crate to workspace

1a4dbd2

This commit adds a new workspace for the `samply-markers` crate starting at version `0.0.0`. Since this is my first PR to this project, introducing a new component, I will bump the version to 0.1.0 once it is fully reviewed, accepted, and ready.

Import samply mmap impl from feldera/feldera@1b448b7

18a3dfc

Source: feldera/feldera@1b448b7 Original author: Ben Pfaff <blp@feldera.com> File renamed from crates/adapters/src/samply.rs to ./samply-markers/src/unix.rs

Run cargo-fmt on feldera/feldera@1b448b7

79d5392

This commit runs cargo-fmt on the ported commit from feldera/feldera@1b448b7 to ensure that the code style is up to date with samply's repository.

Add required dependencies from feldera/feldera@1b448b7

eb7a64e

This commit adds all of the required dependencies in order to get the ported code from feldera/feldera@1b448b7 to build. Some of these dependencies may be removed as I update the API to my vision for the project.

Remove non-unix-specific code from unix.rs

4b7b49c

This commit removes all of the code from unix.rs that is not specific to actually writing to the `mmap` files on unix systems. This preserves the original author's attributions.

Add SamplyTimestamp with default provider

ea591b8

This commit adds the `SamplyTimestamp` implementation with a default provider that always returns zero.

Add SamplyMarker with default provider

3fb6cca

This commit adds the `SamplyMarker` implementation with a default provider that intentionally does not write the markers to a file.

Update unix.rs to conform to provider API

8c0c48e

This commit updates the `unix.rs` implementation to be compatible with the uniform internal provider APIs that markers require in order to function.

Add unimplemented Windows provider

fd0d043

This commit adds unimplemented stubs for a Windows provider that would conform to the uniform internal provider APIs.

Add SamplyTimer implementation

11191c7

This commit adds a `SamplyTimer` abstraction which is a scoped RAII object that records the start time it was created and emits an interval marker when it goes out of scope.

Add samply_marker declarative macro

453ff5c

This commit adds a declarative macro wrapper around the `SamplyMarker` implementation with the intent to be the primary way of emitting markers.

Add samply_timer declarative macro

305e59b

This commit adds a declarative macro wrapper around the `SamplyTimer` implementation with the intent to be the primary way of constructing timer objects to emit intervals.

Add samply_measure declarative macro

0a3300d

This commit adds a declarative macro wrapper that will automatically measure the runtime of a provided sync or async code block by emitting an intterval marker when the code block has finished running.

Add prelude module

cc01f01

This commit adds a `prelude` module to the `samply-markers` crate so that users have an easy way to glob import everything that they need to instrument their code with markers.

Add top-level doc comment

05062fa

This commit adds a top-level doc comment to the `samply-markers` crate.

Add integration tests

2743e38

This commit adds integreation tests. The tests for Unix systems are fully fleshed out, ensuring that markers are correctly written to the expected files. The for Windows systems currently expect to panic.

Add README.md

844283b

This commit adds a project-level README file to the samply-markers crate.

Support samply-markers on macOS

19a3ea5

This commit ensures that the `unix.rs` file correctly handles emitting markers on macOS. I hadn't realized in previous commits that the functionality was only working on Linux. It should work on both systems now.

Add safety comments to unsafe blocks

47007ce

This commit adds a safety comment to every unsafe block in the `unix.rs` file, explaining why I blieve the call is justified.

Add examples directory to samply-markers

ab39197

This commit adds an examples directory with instructions on how to build, run, and profile the examples shown in the main README file.

nordzilla force-pushed the samply-markers branch from 3337731 to 8a5e032 Compare November 18, 2025 11:53

nordzilla mentioned this pull request Nov 18, 2025

Compute profiling time range from samples and markers #715

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add samply-markers crate #710

Add samply-markers crate #710

Uh oh!

nordzilla commented Nov 11, 2025 •

edited

Loading

Uh oh!

blp commented Nov 11, 2025

Uh oh!

nordzilla commented Nov 11, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add samply-markers crate #710

Are you sure you want to change the base?

Add samply-markers crate #710

Uh oh!

Conversation

nordzilla commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview

Examples

Crate Name Reservation

Current Status

Future Work

Windows Support

Runtime Detection

Extended Marker Data

Known Issues

Buffer Flushing

Buffer Size

Ctrl+C Handling

Attaching to a Process with Markers

Uh oh!

blp commented Nov 11, 2025

Uh oh!

nordzilla commented Nov 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

nordzilla commented Nov 11, 2025 •

edited

Loading

nordzilla commented Nov 11, 2025 •

edited

Loading