fix flamegraph by turbocrime · Pull Request #524 · gungraun/gungraun

turbocrime · 2026-02-07T08:18:30Z

Addresses #523

Summary

The current flamegraph output stacks all functions linearly by descending
inclusive cost — a design described as a quick
cost overview using inferno as a rendering backend.

This works as a sorted bar chart, but the flamegraph y-axis (call depth)
carries no meaning — sibling calls appear nested rather than branching. Since
callgrind output already contains caller-callee edges (cfn=, calls=,
per-call-site costs), we can do better.

This PR restores caller-callee edge tracking (present in 190dfd6, removed in
1f2e0bd) and uses DFS from the sentinel/root to emit folded stacks that
reflect actual call-graph structure.

Before / After

Given parent of total cost 90 Ir from calling inline_me (10 Ir), work_a (60 Ir), work_b (20 Ir):

fn parent() {
    inline_me(); // 10 Ir
    work_a(); // 60 Ir
    work_b(); // 20 Ir
}

Before (linear staircase):

above_parent 12345
above_parent;parent 90
above_parent;parent;work_a 60
above_parent;parent;work_a;work_b 20

After (branching call tree):

parent; 10
parent;work_a 60
parent;work_b 20

Now, parent is annotated with its own cost in the output graph. Rendering the graph compiles its total cost, so that remains available when viewing the graph.

Full example: https://github.com/turbocrime/gungraun-flamegraph-example

Github disables the fancier features of the SVG (interactivity, etc) when embedded, so please run the repro to get the full idea.

What changed

hashmap_parser.rs: parse_with_edges() exposes caller-callee edges via
callback during parsing
flamegraph_parser.rs: replaces BinaryHeap + .windows(2) with a
CallGraph struct and DFS stack emission; sentinel is used as DFS root when
present
New test fixtures for branching and recursive call graphs

Limitations

Inferno forces lexical sorting when rendering the graph.

turbocrime · 2026-02-07T11:18:46Z

-                    continue;
-                }
-            }
+        let roots: Vec<&Id> = if let Some(sentinel_key) = &self.costs.sentinel_key {


using sentinel as dfs root as in 190dfd6

turbocrime · 2026-02-07T11:20:10Z


 /// The unique `Id` identifying a function uniquely
-#[derive(Debug, Hash, PartialEq, Eq, Clone, Serialize, Deserialize)]
+#[derive(Debug, Hash, PartialEq, Eq, PartialOrd, Ord, Clone, Serialize, Deserialize)]


PartialOrd, Ord added, replacing HeapElem's Ord to maintain deterministic output expected by tests

turbocrime · 2026-02-07T11:21:03Z

sentinel is now dfs root, so output is the reachable subtree instead of all roots filtered by cost ceiling

turbocrime · 2026-02-07T11:21:54Z

the old search produced one entry per function. dfs produces one entry per call-graph path

turbocrime · 2026-02-07T11:22:40Z

new test case: parent calling multiple children

turbocrime · 2026-02-07T11:43:19Z

 impl CallgrindParser for HashMapParser {
    type Output = CallgrindMap;

-    #[allow(clippy::too_many_lines)]
    fn parse_single(&self, path: &Path) -> Result<(CallgrindProperties, Self::Output)> {
+        self.parse_with_edges(path, |_, _, _| {})
+    }
+}
+
+impl HashMapParser {
+    /// Like [`CallgrindParser::parse_single`] but invokes `on_call_edge(caller, callee, cost)`
+    /// for each caller→callee edge encountered.
+    #[allow(clippy::too_many_lines)]
+    pub fn parse_with_edges<F>(
+        &self,
+        path: &Path,
+        mut on_call_edge: F,
+    ) -> Result<(CallgrindProperties, CallgrindMap)>
+    where
+        F: FnMut(&Id, &Id, &Metrics),
+    {


only FlamegraphParser needs edges, so the parse_single signature is not modified, and it now delegates to parse_with_edges. i don't like this choice but i wanted to avoid larger diff.

turbocrime · 2026-02-07T22:58:59Z

okay, i see total_flamegraph_map_from_parsed merges all per-thread/per-part
output files into a single map before generating stacks.

with the prior format this just produced imprecise bar widths; with the
call-graph approach, this might graph caller-callee relationships that don't
actually exist

there's no test case that specifically expresses this. the merging behavior is
about combining separate CallGraphs from different output files, and every
existing test asserts result.len() == 1, so it's explicitly not covered

data to handle multi-threaded/multi-part benchmarks should be available.
callgrind produces separate output files per thread/part, and the parser
extracts thread/part identity into CallgrindProperties, but the merging step
discards it.

so back to draft and i'm planning to follow up on this.

fix flamegraph

40833f1

turbocrime marked this pull request as draft February 7, 2026 09:22

turbocrime force-pushed the fix/flamegraph branch from 7a43ed5 to 56f84a2 Compare February 7, 2026 09:27

restore sentinel behavior

c945cf5

turbocrime force-pushed the fix/flamegraph branch from 56f84a2 to c945cf5 Compare February 7, 2026 10:03

turbocrime added 2 commits February 7, 2026 02:52

reduce diff

f6d4c38

correct recursion measurement

20a1722

turbocrime commented Feb 7, 2026

View reviewed changes

turbocrime marked this pull request as ready for review February 7, 2026 11:45

turbocrime marked this pull request as draft February 7, 2026 22:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix flamegraph#524

fix flamegraph#524
turbocrime wants to merge 4 commits into
gungraun:mainfrom
turbocrime:fix/flamegraph

turbocrime commented Feb 7, 2026 •

edited

Loading

Uh oh!

turbocrime Feb 7, 2026

Uh oh!

turbocrime Feb 7, 2026

Uh oh!

turbocrime Feb 7, 2026

Uh oh!

turbocrime Feb 7, 2026

Uh oh!

turbocrime Feb 7, 2026

Uh oh!

turbocrime Feb 7, 2026

Uh oh!

turbocrime commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

turbocrime commented Feb 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Before / After

What changed

Limitations

Uh oh!

turbocrime Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

turbocrime Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

turbocrime Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

turbocrime Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

turbocrime Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

turbocrime Feb 7, 2026

Choose a reason for hiding this comment

Uh oh!

turbocrime commented Feb 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

turbocrime commented Feb 7, 2026 •

edited

Loading