fix flamegraph#524
Conversation
7a43ed5 to
56f84a2
Compare
56f84a2 to
c945cf5
Compare
| continue; | ||
| } | ||
| } | ||
| let roots: Vec<&Id> = if let Some(sentinel_key) = &self.costs.sentinel_key { |
|
|
||
| /// The unique `Id` identifying a function uniquely | ||
| #[derive(Debug, Hash, PartialEq, Eq, Clone, Serialize, Deserialize)] | ||
| #[derive(Debug, Hash, PartialEq, Eq, PartialOrd, Ord, Clone, Serialize, Deserialize)] |
There was a problem hiding this comment.
PartialOrd, Ord added, replacing HeapElem's Ord to maintain deterministic output expected by tests
There was a problem hiding this comment.
sentinel is now dfs root, so output is the reachable subtree instead of all roots filtered by cost ceiling
There was a problem hiding this comment.
the old search produced one entry per function. dfs produces one entry per call-graph path
There was a problem hiding this comment.
new test case: parent calling multiple children
| impl CallgrindParser for HashMapParser { | ||
| type Output = CallgrindMap; | ||
|
|
||
| #[allow(clippy::too_many_lines)] | ||
| fn parse_single(&self, path: &Path) -> Result<(CallgrindProperties, Self::Output)> { | ||
| self.parse_with_edges(path, |_, _, _| {}) | ||
| } | ||
| } | ||
|
|
||
| impl HashMapParser { | ||
| /// Like [`CallgrindParser::parse_single`] but invokes `on_call_edge(caller, callee, cost)` | ||
| /// for each caller→callee edge encountered. | ||
| #[allow(clippy::too_many_lines)] | ||
| pub fn parse_with_edges<F>( | ||
| &self, | ||
| path: &Path, | ||
| mut on_call_edge: F, | ||
| ) -> Result<(CallgrindProperties, CallgrindMap)> | ||
| where | ||
| F: FnMut(&Id, &Id, &Metrics), | ||
| { |
There was a problem hiding this comment.
only FlamegraphParser needs edges, so the parse_single signature is not modified, and it now delegates to parse_with_edges. i don't like this choice but i wanted to avoid larger diff.
|
okay, i see with the prior format this just produced imprecise bar widths; with the there's no test case that specifically expresses this. the merging behavior is data to handle multi-threaded/multi-part benchmarks should be available. so back to draft and i'm planning to follow up on this. |
Addresses #523
Summary
The current flamegraph output stacks all functions linearly by descending
inclusive cost — a design described as a quick
cost overview using inferno as a rendering backend.
This works as a sorted bar chart, but the flamegraph y-axis (call depth)
carries no meaning — sibling calls appear nested rather than branching. Since
callgrind output already contains caller-callee edges (
cfn=,calls=,per-call-site costs), we can do better.
This PR restores caller-callee edge tracking (present in 190dfd6, removed in
1f2e0bd) and uses DFS from the sentinel/root to emit folded stacks that
reflect actual call-graph structure.
Before / After
Given
parentof total cost 90 Ir from callinginline_me(10 Ir),work_a(60 Ir),work_b(20 Ir):Before (linear staircase):
After (branching call tree):
Now, parent is annotated with its own cost in the output graph. Rendering the graph compiles its total cost, so that remains available when viewing the graph.
Full example: https://github.com/turbocrime/gungraun-flamegraph-example
Github disables the fancier features of the SVG (interactivity, etc) when embedded, so please run the repro to get the full idea.
What changed
hashmap_parser.rs:parse_with_edges()exposes caller-callee edges viacallback during parsing
flamegraph_parser.rs: replacesBinaryHeap+.windows(2)with aCallGraphstruct and DFS stack emission; sentinel is used as DFS root whenpresent
Limitations
Inferno forces lexical sorting when rendering the graph.