feature request: subtree aggregate, graphframe subtree

I have custom CSV format data I have imported to hatchet, and I would like a way to aggregate over subtrees with different top level names. In particular, I have subtrees for different Runge-Kutta stages, `eRK_stage_[1-4]` and I want to produce a new tree that has the sum of each subtree. The subtrees have identical timer/node names but different timer/metric values. So if `eRK_stage_[1-4]` have sub-node `add_nonlinear`, then the new aggregate subtree, let's call it `eRK_all`, would have a subnode `add_nonlinear` too, with value that is the sum of all the values for stages 1-4.

I can do this as a pre-processing step in my custom reader implementation, but this sort of tree operation seems like it could be useful for other people as well. It's not at all obvious to me how to do this with existing APIs, although perhaps a clever groupby_aggregate or filter followed by squash can do something like this?

I would also love a convenience function for creating a sub-graphframe. For, I have initialization and timeloop regions, and most of the time I am only interested in the timeloop. I'd like to be able to say something like `gf.tree(root='GENE.gsub.timeloop.t_loop')`, and it would only print from Node t_loop with parents timeloop, gsub, GENE. Or `gf.subtree('GENE.gsub.timeloop.t_loop').tree()` would work nicely too.

If this is not possible with existing APIs, and there is interest, I could work on a PR once I get it working. I guess one complication is that these operations may not work well with non-tree graphs.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feature request: subtree aggregate, graphframe subtree #431

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

feature request: subtree aggregate, graphframe subtree #431

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions