Work in progress: APIs for more substantial graph modifications#123
Open
dcdehaas wants to merge 3 commits into
Open
Work in progress: APIs for more substantial graph modifications#123dcdehaas wants to merge 3 commits into
dcdehaas wants to merge 3 commits into
Conversation
b4ecc73 to
a970cdd
Compare
* ImmutableGRG has new API: "set_samples()" * Takes an (ordered) list of NodeIDs that will become the new samples in the GRG * The user is responsible for ensuring that no pair of nodes in the list have a ancestor/descendant relationship * Compatible with get_sample_nodes() and is_sample() * You can change the number of samples by giving more nodes * New property samples_are_ordered tells you sample nodes are`0...(numSamples-1)` * ImmutableGRG has new concept of "negative node" * "Negative" nodes are just regular nodes in the GRG that have been created with `grg.make_node(negative=True)` * Negative nodes are considered "topologically beneath" all regular nodes * The actual NodeID returned by make_node() is still a positive value. * When you connect a negative node to another node you must put a negative sign in front of it * Property `nodes_are_ordered` tells you that you can iterate nodes from `0...(numNodes-1)` * Otherwise, you must use `pygrgl.get_topo_order()`, but now you can do so with `seed_list=[]` which will much more efficiently let you traverse the entire graph. * `matmul` respects negative nodes and new sample nodes. * `save_grg` respects negative nodes and new sample nodes. * `save_subset` IS UNTESTED! And likely not yet correct. * When you save a GRG with negative nodes and/or new sample nodes, everything gets renumbered. So when you reload the GRG from disk, all nodes are "regular", and the samples will once again be numbered `0...(numSamples-1)`, _in the order that you provided to set_samples_. * FIXME: right now, if you truly break the topological order, then get_topo_order() will throw an exception * This will be improved later; for now, you can use this to detect when you have made a mistake
a970cdd to
2fe4896
Compare
Previously there were two performance problems with modifying GRG: 1. The Mutations were hard deleted from a std::vector, which resulted in an O(N) copy of data every remote_mutation() 2. The adding of a new Mutation flagged the mutation-to-node mapping to be resorted again on the next read access. This is overly aggressive IF AND ONLY IF you are adding Mutations to newly added nodes, then it is trivial to keep the mapping in the right sorted order by just appending to it. The above issues were fixed, and now there is also a (branch-only?) option to some of the GRG APIs to force no sorting of the mutation- to-node mapping. This lets users ensure that they are getting the best performance - if they make a mistake in their code that causes GRG to need to re-sort the mapping then it will throw an exception instead, so they can see where the problem is.
* Add (failing) test for bad GRG modification scenario The topo order is incorrect when a mixture of positive and negative nodes is added to a graph. * Fixed bug where DOWN matmul wouldn't recognize nodes added to the grg (#139) * Correct issues with regression test; simplify getOrderedNodes Regression test had the above/below node checks backwards. For getOrderedNodes(), simplify things by just generating the UP order and then reversing hte result if we need DOWN. Also, make the checking for negative nodes more efficient: we know they are in ascending order, so we can just check them as we iterate the (also ascending order) NodeID range. --------- Co-authored-by: Akshay Anand <aksand01@gmail.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
0...(numSamples-1)grg.make_node(negative=True)nodes_are_orderedtells you that you can iterate nodes from0...(numNodes-1)pygrgl.get_topo_order(), but now you can do so withseed_list=[]which will much more efficiently let you traverse the entire graph.matmulrespects negative nodes and new sample nodes.save_grgrespects negative nodes and new sample nodes.save_subsetIS UNTESTED! And likely not yet correct.0...(numSamples-1), in the order that you provided to set_samples.