Skip to content

Work in progress: APIs for more substantial graph modifications#123

Open
dcdehaas wants to merge 3 commits into
mainfrom
grg_modify_improve
Open

Work in progress: APIs for more substantial graph modifications#123
dcdehaas wants to merge 3 commits into
mainfrom
grg_modify_improve

Conversation

@dcdehaas

@dcdehaas dcdehaas commented Feb 23, 2026

Copy link
Copy Markdown
Member
  • ImmutableGRG has new API: "set_samples()"
    • Takes an (ordered) list of NodeIDs that will become the new samples in the GRG
    • The user is responsible for ensuring that no pair of nodes in the list have a ancestor/descendant relationship
    • Compatible with get_sample_nodes() and is_sample()
    • You can change the number of samples by giving more nodes
    • New property samples_are_ordered tells you sample nodes are0...(numSamples-1)
  • ImmutableGRG has new concept of "negative node"
    • "Negative" nodes are just regular nodes in the GRG that have been created with grg.make_node(negative=True)
    • Negative nodes are considered "topologically beneath" all regular nodes
    • The actual NodeID returned by make_node() is still a positive value.
    • When you connect a negative node to another node you must put a negative sign in front of it
    • Property nodes_are_ordered tells you that you can iterate nodes from 0...(numNodes-1)
    • Otherwise, you must use pygrgl.get_topo_order(), but now you can do so with seed_list=[] which will much more efficiently let you traverse the entire graph.
  • matmul respects negative nodes and new sample nodes.
  • save_grg respects negative nodes and new sample nodes.
  • save_subset IS UNTESTED! And likely not yet correct.
  • When you save a GRG with negative nodes and/or new sample nodes, everything gets renumbered. So when you reload the GRG from disk, all nodes are "regular", and the samples will once again be numbered 0...(numSamples-1), in the order that you provided to set_samples.

* ImmutableGRG has new API: "set_samples()"
  * Takes an (ordered) list of NodeIDs that will become the new samples in the GRG
  * The user is responsible for ensuring that no pair of nodes in the list have a ancestor/descendant relationship
  * Compatible with get_sample_nodes() and is_sample()
  * You can change the number of samples by giving more nodes
  * New property samples_are_ordered tells you sample nodes are`0...(numSamples-1)`
* ImmutableGRG has new concept of "negative node"
  * "Negative" nodes are just regular nodes in the GRG that have been created with `grg.make_node(negative=True)`
  * Negative nodes are considered "topologically beneath" all regular nodes
  * The actual NodeID returned by make_node() is still a positive value.
  * When you connect a negative node to another node you must put a negative sign in front of it
  * Property `nodes_are_ordered` tells you that you can iterate nodes from `0...(numNodes-1)`
  * Otherwise, you must use `pygrgl.get_topo_order()`, but now you can do so with `seed_list=[]` which will much more efficiently let you traverse the entire graph.
* `matmul` respects negative nodes and new sample nodes.
* `save_grg` respects negative nodes and new sample nodes.
* `save_subset` IS UNTESTED!  And likely not yet correct.
* When you save a GRG with negative nodes and/or new sample nodes, everything gets renumbered. So when you reload the GRG from disk, all nodes are "regular", and the samples will once again be numbered `0...(numSamples-1)`, _in the order that you provided to set_samples_.
* FIXME: right now, if you truly break the topological order, then get_topo_order() will throw an exception
  * This will be improved later; for now, you can use this to detect when you have made a mistake
dcdehaas and others added 2 commits May 5, 2026 16:07
Previously there were two performance problems with modifying GRG:
1. The Mutations were hard deleted from a std::vector, which
   resulted in an O(N) copy of data every remote_mutation()
2. The adding of a new Mutation flagged the mutation-to-node
   mapping to be resorted again on the next read access. This is
   overly aggressive IF AND ONLY IF you are adding Mutations to
   newly added nodes, then it is trivial to keep the mapping in
   the right sorted order by just appending to it.

The above issues were fixed, and now there is also a (branch-only?)
option to some of the GRG APIs to force no sorting of the mutation-
to-node mapping. This lets users ensure that they are getting the
best performance - if they make a mistake in their code that
causes GRG to need to re-sort the mapping then it will throw an
exception instead, so they can see where the problem is.
* Add (failing) test for bad GRG modification scenario

The topo order is incorrect when a mixture of positive and negative
nodes is added to a graph.

* Fixed bug where DOWN matmul wouldn't recognize nodes added to the grg (#139)

* Correct issues with regression test; simplify getOrderedNodes

Regression test had the above/below node checks backwards.

For getOrderedNodes(), simplify things by just generating the UP
order and then reversing hte result if we need DOWN. Also, make
the checking for negative nodes more efficient: we know they are
in ascending order, so we can just check them as we iterate the
(also ascending order) NodeID range.

---------

Co-authored-by: Akshay Anand <aksand01@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant