[NetKAT] Intern decision nodes via per-field unique tables keyed by node index#100
Draft
smolkaj wants to merge 1 commit into
Draft
[NetKAT] Intern decision nodes via per-field unique tables keyed by node index#100smolkaj wants to merge 1 commit into
smolkaj wants to merge 1 commit into
Conversation
…ode index. Previously, the unique tables (packet_by_node_, transformer_by_node_) stored a full copy of every decision node as the map key, duplicating all node storage. This was especially costly for PacketTransformerManager, whose nodes own heap-allocated btree maps. Now the tables store 4-byte node indices and hash/compare by dereferencing into nodes_, so each node is stored exactly once. Splitting the tables by packet field keeps each table - and thus its probe sequences - small, and lays the groundwork for storing the nodes themselves by field/level (standard practice in BDD packages, where it improves locality and enables variable reordering). A follow-up change builds on this. Implementation notes: * Interning probes the table with the candidate node via heterogeneous lookup (no node is added to nodes_ unless it is new), keeping the common already-interned case copy- and allocation-free. * The table functors reference nodes_ through a heap-allocated location slot so that they survive manager moves; move operations repoint the slot. Benchmarks are flat overall: first-time compilation is unchanged, recompile microbenchmarks regress by ~8% (~50ns) due to the indirection in the unique tables, in exchange for halving node-related memory. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
|
Thanks for your pull request! It looks like this may be your first contribution to a Google open source project. Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA). View this failed invocation of the CLA check for more information. For the most up to date status, view the checks section at the bottom of the pull request. |
This was referenced Jun 10, 2026
Contributor
Author
|
Measured this PR stack on the new scale benchmark from #103 (realistically large inputs; medians of 3,
Two takeaways:
|
Contributor
Author
Contributor
Author
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
Cuts node-related memory roughly in half and reorganizes the unique tables (hash-cons tables) of
PacketSetManagerandPacketTransformerManagerby packet field, as a first step toward level-based node storage (the standard organization in BDD packages, where it improves locality and enables variable reordering).Before:
packet_by_node_/transformer_by_node_wereflat_hash_map<DecisionNode, Handle>— every interned node was stored twice: once innodes_, once as the map key. ForPacketTransformerManagerthe key copy was especially costly, since each node owns heap-allocated btree maps.After: one
flat_hash_set<uint32_t>per packet field, storing node indices; hashing and equality dereference intonodes_. Each node is stored exactly once. Per-field tables also keep probe sequences short and drop the field from hashing entirely (it's implied by the table).Design notes
NodeToPacket/NodeToTransformerlook up the candidate node via heterogeneous lookup (is_transparentfunctors), so the common already-interned case does no copies and no allocations. Only genuinely new nodes touchnodes_.nodes_, which moves with the manager. They do so through a heap-allocated location slot owned by the manager; the (now manual) move operations repoint the slot. This keeps moves O(1) — the alternative of rebuilding tables on move would be O(#nodes).Benchmarks
Flat overall: first-time compilation unchanged, recompile microbenchmarks +~50ns (~8%) from the added indirection in the unique tables — in exchange for the memory halving.
Testing
All 17 bazel test targets pass, including the golden diff tests (node numbering is unchanged by this PR) and the fuzz tests comparing against reference implementations.
Two related PRs: the arena page-size fix is split out as an independent one-line win, and the originally-stacked level-packed-handles work (#102) is deferred until operation memoization lands, per the scaling data on that PR.
🤖 Generated with Claude Code