Skip to content

feat(consensus): add prefix_hash to data tx and fold into ledger tx_root#1450

Open
DanMacDonald wants to merge 2 commits into
masterfrom
dmac/prefix-hash-tx-root
Open

feat(consensus): add prefix_hash to data tx and fold into ledger tx_root#1450
DanMacDonald wants to merge 2 commits into
masterfrom
dmac/prefix-hash-tx-root

Conversation

@DanMacDonald

@DanMacDonald DanMacDonald commented Jun 16, 2026

Copy link
Copy Markdown
Collaborator

Summary

Adds a per-transaction prefix_hash (SHA-256 of the leading prefix bytes of txs' data) and folds it into each data ledger's tx_root, so an indexer or light client holding only the block-signature-sealed tx_root can trust every transaction's prefix_hash — and the canonical tags it commits to — without verifying any individual transaction's signature. A verifier reconstructs the folded root from each tx's (data_root, prefix_hash) and compares it to the signed tx_root.

This is a softfork: no block-header field is added and the signed block preimage is unchanged. The only changes are (a) the data-transaction format — header_size is renamed to prefix_size and a signed prefix_hash: H256 field is added — and (b) how tx_root is derived. Empty ledgers still fold to H256::zero(), so no existing block's tx_root changes meaning. The leaf formula, hash_all_sha256([data_root, prefix_hash]), matches the gateway's, so cross-implementation tx_root reconstruction holds.

Design notes

Folding changes each tx_root leaf from data_root to hash_all_sha256([data_root, prefix_hash]). Because validate_path returns the leaf's stored hash, a tx_path proof no longer yields the raw data_root — and the node previously read data_root straight out of that leaf in three places. Each now recovers the real data_root from an independent source and re-verifies it against the proof leaf via the same fold, so the two can never silently diverge:

  • tx_root enforcement (prevalidate_block) — recomputes each ledger's tx_root from the included txs' folded leaves and rejects TxRootMismatch. This is what makes the block signature transitively authenticate every prefix_hash.
  • PoA data-ledger branch (poa_is_valid) — recovers the recall chunk's owning transaction (in-memory block_tree for tip blocks, else the DB), binds the tx_path leaf to fold(data_root, prefix_hash), and validates the data_path against the real data_root.
  • Storage retrieval (get_chunk_by_offset / get_chunk_metadata) — recovers data_root from a new per-submodule table TxLeafBindingByTxPathHash (tx_path_hash → {data_root, prefix_hash}), cross-checked against the proof leaf via the fold. (Additive table — created empty on open, no data migration.)

The fold lives in one place — DataTransactionLedger::fold_tx_root_leaf — and every consumer (block production, validation recompute, the PoA binding, and the storage check) delegates to it.

Test plan

  • cargo clippy --workspace --tests and cargo fmt --all clean.
  • types — fold is load-bearing (changing a tx's prefix_hash changes tx_root); compute_tx_root == merklize_tx_root root; an indexer can reconstruct tx_root from (data_root, prefix_hash); the fold is byte-identical to hash_all_sha256 (gateway-compat guard); prefix_size/prefix_hash are covered by the tx signature.
  • block validation — a block whose tx_root doesn't match its txs is rejected with TxRootMismatch; valid blocks pass.
  • PoA — existing data-ledger PoA suites pass, including the block_tree fallback (data_poa_at_tip_validates_via_block_tree_fallback) and migration-depth-2 (spiky_heavy_mine_ten_blocks_with_migration_depth_two); a tampered leaf is rejected.
  • chunk serving / wire — end-to-end retrieval (spiky_heavy_api_end_to_end_test_32b) and the full p2p gossip-fixture suite pass.

🤖 Generated with Claude Code

Adds a signed per-tx `prefix_hash` and folds `hash_all_sha256([data_root,
prefix_hash])` into each tx_root leaf, so an indexer holding only the
block-signature-sealed tx_root can trust every tx's prefix_hash without
verifying individual tx signatures. Softfork (no block-header change;
empty ledgers still fold to H256::zero()).

- rename header_size -> prefix_size; add prefix_hash: H256
- block validation recomputes tx_root and rejects TxRootMismatch
- PoA data-ledger branch recovers the owning tx's data_root (the folded
  leaf no longer yields it) and binds it via the fold
- storage retrieval recovers data_root from a new submodule binding
  table (tx_path_hash -> {data_root, prefix_hash}), verified by the fold
- block.rs: fold load-bearing, compute_tx_root == merklize root,
  indexer reconstruction, and fold == hash_all_sha256 (gateway-compat guard)
- signature.rs: prefix_size/prefix_hash are covered by the tx signature
- chain-tests: a block whose tx_root doesn't match its txs is rejected
  with TxRootMismatch
@coderabbitai

coderabbitai Bot commented Jun 16, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Replaces DataTransactionHeaderV1.header_size with prefix_size and prefix_hash fields across the type system, wire format, fixtures, migration, and tooling. Introduces a TxLeafBinding DB table storing (data_root, prefix_hash) per tx-path, refactors chunk retrieval to recover data_root via a fold-based helper, and strengthens block pre-validation and PoA validation to enforce tx-root leaf binding with two new consensus-rejection error variants.

Changes

prefix_hash field + PoA tx-root/data-root binding

Layer / File(s) Summary
DataTransactionHeaderV1 field rename and tx_root leaf folding
crates/types/src/transaction.rs, crates/types/src/block.rs, crates/types/src/signature.rs
Removes header_size, adds prefix_size and prefix_hash to DataTransactionHeaderV1. Introduces fold_tx_root_leaf and tx_root_leaf_value helpers that fold (data_root, prefix_hash) into Merkle leaves for merklize_tx_root/compute_tx_root. Adds property tests for fold correctness and signature coverage of prefix fields.
Wire format, P2P fixtures, migration, and tooling propagation
crates/p2p/src/wire_types/data_transaction.rs, crates/p2p/src/wire_types/test_helpers.rs, fixtures/gossip_fixtures.json, crates/database/src/migration.rs, crates/tooling/multiversion-tests/src/...
Updates DataTransactionHeaderV1Inner wire fields, golden gossip fixtures, and v2→v3 migration (sets prefix_hash=zero for legacy records). Updates multiversion-test tooling to recognize prefix_size as a keep-default sentinel.
TxLeafBinding DB schema, helpers, and compression
crates/database/src/submodule/tables.rs, crates/database/src/submodule/db.rs, crates/database/src/tables.rs
Adds TxLeafBinding struct and TxLeafBindingByTxPathHash table to the submodule schema. Adds get_tx_leaf_binding/add_tx_leaf_binding DB helpers, registers compact compression, and clears the table on database reset.
StorageModule data_root recovery and chunk retrieval refactor
crates/domain/src/models/storage_module.rs
Adds recover_tx_path_data_root that reads the stored TxLeafBinding, verifies the leaf fold, and returns (data_root, data_path_hash). Extends index_transaction_data to persist the binding on indexing. Refactors generate_full_chunk and get_chunk_metadata to use the recovery helper.
Block pre-validation and PoA binding enforcement
crates/actors/src/block_validation.rs, crates/chain-tests/src/block_production/block_validation.rs, crates/actors/src/mempool_service.rs, crates/chain-tests/src/validation/...
Adds TxRootMismatch and PoaTxRootLeafMismatch consensus-rejection error variants. prevalidate_block recomputes data ledger tx_roots and fails on mismatch. PoA validation resolves the owning tx via new helpers, binds the tx_path leaf to the folded tx_root leaf, and validates the data chunk proof against owning_tx.data_root. Test scaffolding persists owning block/tx headers into DB.

Sequence Diagram(s)

sequenceDiagram
  participant prevalidate_block
  participant poa_is_valid
  participant load_owning_tx_for_poa
  participant DataTransactionLedger
  participant DB

  rect rgba(135, 206, 250, 0.5)
    Note over prevalidate_block: tx_root recomputation check
    prevalidate_block->>DataTransactionLedger: compute_tx_root(included_txs)
    DataTransactionLedger-->>prevalidate_block: recomputed_root
    prevalidate_block->>prevalidate_block: compare vs header.tx_root → TxRootMismatch?
  end

  rect rgba(144, 238, 144, 0.5)
    Note over poa_is_valid: PoA tx-path leaf binding
    poa_is_valid->>DB: get_data_poa_bounds (returns BlockBounds + owning_block_hash)
    poa_is_valid->>load_owning_tx_for_poa: owning_block_hash, byte_range
    load_owning_tx_for_poa->>DB: fetch DataTransactionHeader
    DB-->>poa_is_valid: owning_tx
    poa_is_valid->>DataTransactionLedger: tx_root_leaf_value(owning_tx)
    DataTransactionLedger-->>poa_is_valid: expected_leaf (fold data_root+prefix_hash)
    poa_is_valid->>poa_is_valid: compare vs tx_path leaf → PoaTxRootLeafMismatch?
    poa_is_valid->>poa_is_valid: verify data chunk proof against owning_tx.data_root
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • Irys-xyz/irys#1425: Modifies anchor hash/height selection for auto stake/pledge to use tip - block_migration_depth, touching the same block validation and ingress inclusion window logic that this PR extends with PoA binding enforcement.

Suggested reviewers

  • JesseTheRobot
  • antouhou
  • roberts-pumpurs
🚥 Pre-merge checks | ✅ 5
✅ Passed checks (5 passed)
Check name Status Explanation
Title check ✅ Passed The title accurately and concisely summarizes the main change: adding prefix_hash to data transactions and folding it into the ledger tx_root, which aligns with the PR objectives.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dmac/prefix-hash-tx-root

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/actors/src/block_validation.rs`:
- Around line 3231-3236: The function ledger_tx_ids_in currently uses map()
which collapses a missing owning ledger into None, and callers use
unwrap_or_default() to silently treat this as an empty tx list, causing the
error to be misattributed as a PoAChunkOffsetOutOfTxBounds error instead of
BlockBoundsLookupError. Change the return type from Option<Vec<H256>> to a
Result that returns BlockBoundsLookupError when the ledger is not found, and
return a borrowed slice instead of cloning the full tx-id list to avoid
unnecessary cloning on every migrated PoA lookup. Update all call sites that
currently use unwrap_or_default() to properly handle the Result type and
propagate the BlockBoundsLookupError accordingly.

In `@crates/tooling/multiversion-tests/src/data_tx.rs`:
- Around line 183-184: The issue is that when tx.header.prefix_size is assigned
to 64 (when tx_build.keep_default does not include "prefix_size"), the
tx.header.prefix_hash field is never recomputed to match. Since prefix_hash must
be SHA-256(first prefix_size bytes), modifying prefix_size without updating
prefix_hash creates an inconsistency between the two fields that both get signed
in the subsequent sign_transaction call, causing compat failures. To fix this,
after assigning tx.header.prefix_size = 64, recompute tx.header.prefix_hash by
computing the SHA-256 hash of the first 64 bytes of the transaction data before
calling sign_transaction, or alternatively prevent the post-creation mutation by
computing both prefix_size and prefix_hash at transaction creation time instead.

In `@crates/types/src/block.rs`:
- Around line 1927-1940: The test function
fold_tx_root_leaf_matches_hash_all_sha256 uses a manual loop to iterate through
test cases instead of using the rstest framework. Convert this to a
parameterized test by applying the #[rstest] macro to the function, adding each
test case as a #[case] attribute with the data_root and prefix_hash parameters,
and replacing the loop body with a single assertion that uses the parameterized
inputs. This will provide per-case reporting and align with the repository's
test conventions.
- Around line 728-736: The folded_leaves function uses an unchecked cast of
h.data_size (a u64) to usize with the as keyword, which silently truncates on
32-bit platforms. Replace the unsafe as usize cast in the tx_size field
assignment with usize::try_from(h.data_size).expect(...) to perform a checked
conversion that will panic if the value cannot be represented as a usize,
preventing silent data loss.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: a0f2589e-64c1-44fc-8adb-e777c8a4c407

📥 Commits

Reviewing files that changed from the base of the PR and between 8dee75f and ca54efb.

📒 Files selected for processing (18)
  • crates/actors/src/block_validation.rs
  • crates/actors/src/mempool_service.rs
  • crates/chain-tests/src/block_production/block_validation.rs
  • crates/chain-tests/src/validation/mempool_ingress_proof_dedup.rs
  • crates/chain-tests/src/validation/mod.rs
  • crates/database/src/migration.rs
  • crates/database/src/submodule/db.rs
  • crates/database/src/submodule/tables.rs
  • crates/database/src/tables.rs
  • crates/domain/src/models/storage_module.rs
  • crates/p2p/src/wire_types/data_transaction.rs
  • crates/p2p/src/wire_types/test_helpers.rs
  • crates/tooling/multiversion-tests/src/data_tx.rs
  • crates/tooling/multiversion-tests/src/run_config.rs
  • crates/types/src/block.rs
  • crates/types/src/signature.rs
  • crates/types/src/transaction.rs
  • fixtures/gossip_fixtures.json

Comment thread crates/actors/src/block_validation.rs
Comment thread crates/tooling/multiversion-tests/src/data_tx.rs
Comment thread crates/types/src/block.rs
Comment thread crates/types/src/block.rs
@github-actions

Copy link
Copy Markdown
Contributor

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant