refactor: simplify block index - remove service layer, atomic persistence, accept SealedBlock#1151
Conversation
📝 WalkthroughWalkthroughThis PR removes the async Tokio-based BlockIndexService and consolidates block indexing and migration logic into BlockMigrationService and BlockTreeService. Block indexing now occurs synchronously within transactional DB operations using updated BlockIndex APIs, eliminating the service's message-passing channels and lifecycle management. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/actors/src/block_migration_service.rs`:
- Around line 30-40: The constructor new currently sets last_received_block to
None which allows the first write to skip ordering checks; instead seed
last_received_block from the current index head when present by querying the
provided block_index (e.g., call the index's head/get_head method) and setting
last_received_block = Some(head) if the index is non-empty, otherwise None;
update the initialization in new to compute this value so subsequent calls
enforce correct first‑write ordering (refer to new, last_received_block, and
block_index).
In `@crates/chain/src/chain.rs`:
- Around line 1345-1349: Add BlockIndexWriter to the existing irys_actors import
block alongside BlockMigrationService and replace the fully-qualified usage
irys_actors::block_migration_service::BlockIndexWriter with the newly imported
BlockIndexWriter; update the import statement that currently brings in
BlockMigrationService to also include BlockIndexWriter so the construction call
let block_index_writer = BlockIndexWriter::new(...) uses the local symbol
instead of the full path.
---
Duplicate comments:
In `@crates/actors/src/block_validation.rs`:
- Around line 3560-3604: The block being constructed sets previous_block_hash to
H256::zero(), breaking chain continuity before calling
block_index_writer.migrate_block; update IrysBlockHeaderV1's previous_block_hash
to the real previous block hash (e.g. obtain the prior block header/hash from
the existing block variable or from context.block_index and set
previous_block_hash = previous_block.header.hash()/previous_hash) before
wrapping it in Arc and calling migrate_block so the BlockIndex sees a continuous
chain; adjust the irys_block construction to use that computed previous hash and
keep the rest of the migrate_block call the same.
7856a3f to
8fe7e59
Compare
- Extract BlockIndexWriter into block_migration_service submodule - Merge two-stage prepare_migration/process_migration into single migrate_blocks - BlockIndexWriter.write_block_index() accepts a DB transaction, enabling atomic block persistence: block data, tx metadata, and block index all commit together - Construct BlockTree cache and BlockMigrationService in chain.rs alongside other services - Remove BlockIndexService actor, mpsc channel, and ServiceSenders/Receivers entries - Add BlockIndex::push_block_with_tx for transaction-aware block index writes - Add SealedBlock::new_unchecked for test-only construction
8fe7e59 to
e69ef80
Compare
Merge push_block and push_block_with_tx into a single push_block that takes a caller-provided DB transaction. Genesis block index write is now atomic with block header and commitment persistence.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/actors/src/block_migration_service/mod.rs (1)
238-295:⚠️ Potential issue | 🟠 MajorEnsure
BlockIndexWriterstate is only advanced after the DB transaction successfully commits.
write_block_indexmutateslast_received_blockinside theupdate_eyretransaction closure. If the transaction fails to commit afterwrite_block_indexreturns successfully, the in-memory writer state will be ahead of the database. On the next block, ordering checks would pass incorrectly, potentially skipping blocks or creating inconsistencies.Move the
last_received_blockmutation outside the transaction closure, or restructure to only advance writer state after the transaction commits.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/actors/src/block_migration_service/mod.rs` around lines 238 - 295, persist_block currently calls writer.write_block_index inside the self.db.update_eyre closure which mutates BlockIndexWriter state (last_received_block) before the DB transaction commits; if the transaction later fails the in-memory writer will be advanced incorrectly. Refactor so the DB closure only performs database modifications (insert_block_header, batch_set_*, insert_commitment_tx, insert_tx_header) and does not mutate writer state: have write_block_index either return the new last_received_block (or a small struct/flag) without mutating state, or defer calling the method that mutates BlockIndexWriter until after update_eyre returns successfully, then advance last_received_block (or call writer.advance_* outside persist_block’s DB closure). Ensure references to writer.write_block_index and BlockIndexWriter.last_received_block are the points of change and keep transaction mutation-free.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/actors/src/block_migration_service/block_index_writer.rs`:
- Around line 59-63: The collected transaction order is reversed: you build
all_txs as [Publish..., Submit...] but BlockIndex::push_block (called with
all_txs and self.chunk_size) expects Submit entries first then Publish; fix by
changing the collection order to extend
transactions.get_ledger_txs(DataLedger::Submit) first and then
transactions.get_ledger_txs(DataLedger::Publish) so all_txs becomes [Submit...,
Publish...] before calling BlockIndex::push_block.
In `@crates/domain/src/models/block_index.rs`:
- Around line 156-161: BlockIndex::push_block expects all_txs ordered as
[Submit..., Publish...] but BlockIndexWriter::write_block_index builds all_txs
as [Publish..., Submit...]; update write_block_index to append Submit ledger
transactions first (using transactions.get_ledger_txs(DataLedger::Submit)) and
then Publish transactions so the all_txs vector ordering matches the slicing
logic in BlockIndex::push_block and chunk counts (submit_tx_count) remain
correct.
---
Outside diff comments:
In `@crates/actors/src/block_migration_service/mod.rs`:
- Around line 238-295: persist_block currently calls writer.write_block_index
inside the self.db.update_eyre closure which mutates BlockIndexWriter state
(last_received_block) before the DB transaction commits; if the transaction
later fails the in-memory writer will be advanced incorrectly. Refactor so the
DB closure only performs database modifications (insert_block_header,
batch_set_*, insert_commitment_tx, insert_tx_header) and does not mutate writer
state: have write_block_index either return the new last_received_block (or a
small struct/flag) without mutating state, or defer calling the method that
mutates BlockIndexWriter until after update_eyre returns successfully, then
advance last_received_block (or call writer.advance_* outside persist_block’s DB
closure). Ensure references to writer.write_block_index and
BlockIndexWriter.last_received_block are the points of change and keep
transaction mutation-free.
Inline BlockIndexWriter logic directly into BlockMigrationService, removing the unnecessary abstraction that bundled unrelated concerns (block index writes and supply state tracking). Fix correctness bug: the old writer built the all_txs array as [Publish, Submit] but BlockIndex::push_block splits at submit_tx_count assuming Submit comes first, causing ledger chunk calculations to be swapped.
| }; | ||
|
|
||
| self.push_item(&block_index_item, block.height) | ||
| insert_block_index_item(tx, block.height, &block_index_item) |
There was a problem hiding this comment.
todo: does it make sense to pass in the sealed block here, and that the blockindexitem gets created internally
…rage_submodules_config Address PR #1151 review comments: - push_block now takes &SealedBlock instead of raw header + tx slice, extracting Submit/Publish txs internally via BlockTransactions - Add SealedBlock::genesis() constructor for genesis block callers - Simplify persist_block by removing manual tx ordering vector - Remove unused storage_submodules_config from BlockTreeServiceInner
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
crates/actors/src/block_validation.rs (2)
3266-3278: 🧹 Nitpick | 🔵 Trivial
chunk_sizeparameter diverges fromcontext.consensus_config.chunk_sizeinsidepush_blockThe entropy computation uses the
chunk_size: usizeparameter, while thepush_blockcall (line 3361) usescontext.consensus_config.chunk_sizedirectly. All callers currently passcontext.consensus_config.chunk_size as usize, so the values agree today, but the two are decoupled — if a future caller ever passes a different value, the block index would be built with a different chunk size than the entropy was computed with, producing a silently incorrect PoA validation result.Removing the
chunk_sizeparameter and deriving it internally from context eliminates this divergence point.♻️ Proposed refactor
fn poa_test( context: &mut TestContext, txs: &[DataTransaction], #[expect( clippy::ptr_arg, reason = "we need to clone this so it needs to be a Vec" )] poa_chunk: &mut Vec<u8>, poa_tx_num: usize, poa_chunk_num: usize, total_chunks_in_tx: usize, - chunk_size: usize, ) { let height = context.block_index.num_blocks(); + let chunk_size = context.consensus_config.chunk_size as usize;Apply the same change to
test_poa_with_malicious_merkle_data, and remove thechunk_sizeargument from all call sites.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/actors/src/block_validation.rs` around lines 3266 - 3278, The poa_test function currently accepts a chunk_size parameter that can diverge from context.consensus_config.chunk_size used by push_block; remove the chunk_size parameter from poa_test (and from test_poa_with_malicious_merkle_data), compute chunk_size inside poa_test as let chunk_size = context.consensus_config.chunk_size as usize, and update all call sites to stop passing chunk_size so both entropy computation and push_block use the same context.consensus_config.chunk_size.
3455-3466: 🧹 Nitpick | 🔵 TrivialSame
chunk_sizeparameter issue aspoa_test
test_poa_with_malicious_merkle_datahas the same decoupling between thechunk_size: usizeparameter (used for entropy) andcontext.consensus_config.chunk_sizeused insidepush_block(line 3605). Apply the same refactor suggested above.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/actors/src/block_validation.rs` around lines 3455 - 3466, test_poa_with_malicious_merkle_data suffers the same mismatch between the chunk_size parameter and the value used inside push_block (context.consensus_config.chunk_size); update the test so entropy/chunk calculations use the actual consensus chunk size rather than an independent chunk_size parameter—either remove the chunk_size parameter and read context.consensus_config.chunk_size where needed, or set context.consensus_config.chunk_size = chunk_size before calling push_block and any entropy logic; modify references in test_poa_with_malicious_merkle_data and ensure push_block still reads context.consensus_config.chunk_size consistently.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/database/src/database.rs`:
- Around line 658-703: The genesis branch in insert_block_index_for_sealed_block
incorrectly sets max_publish_chunks to 0, dropping any Publish transactions in
the genesis block; update that branch so both max_publish_chunks and
max_submit_chunks are initialized from the calculated pub_chunks_added and
sub_chunks_added respectively (use pub_chunks_added for max_publish_chunks
instead of 0) so cumulative ledger totals reflect genesis transactions; ensure
calculate_chunks_added is used as shown and no other logic (e.g.,
block_index_num_blocks or prev_block checks) is changed.
---
Outside diff comments:
In `@crates/actors/src/block_validation.rs`:
- Around line 3266-3278: The poa_test function currently accepts a chunk_size
parameter that can diverge from context.consensus_config.chunk_size used by
push_block; remove the chunk_size parameter from poa_test (and from
test_poa_with_malicious_merkle_data), compute chunk_size inside poa_test as let
chunk_size = context.consensus_config.chunk_size as usize, and update all call
sites to stop passing chunk_size so both entropy computation and push_block use
the same context.consensus_config.chunk_size.
- Around line 3455-3466: test_poa_with_malicious_merkle_data suffers the same
mismatch between the chunk_size parameter and the value used inside push_block
(context.consensus_config.chunk_size); update the test so entropy/chunk
calculations use the actual consensus chunk size rather than an independent
chunk_size parameter—either remove the chunk_size parameter and read
context.consensus_config.chunk_size where needed, or set
context.consensus_config.chunk_size = chunk_size before calling push_block and
any entropy logic; modify references in test_poa_with_malicious_merkle_data and
ensure push_block still reads context.consensus_config.chunk_size consistently.
JesseTheRobot
left a comment
There was a problem hiding this comment.
Looks good overall, two blocking bits of feedback (1. locking/strict event ordering, 2. block index domain logic in the database)
Move BlockIndexItem computation logic from database.rs insert_block_index_for_sealed_block() back into BlockIndex::push_block() in the domain model. The database layer should be thin wrappers around MDBX; computing cumulative chunk counts and assembling BlockIndexItems is domain logic.
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/domain/src/models/block_index.rs (1)
497-712:⚠️ Potential issue | 🟠 MajorMissing unit tests for
BlockIndex::push_block.
push_blockis the central new function introduced by this refactor, but the test module has no test for it. The following behaviours have zero coverage:
- Genesis block (
num_blocks == 0 && height == 0) — correct initialisation ofmax_publish_chunks/max_submit_chunks.- Non-genesis block with correct
previous_block_hash— cumulative chunk accumulation is correct.- Non-genesis block with wrong
previous_block_hash—eyre::bail!is triggered.- Transactional write —
insert_block_index_itemis called with the right height.Consider adding a test like:
#[test] fn push_block_genesis_and_chained() -> eyre::Result<()> { use irys_types::SealedBlock; // test constructor let tmp = setup_tracing_and_temp_dir(Some("push_block_test"), false); let db = create_test_db(&tmp.path().join("db")); let chunk_size = 256 * 1024; // build minimal SealedBlock for height 0 with no txs, call BlockIndex::push_block // assert num_blocks == 1, item.ledgers[Publish].total_chunks == 0 // build height 1 block pointing to height 0 hash // assert cumulative chunks are correct // build height 1 block with wrong prev_hash, assert Err Ok(()) }🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/domain/src/models/block_index.rs` around lines 497 - 712, Missing unit tests for BlockIndex::push_block — add a new test (e.g., push_block_genesis_and_chained) that uses setup_tracing_and_temp_dir and create_test_db to construct a DB and then builds minimal irys_types::SealedBlock instances to exercise: (1) genesis push (num_blocks == 1 and max_publish_chunks/max_submit_chunks or item.ledgers[*].total_chunks initialized correctly), (2) a chained block with correct previous_block_hash that accumulates chunks as expected (verify BlockIndex::num_blocks and BlockIndex::get_item height/ledgers), (3) a block with an incorrect previous_block_hash where BlockIndex::push_block returns an Err (eyre::bail!), and (4) that a successful push calls the DB insertion path by asserting insert_block_index_item behavior via the produced height (i.e., confirm the persisted item height matches expected). Locate and call BlockIndex::push_block, use irys_types::SealedBlock to build blocks, and assert outcomes via BlockIndex::num_blocks and get_item to validate transactional write and error paths.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/domain/src/models/block_index.rs`:
- Around line 155-160: The function calculate_chunks_added currently multiplies
each tx.data_size.div_ceil(chunk_size) by chunk_size and then divides the total
by chunk_size, which is redundant; change calculate_chunks_added to sum the
per-transaction chunk counts directly (i.e., sum
tx.data_size.div_ceil(chunk_size) for each DataTransactionHeader) instead of
computing bytes_added and dividing by chunk_size — locate calculate_chunks_added
and replace the fold accumulation with a direct sum of div_ceil results (refer
to DataTransactionHeader, its data_size field, and the chunk_size parameter).
---
Outside diff comments:
In `@crates/domain/src/models/block_index.rs`:
- Around line 497-712: Missing unit tests for BlockIndex::push_block — add a new
test (e.g., push_block_genesis_and_chained) that uses setup_tracing_and_temp_dir
and create_test_db to construct a DB and then builds minimal
irys_types::SealedBlock instances to exercise: (1) genesis push (num_blocks == 1
and max_publish_chunks/max_submit_chunks or item.ledgers[*].total_chunks
initialized correctly), (2) a chained block with correct previous_block_hash
that accumulates chunks as expected (verify BlockIndex::num_blocks and
BlockIndex::get_item height/ledgers), (3) a block with an incorrect
previous_block_hash where BlockIndex::push_block returns an Err (eyre::bail!),
and (4) that a successful push calls the DB insertion path by asserting
insert_block_index_item behavior via the produced height (i.e., confirm the
persisted item height matches expected). Locate and call BlockIndex::push_block,
use irys_types::SealedBlock to build blocks, and assert outcomes via
BlockIndex::num_blocks and get_item to validate transactional write and error
paths.
| fn calculate_chunks_added(txs: &[DataTransactionHeader], chunk_size: u64) -> u64 { | ||
| let bytes_added = txs.iter().fold(0, |acc, tx| { | ||
| acc + tx.data_size.div_ceil(chunk_size) * chunk_size | ||
| }); | ||
|
|
||
| bytes_added / chunk_size | ||
| } |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
calculate_chunks_added — redundant multiply-then-divide.
Each term tx.data_size.div_ceil(chunk_size) * chunk_size is already an exact multiple of chunk_size, so the outer / chunk_size simply cancels the multiplication. The function can be expressed directly as:
fn calculate_chunks_added(txs: &[DataTransactionHeader], chunk_size: u64) -> u64 {
txs.iter()
.map(|tx| tx.data_size.div_ceil(chunk_size))
.sum()
}♻️ Proposed simplification
fn calculate_chunks_added(txs: &[DataTransactionHeader], chunk_size: u64) -> u64 {
- let bytes_added = txs.iter().fold(0, |acc, tx| {
- acc + tx.data_size.div_ceil(chunk_size) * chunk_size
- });
- bytes_added / chunk_size
+ txs.iter().map(|tx| tx.data_size.div_ceil(chunk_size)).sum()
}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@crates/domain/src/models/block_index.rs` around lines 155 - 160, The function
calculate_chunks_added currently multiplies each
tx.data_size.div_ceil(chunk_size) by chunk_size and then divides the total by
chunk_size, which is redundant; change calculate_chunks_added to sum the
per-transaction chunk counts directly (i.e., sum
tx.data_size.div_ceil(chunk_size) for each DataTransactionHeader) instead of
computing bytes_added and dividing by chunk_size — locate calculate_chunks_added
and replace the fold accumulation with a direct sum of div_ceil results (refer
to DataTransactionHeader, its data_size field, and the chunk_size parameter).
Summary
Simplifies the block index subsystem by removing the async service layer and tightening the API surface. Net -411 lines.
Service removal
BlockIndexServiceactor entirely (async message loop + mpsc channel) — MDBX handles its own concurrency, so the service wrapper was pure overheadblock_indexfromServiceSenders/ServiceReceiversAtomic persistence
prepare_migration()/process_migration()into a single synchronousmigrate_blocks()method onBlockMigrationServicepush_blockacceptsSealedBlockBlockIndex::push_blocknow takes&SealedBlockinstead of raw(&IrysBlockHeader, &[DataTransactionHeader])— extracts Submit/Publish txs internally viaBlockTransactions::get_ledger_txs()Summary by CodeRabbit