fix: mempool performs a dedup on ingress proofs#1131
Conversation
📝 WalkthroughWalkthroughDeduplicates ingress proofs by signer address in the mempool service, excludes expired ingress proofs when collecting proofs per data root, and adds an integration test that seeds multiple proofs (including duplicate signers across anchors) to verify signer-level deduplication after publishing to the mempool. Changes
Sequence Diagram(s)mermaid Estimated code review effort🎯 3 (Moderate) | ⏱️ ~22 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 3✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/chain/tests/validation/mempool_ingress_proof_dedup.rs`:
- Around line 46-63: The test constructs a DataTransactionHeader::V1 using
DataTransactionHeaderV1WithMetadata with metadata created by
irys_types::DataTransactionMetadata::new(), which leaves promoted_height empty
and causes the publish path in mempool_service.rs to use the DB fallback; make
this explicit for readers by either setting promoted_height explicitly (e.g.,
construct metadata with promoted_height = None) or adding a short clarifying
comment next to irys_types::DataTransactionMetadata::new() explaining that
promoted_height is intentionally unset so the test exercises the DB lookup via
tx_header_by_txid rather than the promoted-path.
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/chain/tests/validation/mempool_ingress_proof_dedup.rs`:
- Around line 138-157: The assertion expecting 3 ingress proofs for data_root_1
is wrong because store_external_ingress_proof_checked deletes existing proofs
per address before inserting, so storing proof_1a1 then proof_1a2 (both from
signer_a) results in only proof_1a2 and proof_1b remaining; update the assertion
on stored_1.len() to expect 2 instead of 3 (refer to
store_external_ingress_proof_checked, ingress_proofs_by_data_root, data_root_1,
proof_1a1/proof_1a2/proof_1b).
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In `@crates/chain/tests/validation/mempool_ingress_proof_dedup.rs`:
- Around line 138-169: Add an explicit check that proof_1a1 and proof_1a2 are
different before seeding the DB: insert an assert_ne!(proof_1a1, proof_1a2)
right before the loop that calls tx.put::<IngressProofs>(...) so the test will
fail fast if those two proofs ever serialize identically (protecting the
subsequent stored_1.len() == 3 assertion); reference the symbols proof_1a1,
proof_1a2, tx.put::<IngressProofs>, and stored_1 in the same test block.
| // Store both data txs and all proofs in the DB | ||
| genesis_node.node_ctx.db.update(|tx| { | ||
| use irys_database::tables::{ | ||
| CompactCachedIngressProof, CompactTxHeader, IngressProofs, IrysDataTxHeaders, | ||
| }; | ||
| use irys_types::ingress::CachedIngressProof; | ||
|
|
||
| // Data root 1: tx + 3 proofs (2 from signer A, 1 from signer B) | ||
| // Insert directly into the DB table to bypass the dedup logic in | ||
| // store_external_ingress_proof_checked (which removes prior proofs | ||
| // from the same signer for the same data root). | ||
| tx.put::<IrysDataTxHeaders>(data_tx_1.id, CompactTxHeader(data_tx_1.clone()))?; | ||
| irys_database::cache_data_root(tx, &data_tx_1, None)?; | ||
| for (proof, address) in [ | ||
| (&proof_1a1, signer_a.address()), | ||
| (&proof_1a2, signer_a.address()), | ||
| (&proof_1b, signer_b.address()), | ||
| ] { | ||
| tx.put::<IngressProofs>( | ||
| proof.data_root, | ||
| CompactCachedIngressProof(CachedIngressProof { | ||
| address, | ||
| proof: proof.clone(), | ||
| }), | ||
| )?; | ||
| } | ||
| let stored_1 = irys_database::ingress_proofs_by_data_root(tx, data_root_1)?; | ||
| assert_eq!( | ||
| stored_1.len(), | ||
| 3, | ||
| "expected 3 ingress proofs for data_root_1" | ||
| ); |
There was a problem hiding this comment.
🧹 Nitpick | 🔵 Trivial
🧩 Analysis chain
🏁 Script executed:
# First, find the IngressProofs table definition and its properties
rg "IngressProofs" --type rust -B 5 -A 5 | head -100Repository: Irys-xyz/irys
Length of output: 5845
🏁 Script executed:
# Look for table! macro definition to understand DUPSORT behavior
fd -e rs | xargs rg "table!\s*\{" -A 30 | grep -A 30 "IngressProofs"Repository: Irys-xyz/irys
Length of output: 39
🏁 Script executed:
# Check the CompactCachedIngressProof structure and how it implements comparison
rg "CompactCachedIngressProof\|CachedIngressProof" --type rust -B 3 -A 8 | head -150Repository: Irys-xyz/irys
Length of output: 39
🏁 Script executed:
# Verify the actual proofs used in the test (proof_1a1, proof_1a2) and their definitions
rg "proof_1a1|proof_1a2" --type rust -B 5 -A 5Repository: Irys-xyz/irys
Length of output: 2968
🏁 Script executed:
# Find the database table definitions
fd -e rs -path "*/irys_database/*" | xargs rg "IngressProofs" -B 3 -A 10 | head -200Repository: Irys-xyz/irys
Length of output: 15263
🏁 Script executed:
# Look for table definitions more broadly
fd -e rs | xargs rg "IngressProofs.*table\!|table\!.*IngressProofs" -B 5 -A 10Repository: Irys-xyz/irys
Length of output: 190
🏁 Script executed:
# Search for DUPSORT in database context
rg "DUPSORT|DupSort|dupsort" --type rust -C 5 | grep -C 5 "IngressProofs\|ingress"Repository: Irys-xyz/irys
Length of output: 5868
🏁 Script executed:
# Check the CompactCachedIngressProof definition and its Encode/Compress impl
rg "struct CompactCachedIngressProof|impl.*CompactCachedIngressProof" --type rust -B 3 -A 15Repository: Irys-xyz/irys
Length of output: 39
🏁 Script executed:
# Verify if anchors are included in the encoded form by checking Proof/IngressProof structure
rg "struct IngressProof|struct.*IngressProof\|pub anchor" --type rust -B 2 -A 10 | head -150Repository: Irys-xyz/irys
Length of output: 2267
🏁 Script executed:
# Find CompactCachedIngressProof structure
rg "struct CompactCachedIngressProof" --type rust -B 5 -A 20Repository: Irys-xyz/irys
Length of output: 39
🏁 Script executed:
# Find CachedIngressProof structure
rg "struct CachedIngressProof" --type rust -B 5 -A 20Repository: Irys-xyz/irys
Length of output: 1266
🏁 Script executed:
# Check Compress trait implementations
rg "impl.*Compress.*IngressProof\|impl.*Encode.*IngressProof" --type rust -B 3 -A 15Repository: Irys-xyz/irys
Length of output: 39
🏁 Script executed:
# Look for how IngressProof is compressed/encoded
rg "impl.*Compress\|impl.*Codable" --type rust | grep -i "ingress\|proof"Repository: Irys-xyz/irys
Length of output: 39
🏁 Script executed:
# Find the IngressProofs table definition in database tables
fd -path "*/database/src/*" -name "*.rs" | xargs rg "IngressProofs" -B 5 -A 5 | head -150Repository: Irys-xyz/irys
Length of output: 10627
🏁 Script executed:
# Look for table macro usage
rg "table!\(" --type rust -A 50 | grep -A 50 "IngressProofs"Repository: Irys-xyz/irys
Length of output: 39
🏁 Script executed:
# Check the Compact trait and what it encodes
fd -path "*/types/src/*" -name "*.rs" | xargs rg "derive.*Compact\|pub fn compress\|impl.*Compact" -B 2 -A 10 | head -200Repository: Irys-xyz/irys
Length of output: 225
🏁 Script executed:
# Look for the actual table definition
fd -e rs | xargs grep -l "IngressProofs" | xargs grep -l "table"Repository: Irys-xyz/irys
Length of output: 929
🏁 Script executed:
# Look at the tables.rs file for IngressProofs definition
cat crates/database/src/tables.rs | head -500Repository: Irys-xyz/irys
Length of output: 5531
🏁 Script executed:
# Search for IngressProofs in tables.rs specifically
rg "IngressProofs" crates/database/src/tables.rs -B 10 -A 10Repository: Irys-xyz/irys
Length of output: 740
🏁 Script executed:
# Now let's check how CompactCachedIngressProof is related to the encoding
rg "CompactCachedIngressProof" --type rust -B 5 -A 15Repository: Irys-xyz/irys
Length of output: 13570
Good fix: direct put correctly bypasses store_external_ingress_proof_checked dedup to seed the pre-dedup state.
The comment on lines 146–148 clearly explains why the test uses raw tx.put::<IngressProofs> instead of the checked function, and the assertion of 3 stored proofs is correct. The IngressProofs table is a DUPSORT table with address as the subkey; since proof_1a1 and proof_1a2 have different anchors, their serialized forms differ and both are stored.
Consider adding assert_ne!(proof_1a1, proof_1a2) before insertion as a safeguard: this makes explicit that the two proofs differ at the Compact-encoding level, protecting against future changes that might inadvertently strip the anchor field from the serialized form, which would cause the 3-proof assertion to fail silently.
🤖 Prompt for AI Agents
In `@crates/chain/tests/validation/mempool_ingress_proof_dedup.rs` around lines
138 - 169, Add an explicit check that proof_1a1 and proof_1a2 are different
before seeding the DB: insert an assert_ne!(proof_1a1, proof_1a2) right before
the loop that calls tx.put::<IngressProofs>(...) so the test will fail fast if
those two proofs ever serialize identically (protecting the subsequent
stored_1.len() == 3 assertion); reference the symbols proof_1a1, proof_1a2,
tx.put::<IngressProofs>, and stored_1 in the same test block.
Describe the changes
Mempool will detect and perform deduplication per (data_root, signer) combo. it shouldn't really be an issue after #1128 gets merged, but better safe than sorry
Checklist
Summary by CodeRabbit
Bug Fixes
Tests