Skip to content

fix: mempool performs a dedup on ingress proofs#1131

Merged
roberts-pumpurs merged 5 commits into
masterfrom
rob/mempool-ingress-proof-dedup
Feb 10, 2026
Merged

fix: mempool performs a dedup on ingress proofs#1131
roberts-pumpurs merged 5 commits into
masterfrom
rob/mempool-ingress-proof-dedup

Conversation

@roberts-pumpurs

@roberts-pumpurs roberts-pumpurs commented Feb 10, 2026

Copy link
Copy Markdown
Contributor

Describe the changes
Mempool will detect and perform deduplication per (data_root, signer) combo. it shouldn't really be an issue after #1128 gets merged, but better safe than sorry

Checklist

  • Tests have been added/updated for the changes.
  • Documentation has been updated for the changes (if applicable).
  • The code follows Rust's style guidelines.

Summary by CodeRabbit

  • Bug Fixes

    • Deduplicate ingress proofs by signer (keep first proof per signer) before promotion; logs report before/after counts when deduplication occurs.
    • Exclude expired ingress proofs so only valid, non-expired proofs are considered for mempool promotion.
  • Tests

    • Added a test validating mempool ingress-proof deduplication and signer distribution across multiple data roots.

@roberts-pumpurs roberts-pumpurs self-assigned this Feb 10, 2026
@coderabbitai

coderabbitai Bot commented Feb 10, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Deduplicates ingress proofs by signer address in the mempool service, excludes expired ingress proofs when collecting proofs per data root, and adds an integration test that seeds multiple proofs (including duplicate signers across anchors) to verify signer-level deduplication after publishing to the mempool.

Changes

Cohort / File(s) Summary
Mempool Proof Handling
crates/actors/src/mempool_service.rs
Made all_proofs mutable; filter out expired ingress proofs when collecting proofs per data root; perform in-place deduplication of ingress proofs by signer address (keep first per signer) and log before/after counts; dedupe now runs prior to minimum-proof validation.
Tests — Deduplication
crates/chain/tests/validation/mempool_ingress_proof_dedup.rs
Added heavy_mempool_dedup_ingress_proof_signers async test that seeds DB with multiple proofs (including duplicate signers across anchors and data roots), publishes via mempool, and asserts signer-based deduplication and per-signer distribution.
Test Module Registration
crates/chain/tests/validation/mod.rs
Registered new test module mempool_ingress_proof_dedup.

Sequence Diagram(s)

mermaid
sequenceDiagram
participant Client as Client/Publisher
participant Mempool as MempoolService
participant DB as Database
participant Miner as Miner/Promoter
Client->>Mempool: Publish(ingress proofs, data roots)
Mempool->>DB: Query proofs by data_root
DB-->>Mempool: Return all proofs (exclude expired)
Mempool->>Mempool: In-place dedupe by signer (keep first)
Mempool->>Mempool: Check minimum-proof thresholds
Mempool->>Miner: Promote eligible data roots to miner queue
Miner-->>Mempool: Acknowledgement

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~22 minutes

Possibly related PRs

  • fix: duplicate ingress proof signers #1128: Addresses duplicate ingress-proof signer handling and related adjustments to proof generation/storage — strongly related to signer-level deduplication in mempool_service.

Suggested reviewers

  • JesseTheRobot
🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title accurately and concisely summarizes the main change: deduplication of ingress proofs in the mempool service, which is the core functionality added across all modified files.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch rob/mempool-ingress-proof-dedup

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/chain/tests/validation/mempool_ingress_proof_dedup.rs`:
- Around line 46-63: The test constructs a DataTransactionHeader::V1 using
DataTransactionHeaderV1WithMetadata with metadata created by
irys_types::DataTransactionMetadata::new(), which leaves promoted_height empty
and causes the publish path in mempool_service.rs to use the DB fallback; make
this explicit for readers by either setting promoted_height explicitly (e.g.,
construct metadata with promoted_height = None) or adding a short clarifying
comment next to irys_types::DataTransactionMetadata::new() explaining that
promoted_height is intentionally unset so the test exercises the DB lookup via
tx_header_by_txid rather than the promoted-path.

Comment thread crates/chain/tests/validation/mempool_ingress_proof_dedup.rs Outdated

@JesseTheRobot JesseTheRobot left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/chain/tests/validation/mempool_ingress_proof_dedup.rs`:
- Around line 138-157: The assertion expecting 3 ingress proofs for data_root_1
is wrong because store_external_ingress_proof_checked deletes existing proofs
per address before inserting, so storing proof_1a1 then proof_1a2 (both from
signer_a) results in only proof_1a2 and proof_1b remaining; update the assertion
on stored_1.len() to expect 2 instead of 3 (refer to
store_external_ingress_proof_checked, ingress_proofs_by_data_root, data_root_1,
proof_1a1/proof_1a2/proof_1b).

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@crates/chain/tests/validation/mempool_ingress_proof_dedup.rs`:
- Around line 138-169: Add an explicit check that proof_1a1 and proof_1a2 are
different before seeding the DB: insert an assert_ne!(proof_1a1, proof_1a2)
right before the loop that calls tx.put::<IngressProofs>(...) so the test will
fail fast if those two proofs ever serialize identically (protecting the
subsequent stored_1.len() == 3 assertion); reference the symbols proof_1a1,
proof_1a2, tx.put::<IngressProofs>, and stored_1 in the same test block.

Comment on lines +138 to +169
// Store both data txs and all proofs in the DB
genesis_node.node_ctx.db.update(|tx| {
use irys_database::tables::{
CompactCachedIngressProof, CompactTxHeader, IngressProofs, IrysDataTxHeaders,
};
use irys_types::ingress::CachedIngressProof;

// Data root 1: tx + 3 proofs (2 from signer A, 1 from signer B)
// Insert directly into the DB table to bypass the dedup logic in
// store_external_ingress_proof_checked (which removes prior proofs
// from the same signer for the same data root).
tx.put::<IrysDataTxHeaders>(data_tx_1.id, CompactTxHeader(data_tx_1.clone()))?;
irys_database::cache_data_root(tx, &data_tx_1, None)?;
for (proof, address) in [
(&proof_1a1, signer_a.address()),
(&proof_1a2, signer_a.address()),
(&proof_1b, signer_b.address()),
] {
tx.put::<IngressProofs>(
proof.data_root,
CompactCachedIngressProof(CachedIngressProof {
address,
proof: proof.clone(),
}),
)?;
}
let stored_1 = irys_database::ingress_proofs_by_data_root(tx, data_root_1)?;
assert_eq!(
stored_1.len(),
3,
"expected 3 ingress proofs for data_root_1"
);

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick | 🔵 Trivial

🧩 Analysis chain

🏁 Script executed:

# First, find the IngressProofs table definition and its properties
rg "IngressProofs" --type rust -B 5 -A 5 | head -100

Repository: Irys-xyz/irys

Length of output: 5845


🏁 Script executed:

# Look for table! macro definition to understand DUPSORT behavior
fd -e rs | xargs rg "table!\s*\{" -A 30 | grep -A 30 "IngressProofs"

Repository: Irys-xyz/irys

Length of output: 39


🏁 Script executed:

# Check the CompactCachedIngressProof structure and how it implements comparison
rg "CompactCachedIngressProof\|CachedIngressProof" --type rust -B 3 -A 8 | head -150

Repository: Irys-xyz/irys

Length of output: 39


🏁 Script executed:

# Verify the actual proofs used in the test (proof_1a1, proof_1a2) and their definitions
rg "proof_1a1|proof_1a2" --type rust -B 5 -A 5

Repository: Irys-xyz/irys

Length of output: 2968


🏁 Script executed:

# Find the database table definitions
fd -e rs -path "*/irys_database/*" | xargs rg "IngressProofs" -B 3 -A 10 | head -200

Repository: Irys-xyz/irys

Length of output: 15263


🏁 Script executed:

# Look for table definitions more broadly
fd -e rs | xargs rg "IngressProofs.*table\!|table\!.*IngressProofs" -B 5 -A 10

Repository: Irys-xyz/irys

Length of output: 190


🏁 Script executed:

# Search for DUPSORT in database context
rg "DUPSORT|DupSort|dupsort" --type rust -C 5 | grep -C 5 "IngressProofs\|ingress"

Repository: Irys-xyz/irys

Length of output: 5868


🏁 Script executed:

# Check the CompactCachedIngressProof definition and its Encode/Compress impl
rg "struct CompactCachedIngressProof|impl.*CompactCachedIngressProof" --type rust -B 3 -A 15

Repository: Irys-xyz/irys

Length of output: 39


🏁 Script executed:

# Verify if anchors are included in the encoded form by checking Proof/IngressProof structure
rg "struct IngressProof|struct.*IngressProof\|pub anchor" --type rust -B 2 -A 10 | head -150

Repository: Irys-xyz/irys

Length of output: 2267


🏁 Script executed:

# Find CompactCachedIngressProof structure
rg "struct CompactCachedIngressProof" --type rust -B 5 -A 20

Repository: Irys-xyz/irys

Length of output: 39


🏁 Script executed:

# Find CachedIngressProof structure
rg "struct CachedIngressProof" --type rust -B 5 -A 20

Repository: Irys-xyz/irys

Length of output: 1266


🏁 Script executed:

# Check Compress trait implementations
rg "impl.*Compress.*IngressProof\|impl.*Encode.*IngressProof" --type rust -B 3 -A 15

Repository: Irys-xyz/irys

Length of output: 39


🏁 Script executed:

# Look for how IngressProof is compressed/encoded
rg "impl.*Compress\|impl.*Codable" --type rust | grep -i "ingress\|proof"

Repository: Irys-xyz/irys

Length of output: 39


🏁 Script executed:

# Find the IngressProofs table definition in database tables
fd -path "*/database/src/*" -name "*.rs" | xargs rg "IngressProofs" -B 5 -A 5 | head -150

Repository: Irys-xyz/irys

Length of output: 10627


🏁 Script executed:

# Look for table macro usage
rg "table!\(" --type rust -A 50 | grep -A 50 "IngressProofs"

Repository: Irys-xyz/irys

Length of output: 39


🏁 Script executed:

# Check the Compact trait and what it encodes
fd -path "*/types/src/*" -name "*.rs" | xargs rg "derive.*Compact\|pub fn compress\|impl.*Compact" -B 2 -A 10 | head -200

Repository: Irys-xyz/irys

Length of output: 225


🏁 Script executed:

# Look for the actual table definition
fd -e rs | xargs grep -l "IngressProofs" | xargs grep -l "table"

Repository: Irys-xyz/irys

Length of output: 929


🏁 Script executed:

# Look at the tables.rs file for IngressProofs definition
cat crates/database/src/tables.rs | head -500

Repository: Irys-xyz/irys

Length of output: 5531


🏁 Script executed:

# Search for IngressProofs in tables.rs specifically
rg "IngressProofs" crates/database/src/tables.rs -B 10 -A 10

Repository: Irys-xyz/irys

Length of output: 740


🏁 Script executed:

# Now let's check how CompactCachedIngressProof is related to the encoding
rg "CompactCachedIngressProof" --type rust -B 5 -A 15

Repository: Irys-xyz/irys

Length of output: 13570


Good fix: direct put correctly bypasses store_external_ingress_proof_checked dedup to seed the pre-dedup state.

The comment on lines 146–148 clearly explains why the test uses raw tx.put::<IngressProofs> instead of the checked function, and the assertion of 3 stored proofs is correct. The IngressProofs table is a DUPSORT table with address as the subkey; since proof_1a1 and proof_1a2 have different anchors, their serialized forms differ and both are stored.

Consider adding assert_ne!(proof_1a1, proof_1a2) before insertion as a safeguard: this makes explicit that the two proofs differ at the Compact-encoding level, protecting against future changes that might inadvertently strip the anchor field from the serialized form, which would cause the 3-proof assertion to fail silently.

🤖 Prompt for AI Agents
In `@crates/chain/tests/validation/mempool_ingress_proof_dedup.rs` around lines
138 - 169, Add an explicit check that proof_1a1 and proof_1a2 are different
before seeding the DB: insert an assert_ne!(proof_1a1, proof_1a2) right before
the loop that calls tx.put::<IngressProofs>(...) so the test will fail fast if
those two proofs ever serialize identically (protecting the subsequent
stored_1.len() == 3 assertion); reference the symbols proof_1a1, proof_1a2,
tx.put::<IngressProofs>, and stored_1 in the same test block.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants