feat(p2p): evict peers on chain_id (network) mismatch handshake rejection by JesseTheRobot · Pull Request #1437 · Irys-xyz/irys

JesseTheRobot · 2026-06-02T13:17:16Z

Problem

PR #1435 hard-rejects handshakes from a different chain_id, but a node configured for a different chain still kept communicating on devnet. Root cause: the chain_id check lives only in the handshake, while the gossip data plane (check_peer_v*) authorizes purely on cache membership + source IP — it never consults handshake outcome.

When our outbound announce is rejected for a chain mismatch (ChainIdMismatch → NetworkMismatch), the rejection was only recorded in failed_announcements — a map that is written but never read. The peer stayed in the cache and fully trusted, so gossip kept flowing.

Fix

A NetworkMismatch handshake rejection now evicts the peer from the in-memory cache (all lookup maps) and deletes it from the persistent peer DB. A node re-announcing to its cached peers while on the wrong chain is therefore isolated after the startup announce round (every upgraded peer rejects it → each is evicted → empty peer set).

This relies on peers enforcing #1435, which is acceptable since the target devnet is fully upgraded. Self-enforcing variants and the larger "always require a session handshake + authenticated handshake response" work are scoped in the included design doc for a follow-up.

Changes

domain (peer_list.rs): PeerList::remove_peer_by_api_address — removes a peer from the persistent cache/purgatory and every index map, emitting PeerRemoved (mirrors the existing purgatory-eviction cleanup).
database (database.rs): delete_peer_list_item — removes a peer from the PeerListItems table (mirror of insert_peer_list_item).
p2p (peer_network_service.rs): evict_peer_on_network_mismatch wired into the outbound-announce Rejected arm; no-op for any non-network rejection reason.
docs: design doc capturing the diagnosis, this fix, and the follow-up plan.

Test plan

Added (TDD, red→green):

remove_peer_by_api_address_clears_all_lookups (domain)
delete_peer_list_item_removes_it (database)
network_mismatch_rejection_evicts_peer_from_cache_and_db (p2p)
non_network_rejection_retains_peer (p2p)

cargo fmt --all clean; cargo clippy -p irys-p2p -p irys-domain -p irys-database --tests --all-targets clean. Existing rejection-coverage tests still pass. Full-workspace cargo xtask test not yet run.

🤖 Generated with Claude Code

Summary by CodeRabbit

New Features
- Evict peers by API address and fully clear them from all lookup paths and caches.
- Added an internal DB helper to remove peer-list entries so deletions can be confirmed durably.
Bug Fixes
- Peers rejected for network/chain mismatches are evicted from memory and staged for durable removal during DB flush to avoid resurrection.
Documentation
- Added design spec for gossip handshake enforcement and network isolation.
Tests
- Added tests covering eviction, staged persistence, and non-network rejection behavior.

coderabbitai · 2026-06-02T13:17:28Z

Note

Reviews paused

It looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the reviews.auto_review.auto_pause_after_reviewed_commits setting.

Use the following commands to manage reviews:

@coderabbitai resume to resume automatic reviews.
@coderabbitai review to trigger a single review.

Use the checkboxes below for quick actions:

▶️ Resume reviews
🔍 Trigger review

📝 Walkthrough

Walkthrough

Evicts peers on outbound NetworkMismatch: adds a DB deletion helper, in-memory eviction by API address, stages deletions in the peer service and applies them in flush(), with tests and a design document.

Changes

Peer eviction on network mismatch

Layer / File(s)	Summary
Database peer deletion helper `crates/database/src/database.rs`	Adds `delete_peer_list_item` to remove a `PeerListItems` row by `IrysPeerId` and a test that inserts, deletes, and verifies removal.
In-memory cache eviction by API address `crates/domain/src/models/peer_list.rs`	Adds `remove_peer_by_api_address` and `PeerListDataInner::remove_peer_by_api_address` to evict a peer found by API address, clear all lookup/index maps and `known_peers_cache`, emit `PeerEvent::PeerRemoved`, and return the removed item; includes a regression test.
Service integration: staged DB removals and flush `crates/p2p/src/peer_network_service.rs`	Imports `delete_peer_list_item`, adds `pending_db_removals` to state, `evict_peer_on_network_mismatch` to remove from `PeerList` on `NetworkMismatch`, stages peer IDs for deletion, updates `flush()` to skip reinserting staged peers and to call `delete_peer_list_item` for staged deletions within the same DB transaction, and adds tests plus a `build_peer_list` test helper.
Design specification and roadmap `docs/superpowers/specs/2026-06-02-gossip-handshake-enforcement-design.md`	Documents the asymmetric `chain_id` rejection problem, the near-term evict-on-rejection remediation, expected behavior and limitations, and follow-up plans (session-scoped handshake gate, authenticated HandshakeResponseV2), plus tier-3 proposals and deferred items.

Sequence Diagram

sequenceDiagram
  participant Initiator
  participant announce_yourself_to_address
  participant Responder
  participant PeerList
  participant Database

  Initiator->>announce_yourself_to_address: send handshake
  announce_yourself_to_address->>Responder: forward handshake
  Responder->>announce_yourself_to_address: PeerResponse::Rejected(NetworkMismatch)

  rect rgba(255, 100, 100, 0.5)
  announce_yourself_to_address->>PeerList: remove_peer_by_api_address(api_address)
  PeerList->>PeerList: clear lookup and index maps
  PeerList->>announce_yourself_to_address: return Option<IrysPeerId>
  announce_yourself_to_address->>announce_yourself_to_address: add id to pending_db_removals
  announce_yourself_to_address->>Database: flush applies delete_peer_list_item(id)
  Database->>announce_yourself_to_address: Result<bool>
  end

  announce_yourself_to_address->>Initiator: return PeerHandshakeRejected

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Irys-xyz/irys#1435: Introduces the chain-id/ChainIdMismatch → NetworkMismatch rejection path this PR uses to trigger evictions.

Suggested reviewers

glottologist

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: evicting peers when chain_id (network) mismatch is detected during handshake rejection, which directly aligns with the PR's primary objective to fix the bug where such peers remained in cache.
Docstring Coverage	✅ Passed	Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/p2p-evict-chain-mismatch-peers

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/p2p/src/peer_network_service.rs`:
- Around line 1105-1134: evict_peer_on_network_mismatch currently performs an
independent DB delete (db.update_scoped + delete_peer_list_item) which races
with the periodic flush (persistable_peers_with_mining_addr / flush), allowing
the flushed snapshot to resurrect the peer; change this so eviction is
serialized with the flush writer: do not run a standalone delete in
evict_peer_on_network_mismatch; instead mark/remove the peer in PeerList and
either (a) enqueue the peer_id for removal to the same flush/DB writer (or call
a new PeerList method that acquires the same lock used by flush and performs the
delete inside the flush's DB transaction), or (b) have flush consult a removal
set (updated by evict_peer_on_network_mismatch) and perform
delete_peer_list_item within its transaction; reference
evict_peer_on_network_mismatch, PeerList,
persistable_peers_with_mining_addr/flush, delete_peer_list_item and
db.update_scoped when making the change.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 06c910cf-cf5d-45bf-830d-63992fd7b58c

📥 Commits

Reviewing files that changed from the base of the PR and between 12614c1 and 9dd1ecf.

📒 Files selected for processing (4)

crates/database/src/database.rs
crates/domain/src/models/peer_list.rs
crates/p2p/src/peer_network_service.rs
docs/superpowers/specs/2026-06-02-gossip-handshake-enforcement-design.md

coderabbitai · 2026-06-02T13:23:38Z

+~50–70 LOC (eviction primitive + DB delete + one wiring arm) + tests. No wire
+change, no persistence.


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Clarify “no persistence” wording to avoid contradiction with DB eviction.

This line reads as if persistence is untouched, but the near-term fix explicitly deletes peers from PeerListItems in DB. Consider rewording to “no wire change, no new persisted state/schema” for precision.

github-actions

⚠️ Performance Alert ⚠️

Possible performance regression was detected for benchmark.
Benchmark result of this commit is worse than the previous benchmark result exceeding threshold 1.20.

Benchmark suite	Current: `9dd1ecf`	Previous: `12614c1`	Ratio
`get_recall_range/100`	`0.015291` ms/iter (`± 0.00114`)	`0.012592` ms/iter (`± 0.000486`)	`1.21`
`get_recall_range/1000`	`0.159634` ms/iter (`± 0.006973`)	`0.120043` ms/iter (`± 0.002043`)	`1.33`

This comment was automatically generated by workflow using github-action-benchmark.

github-actions · 2026-06-02T13:37:47Z

Benchmark results: https://irys-xyz.github.io/irys/dev/bench/feat%2Fp2p-evict-chain-mismatch-peers/index.html

JesseTheRobot · 2026-06-02T13:59:31Z

@coderabbitai full review

coderabbitai · 2026-06-02T13:59:38Z

✅ Actions performed

Full review triggered.

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/p2p/src/peer_network_service.rs`:
- Around line 237-261: The code drains pending_db_removals (std::mem::take)
while holding the lock before calling db.update_scoped, which loses staged
removals if the transaction fails; instead, while holding the lock capture a
clone/copy of pending_db_removals (do not std::mem::take) and release the lock,
run update_scoped using that cloned set (removals_clone), then if update_scoped
succeeds reacquire self.state.lock().await and remove those successfully-applied
IDs from pending_db_removals (or clear them) so removals are only dropped on
successful transaction; refer to pending_db_removals, state.lock().await,
update_scoped, and the local removals variable to locate where to change the
logic.

In `@docs/superpowers/specs/2026-06-02-gossip-handshake-enforcement-design.md`:
- Around line 97-103: Update the docs to reflect that on outbound-announce
rejection (the PeerResponse::Rejected arm in announce_yourself_to_address) you
should still evict the peer from the in-memory cache using
remove_peer_by_api_address, but do NOT perform immediate DB deletion; instead
record the peer in pending_db_removals and let the DatabaseProvider-only writer
apply deletes during flush via db.update_scoped, preserving the existing error
return when rejected_response.reason ==
version::RejectionReason::NetworkMismatch.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: ad594948-d365-46ba-ac93-f7bc02bd9fd0

📥 Commits

Reviewing files that changed from the base of the PR and between 12614c1 and 58ceb14.

📒 Files selected for processing (4)

crates/database/src/database.rs
crates/domain/src/models/peer_list.rs
crates/p2p/src/peer_network_service.rs
docs/superpowers/specs/2026-06-02-gossip-handshake-enforcement-design.md

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (2)

crates/p2p/src/peer_network_service.rs (2)

246-270: ⚠️ Potential issue | 🟠 Major | 🏗️ Heavy lift

Close the remaining snapshot/staging race.

flush() still has a crash window: it takes pending_db_removals, then snapshots peer_list, but a concurrent NetworkMismatch eviction can land after that take and before this transaction commits. In that case this flush can write the stale snapshot back to PeerListItems, and if the node exits before the next flush the peer is reloaded on restart.

This needs one atomic view of both the peer snapshot and the staged removals. The simplest shape here is to serialize peer_list.remove_peer_by_api_address(...) + pending_db_removals.insert(...) with the same state mutex that guards the flush snapshot/take path.

Possible fix shape

 async fn flush(&self) -> Result<(), PeerListServiceError> {
-    let (db, removals) = {
+    let (db, removals, persistable_peers) = {
         let mut state = self.state.lock().await;
         (
             state.db.clone(),
             std::mem::take(&mut state.pending_db_removals),
+            self.peer_list.persistable_peers_with_mining_addr(),
         )
     };
-
-    let persistable_peers = self.peer_list.persistable_peers_with_mining_addr();
     let result = db
         .update_scoped(|tx| {
             for (peer_id, peer) in persistable_peers.iter() {
                 if removals.contains(peer_id) {
                     continue;
@@
-                if let Some(peer_id) =
-                    evict_peer_on_network_mismatch(&peer_list, api_address, &rejected_response)
-                {
-                    inner.state.lock().await.pending_db_removals.insert(peer_id);
+                let mut state = inner.state.lock().await;
+                if let Some(peer_id) =
+                    evict_peer_on_network_mismatch(&peer_list, api_address, &rejected_response)
+                {
+                    state.pending_db_removals.insert(peer_id);
                 }

Also applies to: 1119-1123

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/p2p/src/peer_network_service.rs` around lines 246 - 270, The flush()
path can race with concurrent NetworkMismatch evictions because the peer
snapshot (persistable_peers) and staged removals (pending_db_removals /
removals) are taken separately; to fix, make the snapshot and the staging of
evictions atomic under the same state mutex so you cannot re-persist a peer that
was evicted after the snapshot: acquire self.state.lock() around the sequence
that calls peer_list.remove_peer_by_api_address(...) /
pending_db_removals.insert(...) and the code that creates persistable_peers and
removals, then release the lock before calling db.update_scoped(...) (leaving
transaction logic unchanged); apply the same locking pattern where peer removals
are staged (the peer_list.remove_peer_by_api_address and
pending_db_removals.insert sites) so the update_scoped(tx ->
insert_peer_list_item / delete_peer_list_item) always operates on a consistent
snapshot.

1236-1236: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Use #[tokio::test] on these async tests instead of importing tokio::test and using #[test].

In crates/p2p/src/peer_network_service.rs, the async fn tests are currently annotated with #[test] after use tokio::test;, but the repo guideline requires using the explicit #[tokio::test] attribute for async tests:

network_mismatch_rejection_evicts_from_cache_and_returns_id
non_network_rejection_retains_peer
flush_deletes_pending_removals_without_resurrecting

As per coding guidelines: “Use #[tokio::test] for async tests”.

Also applies to: 1327-1328, 1363-1364, 1396-1397

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/p2p/src/peer_network_service.rs` at line 1236, The async tests
network_mismatch_rejection_evicts_from_cache_and_returns_id,
non_network_rejection_retains_peer, and
flush_deletes_pending_removals_without_resurrecting (and the other occurrences
around 1327-1328, 1363-1364, 1396-1397) should use the #[tokio::test] attribute
instead of importing tokio::test and annotating with #[test]; update each test
function to replace #[test] with #[tokio::test], remove the standalone use
tokio::test; import (or any unused import) if present, and ensure the async fn
signatures remain unchanged so the tests run under the Tokio runtime.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/p2p/src/peer_network_service.rs`:
- Around line 246-270: The flush() path can race with concurrent NetworkMismatch
evictions because the peer snapshot (persistable_peers) and staged removals
(pending_db_removals / removals) are taken separately; to fix, make the snapshot
and the staging of evictions atomic under the same state mutex so you cannot
re-persist a peer that was evicted after the snapshot: acquire self.state.lock()
around the sequence that calls peer_list.remove_peer_by_api_address(...) /
pending_db_removals.insert(...) and the code that creates persistable_peers and
removals, then release the lock before calling db.update_scoped(...) (leaving
transaction logic unchanged); apply the same locking pattern where peer removals
are staged (the peer_list.remove_peer_by_api_address and
pending_db_removals.insert sites) so the update_scoped(tx ->
insert_peer_list_item / delete_peer_list_item) always operates on a consistent
snapshot.
- Line 1236: The async tests
network_mismatch_rejection_evicts_from_cache_and_returns_id,
non_network_rejection_retains_peer, and
flush_deletes_pending_removals_without_resurrecting (and the other occurrences
around 1327-1328, 1363-1364, 1396-1397) should use the #[tokio::test] attribute
instead of importing tokio::test and annotating with #[test]; update each test
function to replace #[test] with #[tokio::test], remove the standalone use
tokio::test; import (or any unused import) if present, and ensure the async fn
signatures remain unchanged so the tests run under the Tokio runtime.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: cad070b9-92b2-49e9-9c8b-d26c081ba44b

📥 Commits

Reviewing files that changed from the base of the PR and between 58ceb14 and 0aedc1a.

📒 Files selected for processing (1)

crates/p2p/src/peer_network_service.rs

coderabbitai

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

crates/p2p/src/peer_network_service.rs (1)
1120-1138: ⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Don't mark rejected handshakes as successful announcements.

By this branch, Line 1050 has already sent AnnouncementFinished { success: true }, so a rejected handshake still lands in successful_announcements. Line 526 then suppresses fresh non-forced handshakes to the same API address for SUCCESSFUL_ANNOUNCEMENT_CACHE_TTL, even though this path just evicted it. Only emit the success notification after PeerResponse::Accepted; send a non-retry failure for PeerResponse::Rejected instead.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@crates/p2p/src/peer_network_service.rs` around lines 1120 - 1138, The code
currently emits AnnouncementFinished { success: true } before handling
PeerResponse::Rejected, causing rejected handshakes to be recorded in
successful_announcements; instead, only emit the success announcement after
matching PeerResponse::Accepted and for PeerResponse::Rejected emit a failure
(non-retry) announcement so the successful_announcements cache is not populated
for evicted peers. Concretely, move the AnnouncementFinished { success: true }
emission out of the common pre-response path and into the PeerResponse::Accepted
branch, and in the PeerResponse::Rejected branch send a failure announcement
(with success: false or a distinct non-retry event) before calling
evict_peer_on_network_mismatch and inserting into state.pending_db_removals;
update any handlers that expect the announcement shape accordingly (references:
AnnouncementFinished, successful_announcements, PeerResponse::Accepted,
PeerResponse::Rejected, evict_peer_on_network_mismatch, pending_db_removals).

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Outside diff comments:
In `@crates/p2p/src/peer_network_service.rs`:
- Around line 1120-1138: The code currently emits AnnouncementFinished {
success: true } before handling PeerResponse::Rejected, causing rejected
handshakes to be recorded in successful_announcements; instead, only emit the
success announcement after matching PeerResponse::Accepted and for
PeerResponse::Rejected emit a failure (non-retry) announcement so the
successful_announcements cache is not populated for evicted peers. Concretely,
move the AnnouncementFinished { success: true } emission out of the common
pre-response path and into the PeerResponse::Accepted branch, and in the
PeerResponse::Rejected branch send a failure announcement (with success: false
or a distinct non-retry event) before calling evict_peer_on_network_mismatch and
inserting into state.pending_db_removals; update any handlers that expect the
announcement shape accordingly (references: AnnouncementFinished,
successful_announcements, PeerResponse::Accepted, PeerResponse::Rejected,
evict_peer_on_network_mismatch, pending_db_removals).

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: b943b519-7745-4db1-9080-4c8ea1e6c072

📥 Commits

Reviewing files that changed from the base of the PR and between 0aedc1a and 281ebbb.

📒 Files selected for processing (1)

crates/p2p/src/peer_network_service.rs

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@crates/domain/src/models/peer_list.rs`:
- Around line 1440-1480: Update the test
remove_peer_by_api_address_clears_all_lookups to also verify the gossip lookup
and emitted removal event: after calling
peer_list.remove_peer_by_api_address(&api), assert that
peer_list.peer_by_gossip_address(&peer.address.gossip) (or the appropriate
gossip address field) returns None, and assert that the PeerRemoved event was
emitted (check the event stream/queue or peer_list.recent_events()/last_event()
for a PeerRemoved containing the removed peer_id); reference the
functions/methods peer_by_gossip_address, remove_peer_by_api_address and the
PeerRemoved event type to locate where to add these assertions.

In `@crates/p2p/src/peer_network_service.rs`:
- Around line 1120-1142: The code currently treats any HTTP-level
Ok(peer_response) as an announcement success, which lets NetworkMismatch
rejections remain recorded in successful_announcements; instead, only emit the
AnnouncementFinished { success: true, retry: false } notification from the
PeerResponse::Accepted arm and for PeerResponse::Rejected return/report an
announcement with success: false, retry: false (so handle_handshake_request()
won't cache rejected peers in successful_announcements). Update the
PeerResponse::Rejected branch (where evict_peer_on_network_mismatch(...) and
pending_db_removals.insert(...) are done) to report failure (success:false,
retry:false) and keep returning
PeerListServiceError::PeerHandshakeRejected(rejected_response), and move the
success notification logic into the PeerResponse::Accepted branch; ensure
references to successful_announcements handling and any AnnouncementFinished
emission are adjusted accordingly.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 58c48eb6-1246-4a22-bfda-c8397c8d3670

📥 Commits

Reviewing files that changed from the base of the PR and between 281ebbb and b5c80d6.

📒 Files selected for processing (4)

crates/database/src/database.rs
crates/domain/src/models/peer_list.rs
crates/p2p/src/peer_network_service.rs
docs/superpowers/specs/2026-06-02-gossip-handshake-enforcement-design.md

…tion A handshake rejected for a chain_id mismatch (mapped to NetworkMismatch) only recorded a failed announcement and left the peer in the cache. The gossip data plane (check_peer_v*) trusts cache membership rather than handshake outcome, so the rejected peer stayed fully trusted and the node kept exchanging gossip with a different network. Now a NetworkMismatch handshake rejection evicts the peer from the in-memory cache (all lookup maps) and deletes it from the persistent peer DB, so a node re-announcing to its cached peers while on the wrong chain is isolated after the startup announce round. - domain: PeerList::remove_peer_by_api_address removes a peer from the cache and every index map, emitting PeerRemoved. - database: delete_peer_list_item removes a peer from the PeerListItems table. - p2p: evict_peer_on_network_mismatch wired into the outbound-announce rejection path; no-op for any non-network rejection reason. Includes a design doc covering the diagnosis and the larger follow-up (session-scoped handshake + authenticated handshake response).

… resurrect race evict_peer_on_network_mismatch deleted the peer in its own db.update_scoped transaction, which raced with the periodic flush: if flush had already snapshotted persistable_peers before the eviction removed the peer from the cache, flush's insert re-inserted the just-deleted peer, resurrecting it in the DB (reloaded on the next restart). Make flush the sole peer-DB writer. Eviction now removes the peer from the in-memory cache and stages its peer_id in PeerNetworkServiceState.pending_db_removals; flush drains the set and, within its single transaction, skips re-inserting any staged peer and deletes them. The in-memory eviction stays immediate, so the gossip data plane stops trusting the peer at once.

flush() discarded the inner transaction Result via `let _ =`, silently dropping insert/delete failures (it could even report success after a failed delete), and it `mem::take`d pending_db_removals before the write so a failed flush lost the staged chain-mismatch eviction deletes entirely. Flatten the nested update_scoped result so inner PeerListServiceError values propagate, and re-stage removals (merge, to preserve evictions staged during the lock-free write window) when the flush fails so the next flush retries. Deletes are idempotent, so the retry is safe.

flush() snapshotted persistable_peers (peer-list lock) and drained pending_db_removals (state lock) separately, so a concurrent NetworkMismatch eviction could be observed half-applied and a just-evicted peer re-persisted. Take both under the same state lock, and perform the eviction's cache removal and removal staging under that same lock, so flush always sees a consistent (snapshot, removals) pair. Lock ordering is state -> peer_list on both paths (peer-list methods never acquire the state lock), so no deadlock. The lock is released before the DB write; an eviction during that lock-free write can still re-persist a peer for one flush cycle, which is benign (the in-memory eviction is already in effect and the next flush deletes it).

The near-term design's step 3 described an inline DB delete in the eviction path. The implementation defers the delete: eviction removes the peer from the cache and stages its id in pending_db_removals (under the same state lock flush takes), and flush — the sole peer-DB writer — applies delete_peer_list_item in its transaction. Update step 3 to match.

…uncements announce_yourself_to_address sent AnnouncementFinished{success:true} for any transport-level Ok response, including PeerResponse::Rejected, so a rejection (e.g. NetworkMismatch) was cached in successful_announcements and suppressed future announce attempts as if it had succeeded. Move the success notification into the Accepted arm; the Rejected arm now reports success:false, retry:false (terminal — re-announcing won't change the peer's mind) before evicting on NetworkMismatch and returning PeerHandshakeRejected.

…viction remove_peer_by_api_address_clears_all_lookups now also asserts the gossip-address index returns None after eviction and that a PeerRemoved event carrying the evicted peer_id is emitted (subscribed before the removal).

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

github-actions Bot reviewed Jun 2, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread crates/p2p/src/peer_network_service.rs Outdated

Comment thread docs/superpowers/specs/2026-06-02-gossip-handshake-enforcement-design.md

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

JesseTheRobot force-pushed the feat/p2p-evict-chain-mismatch-peers branch from 281ebbb to b5c80d6 Compare June 2, 2026 17:15

coderabbitai Bot reviewed Jun 2, 2026

View reviewed changes

Comment thread crates/domain/src/models/peer_list.rs

Comment thread crates/p2p/src/peer_network_service.rs

JesseTheRobot added 7 commits June 2, 2026 17:28

JesseTheRobot force-pushed the feat/p2p-evict-chain-mismatch-peers branch from b5c80d6 to 9a2f23a Compare June 2, 2026 17:43

JesseTheRobot merged commit 6090870 into master Jun 2, 2026
18 checks passed

JesseTheRobot deleted the feat/p2p-evict-chain-mismatch-peers branch June 2, 2026 18:03

		~50–70 LOC (eviction primitive + DB delete + one wiring arm) + tests. No wire
		change, no persistence.

Conversation

JesseTheRobot commented Jun 2, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Fix

Changes

Test plan

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reviews paused

Walkthrough

Changes

Sequence Diagram

Estimated code review effort

Possibly related PRs

Suggested reviewers

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

coderabbitai Bot Jun 2, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions Bot left a comment

Choose a reason for hiding this comment

⚠️ Performance Alert ⚠️

Uh oh!

github-actions Bot commented Jun 2, 2026

Uh oh!

JesseTheRobot commented Jun 2, 2026

Uh oh!

coderabbitai Bot commented Jun 2, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

JesseTheRobot commented Jun 2, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 2, 2026 •

edited

Loading