Skip to content

feat: simplify the client#1153

Merged
antouhou merged 11 commits into
masterfrom
feat-simplify-the-client
Feb 24, 2026
Merged

feat: simplify the client#1153
antouhou merged 11 commits into
masterfrom
feat-simplify-the-client

Conversation

@antouhou

@antouhou antouhou commented Feb 18, 2026

Copy link
Copy Markdown
Contributor

Describe the changes
A clear and concise description of what the bug fix, feature, or improvement is.

Related Issue(s)
Please link to the issue(s) that will be closed with this PR.

Checklist

  • Tests have been added/updated for the changes.
  • Documentation has been updated for the changes (if applicable).
  • The code follows Rust's style guidelines.

Additional Context
Add any other context about the pull request here.

Summary by CodeRabbit

  • Refactor
    • Centralized peer-to-peer fetch and retry logic to improve reliability and introduce a single automatic retry during peer negotiation.
    • No changes to public APIs or exported signatures.
  • Bug Fixes
    • More consistent handling of missing, rejected, or transient peer responses to reduce persistent failures.
  • Tests
    • Added tests covering retry, negotiation, offline, and error paths.
  • Chores
    • Added a dev/test dependency to support expanded testing.

@coderabbitai

coderabbitai Bot commented Feb 18, 2026

Copy link
Copy Markdown
Contributor
📝 Walkthrough

Walkthrough

Refactors per-peer data-pull logic in GossipClient by adding an internal PeerPullOutcome<T> enum and unified helpers that perform a single handshake-retry cycle and centralize response handling. No public APIs or exported signatures were changed. (≈34 words)

Changes

Cohort / File(s) Summary
Gossip Client Retry Refactor
crates/p2p/src/gossip_client.rs
Adds private enum PeerPullOutcome<T> and helpers: pull_with_handshake_retry, pull_data_from_peer_with_retry, pull_primitive_data_from_peer_with_retry, handle_peer_pull_response, block_body. Replaces many per-peer retry loops with unified two-attempt (handshake retry) flow; adds unit tests and internal docs.
Dev-test dependency
crates/p2p/Cargo.toml
Adds tokio as a dev-dependency with workspace = true and features = ["test-util"] for tests.

Sequence Diagram(s)

sequenceDiagram
  participant GossipClient as GossipClient
  participant Peer as RemotePeer
  participant Network as PeerNetwork/HandshakeManager
  rect rgba(200,200,255,0.5)
    GossipClient->>Peer: Send data pull request (V2/V1)
    alt Accepted(Some)
      Peer-->>GossipClient: Accepted(Some data)
    else Accepted(None)
      Peer-->>GossipClient: Accepted(None)
    else Rejected(HandshakeNeeded)
      Peer-->>GossipClient: Rejected(HandshakeRequired)
      GossipClient->>Network: Initiate handshake with Peer
      Network-->>Peer: Perform handshake
      Network-->>GossipClient: Handshake result (ok/err)
      alt Handshake ok
        GossipClient->>Peer: Retry data pull request
        Peer-->>GossipClient: Accepted(Some/None) or Rejected
      else Handshake err
        Network-->>GossipClient: Handshake failed -> propagate error
      end
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested reviewers

  • DanMacDonald
  • JesseTheRobot
🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 inconclusive)

Check name Status Explanation Resolution
Title check ❓ Inconclusive The title 'feat: simplify the client' is vague and generic, using broad language that doesn't clearly describe the specific refactoring work (internal retry logic consolidation, handshake retry helpers, response handling centralization). Provide a more specific title that describes the actual change, e.g., 'refactor: consolidate peer retry logic with handshake retry helpers' or 'refactor: centralize gossip client response handling'.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Docstring Coverage ✅ Passed Docstring coverage is 100.00% which is sufficient. The required threshold is 80.00%.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
  • 📝 Generate docstrings (stacked PR)
  • 📝 Generate docstrings (commit on current branch)
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch feat-simplify-the-client

Tip

Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord.


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@antouhou antouhou marked this pull request as ready for review February 19, 2026 22:05

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
crates/p2p/src/gossip_client.rs (1)

1409-1425: ⚠️ Potential issue | 🟠 Major

Circuit breaker blocking check is bypassed in the transaction pull path.

pull_primitive_data_from_peer_with_retry (used by pull_transaction_from_peer) does not invoke check_circuit_breaker before attempting the request. In contrast, pull_data_from_peer_with_retry (lines 1470-1471 comment: "includes circuit breaker checks") does check it at the start of pull_data_and_update_the_score (line 354).

This means transactions will still be requested from peers with an open circuit breaker, even though the circuit breaker state is still recorded post-request via handle_data_retrieval_score. While this avoids the recursion cycle through pull_block_body_from_v1_peer → pull_transaction_from_peer, it removes early-request protection against known-bad peers.

Consider either: adding self.check_circuit_breaker(&peer.0)?; to pull_primitive_data_and_update_the_score, or documenting why bypassing this check is acceptable for transactions.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/p2p/src/gossip_client.rs` around lines 1409 - 1425,
pull_transaction_from_peer currently calls
pull_primitive_data_from_peer_with_retry which ends up in
pull_primitive_data_and_update_the_score without calling check_circuit_breaker,
allowing requests to peers with open circuit breakers; add a pre-request
circuit-breaker check by invoking self.check_circuit_breaker(&peer.0)?; at the
start of pull_primitive_data_and_update_the_score (or, if you prefer a narrower
change, add the same call inside pull_transaction_from_peer immediately before
calling pull_primitive_data_from_peer_with_retry) so circuit state is honored
while still preserving the retry/recursion behavior handled by
pull_primitive_data_from_peer_with_retry and handle_data_retrieval_score.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/p2p/src/gossip_client.rs`:
- Around line 1468-1614: Add unit tests for the new retry helpers: write tests
that call pull_data_from_peer_with_retry,
pull_primitive_data_from_peer_with_retry and indirectly exercise
handle_peer_pull_response; cover (1) immediate success returning Ok, (2) first
attempt returning RejectionReason::HandshakeRequired which triggers
peer_list.initiate_handshake and then succeeds on retry, and (3) both attempts
failing to produce an Err. Use mocks/stubs for GossipClient internals (simulate
pull_data_and_update_the_score / pull_primitive_data_and_update_the_score
returning GossipResponse::Accepted(Some(_)),
GossipResponse::Rejected(HandshakeRequired(...)) then Accepted, and persistent
errors) and assert the returned PeerNetworkError or (IrysPeerId, mapped) values
and that peer_list.initiate_handshake and set_is_online_by_peer_id are invoked
where applicable.
- Around line 1468-1545: Both functions contain the same retry loop; extract
that loop into a single helper (e.g., pull_with_handshake_retry) that accepts
the common parameters (data_request: GossipDataRequestV2, peer: &(IrysPeerId,
PeerListItem), peer_list: &PeerList) plus an async fetch closure/factory which
returns the same Result type used today; inside the helper run the for attempt
in 0..2 loop, call the provided fetch closure to get the Result, then call
Self::handle_peer_pull_response(...) exactly as currently done and return
PeerPullOutcome results; then replace pull_data_from_peer_with_retry to call the
new helper with a closure that invokes self.pull_data_and_update_the_score(...,
fallback_header, ...) and replace pull_primitive_data_from_peer_with_retry to
call it with a closure that invokes
self.pull_primitive_data_and_update_the_score(...), keeping the existing mapping
and error message behavior.

---

Outside diff comments:
In `@crates/p2p/src/gossip_client.rs`:
- Around line 1409-1425: pull_transaction_from_peer currently calls
pull_primitive_data_from_peer_with_retry which ends up in
pull_primitive_data_and_update_the_score without calling check_circuit_breaker,
allowing requests to peers with open circuit breakers; add a pre-request
circuit-breaker check by invoking self.check_circuit_breaker(&peer.0)?; at the
start of pull_primitive_data_and_update_the_score (or, if you prefer a narrower
change, add the same call inside pull_transaction_from_peer immediately before
calling pull_primitive_data_from_peer_with_retry) so circuit state is honored
while still preserving the retry/recursion behavior handled by
pull_primitive_data_from_peer_with_retry and handle_data_retrieval_score.

Comment thread crates/p2p/src/gossip_client.rs
Comment thread crates/p2p/src/gossip_client.rs

@glottologist glottologist left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non blocking nits

Comment thread crates/p2p/src/gossip_client.rs Outdated
/// Uses `pull_data_and_update_the_score` which includes circuit breaker
/// checks and V1 BlockBody compatibility handling.
/// Used by `pull_block_body_from_peer` and `pull_block_header_from_peer`.
async fn pull_data_from_peer_with_retry<T>(

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: These are near dups, we should condense/extract

Comment thread crates/p2p/src/gossip_client.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/p2p/src/gossip_client.rs`:
- Around line 49-54: The enum PeerPullOutcome<T> is missing a Debug impl; add
#[derive(Debug)] above the enum declaration for PeerPullOutcome<T> so the
compiler generates a Debug implementation (which will introduce the required T:
Debug bound where used) to improve test diagnostics and logging, then update any
tests that currently use std::mem::discriminant to instead rely on the enum's
Debug output if desired.
- Around line 1486-1507: The final Err(...) after the for loop is unreachable
because the loop over attempt in 0..2 will always return on the second iteration
(handle_peer_pull_response only yields RetryAfterHandshake when attempt == 0);
replace the Err(PeerNetworkError::FailedToRequestData(...)) tail with
unreachable!() to document the invariant and cause a panic if future changes
violate it. Update the code in the function containing the for attempt in 0..2
loop that calls Self::handle_peer_pull_response and use unreachable!()
(optionally with a brief message) instead of returning the PeerNetworkError to
make the unreachable control-flow explicit.
- Around line 1571-1581: The outer warn! that logs all rejections is redundant
for the RejectionReason::HandshakeRequired branch because a more specific warn!
follows; update the handling in the match so that either (a) the outer log is
demoted to debug! before matching RejectionReason, or (b) skip logging in the
outer path when the reason is RejectionReason::HandshakeRequired and let the
inner warn! (inside the RejectionReason::HandshakeRequired arm) be the only log;
locate the outer warn! and the match on RejectionReason::HandshakeRequired in
gossip_client.rs and adjust the logging accordingly (references:
RejectionReason::HandshakeRequired, peer_list.initiate_handshake, the outer
warn! that logs "Peer {:?} rejected {:?} request: {:?}").
- Around line 3075-3137: The two tests calling pull_with_handshake_retry
(handshake_required_then_success_on_retry and
handshake_required_both_attempts_returns_err) trigger a real sleep on
HANDSHAKE_WAIT_TIMEOUT; update both test annotations to
#[tokio::test(start_paused = true)] so the tokio clock is paused and timers
auto-advance (removing the ~1s wait per test) while keeping the same async
behavior and assertions against pull_with_handshake_retry and its handshake
retry logic.
- Around line 1570-1606: The match currently consumes reason (RejectionReason)
inside the GossipResponse::Rejected arm which relies on RejectionReason being
Copy; change the match to match &reason { ... } and update each pattern to match
references (e.g., RejectionReason::HandshakeRequired(_) ->
RejectionReason::HandshakeRequired(inner) becomes
RejectionReason::HandshakeRequired(inner) with an ampersand) so the variable
reason remains usable for the later warn! and the
PeerPullOutcome::Err(PeerNetworkError::FailedToRequestData(...)) format call;
ensure the initiate_handshake call on peer_list.initiate_handshake(...) and
other arms (GossipDisabled, InvalidCredentials, ProtocolMismatch) use reference
patterns accordingly so no ownership is moved.

Comment thread crates/p2p/src/gossip_client.rs
Comment thread crates/p2p/src/gossip_client.rs
Comment thread crates/p2p/src/gossip_client.rs Outdated
Comment thread crates/p2p/src/gossip_client.rs

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

♻️ Duplicate comments (1)
crates/p2p/src/gossip_client.rs (1)

2896-2899: 🧹 Nitpick | 🔵 Trivial

Nit: PeerPullOutcome now derives Debug, so std::mem::discriminant is no longer necessary.

Since #[derive(Debug)] was added to PeerPullOutcome<T>, you can print the variant directly:

♻️ Proposed simplification
-                other => panic!("expected Ok, got {:?}", std::mem::discriminant(&other)),
+                other => panic!("expected Ok, got {:?}", other),
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@crates/p2p/src/gossip_client.rs` around lines 2896 - 2899, The match's panic
uses std::mem::discriminant unnecessarily; since PeerPullOutcome now implements
Debug, change the panic arm to print the variant directly (e.g.,
panic!("expected Ok, got {:?}", other)) so the full enum variant/value is shown;
update the match handling that binds outcome (PeerPullOutcome::Ok((id, _block))
=> ...) to use the Debug-printable `other` in the panic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@crates/p2p/src/gossip_client.rs`:
- Around line 1568-1604: The inner match arm for GossipResponse::Rejected
currently shadows the outer variable named reason when matching
RejectionReason::HandshakeRequired; rename the inner binding to handshake_reason
(or similar) in the RejectionReason::HandshakeRequired arm where you call
peer_list.initiate_handshake and log the handshake wait so the outer
RejectionReason remains unshadowed for the final PeerPullOutcome::Err call that
uses reason in the format! message; update any references inside that arm (e.g.,
the warn! call and any uses of the inner Option) to use handshake_reason
instead.

---

Duplicate comments:
In `@crates/p2p/src/gossip_client.rs`:
- Around line 2896-2899: The match's panic uses std::mem::discriminant
unnecessarily; since PeerPullOutcome now implements Debug, change the panic arm
to print the variant directly (e.g., panic!("expected Ok, got {:?}", other)) so
the full enum variant/value is shown; update the match handling that binds
outcome (PeerPullOutcome::Ok((id, _block)) => ...) to use the Debug-printable
`other` in the panic.

ℹ️ Review info

Configuration used: Repository UI

Review profile: ASSERTIVE

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 425f425 and 1e1476e.

📒 Files selected for processing (2)
  • crates/p2p/Cargo.toml
  • crates/p2p/src/gossip_client.rs

Comment thread crates/p2p/src/gossip_client.rs
@antouhou antouhou merged commit a100a92 into master Feb 24, 2026
17 checks passed
@antouhou antouhou deleted the feat-simplify-the-client branch February 24, 2026 22:49
@coderabbitai coderabbitai Bot mentioned this pull request Feb 25, 2026
3 tasks
@coderabbitai coderabbitai Bot mentioned this pull request Mar 11, 2026
3 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants