feat: simplify the client#1153
Conversation
# Conflicts: # crates/p2p/src/gossip_client.rs
📝 WalkthroughWalkthroughRefactors per-peer data-pull logic in GossipClient by adding an internal Changes
Sequence Diagram(s)sequenceDiagram
participant GossipClient as GossipClient
participant Peer as RemotePeer
participant Network as PeerNetwork/HandshakeManager
rect rgba(200,200,255,0.5)
GossipClient->>Peer: Send data pull request (V2/V1)
alt Accepted(Some)
Peer-->>GossipClient: Accepted(Some data)
else Accepted(None)
Peer-->>GossipClient: Accepted(None)
else Rejected(HandshakeNeeded)
Peer-->>GossipClient: Rejected(HandshakeRequired)
GossipClient->>Network: Initiate handshake with Peer
Network-->>Peer: Perform handshake
Network-->>GossipClient: Handshake result (ok/err)
alt Handshake ok
GossipClient->>Peer: Retry data pull request
Peer-->>GossipClient: Accepted(Some/None) or Rejected
else Handshake err
Network-->>GossipClient: Handshake failed -> propagate error
end
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Suggested reviewers
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches
🧪 Generate unit tests (beta)
Tip Issue Planner is now in beta. Read the docs and try it out! Share your feedback on Discord. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
crates/p2p/src/gossip_client.rs (1)
1409-1425:⚠️ Potential issue | 🟠 MajorCircuit breaker blocking check is bypassed in the transaction pull path.
pull_primitive_data_from_peer_with_retry(used bypull_transaction_from_peer) does not invokecheck_circuit_breakerbefore attempting the request. In contrast,pull_data_from_peer_with_retry(lines 1470-1471 comment: "includes circuit breaker checks") does check it at the start ofpull_data_and_update_the_score(line 354).This means transactions will still be requested from peers with an open circuit breaker, even though the circuit breaker state is still recorded post-request via
handle_data_retrieval_score. While this avoids the recursion cycle throughpull_block_body_from_v1_peer → pull_transaction_from_peer, it removes early-request protection against known-bad peers.Consider either: adding
self.check_circuit_breaker(&peer.0)?;topull_primitive_data_and_update_the_score, or documenting why bypassing this check is acceptable for transactions.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/p2p/src/gossip_client.rs` around lines 1409 - 1425, pull_transaction_from_peer currently calls pull_primitive_data_from_peer_with_retry which ends up in pull_primitive_data_and_update_the_score without calling check_circuit_breaker, allowing requests to peers with open circuit breakers; add a pre-request circuit-breaker check by invoking self.check_circuit_breaker(&peer.0)?; at the start of pull_primitive_data_and_update_the_score (or, if you prefer a narrower change, add the same call inside pull_transaction_from_peer immediately before calling pull_primitive_data_from_peer_with_retry) so circuit state is honored while still preserving the retry/recursion behavior handled by pull_primitive_data_from_peer_with_retry and handle_data_retrieval_score.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/p2p/src/gossip_client.rs`:
- Around line 1468-1614: Add unit tests for the new retry helpers: write tests
that call pull_data_from_peer_with_retry,
pull_primitive_data_from_peer_with_retry and indirectly exercise
handle_peer_pull_response; cover (1) immediate success returning Ok, (2) first
attempt returning RejectionReason::HandshakeRequired which triggers
peer_list.initiate_handshake and then succeeds on retry, and (3) both attempts
failing to produce an Err. Use mocks/stubs for GossipClient internals (simulate
pull_data_and_update_the_score / pull_primitive_data_and_update_the_score
returning GossipResponse::Accepted(Some(_)),
GossipResponse::Rejected(HandshakeRequired(...)) then Accepted, and persistent
errors) and assert the returned PeerNetworkError or (IrysPeerId, mapped) values
and that peer_list.initiate_handshake and set_is_online_by_peer_id are invoked
where applicable.
- Around line 1468-1545: Both functions contain the same retry loop; extract
that loop into a single helper (e.g., pull_with_handshake_retry) that accepts
the common parameters (data_request: GossipDataRequestV2, peer: &(IrysPeerId,
PeerListItem), peer_list: &PeerList) plus an async fetch closure/factory which
returns the same Result type used today; inside the helper run the for attempt
in 0..2 loop, call the provided fetch closure to get the Result, then call
Self::handle_peer_pull_response(...) exactly as currently done and return
PeerPullOutcome results; then replace pull_data_from_peer_with_retry to call the
new helper with a closure that invokes self.pull_data_and_update_the_score(...,
fallback_header, ...) and replace pull_primitive_data_from_peer_with_retry to
call it with a closure that invokes
self.pull_primitive_data_and_update_the_score(...), keeping the existing mapping
and error message behavior.
---
Outside diff comments:
In `@crates/p2p/src/gossip_client.rs`:
- Around line 1409-1425: pull_transaction_from_peer currently calls
pull_primitive_data_from_peer_with_retry which ends up in
pull_primitive_data_and_update_the_score without calling check_circuit_breaker,
allowing requests to peers with open circuit breakers; add a pre-request
circuit-breaker check by invoking self.check_circuit_breaker(&peer.0)?; at the
start of pull_primitive_data_and_update_the_score (or, if you prefer a narrower
change, add the same call inside pull_transaction_from_peer immediately before
calling pull_primitive_data_from_peer_with_retry) so circuit state is honored
while still preserving the retry/recursion behavior handled by
pull_primitive_data_from_peer_with_retry and handle_data_retrieval_score.
| /// Uses `pull_data_and_update_the_score` which includes circuit breaker | ||
| /// checks and V1 BlockBody compatibility handling. | ||
| /// Used by `pull_block_body_from_peer` and `pull_block_header_from_peer`. | ||
| async fn pull_data_from_peer_with_retry<T>( |
There was a problem hiding this comment.
Nit: These are near dups, we should condense/extract
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/p2p/src/gossip_client.rs`:
- Around line 49-54: The enum PeerPullOutcome<T> is missing a Debug impl; add
#[derive(Debug)] above the enum declaration for PeerPullOutcome<T> so the
compiler generates a Debug implementation (which will introduce the required T:
Debug bound where used) to improve test diagnostics and logging, then update any
tests that currently use std::mem::discriminant to instead rely on the enum's
Debug output if desired.
- Around line 1486-1507: The final Err(...) after the for loop is unreachable
because the loop over attempt in 0..2 will always return on the second iteration
(handle_peer_pull_response only yields RetryAfterHandshake when attempt == 0);
replace the Err(PeerNetworkError::FailedToRequestData(...)) tail with
unreachable!() to document the invariant and cause a panic if future changes
violate it. Update the code in the function containing the for attempt in 0..2
loop that calls Self::handle_peer_pull_response and use unreachable!()
(optionally with a brief message) instead of returning the PeerNetworkError to
make the unreachable control-flow explicit.
- Around line 1571-1581: The outer warn! that logs all rejections is redundant
for the RejectionReason::HandshakeRequired branch because a more specific warn!
follows; update the handling in the match so that either (a) the outer log is
demoted to debug! before matching RejectionReason, or (b) skip logging in the
outer path when the reason is RejectionReason::HandshakeRequired and let the
inner warn! (inside the RejectionReason::HandshakeRequired arm) be the only log;
locate the outer warn! and the match on RejectionReason::HandshakeRequired in
gossip_client.rs and adjust the logging accordingly (references:
RejectionReason::HandshakeRequired, peer_list.initiate_handshake, the outer
warn! that logs "Peer {:?} rejected {:?} request: {:?}").
- Around line 3075-3137: The two tests calling pull_with_handshake_retry
(handshake_required_then_success_on_retry and
handshake_required_both_attempts_returns_err) trigger a real sleep on
HANDSHAKE_WAIT_TIMEOUT; update both test annotations to
#[tokio::test(start_paused = true)] so the tokio clock is paused and timers
auto-advance (removing the ~1s wait per test) while keeping the same async
behavior and assertions against pull_with_handshake_retry and its handshake
retry logic.
- Around line 1570-1606: The match currently consumes reason (RejectionReason)
inside the GossipResponse::Rejected arm which relies on RejectionReason being
Copy; change the match to match &reason { ... } and update each pattern to match
references (e.g., RejectionReason::HandshakeRequired(_) ->
RejectionReason::HandshakeRequired(inner) becomes
RejectionReason::HandshakeRequired(inner) with an ampersand) so the variable
reason remains usable for the later warn! and the
PeerPullOutcome::Err(PeerNetworkError::FailedToRequestData(...)) format call;
ensure the initiate_handshake call on peer_list.initiate_handshake(...) and
other arms (GossipDisabled, InvalidCredentials, ProtocolMismatch) use reference
patterns accordingly so no ownership is moved.
There was a problem hiding this comment.
Actionable comments posted: 1
♻️ Duplicate comments (1)
crates/p2p/src/gossip_client.rs (1)
2896-2899: 🧹 Nitpick | 🔵 TrivialNit:
PeerPullOutcomenow derivesDebug, sostd::mem::discriminantis no longer necessary.Since
#[derive(Debug)]was added toPeerPullOutcome<T>, you can print the variant directly:♻️ Proposed simplification
- other => panic!("expected Ok, got {:?}", std::mem::discriminant(&other)), + other => panic!("expected Ok, got {:?}", other),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@crates/p2p/src/gossip_client.rs` around lines 2896 - 2899, The match's panic uses std::mem::discriminant unnecessarily; since PeerPullOutcome now implements Debug, change the panic arm to print the variant directly (e.g., panic!("expected Ok, got {:?}", other)) so the full enum variant/value is shown; update the match handling that binds outcome (PeerPullOutcome::Ok((id, _block)) => ...) to use the Debug-printable `other` in the panic.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@crates/p2p/src/gossip_client.rs`:
- Around line 1568-1604: The inner match arm for GossipResponse::Rejected
currently shadows the outer variable named reason when matching
RejectionReason::HandshakeRequired; rename the inner binding to handshake_reason
(or similar) in the RejectionReason::HandshakeRequired arm where you call
peer_list.initiate_handshake and log the handshake wait so the outer
RejectionReason remains unshadowed for the final PeerPullOutcome::Err call that
uses reason in the format! message; update any references inside that arm (e.g.,
the warn! call and any uses of the inner Option) to use handshake_reason
instead.
---
Duplicate comments:
In `@crates/p2p/src/gossip_client.rs`:
- Around line 2896-2899: The match's panic uses std::mem::discriminant
unnecessarily; since PeerPullOutcome now implements Debug, change the panic arm
to print the variant directly (e.g., panic!("expected Ok, got {:?}", other)) so
the full enum variant/value is shown; update the match handling that binds
outcome (PeerPullOutcome::Ok((id, _block)) => ...) to use the Debug-printable
`other` in the panic.
Describe the changes
A clear and concise description of what the bug fix, feature, or improvement is.
Related Issue(s)
Please link to the issue(s) that will be closed with this PR.
Checklist
Additional Context
Add any other context about the pull request here.
Summary by CodeRabbit