PIBD peers fix#3823
Conversation
… stale segments disconnecting only outbound, force request for output and rangeproof segments to avoid stuck at this case
b41f0bd to
769da6d
Compare
wiesche89
left a comment
There was a problem hiding this comment.
Thanks for working on this. I found three points that I think should be checked before merging.
| } | ||
|
|
||
| /// Drop all tracked PIBD requests, returning how many entries were removed. | ||
| pub fn clear_pibd_requests(&self) -> usize { |
There was a problem hiding this comment.
Currently, pending PIBD requests are cleared, which means the node loses the mapping of which segments are currently expected from which peers.
I have effectively fixed this in my branch: pending PIBD requests are no longer cleared globally. Instead, retryable segments are reconsidered via retryable_pibd_segments(), and when they are requested again, the existing request state is updated via refresh_pibd_segment().
So the question is whether we should wait for my change here, or handle this separately in a new PR?
There was a problem hiding this comment.
lets wait your change here I guess
| let mut elems_added = 0; | ||
| if let Some(mut next_output_idx) = self.next_required_output_segment_index() { | ||
| while (next_output_idx as usize) < total_output_segments { | ||
| if elems_added == max_elements / 3 { |
There was a problem hiding this comment.
I intentionally kept the max_cached_segments checks in place. This prevents creating additional normal requests for output, rangeproof, and kernel segments once the respective segment cache is full.
That avoids requesting too many extra segments under bad peer ordering or malicious peers, which would otherwise increase retry pressure.
Only the next required kernel is still explicitly forced, to avoid the known PIBD stall case.
- force request of next required output/rangeproof/kernel segments - add PIBD peer height slack filtering - temporarily block peers after PIBD segment timeouts - disconnect timed-out outbound PIBD peers - use escalating temporary block durations - keep existing retry/refresh logic instead of clearing pending requests
…locked peers to use fallback .zip download
choose peers based on minimal height (minimal 2 blocks behind max tip)
temporary block peers for stale segments disconnecting only outbound:
(Actual only for inbound, outbound resetting counter after reconnect)
force request for output and rangeproof segments to avoid stuck
do not check for max cached segments on selecting next desired segment for request (stuck happened when peer got disconnected and we reached max segments cache size, so not tried to make another request).