Skip to content

feat(net): optimize disconnectRandom by tracking block receive time#6704

Open
xxo1shine wants to merge 1 commit intotronprotocol:developfrom
xxo1shine:feature/random-disconnect-optimization
Open

feat(net): optimize disconnectRandom by tracking block receive time#6704
xxo1shine wants to merge 1 commit intotronprotocol:developfrom
xxo1shine:feature/random-disconnect-optimization

Conversation

@xxo1shine
Copy link
Copy Markdown
Collaborator

What does this PR do?

Optimize disconnectRandom by tracking block receive time peer

  • Add blockRcvTime/blockRcvTimeCmp fields to PeerConnection to track when a peer last delivered a valid block
  • Set blockRcvTime in BlockMsgHandler after each block is received
  • Fix lastInteractiveTime update in InventoryMsgHandler: only update for block inventories above current head block num, preventing attackers from forging activity via stale block hashes
  • Add getRandomDisconnectionPeers() to ResilienceService: narrows the disconnect candidate pool to the oldest half by blockRcvTime, so peers that recently delivered blocks are protected from random eviction

Fixes #6504

Why are these changes required?

This PR has been tested by:

  • Unit Tests
  • Manual Testing

Follow up

Extra details

@github-actions github-actions Bot requested a review from 317787106 April 24, 2026 03:59
@xxo1shine xxo1shine changed the title feat(net): optimize disconnectRandom by tracking block receive time per peer feat(net): optimize disconnectRandom by tracking block receive time Apr 24, 2026
@xxo1shine xxo1shine requested a review from lvs0075 April 24, 2026 04:02
@halibobo1205 halibobo1205 added the topic:net p2p net work, synchronization label Apr 24, 2026
@halibobo1205 halibobo1205 added this to the GreatVoyage-v4.8.2 milestone Apr 24, 2026

@Setter
@Getter
private volatile long blockRcvTimeCmp;
Copy link
Copy Markdown

@warku123 warku123 Apr 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[QUESTION]Why does blockRcvTimeCmp live in PeerConnection?

Its only use is inside ResilienceService.getRandomDisconnectionPeers() to snapshot blockRcvTime before sorting, so the comparator reads a stable value even if another thread updates blockRcvTime mid-sort. That's a valid concern, but it's an implementation detail of one method in one service — it doesn't describe any property of the peer itself.

A local snapshot map achieves the same safety without polluting the peer domain object:

private List<PeerConnection> getRandomDisconnectionPeers(List<PeerConnection> peers) {
    Map<PeerConnection, Long> snapshot = new IdentityHashMap<>();
    peers.forEach(p -> snapshot.put(p, p.getBlockRcvTime()));
    List<PeerConnection> sorted = new ArrayList<>(peers);
    sorted.sort(Comparator.comparingLong(snapshot::get));
    return sorted.subList(0, sorted.size() / 2);
}

Is there a reason this needs to be a field rather than a method-local variable?

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason blockRcvTimeCmp was introduced as a field was to solve a concrete problem: blockRcvTime is updated concurrently by network threads during the sort, which causes the Comparator to observe inconsistent values across comparisons and triggers IllegalArgumentException: Comparison method violates its general contract — failing the sort. Making it a field on the peer was the way I picked at the time to give the comparator a stable snapshot value to read.

That said, your direction is right — a stable snapshot is purely an implementation detail of getRandomDisconnectionPeers and shouldn't leak onto PeerConnection. Switching to a method-local IdentityHashMap<PeerConnection, Long> populated once before the sort and read-only during the sort gives the same stability guarantee without polluting the domain model.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've optimized it according to your suggestions.

@lvs0075 lvs0075 requested a review from halibobo1205 April 27, 2026 04:58
…er peer

- Add blockRcvTime/blockRcvTimeCmp fields to PeerConnection to track when
  a peer last delivered a valid block
- Set blockRcvTime in BlockMsgHandler after each block is received
- Fix lastInteractiveTime update in InventoryMsgHandler: only update for
  block inventories above current head block num, preventing attackers from
  forging activity via stale block hashes
- Add getRandomDisconnectionPeers() to ResilienceService: narrows the
  disconnect candidate pool to the oldest half by blockRcvTime, so peers
  that recently delivered blocks are protected from random eviction

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@xxo1shine xxo1shine force-pushed the feature/random-disconnect-optimization branch from cdf1047 to a912a49 Compare April 27, 2026 06:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

topic:net p2p net work, synchronization

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Node fails to synchronize blocks after a peer is randomly disconnected

3 participants