Skip to content

Conversation

@ziggie1984
Copy link
Collaborator

@ziggie1984 ziggie1984 commented Dec 21, 2025

Fixes a common seen flake see also here: https://github.com/lightningnetwork/lnd/actions/runs/20314774459/job/58355291404?pr=10049

Did catch our attention because the retribution test failed, however everything is fine with this logic. There core issue was that we would not wait for all HTLCs to properly lockin and then after capturing the state the channel would still try to progress the channel state to commit all inflight HTLCs.

@gemini-code-assist
Copy link

Summary of Changes

Hello @ziggie1984, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request refines the CompletePaymentRequestsNoWait test utility to prevent a race condition that caused test flakiness. The function's waiting mechanism has been updated to accurately reflect the completion of payment commitments, ensuring that tests capture a stable and fully updated channel state. This change specifically targets and resolves issues where backups would capture an incorrect channel height due to HTLCs still being in-flight.

Highlights

  • Flaky Test Fix: This pull request resolves a flaky test, revokedCloseRetributionRemoteHodlCase, by addressing a race condition in the CompletePaymentRequestsNoWait helper function.
  • Improved Payment Completion Logic: The CompletePaymentRequestsNoWait function now waits for all outgoing HTLCs to be fully locked in and appear in PendingHtlcs on the sender's side, rather than relying solely on the channel's NumUpdates count.
  • Enhanced Test Stability: By ensuring all payments have completed their commitment exchange before returning, the change prevents tests from capturing an inconsistent channel state, thus improving overall test reliability.

🧠 New Feature in Public Preview: You can now enable Memory to help Gemini Code Assist learn from your team's feedback. This makes future code reviews more consistent and personalized to your project's style. Click here to enable Memory in your admin console.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@ziggie1984 ziggie1984 self-assigned this Dec 21, 2025
@ziggie1984 ziggie1984 added no-changelog flake fix itests Issues related to integration tests. labels Dec 21, 2025
@ziggie1984 ziggie1984 force-pushed the fix-flake-revoked_uncooperative_close_retribution branch from b409c7d to 9b62490 Compare December 21, 2025 19:34
@ziggie1984
Copy link
Collaborator Author

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request addresses a flaky test by changing the waiting logic in CompletePaymentRequestsNoWait to count pending HTLCs instead of relying on NumUpdates. The approach is a clear improvement. My review includes suggestions to make the waiting condition even more robust by checking if the HTLCs are fully 'locked in', which would also align the implementation more closely with the guarantee mentioned in the pull request description. This will further improve the reliability of tests that depend on this function.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request effectively addresses a race condition in CompletePaymentRequestsNoWait by shifting the waiting mechanism from monitoring NumUpdates to counting the number of newly added outgoing HTLCs. This change provides a more reliable way to ensure all payments are committed before proceeding, fixing the described test flakiness. The implementation is clear and correct. I have one minor suggestion to refactor a small piece of duplicated logic for better maintainability.

@ziggie1984 ziggie1984 force-pushed the fix-flake-revoked_uncooperative_close_retribution branch from 9b62490 to e849989 Compare December 21, 2025 19:41
@ziggie1984 ziggie1984 marked this pull request as ready for review December 21, 2025 19:43
Before this change, CompletePaymentRequestsNoWait would return as
soon as the channel's NumUpdates increased by at least one. When
sending multiple payments, this meant the function could return
while some HTLCs were still in-flight and not yet committed to the
channel state.

The problem occurred when tests captured the channel state
immediately after calling this function. Even though we read the
current NumUpdates from the channel, HTLCs could still be in the
process of being committed. This led to a race where the channel
would progress to a new state after we thought we had correctly
captured it, causing tests to see unexpected commitment heights.

Fix this by waiting for all outgoing HTLCs to appear in
PendingHtlcs before returning. We count outgoing HTLCs before
sending, then wait until exactly len(paymentRequests) new HTLCs
are present. This guarantees all payments have fully completed
their commitment exchange and are locked in on both sides.

Fixes the flaky revokedCloseRetributionRemoteHodlCase test where
backups would capture state at height N+1 instead of the expected
  height N.
@ziggie1984 ziggie1984 force-pushed the fix-flake-revoked_uncooperative_close_retribution branch from e849989 to ece8469 Compare December 21, 2025 20:52
Copy link
Member

@yyforyongyu yyforyongyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🦺

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

flake fix itests Issues related to integration tests. no-changelog

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants