fix: resolve payment on RemoveTlc(Fulfill) for force-closed channels (#1222)#1254
fix: resolve payment on RemoveTlc(Fulfill) for force-closed channels (#1222)#1254doitian wants to merge 15 commits into
Conversation
|
The current test has passed, but whether the results of list_channel and get_invoice need to be synchronizedly modified ? lnd0->lnd1->fiber1->fiber2
|
4fc682d to
85ec855
Compare
| ) | ||
| }) | ||
| { | ||
| self.store |
There was a problem hiding this comment.
TODO: check the invoice is fully paid.
There was a problem hiding this comment.
Pull request overview
Fixes a long-standing edge case in Fiber’s payment lifecycle where a peer’s RemoveTlc(Fulfill) can arrive after a channel has been force-closed and the channel actor is already gone, leaving the sender (and downstream CCH tracking) stuck in Inflight. The PR adds network-level handling to recover fulfillment from persisted channel state, plus tests to validate one-hop and two-hop on-chain settlement flows and a new watchtower Bruno e2e scenario.
Changes:
- Intercept
RemoveTlc(Fulfill)for force-closed channels inNetworkActor::handle_peer_message, recover TLC info from persisted channel state, and propagate fulfillment upstream / to the payment actor. - Ensure invoices are marked
Paidwhen a fulfilled received-TLC is locally removed but the commitment round-trip never completes (common around force-closes). - Add unit tests and register a new Bruno e2e watchtower scenario in CI.
Reviewed changes
Copilot reviewed 28 out of 28 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
crates/fiber-lib/src/fiber/network.rs |
Adds fallback handling for fulfill removes on closed channels and additional invoice-status reconciliation in periodic channel checks. |
crates/fiber-lib/src/store/store_impl/mod.rs |
Emits StoreChange::PutPreimage when inserting watchtower preimages so downstream watchers (e.g. CCH) can react. |
crates/fiber-lib/src/fiber/tests/payment.rs |
Adds one-hop and two-hop unit tests covering on-chain settlement after force-close. |
.github/workflows/e2e.yml |
Registers the new Bruno e2e scenario in CI. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/01-node1-connect-node2.bru |
New e2e step: connect Node1↔Node2. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/02-node2-connect-node3.bru |
New e2e step: connect Node2↔Node3. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/03-node1-node2-open-channel.bru |
New e2e step: open Node1→Node2 channel. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/04-node2-get-auto-accepted-channel.bru |
New e2e step: discover N1–N2 channel id. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/05-ckb-generate-blocks.bru |
New e2e step: mine epochs for N1–N2 channel confirmation. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/06-node2-node3-open-channel.bru |
New e2e step: open Node2→Node3 channel. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/07-node3-get-auto-accepted-channel.bru |
New e2e step: discover N2–N3 channel id. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/08-ckb-generate-blocks.bru |
New e2e step: mine epochs for N2–N3 channel confirmation. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/09-get-node1-funding-script.bru |
New e2e step: read Node1 funding lock script. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/10-get-node3-funding-script.bru |
New e2e step: read Node3 funding lock script. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/11-get-node1-balance.bru |
New e2e step: snapshot Node1 chain balance. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/12-get-node3-balance.bru |
New e2e step: snapshot Node3 chain balance. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/13-node3-gen-invoice.bru |
New e2e step: create hold invoice / preimage pair for the scenario. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/14-node1-send-payment-with-invoice.bru |
New e2e step: Node1 initiates payment via invoice. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/15-node2-force-close-channel.bru |
New e2e step: Node2 force-closes N2–N3 to force on-chain resolution path. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/16-node2-disconnect-node3.bru |
New e2e step: disconnect Node2 from Node3. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/17-ckb-generate-blocks-for-force-close-tx.bru |
New e2e step: mine epochs for force-close tx confirmation. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/18-node3-remove-tlc.bru |
New e2e step: Node3 removes TLC with preimage (fulfill). |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/19-ckb-generate-blocks-for-settlement-tx-preimage.bru |
New e2e step: mine epochs for settlement tx/preimage discovery. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/20-ckb-generate-blocks-for-final-settlement-tx.bru |
New e2e step: mine epochs to reach final settlement. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/21-ckb-generate-blocks-for-final-settlement-tx-commit.bru |
New e2e step: mine epochs to commit final settlement. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/22-check-channel1-balance.bru |
New e2e assertion: N1–N2 channel balance reflects claimed amount. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/23-check-payment-status.bru |
New e2e assertion: payment status becomes Success. |
tests/bruno/e2e/watchtower/force-close-preimage-settled-by-recipient/24-disconnect.bru |
New e2e teardown: disconnect peers. |
85ec855 to
0cb440d
Compare
2f3c6b7 to
87fbe73
Compare
When a channel is force-closed the channel actor stops, so any RemoveTlc(Fulfill) arriving from the peer is silently dropped. This leaves the sender's payment session stuck in Inflight forever. Handle this in handle_peer_message: when a RemoveTlc(Fulfill) arrives for a channel whose actor is gone, look up the TLC from persisted state. If the TLC has a forwarding_tlc (intermediate node), relay the fulfillment upstream. Otherwise (payment sender), emit TlcRemoveReceived so the payment actor can finalise. Also emit StoreChange::PutPreimage from insert_watch_preimage so watchtower preimage discoveries are observable by store watchers. Made-with: Cursor
Add two tests that verify payment session resolution after a force close when the payee settles the TLC by revealing the preimage: - test_one_hop_payment_payee_settles_onchain: A force-closes, B settles invoice, A's payment should reach Success. - test_two_hop_payment_payee_settles_onchain: B force-closes B→C, C settles invoice, the fulfill relays through B back to A. Made-with: Cursor
Bruno e2e scenario: A→B→C payment where B force-closes the B→C channel and C settles the TLC on-chain with the preimage. Verifies that A's payment status reaches Success and B's channel balance reflects the settled amount. Made-with: Cursor
…lement After a force-closed channel's TLC is settled on-chain: - Payee: CheckChannels now scans received TLCs with LocalRemoved + Fulfill reason on both ChannelReady and Closed(UNCOOPERATIVE) channels and updates the invoice status to Paid. - Payer: the RemoveTlc(Fulfill) handler for force-closed channels now calls set_offered_tlc_removed() to transition the TLC to RemoteRemoved and persists the updated channel state. - Tests: added invoice Paid and TLC RemoteRemoved assertions to both one-hop and two-hop on-chain settlement tests.
Gate the persisted-state fallback path to channels that are actually closed and whose persisted remote pubkey matches the incoming peer, so a reconnect race or unrelated peer cannot drive fulfillment handling for a channel that does not belong to it. Also re-validate the preimage against the TLC payment hash and verify the offered TLC is in Committed status before calling `set_offered_tlc_removed`, which asserts on the status and would otherwise panic the network actor on duplicate or out-of-order removes.
Replace fixed 3s sleeps with `wait_until_success` so the assertions wait as long as needed up to the helper's timeout, reducing CI flakiness under load and producing clearer diagnostics when settlement takes longer than expected.
The payer reaching PaymentStatus::Success and the payee marking its invoice as Paid (and the offered TLC transitioning to RemoteRemoved) are independent events, so asserting them synchronously after wait_until_success races with their propagation and produces flaky CI failures. Wrap the post-settlement invoice and TLC checks in wait_until_async_timeout so they converge within the helper's bound, matching the polling pattern already used elsewhere in the test.
Force-closed channels can persist later offered TLCs before they reach Committed, so accepting only Committed caused valid on-chain fulfillments to be dropped. Allow the fallback path to mark those TLCs as fulfilled after preimage validation and cover the stuck-payment case with a regression test.
Closed channels can learn multiple preimages from on-chain settlement without receiving peer RemoveTlc messages. Reconcile every matching offered TLC so each affected payment attempt can reach success.
Settling each preimage in a separate scan delays multi-TLC force-close recovery. Build a single settlement transaction for all currently unlockable preimage TLCs while preserving conservative timeout handling.
Multiple held payments on the same force-closed channel should all reach success when their preimages are discovered on-chain. Add regression coverage for the closed-channel reconciliation path.
429002c to
ff39903
Compare
Flatten the deeply nested fallback path in handle_peer_message by moving the TLC lookup and validation into load_closed_channel_offered_tlc_for_fulfill and the surrounding orchestration into try_handle_closed_channel_remove_tlc_fulfill, so the call site becomes a single flat branch.
|
|
||
| for (_pubkey, channel_id, channel_state) in self.store.get_channel_states(None) { | ||
| if matches!( | ||
| if matches!(channel_state, ChannelState::ChannelReady) { |
There was a problem hiding this comment.
I'm not sure is it too heavy to run these in CheckChannels. Also the code contains many logic similar to existing code blocks with nuance.
| state.store.insert_channel_actor_state(actor_state); | ||
| if let Some(actor) = state.channels.get(&channel_id) { | ||
| if let Err(err) = actor.send_message(ChannelActorMessage::Command( | ||
| ChannelCommand::ReloadState(ReloadParams { |
There was a problem hiding this comment.
This is a test/bench command before and is promoted to release environment here to synchronize the state in case the actor is still alive.
Split the large CheckChannels match arm into per-channel-state helpers and remove duplicated boilerplate for sending RemoveTlc / Shutdown commands. - Add check_ready_channel_tlcs and check_force_closed_channel_tlcs as the per-channel dispatchers. - Extract fulfill_committed_received_tlcs_with_preimages, fail_expiring_received_tlcs, and force_close_if_offered_tlcs_expired for the three logical passes on ChannelReady channels. - Add force_shutdown_channel sibling to send_remove_tlc_to_channel and reuse the existing helper for all RemoveTlc call sites. - Factor the repeated ExpiryTooSoon RemoveTlcCommand construction into expiry_too_soon_remove_command. No behavioral change.
|
Turn to draft since there are complex edge cases not fixed yet. See test cases post by @gpBlockchain above. |
Closes #1222
Summary
RemoveTlc(Fulfill)from the peer is silently dropped, leaving the sender's payment session stuck inInflightforever. Fixhandle_peer_messageto intercept these messages: look up the TLC from persisted channel state, relay the fulfillment upstream for intermediate nodes, or emitTlcRemoveReceivedfor the payment sender.force-close-preimage-settled-by-recipientBruno e2e test and register it in CI.Test plan
cargo nextest run -p fnn -p fiber-bin— 767 tests pass, 0 failurestest_one_hop_payment_payee_settles_onchainpassestest_two_hop_payment_payee_settles_onchainpasseswatchtower/force-close-preimage-settled-by-recipientpassesPaidRemoteRemoved