Skip to content

Conversation

elnosh
Copy link
Contributor

@elnosh elnosh commented Oct 3, 2025

Resolves #4048

    Previously `should_broadcast_holder_commitment_txn` would FC a
    channel if an outbound HTLC that hasn't been resolved was
    `LATENCY_GRACE_PERIOD_BLOCKS` past expiry. For outgoing payments,
    we can avoid FC the channel since we are not in a race to claim an
    inbound HTLC. For cases in which a node has been offline for a
    while, this could help to fail the HTLC on reconnection instead
    of causing a FC.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Oct 3, 2025

I've assigned @tankyleo as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

true
};
if (htlc_outbound && has_incoming && htlc.0.cltv_expiry + LATENCY_GRACE_PERIOD_BLOCKS <= height) ||
(htlc_outbound && !has_incoming && htlc.0.cltv_expiry + 2016 <= height) ||
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think per the issue, we shouldn't FC at all for these timed out outbound payments. I can't really see a downside to doing that since the user can FC manually or contact their counterparty out-of-band, though I could be missing something. It should be really rare/weird for the counterparty to not just fail the HTLC back on reconnect anyway.

Asked @wpaulino offline and he seemed to confirm this approach

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, will remove the arbitrary 2016 I added. Yeah I thought about not FC at all but wasn't sure in case something weird happens and counterparty never fails it back and could end up with the HTLC stuck there. But yeah, it can FC manually.

Previously `should_broadcast_holder_commitment_txn` would FC a
channel if an outbound HTLC that hasn't been resolved was
`LATENCY_GRACE_PERIOD_BLOCKS` past expiry. For outgoing payments,
we can avoid FC the channel since we are not in a race to claim an
inbound HTLC. For cases in which a node has been offline for a
while, this could help to fail the HTLC on reconnection instead
of causing a FC.
Copy link

codecov bot commented Oct 12, 2025

Codecov Report

❌ Patch coverage is 98.31933% with 4 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.80%. Comparing base (81495c6) to head (efd46d1).
⚠️ Report is 43 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/async_payments_tests.rs 95.00% 1 Missing and 2 partials ⚠️
lightning/src/chain/channelmonitor.rs 91.66% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4140      +/-   ##
==========================================
+ Coverage   88.64%   88.80%   +0.15%     
==========================================
  Files         180      180              
  Lines      135230   136721    +1491     
  Branches   135230   136721    +1491     
==========================================
+ Hits       119874   121414    +1540     
+ Misses      12591    12504      -87     
- Partials     2765     2803      +38     
Flag Coverage Δ
fuzzing 21.02% <100.00%> (-0.74%) ⬇️
tests 88.64% <98.31%> (+0.16%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@elnosh
Copy link
Contributor Author

elnosh commented Oct 12, 2025

I think this is ready now. I removed the arbitrary 2016 block grace period I had for outbound payments to instead not FC at all.

@valentinewallace Added test for async payment version.

Addressed all the tests that were failing which were expecting a FC on an outbound HTLC. I found this a bit tricky for tests that were depending on the FC after LATENCY_GRACE_PERIOD_BLOCKS. For some tests (like async_signer_tests) I just triggered the broadcast manually since the test is not testing this specifically. For the other tests that actually test the FC on HTLC-Timeout, these mostly had HTLCs going from A -> B and expecting a timeout on the outbound HTLC from A. This won't cause a FC anymore so I changed to test the FC on timeout when it is forwarding an HTLC since this is the functionality that is kept. So in those cases, I added an extra node and channel to route a payment and expect the FC on the node that is forwarding the HTLC.

@elnosh elnosh marked this pull request as ready for review October 12, 2025 19:05
@elnosh elnosh changed the title Delay FC for outgoing payments Do not FC for outgoing payments Oct 12, 2025
Copy link
Contributor

@valentinewallace valentinewallace left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically LGTM

// can still claim the corresponding HTLC. Thus, to avoid needlessly hitting the
// chain when our counterparty is waiting for expiration to off-chain fail an HTLC
// we give ourselves a few blocks of headroom after expiration before going
// on-chain for an expired HTLC.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should update this comment for the new behavior

chain_mon
.chain_monitor
.block_connected(&create_dummy_block(BlockHash::all_zeros(), 42, Vec::new()), 200);
chain_mon.chain_monitor.get_monitor(chan.2).unwrap().broadcast_latest_holder_commitment_txn(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: add a comment

};
let block = create_dummy_block(BlockHash::all_zeros(), 42, Vec::new());
watchtower_bob.chain_monitor.block_connected(&block, HTLC_TIMEOUT_BROADCAST - 1);
watchtower_bob.chain_monitor.block_connected(&block, htlc_timeout - 1);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we're connecting an absolute expiry amount of blocks here when we should be connecting a relative number of blocks?

let reason = ClosureReason::HTLCsTimedOut { payment_hash: Some(payment_hash) };
check_closed_broadcast(&nodes[0], 1, true);
check_added_monitors(&nodes[0], 1);
let reason = ClosureReason::HTLCsTimedOut { payment_hash: Some(payment_hash) };
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: unnecessary diff, and below?

Copy link
Contributor

@tankyleo tankyleo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you, I just reviewed just channelmonitor.rs so far, will be taking a look at the tests next

Comment on lines +5902 to +5918
let htlc_outbound = $holder_tx == htlc.0.offered;
let has_incoming = if htlc_outbound {
if let Some(source) = htlc.1.as_deref() {
match *source {
HTLCSource::OutboundRoute { .. } => false,
HTLCSource::PreviousHopData(_) => true,
}
} else {
panic!("Every offered non-dust HTLC should have a corresponding source");
}
} else {
true
};
if (htlc_outbound && has_incoming && htlc.0.cltv_expiry + LATENCY_GRACE_PERIOD_BLOCKS <= height) ||
(!htlc_outbound && htlc.0.cltv_expiry <= height + CLTV_CLAIM_BUFFER && self.payment_preimages.contains_key(&htlc.0.payment_hash)) {
log_info!(logger, "Force-closing channel due to {} HTLC timeout - HTLC with payment hash {} expires at {}", if htlc_outbound { "outbound" } else { "inbound"}, htlc.0.payment_hash, htlc.0.cltv_expiry);
return Some(htlc.0.payment_hash);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've got this diff below on this section here, seems to me it reads better ? Feel free to edit

diff --git a/lightning/src/chain/channelmonitor.rs b/lightning/src/chain/channelmonitor.rs
index 5b9d7435d1..f364b25f0c 100644
--- a/lightning/src/chain/channelmonitor.rs
+++ b/lightning/src/chain/channelmonitor.rs
@@ -5887,7 +5887,7 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitorImpl<Signer> {
 		let height = self.best_block.height;
 		macro_rules! scan_commitment {
 			($htlcs: expr, $holder_tx: expr) => {
-				for ref htlc in $htlcs {
+				for (ref htlc, ref source) in $htlcs {
 					// For inbound HTLCs which we know the preimage for, we have to ensure we hit the
 					// chain with enough room to claim the HTLC without our counterparty being able to
 					// time out the HTLC first.
@@ -5899,23 +5899,13 @@ impl<Signer: EcdsaChannelSigner> ChannelMonitorImpl<Signer> {
 					// chain when our counterparty is waiting for expiration to off-chain fail an HTLC
 					// we give ourselves a few blocks of headroom after expiration before going
 					// on-chain for an expired HTLC.
-					let htlc_outbound = $holder_tx == htlc.0.offered;
-					let has_incoming = if htlc_outbound {
-						if let Some(source) = htlc.1.as_deref() {
-							match *source {
-								HTLCSource::OutboundRoute { .. } => false,
-								HTLCSource::PreviousHopData(_) => true,
-							}
-						} else {
-							panic!("Every offered non-dust HTLC should have a corresponding source");
-						}
-					} else {
-						true
-					};
-					if (htlc_outbound && has_incoming && htlc.0.cltv_expiry + LATENCY_GRACE_PERIOD_BLOCKS <= height) ||
-					   (!htlc_outbound && htlc.0.cltv_expiry <= height + CLTV_CLAIM_BUFFER && self.payment_preimages.contains_key(&htlc.0.payment_hash)) {
-						log_info!(logger, "Force-closing channel due to {} HTLC timeout - HTLC with payment hash {} expires at {}", if htlc_outbound { "outbound" } else { "inbound"}, htlc.0.payment_hash, htlc.0.cltv_expiry);
-						return Some(htlc.0.payment_hash);
+					let htlc_outbound = $holder_tx == htlc.offered;
+					let is_outbound_and_has_incoming = htlc_outbound
+						&& matches!(source.as_deref().expect("Every outbound HTLC should have a corresponding source"), HTLCSource::PreviousHopData(_));
+					if (is_outbound_and_has_incoming && htlc.cltv_expiry + LATENCY_GRACE_PERIOD_BLOCKS <= height) ||
+					   (!htlc_outbound && htlc.cltv_expiry <= height + CLTV_CLAIM_BUFFER && self.payment_preimages.contains_key(&htlc.payment_hash)) {
+						log_info!(logger, "Force-closing channel due to {} HTLC timeout - HTLC with payment hash {} expires at {}", if htlc_outbound { "outbound" } else { "inbound"}, htlc.payment_hash, htlc.cltv_expiry);
+						return Some(htlc.payment_hash);
 					}
 				}
 			}

HTLCSource::PreviousHopData(_) => true,
}
} else {
panic!("Every offered non-dust HTLC should have a corresponding source");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this particular spot, dust offered HTLCs on holder_tx also have sources, as well as dust and non-dust accepted HTLCs on !holder_tx ie the counterparty's tx.

So this would include all these cases too: "Every outbound HTLC should have a corresponding source"

Copy link
Collaborator

@TheBlueMatt TheBlueMatt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we can simply never FC a channel for holding outbound payments forever, though I admit I'm not entirely sure what the right answer is. For someone not a mobile client, we still want to FC "on time" since they want their money back and their peer is simply misbehaving. For a mobile client, too, we do eventually want to FC if the peer is simply not failing the HTLC even though we're connected (if we're not connected cause the peer is gone the user presumably will eventually manually FC anyway). Maybe there's a way to gate this on connectivity or uptime?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Don't FC during startup chain relay if we might be able to salvage the channel by connecting to a peer

5 participants