-
Notifications
You must be signed in to change notification settings - Fork 966
Fix a bug where an HTTP/1 request could fail with ClosedSessionException when sending requests concurrently.
#6229
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
…eption` when send concurrently. Motivation: `ClosedSessionException` occurred intermittently when sending multiple HTTP/1 requests concurrently. TCP dump showed that the server sent a response, but Nett's `LoggingHandler` did not log a READ event. This may indicate the Channel's internal state might be broken. The root cause appears to be related to the connection acquisition logic. Because when `H1C` was used explicitly, the `ClosedSessionException` wasn't raised. While acquiring a connection, if a pending connection is in progress does not support multiplexing, a new connection is created. Creating a new connection occurs while the acquired connection is still being initialized. https://github.com/netty/netty/blob/4.2/transport-classes-kqueue/src/main/java/io/netty/channel/kqueue/AbstractKQueueChannel.java#L673-L684 ``` at com.linecorp.armeria.client.HttpChannelPool$ChannelAcquisitionFuture.handlePiggyback(HttpChannelPool.java:852) at com.linecorp.armeria.client.HttpChannelPool$ChannelAcquisitionFuture.lambda$piggyback$0(HttpChannelPool.java:770) at com.linecorp.armeria.client.HttpChannelPool$ChannelAcquisitionFuture.handlePendingPiggybacks(HttpChannelPool.java:906) at com.linecorp.armeria.client.HttpChannelPool$ChannelAcquisitionFuture.complete(HttpChannelPool.java:870) at com.linecorp.armeria.client.HttpChannelPool.notifyConnect(HttpChannelPool.java:514) at com.linecorp.armeria.client.HttpChannelPool.lambda$connect$2(HttpChannelPool.java:374) at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590) at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:557) at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492) at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636) at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:625) at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:105) at com.linecorp.armeria.client.HttpSessionHandler.tryCompleteSessionPromise(HttpSessionHandler.java:517) at com.linecorp.armeria.client.HttpSessionHandler.userEventTriggered(HttpSessionHandler.java:447) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:398) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:376) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:368) at com.linecorp.armeria.client.Http1ResponseDecoder.userEventTriggered(Http1ResponseDecoder.java:328) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:400) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:376) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:368) at io.netty.channel.CombinedChannelDuplexHandler$DelegatingChannelHandlerContext.fireUserEventTriggered(CombinedChannelDuplexHandler.java:430) at io.netty.channel.ChannelInboundHandlerAdapter.userEventTriggered(ChannelInboundHandlerAdapter.java:117) at io.netty.handler.codec.ByteToMessageDecoder.userEventTriggered(ByteToMessageDecoder.java:388) at io.netty.handler.codec.http.HttpObjectDecoder.userEventTriggered(HttpObjectDecoder.java:610) at io.netty.channel.CombinedChannelDuplexHandler.userEventTriggered(CombinedChannelDuplexHandler.java:241) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:398) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:376) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:368) at io.netty.handler.logging.LoggingHandler.userEventTriggered(LoggingHandler.java:222) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:398) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:376) at io.netty.channel.AbstractChannelHandlerContext.fireUserEventTriggered(AbstractChannelHandlerContext.java:368) at io.netty.channel.DefaultChannelPipeline$HeadContext.userEventTriggered(DefaultChannelPipeline.java:1375) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:396) at io.netty.channel.AbstractChannelHandlerContext.invokeUserEventTriggered(AbstractChannelHandlerContext.java:376) at io.netty.channel.DefaultChannelPipeline.fireUserEventTriggered(DefaultChannelPipeline.java:862) at com.linecorp.armeria.client.HttpClientPipelineConfigurator.finishSuccessfully(HttpClientPipelineConfigurator.java:435) at com.linecorp.armeria.client.HttpClientPipelineConfigurator.lambda$configureAsHttp$1(HttpClientPipelineConfigurator.java:414) at io.netty.util.concurrent.DefaultPromise.notifyListener0(DefaultPromise.java:590) at io.netty.util.concurrent.DefaultPromise.notifyListeners0(DefaultPromise.java:583) at io.netty.util.concurrent.DefaultPromise.notifyListenersNow(DefaultPromise.java:559) at io.netty.util.concurrent.DefaultPromise.notifyListeners(DefaultPromise.java:492) at io.netty.util.concurrent.DefaultPromise.setValue0(DefaultPromise.java:636) at io.netty.util.concurrent.DefaultPromise.setSuccess0(DefaultPromise.java:625) at io.netty.util.concurrent.DefaultPromise.trySuccess(DefaultPromise.java:105) at io.netty.channel.DefaultChannelPromise.trySuccess(DefaultChannelPromise.java:84) at io.netty.channel.kqueue.AbstractKQueueChannel$AbstractKQueueUnsafe.fulfillConnectPromise(AbstractKQueueChannel.java:616) at io.netty.channel.kqueue.AbstractKQueueChannel$AbstractKQueueUnsafe.finishConnect(AbstractKQueueChannel.java:654) at io.netty.channel.kqueue.AbstractKQueueChannel$AbstractKQueueUnsafe.writeReady(AbstractKQueueChannel.java:438) at io.netty.channel.kqueue.KQueueEventLoop.processReady(KQueueEventLoop.java:215) at io.netty.channel.kqueue.KQueueEventLoop.run(KQueueEventLoop.java:296) at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:998) at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) at java.base/java.lang.Thread.run(Thread.java:1589) ``` It was difficult to figure out the exact reason of the root cause in Netty. However, given the investigation, Netty's internal Channel status seems broken if a connection is created recursively while initializing a new connection. Related issue: #6625 Modifications: - Reschedule the connection attempt to avoid creating a new connection in the middle of initializing the acquired channel. Result: You no longer see an unexpected `ClosedSessionException` when send HTTP/1 concurrently.
ClosedSessionException when send concurrently.ClosedSessionException when sending concurrently.
ClosedSessionException when sending concurrently.ClosedSessionException when sending requests concurrently.
🔍 Build Scan® (commit: 7a5c7d5) |
| connect(desiredProtocol, serializationFormat, key, childPromise, timingsBuilder); | ||
| // Reschedule the connection attempt to avoid creating a new connection in the middle of | ||
| // initializing the acquired channel. | ||
| eventLoop.execute(() -> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes me wonder is there a way to check in which state the channel is? We reproduced this issue in a production and forked this fix and it helped but not fully,
// In HttpChannelPool.java we added
public static long NEW_CONNECT_DELAY_MILLIS = 0L;
// And on this line we added this
if (NEW_CONNECT_DELAY_MILLIS <= 0) {
eventLoop.execute(() -> connect(desiredProtocol, serializationFormat, key, childPromise, timingsBuilder));
} else {
eventLoop.schedule(() -> connect(desiredProtocol, serializationFormat, key, childPromise, timingsBuilder), NEW_CONNECT_DELAY_MILLIS, TimeUnit.MILLISECONDS);
}The bigger the delay was the less this reproduced, so it makes me believe that correct way would be to reschedule but check in callback if the state is suitable and if not reschedule again. The best way would have been to react to some event that channel is initialized and we can proceed with connect.
Motivation:
ClosedSessionExceptionoccurred intermittently when sending multiple HTTP/1 requests concurrently. TCP dump showed that the server sent a response, but Nett'sLoggingHandlerdid not log a READ event. This may indicate that the Channel's internal state might be broken.The root cause is related to the connection acquisition logic. When
H1Cwas used explicitly, theClosedSessionExceptionwasn't raised. While acquiring a connection, if a pending connection in progress does not support multiplexing, a new connection is created. Creating a new connection occurs while the acquired connection is still being initialized.https://github.com/netty/netty/blob/4.2/transport-classes-kqueue/src/main/java/io/netty/channel/kqueue/AbstractKQueueChannel.java#L673-L684
It was difficult to pinpoint the exact root cause within Netty. However, given the investigation, it appears that Netty’s internal Channel state becomes corrupted when a new connection is created recursively during the initialization of another connection.
Modifications:
Result:
ClosedSessionExceptionwhen sending HTTP/1 concurrently.