Tags: Netflix/zuul
Tags
Fix for connection leak issue with HTTP 1xx responses (#2137) We were returning the origin connection to the pool after getting a 1xx HTTP response, which was incorrect – those are responses that indicate a further response was coming. In some cases, the origin's eventual (real) response could actually go to the wrong client in these cases. This also fixes an issue where the client -> proxy connection explicitly handled HTTP 100 (Continue), but not any other 1xx response. The others aren't common, but we should handle them anyway. IMPORTANT NOTE: with this change, we're still not passing 1xx HTTP responses from origin back to clients – they're swallowed by Zuul. I originally started on a larger change that did pass them back, but it would require more changes, and more risk. It messes with backpressure, and probably requires injecting the interim responses directly to clients, bypassing the outbound filter chain, which seems ... potentially problematic? Meanwhile, we have no actual use case for returning interim responses to the client right now (and we never handled them correctly before, so while it's technically a behavior change, it's not one anyone would be relying on). Some Claude code in here, fwiw, but I've reviewed enough that I'm to blame for mistakes :) --------- Co-authored-by: Matt Hoffman <matthoffman@netflix.com>
Add `getServers` to `Resolver` and `ClientChannelManager` to expose t… …he origin pool (#2143) `Resolver` and `ClientChannelManager` only expose `resolve`, which load-balances and hands back a single server - there's no way to read the full set of origins they currently know about without picking one. That's fine for routing, but not for callers that want to inspect per-server discovery metadata across the whole pool. This PR adds a `getServers()`, returning a read-only snapshot of all known servers without picking or acquiring a connection.
Fix reentrant double responses on origin errors (#2139) When an origin read times out while a response is still buffered inside a body-buffering response filter, `ClientResponseWriter.exceptionCaught` writes an error response and flushes it. That flush can complete synchronously and fire a `CompleteEvent`, which causes `ZuulFilterChainHandler.finishResponseFilters` to re-deliver the still-buffered response into `channelRead`. Resulting in a _second_ `HttpResponse` on the same stream. `HttpContentEncoder` then throws `IllegalStateException: cannot send more responses than requests`. The reentrant `channelRead` runs before `exceptionCaught` sets `startedSendingResponseToClient = true` (as this is in the same event loop, synchronous), so the existing guard in `channelRead` doesn't catch it. This PR sets the flag before the `writeAndFlush` so the reentrant delivery takes the dispose-and-close path instead of writing a second response. ``` java.lang.IllegalStateException: cannot send more responses than requests at io.netty.handler.codec.http.HttpContentEncoder.encode(HttpContentEncoder.java:130) // (2) second response rejected ... at com.netflix.zuul.netty.server.ClientResponseWriter.channelRead(ClientResponseWriter.java:133) // re-delivered buffered response -> 2nd write at com.netflix.zuul.netty.filter.ZuulFilterChainRunner.runFilters(ZuulFilterChainRunner.java:170) at com.netflix.zuul.netty.filter.ZuulFilterChainHandler.finishResponseFilters(ZuulFilterChainHandler.java:148) at com.netflix.zuul.netty.filter.ZuulFilterChainHandler.fireEndpointFinish(ZuulFilterChainHandler.java:132) at com.netflix.zuul.netty.filter.ZuulFilterChainHandler.userEventTriggered(ZuulFilterChainHandler.java:91) // CompleteEvent fires back up the pipeline ... at com.netflix.netty.common.HttpLifecycleChannelHandler.fireCompleteEventIfNotAlready(HttpLifecycleChannelHandler.java:96) at com.netflix.netty.common.HttpServerLifecycleChannelHandler$HttpServerLifecycleOutboundChannelHandler.lambda$write$0(HttpServerLifecycleChannelHandler.java:88) ... at io.netty.channel.ChannelOutboundBuffer.remove(ChannelOutboundBuffer.java:302) // flushed error response completes synchronously ... at io.netty.channel.AbstractChannelHandlerContext.writeAndFlush(AbstractChannelHandlerContext.java:776) at com.netflix.zuul.netty.server.ClientResponseWriter.exceptionCaught(ClientResponseWriter.java:307) // (1) error response written + flushed ```
Preserve failure nf status set by recordFinalError in buildZuulHttpRe… …sponse (#2131) When origin.recordFinalError() sets a specific failure StatusCategory (e.g. from an origin subclass override), the subsequent unconditional setStatusCategory call was overwriting it with the generic status derived from the response code. This PR switches to use storeStatusCategoryIfNotAlreadyFailure so that a failure status set by recordFinalError is not overwritten. Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
Return bad request for invalid URI (#2127) - Reject encoded slashes (%2F) in paths: mirrors Envoy's REJECT_REQUEST behavior but a bit stricter; any request with %2F in the path gets a 400 Bad Request response - Decode %2E before normalization: per RFC 3986 §2.4, percent-encoded unreserved characters (like .) should be decoded before processing, so dot-segments like %2E%2E are properly collapsed by normalize() - Propagate URISyntaxException from parsePath: instead of silently swallowing parse errors and falling back to a regex, invalid URIs now set a BAD_URI flag on the context and are rejected with a 400 - Remove the fallback regex: deleted URL_REGEX and the manual path-parsing fallback; invalid URIs are now a hard reject rather than a best-effort parse - Remove opaque URI handling
PreviousNext