Skip to content

refactored unified breaker with retry after#4566

Open
adleong wants to merge 61 commits into
mainfrom
alex/refactored-unified-breaker-with-retry-after
Open

refactored unified breaker with retry after#4566
adleong wants to merge 61 commits into
mainfrom
alex/refactored-unified-breaker-with-retry-after

Conversation

@adleong

@adleong adleong commented Jun 9, 2026

Copy link
Copy Markdown
Member

This PR implements my feedback from #4561, simplifying retry-after hinting to only look at the most recent failure. In order to accomplish this, we store the retry-after hint on the classification.

Tests have not been updated.

unleashed and others added 30 commits May 21, 2026 20:25
Introduce linkerd-ewma, a general-purpose exponentially-weighted moving
average crate. The crate provides five public methods on an Ewma struct:
new (initializes with INFINITY sentinel), get (returns stored value),
add (blends a new sample using exponential decay), add_peak (replaces
stored value when the new sample exceeds it), and add_rate (derives a
rate from the inverse of the elapsed interval and feeds it through add).

This is being added in spite of tower::PeakEwma because this is not
limited to middleware-based RTT computing. We specifically plan to
use this implementation for a load biasing feature and a
success-rate circuit breaker policy, which would otherwise not be
possible.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Extend linkerd-ewma with the API surface needed for success-rate circuit
breaking. A MIN_DECAY constant (1 ms) is now applied in both constructors
so that a zero-duration decay never produces division-by-zero or NaN
results in downstream arithmetic.

New methods: new_with_value sets an explicit initial sample instead of the
INFINITY sentinel, reset overwrites both value and timestamp for breaker
recovery, and get_at projects the stored value forward through exponential
decay without mutating internal state.

Also add_peak is now decay-aware: it projects the stored value to the
candidate timestamp before deciding whether to replace it, and it
unconditionally replaces INFINITY so that the first real sample always
takes effect even at the construction timestamp.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Add a retry_after module to linkerd-http-classify with shared parsing
functions for extracting backoff hints from HTTP and gRPC responses.

parse_retry_after handles 429/503 responses with both delay-seconds and
HTTP-date formats per RFC 7231, capping the returned duration at a
caller-specified maximum. parse_grpc_retry_pushback reads the
grpc-retry-pushback-ms header per the gRPC A6 spec, rejecting negative
values and capping positive ones.

We use the httpdate crate for the actual RFC 7231 HTTP-date parsing.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
…re penalties

Introduce the linkerd-load-biaser crate, which wraps any tower::Service to
provide per-endpoint load metrics for P2C balancing. The crate tracks request
latency via EWMA and injects penalties when failure responses are detected,
steering traffic away from unhealthy endpoints.

Penalty injection covers HTTP 429/503/5xx and gRPC RESOURCE_EXHAUSTED/UNAVAILABLE
trailers-only responses (not streaming gRPC failures since we can only
access headers here). For responses with backoff hints, Retry-After on
HTTP 429/503 or grpc-retry-pushback-ms on gRPC trailers-only errors, the
penalty is amplified so that the EWMA value remains meaningful through
the server-requested backoff window. The amplification is clamped to
prevent infinity from permanently disabling the endpoint.

The load metric is computed as `max(rtt * (pending + 1), penalty)`, where
`rtt` is the peak-EWMA latency, and `pending` is the number of in-flight
requests. This is returned via tower::load::Load for direct P2C
integration.

The load biaser is disabled by default, preserving RTT-only behavior
(PeakEwma equivalent), unless explicitly activated.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
These cover the complete load biasing lifecycle, including penalty
injection, hint parsing, cancellation safety via PinnedDrop, and
backwards-compatible behavior when disabled (ie. RTT-only behavior
equivalent to PeakEwma).

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Co-authored-by: katelyn martin <kate@buoyant.io>
Co-authored-by: katelyn martin <kate@buoyant.io>
…_rate_limit_hint

The _max parameter was accepted for API symmetry with rate_limit_hint(max) but
intentionally unused: the method always caches the uncapped raw value so each
consumer can apply its own cap via rate_limit_hint(max). Removing the parameter
for now since we probably won't need it in the future, and if so we can
always put it back in place.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
…or and accessor

Make the inner Duration field private and provide CachedRateLimitHint::new() for
construction and duration_capped(max) for reads. This prevents consumers from
bypassing the per-caller cap that rate_limit_hint(max) enforces, since the cached
value is intentionally uncapped.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Explain why a standalone EWMA crate exists instead of using Tower's
RttEstimate: it is private, mutates on read, and cannot support the
penalty dimension that failure-aware load balancing requires.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
The crate only uses tokio::time, so disable the default feature set to
avoid pulling unnecessary features into the dependency declaration.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
The cancellation test uses tokio::sync::oneshot which requires the sync
feature. This compiled only because workspace feature unification pulled
it in from other crates.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Replace raw string literals with the module-level constant for
consistency with how HTTP tests use http::header::RETRY_AFTER.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Consistent with Ewma::new which already has this attribute.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Inspect the grpc-status header only on HTTP 200 responses whose
content-type starts with application/grpc. Without this a non-gRPC
upstream that happens to include a grpc-status header would be
considered a gRPC failure and penalized by the load biaser.

The same check is applied to the gRPC retry-pushback-ms parsing in
the ReponseFailureHint trait implementation.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Up until now we mapped every non-zero gRPC status code to
FailureHint::InternalError, penalizing client errors like CANCELLED,
INVALID_ARGUMENT, NOT_FOUND, etc. These don't indicate server
health issues and should not steer traffic away from the endpoint.

Restrict penalty injection to server-side error codes that indicate
endpoint problems: UNKNOWN (2), DEADLINE_EXCEEDED (4), INTERNAL (13),
and DATA_LOSS (15), alongside the existing RESOURCE_EXHAUSTED (8)
and UNAVAILABLE (14) statuses.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Ensure only those gRPC status codes indicating server-side errors
inject penalties.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Verify that consecutive 429 responses at 1s intervals keep the
penalty at the configured level, confirming the EWMA peak resets
the decayed value rather than accumulating.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Add a `last_update()` getter that returns the timestamp of the most
recent EWMA update. Callers that need to detect staleness (ie. idle
periods where the EWMA has decayed to the point that a single sample
dominates) can compare this against the current time to detect this
exact circumstance (and, for example, require more samples before
taking decisions).

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Co-authored-by: katelyn martin <git@katelyn.world>
- Drop unused add_rate, last_update
- Correct MIN_DECAY enforcement comment
- Note on ignoring negative do-not-retry pushbacks

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
…A RTT

We now now keep a single RTT EWMA and a load of `rtt * (pending + 1)`,
exactly like Tower's PeakEwma.

A success records its measured RTT, while a failure now records a
computed effective RTT through the same peak-EWMA logic, using the
Retry-After or grpc-retry-pushback hint when present, or otherwise
penalizing the RTT with a base value.

In-flight requests are now counted the way Tower's PeakEwma counts them,
using Arc's strong count and measuring on cancellation.

Finally an explicit completion tracker can use `PendingUntilFirstData`
for measurement to more closely match previous behavior.

`linkerd-ewma` is still a separate crate because we feed it a penalty
value rather than a measured RTT, and since Tower's `RttEstimate` is
private (at the moment) and advances its decay clock on read, it can't
accept an injected observation nor be read under a shared lock.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
…fault

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
The gRPC A6 spec defines grpc-retry-pushback-ms as an i32.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
unleashed and others added 24 commits June 8, 2026 15:24
Conveys meaning without coupling type nor constant.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Add a test exercising the case of a sample at or below the still
undecayed peak. It should not replace the peak, but compute the
value blending in the new sample.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Ensure that add() discards a sample whose timestamp is at or before
the stored one.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
A hint below the base penalty such as 0 records a low effective RTT
and can make a failing endpoint look healthier than it should.

Ensure a failure's recorded measurement is at least the base penalty,
so that retry hints take effect only when they exceed that penalty.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
test_rtt_tracked_after_request resolved instantly under paused time and
only checked that the RTT moved minimally. Drive a request that takes
a measurable delay and assert the recorded RTT reflects it.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Existing tests raised the pending count with disabled handles or a
single request. Try now with two concurrent requests and assert
the strong count reports two pending, then assert the count falls
when they resolve.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Version 0.20.0 adds the proto surface the load-balancing and circuit
breaker work builds on: the penalty peak-EWMA load variant and the
unified failure-accrual kind.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
The `ExponentialBackoff` strategy keeps its minimum and maximum
durations in private fields, so callers that hold a backoff have no way
to read the window it covers. Add public accessors that return those
two durations. They let a caller clamp an externally supplied delay to
the configured backoff window.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
The broadcast classification channel reported every dropped send the
same way, which conflated two unrelated conditions. When no breaker
consumes classifications, the receiver is dropped and the channel
closes for good. On the default path that is the steady state and not a
fault, so logging it only adds noise. A full channel differs. A
consumer exists yet does not drain fast enough, and that backpressure
is worth surfacing.

Route every send site through a single helper that inspects the
try-send error. A closed channel now emits a quiet trace line noting
there is no consumer, while a full channel emits a debug line flagging
the backpressure. Telling the two apart keeps the common no-consumer
case silent without hiding the one operators care about.

Replace the derived Debug on the channel state with a hand-written impl
so the formatting no longer demands that the class type itself be
Debug, which it need not be at every use site.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Introduce a SuccessRateWindow primitive the unified circuit breaker will
consult to tell whether the recent fraction of healthy responses has
fallen below a threshold. It tracks the ratio with a ring of ten
fixed-duration buckets that span a configurable decay. Recording a
response advances the ring to now, zeroes any bucket aged out of the
window, then tallies the sample into the live bucket. A check sums the
live buckets and trips when the window holds a minimum count of requests
and the ratio is below the threshold. An idle gap past the window clears
every bucket so a quiet endpoint starts cold.

The breaker needs a rate-independent measure here. A decaying moving
average weights each sample by how long since the previous one, so a
burst of failures in a row gives each sample almost no weight and can
hide a complete outage, while the same failures spread out trip sooner.
Exact counts over a fixed window have no such blind spot. A given
fraction of failures reaches the same decision regardless of arrival
rate, the property an operator expects from a success-rate threshold.

The module takes plain parameters rather than a configuration type, so
it stays independent of how the breaker is configured. It is not yet
wired into the breaker. A later change connects it.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
A server under load can tell a client how long to wait before trying
again, through an HTTP Retry-After header or a gRPC
grpc-retry-pushback-ms trailer on a RESOURCE_EXHAUSTED response. The
circuit breaker should honor that signal rather than relying only on its
own escalating backoff, so a backend that asks for a longer pause gets
one and a backend that recovers fast is not punished beyond what it
requested.

This adds the per-endpoint plumbing that captures those hints. A
duration hint store keeps the latest hint, last value wins, stamped with
the instant it was recorded so an old value can be detected and dropped
rather than replayed into a later backoff cycle. Keeping the freshest
hint lets a recovering server that lowers its pushback be honored at the
lower value. Taking a hint subtracts the time already spent waiting, so
the breaker sees only the remaining delay, and an exhausted or overshot
hint clears the slot.

HTTP and gRPC hints live in separate stores kept strictly by source, and
gRPC is parsed only on a genuine gRPC response, gated on a 200 OK so a
429 with a spurious grpc-status cannot leak across. A free function
drains both stores and clamps each hint into the probe backoff's minimum
and maximum before taking the larger, so one response can neither
shorten the base backoff nor push past the ceiling the breaker escalates
toward. The classifiers wrap an inner one and record hints as a side
effect, leaving every decision unchanged. Nothing consumes the stores
yet, and the core gRPC status accessor becomes public for that next
step.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Introduce a circuit breaker that watches two failure signals at once. A
consecutive-failure count reacts the instant an endpoint hard-fails but
stays silent under a partial outage, where requests fail steadily yet
never string together a long enough run to trip. A windowed success
ratio catches that partial degradation, though it lags on a total
outage since it must first gather its minimum sample. Running both in
one breaker pairs the fast reaction of the count with the partial
coverage of the ratio, so either kind of failure opens the circuit.

The ratio comes from a time-windowed counter rather than a time-decayed
average, so the trip decision tracks the failure fraction and not the
request rate. A tight burst and the same burst spread out reach the
same verdict. The ratio dimension is protected at first start and
cannot trip until enough samples sit in its window, and a threshold of
zero disables it. The engine takes only primitive parameters and the
threshold is a plain fraction the caller supplies, so it never sees how
the policy is represented at the configuration layer.

The state machine has three states. Open accepts traffic while tracking
both signals. Shut rejects traffic and runs the backoff, re-reading the
combined Retry-After and gRPC hint each iteration so a fresh, longer
hint from a later probe raises the floor for the waits that follow.
Probation admits one probe, bounded by the backoff ceiling rather than
the window just waited. The probe is strict. An HTTP probe must be
non-5xx and non-429, and a gRPC probe passes for any class other than
RESOURCE_EXHAUSTED, since a 429 there means the endpoint is still rate
limiting.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Add the per-endpoint gating layer that connects each balancer
endpoint's response classification stream to its own breaker. The gate
set reads the target's failure accrual policy and, for each endpoint,
builds a gate whose readiness a breaker task controls and a classifier
that broadcasts each classification over that breaker's channel.

Isolation is the point. The stores that hold Retry-After and gRPC
pushback hints, and the breaker task that reads them, are built per
endpoint, so a hint observed on one endpoint extends only that
endpoint's backoff and a run of failures on one never gates another.
When the policy respects server hints, the classifier is wrapped so the
inner broadcaster records those hints into the endpoint's stores. A
policy that leaves the hint flag off never reads the stores and skips
that wrapper.

A policy that can never trip costs the same as none at all, so a
missing or inert policy resolves to a no-op path: no breaker task and
no hint stores get allocated, and the gate simply never shuts. Hints
are clamped to the chosen policy's maximum backoff, the same ceiling
the breaker applies, so an oversized header is never held in a store
until the breaker would discard it.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Failure accrual was a closed enum of None or ConsecutiveFailures that
every HTTP and gRPC protocol config held by value. That left no room for
the new success-rate breaker and forced the disabled state to be its own
variant. Model it as an opt-in oneof instead. Each protocol config now
holds an Option, where None means accrual is off, so the disabled case
lives at the config level. The enum becomes Consecutive or Unified. The
consecutive policy counts failures in a row, while the unified policy
adds a windowed success-rate threshold on top of a consecutive-failure
ceiling. Both variants also hold a Retry-After preference.

Store the success-rate threshold as basis points in a small newtype
rather than a float. A float is neither Eq nor Hash, which a backend's
cache identity requires, while an integer in the range zero to ten
thousand is both and stays finer than any meaningful success-rate
target. This keeps the protocol configs on their plain derived equality
and hashing. The proto conversion validates the incoming fraction for
range and NaN before quantizing.

The proto conversion dispatches on the oneof kind, reads the Retry-After
preference off the backoff message, and enforces ceilings on the
cold-start request floor and the success-rate window. That window floor
binds only under a nonzero threshold. A zero threshold disables the
success-rate dimension, so any decay is accepted and the
consecutive-failure ceiling stands alone. The outbound breaker wiring
then maps an absent policy to a gate that never closes, a consecutive
policy to the consecutive breaker, and a unified policy to the
success-rate engine. The per-endpoint gate set threads each endpoint's
Retry-After stores to its own breaker, so a hint on one endpoint never
extends another's.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Move the backoff-wait loop, which optionally floors the wait on a
server Retry-After or gRPC pushback hint, out of the unified breaker
and into the retry-after module beside the hint stores. The unified
breaker calls the shared helper; the consecutive-failures breaker
keeps its own hint-free exponential backoff, so only the unified
policy ever floors a wait on a server hint.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
The balancer load model now describes a second estimator. Alongside
plain peak-EWMA, a backend may select a variant that penalizes endpoints
that signal rate limiting, biasing P2C toward healthier peers. Extend
the client-policy Load enum with a PenaltyPeakEwma variant and hold the
strategy's fields verbatim from the control-plane policy: the RTT seed
and decay window, the penalty magnitude and its own decay, and the cap
on how far a Retry-After hint may extend that penalty. The policy layer
only records them as part of the backend's identity. The balancer maps
them onto its estimator later.

Every field is a duration, so the new struct and the widened enum keep
deriving Eq and Hash and remain part of a backend's cache key, matching
the existing peak-EWMA strategy. The proto conversion gains the matching
branch. Since each field is optional on the wire, an absent value takes
a documented default while a value that is present but invalid still
surfaces an error rather than being silently discarded. A small helper
expresses that decoding next to the existing required-field decoder.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Pin the failure-accrual proto conversion across both branches of the
discovery oneof and the rules that guard the success-rate path. On the
consecutive branch the failure ceiling comes through, and the
Retry-After preference is read off the backoff message both when the
hint is requested and when it is left unset. A branch with no backoff is
treated as missing, and an accrual with no kind at all is rejected too.

On the unified branch the wire fraction lands in the basis-points
threshold rather than a raw float, so the threshold can stay part of a
backend's cache identity. The measurement window defaults to ten seconds
when absent. A zero threshold disables the success-rate dimension, so a
sub-floor window is accepted there and the consecutive-failure ceiling
stands on its own. Thresholds outside the unit interval, NaN among them,
are rejected at the boundary before any rounding, and a populated
ejection field is ignored since the conversion reads only the discovery
kind.

This boundary is the single place where untrusted control-plane numbers
become typed policy. A regression here would either crash a backend or
admit a breaker configuration that can never trip.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Give the P2C balancer a second load estimator alongside its Tower
peak-EWMA wrapper. The new one tracks load with the response-aware
biaser, so an endpoint returning rate-limit signals is de-prioritized
for a penalty window. NewPenaltyPeakEwmaBalance builds each endpoint
through NewLoadBiaser and serves only backends whose policy opts into
penalties. A shared helper gives both paths the same queue, metrics, and
endpoint setup, so only the per-endpoint load tracker and pool differ.
Nothing selects the penalty path yet, so the change is additive.

A backend's PenaltyPeakEwma policy maps onto the biaser configuration
field for field, with one exception. The policy has a separate
penalty_decay, yet the biaser records a penalized response as a raised
effective RTT that fades through the same RTT EWMA window. No second
decay remains to drive, so penalty_decay folds into rtt_decay and drops
from the mapping. Both paths also floor the seed RTT at MIN_DEFAULT_RTT,
one millisecond, since the estimate scales with the seed and a near-zero
seed would let P2C over-select a fresh endpoint. RTT is sampled at the
first response data frame.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
The HTTP balance layer now chooses its load estimator from the backend's
policy. NewBalance reads the load policy from the target and dispatches
to svc::Either: the PeakEwma branch builds the existing peak-EWMA
balancer, and the PenaltyPeakEwma branch builds the response-aware
penalty estimator that de-prioritizes endpoints returning rate-limit
signals. The default stays peak-EWMA, so a backend that does not opt in
behaves exactly as before, and RTT is still sampled at the first
response data frame on either path.

Since the two pool service types differ, each branch boxes its response
body so the two unify to a single response type. Boxing does not change
when first-data load completion fires, as the underlying handle still
observes the boxed body when polled and the balancer's response body is
boxed downstream regardless.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Hold the backend's load policy on the balance target so the HTTP
balancer can pick its endpoint-load estimator per backend. The concrete
balance dispatch now holds the client-policy load oneof in place of a
bare EWMA configuration, and the target gains what the selector reads:
the load itself, an EWMA configuration derived from the named estimator,
and a penalty configuration that defaults to penalty-free for peak-EWMA
so the parameter stays total. A Load::peak_ewma_rtt() helper reads the
decay and seed RTT out of either estimator for the EWMA call sites.

The logical, profile, and policy routers pick the load policy that flows
into the dispatch. Profiles have no penalty configuration, so that path
uses a peak-EWMA default whose RTT settings match the prior balancer,
and a backend opting into nothing behaves as before. Opaque and TLS
routing lack HTTP response classification, so they take only the RTT
settings, and when an operator did set a real penalty the router warns
it ignores that estimator so the drop stays visible. The control-plane
balancer is likewise RTT-only and uses peak-EWMA directly.

The per-backend selector boxes each branch's response body, relaxing its
bounds so the penalty branch may use a distinct, independently boxable
body. That branch samples penalty load at the first response data frame
and wraps the endpoint body in the biaser's completion type, which
differs from the peak-EWMA wrapper. Both bodies box to one response type
regardless.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Add integration tests that drive the unified circuit breaker through the
dispatch the outbound stack uses to spawn a policy per endpoint, rather
than poking the engine internals directly. The unit tests already pin
the state machine, so these pin the wiring that turns a failure-accrual
policy into a running breaker. Both trip conditions are covered: a run
of consecutive failures opens the circuit at once with no cold-start
guard, and a low windowed success rate opens it when no such run forms.

Recovery goes through bounded probation. A clean probe reopens the
circuit, while a failed or silent probe re-shuts it and advances the
backoff so a still-broken endpoint stays ejected. The two probe verdicts
are covered on both breaker branches, since the consecutive branch keeps
judging a probe by the default classifier while the unified branch
treats a rate-limit signal during probation as a failure.

Server pushback is covered as an opt-in backoff floor clamped to the
ceiling, and isolated per endpoint so one peer's hint cannot stretch
another endpoint's recovery.

Signed-off-by: Alejandro Martinez Ruiz <amr@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
Signed-off-by: Alex Leong <alex@buoyant.io>
@adleong adleong requested a review from a team as a code owner June 9, 2026 22:33
Signed-off-by: Alex Leong <alex@buoyant.io>
@unleashed unleashed force-pushed the amr/refactored-unified-breaker branch 2 times, most recently from 8544696 to 713e33d Compare June 11, 2026 18:06
Base automatically changed from amr/refactored-unified-breaker to main June 11, 2026 18:20
@cratelyn

Copy link
Copy Markdown
Member

my understanding is that this can be closed, because we opted to take #4565. is that correct @unleashed?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants