feat(server): meter bytes transferred on forward deliveries#6240
Conversation
There was a problem hiding this comment.
3 issues found across 10 files
Confidence score: 3/5
- There is concrete regression risk in
packages/webhooks/lib/utils.ts: unhandledonByteserrors can convert an otherwise successful webhook delivery into a retry/failure path, which may resend the same event. packages/webhooks/lib/utils.unit.test.tsintroduces a real flake risk because fixed port 4103 conflicts withsync.unit.test.tswhen Vitest runs in parallel.packages/webhooks/lib/forward.tsappears to drop accumulatedbyteson theresult.isErr()path, so failed single-hop forwards miss the new size metrics; this is moderate observability impact rather than a hard blocker.- Pay close attention to
packages/webhooks/lib/utils.ts,packages/webhooks/lib/utils.unit.test.ts, andpackages/webhooks/lib/forward.ts- delivery retry behavior, parallel test port contention, and error-path metric emission need validation.
Reply with feedback, questions, or to request a fix.
Fix all with cubic | Re-trigger cubic
f5d1b5d to
c2bbc0c
Compare
| logContextGetter, | ||
| onBytes: (bytes) => { | ||
| metrics.increment(metrics.Types.WEBHOOK_REQUEST_SIZE_IN_BYTES, bytes.sent, { callsite: 'forward' }); | ||
| metrics.increment(metrics.Types.WEBHOOK_RESPONSE_SIZE_IN_BYTES, bytes.received, { callsite: 'forward' }); |
There was a problem hiding this comment.
Why does bytes.sent = request size and bytes.received = response size?
There was a problem hiding this comment.
For context: we're only metering data transfer from within our subnets to the internet (as those are charged as DTO and NAT). So with that in mind, bytes.sent is the request size because it captures the payload size forwarded to the webhook server (as a request originating from us) and bytes.received is the response size because it captures the response we got back from the webhook server.
| return Ok({ forwarded }); | ||
| } finally { | ||
| try { | ||
| onBytes?.(totalBytes); |
There was a problem hiding this comment.
We already increase metrics from within functions here. forwardWebhook is only called from 1 place. What's the value of adding a callback function to it instead of just increasing the metrics inline in the finally block?
There was a problem hiding this comment.
For context: sending metrics to DD is phase 1; phase 2 is to send it over pubsub and from there fork to DD (observability) and ClickHouse (billing).
I exposed a callback in order to contain the dependency only in the caller packages. It also comes with the minor perk of making it a bit simpler to test, as I don't have to monkeypatch the metrics module to verify it aggregates bytes correctly.
But, to be frank, I don't see a problem in bringing pubsub as a dependency on webhooks - both it's callers, jobs and server, already depend on it - and ditching away the callback approach, if you feel strongly about this.
There was a problem hiding this comment.
jobs and server are app packages. They are therefore not imported. On the other end, webhooks is an internal package. Adding pubsub as a dependency means it becomes a transient dependency of everything that import webhooks itself. For now it happens that pubsub is also a dep of all the parents of webhooks but if webhooks is ever needed somewhere else, the new parent would have to bring pubsub even if it doesn't need the functionalities that depend on it.
I think we should be careful when adding dependencies between packages, especially when they are not core capabilities of the package itself
There was a problem hiding this comment.
Ok it makes sense to keep it as is to avoid dependency issues.
f4379a4 to
e3b6948
Compare
024ef74 to
f5a130e
Compare
Track HTTP request/response bytes per webhook delivery hop via a
metering transport. Accumulate across all URLs (primary + secondary)
and connection fan-out, then emit WEBHOOK_REQUEST_SIZE_IN_BYTES and
WEBHOOK_RESPONSE_SIZE_IN_BYTES metrics from the server.