Skip to content

udp: add ECN, PMTUD, pktinfo, and GSO/GRO extensions to uv_udp_t#5116

Open
jasnell wants to merge 2 commits into
libuv:masterfrom
jasnell:jasnell/udp2
Open

udp: add ECN, PMTUD, pktinfo, and GSO/GRO extensions to uv_udp_t#5116
jasnell wants to merge 2 commits into
libuv:masterfrom
jasnell:jasnell/udp2

Conversation

@jasnell

@jasnell jasnell commented Apr 11, 2026

Copy link
Copy Markdown
Contributor

Extends the existing uv_udp_t handle with per-packet metadata delivery
and advanced UDP features, without introducing a new handle type.

uv_udp_recv_start2() is an enhanced receive path that delivers metadata
through a uv_udp_recv_t struct: ECN codepoint, local destination
address, interface index, and GRO segment stride. Features are enabled
at bind time via new UV_UDP_* flags:

RECVECN IP_RECVTOS / IPV6_RECVTCLASS; uv_udp_set_ecn() for send
PMTUD IP_MTU_DISCOVER / IP_DONTFRAG; uv_udp_set_pmtud()
RECVPKTINFO IP_PKTINFO / IPV6_RECVPKTINFO (IP_RECVDSTADDR on BSD)
GRO UDP GRO with transparent segment splitting (Linux 5.0+)
GRO_RAW UDP GRO without splitting; app splits by segment_size
TXTIME kernel packet pacing via SO_TXTIME (Linux 4.19+)

uv_udp_try_send_batch() sends batches with optional UDP_SEGMENT GSO
(Linux 4.18+) and per-message SCM_TXTIME timestamps.
uv_udp_getmtu() queries the cached path MTU (Linux and Windows).
uv_udp_set_cpu_affinity() sets SO_INCOMING_CPU (Linux 3.9+).
uv_udp_configure() enables features post-bind.

The enhanced recv callback is stored in the existing recv_cb field
(same-sized function pointer) with a UV_HANDLE_UDP_RECV2 flag
controlling dispatch. The uv_udp_t struct layout is unchanged.

On Windows, WSARecvMsg is used when RECVECN or RECVPKTINFO is
requested; otherwise falls back to WSARecvFrom. Where a bind flag
is unsupported on the current platform it is silently ignored.

Some discussion of the motivation behind this: https://www.jasnell.me/posts/libuv-udp-extensions

Fixes: #3077

@saghul

saghul commented Apr 11, 2026

Copy link
Copy Markdown
Member

I just glanced at the API in uv.h, some thoughts:

  • We are re-thinking about a libuv v2, this might be a large enough change to target v2
  • I don't see any fundamental reason why this can't work with the current udp handle. We'd of course need a uv_udp_recv_start2 and all the extra APIs, but we have reserved fields which could be used to allocate an auxiliary structure rather than duplicate a lot of logic. Did you consider that approach?

@jasnell

jasnell commented Apr 11, 2026

Copy link
Copy Markdown
Contributor Author

I did consider adding to the existing udp handle but did not want to risk an abi breaking change, even with the reserved fields. If that's not as much of a risk tho, I can switch it.

I also considered targeting V2 but as this is intended to support quic in node sooner rather than later, and since that is still on 1.x, that's where I wanted to start.

@jasnell jasnell force-pushed the jasnell/udp2 branch 2 times, most recently from f7b269a to e821f58 Compare April 11, 2026 22:19
@bnoordhuis

Copy link
Copy Markdown
Member

Like @saghul, I don't think this needs a new handle type.

Also, with a 4500 line diff, you shouldn't realistically expect anyone to review this in depth anytime soon. We're unpaid volunteers and there's funner stuff to do.

@jasnell

jasnell commented Apr 13, 2026

Copy link
Copy Markdown
Contributor Author

I don't think this needs a new handle type.

Ok, I can update it to use the existing uv_udp_t reserved fields them

@jasnell

jasnell commented Apr 16, 2026

Copy link
Copy Markdown
Contributor Author

Ok, after chatting a bit with @santigimeno at the node.js collab summit in london this week about this and the updated plans to try to get v2 across the line, I think it does make sense to target v2 here. So I'll move this back to draft and will work on rebasing it both on v2 and using the uv_udp_t handle.

@jasnell jasnell marked this pull request as draft April 16, 2026 02:56
@santigimeno

Copy link
Copy Markdown
Member

Thanks. Will take a look when it's ready.

@jasnell jasnell force-pushed the jasnell/udp2 branch 2 times, most recently from a3a054f to 664a25a Compare May 7, 2026 20:25
@jasnell jasnell changed the base branch from v1.x to master May 7, 2026 20:25
@jasnell jasnell changed the title udp: add uv_udp2_t handle type udp: add ECN, PMTUD, pktinfo, and GSO/GRO extensions to uv_udp_t May 7, 2026
@jasnell jasnell marked this pull request as ready for review May 7, 2026 20:26
@jasnell

jasnell commented May 7, 2026

Copy link
Copy Markdown
Contributor Author

@santigimeno @saghul ... ok, the implementation has been updated to (a) use the existing uv_udp_t handle instead of introducing a new handle and (b) targeting master/v2.x instead of v1.x. Managed to make the change set a bit smaller in the process ;-)

@bnoordhuis

Copy link
Copy Markdown
Member

It's still over 3,100 lines of.... I don't want to use the phrase "AI slop" but I can't think of anything more appropriate.

I won't speak for the others but I myself have a severe distaste for reviewing such code: a code monkey can now throw out in 30 seconds what takes me 1 hour to review. Worse, I end up having it to maintain it afterwards.

Long story short: I'm not going to invest my own free time in this, and neither should other maintainers. If this is something you're doing for CloudFlare or OpenJS, then let them get in touch for a support contract.

@jasnell

jasnell commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

@bnoordhuis ... I appreciate your input. You've already indicated that you would not consider reviewing this PR to be "fun" and have indicated that you've "moved past caring" about what Node.js needs from libuv. Given that this is in support of a significant new Node.js feature, I wouldn't expect you personally to invest any amount of time on reviewing this PR or maintaining the changes.

Implementing QUIC for Node.js is a long-time personal project that has absolutely nothing to do with Cloudflare and is not something I'm doing in any official capacity on behalf of the OpenJS foundation. I'm more than happy to help maintain these changes long term.

I appreciate that @santigimeno has indicated to me that they intend to review the changes. @santigimeno, if there's anything I can do to help make the review process easier please do let me know. Roughly 50% of the LOC changed are implementation, which is reasonable split between the Unix and Windows portions (roughly 800-900 lines each). Over half the PR is tests and docs. If it would make the review easier I can split into multiple commits but I've tried to keep everything together.

@saghul and @santigimeno ... one concern here worth noting.... if you look at the additional flags added in uv_common.h (lines 117-121)... they fully exhaust the remaining space in the 32-bit unsigned int flags. I wouldn't expect new flags to be added any time soon but it's worth mentioning.

@jasnell

jasnell commented May 10, 2026

Copy link
Copy Markdown
Contributor Author

FWIW, the CI-docs failure (https://github.com/libuv/libuv/actions/runs/25630392082/job/75232922375?pr=5116) appear to be unrelated to this change.

#5136 should fix a couple of the broken links. Not sure what's up the third one.

jasnell added 2 commits May 11, 2026 06:53
Extends the existing uv_udp_t handle with per-packet metadata delivery
and advanced UDP features, without introducing a new handle type.

uv_udp_recv_start2() is an enhanced receive path that delivers metadata
through a uv_udp_recv_t struct: ECN codepoint, local destination
address, interface index, and GRO segment stride. Features are enabled
at bind time via new UV_UDP_* flags:

  RECVECN       IP_RECVTOS / IPV6_RECVTCLASS; uv_udp_set_ecn() for send
  PMTUD         IP_MTU_DISCOVER / IP_DONTFRAG; uv_udp_set_pmtud()
  RECVPKTINFO   IP_PKTINFO / IPV6_RECVPKTINFO (IP_RECVDSTADDR on BSD)
  GRO           UDP GRO with transparent segment splitting (Linux 5.0+)
  GRO_RAW       UDP GRO without splitting; app splits by segment_size
  TXTIME        kernel packet pacing via SO_TXTIME (Linux 4.19+)

uv_udp_try_send_batch() sends batches with optional UDP_SEGMENT GSO
(Linux 4.18+) and per-message SCM_TXTIME timestamps.
uv_udp_getmtu() queries the cached path MTU (Linux and Windows).
uv_udp_set_cpu_affinity() sets SO_INCOMING_CPU (Linux 3.9+).
uv_udp_configure() enables features post-bind.

The enhanced recv callback is stored in the existing recv_cb field
(same-sized function pointer) with a UV_HANDLE_UDP_RECV2 flag
controlling dispatch. The uv_udp_t struct layout is unchanged.

On Windows, WSARecvMsg is used when RECVECN or RECVPKTINFO is
requested; otherwise falls back to WSARecvFrom. Where a bind flag
is unsupported on the current platform it is silently ignored.

@santigimeno santigimeno left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass focused on API shape and unix implementation only.

@jasnell yes, in principle, splitting the PR on single-feature commits would be nice.

Comment thread docs/src/udp.rst
Comment on lines +94 to +95
/* Indicates data from the Linux MSG_ERRQUEUE error queue. */
UV_UDP_RECV_LINUX_RECVERR = 32768

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's already UV_UDP_LINUX_RECVERR. Never used either

Comment thread include/uv.h
Comment on lines +784 to +787
/* Indicates data from the Linux MSG_ERRQUEUE error queue.
* Set in uv_udp_recv_t.flags when the received message came from the
* socket error queue (IP_RECVERR / IPV6_RECVERR). */
UV_UDP_RECV_LINUX_RECVERR = 32768

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove

Comment thread docs/src/udp.rst
Comment on lines +826 to +849
* - GSO (``uv_udp_try_send_batch``)
- Yes (4.18+)
- no-op
- no-op
- no-op
- no-op
* - GRO (``UV_UDP_GRO`` / ``UV_UDP_GRO_RAW``)
- Yes (5.0+)
- no-op
- no-op
- no-op
- no-op
* - SO_TXTIME (``UV_UDP_TXTIME``)
- Yes (4.19+)
- no-op
- no-op
- no-op
- no-op
* - SO_INCOMING_CPU (``uv_udp_set_cpu_affinity``)
- Yes (3.9+)
- ENOTSUP
- ENOTSUP
- ENOTSUP
- ENOTSUP

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We usually don't support features supported on a single platform.
Most of these (the ones related to the transmission) seem something that could be implemented easily on userland, right?
GRO would require more thought: maybe a way to expose the metadata coming via ancillary data.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, I was hoping to come back in a follow up and add an approximated implementation for other runtimes that would make the API operational even if it needed to fallback to a not-actually-GSO/GRO impl under the covers.

The UDP_TXTIME is a kernel pacing feature. Not possible to implement at the user-space level with the same performance.

The cpu_affinity one is certainly not as critical.

Comment thread docs/src/udp.rst
.. note::
On **macOS**, ``IPV6_RECVPKTINFO`` is not available. IPv6 pktinfo
(``UV_UDP_RECVPKTINFO``) is silently ignored; ``recv->local.ss_family``
will be 0.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can use it if we define -D__APPLE_USE_RFC_3542. See for example: python/cpython#19526

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, is nodejs/node-v0.x-archive#6589 (comment) still a problem in macos? I assume not anymore?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh! I'd completely missed that APPLE_USE_RFC_3542 was a thing. Nice.

Comment thread docs/src/udp.rst
``UV_UDP_RECVPKTINFO``.

:returns: 0 on success, ``UV_EBADF`` if no socket exists, ``UV_EINVAL``
for unrecognized flags.

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In which cases this function would be used?
For now I would probably drop this new method from the PR. It can be added, if we consider it useful, in a follow up PR. Some problems I see now: increases API surface. Some the options already have a public API added in this same PR. Also it sets a default uv_udp_pmtud value.

Comment thread src/unix/udp.c
#elif defined(IP_RECVDSTADDR)
setsockopt(fd, IPPROTO_IP, IP_RECVDSTADDR, &yes, sizeof(yes));
#if defined(IP_RECVIF)
setsockopt(fd, IPPROTO_IP, IP_RECVIF, &yes, sizeof(yes));

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why nesting these 2 options? Aren't they independent? Or enforcing that IP_RECVIF is only set when IP_RECVDSTADDR is intentional?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That was the intention but it's worth revisiting if it's appropriate.

Comment thread src/unix/udp.c
Comment on lines +468 to +471
if (flags & UV_UDP_PMTUD) {
/* Default to PROBE mode. */
uv__udp_set_pmtud_opt(fd, addr->sa_family, 2);
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return error if uv__udp_set_pmtud_opt() fails?

Comment thread src/unix/udp.c
Comment on lines +463 to +466
if (flags & UV_UDP_RECVECN) {
uv__udp_set_recvecn(fd, addr->sa_family);
handle->flags |= UV_HANDLE_UDP_RECVECN;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

check return value and don't set the flag on error.

Comment thread src/unix/udp.c
Comment on lines +460 to +461
if (flags & UV_UDP_RECVMMSG)
handle->flags |= UV_HANDLE_UDP_RECVMMSG;

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why adding this?

Comment thread src/unix/udp.c
Comment on lines +1542 to +1554
static int uv__udp_set_recvecn(int fd, sa_family_t family) {
int yes = 1;
#if defined(IP_RECVTOS)
if (family == AF_INET)
setsockopt(fd, IPPROTO_IP, IP_RECVTOS, &yes, sizeof(yes));
#endif
#if defined(IPV6_RECVTCLASS)
if (family == AF_INET6)
setsockopt(fd, IPPROTO_IPV6, IPV6_RECVTCLASS, &yes, sizeof(yes));
#endif
return 0;
}

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we handle setsockopt() errors here?
Also returning UV_ENOTSUP if no option set?

@jasnell

jasnell commented May 14, 2026

Copy link
Copy Markdown
Contributor Author

@santigimeno ... really appreciate the initial review. I'll take another update pass hopefully by this time next week

jasnell added a commit to jasnell/node that referenced this pull request May 14, 2026
Use a uv_check_t on BindingData to process outbound
pending packet send, and use TrySend for actually sending
packets when possible. Results in an 8% improvement in
req/s and ~24% improvement in p95 latency.

Also sets us up better for future improvements in
libuv if the changes proposed in libuv/libuv#5116
are accepted.

Signed-off-by: James M Snell <jasnell@gmail.com>
Assisted-by: Opencode:Opus 4.6
jasnell added a commit to jasnell/node that referenced this pull request May 15, 2026
Use a uv_check_t on BindingData to process outbound
pending packet send, and use TrySend for actually sending
packets when possible. Results in an 8% improvement in
req/s and ~24% improvement in p95 latency.

Also sets us up better for future improvements in
libuv if the changes proposed in libuv/libuv#5116
are accepted.

Signed-off-by: James M Snell <jasnell@gmail.com>
Assisted-by: Opencode:Opus 4.6
jasnell added a commit to jasnell/node that referenced this pull request May 15, 2026
Use a uv_check_t on BindingData to process outbound
pending packet send, and use TrySend for actually sending
packets when possible. Results in an 8% improvement in
req/s and ~24% improvement in p95 latency.

Also sets us up better for future improvements in
libuv if the changes proposed in libuv/libuv#5116
are accepted.

Signed-off-by: James M Snell <jasnell@gmail.com>
Assisted-by: Opencode:Opus 4.6
jasnell added a commit to jasnell/node that referenced this pull request May 15, 2026
Use a uv_check_t on BindingData to process outbound
pending packet send, and use TrySend for actually sending
packets when possible. Results in an 8% improvement in
req/s and ~24% improvement in p95 latency.

Also sets us up better for future improvements in
libuv if the changes proposed in libuv/libuv#5116
are accepted.

Signed-off-by: James M Snell <jasnell@gmail.com>
Assisted-by: Opencode:Opus 4.6
jasnell added a commit to jasnell/node that referenced this pull request May 15, 2026
Use a uv_check_t on BindingData to process outbound
pending packet send, and use TrySend for actually sending
packets when possible. Results in an 8% improvement in
req/s and ~24% improvement in p95 latency.

Also sets us up better for future improvements in
libuv if the changes proposed in libuv/libuv#5116
are accepted.

Signed-off-by: James M Snell <jasnell@gmail.com>
Assisted-by: Opencode:Opus 4.6
jasnell added a commit to nodejs/node that referenced this pull request May 21, 2026
Use a uv_check_t on BindingData to process outbound
pending packet send, and use TrySend for actually sending
packets when possible. Results in an 8% improvement in
req/s and ~24% improvement in p95 latency.

Also sets us up better for future improvements in
libuv if the changes proposed in libuv/libuv#5116
are accepted.

Signed-off-by: James M Snell <jasnell@gmail.com>
Assisted-by: Opencode:Opus 4.6
PR-URL: #63267
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
aduh95 pushed a commit to nodejs/node that referenced this pull request May 22, 2026
Use a uv_check_t on BindingData to process outbound
pending packet send, and use TrySend for actually sending
packets when possible. Results in an 8% improvement in
req/s and ~24% improvement in p95 latency.

Also sets us up better for future improvements in
libuv if the changes proposed in libuv/libuv#5116
are accepted.

Signed-off-by: James M Snell <jasnell@gmail.com>
Assisted-by: Opencode:Opus 4.6
PR-URL: #63267
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
araujogui pushed a commit to araujogui/node that referenced this pull request May 26, 2026
Use a uv_check_t on BindingData to process outbound
pending packet send, and use TrySend for actually sending
packets when possible. Results in an 8% improvement in
req/s and ~24% improvement in p95 latency.

Also sets us up better for future improvements in
libuv if the changes proposed in libuv/libuv#5116
are accepted.

Signed-off-by: James M Snell <jasnell@gmail.com>
Assisted-by: Opencode:Opus 4.6
PR-URL: nodejs#63267
Reviewed-By: Matteo Collina <matteo.collina@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants