Skip to content

fix(voice): nil-guard state.ChannelID in gateway open path#551

Open
sealbro wants to merge 2 commits into
disgoorg:masterfrom
sealbro:fix/voice-gateway-channelid-nil
Open

fix(voice): nil-guard state.ChannelID in gateway open path#551
sealbro wants to merge 2 commits into
disgoorg:masterfrom
sealbro:fix/voice-gateway-channelid-nil

Conversation

@sealbro

@sealbro sealbro commented Jun 7, 2026

Copy link
Copy Markdown
Contributor

Summary

gateway.open at voice/gateway.go:183 bare-derefs *state.ChannelID. It SIGSEGVs when VOICE_SERVER_UPDATE arrives before VOICE_STATE_UPDATE (no strict ordering on reconnect/resume), or when a kick nils ChannelID while a VSU goroutine is in flight.

Design

Required inputs are split across two events — VSU carries Token + Endpoint, VStateU carries ChannelID — and the order isn't guaranteed. Both handlers now call a small tryOpenGateway helper that checks "all three present" and spawns one open goroutine with a value-captured state. Whichever event arrives second triggers the open.

If both events arrive close enough to spawn two opens, the second goroutine gets ErrGatewayAlreadyConnected from gateway.Open and logs once.

gateway.open also returns a new ErrGatewayNoChannelID sentinel when state.ChannelID == nil, as defense-in-depth for external Gateway consumers that bypass Conn. doReconnect treats it as terminal alongside ErrGatewayAlreadyConnected.

Effect on each ordering

Scenario Old This PR
VStateU → VSU works works
VSU → VStateU nil-deref works
VSU with stale nil ChannelID nil-deref no-op

Follow-up to #550 ((*udpConnImpl).Close nil-deref).

Comment thread voice/gateway.go Outdated
Comment thread voice/conn.go Outdated

c.state.Token = update.Token
c.state.Endpoint = *update.Endpoint
if c.state.ChannelID == nil {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

returning here means if the voice server update arrives before the voice state update it will never connect

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, missed it
I am thinking of adding a method like tryOpenGateway that spawns the gateway open goroutine once Token, Endpoint,
and ChannelID are all populated and no other open is in flight.

I will push draft changes for discussion and care about tests

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added tryOpenGateway implementation, which should protect from nil ChannelID

Updated PR description.

@sealbro sealbro force-pushed the fix/voice-gateway-channelid-nil branch from 129c151 to 838ba98 Compare June 7, 2026 14:12
The open needs Token + Endpoint (from VOICE_SERVER_UPDATE) and ChannelID
(from VOICE_STATE_UPDATE). HandleVoiceServerUpdate was the sole trigger
and assumed VOICE_STATE_UPDATE had already populated ChannelID, so it
panicked on reconnect/resume orderings and on kick races (ChannelID
nilled while the open goroutine was in flight).

Introduce tryOpenGateway, called from both handlers, that spawns the
open goroutine only when all three fields are populated (canOpen
predicate), no open is in flight (opening flag), and the gateway is in
Unconnected/Disconnected. The goroutine captures state by value and
loops with a 2*maximumConnectDelay parent context: each iteration that
finds state changed mid-flight re-opens with the fresh state; otherwise
it exits. This handles the reconnect race where stale events trigger an
open for the wrong channel while the correct VStateU/VSU pair arrive
during the open.

gateway.open now also returns a new ErrGatewayNoChannelID sentinel
when state.ChannelID is nil instead of bare-dereferencing. Internal
callers won't trigger this (conn.go's helper enforces the precondition)
but it makes the failure mode loud and clear for external Gateway
consumers. doReconnect treats it as terminal alongside
ErrGatewayAlreadyConnected.
@sealbro sealbro force-pushed the fix/voice-gateway-channelid-nil branch from 838ba98 to d3a4a48 Compare June 7, 2026 15:22
Comment thread voice/conn.go Outdated
c.stateMu.Unlock()
parentCancel()
}()
for {

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a fan of this at all, this just spams connects...
you can just call open in either voice server or voice state update
after either all data is there after all

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplified and dropped the loop. Both handlers now call tryOpenGateway, which just checks "all three present" and spawns a single open goroutine.
I will add a check on my side.

@sealbro sealbro force-pushed the fix/voice-gateway-channelid-nil branch from 0a10a06 to e9f148b Compare June 8, 2026 16:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Development

Successfully merging this pull request may close these issues.

2 participants