Skip to content

AudioSender spams encrypt errors while the DAVE epoch isn't ready (on channel join/move) #555

@thomas-vilte

Description

@thomas-vilte

Describe the bug
With DAVE on, whenever the bot joins or moves to a new voice channel there's a short window where it can't encrypt yet — the MLS epoch hasn't been set up. The problem is the audio sender doesn't know that, so it just keeps grabbing frames and calling Encrypt, which fails the whole time. Each of those failures goes through handleErr and gets logged as an ERROR, so you get a little burst of errors on every channel switch (~5 from the silence frames, more if the epoch takes longer or has to recover). And since it's still pulling frames off the provider while it fails, you lose a bit of audio each time too.

Error

level=ERROR msg="failed to send audio" name=bot name=voice name=voice_conn err="failed to encrypt packet: session: session: no active epoch" 

(that error string is from my own dave impl — the point is it hits handleErr once per frame while the epoch is coming up)

To Reproduce
Steps to reproduce the behavior:

  1. Turn on DAVE (a real godave.Session, not the noop one)
  2. Start playing audio
  3. Move the bot between channels, or have it follow someone hopping channels
  4. Watch the logs during the switch
conn := client.VoiceManager.CreateConn(guildID)
conn.Open(ctx, channelA, false, true)
conn.SetOpusFrameProvider(provider)
// playing...
// now move — new MLS handshake kicks off:
conn.Open(ctx, channelB, false, true)
// during the handshake send() keeps calling Encrypt -> one ERROR per frame

Expected behavior
I'd expect the sender to just sit quiet while the session can't encrypt, instead of hammering Encrypt and logging an error per frame. Once the epoch's ready it picks back up — and ideally without eating frames in the meantime so the audio doesn't skip.

Screenshots
(Lun's got videos/screenshots of it happening on a channel switch — will attach)
Image

Disgo Version:

  • v0.19.6

Additional context
Heads up: I'm on my own dave impl (dave-go), not golibdave, so I can't say for sure golibdave throws the exact same error. But the gap is in send(), which doesn't care which session you're using, so I'd be surprised if golibdave didn't hit the same window during epoch setup. I've got a fix running locally — added a Ready() bool to godave.Session (noop just returns true), kept the session on connImpl and exposed it on Conn, then bail at the top of send() when it's not ready. 0 errors across 14 channel moves vs ~5 each before. Glad to put up a PR for the disgo + godave side if you're good with the approach.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions