-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dmsg client fails to accept stream with error="dmsg error 401 - listener accept chan maxed" #1803
Comments
Reproducing the issuei've devised a simple way to reproduce the issue here. create a transport to the same visor via dmsg
Copy the transport ID in place of the existing transport id in the following shell command
Before all the loops have completed, the visor is no longer able to accept transports. It may take over an hour to reach this state. in one such test, after creating and removing the transport 2834 times, it was no longer possible for the visor to accept transports. on another test, it happened in fewer than 300 loops. In another test, < 700 loops. It does not appear to happen consistently at the same number of loops but it does always happen eventually |
Running the same test again after having increased the accept buffer size to 50 from 20. The
After re-building to develop branch, the error is now consistently here is the visor debug logging of that
It seems in this instance the errors begin with failure to register the transport with transport discovery, instead of failure to accept the transport
will recommence the tests with a brief sleep between transport creation attempts |
When a 2 second sleep is added to the loop, it seems the issue no longer manifests - the test is ongoing. ~3560 loops in without the error. However; this does not prevent any given visor from being inundated with transport requests which might trigger the UpdateThe visor did stop being able to accept transports after the 4427th loop with
i don't have the visor debug logging of the exact moment this occurred because it's beyond the scroll back history of the terminal. The visor is currently unable to accept transports, though the test ended hours ago |
The non-transportability aspect of this issue was addressed in a temporary way in #1807 |
When performing the following test of the transport setup-node functionality (which also performs a
/health
check of the visor over dmsghttp usingdmsgweb
):Long before the conclusion of the test, the visor enters a state where it can no longer accept transports.
Oddly, it is still possible to create transports to other visors; it's simply not possible to accept them or for remote visors to create a transport to the local visor
Here I check a public key from the end of the above test to see if it's online (it is)
I add and remove a transport to that visor from the local visor as a test - it works
Then I attempt to create a transport from the remote visor to my local visor with the transport setup-node
it fails with the following visor debug logging
Conclusion
It's clear there is some issue with either too many open connections or connections not getting closed properly.
The text was updated successfully, but these errors were encountered: