term: clamp zero window size from Docker PTY#959
Conversation
When Docker allocates a PTY (tty: true + stdin_open: true) but nobody is attached, TIOCGWINSZ succeeds but reports 0x0. This causes unsigned underflow in row_l - 1 calculations throughout term.c and a decrement-underflow crash in Arvo's drum when it receives %blew [0 0]. Clamp to 80x24 defaults when ioctl returns zero dimensions, matching the existing fallback when ioctl fails. Also fix the initial default row_l from 0 to 24 to match col_l. Resolves urbit#159.
Fang-
left a comment
There was a problem hiding this comment.
Regardless of where the dimensions come from, zeroes in %blew are bogus, so clamping like this is good. Arguably 0 == col_l instead of !col_l would be semantically clearer, but it's C, so readers know.
|
What version of vere are you running when you blow up with your reproduction commands? We've been unable to reproduce on linux-x86_64 and linux-aarch64. |
The issue was observed with FWIW, for a long time I've been operating with this workaround. I tested the fix in this PR by cross-compiling from this branch for aarch64-linux-musl and booting a comet in Docker (via OrbStack on macOS) with Worth mentioning that the reproduction is timing-sensitive: once anyone |
|
Tested on a live x86_64 VPS (Ubuntu, Docker 28.1.1) — the same environment where the issue was originally observed. Steps:
# 1. Cross-compile
zig build -Dtarget=x86_64-linux-musl -Doptimize=ReleaseFast
# 2. Build Docker image
cp zig-out/x86_64-linux-musl/urbit docker/
docker build --platform linux/amd64 -t vere-test-x86 docker/
# 3. Transfer to VPS
docker save vere-test-x86 | gzip > /tmp/vere-test-x86.tar.gz
scp /tmp/vere-test-x86.tar.gz user@vps:/tmp/
# on VPS:
docker load < /tmp/vere-test-x86.tar.gz
# 4. Add to docker-compose.yml under services:
# test-comet:
# image: vere-test-x86
# volumes:
# - ~/urbit/test-comet:/urbit
# stdin_open: true
# tty: true
mkdir -p ~/urbit/test-comet && touch ~/urbit/test-comet/test.comet
# 5. Start without attaching
docker compose up -d test-cometResult: Comet booted successfully and has been running for 8+ hours with no crash or attach. Dojo prompt is active, gall apps loaded normally. |
|
FYI I reverted this because of a regression in terminal behavior on at least macos-aarch64. look at all the strange empty linesthis is normalThis needs to be fixed if we want this to go in. |
Thanks, will investigate |
When Docker allocates a PTY (tty: true + stdin_open: true) but nobody is attached, TIOCGWINSZ succeeds but reports 0x0. This causes a decrement-underflow crash in drum when it receives %blew [0 0]. Clamp to 80x24 defaults when ioctl returns zero dimensions, matching the existing fallback when ioctl fails entirely. Unlike the prior attempt (urbit#959), this does not change the initial row_l from 0 — that value serves as a sentinel throughout term.c to skip cursor positioning during early boot (checked by hija/loja and the spinner). Changing it to 24 caused blank lines in terminal output on macOS. Resolves urbit#159.
## Summary - When Docker allocates a PTY (`tty: true` + `stdin_open: true`) but nobody is attached, `TIOCGWINSZ` succeeds but reports 0x0 - This causes a `decrement-underflow` crash in drum when it receives `%blew [0 0]` - Clamp to 80x24 defaults when ioctl returns zero dimensions (matching existing fallback when ioctl fails) This is a v2 of #959, which was merged then reverted due to a regression. The prior version also changed the initial `row_l` from 0 to 24, which broke terminal output on macOS — `row_l = 0` is a sentinel used by `hija`/`loja` and the spinner to skip cursor positioning during early boot. Setting it to 24 caused every `u3l_log()` call before the first `%blew` event to reposition the cursor, producing blank lines. This version keeps `row_l = 0` and only adds the clamping in `u3_term_get_blew`. Resolves #159. See also [urbit/urbit#4750](urbit/urbit#4750) ## Test results **Docker on x86_64 Linux VPS** (Ubuntu, `tty: true` + `stdin_open: true`, no attach): - Cross-compiled for `x86_64-linux-musl`, built Docker image, transferred to VPS - Fresh comet booted successfully, ran 8+ minutes with no crash - Container config matches the exact scenario that triggers the bug **macOS aarch64** (native build, PTY via `script`): - Fresh comet booted, full bootstrap completed - No spurious blank lines in boot output (the regression from #959 is fixed) - Early boot log lines use plain text (no cursor positioning), confirming `row_l = 0` sentinel works correctly
Summary
tty: true+stdin_open: true) but nobody is attached,TIOCGWINSZsucceeds but reports 0x0row_l - 1calculations throughout term.c and adecrement-underflowcrash in drum when it receives%blew [0 0]row_lfrom 0 to 24Resolves #159.
See also urbit/urbit#4750
Disclaimer
I made this change using a coding agent. Please pardon any slop.
Test plan
zig buildzig build -Dtarget=aarch64-linux-musl -Doptimize=ReleaseFasttty: true+stdin_open: trueand nodocker attach— ship boots without crashingdocker attach/ detach still works after bootReproduction steps