Skip to content

term: clamp zero window size from Docker PTY#959

Merged
pkova merged 2 commits into
urbit:developfrom
tomholford:th/i/159/fix-docker-tty-zero-winsize
Feb 17, 2026
Merged

term: clamp zero window size from Docker PTY#959
pkova merged 2 commits into
urbit:developfrom
tomholford:th/i/159/fix-docker-tty-zero-winsize

Conversation

@tomholford

@tomholford tomholford commented Feb 6, 2026

Copy link
Copy Markdown
Contributor

Summary

  • When Docker allocates a PTY (tty: true + stdin_open: true) but nobody is attached, TIOCGWINSZ succeeds but reports 0x0
  • This causes unsigned underflow in row_l - 1 calculations throughout term.c and a decrement-underflow crash in drum when it receives %blew [0 0]
  • Clamp to 80x24 defaults when ioctl returns zero dimensions (matching existing fallback when ioctl fails)
  • Fix initial default row_l from 0 to 24

Resolves #159.
See also urbit/urbit#4750

Disclaimer

I made this change using a coding agent. Please pardon any slop.

Test plan

  • Build with zig build
  • Cross-compile with zig build -Dtarget=aarch64-linux-musl -Doptimize=ReleaseFast
  • Run in Docker with tty: true + stdin_open: true and no docker attach — ship boots without crashing
  • Verify docker attach / detach still works after boot

Reproduction steps

# Cross-compile for Linux
zig build -Dtarget=aarch64-linux-musl -Doptimize=ReleaseFast

# Build Docker image
cp zig-out/aarch64-linux-musl/urbit docker/
docker build -t vere-test docker/

# Create volume with comet file
docker volume create test-comet
docker run --rm -v test-comet:/urbit alpine touch /urbit/test-ship.comet

# Run with tty + stdin_open, no attach (previously crashed)
docker run -d --name test-vere -t -i -v test-comet:/urbit vere-test

# Verify ship boots
docker logs -f test-vere

# Cleanup
docker stop test-vere && docker rm test-vere
docker volume rm test-comet

When Docker allocates a PTY (tty: true + stdin_open: true) but nobody
is attached, TIOCGWINSZ succeeds but reports 0x0. This causes unsigned
underflow in row_l - 1 calculations throughout term.c and a
decrement-underflow crash in Arvo's drum when it receives %blew [0 0].

Clamp to 80x24 defaults when ioctl returns zero dimensions, matching
the existing fallback when ioctl fails. Also fix the initial default
row_l from 0 to 24 to match col_l.

Resolves urbit#159.
@tomholford tomholford marked this pull request as ready for review February 8, 2026 01:40
@tomholford tomholford requested a review from a team as a code owner February 8, 2026 01:40
@Fang- Fang- self-requested a review February 9, 2026 16:29
Fang-
Fang- previously approved these changes Feb 9, 2026

@Fang- Fang- left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of where the dimensions come from, zeroes in %blew are bogus, so clamping like this is good. Arguably 0 == col_l instead of !col_l would be semantically clearer, but it's C, so readers know.

@pkova

pkova commented Feb 9, 2026

Copy link
Copy Markdown
Collaborator

What version of vere are you running when you blow up with your reproduction commands? We've been unable to reproduce on linux-x86_64 and linux-aarch64.

@tomholford

Copy link
Copy Markdown
Contributor Author

What version of vere are you running when you blow up with your reproduction commands? We've been unable to reproduce on linux-x86_64 and linux-aarch64.

@pkova:

The issue was observed with tloncorp/vere:latest (v4.2, and every version before since #159 and urbit/urbit#4750 were opened) on an x86_64 VPS (Ubuntu 24.04, Docker 29.2.1), running with tty: true + stdin_open: true in docker-compose. The container crashes on initial boot when nobody has attached — TIOCGWINSZ succeeds but returns 0x0, and drum crashes on %blew [0 0].

FWIW, for a long time I've been operating with this workaround.

I tested the fix in this PR by cross-compiling from this branch for aarch64-linux-musl and booting a comet in Docker (via OrbStack on macOS) with -t -i and no attach. Apologies for the lack of clarity, but I didn't reproduce locally on an unpatched build first. Happy to do so if that would help.

Worth mentioning that the reproduction is timing-sensitive: once anyone docker attaches (even briefly), Docker sets TIOCSWINSZ with real dimensions, and the PTY retains those values even after detach.

@tomholford

Copy link
Copy Markdown
Contributor Author

Addressed @Fang-'s feedback (explicit zero comparison) in 467ad68.

@tomholford

Copy link
Copy Markdown
Contributor Author

Tested on a live x86_64 VPS (Ubuntu, Docker 28.1.1) — the same environment where the issue was originally observed.

Steps:

  1. Cross-compiled from this branch for x86_64-linux-musl
  2. Built Docker image from docker/Dockerfile with the custom binary
  3. Transferred image to VPS via docker save/docker load
  4. Added a throwaway comet service to the existing docker-compose with tty: true + stdin_open: true (same config as the affected ships)
  5. Started the container without attaching
# 1. Cross-compile
zig build -Dtarget=x86_64-linux-musl -Doptimize=ReleaseFast

# 2. Build Docker image
cp zig-out/x86_64-linux-musl/urbit docker/
docker build --platform linux/amd64 -t vere-test-x86 docker/

# 3. Transfer to VPS
docker save vere-test-x86 | gzip > /tmp/vere-test-x86.tar.gz
scp /tmp/vere-test-x86.tar.gz user@vps:/tmp/
# on VPS:
docker load < /tmp/vere-test-x86.tar.gz

# 4. Add to docker-compose.yml under services:
#   test-comet:
#     image: vere-test-x86
#     volumes:
#       - ~/urbit/test-comet:/urbit
#     stdin_open: true
#     tty: true
mkdir -p ~/urbit/test-comet && touch ~/urbit/test-comet/test.comet

# 5. Start without attaching
docker compose up -d test-comet

Result: Comet booted successfully and has been running for 8+ hours with no crash or attach. Dojo prompt is active, gall apps loaded normally.

@pkova pkova merged commit 4e2a283 into urbit:develop Feb 17, 2026
2 checks passed
pkova added a commit that referenced this pull request Feb 18, 2026
This reverts commit 4e2a283, reversing
changes made to de22bf0.
@pkova

pkova commented Feb 18, 2026

Copy link
Copy Markdown
Collaborator

FYI I reverted this because of a regression in terminal behavior on at least macos-aarch64.

look at all the strange empty lines
urbit 4.3-245ecf8
boot: home is pub
loom: mapped 2048MB
lite: arvo formula 4ce68411
lite: core 641296f
lite: final state 641296f




loom: mapped 2048MB
boot: protected loom
live: logical boot
boot: installed 1770 jets
boot: parsing %brass pill






























1-b
1-c (compiling compiler, wait a few minutes)
took µs/2
1-d
took µs/7
1-e
ride: parsing
ride: compiling
ride: compiled
took ms/26.195
1-f
took µs/3
1-g
took ms/2.014
lull: 0vc.3u2io
zuse: 0ve.tug7l
vane: %ames: 0vj.05pdc
vane: %behn: 0v1s.oesbr
vane: %clay: 0v4.aismq
vane: %dill: 0v1l.85ao1
vane: %eyre: 0vg.feekf
vane: %gall: 0vk.ie9q7
vane: %iris: 0v1p.u1u45
vane: %jael: 0v18.cmvih
vane: %khan: 0vi.g4115
vane: %lick: 0vt.n7tn5
arvo: metamorphosis
ames: metamorphosis on %call
clay: kernel updated
clay: rebuilding %base after kernel update
gall: installing %acme
gall: installing %azimuth
gall: installing %dbug
gall: installing %dojo
gall: installing %eth-watcher
gall: installing %hood
drum: link [~pub %dojo]
kiln: boot
gall: installing %herm
gall: installing %lens
gall: installing %ping
gall: installing %spider
gall: installing %docket
gall: installing %hark
gall: installing %settings
gall: installing %storage
gall: installing %treaty
gall: installing %vitals
docket: fetching %http glob for %webterm desk
docket: fetching %http glob for %landscape desk
docket: fetching %http glob for %webterm desk
docket: fetching %http glob for %landscape desk
docket: fetching %http glob for %groups desk
gall: installing %logs
gall: installing %groups
gall: installing %chat
gall: installing %notify
gall: installing %grouper
gall: installing %groups-ui
gall: installing %channels
gall: installing %channels-server
gall: installing %profile
gall: installing %activity
gall: installing %contacts
gall: installing %reel
gall: installing %bait
gall: installing %growl
gall: installing %bark
gall: installing %genuine
gall: installing %lanyard
gall: installing %dumb-proxy
gall: installing %expose
gall: installing %metagrab
clay: base is always essential
--------------- bootstrap complete ----------------
dock: pace (once): configured at pub/.bin/pace
vere: binary copy succeeded
disk: loaded epoch 0i0
loom: mapped 2048MB
boot: protected loom
live: mapped: MB/762.478.592
boot: installed 1770 jets
vere: checking version compatibility
lick: init mkdir pub/.urb/dev
mesa: INIT
mesa: forwarding enabled
ames: skipping port: 31503
ames: live on 0 (localhost only)
mesa: live on 31503 (localhost only)
conn: listening on pub/.urb/conn.sock
ames: unix-duct received on %born
http: web interface live on http://localhost:80
http: loopback live on http://localhost:12321
pier (34): live
docket: fetching %http glob for %landscape desk
mdns: fake-pub registered on all interfaces
> 
> 
~pub:dojo>
this is normal
urbit 4.3-cf4d8cf
boot: home is pub
loom: mapped 2048MB
lite: arvo formula 4ce68411
lite: core 641296f
lite: final state 641296f
boot: downloading pill https://bootstrap.urbit.org/urbit-v4.3.pill
disk: loaded epoch 0i0
loom: mapped 2048MB
boot: protected loom
live: logical boot
boot: installed 1770 jets
boot: parsing %brass pill
--------------- bootstrap starting ----------------
boot: 1-21
1-b
1-c (compiling compiler, wait a few minutes)
took µs/3
1-d
took µs/7
1-e
ride: parsing
ride: compiling
ride: compiled
took ms/26.775
1-f
took µs/2
1-g
took ms/1.965
lull: 0vc.3u2io
zuse: 0ve.tug7l
vane: %ames: 0vj.05pdc
vane: %behn: 0v1s.oesbr
vane: %clay: 0v4.aismq
vane: %dill: 0v1l.85ao1
vane: %eyre: 0vg.feekf
vane: %gall: 0vk.ie9q7
vane: %iris: 0v1p.u1u45
vane: %jael: 0v18.cmvih
vane: %khan: 0vi.g4115
vane: %lick: 0vt.n7tn5
arvo: metamorphosis
ames: metamorphosis on %call
clay: kernel updated
clay: rebuilding %base after kernel update
gall: installing %acme
gall: installing %azimuth
gall: installing %dbug
gall: installing %dojo
gall: installing %eth-watcher
gall: installing %hood
drum: link [~pub %dojo]
kiln: boot
gall: installing %herm
gall: installing %lens
gall: installing %ping
gall: installing %spider
gall: installing %docket
gall: installing %hark
gall: installing %settings
gall: installing %storage
gall: installing %treaty
gall: installing %vitals
docket: fetching %http glob for %webterm desk
docket: fetching %http glob for %landscape desk
docket: fetching %http glob for %webterm desk
docket: fetching %http glob for %landscape desk
docket: fetching %http glob for %groups desk
gall: installing %logs
gall: installing %groups
gall: installing %chat
gall: installing %notify
gall: installing %grouper
gall: installing %groups-ui
gall: installing %channels
gall: installing %channels-server
gall: installing %profile
gall: installing %activity
gall: installing %contacts
gall: installing %reel
gall: installing %bait
gall: installing %growl
gall: installing %bark
gall: installing %genuine
gall: installing %lanyard
gall: installing %dumb-proxy
gall: installing %expose
gall: installing %metagrab
clay: base is always essential
--------------- bootstrap complete ----------------
dock: pace (once): configured at pub/.bin/pace
vere: binary copy succeeded
disk: loaded epoch 0i0
loom: mapped 2048MB
boot: protected loom
live: mapped: MB/762.478.592
boot: installed 1770 jets
vere: checking version compatibility
lick: init mkdir pub/.urb/dev
mesa: INIT
mesa: forwarding enabled
ames: skipping port: 31503
ames: live on 0 (localhost only)
mesa: live on 31503 (localhost only)
conn: listening on pub/.urb/conn.sock
ames: unix-duct received on %born
http: web interface live on http://localhost:80
http: loopback live on http://localhost:12321
pier (34): live
docket: fetching %http glob for %landscape desk
mdns: fake-pub registered on all interfaces
~pub:dojo>

This needs to be fixed if we want this to go in.

@tomholford

Copy link
Copy Markdown
Contributor Author

FYI I reverted this because of a regression in terminal behavior on at least macos-aarch64.

look at all the strange empty lines

This needs to be fixed if we want this to go in.

Thanks, will investigate

tomholford added a commit to tomholford/vere that referenced this pull request Feb 18, 2026
When Docker allocates a PTY (tty: true + stdin_open: true) but nobody
is attached, TIOCGWINSZ succeeds but reports 0x0. This causes a
decrement-underflow crash in drum when it receives %blew [0 0].

Clamp to 80x24 defaults when ioctl returns zero dimensions, matching
the existing fallback when ioctl fails entirely.

Unlike the prior attempt (urbit#959), this does not change the initial
row_l from 0 — that value serves as a sentinel throughout term.c
to skip cursor positioning during early boot (checked by hija/loja
and the spinner). Changing it to 24 caused blank lines in terminal
output on macOS.

Resolves urbit#159.
pkova added a commit that referenced this pull request Mar 13, 2026
## Summary

- When Docker allocates a PTY (`tty: true` + `stdin_open: true`) but
nobody is attached, `TIOCGWINSZ` succeeds but reports 0x0
- This causes a `decrement-underflow` crash in drum when it receives
`%blew [0 0]`
- Clamp to 80x24 defaults when ioctl returns zero dimensions (matching
existing fallback when ioctl fails)

This is a v2 of #959, which was merged then reverted due to a
regression. The prior version also changed the initial `row_l` from 0 to
24, which broke terminal output on macOS — `row_l = 0` is a sentinel
used by `hija`/`loja` and the spinner to skip cursor positioning during
early boot. Setting it to 24 caused every `u3l_log()` call before the
first `%blew` event to reposition the cursor, producing blank lines.
This version keeps `row_l = 0` and only adds the clamping in
`u3_term_get_blew`.

Resolves #159.
See also [urbit/urbit#4750](urbit/urbit#4750)

## Test results

**Docker on x86_64 Linux VPS** (Ubuntu, `tty: true` + `stdin_open:
true`, no attach):
- Cross-compiled for `x86_64-linux-musl`, built Docker image,
transferred to VPS
- Fresh comet booted successfully, ran 8+ minutes with no crash
- Container config matches the exact scenario that triggers the bug

**macOS aarch64** (native build, PTY via `script`):
- Fresh comet booted, full bootstrap completed
- No spurious blank lines in boot output (the regression from #959 is
fixed)
- Early boot log lines use plain text (no cursor positioning),
confirming `row_l = 0` sentinel works correctly
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Docker: tloncorp/vere bails during launch

3 participants