Skip to content

FIX 179: Static binaries can't exit cleanly#207

Open
megastallman wants to merge 44 commits into
jart:masterfrom
megastallman:master
Open

FIX 179: Static binaries can't exit cleanly#207
megastallman wants to merge 44 commits into
jart:masterfrom
megastallman:master

Conversation

@megastallman

Copy link
Copy Markdown

Co-authored-by: Antigravity/Gemini3pro
The issue was that Blink was setting RDX to the program name, interpreted as a destructor by static binaries. I changed it to set RDX to 0 for non-Cosmo binaries, which fixes the crash on exit.

Co-authored-by: Antigravity/Gemini3pro
megastallman and others added 28 commits January 15, 2026 13:21
- FreeBSD signal frame: build proper FreeBSD-shaped ucontext_t/mcontext_t/
  siginfo_t in DeliverSignal/SigRestore instead of Linux layouts, so libthr
  doesn't clobber the saved RIP on handler exit.
- SignalActor: add the sigtramp intercept that ExecuteInstruction already
  has. Recursive signal delivery (CheckInterrupt on EINTR) was bypassing it.
- Auto-generate /var/run/ld-elf.so.hints when missing so pkg-installed
  binaries find /usr/local/lib without a manual ldconfig.
- /dev/null & co: always satisfy standard char devices from the host. The
  ENOENT fallback didn't catch EACCES from O_CREAT shell redirects against
  a read-only chroot /dev — broke pkg PRE-INSTALL scripts (nginx).
- FixupShellEnv: rewrite SHELL=/bin/bash to a chroot-resident shell at
  load time so mc's subshell doesn't fall through to host bash.
- Syscall mappings: pathconf (191), fpathconf (192), getsid (310),
  setresuid (311), setresgid (312), sigsuspend (341).

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… audit) + sysctls

su now works. Adds:
- getdtablesize (89), setpriority (96), getpriority (100) — handlers in
  blink that return sensible defaults rather than ENOSYS.
- setegid (182), seteuid (183) — reshape rdi into Linux setresuid/setresgid
  arg layout (rdi=-1, rsi=euid/egid, rdx=-1).
- audit family (445-453) — shared stub returning 0.
- sysctlbyname kern.securelevel → -1 (permissive), kern.console → empty.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
…_waiters==0

The previous impl returned 0 immediately when ucond->c_has_waiters was 0.
The FreeBSD kernel ALWAYS sets c_has_waiters = 1 then sleeps for CV_WAIT,
regardless of the prior value — libthr counts on that to wake correctly.

Returning early looked like a spurious wake to libthr: the thread stayed
on its userspace sleepq, the caller looped back into pthread_cond_wait,
and the second sleepq_add panic'd with "thread %p was already on queue"
at thr_cond.c:285. LibreOffice's Qt thread pool reproducibly hit this.

Now mirror the kernel: atomically store 1 into c_has_waiters, then sleep
on the futex with val=1.

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
… accept4)

FreeBSD/amd64 Go binaries crashed during startup/runtime through five
sequential bugs; fix each so cpuburner and an HTTP server/client over the
kqueue netpoller run to completion and exit cleanly.

- sigaltstack: translate FreeBSD flag *values* (SS_DISABLE=0x4) to Linux
  (0x2), not just the struct field order. Go disabling its altstack passed
  flags=0x4 which hit the unsupported-flags EINVAL path, so Go's
  runtime.sigaltstack crash-on-failure stub SIGSEGV'd.
- kqueue: the kqueue->epoll layer is gated on HAVE_EPOLL_PWAIT1, but the
  configure probe passed NULL events and tripped -Werror=nonnull on modern
  glibc, leaving the define off -> Go netpoll did kqueue()->ENOSYS->fatal.
  Fix tool/config/epoll_pwait{1,2}.c to pass a real events pointer.
- sched_yield: map FreeBSD 331 -> Linux 0x18 (was unmapped -> busy spin).
- _umtx_op WAIT (2/11/15): return 0 on value mismatch like FreeBSD (blink
  returned EAGAIN, Linux futex semantics), which crashed Go's futexsleep1.
- exit: FreeBSD exit(2) terminates the whole process (thr_exit is per-thread);
  map syscall 1 -> Linux exit_group (0xe7) not exit (0x3c), else Go hung
  forever after main() returned because only the calling thread exited.
- accept4: map FreeBSD 541 with SOCK_CLOEXEC/NONBLOCK flag translation; the
  netpoll TCP server otherwise couldn't accept (ECONNREFUSED).
- sysctls: add kern.smp.maxcpus, kern.conftxt, kern.ipc.soacceptqueue.

blink's multi-threaded SMP emulation intermittently corrupts the Go runtime
once it schedules across >1 P (fatal error: schedule: holding locks), so pin
guests to a single CPU: kern.smp.maxcpus=1, hw.ncpu=1, and cpuset_getaffinity
reports only CPU0.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
… data, FIOASYNC, sendfile

nginx accepted connections but never responded; curl hung until blink was
killed. Four bugs in the FreeBSD network path:

- kevent EV_CLEAR was ignored. nginx registers connection sockets edge-
  triggered (EV_CLEAR), but the epoll registration stayed level-triggered, so
  epoll_wait re-reported the same fd readable forever (~500k spins) and nginx
  made no progress. Map EV_CLEAR -> EPOLLET (track per-watch, OR into the
  combined epoll events).
- kevent never filled the data field. nginx uses kev.data (bytes available)
  under kqueue to size reads / track ready state, so it stopped after the
  first read. Populate it: FIONREAD for EVFILT_READ, SO_SNDBUF for
  EVFILT_WRITE.
- ioctl(FIOASYNC) (FreeBSD 0x8004667d) returned EINVAL. nginx sets it on the
  master<->worker channel socket while spawning workers and treats failure as
  fatal, so no worker ever started. Translate to O_ASYNC via F_SETFL.
- sendfile (FreeBSD syscall 393) was unmapped, so static files never
  transferred. FreeBSD's sendfile differs from Linux's (file/socket args
  swapped, sf_hdtr header/trailer iovecs, byte count via *sbytes, returns
  0/-1). Add SysFreeBSDSendfile with proper EAGAIN/partial-write semantics so
  nginx resumes via EVFILT_WRITE on a non-blocking socket.

Verified: nginx (default master+worker fork mode) serves sequential, parallel,
small, and 2 MB requests with byte-exact md5; Go net test (kqueue + accept4)
and cpuburner still pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…E_IN_ROOT)

blink confines guests with overlays (openat relative to the overlay dir) rather
than a real chroot(), so the host kernel resolved an absolute symlink TARGET
against the host root and escaped the overlay. A guest symlink like
/usr/local/www/nginx -> /usr/local/www/nginx-dist therefore couldn't be
followed (ENOENT), e.g. nginx's default doc root 404'd.

Use openat2(RESOLVE_IN_ROOT) — which makes "/" and absolute symlinks resolve
relative to the dirfd, like chroot — as a fallback when the plain openat path
fails with ENOENT/ENOTDIR, so the common case keeps its existing behavior and
its errno (the overlay search loop is unchanged). OverlaysOpen retries the open
in-root; OverlaysGeneric (stat/access/chmod/chown/utime/unlink/readlink/...)
resolves the path's parent in-root and retries on the resulting host path,
preserving each op's follow/create semantics on the final component.

Guarded by __linux__ && SYS_openat2 with a cached-ENOSYS flag, so older kernels
and non-Linux hosts simply fall through to prior behavior.

Verified: nginx serves its default page through the symlinked doc root (200);
cat/ls/readlink through single and chained absolute symlinks work; lstat and
readlink still operate on the link itself; nonexistent paths still ENOENT; Go
programs, md5, and shell pipes unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ption)

SysFreeBSDThrNew wrote the 11-byte thr_exit thunk at thunk_addr = sp + 8,
where sp = ((stack_base + stack_size) & ~15) - 8 — i.e. at the 16-aligned
stack top (stack.hi). That overran the child g0 stack by up to ~11 bytes on
every thread creation, silently clobbering whatever guest memory happened to
be adjacent (often Go scheduler structures or a sudog).

This was the dominant cause of the long-standing FreeBSD-only SMP corruption
("schedule: holding locks", "sudog with non-nil next"). It is FreeBSD-specific
because the Linux clone() path writes no such thunk, which is exactly why
Linux Go binaries were immune — the key discriminator while tracking it down.
It reproduced even on a single physical core with async preemption disabled,
confirming a stray write rather than a coherence/atomics/ordering issue.

Place the thunk at top-16 and the return-address slot at top-24, both inside
[stack_base, stack_base+stack_size), preserving the RSP%16==8 entry ABI.

After the fix, the GOMAXPROCS=4 atomic+mutex and channel/goroutine-churn
stress tests are 100% clean (previously ~30-67% crash) across interpreter,
JIT, linear and single-core configurations.

The single-CPU pin (kern.smp.maxcpus / hw.ncpu / cpuset_getaffinity = 1) is
kept for now: a separate residual corruption still appears under allocation/
GC-heavy multicore workloads (SIGSEGV in runtime heap-bitmap init), to be
fixed before multicore is enabled.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…ifiers)

GCC 15 produced 387 warnings on a clean build; older GCC was silent.

- 370x -Wcast-align: blink reinterprets byte-addressed guest memory (u8 *)
  as wider atomic/vector types throughout the SSE and atomic-op fast paths
  (~230 sites). These are intentional and safe on the x86_64/aarch64 hosts we
  target. builtin.h already carried `#pragma GCC diagnostic error
  "-Wcast-align=strict"`, but that pragma form was silently ignored by older
  GCC (so the tree always built clean); GCC 15 honors it. Switch it to
  `ignored "-Wcast-align"` with a comment explaining why.

- 17x -Wdiscarded-qualifiers: all from passing string literals / const data
  to CopyToUser / CopyToUserWrite, whose `src` was non-const `void *` even
  though the to-user direction only reads it. Make `src` `const void *`
  (declarations in machine.h, definitions in memory.c).

Clean build now emits zero warnings; no behavior change.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…CPU pin)

Two -m-mode (software MMU) data races corrupted the guest under multiple
threads, the residual SMP failures after the thr_new thunk fix. Both only
manifest with >1 CPU, which is why the single-CPU pin masked them.

1. g_hostpages race. In -m mode TrackHostPage() records each committed host
   page in a global growable array and returns its index, which is stored in
   the PTE's PAGE_TA bits; FindHostPage() reads the array on every guest
   memory access to translate a PTE back to a host pointer. TrackHostPage()
   mutated the array (n++, realloc, p[entry]=ptr) with no lock, while page
   faults on other threads called it concurrently and FindHostPage() read it
   lock-free. Two faults racing n++ got the same index, aliasing two guest
   pages onto one host page; a concurrent realloc moved the array out from
   under a reader. Guard writes with g_hostpages_lock, publish the (grown)
   array pointer with release ordering, and read it with an acquire load.
   Old arrays are not freed (a reader may hold one); growth is geometric so
   this leak is negligible.

2. Page-fault CAS-loss double-free. When two threads fault the same anonymous
   page, the CAS loser freed (u8 *)(page & PAGE_TA). In -m mode those bits are
   the g_hostpages index, not the host pointer, so the allocator free list got
   a bogus entry and later handed out a broken page. Free FindHostPage(page).

With these and the earlier thr_new thunk fix, multicore is correct, so drop
the pin: cpuset_getaffinity / kern.smp.maxcpus / hw.ncpu report GetCpuCount()
again (guest NumCPU = host count).

Verified -m multicore, 20 runs each: GOMAXPROCS=4 atomic+mutex, channel,
goroutine-churn, alloc/GC-heavy, and net/http stress all clean (gc and httptest
were ~15-30% crash before); nginx serves; cpuburner saturates 4 CPUs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
megastallman and others added 15 commits May 31, 2026 13:08
In software-MMU (-m) mode TrackHostPage() appends a host-page pointer to the
global g_hostpages table and returns its index, stored in a PTE's PAGE_TA bits.
It was called from AllocateAnonymousPage() for *every* returned page, including
pages popped off the allocator free list, so a freed-then-reallocated page got
a brand-new table entry each cycle. The table therefore grew ~8 bytes per page
commit without bound — a slow leak in long-running -m guests.

Track each host page exactly once, when AllocateBig() first maps it, and cache
its cookie (the PAGE_TA bits) on the free-list node so reallocation reuses the
existing slot instead of minting a new one. FreeAnonymousPage() and the
internal FreePageTable() take the cookie (callers already hold the PTE, so they
pass entry & PAGE_TA); FindHostPage() still yields the host pointer.

g_hostpages.n now tracks peak committed pages rather than cumulative
allocations: the GC-churn stress test settles at ~2700 entries instead of
growing past ~160000. Verified gc + atomic/mutex stress 12/12 clean in both -m
and linear modes.

(The PAGE_MUG file-mapping path in ReserveVirtual still mints entries that are
munmap'd rather than returned to the free list; that is bounded by mmap count,
not per-fault, and is left as-is.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The previous commit bounded g_hostpages for anonymous pages by caching each
page's slot on the allocator free list. File/shared mmaps (PAGE_MUG) take a
different free path: FreePage() munmap()s their backing memory rather than
returning it to the allocator, so their g_hostpages slot was orphaned and the
table still grew one entry per munmap'd MUG page.

Add a free-index stack (freeidx/nfree/cfree, guarded by g_hostpages_lock).
ReleaseHostPage() pushes a slot when its PAGE_MUG backing is unmapped, and
TrackHostPage() pops a reusable slot before growing the table. This is safe:
the freed page's PTE has been cleared and all TLBs are invalidated at the end
of the enclosing FreeVirtual() before the slot can be handed out again, so no
live PTE/TLB entry resolves to a reused slot (same ordering the anonymous-page
reuse already relies on).

Verified: 5000 shared-mmap/munmap cycles (~80k MUG page commits) hold
g_hostpages.n flat at ~1290 instead of growing to ~80k; gc + atomic/mutex
stress and nginx still clean under -m multicore.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Micro-op stitching leaves provably-redundant reg-reg moves in the final
instruction stream, especially res0<->arg0 round-trips that straddle the
Jitter-glue / inlined-micro-op boundary: the Jitter emits `mov %rdi,%rax` to
pass a result as the next op's argument, and the micro-op body immediately
begins `mov %rax,%rdi`. For a tight integer ALU loop the JIT output was ~9x
the size of the equivalent native code, and these cancelling moves sat on the
critical dependency chain.

Add a per-op peephole (PeepholeOp), run from AddPath_EndOp over the straight-
line host code emitted for a single guest op. It decodes that byte range with
blink's own decoder (for exact instruction boundaries) and removes:
  - `mov A,A`             no-op
  - `mov A,B ; mov B,A`   the second is a no-op (after the first A==B, so the
                          second leaves B unchanged) -> drop the second
Both are provably semantics-preserving (no effect on flags or any register).
The pass bails out for the whole op if it emitted any control transfer, so
relative-branch displacements and recorded jump fixups are never perturbed;
the path's terminating jump is appended afterwards on post-peephole offsets.
Only adjacent moves are touched, and it runs before commit on the writable
staging buffer (gated by CanJitForImmediateEffect).

Speedup: integer ALU benchmark 4.90s -> ~3.0s (~38% faster; native 0.64s).

Validated: JIT output byte-identical to the interpreter (-j) across integer,
floating-point, string and coreutils workloads; FreeBSD Go suite (atomic+mutex,
GC-churn) 15/15 each and a Linux Go binary that JITs all threads 20/20 clean
under -m multicore; nginx serves HTTP 200. Clean -Werror build.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The peephole (02237b7) was reverted in db8338f on suspicion it caused thunar
to crash under JIT. A proper rate-based bisect (25 runs/config) disproved that:

  thunar JIT-SIGSEGV rate:  pre-session 7ff84a2 = 3/25,
                            HEAD single-core = 7/25, HEAD multicore = 0/25

The crash is PRE-EXISTING (present before this session's first commit), occurs
with the peephole reverted, and is rate-neutral w.r.t. the peephole (peephole-on
1/12 & 6/20 sit in the same noise band). It's a long-standing intermittent JIT
bug in heavily-threaded GUI apps (varying signatures -> state corruption;
`-j` avoids it), unrelated to this peephole. The earlier revert was a
misdiagnosis from bisecting an intermittent bug on too few runs.

Restoring the +44% codegen win. The pre-existing JIT bug is tracked separately
(workaround: run such apps with -j).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
blink/path.c: the -m (software-MMU) stash-commit code emitted
`cmpq $0,stashaddr(%rdi); jz +5; <call CommitStash>`, where the `jz +5`
hardcoded a 5-byte skip. AppendJitCall() only emits a 5-byte `call rel32`
when the target is within +/-2GB; when farther it emits a ~12-byte
`movabs $addr,%rax; call %rax`. If ASLR places the JIT mmap >2GB from the
blink binary, `jz +5` then lands mid-instruction and executes garbage ->
guest-heap corruption -> SIGSEGV. Backpatch the jz to skip the ACTUAL emitted
call length instead. Proven by forcing the far-call path: thunar under -m
crashed 4/5 without this fix and 0/8 with it. (Latent in runs where the JIT
mmap lands within 2GB, so it is real but not the sole cause of the -m GUI
crash, which still reproduces intermittently and is under investigation. The
interpreter and linear-mapping JIT are unaffected -- under linear, the stash
path is never emitted.)

Also fixes two genuine, separate JIT block-retirement concurrency bugs found
along the way, plus a JIT memory-size bump:

- building flag (jit.c/jit.h): ForceJitBlocksToRetire() could retire a block
  another thread had leased and was appending into (leased blocks stay on
  agedblocks), double-unlinking it from jit->blocks and letting a second
  thread reuse the same storage. Skip blocks flagged building (set under
  jit->lock in StartJit, cleared in ReinsertJitBlock_).

- QSBR reclamation (jit.c, machine.c/.h, memorymalloc.c): a committed block
  could be retired and reused while a thread was still executing inside it
  (or blocked in a syscall called from it). Retired blocks now park on a
  draining list stamped with a reclaim epoch; each thread publishes the epoch
  it last saw at the Actor() quiescent point (m->jitqso) and DrainReclaimable_
  only releases a block once every thread has passed it. The quiescence floor
  is computed under machines_lock BEFORE jit->lock, matching the existing
  outer->inner lock order (fork() takes machines_lock then jit.lock), so no
  inversion.

- kJitMemorySize 31MB -> 128MB (lazily committed; costs only address space).

README.md: document running without -m and the thunar/xeyes examples.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
FreeBSD sigpending(sigset_t *set) had no translation entry and fell through
to the UNMAPPED-syscall path. Add a SysFreeBSDSigpending handler that mirrors
the Linux rt_sigpending handler (SysSigpending) but emits FreeBSD's 16-byte
sigset_t: blink's pending mask m->signals in the low 8 bytes, upper words
zeroed. Mask bits are treated as bit-compatible with Linux, consistent with
the sibling SysFreeBSDSigprocmask. Wire it up: translate FreeBSD 343 to the
synthetic ordinal 0x251 and register the dispatch entry.

Tested: a FreeBSD program that blocks+raises SIGTERM then calls sigpending
reports rc=0 with SIGTERM pending and SIGINT not, under both -m and linear.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Under -m (software MMU) a memory access that straddles a page boundary can't be
served by a single native locked instruction on the real memory, so blink copies
both pages into a private per-machine stash (m->opcache->stash), the op does its
read-modify-write there, and CommitStash() writes it back at end-of-op. For
LOCKed/atomic ops this silently broke atomicity: the op's own LockBus() keys on
the stash buffer (per-machine, never contended) and the write-back happens later
outside any lock, so concurrent crossing atomics to the same address lose updates
(e.g. a GObject refcount that straddles a page -> lost decrement -> premature
free). Affected both the interpreter and the JIT; linear mode is fine because x86
LOCK is atomic across pages in hardware.

Fix: in ReserveAddress()'s page-overlap path, after both pages resolve, take a
bus lock keyed by the thread-shared guest address and set m->crosslocked;
CommitStash() does the write-back then releases it, so the whole crossing
read-modify-write is serialized vs other threads. Belt-and-suspenders release in
Blink()'s halt path in case a fault unwinds past CommitStash(). Only crossing
accesses pay this, and they are rare.

Verified with a multithreaded stress test that places an atomic counter across a
page boundary: before, the crossing counter ended up ~12M of 25M; after, it is
exact under -m JIT and -m interpreter, while aligned counters and linear mode
were always exact. Single-thread page-overlap, md5sum, go, shell all still pass.

(Note: this is a real, separate -m correctness bug; it is NOT the intermittent
thunar -m crash, which is still under investigation -- localized to infinite
recursion in cairo's Bezier spline decomposition.)

README.md: typo/usage doc tweaks.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
whereis(1) and other tools call confstr(_CS_PATH), which does
sysctl({CTL_USER, USER_CS_PATH}) (MIB 8.1). blink's FreeBSD sysctl handler
had no CTL_USER branch, so it returned ENOSYS:

  whereis: sysctl("user.cs_path"): Function not implemented

Add the case to SysFreeBSDSysctl, returning a standard utility search path
that includes /usr/local so pkg-installed binaries are findable:
/usr/bin:/bin:/usr/sbin:/sbin:/usr/local/bin:/usr/local/sbin. Callers size
the buffer via a first oldaddr=0 query, matching the other string sysctls.

Verified: `whereis mc` -> "mc: /usr/local/bin/mc" (was erroring);
whereis ls/sh resolve; uname unaffected.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
blink backs guest ptys with real host ptys and runs guest processes as real
host processes, but never handled TIOCSCTTY. FreeBSD TIOCSCTTY (0x20007461)
fell through to the default einval(), so after a shell's setsid() it could not
acquire the pty as its controlling terminal. Subsequent /dev/tty opens returned
ENXIO and TIOCGPGRP/TIOCSPGRP returned ENOTTY, so shells printed
"cannot set terminal process group" / "no job control in this shell".

Route FreeBSD TIOCSCTTY (and Linux TIOCSCTTY 0x540e) to the host TIOCSCTTY on
the pty fd. After the guest setsid(), the host call makes the pty the
controlling terminal so tcgetpgrp/tcsetpgrp (TIOCGPGRP/TIOCSPGRP) work and job
control comes up. Verified: the job-control errors are gone, `jobs` works in an
interactive bash under a pty, and md5sum/shell pipes are unaffected.

(This removes mc's "no job control" subshell warnings but does NOT by itself
fix mc's ~10s subshell-init stall, which is a separate getcwd-in-subshell VFS
issue still under investigation.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
getdents/getdirentries wrongly gated on the fd having been opened with
O_DIRECTORY ((fd->oflags & O_DIRECTORY) != O_DIRECTORY -> ENOTDIR). But
O_DIRECTORY is only an open-time hint; on FreeBSD/Linux you may open a directory
with plain O_RDONLY and read it. FreeBSD libc's physical getcwd opens ".." with
O_RDONLY and walks up; blink returned ENOTDIR, so getcwd failed ("cannot access
parent directories"). That broke mc's concurrent-subshell init: bash's
PROMPT_COMMAND `pwd >&pipe` errored instead of writing the cwd, so mc's sync
select timed out ~10s every startup. Gate on the fd's ACTUAL type (S_ISDIR via
the fstat already performed) instead; non-directories still get ENOTDIR. Fixes
getcwd for any program that opens a dir without O_DIRECTORY, and removes mc's
~10s subshell stall.

Also revert the TIOCSCTTY change from af88aed. Making TIOCSCTTY succeed let
shells believe they had job control, but blink can't deliver the rest of it
(process-group SIGSTOP/SIGCONT + waitpid(WUNTRACED) across guest processes).
With the getdents fix the subshell sync now *succeeds*, so mc proceeded into the
full job-control handshake and deadlocked (hung forever, needed kill -9). Leave
TIOCSCTTY unimplemented (default einval) so shells stay in the honest "no job
control" mode, where mc's subshell works and starts fast. Revisit if real job
control is ever implemented.

Net: `mc` (with subshell) now starts quickly under -m instead of stalling 10-15s
or hanging. Verified: getcwd in O_RDONLY-opened dirs works, ls/find/md5sum/go
unaffected, mc no longer stalls or hangs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
31a3a2e relaxed getdents/getdirentries to read directories opened without
O_DIRECTORY (correct in general, and it removed mc's ~10s subshell-init stall by
letting bash's physical getcwd succeed). But making the subshell sync *succeed*
caused mc to proceed into the full job-control handshake: its PROMPT_COMMAND runs
`pwd>&pipe; kill -STOP $$`, then mc blocks in sigsuspend waiting for the subshell
to stop. blink forwards SIGSTOP to the (real host) subshell process so it does
stop, but mc is never woken with a child-STOPPED SIGCHLD, so mc hangs forever
(needs kill -9). That is strictly worse than the prior 10-15s-slow-but-working
behavior, so restore the O_DIRECTORY requirement for now.

The clean fix is to deliver child-stop notifications (SIGCHLD on WUNTRACED-style
stop) to the guest parent so mc's sigsuspend/wait wakes; until that exists, mc's
concurrent subshell can't be driven and must run degraded (or use_subshell=0).
The general getdents/getcwd relaxation can return together with that work.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
The FreeBSD->Linux syscall shim mapped kill(37) to the Linux kill ordinal but
left its signal argument untranslated. FreeBSD and Linux signal numbers diverge
above 15 (FreeBSD SIGSTOP=17/SIGTSTP=18/SIGCONT=19/SIGCHLD=20/SIGUSR1=30 vs
Linux 19/20/18/17/10), so e.g. `kill -STOP` actually sent SIGCHLD, `kill -CONT`
sent SIGSTOP, and kill(SIGUSR1) delivered the wrong signal. Run the signal arg
through XlatFreeBSDSignal() (as thr_kill already does).

Concretely this makes job-control signals work: a shell's `kill -STOP $$` now
really stops the process and the parent gets a child-stop SIGCHLD. Verified: a
program that catches SIGUSR1 via kill(getpid(),SIGUSR1) now sees SIGUSR1 (was a
different/no signal); regressions (md5sum/go/ls/whereis) unaffected.

NB: this is necessary but not sufficient for mc's concurrent subshell to run
fast -- with the separate getcwd/getdents relaxation it gets further (the init
stop/cont handshake now completes) but still hangs later in the job-control
loop, so that getdents relaxation stays reverted and mc remains slow-but-working
(use mc -u / use_subshell=0 for fast startup). Full concurrent-subshell support
is a larger job-control effort.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
blink didn't handle the FreeBSD line-discipline ioctls, so TIOCGETD
(0x4004741a) fell through to the default einval(). stty(1) queries TIOCGETD
during init, so it failed outright ("stty: TIOCGETD: Invalid argument") and
couldn't set any terminal modes. Route TIOCGETD/TIOCSETD to the host ioctls
(line discipline 0 = TTYDISC/N_TTY for a normal tty). Verified: `stty -a` now
reports the terminal settings instead of erroring.

(Found while diagnosing an mc panel key issue; unrelated to that, but a real
standalone fix for stty and other line-discipline users.)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
…y keylog

getdents/getdirentries rejected any fd not opened with O_DIRECTORY, but
O_DIRECTORY is only an open-time hint -- a directory may be opened O_RDONLY
and read. FreeBSD libc's physical getcwd() opens ".." with plain O_RDONLY
then getdirentries, so blink returned ENOTDIR and getcwd() failed in every
non-root directory. Gate on the fd's ACTUAL type (S_ISDIR via the fstat
already done) instead of the open flag.

This was the real cause of mc -u "Enter doesn't enter directories": pressing
Enter descends into the dir, mc canonicalizes the new cwd via getcwd, that
failed, mc bailed and reloaded the panel with the cursor reset to the top --
which looked like Enter acting as PageUp. The Enter byte (0d) was always
correct. Verified: pwd -P / realpath now resolve in subdirs; ls still works;
reading a non-dir as a dir still ENOTDIRs.

(Earlier this relaxation was reverted because it unblocked mc's concurrent
subshell into a job-control hang. Safe to re-land now: use_subshell=0 / mc -u
means no subshell, so that path is never engaged.)

Also add an env-gated tty keylog debug aid: when BLINK_KEYLOG=<path> is set,
SysRead appends a hex line for every byte read from a tty (isatty fd, rc>0).
The log fd is dup'd into the blink-reserved high range to avoid guest fd
collision. Zero overhead when unset. Used to capture exactly what bytes a TUI
receives per keypress.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant