fix race condition in sandbox pack lock#8183
Conversation
Overall package sizeSelf size: 5.67 MB Dependency sizes| name | version | self size | total size | |------|---------|-----------|------------| | import-in-the-middle | 3.0.1 | 82.56 kB | 817.39 kB | | dc-polyfill | 0.1.10 | 26.73 kB | 26.73 kB |🤖 This report was automatically generated by heaviest-objects-in-the-universe |
🎉 All green!❄️ No new flaky tests detected 🎯 Code Coverage (details) 🔗 Commit SHA: 9e0dd47 | Docs | Datadog PR Page | Give us feedback! |
|
@watson Did you consider alternatives? In my experience, adding locks is how you end up with an infinite stream of problems (which we already have with integration tests, so I'm not a fan of adding locks to the mix) |
BenchmarksBenchmark execution time: 2026-04-30 04:16:49 Comparing candidate commit 9e0dd47 in PR branch Found 0 performance improvements and 0 performance regressions! Performance is the same for 1349 metrics, 95 unstable metrics. |
What does this PR do?
Fix race condition in sandbox pack lock.
Motivation
Causes the process to hang.
Explanation from Claude:
The hang is the 60-second retry backoff in execHelperAsync (line 408):
bun pm pack fails on the first try (transient bun issue or resource contention)
execHelperAsync catches the error, waits 60 seconds, then retries
The retry usually succeeds → "then go through the retry so it can finish"
That 60-second sleep is what you're experiencing as the hang. Since bun pm pack runs without a timeout and no output during the wait, it looks completely frozen.
There's also a latent bug in the locking: if both attempts fail, packTarball throws, and the finally block at line 488 only closes the file descriptor — it never deletes the lock file. Any subsequent call to packTarballWithLock then sees EEXIST and enters the while (!existsSync(tarballPath)) polling loop at line 480 forever, because the tarball will never appear. That would be a true infinite hang rather than just a 60-second delay.
The bug has two parts: let me show both clearly, then fix them.
Bug 1: When packTarball fails, the finally block closes the file descriptor but never deletes the lock file. So on a subsequent call, fs.open(lockFile, 'wx') throws EEXIST and the code enters the while (!existsSync(tarballPath)) polling loop — but since no one holds the lock anymore, the tarball will never appear, and the loop runs forever.
Bug 2: The polling loop only exits when the tarball appears. It has no way to detect that the lock holder gave up.
The fix: delete the lock file in finally, and in the polling loop, check if the lock disappeared without a tarball and retry from scratch.
The fix in two parts:
finally now deletes the lock file — so if packing fails, the lock is released and the next call can acquire it and try again. Previously the comment "no need to clean up" was wrong: the temp directory is per-process (created fresh each run via id() on line 21), so the lock file would persist across all the createSandbox calls within the same test run.
Polling loop checks for a vanished lock — after each 100ms sleep it checks whether the lock file is gone but the tarball still isn't there. If so, the lock holder failed, and this waiter retries from scratch rather than looping forever.
Additional Notes