Describe the bug
On Nix 2.34.7 I intermittently get hard SIGABRT crashes from nix flake show. The core dumps all share the same signature: the Boehm GC OOM handler throws std::bad_alloc while it is being called from inside Value::mkFailed, which is declared noexcept. Since an exception escapes a noexcept function, the runtime calls std::terminate → abort → SIGABRT.
The crash originates from the "failed value" / recoverable-error path introduced in 2.34.0 (#15286), so it is specific to evaluations that go through handleEvalExceptionForThunk — which nix flake show does heavily, since it deliberately keeps evaluating attributes that throw.
Stack trace (from a real core dump, nix 2.34.7)
#2 abort (libc.so.6)
#3 nix::(anonymous namespace)::onTerminate() [std::set_terminate handler]
#4 __cxxabiv1::__terminate(...)
#5 __cxa_call_terminate
#6 __gxx_personality_v0
#7 _Unwind_RaiseException_Phase2
#8 _Unwind_RaiseException
#9 __cxa_throw
#10 nix::oomHandler(unsigned long) [throws std::bad_alloc]
#11 GC_register_finalizer_inner [Boehm GC]
#12 nix::Value::mkFailed(std::exception_ptr, nix::Value*) [declared noexcept]
#13 nix::EvalState::handleEvalExceptionForThunk(...)
#14 nix::ExprVar::eval(...)
#15 nix::ExprOpHasAttr::eval(...)
...
The unwinder reaches onTerminate directly from __cxa_throw in oomHandler — i.e. the exception is being thrown from a context that cannot propagate it (a noexcept frame), so it terminates instead of unwinding to a handler.
Root cause
The chain is, in 2.34.7:
-
oomHandler is registered as the Boehm GC out-of-memory callback and throws a C++ exception:
https://github.com/NixOS/nix/blob/2.34.7/src/libexpr/eval-gc.cc#L43
static void * oomHandler(size_t requested)
{
/* Convert this to a proper C++ exception. */
throw std::bad_alloc();
}
-
Value::mkFailed is noexcept but allocates a Value::Failed on the GC heap:
https://github.com/NixOS/nix/blob/2.34.7/src/libexpr/include/nix/expr/value.hh#L1288
inline void mkFailed(std::exception_ptr e, Value * recovery) noexcept
{
setStorage(new Value::Failed(e, recovery));
}
-
Value::Failed derives from gc_cleanup, so constructing it implicitly calls GC_register_finalizer (GC_register_finalizer_inner), which itself allocates from the GC heap:
https://github.com/NixOS/nix/blob/2.34.7/src/libexpr/include/nix/expr/value.hh#L431
struct Failed : gc_cleanup
{
std::exception_ptr ex;
Value * recoveryValue;
...
};
-
EvalState::handleEvalExceptionForThunk calls mkFailed for every thunk that threw during evaluation:
https://github.com/NixOS/nix/blob/2.34.7/src/libexpr/eval.cc#L2188
So when the GC heap is exhausted at the moment mkFailed registers the finalizer, oomHandler throws std::bad_alloc through the noexcept boundary and the process aborts instead of surfacing a normal "out of memory" eval error.
The core problem
Value::mkFailed is marked noexcept, but it transitively performs GC allocations (new Value::Failed → gc_cleanup ctor → GC_register_finalizer → internal GC malloc) that can invoke oomHandler, which throws. A function that can reach a throwing allocation path should not be noexcept — or it must not allocate in a way that can call oomHandler. Either the noexcept is wrong, or the allocation must be made non-throwing/pre-reserved.
Reproduction
This is intermittent by nature — it requires the GC OOM to land specifically inside the finalizer registration within mkFailed. I was not able to construct a deterministic minimal flake that reliably hits this exact instant (capping GC_MAXIMUM_HEAP_SIZE and forcing many throwing attributes through nix flake show produces normal exit-1 eval errors, not the abort).
In the wild it recurs every couple of days. Every one of the ~22 core dumps I have collected over the past ~10 days comes from the same kind of command — nix flake show --experimental-features 'nix-command flakes' --json -v --legacy path:/nix/store/...-source against various large flakes — and they all show the trace above. nix flake show is a natural trigger because it intentionally keeps evaluating attributes that throw, exercising the handleEvalExceptionForThunk → mkFailed path heavily.
If a maintainer can suggest a way to deterministically force an OOM inside GC_register_finalizer_inner, I'm happy to provide a self-contained reproducer.
Relationship to #15990
#15990 ("Don't memoise Interrupted errors", merged 2026-06-08) reworks exactly this area: it stops struct Failed from inheriting gc_cleanup and moves the exception_ptr into a separate ExceptionRef, explicitly to avoid finalizer cycles that Boehm warns about. That PR is not in 2.34.7 (tagged 2026-05-04) and, as of writing, has not yet landed on the 2.34-maintenance branch despite carrying the backport 2.34-maintenance label. It is unclear to me whether the #15990 refactor fully removes the noexcept-throwing-allocation hazard (the replacement ExceptionRef still derives from gc_cleanup), so I'm filing this so the abort-on-OOM path is tracked explicitly rather than as a side effect of the Interrupted-memoisation fix.
Versions / environment
- Nix 2.34.7 (upstream CppNix, as shipped by NixOS 26.05)
- NixOS 26.05, x86_64-linux
- 94 GiB RAM, no
ulimit -v and no systemd MemoryMax limit (the abort happens well below physical memory)
Possible fixes
- Remove
noexcept from Value::mkFailed (and audit callers), or
- Make the
Failed/ExceptionRef allocation not go through a path that can call the throwing oomHandler (e.g. pre-reserve, or use a non-throwing allocation here), or
- Have
oomHandler not throw when invoked from within finalizer registration.
Describe the bug
On Nix 2.34.7 I intermittently get hard
SIGABRTcrashes fromnix flake show. The core dumps all share the same signature: the Boehm GC OOM handler throwsstd::bad_allocwhile it is being called from insideValue::mkFailed, which is declarednoexcept. Since an exception escapes anoexceptfunction, the runtime callsstd::terminate→abort→SIGABRT.The crash originates from the "failed value" / recoverable-error path introduced in 2.34.0 (#15286), so it is specific to evaluations that go through
handleEvalExceptionForThunk— whichnix flake showdoes heavily, since it deliberately keeps evaluating attributes that throw.Stack trace (from a real core dump, nix 2.34.7)
The unwinder reaches
onTerminatedirectly from__cxa_throwinoomHandler— i.e. the exception is being thrown from a context that cannot propagate it (anoexceptframe), so it terminates instead of unwinding to a handler.Root cause
The chain is, in 2.34.7:
oomHandleris registered as the Boehm GC out-of-memory callback and throws a C++ exception:https://github.com/NixOS/nix/blob/2.34.7/src/libexpr/eval-gc.cc#L43
Value::mkFailedisnoexceptbut allocates aValue::Failedon the GC heap:https://github.com/NixOS/nix/blob/2.34.7/src/libexpr/include/nix/expr/value.hh#L1288
Value::Failedderives fromgc_cleanup, so constructing it implicitly callsGC_register_finalizer(GC_register_finalizer_inner), which itself allocates from the GC heap:https://github.com/NixOS/nix/blob/2.34.7/src/libexpr/include/nix/expr/value.hh#L431
EvalState::handleEvalExceptionForThunkcallsmkFailedfor every thunk that threw during evaluation:https://github.com/NixOS/nix/blob/2.34.7/src/libexpr/eval.cc#L2188
So when the GC heap is exhausted at the moment
mkFailedregisters the finalizer,oomHandlerthrowsstd::bad_allocthrough thenoexceptboundary and the process aborts instead of surfacing a normal "out of memory" eval error.The core problem
Value::mkFailedis markednoexcept, but it transitively performs GC allocations (new Value::Failed→gc_cleanupctor →GC_register_finalizer→ internal GC malloc) that can invokeoomHandler, which throws. A function that can reach a throwing allocation path should not benoexcept— or it must not allocate in a way that can calloomHandler. Either thenoexceptis wrong, or the allocation must be made non-throwing/pre-reserved.Reproduction
This is intermittent by nature — it requires the GC OOM to land specifically inside the finalizer registration within
mkFailed. I was not able to construct a deterministic minimal flake that reliably hits this exact instant (cappingGC_MAXIMUM_HEAP_SIZEand forcing many throwing attributes throughnix flake showproduces normal exit-1 eval errors, not the abort).In the wild it recurs every couple of days. Every one of the ~22 core dumps I have collected over the past ~10 days comes from the same kind of command —
nix flake show --experimental-features 'nix-command flakes' --json -v --legacy path:/nix/store/...-sourceagainst various large flakes — and they all show the trace above.nix flake showis a natural trigger because it intentionally keeps evaluating attributes that throw, exercising thehandleEvalExceptionForThunk→mkFailedpath heavily.If a maintainer can suggest a way to deterministically force an OOM inside
GC_register_finalizer_inner, I'm happy to provide a self-contained reproducer.Relationship to #15990
#15990 ("Don't memoise Interrupted errors", merged 2026-06-08) reworks exactly this area: it stops
struct Failedfrom inheritinggc_cleanupand moves theexception_ptrinto a separateExceptionRef, explicitly to avoid finalizer cycles that Boehm warns about. That PR is not in 2.34.7 (tagged 2026-05-04) and, as of writing, has not yet landed on the2.34-maintenancebranch despite carrying thebackport 2.34-maintenancelabel. It is unclear to me whether the #15990 refactor fully removes thenoexcept-throwing-allocation hazard (the replacementExceptionRefstill derives fromgc_cleanup), so I'm filing this so the abort-on-OOM path is tracked explicitly rather than as a side effect of the Interrupted-memoisation fix.Versions / environment
ulimit -vand no systemdMemoryMaxlimit (the abort happens well below physical memory)Possible fixes
noexceptfromValue::mkFailed(and audit callers), orFailed/ExceptionRefallocation not go through a path that can call the throwingoomHandler(e.g. pre-reserve, or use a non-throwing allocation here), oroomHandlernot throw when invoked from within finalizer registration.