Context
- Workflow: Release portable Linux PyTorch Wheels
- Workflow file: .github/workflows/release_portable_linux_pytorch_wheels.yml
- Failing run: ↗ View run
- Platform: Linux
- Impacted Arch: gfx1151
- PyTorch Versions: release/2.9, release/2.10, release/2.12, nightly
- Python Versions: all (3.10, 3.11, 3.12, 3.13, 3.14)
- ROCm nightly: 7.14.0a20260611
Summary
All 20 test jobs for gfx1151 fail in the Run rocm-sdk sanity tests step. The testConsoleScripts test in rocm_sdk.tests.core_test.ROCmCoreTest calls the rocminfo console-script wrapper, which crashes with SIGSEGV (exit signal 11). The failure is consistent across multiple distinct runners (linux-strix-halo-gpu-rocm-3, -6, -7, -9, CS-RORDMZ-DT239, CS-RORDMZ-DT241), ruling out a single-runner issue.
Error
ERROR: testConsoleScripts (rocm_sdk.tests.core_test.ROCmCoreTest) [Check console-script rocminfo]
File ".../.venv/lib/python3.10/site-packages/rocm_sdk/tests/core_test.py", line 141, in testConsoleScripts
output_text = subprocess.check_output(...)
subprocess.CalledProcessError: Command '[PosixPath('/.../rocminfo')]' died with <Signals.SIGSEGV: 11>.
FAILED (errors=1)
##[error]Process completed with exit code 1.
##[error]Executing the custom container implementation failed. Please contact your self hosted runner administrator.
Counts: 1 ERROR / 19 run — across 20 jobs
Root Cause
The rocminfo binary from the rocm-core==7.14.0a20260611 package segfaults on gfx1151 (Strix Halo) hardware. The regression is in the nightly ROCm SDK package for this build date and needs investigation in the rocminfo component or its runtime dependencies (HSA-Runtime, hsa-rocr).
Full Logs
py 3.10, torch release/2.9
py 3.11, torch release/2.9
py 3.10, torch nightly
py 3.11, torch nightly
py 3.12, torch release/2.9
py 3.10, torch release/2.10
py 3.14, torch release/2.9
py 3.12, torch release/2.10
py 3.14, torch release/2.10
py 3.10, torch release/2.12
py 3.12, torch nightly
py 3.13, torch release/2.10
py 3.14, torch nightly
py 3.13, torch nightly
py 3.11, torch release/2.12
py 3.13, torch release/2.12
py 3.13, torch release/2.9
py 3.14, torch release/2.12
py 3.11, torch release/2.10
py 3.12, torch release/2.12
Context
Summary
All 20 test jobs for gfx1151 fail in the Run
rocm-sdk sanity testsstep. ThetestConsoleScriptstest inrocm_sdk.tests.core_test.ROCmCoreTestcalls therocminfoconsole-script wrapper, which crashes withSIGSEGV(exit signal 11). The failure is consistent across multiple distinct runners (linux-strix-halo-gpu-rocm-3, -6, -7, -9, CS-RORDMZ-DT239, CS-RORDMZ-DT241), ruling out a single-runner issue.Error
Counts: 1 ERROR / 19 run — across 20 jobs
Root Cause
The rocminfo binary from the rocm-core==7.14.0a20260611 package segfaults on gfx1151 (Strix Halo) hardware. The regression is in the nightly ROCm SDK package for this build date and needs investigation in the rocminfo component or its runtime dependencies (HSA-Runtime, hsa-rocr).
Full Logs
py 3.10, torch release/2.9
py 3.11, torch release/2.9
py 3.10, torch nightly
py 3.11, torch nightly
py 3.12, torch release/2.9
py 3.10, torch release/2.10
py 3.14, torch release/2.9
py 3.12, torch release/2.10
py 3.14, torch release/2.10
py 3.10, torch release/2.12
py 3.12, torch nightly
py 3.13, torch release/2.10
py 3.14, torch nightly
py 3.13, torch nightly
py 3.11, torch release/2.12
py 3.13, torch release/2.12
py 3.13, torch release/2.9
py 3.14, torch release/2.12
py 3.11, torch release/2.10
py 3.12, torch release/2.12