-
Notifications
You must be signed in to change notification settings - Fork 233
Description
Using #1311 as the playground, as of commit 4011bb8 and CI logs at https://github.com/NVIDIA/cuda-python/actions/runs/19951181011 I verified that nv-gha-runners no longer makes containers as a hard requirement for running jobs on GPU runners. We can now run GPU jobs just fine on the bare, ephemeral VM. This would help us accelerate job start time.
The current test blocker is #1307. We recently added xfail to tests we did not think runnable in the CI. But those tests did run in the bare VM setup, and turned xfail to xpass (hence failing, because we set the strict mode). This can be easily fixed.
In the internal discussion we concluded that we don't need to test against a set of containers. But it is nice to test both container and containerless (i.e. bare VM) environments. We currently have two test workflows:
- test-wheel-linux.yml: Needed because Linux runners required a container (no longer needed)
- test-wheel-windows.yml: Needed because Windows runners do not require any container
I suggest we rename and re-purposes the two workflows as follows:
- test-wheel-container.yml: This runs all existing Linux tests
- test-wheel-containerless.yml: This runs all existing Linux + Windows tests
- Piggybacking on this refactoring we can probably get rid of the Powershell usage in the workflow, since our CI relies heavily on Bash and Git Bash.