-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Better validation of gpu schedules #8068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This means we actually print error messages when using exceptions and the makefile
GPU loop constraints were checked in two different places. Checking them
in ScheduleFunctions was incorrect because it didn't consider update
definitions and specializations. Checking them in FuseGPUThreadLoops was
too late, because the Var names have gone (they've been renamed to
things like __thread_id_x). Furthermore, some problems were internal
errors or runtime errors when they should have been user errors. We
allowed 4d thread and block dimensions, but then hit an internal error.
This PR centralizes checking of GPU loop structure in
CanonicalizeGPUVars and adds more helpful error messages that print the
problematic loop structure. E.g:
```
Error: GPU thread loop over f$8.s0.v0 is inside three other GPU thread
loops. The maximum number of nested GPU thread loops is 3. The loop nest
is:
compute_at for g$8:
for g$8.s0.v7:
for g$8.s0.v6:
for g$8.s0.v5:
for g$8.s0.v4:
gpu_block g$8.s0.v3:
gpu_block g$8.s0.v2:
gpu_thread g$8.s0.v1:
gpu_thread g$8.s0.v0:
store_at for f$8:
compute_at for f$8:
gpu_thread f$8.s0.v1:
gpu_thread f$8.s0.v0:
```
Fixes the bug found in #7946
src/ScheduleFunctions.cpp
Outdated
|
|
||
| std::ostringstream err; | ||
|
|
||
| /* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is meant to left here in commented-out form, it is essential you add a comment explaining why. (Otherwise, just delete it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oops, deleted.
…alide/Halide into abadams/validate_gpu_schedules
|
ready to land? |
* Update makefile to use test/common/terminate_handler.cpp
This means we actually print error messages when using exceptions and
the makefile
* Better validate of GPU schedules
GPU loop constraints were checked in two different places. Checking them
in ScheduleFunctions was incorrect because it didn't consider update
definitions and specializations. Checking them in FuseGPUThreadLoops was
too late, because the Var names have gone (they've been renamed to
things like __thread_id_x). Furthermore, some problems were internal
errors or runtime errors when they should have been user errors. We
allowed 4d thread and block dimensions, but then hit an internal error.
This PR centralizes checking of GPU loop structure in
CanonicalizeGPUVars and adds more helpful error messages that print the
problematic loop structure. E.g:
```
Error: GPU thread loop over f$8.s0.v0 is inside three other GPU thread
loops. The maximum number of nested GPU thread loops is 3. The loop nest
is:
compute_at for g$8:
for g$8.s0.v7:
for g$8.s0.v6:
for g$8.s0.v5:
for g$8.s0.v4:
gpu_block g$8.s0.v3:
gpu_block g$8.s0.v2:
gpu_thread g$8.s0.v1:
gpu_thread g$8.s0.v0:
store_at for f$8:
compute_at for f$8:
gpu_thread f$8.s0.v1:
gpu_thread f$8.s0.v0:
```
Fixes the bug found in halide#7946
* Delete dead code
* Actually clear the ostringstream
GPU loop constraints were checked in two different places. Checking them
in ScheduleFunctions was incorrect because it didn't consider update
definitions and specializations. Checking them in FuseGPUThreadLoops was
too late, because the Var names have gone (they've been renamed to
things like __thread_id_x). Furthermore, some problems were internal
errors or runtime errors when they should have been user errors. We
allowed 4d thread and block dimensions, but then hit an internal error.
This PR centralizes checking of GPU loop structure in
CanonicalizeGPUVars and adds more helpful error messages that print the
problematic loop structure. E.g:
Fixes the bug found in #7946