Outsmart the LLVM optimizer #8073

steven-johnson · 2024-02-07T01:51:00Z

The old definitions of bool_1, bool_2, bool_3 in simd_op_check_x86 (etc) all referred to the same entry in in_f32; as of llvm/llvm-project#76367, the LLVM optimizer is smart enough to realize that (eg) bool1 != bool2 by construction, and optimizes away the code that tests their conditions, such as the one for andps and orps. Initing them from different locations is enough to outsmart the compiler.

(bug was only noticed in the x86 test, but I updated the other tests to guard against future improvements there too.)

The old definitions of bool_1, bool_2, bool_3 in simd_op_check_x86 (etc) all referred to the same entry in in_f32; as of llvm/llvm-project#76367, the LLVM optimizer is smart enough to realize that (eg) bool1 != bool2 by construction, and optimizes away the code that tests their conditions, such as the one for andps and orps. Initing them from different locations is enough to outsmart the compiler. (bug was only noticed in the x86 test, but I updated the other tests to guard against future improvements there too.)

steven-johnson requested a review from zvookin February 7, 2024 01:51

abadams approved these changes Feb 7, 2024

View reviewed changes

steven-johnson merged commit 84fe565 into main Feb 7, 2024

steven-johnson deleted the srj/llvm-fp-fix branch February 7, 2024 17:41

BrewTestBot mentioned this pull request Jul 17, 2024

halide 18.0.0 Homebrew/homebrew-core#177657

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Outsmart the LLVM optimizer #8073

Outsmart the LLVM optimizer #8073

Uh oh!

steven-johnson commented Feb 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Outsmart the LLVM optimizer #8073

Outsmart the LLVM optimizer #8073

Uh oh!

Conversation

steven-johnson commented Feb 7, 2024

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants