Skip to content

Conversation

@rootjalex
Copy link
Member

@rootjalex rootjalex commented Oct 26, 2022

This PR adds support for generating saturating_(add | sub) and pmulh(rs) on Skylake and Cannonlake (i.e. for AVX512BW). It also increases simd_op_check test coverage of fixed-point operations on those archs.

I also did a bit of clean-up on the way:

I did not add abs to codegen because it doesn't appear that LLVM currently exposes non-masked versions of AVX512 abs variants.

Fixes #7002

@rootjalex rootjalex requested a review from abadams October 26, 2022 22:49
@steven-johnson
Copy link
Contributor

Several legit failures here

@rootjalex
Copy link
Member Author

Can't quite figure out why the JIT doesn't like ssse3.pabs instructions. I see them used in LLVM tests (i.e. here). Gonna revert the use of those for now, but will still change the .ll to use llvm.abs.

@rootjalex
Copy link
Member Author

Ugh, same deal with the avx2.pabs instructions (despite showing up in LLVM tests here). I will revert that change and add a comment, but I don't know why these intrinsics in particular are an issue.

@rootjalex
Copy link
Member Author

Just updated the AVX512_Skylake pabs generation, with a fix to complete_x86_target thanks to @abadams

@rootjalex
Copy link
Member Author

Only test failure appears unrelated

Copy link
Contributor

@steven-johnson steven-johnson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, tests pass on my AVX512 Linux box

@rootjalex rootjalex merged commit 5da5dfd into main Oct 31, 2022
@rootjalex rootjalex deleted the rootjalex/x86-fp-cleanup branch October 31, 2022 18:36
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
* clean-up abs and saturating_pmulhrs, fix AVX512 saturating_ ops

* add test coverage for AVX512 fp ops

* generate vpabs on AVX512

* faster AVX2 lowering of saturating_pmulhrs
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Saturating instructions not generated on AVX512

4 participants