Skip to content

Conversation

@G-071
Copy link
Member

@G-071 G-071 commented Feb 3, 2022

This PR provides a performance improvement for the Kokkos reconstruct kernel. The automatic tiling was suboptimal here. Hence we now use manual tiles with 64 workitems each. This lifts the kernel to the performance of its CUDA counterpart.

@G-071 G-071 enabled auto-merge February 3, 2022 22:29
@G-071 G-071 requested a review from diehlpk February 3, 2022 22:38
Copy link
Member

@diehlpk diehlpk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@G-071 G-071 merged commit f1be291 into master Feb 4, 2022
@G-071 G-071 deleted the fix_kokkos_reconstruct_tiling branch February 4, 2022 03:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants