Skip to content

Conversation

@laurentcau
Copy link

  • Add simd aligned_vec3 (and sse aligned_dvec3 - 2 x xmm)
  • Fast packed_vec3 <=> aligned_vec3 and packed_vec4 <=> aligned_vec4 conversion
  • Fast aligned_vec3 <=> aligned_vec4 conversion
  • Optimized aligned_mat x aligned_mat and aligned_mat x aligned_vec
  • Inverse aligned_mat3 simd version (actually slower than ssid on my computer even it has 30% less instruction ?)

@laurentcau
Copy link
Author

Note: I fixed the integer div issue reported there: #1255
I also added a test to check there will be no regression in the future.
For the template issues with GLM_FORCE_NEON, since there is no detail, I can't fix it.
All tests are compiling fine on my side with clang + neon. So that should be something not covered by tests but what ? @dimitre

@laurentcau laurentcau force-pushed the b7 branch 3 times, most recently from 5a553eb to 64038f9 Compare March 18, 2024 10:11
- Add simd aligned_vec3 (and sse aligned_dvec3 - 2 x xmm)
- Fast packed_vec3 <=> aligned_vec3 and packed_vec4 <=> aligned_vec4 conversion
- Fast aligned_vec3 <=> aligned_vec4 conversion
- Optimized aligned_mat x aligned_mat and aligned_mat x aligned_vec
- Inverse aligned_mat3 simd version (actually slower than ssid on my computer even it has 30% less instruction ?)
@laurentcau
Copy link
Author

@christophe-lunarg
Copy link

Considering you didn't receive any answer from the ARM NEON issue after multiple query, I think it's fair to break it and have been for who that matters submit a PR.

@christophe-lunarg
Copy link

Thanks for contributing!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants