Add functionality inconsistently missing from types#264
Conversation
All missing `Cmp...` trait implementations were added. Some tests for existing functions were missing. There was previously a bug in `u64x8::simd_lt`.
Previously `Not` was only implemented for references of 128-bit types.
The `Sum` trait was already implemented via a macro for all types but `f32x8`. I do not know if `f32x8` was left out intentionally or by mistake, since i cannot think of any optimization that can be made to this implementation.
For now 8-bit and 16-bit types do not have optimized implementations.
|
As another
|
|
I don't have time to check closely on this today, but I will probably have time tomorrow. I like what I see so far. i do like the macro test idea. |
|
Its great to hear this PR is helpful! I am working on a crate that uses
About impl ... {
#[expect(private_bounds)]
pub fn simd_eq<Rhs>(self, rhs: Rhs) -> Self
where
Self: SimdEq<Rhs>,
{
SimdEq::simd_eq(self, rhs)
}
}This solution would still support both SIMD values and scalars for the second argument. This seems like a good solution unless there is another use for these traits which i am missing?
About For
The result of
All functions added (unless i made a mistake) already existed for some types but were missing from others. For example, Documentation is missing from the entire crate so maybe fixing that should be a seperate PR? About traits, these functions should probably only be in traits if the entire API is moved into traits. That is possible, but personally i dislike the idea. |
Hmm. Indeed I use Maybe it would then be better to have a |
|
Possibility: If we change the existing inherent methods to take Into instead of Self, that wouldn't be a breaking change I think? Then we can have a From impl for scalar values into wide values. I think those From impls mostly already exist, but might not for all types. I would like it to be totally uniform where possible, but unfortunately I would also like to avoid breaking changes. Maybe we can just do as much as possible for the 1.0 version, and leave an overhaul for 2.0 (which is, ideally, some time after the stdlib simd types become stable, and |
|
Rust lists this change as minor: https://doc.rust-lang.org/cargo/reference/semver.html#fn-generalize-compatible. Inherit functions could be added using I agree that it is a bad idea to make breaking changes especially because of |
|
Two more notes:
|
|
Let's put the macro tests into the PR now, let's put the method adjustment and trait thing on hold until a later PR. I'd rather just merge the stuff that we know is an improvement, and save the pondering. |
This is the Rust equivalent of SSE2's
This is the Rust equivalent of SSE2's Hope it clarifies these functions 😄. |
This adds functionality that already exists for some types but is missing from others inconsistently. I did my best to find
all missing functionality and either fix it or list it below.
Because a ton of functionality is missing, this PR is quite big. I am not sure if this should have been split into multiple PRs?
This also fixes a bug i found in
u64x8::simd_ltand optimizesf64xN::round_int.Added missing trait implementations:
Cmp...traits for many integer types (includingCmp...<T>).impl Not for &Widefor all types butf32x4which already had it.impl Sum<...> for f32x8impl From<&[T]>for types where it is missing.Added functions missing from
f64xN:reciprecip_sqrttrunc_intfast_trunc_intfast_round_intAdded functions missing from some integer types:
absunsigned_absanyallnonetransposeis_negativereduce_addreduce_maxreduce_minsaturating_addsaturating_subNotes
The
Sumtrait was already implemented via a macro for all types butf32x8. I do not know if this was intentional or a mistake, but i cannot think of any optimization that can be made forf32x8.Currently
transposeis left non-optimized for 8-bit and 16-bit integers.Questions
For
f32x8(maybe for other types as well?) the existing implementations forrecipandrecip_sqrtuse intrinsics_mm256_rcp_psand_mm256_rsqrt_ps, which are documented as having "relative error". These functions should probably be cross platform deterministic so would it be better to replace those intrinsics with1 / self?Some integer types have
simd_...inherit functions in addition toCmp...trait implementations (e.g.,i32x16). This is bad because it stops you from writingwide_value.simd_eq(scalar_value)(e.g.,i32x16::ZERO.simd_eq(0)),which the trait implementations support, but the inherit function that shadows the trait method does not. What should be done about this? (removing functions is a breaking change).
Inconsistencies that were not fixed
These inconsistencies were not fixed either because i was not sure what is the correct solution, or they are extremely small inconsistencies that are listed anyway. This includes all inconsistencies that i found and did not fixed:
Functions
unpack_lo/hihave unclear names which mean different things depending on the type.u8x16has similarly named functionsunpack_low/high.mul_widenandmul_keep_highdotmul_scale_roundandmul_scale_round_n. These are tricky because existing implementations use intrinsics specific toi16.Float
sign_bit(does not count because it is deprecated).i8xNfunctionsswizzle,swizzle_relaxed,swizzle_half,swizzle_half_relaxed. These use intrinsics specific toi8.Weird inconsistent conversions:
i8x16::from_i16x16_saturate/truncate,i/u16x8::from_u8x16_low/high,i16x8::from_i8x16,f64xN::from_i32xN,f32xN::from_i32xN(missing forf64and are similar toi32xN::round_float),f64x2::from_i32x4_lower2,u8x16::narrow_i16x8.i8x16/i16x8::from_slice_unalignedis missing from other types and behave differently fromSelf::from(&[...]).i32x8implementsFrom<&[i8]>which is not done for any other type.impl From<i32x4> for f64x2which is a non obvious conversion that is not implemented for any other type.Tests
Adding tests separately for each type leads to bugs and takes a long time.
How do you feel about changing tests to use macros? This could make it easier to add functionality consistently in the future.
Possible syntax:
I hope refactoring existing tests using this wont take too long, but doing so will probably find bugs.