Optimize crc64_rocksoft for aarch64#327
Conversation
pablodelara
left a comment
There was a problem hiding this comment.
Hi @tipabu. What do you mean with this phrase? "I'm not how worried to be about them." It looks like this needs some rewording, thanks.
e21f296 to
b417007
Compare
|
Fixed up -- sorry about that! Was just missing a "sure":
|
|
FWIW, running locally (M4 MacBook Air) with master I see results like vs with this. Also did some extra equivalency testing against the base implementation with https://paste.opendev.org/show/bG3ZmKo1m1D2MZqb7B8Q/ |
|
Thanks @tipabu. Could you also update Release notes with this change? |
|
Should it go under |
|
Is someone going to add the rock soft polynomial to the X86 code base? |
|
It's already there and optimized (see |
Somehow I missed that. Is there a program to generate the rk constants used in the .asm files. similar to what I did when I ported an older version to build with Visual Studio, such as crc64fg.cpp or crc64rg.cpp ? |
Under 3. CHANGE LOG..., in the v2.32 section. Thanks! |
Don't forget to rebase :) |
|
@jeffareid Your same programs work; just need to @pablodelara Done. |
That difference of 1 only occurs with refl CRC, and there is no advantage to not have the right most bit of rk8 set to 1. For reflected CRC, rk8 = the right most 64 bits of a 65 bit reversed polynomial. The missing 65th bit represents x^0, while the right most bit in rk8 represents x^64. Since the CRC is 64 bits, that right most x^64 bit can be ignored (set to 0), but there is no advantage to not having it set to 1. If interested, I can explain this in more detail. |
|
Thanks for the detailed explanation, @jeffareid! I think it finally clicked for me with
|
If fully implemented, then the bits for x^64 to x^127 will all end up equal to zero, but that calculation is not needed since the returned CRC is only for x^0 to x^63. Still there's no advantage of leaving out the x^64 bit in rk8, so I don't know why that was done. |
| .size crc64_tab, 2048 | ||
| #endif | ||
|
|
||
| crc64_tab: |
There was a problem hiding this comment.
is this table used in the pmull implementation or elsewhere?
There was a problem hiding this comment.
My understanding is that it's being referenced (as .lanchor_crc_tab above) in crc/aarch64/crc64_norm_common_pmull.h / crc/aarch64/crc64_refl_common_pmull.h
| .equ br_high_b2, 0xd235 | ||
| .equ br_high_b3, 0xad93 | ||
|
|
||
| .text |
There was a problem hiding this comment.
It seems like this .text directive shouldn't be here, as this section is declaring read only data, not executable code.
There was a problem hiding this comment.
This was cribbed from the other crc64 optimizations -- I can drop it if you'd like, but I'd prefer to drop it from all of them at once.
There was a problem hiding this comment.
Dropped, here and from the others.
This makes it easier to compare the constants used for crc/crc64_*_by8.asm, crc/crc64_*_by16_10.asm, and crc/aarch64/crc64_*_pmull.h Note that this revealed some discrepancies: ecma_refl: br_high != rk8 (92d8af2baf0e1e85 vs 92d8af2baf0e1e84) iso_refl: br_high != rk8 (b000000000000001 vs b000000000000000) jones_refl: br_high != rk8 (2b5926535897936b vs 2b5926535897936a) but they should be innocuous. Signed-off-by: Tim Burke <tim.burke@gmail.com>
Signed-off-by: Tim Burke <tim.burke@gmail.com>
Closes intel#326 Signed-off-by: Tim Burke <tim.burke@gmail.com>
|
PR merged, thanks! |
Closes #326