-
Notifications
You must be signed in to change notification settings - Fork 415
Fix computation of cache-line size on Power #6515
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
d538b0a to
df47457
Compare
port/unix/omrcpu.c
Outdated
| :"r"((void *) &buf[512])); | ||
| memset(buf, 255, sizeof(buf)); | ||
|
|
||
| __asm__ __volatile__( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect the __volatile__ isn't needed. The port library copy doesn't use it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah nm, I see you just added it here. I guess it doesn't hurt.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was used in both places in j9memclr.cpp and won't hurt. In fact, it may save us from a future compiler optimization.
|
jenkins build all |
|
The changes LGTM, but can the cache line size detection be factored out into a single place? |
I'll look into whether there's a clear path to that. |
df47457 to
18814e6
Compare
|
Updated as suggested, reusing |
|
The description of the functional fix (i.e. the memory clobber) seems to have gone missing from the commit message. It's probably more important now than it was before, since there are more changes |
The compiler needs to be told that our uses of dcbz clobber memory, otherwise the optimizer can reasonably assume in getCacheLineSize(), that each element of buf still has the value (255) assigned by memset(). * reuse getCacheLineSize() in omrcpu_startup() * use uint32_t consistently for cache line size Signed-off-by: Keith W. Campbell <keithc@ca.ibm.com>
18814e6 to
8b3bf5b
Compare
|
That description was never in the commit message itself (only in the description here); it's now also in the commit message. |
|
Jenkins build all |
This allows the compiler to do the widening operation once, rather than repeatedly in loops, fixing a performance regression introduced in eclipse-omr#6515. Signed-off-by: Keith W. Campbell <keithc@ca.ibm.com>
This allows the compiler to do the widening operation once, rather than repeatedly in loops, fixing a performance regression introduced in eclipse-omr#6515. Signed-off-by: Keith W. Campbell <keithc@ca.ibm.com>
We need to inform the compiler that our uses of
dcbzclobber memory, otherwise the optimizer can reasonably assume, e.g. inomrcpu_startup(),thatbuf[0] == 255at the beginning of the for loop.