RISC-V integration and JIT compiler#3725
Conversation
|
@Slayingripper please test on your board (with and without |
|
Can you also build https://github.com/tevador/RandomX/ and run the benchmark there? I need this to check if I didn't break anything while porting the code. |
|
All good |
|
And the last line (after the reference result) - performance (hashes per second)? |
Performance: 86.8955 hashes per second |
|
All good then. XMRig has better soft AES implementation, so it should be a bit faster even with the same JIT compiler. |
|
One more question: RxDataset_riscv.h and the whole src/crypto/riscv folder don't seem to be used anywhere. What is the purpose of these files? |
I had forgotten about this, thinking that RISC-V had some crypto extension, but until this happens, this won't get utilised. I was hoping to figure out a way to implement this, but maybe it's too early for RISC-V. You could remove them |
I was digging into the Orange Pi's documentation in the hopes I could figure out a way to utilise the "AI" features. But I guess it was just marketing hype. |
|
@Slayingripper can you also run |
|
Unfortunately (or fortunately) nothing unusual there, so auto-config should already create 8 threads for it. |
|
@KiritakeKumi That's too low hashrate for SG2042. Did you test the latest |
This is very suspicious since the same chip is used in the XMR MINER X5 if I'm not mistaken. The low hashrate is quite interesting |
|
X5 uses SG2042R - a custom version of SG2042. I suspect it has hardware AES + vector instructions + some extra instructions specifically for common RandomX code sequences. But the regular SG2042 shouldn't be more than 10x slower anyway. |
Yes, I agree, but at least it's working. Thinking about this for a second, although this is not a 1:1 comparison. Since the RV2 gets around 100H/s at 8 cores , then (100*64)/8 , does come out to 800 H/s, so the benchmark on the surface does make sense. I will also add that both of these chips have relatively low L1 Cache, which, if I'm not mistaken, also impacts performance. |
|
RV2 does 37 h/s on a single core though, even with only 512 KB cache. SG2042 has 64 MB cache, so it can run 32 threads at full speed - it should be much faster per thread. |
Yes, I'm using the latest dev branch. Regarding the core count, I found that 20 cores is optimal; higher core counts might be due to NUMA optimization issues? The SG2042R, compared to the SG2042, should have added a RandomX hardware accelerator. |
Thread affinity can also be important. When you set thread count manually, threads are not fixed to specific cores. Can you run |
topology.xml |
|
Weird, it doesn't show cache size per core. But it does show that this CPU is split into 4 NUMA nodes, so running XMRig in auto mode is important. Try to use |
|
After setting it up, I can now achieve a speed of around 2100 H/s. Without RVV acceleration, I think this speed is quite reasonable now. |
-DARCH=nativefor RISC-V builds - it will test and enablezbaandzbbextenstions for RandomXCloses #1924