forked from tabletuser-blogspot/ollama-benchmark
-
Notifications
You must be signed in to change notification settings - Fork 14
Open
Description
As part of my testing of the System76 Thelio Astra (see geerlingguy/sbc-reviews#53), I wanted to test a few different GPUs and models on the system... note that the unit I was shipped came with the base (and very basic—only 4GB of VRAM!) Nvidia RTX A400 workstation GPU. I am also testing a few others...
| Device | CPU/GPU | Model | Speed | Power (Peak) |
|---|---|---|---|---|
| Nvidia A400 | GPU | llama3.2:3b | 35.51 Tokens/s | 167 W |
| Nvidia A400 | CPU/GPU | llama3.1:8b | 2.79 Tokens/s | 190 W |
| Nvidia A400 | CPU/GPU | llama2:13b | 7.93 Tokens/s | 223 W |
| Nvidia A4000 | GPU | llama3.2:3b | 90.92 Tokens/s | 244 W |
| Nvidia A4000 | GPU | llama3.1:8b | 59.11 Tokens/s | 250 W |
| Nvidia A4000 | GPU | llama2:13b | 44.00 Tokens/s | 254 W |
| AMD Pro W77001 | GPU | llama3.2:3b | 89.31 Tokens/s | 261 W |
| AMD Pro W77001 | GPU | llama3.1:8b | 56.92 Tokens/s | 278 W |
| AMD Pro W77001 | CPU/GPU2 | llama2:13b | 8.41 Tokens/s | 187 W |
| Nvidia 4080 Super | GPU | llama3.2:3b | 232.21 Tokens/s | TODO |
1 These GPUs were tested using llama.cpp with Vulkan support — see comments below.
2 This model didn't seem to run entirely on the GPU, not quite sure why.
Metadata
Metadata
Assignees
Labels
No labels