Skip to content

Latest commit

 

History

History
 
 

Benchmarking

Here we collect prefill and generation speed obtained with different hardware.

Run ds4-bench as:

./ds4-bench \
  -m ds4flash.gguf \
  --prompt-file speed-bench/promessi_sposi.txt \
  --ctx-start 2048 \
  --ctx-max 65536 \
  --step-incr 2048 \
  --gen-tokens 128

Provide PR including your numbers if your hardware was not already tested. Call the benchmark csv file something like m3_max.csv or alike, so that it is clear what hardware was used for the benchmark.

To generate an SVG graph from a CSV file:

python3 speed-bench/plot_speed.py speed-bench/m3_max.csv --title "M3 Max t/s"

The script uses only the Python standard library. By default it writes a file next to the CSV using the _ts.svg suffix, such as speed-bench/m3_max_ts.svg.