[BOUNTY - $500] Llama.cpp inference engine

- it should automatically detect the best device to run on
- We should require 0 manual configuration from the user, by default llama.cpp for example requires specifying the device