English|中文
v_llama_cpp is the V language binding for llama.cpp, allowing you to directly use llama.cpp functionality in V language projects.
llama.cpp is an LLM (Large Language Model) inference framework implemented in C++, with the following main features:
- Pure CPU Inference: Run large models without a GPU
- Quantization Support: Supports INT4, INT5, INT8 and other quantization formats, significantly reducing memory requirements
- Cross-Platform: Works on Windows, Linux, macOS, and even mobile devices
- Efficient Performance: Optimized for ordinary hardware, runs on regular laptops
Simply put, llama.cpp allows you to run large models like Deepseek, Qwen, ChatGLM locally on consumer-grade hardware.
It is recommended to download the source code using git:
# Download from Github
git clone https://github.com/sakana-ctf/v_llama_cpp
# For users in China, download from Gitee
git clone https://gitee.com/sakana_ctf/v_llama_cppBuild and check the llama.cpp environment; if the llama.cpp environment does not exist, it will attempt to install it:
v install.vshNote: Installing llama.cpp with vlang may require root privileges. You can use sudo v build.vsh
A convenient method is now provided to uninstall the current repository:
v unstall.vsh
If you had configured v_llama_cpp before updating, it will be uninstalled first and then reinstalled during the installation process.
Direct installation using the following command is planned for the future:
v install v_llama_cppSeveral basic examples are provided in the ./examples/ folder. Below is the simplest calling method: ./examples/ez_simple.v:
module main
import os
import v_llama_cpp {
ModelUrl,
}
fn main() {
model_url := ModelUrl{
url: [
'https://www.modelscope.cn/models/unsloth/DeepSeek-R1-Distill-Qwen-1.5B-GGUF/resolve/master/DeepSeek-R1-Distill-Qwen-1.5B-Q2_K.gguf',
'https://huggingface.co/unsloth/DeepSeek-R1-Distill-Qwen-1.5B-GGUF/resolve/main/DeepSeek-R1-Distill-Qwen-1.5B-Q2_K.gguf',
]
sha256: '6b01273c847100f7e594c34869670430fc3597b3897f839664ed4ba4588f5c54'
}
model_path := './DeepSeek-R1-Distill-Qwen-1.5B-Q2_K.gguf'
mut ctx := ModelUrl(model_url).ez_load_model(model_path, -1, 2048, 512) or {
println('load model failed.')
return
}
input_buffer := os.input('>')
prompt := '<|User|>${input_buffer}<|Assistant|><think>\n'
print('deepseek: ')
ctx.ez_response(prompt, 512, 256, print_token) or { println('response failed.') }
}
fn print_token(token string) {
print(token)
}The model file will be automatically downloaded to the './DeepSeek-R1-Distill-Qwen-1.5B-Q2_K.gguf' directory where the program is located. It is recommended to obtain model files from the following sources: