Skip to content

tty-pt/libqllm

Repository files navigation

libqllm

This is a library that is focused on making LLM usage easy and portable. The idea is you don't have to bother about CUDA or anything like that. You just install it via your favorite package manager, and then you can use it to do inference and generate embeddings. It is a wrap around llama-cpp, but with a simple interface and the build complexity hidden. It uses Vulkan on Linux and Metal on MacOS to allow for this portability.

This project comes with a few tools for ease-of-use, like a service program to allow for chat sessions, bash completion, and a client program. Also, qllm-list for listing your gguf models, and qllm-path for getting the real path to one.

Installation

Check out these instructions. And use "libqllm" as the package name.

Chat usage

Follow these instructions to install huggingface-cli so you can download models you can run.

Download a model, like:

huggingface-cli download reedmayhew/Grok-3-gemma3-4B-distilled gemma-3-finetune.Q8_0.gguf

Run:

qllmd -d -p 4242 gemma* # To start the service
qllm-chat # To talk to it

About

C LLm inference and embeddings based on llama.cpp

Resources

Stars

Watchers

Forks

Packages

No packages published