libqllm

This is a library that is focused on making LLM usage easy and portable. The idea is you don't have to bother about CUDA or anything like that. You just install it via your favorite package manager, and then you can use it to do inference and generate embeddings. It is a wrap around llama-cpp, but with a simple interface and the build complexity hidden. It uses Vulkan on Linux and Metal on MacOS to allow for this portability.

This project comes with a few tools for ease-of-use, like a service program to allow for chat sessions, bash completion, and a client program. Also, qllm-list for listing your gguf models, and qllm-path for getting the real path to one.

Installation

Check out these instructions. And use "libqllm" as the package name.

Chat usage

Follow these instructions to install huggingface-cli so you can download models you can run.

Download a model, like:

huggingface-cli download reedmayhew/Grok-3-gemma3-4B-distilled gemma-3-finetune.Q8_0.gguf

Run:

qllmd -d -p 4242 gemma* # To start the service
qllm-chat # To talk to it

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
.github/workflows		.github/workflows
bin		bin
include/ttypt		include/ttypt
share/bash-completion/completions		share/bash-completion/completions
src		src
submodules		submodules
.gitignore		.gitignore
.gitmodules		.gitmodules
Makefile		Makefile
README.md		README.md
objects-set.mk		objects-set.mk

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

libqllm

Installation

Chat usage

About

Uh oh!

Releases 1

Packages

Languages

tty-pt/libqllm

Folders and files

Latest commit

History

Repository files navigation

libqllm

Installation

Chat usage

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages