Bertrand -- Local Knowledge Agent

A basic implementation of a locally-running LLM with RAG.

Copyright & Licensing

The code is copyright (c) 2024 AlertAvert.com. All rights reserved
The code is released under the Apache 2.0 License, see LICENSE for details.

System configuration

System Architecture Overview

PDF Ingestion and Chunking: Uses unstructured.io to extract text and split it into manageable chunks.
Embedding Generation: Use a suitable embedding model (from sentence-transformers).
Vector Store (Qdrant): Store embeddings and associated metadata in a local Qdrant instance (running via Docker).
Query Execution: Retrieve relevant chunks from Qdrant based on user input.
LLM Querying (Ollama): Use Ollama to generate context-aware answers using retrieved chunks.
Frontend (Streamlit): Provide a user interface for querying the knowledge base and displaying results.

LLM (Ollama)

We run a local copy of Meta's Llama 3.2 3B model using ollama:

ollama run llama3.2:latest

However, even if the LLM is not running (use ollama ps to confirm) it will be started automatically when running the ollama.generate() function.

See also this course on YouTube for more details.

Update
Given all the furore about DeepSeek R1 model, I have also added support for it, using the ollama model name deepseek-r1 Simply change the value for LLM_MODEL in constants/__init__.py to deepseek-r1:

LLM_MODEL: str = "deepseek-r1"

this will use the 7B model, but you can also use deepseek-r1:14b if your hardware supports it.

Vector DB

Qdrant

Qdrant is a vector database designed for real-time, high-performance similarity search. It offers a simple API and runs efficiently on local machines. • Key Features: • Lightweight and fast. • REST API with gRPC support. • Compatible with embeddings from most AI frameworks. • Installation: Use Docker or pip for Python bindings.

docker run -p 6333:6333 qdrant/qdrant

Why Qdrant? • Ease of Use: Simple setup with REST and Python APIs. • Local Storage: Supports persistent storage, so you don’t need to worry about data loss. • Low Complexity: Lightweight, no complex dependencies. • Incremental Updates: Adding new embeddings (from additional PDFs) is straightforward. • Performance: Optimized for local environments and fast similarity searches.

Recommended Setup: • Run Qdrant locally via Docker or as a standalone binary. • Use Qdrant’s Python client for embedding storage and querying.

Implementation

docker run -d -p 6333:6333 qdrant/qdrant

Created the bertie virtualenv (see requirements.txt)

Run Locally

Python Virtualenv

As with every Python project, the recommended way is to use virtualenvwrapper, and create a new venv, then install the dependencies there:

mkvirtualenv -p $(which python3) bertie
pip install -r requirements.txt

this will also install and put on the PATH the necessary scripts (ollama and streamlit).

Running the KB

The easiest way to run the app is to use docker-compose:

docker compose up -d

until integrated with compose, the Streamlit app needs to be run manually:¹

streamlit run app.py [debug]

adding the debug flag will generate DEBUG logs.

This should open a browser window showing the UI, at http://localhost:8501.

¹ See Issue #2

Name		Name	Last commit message	Last commit date
Latest commit History 15 Commits
constants		constants
embeddings		embeddings
examples		examples
kb		kb
sample-data		sample-data
scripts		scripts
static		static
utils		utils
.gitignore		.gitignore
LICENSE		LICENSE
Modelfile		Modelfile
README.md		README.md
app.py		app.py
bulk_upload.sh		bulk_upload.sh
docker-compose.yaml		docker-compose.yaml
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Bertrand -- Local Knowledge Agent

Copyright & Licensing

System configuration

System Architecture Overview

LLM (Ollama)

Vector DB

Qdrant

Implementation

Run Locally

Python Virtualenv

Running the KB

References

About

Uh oh!

Releases 2

Packages

Languages

License

alertavert/bertrand

Folders and files

Latest commit

History

Repository files navigation

Bertrand -- Local Knowledge Agent

Copyright & Licensing

System configuration

System Architecture Overview

LLM (Ollama)

Vector DB

Qdrant

Implementation

Run Locally

Python Virtualenv

Running the KB

References

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages