RAG chat bot for the FINKI Hub Discord server, powered by LangChain and FastAPI. Uses PostgreSQL and pgvector for keeping documents. Has support for many LLMs.
It currently answers questions using a retrieval pipeline over an FAQ dataset (the question table), and separately manages a collection of links. Retrieval over additional document types is planned.
This project comes as a monorepo of microservices:
- API (
/api) for managing documents, links and chatting (default port: 8880) - GPU API (
/gpu-api) for locally executing GPU accelerated tasks like embeddings generation and reranking (default port: 8888) - Front-end (planned — not yet part of this repository)
- Database (PostgreSQL + pgvector) for keeping documents and embeddings
The API Docker image is available as ghcr.io/finki-hub/chat-bot-api, while the GPU API Docker image is available as ghcr.io/finki-hub/chat-bot-gpu-api.
It's highly recommended to do this in Docker.
To run the chat bot:
- Download
compose.prod.yaml - Download
.env.sample, rename it to.envand change it to your liking - Run
docker compose -f compose.prod.yaml up -d
The API will be running on port 8880. This also brings up a pgAdmin instance. You may use it to view or create documents. It's accesible on port 5555 by default.
Requires Python 3.14 (>=3.14,<3.15) and uv.
- Clone the repository:
git clone https://github.com/finki-hub/chat-bot.git - Install dependencies: in each directory (
apiandgpu-api), runuv sync - Prepare env. variables by copying
.env.sampleto.env- minimum setup requires the database configuration, it can be left as is - Run it:
docker compose up -d. Unlike production, the dev compose builds theapiandgpu-apiimages locally from source (it does not pull from ghcr), so the first run builds the containers. The per-directoryuv syncfrom step 2 is for local/IDE tooling only — the containers build their own environment.
This also brings up the FastAPI Swagger UI (OpenAPI docs) at localhost:8880/docs.
This is an incomplete list. You may view all available endpoints on the OpenAPI documentation (/docs).
API (/api):
/questions/list- get all questions/questions/name/<name>- get a question by its name/questions/fill- generate (fill) embeddings for stored questions for a given model (streams progress via SSE)/links/list- get all links/chat- chat with the bot (streaming response);/chat/modelslists available chat models/health- detailed health check
GPU API (/gpu-api):
/embeddings/embed- generate embedding vectors for given input text(s)/rerank- re-rank documents by relevance to a query
This project is licensed under the terms of the MIT license.