Overview

A simple collection of LLM snippets and utilities.

How to Use

Download or clone the repo: git clone git@github:pavdwest/llm_docsearch.git
cd llm_docsearch
Create virtual env: pip -m venv .venv
Activate venv: source ./venv/bin/activate
Install requirements: pip install -r requirements.txt
Copy .env.example to .env and fill out your OpenAI key

Running the Example

The docs folder already contains some example data saved as pdfs and raw text, attributed to the following sources:

Run 'training'

python ./train.py

It should output something like the following:

Delete existing db...
Loading 4 documents...
Creating vector db...
Done!

Note that you can rerun the training at any time to delete the existing db and reload only the files currently in the docs dir.

Run the query

python ./run_query.py

It should output something like the following:

Running query: 'When was the Cologne Cathedral built and how tall is it?'
Response: ' The Cologne Cathedral was built in 1248 and is 157 metres tall. This information can be found in the Riviera Travel Blog and Touropia sources.'

Running with your own data

(Optional) Backup the Example Data

It might be worth making a backup of the example docs if you'd like to use them again in the future.

Delete the Example Data

Delete everything in the docs directory.

Load your own data

Copy all of your source documents into the docs folder. Explicitly supported file types are *.html and *.pdf. It will attempt to load other types as text, mileage may vary.

Run 'training'

Run 'training':

python ./train.py

Run a query

Run a query by passing it as basic text via the command line:

python ./run_query.py Find all the details about 'SomeTopic' in my documents

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Overview

How to Use

Running the Example

Run 'training'

Run the query

Running with your own data

(Optional) Backup the Example Data

Delete the Example Data

Load your own data

Run 'training'

Run a query

About

Uh oh!

Releases

Packages

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
db		db
docs		docs
.env.example		.env.example
.gitignore		.gitignore
README.md		README.md
config.py		config.py
requirements.txt		requirements.txt
run_query.py		run_query.py
train.py		train.py

pavdwest/llm_docsearch

Folders and files

Latest commit

History

Repository files navigation

Overview

How to Use

Running the Example

Run 'training'

Run the query

Running with your own data

(Optional) Backup the Example Data

Delete the Example Data

Load your own data

Run 'training'

Run a query

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages