This is an app that let's you ask questions about any data source by leveraging embeddings, vector databases, large language models and last but not least langchains
- Upload any
fileor enter anypathorurl - The data source is detected and loaded into text documents
- The text documents are embedded using openai embeddings
- The embeddings are stored as a vector dataset to a datalake
- A langchain is created consisting of a LLM model (
gpt-3.5-turboby default) and the embedding database index as retriever - When sending questions to the bot this chain is used as context to answer your questions
- Finally the chat history is cached locally to enable a ChatGPT like Q&A conversation
- As default context this git repository is taken so you can directly start asking question about its functionality without chosing an own data source.
- To run locally or deploy somewhere, execute
cp .env.template .envand set necessary keys in the newly created secrets file. Another option is to manually set environment variables - Yes, Chad in
DataChadrefers to the well-known meme