This repo is an implementation of a locally hosted chatbot specifically focused on question answering over the LangChain documentation. Built with LangChain, and Next.js.
Deployed version: chatjs.langchain.com
Looking for the Python version? Click here
The app leverages LangChain's streaming API to update the page in real time for multiple users.
- Install dependencies via:
yarn install
. - Set the required environment variables listed inside
backend/.env.example
for the backend, andfrontend/.env.example
for the frontend.
- Build the backend via
yarn build --filter=backend
(from root). - Run the ingestion script by navigating into
./backend
and runningyarn ingest
.
- Navigate into
./frontend
and runyarn dev
to start the frontend. - Open localhost:3000 in your browser.
There are two components: ingestion and question-answering.
Ingestion has the following steps:
- Pull html from documentation site as well as the Github Codebase
- Load html with LangChain's RecursiveUrlLoader and SitemapLoader
- Split documents with LangChain's RecursiveCharacterTextSplitter
- Create a vectorstore of embeddings, using LangChain's Weaviate vectorstore wrapper (with OpenAI's embeddings).
Question-Answering has the following steps:
- Given the chat history and new user input, determine what a standalone question would be using GPT-3.5.
- Given that standalone question, look up relevant documents from the vectorstore.
- Pass the standalone question and relevant documents to the model to generate and stream the final answer.
- Generate a trace URL for the current chat session, as well as the endpoint to collect feedback.
Deploy the frontend Next.js app as a serverless Edge function on Vercel.