A RAG-based AI Legal Assistant designed to provide tailored legal information from a private knowledge base. It features an adaptive user interface, a sophisticated retrieval-scoring-rewriting loop for accuracy, dynamic question suggestions, and a self-improving FAQ system.
- Interactive Chat Interface: A user-friendly chat application built with Streamlit.
- Adaptive AI Persona: The assistant adjusts its communication style and response depth for
Legal Professionals,Law Students, and theGeneral Public. - Document Management Dashboard: An interface to upload PDF documents, which are then processed, chunked, and stored in a vector knowledge base.
- Agentic RAG: The system uses a
LangGraph-powered agent that:- Retrieves relevant document chunks from a Pinecone vector store.
- Scores the relevance of the retrieved context against the user's query.
- Rewrites the query and re-retrieves if the initial results are not relevant enough.
- Dynamic "Related Questions": After each response, the assistant suggests similar questions from a vector database, displayed in the sidebar to guide user exploration.
- Conversation-Driven Content Generation: When a user ends a chat, a unified background process is triggered to:
- Generate FAQs: Analyzes the conversation to create and store detailed Q&A pairs in a local SQLite database.
- Generate Suggested Questions: Creates new, concise questions and adds them to Pinecone to improve future suggestions.
- FAQ Page: A dedicated page to browse all generated FAQs, categorized for easy access.
- Dual Database System:
- Pinecone: For the primary knowledge base and for storing suggested questions.
- SQLite: For persisting structured FAQs.
- Modular & Extensible: The codebase is organized into distinct modules for configuration, application logic, and core AI components.
- AI Frameworks: LangChain, LangGraph
- LLM Provider: OpenAI
- Vector Database: Pinecone
- Structured Database: SQLite
- Web Framework: Streamlit
- Dependencies: See
pyproject.tomlfor the full list.
└── legal_assistant/
├── apps/ # Streamlit applications
│ ├── assistant.py # Main chat interface
│ └── dashboard.py # Document management dashboard
├── config/ # Configuration files
│ ├── database.py # Manages SQLite FAQ database
│ └── settings.py # Project settings and API keys
├── src/ # Core source code for the RAG pipeline
│ ├── document_processor.py # Handles PDF loading and chunking
│ ├── faq_generator.py # Logic for generating FAQs from conversations
│ ├── graph.py # LangGraph agent definition
│ ├── nodes.py # Agent nodes (assistant, RAG loop)
│ ├── prompts.py # All system and task prompts
│ ├── question_generator.py # Logic for generating suggested questions
│ ├── tools.py # Custom tools for the agent (e.g., knowledge base search)
│ └── vector_store.py # Manages interaction with Pinecone
├── generate_content_task.py # Unified background script for all content generation
├── pyproject.toml # Project metadata and dependencies
├── README.md # You are here!
└── .python-version # Specifies Python version (3.11)
Follow these instructions to set up and run the project locally.
- Python 3.11
- An active OpenAI API key.
- An active Pinecone API key.
-
Set up a virtual environment:
python -m venv venv source venv/bin/activate # On Windows, use `venv\Scripts\activate`
-
Install the dependencies:
pip install -r requirements.txt
-
Configure Environment Variables: Create a
.envfile in the project root directory.touch .env
Add your API keys and Pinecone index name to the
.envfile:OPENAI_API_KEY="sk-..." PINECONE_API_KEY="..." PINECONE_INDEX_NAME="your-pinecone-index-name"
You can run two separate Streamlit applications. It's recommended to run them in separate terminal tabs.
- Run the Document Management Dashboard:
Navigate to the URL provided by Streamlit (usually
python -m streamlit run apps/dashboard.py
http://localhost:8501) to upload your documents. This step is crucial for populating the knowledge base.
2. Run the Legal Assistant Chat App:
sh python -m streamlit run apps/assistant.py
Navigate to the URL provided (usually http://localhost:8502 if the dashboard is still running) to interact with the assistant.
The chat flow is managed by a LangGraph agent defined in src/graph.py.
- Initial Call: The user's query is sent to the
assistant_node. The model, armed with thesearch_knowledge_basetool, determines that it needs to retrieve information and makes a tool call. - RAG Loop (
rag_node):- Retrieve: The
search_knowledge_basetool (src/tools.py) is invoked, performing a similarity search in Pinecone. - Score: The retrieved documents are scored for relevance against the user's query using a dedicated model and prompt.
- Rewrite (if needed): If the score is below a
RELEVANCE_THRESHOLD, the system uses another model to rewrite the query for better results. It then re-runs the retrieval. This loop can run up toMAX_RETRIEVAL_ATTEMPTS.
- Retrieve: The
- Generation: Once relevant context is retrieved, it's passed back to the
assistant_node. The main model then generates a final, comprehensive answer based on this context, tailored to the selected user type.
The assistant improves over time by learning from user conversations.
- Trigger: When a user clicks "End Chat",
assistant.pysaves the conversation history to a temporary file and launches thegenerate_content_task.pyscript in a non-blocking background process. - Dual Generation: This unified script performs two tasks:
- Suggested Questions: It uses the
QuestionGeneratorto create a list of concise, related questions. These are vectorized and stored in a dedicatedfaq-questionsnamespace in Pinecone, making them available for the "Related Questions" feature in the sidebar. - FAQ Generation: It uses the
FAQGeneratorto create detailed Question/Answer pairs based on the conversation. These are stored in a local SQLite database and can be viewed on the "Frequently Asked Questions" page.
- Suggested Questions: It uses the