Product Requirements Document (PRD)
Project Title:
QueryTalk – Natural Language to SQL Conversational Interface
1. Objective
To build an AI-powered interface that allows users to interact with databases using natural
language. The system translates user questions into MongoDB, executes the query, and converts
the results back into a natural language response.
2. Problem Statement
Non-technical users often struggle with accessing or querying structured data. SQL/MongoDB
knowledge is required to extract meaningful insights. This system aims to bridge that gap by
enabling natural language interactions with databases.
3. Goals & Success Metrics
Goals:
Convert natural language questions to accurate MongoDB queries.
Execute MongoDB queries on a connected database.
Return results as human-readable, context-aware natural language responses.
Success Metrics:
≥ 90% accuracy in MongoDB generation.
Response time < 3 seconds.
4. Key Features
User Input Module
Accepts typed natural language questions.
NL to MongoDB Converter
Uses an NLP model (e.g., OpenAI Codex, Google T5, or fine-tuned LLM) to generate
MongoDB from input.
Ensures schema-aware query generation.
Query Executor
Executes generated MongoDB against a connected database (e.g., PostgreSQL, MySQL,
etc.).
Handles errors (e.g., malformed queries, missing tables).
MongoDB Result to Natural Language Converter
Converts raw tabular results into coherent summaries using templating or LLMs.
5. Architecture Overview
6. Technical Requirements
Languages & Frameworks: Python (FastAPI/Flask), Streamlit (frontend), SQLAlchemy
Model APIs: OpenAI/GPT, Hugging Face Transformers, LangChain
Database Support: PostgreSQL, MySQL.
7. User Roles
End User: Queries the database using natural language.
Admin: Manages database connections, schema mappings, and logs.
8. Milestones & Timeline
S.No. Milestone Time Description
Estimate
1. Requirement 2 days Objective: The goal of this milestone is to thoroughly
Analysis analyze the product requirements, gather information.
Steps involved:
1. Gather information on the database schema, typical
user queries, and any existing systems that might
interact with the product.
2. Confirm the essential features for the MVP.
3. Prioritize which features should be implemented
first.
4. Identify the best tools and technologies for the
further steps.
5. Determine integration points with external services
(e.g., LLM APIs).
6. Identify potential risks like schema changes,
security concerns (SQL injection), and inaccurate
translations.
Outcome:
1. A clear project vision and actionable, agreed-upon
tasks.
2. A roadmap to avoid potential obstacles in
development.
3. Tech stack decisions finalized.
2. MVP Design 1 week Objective: Design the architecture for the system and map
& Schema the database schema to the natural language queries that
Mapping users will input. This is a crucial step because it aligns the
natural language understanding with the underlying data
structure.
Steps involved :
1. Define System Architecture:
Design the high-level architecture for the
NL to SQL Module, Query Execution, and
Frontend Interface.
Select appropriate backend frameworks
(e.g., FastAPI, Flask).
2. Schema Mapping:
Create mappings between natural language
queries and database tables/columns.
Use predefined keywords (e.g.,
"total sales" → SUM(amount), "sales"
→ sales table).
Establish rules for common queries (like
date filters, aggregations, etc.).
3. Design Database Connection Logic:
Identify and establish the database
connection.
Define ORM (Object-Relational Mapping)
models if necessary (e.g., using
SQLAlchemy in Python).
4. Define User Input Interface:
Design a simple interface (web or CLI)
where users can type their queries.
Outline how the system will process these
queries.
Outcomes :
1. Finalized system architecture diagrams.
2. Schema mappings and keyword ruleset for
NL2SQL module.
3. ORM models for the connected database.
4. MVP architecture design complete.
Objective : Build and fine-tune the core module that
3. NL to SQL 2 days translates natural language into SQL queries accurately,
Module using schema-aware techniques.
Development
Steps Involved :
1. Select or fine-tune an LLM (e.g., Codex, T5, or
other Hugging Face models).
2. Provide schema context to the model.
3. Implement fallback rules or regex-based templates
for common queries.
4. Validate generated SQL syntax before execution.
5. Test accuracy across various query types
(aggregation, filters, joins, etc.).
Outcomes :
1. A functional NL-to-SQL model integrated with
schema context.
2. Query validation logic to catch errors early.
3. Evaluation metrics (accuracy, coverage, etc.)
established.
Objective : Safely execute generated SQL on the
4. Query 2 days connected database and handle runtime/query execution
Execution & gracefully.
Error
Handling Steps Involved :
1. Establish secure DB connection with SQL
Alchemy.
2. Use parameterized queries to prevent SQL
injection.
3. Implement error-catching logic ( invalid SQL,
timeouts, missing tables).
4. Log queries and results for future analytics.
Outcomes :
1. Stable query execution pipeline with retry/error
handling.
2. Enhanced security via parameterized statements.
3. Logs for auditability and future improvements.
Objective : Convert SQL result sets into user-friendly
5. Result to NL 2 days natural language summaries to enhance usability.
Response Steps Involved :
Generator
5. Design templated responses (e.g., "Total sales for
March is $X").
6. Integrate an LLM to summarize complex outputs.
7. Handle edge cases (no result, large result sets).
8. Add contextual metadata (e.g., date range, filters
used).
Outcomes :
4. Human-readable summaries generated for queries.
5. Template fallback for structured responses.
6. Seamless integration with frontend.
6. Frontend 2 days Objective : Create a user-friendly interface that enables
Integration users to input natural language queries and receive answers
in conversational format.
Steps involved :
1. Build the UI in streamlit for a conversational chat
layout.
2. Integrate API endpoints (NL2SQL, query
execution, response generation).
3. Add feedback options for users to rate responses.
4. Optimize for responsiveness and accessibility.
Outcomes :
1. A functional, intuitive frontend interface.
2. Fully integrated backend with real-time response
flow.
10. Future Enhancements
Voice input support
Multi-database querying
Data visualization of results
User personalization and query history
Integration with BI tools (e.g., Power BI, Tableau)