Backend that lets users ask questions in plain English; questions are converted to SQL, run on a MySQL database, and answered with an LLM. Built for retail/product data with few-shot examples and semantic example selection.
I posted about this project on LinkedIn:
https://www.linkedin.com/posts/md-sohail-230141205_github-sohail3080querymind-querymind-activity-7430337701012246528-9Slq?utm_source=share&utm_medium=member_desktop&rcm=ACoAADQq6AQBraEbiT14ztiatVqIq2dy3pG4Gus
You can test the API using the Postman collection below:
Postman Collection:
https://www.postman.com/myselfmdsohail-1533277/querymind/request/52316234-72d3d142-83e7-42c1-881b-73c53ce7d4ad/?action=share&creator=52316234&ctx=documentation
-
Create a virtual environment (recommended):
python -m venv .venv
-
Activate the virtual environment:
- On Windows:
.venv\Scripts\activate - On Linux/macOS:
source .venv/bin/activate
- On Windows:
-
Install the required packages — Install the packages listed in
requirements.txt(ensure your venv is activated so they install into the virtual environment). -
Start the FastAPI server with Uvicorn:
uvicorn main:app --reload
| Route | Method | Use |
|---|---|---|
/v1/api/query |
POST | Send a natural language question; backend generates SQL, runs it on MySQL, and returns an LLM answer. |
Body
{
"query": "What electronics items cost less than 50 dollars?"
}The backend uses few-shot examples (selected by semantic similarity from a Chroma vector store), the MySQL schema, and an OpenAI-compatible LLM to produce SQL, execute it, and format the answer.
| Variable | Description |
|---|---|
OPENAI_API_KEY |
API key for the LLM (OpenAI or compatible endpoint). |
OPENAI_API_BASE |
Base URL for the API (e.g. https://api.openai.com/v1 or your custom endpoint). |
LLM |
Model name/id (e.g. gpt-4, gpt-3.5-turbo). |
EMBEDDING_MODEL |
HuggingFace model for few-shot example embeddings (no default in code; set in .env; example in .env.example: sentence-transformers/all-MiniLM-L6-v2). |
db_user |
MySQL user (readonly recommended). |
db_password |
MySQL password. |
db_host |
MySQL host. |
db_name |
Database name. |
db_port |
MySQL port (e.g. 3306). |
Copy .env.example to .env and fill in your values.
Testing: This project was tested using OpenRouter with the model arcee-ai/trinity-large-preview:free. Use OpenRouter’s API URL for OPENAI_API_BASE and your OpenRouter API key for OPENAI_API_KEY, and set LLM to arcee-ai/trinity-large-preview:free (or another OpenRouter model).
The following table was used by me. You can create it in your MySQL database to try the app as-is:
CREATE TABLE product_inventory (
id INT AUTO_INCREMENT PRIMARY KEY,
sku VARCHAR(64) NOT NULL UNIQUE,
product_name VARCHAR(255) NOT NULL,
category VARCHAR(255),
price DECIMAL(10,2) NOT NULL,
stock INT NOT NULL DEFAULT 0,
supplier VARCHAR(255)
);Adapt to your needs: Change the few-shot examples in config/few_shots.py and the system prompt in config/prompts.py to match your schema and use case. Queries and examples should reflect your own tables, columns, and business logic.
- Chain-of-thought (CoT): The system prompt in
config/prompts.pyasks the LLM to reason step-by-step (understand the question → identify tables/columns → construct filters → ordering/limits) and output that reasoning before writing the SQL. This improves accuracy and debuggability. - Few shots: Example question–SQL–answer pairs live in
config/few_shots.py. The 2 most semantically similar examples to the user’s question are retrieved from Chroma and injected into the prompt so the model can mimic style and patterns (e.g. price ranges, categories, suppliers).
- Text-to-SQL with few-shot learning — Natural language → SQL via LangChain
SQLDatabaseChainand a MySQL-focused system prompt. - Semantic example selection — Few-shot examples are stored in Chroma; the 2 most similar examples to the user question are selected (HuggingFace embeddings).
- Retail-oriented prompts — Prompts and examples in
config/prompts.pyandconfig/few_shots.pytarget product name search, categories, price ranges, stock, and suppliers. - Stateless FastAPI backend — Single query endpoint; no auth in the current code (add middleware if needed).
- Database access — Use a readonly MySQL user for safety. A readonly user can only run SELECT (and similar read-only operations), which helps prevent SQL injection from turning into data changes, and avoids mistaken UPDATE/DELETE/INSERT or any processing other than viewing data. You can configure and verify user permissions in MySQL Workbench (see the sources below for details on managing user privileges).
- LLM dependency — Requires an OpenAI-compatible API (OpenAI or custom base URL). No built-in API key auth in the app; secure the endpoint (e.g. reverse proxy or FastAPI middleware) if exposed.
- Few-shot scope — Example set is fixed in code; best results when questions align with the retail schema and example types (prices, categories, stock, suppliers, etc.).
- No streaming — Response is returned in one shot after SQL execution and LLM reply.
- FastAPI Documentation
- Uvicorn
- Virtual environments | FastAPI
- Youtube video
- Permission for a user in mysql workbench
- Agents
- LangChain SQL Chain
- ChatGoogleGenerativeAI integration
- Chroma Documentation
- HuggingFace Sentence Transformers
- SQLDatabase | LangChain Community
- Few-shot prompting
- Chain-of-thought prompting
This is a learning project to combine natural language Q&A, Text-to-SQL, FastAPI, LangChain, Chroma (for few-shot retrieval), and MySQL in a retail context. It is intended for experimentation and understanding.
If you find a bug, have a suggestion, or want to report an issue, please open an issue or reach out; feedback is welcome.
LinkedIn: My profile