A production-ready, Retrieval-Augmented Generation (RAG) chatbot built for Global AI Hub. It indexes Q&A content, retrieves the most relevant chunks with Chroma, and generates helpful answers using Google Gemini.
- Simple RAG pipeline: chunk → embed → store → retrieve → generate
- Google Gemini 2.0 Flash for response generation
text-embedding-004for high‑quality embeddings- Flask UI, Markdown rendering, persistent Chroma vector store
- Backend: Flask
- RAG: LangChain, Chroma
- LLM: Google Gemini (
gemini-2.0-flash) - Embeddings: Google
models/text-embedding-004 - Data: Markdown Q&A (
data/soru_cevap.md)
- Python 3.10+
- A valid Google API key with access to Gemini models
-
Clone the repository
git clone https://github.com/enesmanan/gaih-chatbot.git cd gaih-chatbot -
Install dependencies
pip install -r requirements.txt
-
Configure environment Create a
.envfile in the project root:GOOGLE_API_KEY=your_google_api_key_here -
Create the vector database (first run or when data/model changes)
python create_database.py
-
Start the app
python app.py
Then open your browser at
http://localhost:5000.
- Generative model:
gemini-2.0-flash(temperature=0.57) - Embedding model:
models/text-embedding-004 - Chunking: 2000 characters, 300 overlap
- Retrieval k: 10
gaih-chatbot/
├── data/
│ └── soru_cevap.md # Q&A dataset (Markdown)
├── app.py # Flask app + RAG query flow
├── create_database.py # Chunk + embed + persist to Chroma
├── chroma_db/ # Persisted vector store (auto-created)
├── static/ # CSS and images
├── templates/ # HTML templates
├── requirements.txt # Dependencies
├── .env # GOOGLE_API_KEY (not committed)
└── README.md # This file