Entha is a proof-of-concept (PoC) implementation of multi-hop document retrieval that enhances traditional Retrieval-Augmented Generation (RAG) pipelines. It improves contextual understanding by not only retrieving the top-k most relevant documents for a query, but also expanding context by retrieving top-n semantically similar documents based on those initial results.
Traditional RAG systems retrieve a fixed number of documents (top-k) based solely on query relevance. However, this can miss related but indirectly connected context. Entha addresses this by:
- Retrieving the top-k documents relevant to the user's query.
- For each of those documents, retrieving top-n additional documents based on embedding similarity.
- Combining all results to create a richer and more connected context for the LLM to generate responses.
- ✅ Multi-hop context retrieval (top-k + neighbors of top-k)
- ✅ Google Generative AI Embeddings for semantic similarity
- ✅ ChromaDB for vector storage and retrieval
- ✅ Streamlit-based interface for easy interaction
- ✅ Modular design for plug-and-play LLMs
- LangChain
- ChromaDB
- Google Generative AI
- Streamlit
User Query
│
├──▶ Top-K Document Retrieval (based on embeddings)
│ │
│ └──▶ Top-N Neighbors (per document)
│ │
└────────────────────▶ Combined Context
↓
Sent to LLM (Gemini)