linkhut

Sort by:

recency
popularity

Order:

descending
ascending

28 Dec 25

ai-engineering-hub/fastest-rag-milvus-groq at main · patchy631/ai-engineering-hub

https://github.com/patchy631/ai-engineering-hub/tree/main/fastest-rag-milvus-groq

In-depth tutorials on LLMs, RAGs and real-world AI agent applications. - ai-engineering-hub/fastest-rag-milvus-groq at main · patchy631/ai-engineering-hub

This project builds the fastest stack to build a RAG application with retrieval latency < 15ms.

It leverages binary quantization for efficient retrieval coupled with Groq’s blazing fast inference speeds.

by tmfnk 6 months ago

28 Dec 25

ai-engineering-hub/fastest-rag-milvus-groq at main · patchy631/ai-engineering-hub

Tags: