🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
-
Updated
Feb 17, 2025 - Python
🦛 CHONK your texts with Chonkie ✨ - The no-nonsense RAG chunking library
MTEB: Massive Text Embedding Benchmark
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Study guides for MIT's 15.003 Data Science Tools
Fast, Accurate, Lightweight Python library to make State of the Art Embedding
A Heterogeneous Benchmark for Information Retrieval. Easy to use, evaluate your models across 15+ diverse IR datasets.
The official implementation of RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval
Fast lexical search implementing BM25 in Python using Numpy, Numba and Scipy
A realtime serving engine for Data-Intensive Generative AI Applications
Superlinked is a Python framework for AI Engineers building high-performance search & recommendation applications that combine structured and unstructured data.
An official implementation for "CLIP4Clip: An Empirical Study of CLIP for End to End Video Clip Retrieval"
SGPT: GPT Sentence Embeddings for Semantic Search
Implementation of RETRO, Deepmind's Retrieval based Attention net, in Pytorch
Epsilla is a high performance Vector Database Management System
Grounded search engine (i.e. with source reference) based on LLM / ChatGPT / OpenAI API. It supports web search, file content search etc.
My personal note about local and global descriptor
Implementation of Memorizing Transformers (ICLR 2022), attention net augmented with indexing and retrieval of memories using approximate nearest neighbors, in Pytorch
Add a description, image, and links to the retrieval topic page so that developers can more easily learn about it.
To associate your repository with the retrieval topic, visit your repo's landing page and select "manage topics."