Stars
A modular graph-based Retrieval-Augmented Generation (RAG) system
A list of papers in the field of approximate nearest neighbor search on high-dimensional vectors.
A fast, distributed, high performance gradient boosting (GBT, GBDT, GBRT, GBM or MART) framework based on decision tree algorithms, used for ranking, classification and many other machine learning …
Source code for SIGMOD 2020 paper "Improving Approximate Nearest Neighbor Search through Learned Adaptive Early Termination"
a python code of applying GBDT+LR for CTR prediction
[SIGMOD 2026] DARTH: Declarative Recall Through Early Termination for Approximate Nearest Neighbor Search.
📚 从零开始的向量数据库原理与实践教程,在线阅读地址:https://datawhalechina.github.io/easy-vectordb/
A low-latency, billion-scale, and updatable graph-based vector store on SSD.
kioxia-jp / aisaq-diskann
Forked from microsoft/DiskANNAll-in-Storage Solution based on DiskANN for DRAM-free Approximate Nearest Neighbor Search
Collection of various algorithms in mathematics, machine learning, computer science and physics implemented in C++ for educational purposes.
vsag is a vector indexing library used for similarity search.
This is the source code of the method proposed in paper: Subspace Collision: An Efficient and Accurate Framework for High-dimensional Approximate Nearest Neighbor Search (accepted by SIGMOD 2025).
Billion-scale Semantic Search dataset derived from Microsoft SpaceV for Vector Search benchmarks with smaller subsets
Exploration of Vector database Index for fast approximate nearest neighbour search.
Visualize hnsw, faiss and other anns index
A library of algorithms for approximate nearest neighbor search in high dimensions, along with a set of useful tools for designing such algorithms.
[SIGMOD' 25] A fast parallel kd-tree implementation
WebGL point cloud viewer for large datasets
The Streamlit app uses cosine similarity to semantically match your query with Airbnb listings and find matching properties in our database using HNSW vs DiskANN.
Official implement for paper "OctreeOcc: Efficient and Multi-Granularity Occupancy Prediction Using Octree Queries"