Skip to content

patw/TinyANN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

TinyDiskANN - Minimal Disk-Based Vector Search

TinyDiskANN is a lightweight vector search library designed for scenarios where:

  • You have billions of vectors but limited RAM
  • You need extreme simplicity without complex dependencies
  • Query latency of 100ms-1s is acceptable
  • You want zero configuration and minimal setup

Research Paper

Original Spec Doc

How It Works

Instead of building complex in-memory indexes like graphs or trees, TinyDiskANN:

  1. Compresses vectors using Product Quantization (PQ)
  2. Stores them sequentially on disk
  3. Scans through them linearly during queries
  4. Keeps only the current batch and top results in RAM

This brute-force approach is surprisingly effective when combined with:

  • Modern SSD throughput (3-7GB/s)
  • Smart batching and vectorization
  • Heavy compression (4-8 bits per dimension)

Comparison to Other Methods

Technique RAM Use Query Speed Accuracy Setup Complexity Best For
TinyDiskANN Minimal 100ms-1s Good None Memory-constrained billion-scale
HNSW High 1-10ms Excellent Moderate Low-latency in-memory search
FAISS (Flat) High 1-100ms Perfect None Small datasets, exact search
DiskANN Medium 10-100ms Excellent High Balanced disk/memory use
SPANN Medium 10-100ms Good High Very large distributed datasets

Key differences:

  • No graph construction - TinyDiskANN skips the hours-long index build phase
  • No memory spikes - RAM usage stays constant regardless of dataset size
  • Portable - Just Python + NumPy, no C++/CUDA dependencies
  • Predictable performance - Linear scan means consistent, if slow, queries

When To Use TinyDiskANN

Best for:

  • "Cold" archival data where queries are rare
  • Memory-constrained environments (even <1GB RAM)
  • Simple deployments where complex systems won't fit
  • Educational purposes to understand vector search basics

Not for:

  • Real-time applications needing <100ms latency
  • Frequently updated datasets (rebuilds are expensive)
  • Cases where 95%+ recall is mandatory

Quickstart

import numpy as np
from tinydiskann import TinyDiskANN

# 1. Create some random vectors
data = np.random.randn(10000, 128).astype(np.float32)

# 2. Train and store the index
codebooks = TinyDiskANN.train_codebooks(data)
pq_codes = TinyDiskANN.quantize_vectors(data, codebooks)
TinyDiskANN.store_pq_codes(pq_codes, codebooks, "my_index")

# 3. Search!
query = np.random.randn(128)
results = TinyDiskANN.search(query, codebooks, "my_index", top_k=5)

Benchmarks

On a standard laptop with SSD (1M 128-dim vectors):

  • Build time: ~2 minutes
  • Query latency: ~300ms
  • RAM usage: <50MB (vs 2GB+ for in-memory indexes)
  • Accuracy: ~80% recall@10 vs exact search

Limitations

This is purposefully minimal software. For production systems consider:

  • Adding multi-threading for batch queries
  • Implementing memory-mapped files for zero-copy reads
  • Adding support for incremental updates

But sometimes - simple is exactly what you need!

About

An experiment in AI collaboration and automated research.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages