#rag #vector-search #embedding #llm

oxibonsai-rag

Pure Rust RAG pipeline for OxiBonsai

9 releases

new 0.2.2 Jun 8, 2026
0.2.1 Jun 6, 2026
0.1.5 Jun 2, 2026
0.1.4 May 16, 2026
0.1.2 Apr 19, 2026

#154 in Science

Download history 3/week @ 2026-04-16 10/week @ 2026-04-30 13/week @ 2026-05-14 6/week @ 2026-05-21 29/week @ 2026-05-28 28/week @ 2026-06-04

76 downloads per month
Used in 5 crates (3 directly)

Apache-2.0

470KB
9K SLoC

oxibonsai-rag

Version Status Tests

Pure Rust Retrieval-Augmented Generation (RAG) pipeline for OxiBonsai.

Self-contained RAG stack: document chunking (character, sentence, paragraph, semantic, hierarchical, sliding window, markdown), pure Rust embedders (identity, TF-IDF), in-memory vector store with cosine similarity, top-k retrieval, and end-to-end prompt-building pipeline.

Part of the OxiBonsai project.

Status

Stable — version 0.2.2, 871 tests passing (cargo nextest run -p oxibonsai-rag). Uplifted from Alpha in 0.1.2.

Features

  • RagPipeline — end-to-end index + query pipeline
  • VectorStore — in-memory L2-normalized cosine similarity search
  • Retriever — document indexing and top-k chunk retrieval
  • Embedder trait — pluggable embedding backends
  • IdentityEmbedder — hash-based embedder for testing
  • TfIdfEmbedder — bag-of-words TF-IDF embedding
  • Chunking strategies: character window, sentence, paragraph, recursive, sliding window, markdown, semantic (cosine boundary), hierarchical
  • ChunkerRegistry — dynamic dispatch for pluggable chunking backends
  • Zero external API calls — fully self-contained

Usage

[dependencies]
oxibonsai-rag = "0.2.2"
use oxibonsai_rag::RagPipeline;

let mut pipeline = RagPipeline::default();
pipeline.index_document("Rust is a systems programming language.")?;
let prompt = pipeline.build_prompt("What is Rust?")?;

License

Apache-2.0 — COOLJAPAN OU

Dependencies

~5.5–9MB
~87K SLoC