Skip to content

ARPAHLS/vic_aisaq_demo

Repository files navigation

vic_aisaq_demo

VIC-style tiered retrieval + native KIOXIA AiSAQ backend.

Latency vs Dataset Size

This repository is a live demonstration of storage-aware AI retrieval for edge and controller-style environments:

  • L0 metadata filtering over filesystem records
  • L1 native AiSAQ ANN search (--use_aisaq) over flash-first index artifacts
  • L2 selective deep parsing for ranked evidence snippets

The idea is straightforward: keep data on disk, narrow candidates cheaply, and only spend memory where it adds value. VIC contributes query planning and tier orchestration, while AiSAQ contributes flash-oriented ANN search for low-DRAM operation. The result is a local-first reference flow that is auditable, reproducible, and easy to present live.

For broader context on why this direction matters, see the Computational Storage Landscape report.

Overview

vic_aisaq_demo combines two proven open-source building blocks:

  • lc0_vic: a tiered retrieval controller that plans and orchestrates search in layers.
  • aisaq-diskann: a flash-oriented ANN backend optimized for low-DRAM environments.

In plain terms: metadata narrows the search space first, vector retrieval finds semantic candidates second, and deep parsing is reserved for a small shortlist.

Execution Flow

  1. Librarian / plan: turn a natural-language question into retrieval intent.
  2. L0 metadata filter: reduce candidate files by extension/size/time/path hints.
  3. L1 vector search: run native AiSAQ ANN search over embeddings.
  4. L2 deep read: parse only top files and attach snippets/evidence.
  5. Ranked response: return paths, scores, tiers, and run metrics.

The funnel below shows how the controller keeps deep parsing affordable by reducing the workload at each tier.

Tier Funnel: L0 to L2

Starting from a large directory, L0 and L1 progressively reduce candidates so L2 parses only a tiny subset before final ranking.

Quick Start

1) Build AiSAQ binaries in WSL

cd ~/aisaq-diskann
git checkout aisaq_release
git submodule update --init --recursive
mkdir -p build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j2
ls -l ~/aisaq-diskann/build/apps/build_disk_index ~/aisaq-diskann/build/apps/search_disk_index

2) Prepare WSL Python runtime for this repo

python3 -m venv ~/venvs/vix_aisaq_demo
source ~/venvs/vix_aisaq_demo/bin/activate
cd <repo-root>
python3 -m pip install -r requirements.txt

3) Set environment

GW=$(ip route | awk '/default/ {print $3; exit}')
export VIC_OLLAMA_BASE_URL="http://$GW:11434"
export VIC_OLLAMA_EMBED_MODEL="embeddinggemma"
export VIC_OLLAMA_MODEL="qwen2.5:0.5b"
export VIC_PRIVACY_MODE="cloud_reasoning_ok"

Ollama

  • VIC_OLLAMA_MODEL is the planner model used by the Librarian step to translate natural-language requests into retrieval intent (filters, semantic focus, tier strategy).
  • VIC_OLLAMA_EMBED_MODEL is the embedding model used to convert text into vectors for L1 ANN search.
  • VIC_OLLAMA_BASE_URL is the Ollama API endpoint used by both planner and embed calls.

Recommended baseline profile for this demo:

  • Planner: qwen2.5:0.5b (fast, lightweight for local planning)
  • Embeddings: embeddinggemma (stable general-purpose embedding model)

To try alternative models, set the same environment variables before running scripts. For practical model-selection guidance, see docs/RUNTIME.md.

4) Build index and run query

ts=$(date +%Y%m%d_%H%M%S)
python3 scripts/build_aisaq_index.py --root data/sample_drive --aisaq-root /home/$USER/aisaq-diskann --index-dir data/aisaq_index 2>&1 | tee "audits/demo_index_${ts}.log"
python3 scripts/run_query.py "Find the Q3 2025 contract that mentions penalty clauses" --aisaq-root /home/$USER/aisaq-diskann 2>&1 | tee "audits/demo_smoke_${ts}.log"

5) Benchmark

ts=$(date +%Y%m%d_%H%M%S)
python3 scripts/compare_backends.py --aisaq-root /home/$USER/aisaq-diskann --out-json docs/benchmarks.json 2>&1 | tee "audits/demo_bench_${ts}.log"

Example Query

python3 scripts/run_query.py \
  "Find the Q3 2025 contract that mentions penalty clauses" \
  --aisaq-root /home/$USER/aisaq-diskann

Expected output includes:

  • query summary line
  • top ranked files with tier labels
  • per-run metrics (l0, l1, l2, latency, index footprint)

The composition view below explains why this pipeline surfaces relevant files even when file names are not obvious keyword matches.

Result Composition: direct vs in-file vs semantic

Compared with direct matching baselines, VIC + AiSAQ shifts more results toward in-file and semantic evidence, not only path/name hits.

Systems Rationale

  • It keeps retrieval local-first and auditable.
  • It separates inexpensive filtering from expensive deep parsing.
  • It makes the core systems trade-off explicit: lower DRAM footprint in exchange for some latency overhead.

The two charts below summarize that systems trade-off and scaling behavior.

Query Latency vs Dataset Size

Traditional search can be very fast on tiny directories, while tiered retrieval behavior is more stable as dataset size grows.

DRAM Footprint by Retrieval Method

AiSAQ-oriented retrieval is designed for low-DRAM environments where memory budget is often the hard constraint.

For strategic background, see:

What is tracked vs generated

  • Tracked: source code, scripts, docs, notebook, sample input corpus.
  • Generated (ignored): audits/, data/aisaq_index/, local virtual envs, benchmark JSON outputs.

This keeps the repo portable while preserving reproducible command flows.

Documentation Map

  • docs/ARCHITECTURE.md — runtime/data-flow details
  • docs/RUNTIME.md — environment assumptions, network notes, and model selection guide
  • docs/BENCHMARKS.md — benchmark interpretation
  • docs/KIOXIA_CONTEXT.md — integration framing for storage audiences
  • docs/RESULTS_SUMMARY.md — shareable results checklist/template
  • notebooks/demo_walkthrough.ipynb — guided runbook

Script Index

  • scripts/build_aisaq_index.py — build native AiSAQ index and row/path map
  • scripts/run_query.py — execute one natural-language query through L0 -> L1 -> L2
  • scripts/compare_backends.py — benchmark baseline vs native AiSAQ path
  • scripts/metrics.py — print compact summary from docs/benchmarks.json
  • scripts/demo_session.py — run a two-query demo sequence for live walkthroughs
  • scripts/sanitize_audits.py — scrub local absolute paths from logs before sharing
  • scripts/setup_wsl_aisaq.sh — WSL helper to build AiSAQ binaries from source
  • scripts/run_option_a.ps1 — optional Windows helper that wraps WSL build/index/smoke steps

Shareable Audit Sanitization

To sanitize local absolute paths in logs before sharing:

python3 scripts/sanitize_audits.py --input-dir audits --output-dir audits/sanitized --glob "*.log"

ARPA Logo
Built & Maintained by ARPA Hellenic Logical Systems & the Community