VIC-style tiered retrieval + native KIOXIA AiSAQ backend.
This repository is a live demonstration of storage-aware AI retrieval for edge and controller-style environments:
- L0 metadata filtering over filesystem records
- L1 native AiSAQ ANN search (
--use_aisaq) over flash-first index artifacts - L2 selective deep parsing for ranked evidence snippets
The idea is straightforward: keep data on disk, narrow candidates cheaply, and only spend memory where it adds value. VIC contributes query planning and tier orchestration, while AiSAQ contributes flash-oriented ANN search for low-DRAM operation. The result is a local-first reference flow that is auditable, reproducible, and easy to present live.
For broader context on why this direction matters, see the Computational Storage Landscape report.
vic_aisaq_demo combines two proven open-source building blocks:
lc0_vic: a tiered retrieval controller that plans and orchestrates search in layers.aisaq-diskann: a flash-oriented ANN backend optimized for low-DRAM environments.
In plain terms: metadata narrows the search space first, vector retrieval finds semantic candidates second, and deep parsing is reserved for a small shortlist.
- Librarian / plan: turn a natural-language question into retrieval intent.
- L0 metadata filter: reduce candidate files by extension/size/time/path hints.
- L1 vector search: run native AiSAQ ANN search over embeddings.
- L2 deep read: parse only top files and attach snippets/evidence.
- Ranked response: return paths, scores, tiers, and run metrics.
The funnel below shows how the controller keeps deep parsing affordable by reducing the workload at each tier.
Starting from a large directory, L0 and L1 progressively reduce candidates so L2 parses only a tiny subset before final ranking.
cd ~/aisaq-diskann
git checkout aisaq_release
git submodule update --init --recursive
mkdir -p build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
make -j2
ls -l ~/aisaq-diskann/build/apps/build_disk_index ~/aisaq-diskann/build/apps/search_disk_indexpython3 -m venv ~/venvs/vix_aisaq_demo
source ~/venvs/vix_aisaq_demo/bin/activate
cd <repo-root>
python3 -m pip install -r requirements.txtGW=$(ip route | awk '/default/ {print $3; exit}')
export VIC_OLLAMA_BASE_URL="http://$GW:11434"
export VIC_OLLAMA_EMBED_MODEL="embeddinggemma"
export VIC_OLLAMA_MODEL="qwen2.5:0.5b"
export VIC_PRIVACY_MODE="cloud_reasoning_ok"VIC_OLLAMA_MODELis the planner model used by the Librarian step to translate natural-language requests into retrieval intent (filters, semantic focus, tier strategy).VIC_OLLAMA_EMBED_MODELis the embedding model used to convert text into vectors for L1 ANN search.VIC_OLLAMA_BASE_URLis the Ollama API endpoint used by both planner and embed calls.
Recommended baseline profile for this demo:
- Planner:
qwen2.5:0.5b(fast, lightweight for local planning) - Embeddings:
embeddinggemma(stable general-purpose embedding model)
To try alternative models, set the same environment variables before running scripts. For practical model-selection guidance, see docs/RUNTIME.md.
ts=$(date +%Y%m%d_%H%M%S)
python3 scripts/build_aisaq_index.py --root data/sample_drive --aisaq-root /home/$USER/aisaq-diskann --index-dir data/aisaq_index 2>&1 | tee "audits/demo_index_${ts}.log"
python3 scripts/run_query.py "Find the Q3 2025 contract that mentions penalty clauses" --aisaq-root /home/$USER/aisaq-diskann 2>&1 | tee "audits/demo_smoke_${ts}.log"ts=$(date +%Y%m%d_%H%M%S)
python3 scripts/compare_backends.py --aisaq-root /home/$USER/aisaq-diskann --out-json docs/benchmarks.json 2>&1 | tee "audits/demo_bench_${ts}.log"python3 scripts/run_query.py \
"Find the Q3 2025 contract that mentions penalty clauses" \
--aisaq-root /home/$USER/aisaq-diskannExpected output includes:
- query summary line
- top ranked files with tier labels
- per-run metrics (
l0,l1,l2, latency, index footprint)
The composition view below explains why this pipeline surfaces relevant files even when file names are not obvious keyword matches.
Compared with direct matching baselines, VIC + AiSAQ shifts more results toward in-file and semantic evidence, not only path/name hits.
- It keeps retrieval local-first and auditable.
- It separates inexpensive filtering from expensive deep parsing.
- It makes the core systems trade-off explicit: lower DRAM footprint in exchange for some latency overhead.
The two charts below summarize that systems trade-off and scaling behavior.
Traditional search can be very fast on tiny directories, while tiered retrieval behavior is more stable as dataset size grows.
AiSAQ-oriented retrieval is designed for low-DRAM environments where memory budget is often the hard constraint.
For strategic background, see:
- Tracked: source code, scripts, docs, notebook, sample input corpus.
- Generated (ignored):
audits/,data/aisaq_index/, local virtual envs, benchmark JSON outputs.
This keeps the repo portable while preserving reproducible command flows.
docs/ARCHITECTURE.md— runtime/data-flow detailsdocs/RUNTIME.md— environment assumptions, network notes, and model selection guidedocs/BENCHMARKS.md— benchmark interpretationdocs/KIOXIA_CONTEXT.md— integration framing for storage audiencesdocs/RESULTS_SUMMARY.md— shareable results checklist/templatenotebooks/demo_walkthrough.ipynb— guided runbook
scripts/build_aisaq_index.py— build native AiSAQ index and row/path mapscripts/run_query.py— execute one natural-language query through L0 -> L1 -> L2scripts/compare_backends.py— benchmark baseline vs native AiSAQ pathscripts/metrics.py— print compact summary fromdocs/benchmarks.jsonscripts/demo_session.py— run a two-query demo sequence for live walkthroughsscripts/sanitize_audits.py— scrub local absolute paths from logs before sharingscripts/setup_wsl_aisaq.sh— WSL helper to build AiSAQ binaries from sourcescripts/run_option_a.ps1— optional Windows helper that wraps WSL build/index/smoke steps
To sanitize local absolute paths in logs before sharing:
python3 scripts/sanitize_audits.py --input-dir audits --output-dir audits/sanitized --glob "*.log"