Releases: vlwkaos/ir
v0.15.0
Features
-
Linux GPU backends (
Cargo.toml,src/llm/mod.rs): addedllama-cuda,llama-rocm, andllama-vulkanCargo features. Build with--features llama-cuda(NVIDIA),llama-rocm(AMD), orllama-vulkan(cross-platform) to enable GPU acceleration on Linux.IR_GPU_LAYERSnow defaults to 99 when any GPU backend feature is compiled in (was macOS-only). Runtime backend is detected viaggml_backend_dev_*enumeration and reported in daemon startup log (e.g.loading models (CUDA)...). -
Nix flake (
flake.nix): addedflake.nixwith packages for all four platforms. Linux default package uses OpenMP/CPU; named outputs#cuda,#rocm,#vulkanenable the respective GPU backends with correctbuildInputs. macOS package links Metal/Foundation/Accelerate frameworks automatically. Thanks to @bglgwyngag for the initial implementation (#15).
v0.14.1
Features
--corsflag forir mcp --http(src/mcp.rs,src/cli/mod.rs): addsAccess-Control-Allow-Originsupport for browser-hosted MCP clients.--cors '*'allows any origin and disables rmcp's DNS-rebinding host check;--cors 'https://...'sets an exact-match origin. Omitting--corsleaves existing behavior unchanged. Exposesmcp-session-idheader and allowslast-event-idon SSE reconnect.
Chores
- hf-hub upgraded to 0.5.0 (
Cargo.toml): switched fromhf-hub 0.3(native-tls / OpenSSL) tohf-hub 0.5.0withdefault-features = false, features = ["ureq"]. Removes OpenSSL and native-tls from the dependency tree; TLS handled by ureq 3's rustls default. The syncapi::sync::ApiBuilderpath is unchanged.
v0.14.0
Fixes
- Portable SQLite extension ABI (
src/db/mod.rs): sqlite-vec auto-extension init function now uses*mut *mut c_char(fromstd::os::raw) instead of*mut *mut i8for the error message pointer, matching the stable C ABI. Previous code was technically UB on targets wherec_charis unsigned. - Preprocessor arg expansion (
src/preprocess.rs):PreprocessHandle::spawnnow appliesexpand_pathto all command args, not just the binary path. Fixes$IR_DIR/preprocessors/jiebabeing passed literally to lindera — caused zh (and any post-v0.12.0 fresh ko/ja install) indexing to fail with "dictionary path does not exist".
Dev / Benchmark Tooling
-
GitHub Actions CI/Release: added
.github/workflows/ci.yml(build + test on push/PR) and.github/workflows/release.yml(cross-platform binary build, GitHub release, Homebrew tap update on tag push).TAP_TOKENsecret required in repo settings for the Homebrew step. -
Chinese (zh) benchmark infrastructure:
scripts/download-miracl-zh.shdownloads the MIRACL-ZH corpus (~132K passages) from HuggingFace;scripts/bench.sh miracl-zhruns the benchmark;scripts/generate-fixtures.shgeneratestest-data/fixtures/miracl-zh-mini/(2000-doc sample, seed=42); placeholderexpected.jsonwith_uncalibratedflags added. Calibrate withscripts/calibrate-fixtures.sh miracl-zh-miniafter download. -
Chinese synthetic fixture (
test-data/fixtures/synthetic-zh/): 20-doc Chinese corpus with BM25 fingerprint terms (玄武/朱雀/青龙/白虎/勾陈), semantic synonym pairs, distractors, and edge cases including punctuation-only lines. Exercises zh preprocessor pipeline end-to-end viascripts/preship.sh --fixture synthetic-zh. Calibrate withscripts/calibrate-fixtures.sh synthetic-zhafterir preprocessor install zh. -
zh integration test (
src/preprocess.rs):#[ignore]testzh_preprocessor_tokenizes_chineseverifies jieba segmentation, ASCII pass-through, and punctuation handling. Run withcargo test -- --ignored zh_preprocessorafterir preprocessor install zh. -
README.zh.md: Full Simplified Chinese translation of README.md.
-
Benchmark isolation preprocessor symlink (
scripts/bench-env.sh):bench_env_initnow symlinks the livepreprocessors/directory into each isolated bench state dir. Previously, bench scripts using$IR_DIR/preprocessors/...references would fail in isolated state; only old absolute-path configs worked by accident.
v0.13.0
Breaking
- Combined-model default removal (
src/daemon.rs,src/llm/combined.rs): dedicated expander + reranker is now the only default tier-2 path. A local Qwen combined GGUF is no longer auto-activated from model search dirs; combined mode is opt-in viaIR_COMBINED_MODELonly and is intended for explicit testing or experiments.
Features
-
Per-collection routing overrides (
config.yml,src/types.rs,src/search/hybrid.rs): collections can now carry optional BM25/fused strong-signal threshold overrides under arouting:block. This gives runtime threshold tuning a stable config surface without changing the global defaults. Overrides apply only when all searched collections agree; mixed searches with conflicting values fall back to the global defaults.collections: - name: wiki-ko path: ~/wiki preprocessor: [ko] routing: fused_strong_product: 0.05
Fields:
fused_strong_floor,fused_strong_product,bm25_strong_floor,bm25_strong_gap. -
Built-in Korean bind default (
src/main.rs): binding the built-inkopreprocessor alias now also writes the current Korean fused routing default (routing.fused_strong_product: 0.05) for that collection when no explicit routing override exists yet.- Existing
ko-bound collections are not auto-migrated. Add therouting:block manually toconfig.ymlor unbind/rebindkoto apply. - Rationale: on the sampled Korean holdout
miracl-ko-s50000-p42, fused0.05outperforms0.06on both quality and latency (nDCG@10=0.9650,R@10=0.9813,med=431.5msvsnDCG@10=0.9603,R@10=0.9766,med=440.4ms).
- Existing
Improvements
- Cold-search daemon warmup (
src/main.rs):ir searchnow kicks off daemon startup immediately, but a cold first query no longer waits for model download/load if BM25 already found usable results. The daemon continues warming in the background so follow-up queries land on the hotter path. - Large-corpus embed stall fix (
src/index/embed.rs):ir embedno longer loads every pending document body into memory before starting work. Pending docs are now counted once and streamed in small batches by content hash, which makes progress visible immediately and avoids the full-corpus RAM spike that could make large MIRACL-Ko resumes look hung for an hour before the first progress update. - Indexer progress bar now shows docs/sec and counts from the hash phase (previously only showed apply phase progress).
Dev / Benchmark Tooling
- Pre-ship regression gate (
scripts/preship.sh): three-axis regression check (stability, speed, performance) across test fixtures before release. Catches hang/crash/timeout (stability), throughput and latency regressions (speed), and nDCG/Recall regressions (performance). Portable: works with macOS bash 3.2, uses perl alarm fallback when GNU timeout is unavailable. - Committed test fixtures (
test-data/fixtures/synthetic-en/): 20-doc English corpus with discriminative BM25 fingerprint terms, semantic synonym pairs, distractors, and edge cases. Calibrated expected.json with 10% buffer floors. Catches pipeline regressions in ~30s without any model download. - Korean canary fixture (
test-data/fixtures/miracl-ko-mini/expected.json): placeholder for issue #13-class deadlock detection. Populate withscripts/generate-fixtures.shafter downloading miracl-ko. - Fixture calibration (
scripts/calibrate-fixtures.sh): measures actual metrics per mode, writes calibrated baselines to expected.json with 10% buffer floors. Per-mode_uncalibratedflag prevents uncalibrated modes from failing the gate. - Pool-size variance study (
scripts/pool-size-study.sh,scripts/pool-size-aggregate.py): sweeps miracl-ko at multiple corpus sizes × multiple seeds to find the smallest pool with stable nDCG stddev < 0.005. Writesresearch/pool-size-study.md. Current minimum stable pool: 10000 docs; active research default: 50000 docs because 10000 often saturates ranking metrics. Pools at or below the 503 mandatory qrel-linked docs are treated as deterministic floors, not variance evidence. - Progress reporting: timestamped
_log()in bench.sh and signal-sweep.sh; stall detector in signal-sweep.sh (STALL DETECTED at 120s no-output with issue context); tqdm progress bars in beir-eval.py (materialize, run, signals, sample). - Benchmark summary output (
scripts/bench.sh): prints one row per available mode (bm25,vector,hybrid) instead of collapsing a run to a single best-mode row. Makes baseline capture usable directly from the benchmark command. - Benchmark resume (
scripts/bench.sh,scripts/beir-eval.py run): rerunning the same dataset after a crash now reuses a prepared collection when it is already ready for the requested mode, instead of redoing the entire prepare stage. Query scoring also resumes from per-query sidecar progress when--outputis set, so a crash mid-run does not force the query loop to restart from zero. Cache validation distinguishesbm25-only results from fullall-mode results. - Benchmark pipeline pinning (
scripts/bench.sh,scripts/bench-env.sh): non-BM25 benchmark runs now force the dedicated expander + reranker path and restart the benchmark daemon before scoring. This prevents local combined-model auto-detect or stale daemon state from silently changing the benchmark pipeline. Benchmark state now also exportsIR_CONFIG_DIR, eliminating the deprecatedXDG_CONFIG_HOMEwarning during benchmark runs. - Sampled benchmarks (
scripts/bench.sh --size N --seed N): large BEIR corpora can now be benchmarked directly on a sampled pool without hand-runningbeir-eval.py samplefirst. Sampled runs get distinct dataset labels, collections, and result-cache directories such asmiracl-ko-s10000-p42, so they do not overwrite full-corpus baselines. - Research harness (
scripts/research-harness.sh,scripts/signal-sweep.sh --sample-only,scripts/threshold-validate.py,research/experiment.md): the benchmark workflow is now consolidated around a maintainer entrypoint withbaseline,signals,thresholds, andvalidate-thresholdssubcommands. Korean threshold research now defaults to sampledmiracl-ko --size 50000pools for better metric headroom, while10000remains the fast stable floor. - Benchmark safety watchdog (
scripts/bench.sh,scripts/bench-env.sh): macOS benchmark runs now keep Metal enabled but watch system free memory, swapouts, and runawayirCPU usage during longprepare/runphases. The wrapper aborts unsafe runs before they drag the machine into swap-heavy or CPU-fallback territory. Thresholds are tunable viaIR_BENCH_MIN_FREE_PCT,IR_BENCH_MAX_IR_CPU_PCT,IR_BENCH_CPU_STRIKES, or can be disabled withIR_BENCH_GUARD=0. - Threshold override env vars (
src/search/hybrid.rs): strong-signal and BM25 shortcut thresholds can be overridden via env vars during research runs. This keeps the shipped defaults unchanged while the harness validates candidate thresholds against a holdout collection. - Tier-2 router dataset export (
scripts/export-tier2-router-data.py,research/experiment.md): signal sweeps can now be converted directly into smoltrain JSONL for a tinyskip_tier2vsrun_tier2classifier trained on real benchmark behavior instead of hand labels. - Tier-2 router smoltrain prep (
scripts/prepare-tier2-router-smoltrain.py,research/experiment.md): exported router datasets can now be turned into a self-containedsmoltrainworkspace withtrain.jsonl,train_balanced.jsonl,eval.jsonl,taxonomy.yaml, andworld.json. - Router bundle wrapper (
scripts/router-data.sh): router bundle prep now has a separate entrypoint for Korean-only, FiQA-only, or mixed training data, so router research stays isolated from the workingscripts/research-harness.shbaseline / threshold flow. - Tier-1 router benchmark path (
src/search/hybrid.rs,scripts/beir-eval.py,scripts/router-bench.py): research runs can forcehybridto return tier-1 fused results only, collecttier1.jsonlalongside normal signal files, and benchmark a trained router offline against a holdout without changing the shipped runtime path. - Research conclusions documented (
research/experiment.md, READMEs, benchmark skill): current maintainer guidance explicitly prefers stricter threshold gating before any runtime router integration. FiQA stays on the current fused threshold, Korean threshold research uses sampledmiracl-ko --size 50000, and router work remains offline until it clearly beats simple gating on holdout.
v0.12.0
New Features
IR_CONFIG_DIRenv var: override the config/data directory with~and$VARexpansion.
Safe to use in MCP configs (.mcp.json) synced across machines with different usernames.
Precedence:IR_CONFIG_DIR>XDG_CONFIG_HOME/ir>~/.config/ir.
Deprecations
XDG_CONFIG_HOMEis deprecated as the config dir override for ir. It still works but
prints a warning. UseIR_CONFIG_DIRinstead.
Improvements
- All path env vars (
IR_*_MODEL,IR_MODEL_DIRS,IR_CONFIG_DIR) now support~and
$VAR/${VAR}expansion. Previously,~in these vars was treated literally and silently
failed to resolve. - Collection paths in
config.ymlnow support~and$VARnotation. Portable paths are
preserved on write; expansion happens at load time. - Preprocessor commands installed via
ir preprocessor installare now stored as
$IR_DIR/preprocessors/...instead of absolute paths, making them portable across machines.
Existing absolute-path commands continue to work. Re-runir preprocessor install <lang>
to migrate to the portable format. - When BM25 returns no results (semantic query with no keyword overlap), the daemon wait
timeout is extended from 3s to 10s — the daemon is the only source of results in this
case, so a slow cold start no longer silently returns nothing. Diagnostic hints are printed
at all daemon fallback sites (start failure, timeout, query error) to guide follow-up
(ir embed <collection>, model path check).
v0.11.2
v0.11.1
Bug Fixes
- Preprocessor pipe deadlock on large Korean collections (#13):
process_linenow uses
a sentinel-based protocol to handle lines where the official lindera CLI produces no output
(e.g. punctuation-only lines where all tokens are filtered by--token-filter). Previously,
read_line()would block forever waiting for output that never arrived, deadlocking both
processes at 0% CPU. Manifested as a hang at ~60k docs when indexing MIRACL-Ko or other large
Korean corpora. (2084e4e)
v0.11.0
Bug Fixes
- BM25 now uses OR semantics for natural-language queries (>3 terms): stop words are
stripped and remaining terms are ORed. Short keyword queries (≤3 non-stop terms) keep
AND semantics. Fixes near-zero recall on question-format queries (e.g.ir search --mode bm25 "what are the symptoms of diabetes"previously returned almost nothing due to AND forcing
all stop words to match). - Preprocessor subprocess:
process_linenow skips empty/whitespace-only lines instead of
sending them to the subprocess. Fixes a deadlock when indexing markdown with blank lines
using official lindera CLI (which emits no output for empty input when--token-filteris active).
Breaking
-
ir preprocessor install ko/ja/zhnow uses official lindera releases. Previouslyir
bundled its own preprocessor binaries in GitHub release artifacts (unreliable — artifacts
were occasionally missing). Starting v0.11.0,ir preprocessor installdownloads the
official lindera CLI binary and per-language dictionary directly from lindera's GitHub
releases. Chinese (zh) switches from a custom bigram tokenizer to lindera + jieba.Migration required if you used a preprocessor before v0.11.0:
ir preprocessor install ko # (or ja / zh) ir update <collection> --force
Existing config entries pointing to old bundled binaries are stale. Search silently degrades
(BM25 without tokenization) until reinstalled.
v0.10.0
Features
-f/--filter "FIELD OP VALUE"onir search: general structured filter supporting built-in fields (path,modified_at,created_at) and YAML frontmatter fields (meta.<name>). Operators:=,!=,>,>=,<,<=,~(contains),!~(not-contains). Multiple-fflags are ANDed. Date values are normalized to UTC RFC3339. Applied at all three search pipeline tiers so each exit point returns correctly filtered results.- MCP
searchtool gains a structuredfilterarray ([{field, op, value}]) with full JSON schema — LLM clients see typed enum choices for operators. - Frontmatter metadata indexed into a new
document_metadatatable atir updatetime; supports all scalar values, tag arrays (one row per element), and nested keys.
Bug Fixes
- Daemon tier-2: reranker without expander now correctly reranks tier-1 fused results (reranking is useful without expansion; expansion alone is harmful, -0.53% nDCG on NFCorpus).
- Daemon tier-2:
IR_COMBINED_MODELload failure now falls back to dedicated models with an explicit warning instead of silently disabling tier-2. - Daemon tier-2: conflict between
IR_COMBINED_MODELand dedicated model env vars now warns before loading (combined wins). - Daemon tier-2:
QMD_EXPANDER_MODEL/QMD_RERANKER_MODELlegacy aliases now correctly trigger dedicated mode instead of falling through to auto-detect.
Breaking
--modified-after/--modified-beforeCLI flags removed (were unreleased). Use-f "modified_at>=YYYY-MM-DD"and-f "modified_at<=YYYY-MM-DD".- MCP
SearchInput.modified_after/modified_beforefields removed. Usefilter: [{field: "modified_at", op: ">=", value: "YYYY-MM-DD"}]. - Collection DBs upgrade to schema version 2 on first use. A one-time backfill populates
document_metadatafrom existing frontmatter (sub-second for <10k docs). No manual migration required.
v0.9.0
[0.9.0] - 2026-04-15
Features
ir search --quiet/-q: suppress all stderr output (progress indicators, daemon log lines). Useful for scripting. Conflicts with--verbose.IR_COMBINED_MODEL: new canonical env var for the unified Qwen3.5 GGUF (replaces both expander + reranker roles).IR_QWEN_MODELstill accepted but emits a deprecation warning on load.IR_*_MODELenv vars now accept HuggingFace repo IDs (owner/name) in addition to local file and directory paths. Setting e.g.IR_EMBEDDING_MODEL=ggml-org/bge-m3-Q8_0-GGUFdownloads and caches the model automatically on first use.- BGE-M3 added to the auto-download registry (
ggml-org/bge-m3-Q8_0-GGUF). Download progress shown in foreground terminal; daemon loads from cache instantly. - Download UX improved: structured message before HF progress bar shows model name, size hint, source URL, cache location, and offline tip.
- Download errors now include actionable fixes: retry,
HF_HUB_OFFLINE=1, manual download URL, cache path to clear on corruption.
Breaking
- Unrecognized
IR_*_MODELvalues (not a file, directory, or known HF repo ID) now error immediately instead of silently falling through to the default model. Users with leftover garbage env vars will see an error with an "Accepted forms" list. Unset the env var to restore default behavior.
Full changelog: https://github.com/vlwkaos/ir/blob/main/CHANGELOG.md