Releases · vlwkaos/ir

Features

Linux GPU backends (Cargo.toml, src/llm/mod.rs): added llama-cuda, llama-rocm, and llama-vulkan Cargo features. Build with --features llama-cuda (NVIDIA), llama-rocm (AMD), or llama-vulkan (cross-platform) to enable GPU acceleration on Linux. IR_GPU_LAYERS now defaults to 99 when any GPU backend feature is compiled in (was macOS-only). Runtime backend is detected via ggml_backend_dev_* enumeration and reported in daemon startup log (e.g. loading models (CUDA)...).
Nix flake (flake.nix): added flake.nix with packages for all four platforms. Linux default package uses OpenMP/CPU; named outputs #cuda, #rocm, #vulkan enable the respective GPU backends with correct buildInputs. macOS package links Metal/Foundation/Accelerate frameworks automatically. Thanks to @bglgwyngag for the initial implementation (#15).

Features

--cors flag for ir mcp --http (src/mcp.rs, src/cli/mod.rs): adds Access-Control-Allow-Origin support for browser-hosted MCP clients. --cors '*' allows any origin and disables rmcp's DNS-rebinding host check; --cors 'https://...' sets an exact-match origin. Omitting --cors leaves existing behavior unchanged. Exposes mcp-session-id header and allows last-event-id on SSE reconnect.

Chores

hf-hub upgraded to 0.5.0 (Cargo.toml): switched from hf-hub 0.3 (native-tls / OpenSSL) to hf-hub 0.5.0 with default-features = false, features = ["ureq"]. Removes OpenSSL and native-tls from the dependency tree; TLS handled by ureq 3's rustls default. The sync api::sync::ApiBuilder path is unchanged.

Fixes

Portable SQLite extension ABI (src/db/mod.rs): sqlite-vec auto-extension init function now uses *mut *mut c_char (from std::os::raw) instead of *mut *mut i8 for the error message pointer, matching the stable C ABI. Previous code was technically UB on targets where c_char is unsigned.
Preprocessor arg expansion (src/preprocess.rs): PreprocessHandle::spawn now applies expand_path to all command args, not just the binary path. Fixes $IR_DIR/preprocessors/jieba being passed literally to lindera — caused zh (and any post-v0.12.0 fresh ko/ja install) indexing to fail with "dictionary path does not exist".

Dev / Benchmark Tooling

GitHub Actions CI/Release: added .github/workflows/ci.yml (build + test on push/PR) and .github/workflows/release.yml (cross-platform binary build, GitHub release, Homebrew tap update on tag push). TAP_TOKEN secret required in repo settings for the Homebrew step.
Chinese (zh) benchmark infrastructure: scripts/download-miracl-zh.sh downloads the MIRACL-ZH corpus (~132K passages) from HuggingFace; scripts/bench.sh miracl-zh runs the benchmark; scripts/generate-fixtures.sh generates test-data/fixtures/miracl-zh-mini/ (2000-doc sample, seed=42); placeholder expected.json with _uncalibrated flags added. Calibrate with scripts/calibrate-fixtures.sh miracl-zh-mini after download.
Chinese synthetic fixture (test-data/fixtures/synthetic-zh/): 20-doc Chinese corpus with BM25 fingerprint terms (玄武/朱雀/青龙/白虎/勾陈), semantic synonym pairs, distractors, and edge cases including punctuation-only lines. Exercises zh preprocessor pipeline end-to-end via scripts/preship.sh --fixture synthetic-zh. Calibrate with scripts/calibrate-fixtures.sh synthetic-zh after ir preprocessor install zh.
zh integration test (src/preprocess.rs): #[ignore] test zh_preprocessor_tokenizes_chinese verifies jieba segmentation, ASCII pass-through, and punctuation handling. Run with cargo test -- --ignored zh_preprocessor after ir preprocessor install zh.
README.zh.md: Full Simplified Chinese translation of README.md.
Benchmark isolation preprocessor symlink (scripts/bench-env.sh): bench_env_init now symlinks the live preprocessors/ directory into each isolated bench state dir. Previously, bench scripts using $IR_DIR/preprocessors/... references would fail in isolated state; only old absolute-path configs worked by accident.

Breaking

Combined-model default removal (src/daemon.rs, src/llm/combined.rs): dedicated expander + reranker is now the only default tier-2 path. A local Qwen combined GGUF is no longer auto-activated from model search dirs; combined mode is opt-in via IR_COMBINED_MODEL only and is intended for explicit testing or experiments.

Features

Per-collection routing overrides (config.yml, src/types.rs, src/search/hybrid.rs): collections can now carry optional BM25/fused strong-signal threshold overrides under a routing: block. This gives runtime threshold tuning a stable config surface without changing the global defaults. Overrides apply only when all searched collections agree; mixed searches with conflicting values fall back to the global defaults.
```
collections:
  - name: wiki-ko
    path: ~/wiki
    preprocessor: [ko]
    routing:
      fused_strong_product: 0.05
```
Fields: fused_strong_floor, fused_strong_product, bm25_strong_floor, bm25_strong_gap.
Built-in Korean bind default (src/main.rs): binding the built-in ko preprocessor alias now also writes the current Korean fused routing default (routing.fused_strong_product: 0.05) for that collection when no explicit routing override exists yet.
- Existing ko-bound collections are not auto-migrated. Add the routing: block manually to config.yml or unbind/rebind ko to apply.
- Rationale: on the sampled Korean holdout miracl-ko-s50000-p42, fused 0.05 outperforms 0.06 on both quality and latency (nDCG@10=0.9650, R@10=0.9813, med=431.5ms vs nDCG@10=0.9603, R@10=0.9766, med=440.4ms).

Improvements

Cold-search daemon warmup (src/main.rs): ir search now kicks off daemon startup immediately, but a cold first query no longer waits for model download/load if BM25 already found usable results. The daemon continues warming in the background so follow-up queries land on the hotter path.
Large-corpus embed stall fix (src/index/embed.rs): ir embed no longer loads every pending document body into memory before starting work. Pending docs are now counted once and streamed in small batches by content hash, which makes progress visible immediately and avoids the full-corpus RAM spike that could make large MIRACL-Ko resumes look hung for an hour before the first progress update.
Indexer progress bar now shows docs/sec and counts from the hash phase (previously only showed apply phase progress).

Dev / Benchmark Tooling

Pre-ship regression gate (scripts/preship.sh): three-axis regression check (stability, speed, performance) across test fixtures before release. Catches hang/crash/timeout (stability), throughput and latency regressions (speed), and nDCG/Recall regressions (performance). Portable: works with macOS bash 3.2, uses perl alarm fallback when GNU timeout is unavailable.
Committed test fixtures (test-data/fixtures/synthetic-en/): 20-doc English corpus with discriminative BM25 fingerprint terms, semantic synonym pairs, distractors, and edge cases. Calibrated expected.json with 10% buffer floors. Catches pipeline regressions in ~30s without any model download.
Korean canary fixture (test-data/fixtures/miracl-ko-mini/expected.json): placeholder for issue #13-class deadlock detection. Populate with scripts/generate-fixtures.sh after downloading miracl-ko.
Fixture calibration (scripts/calibrate-fixtures.sh): measures actual metrics per mode, writes calibrated baselines to expected.json with 10% buffer floors. Per-mode _uncalibrated flag prevents uncalibrated modes from failing the gate.
Pool-size variance study (scripts/pool-size-study.sh, scripts/pool-size-aggregate.py): sweeps miracl-ko at multiple corpus sizes × multiple seeds to find the smallest pool with stable nDCG stddev < 0.005. Writes research/pool-size-study.md. Current minimum stable pool: 10000 docs; active research default: 50000 docs because 10000 often saturates ranking metrics. Pools at or below the 503 mandatory qrel-linked docs are treated as deterministic floors, not variance evidence.
Progress reporting: timestamped _log() in bench.sh and signal-sweep.sh; stall detector in signal-sweep.sh (STALL DETECTED at 120s no-output with issue context); tqdm progress bars in beir-eval.py (materialize, run, signals, sample).
Benchmark summary output (scripts/bench.sh): prints one row per available mode (bm25, vector, hybrid) instead of collapsing a run to a single best-mode row. Makes baseline capture usable directly from the benchmark command.
Benchmark resume (scripts/bench.sh, scripts/beir-eval.py run): rerunning the same dataset after a crash now reuses a prepared collection when it is already ready for the requested mode, instead of redoing the entire prepare stage. Query scoring also resumes from per-query sidecar progress when --output is set, so a crash mid-run does not force the query loop to restart from zero. Cache validation distinguishes bm25-only results from full all-mode results.
Benchmark pipeline pinning (scripts/bench.sh, scripts/bench-env.sh): non-BM25 benchmark runs now force the dedicated expander + reranker path and restart the benchmark daemon before scoring. This prevents local combined-model auto-detect or stale daemon state from silently changing the benchmark pipeline. Benchmark state now also exports IR_CONFIG_DIR, eliminating the deprecated XDG_CONFIG_HOME warning during benchmark runs.
Sampled benchmarks (scripts/bench.sh --size N --seed N): large BEIR corpora can now be benchmarked directly on a sampled pool without hand-running beir-eval.py sample first. Sampled runs get distinct dataset labels, collections, and result-cache directories such as miracl-ko-s10000-p42, so they do not overwrite full-corpus baselines.
Research harness (scripts/research-harness.sh, scripts/signal-sweep.sh --sample-only, scripts/threshold-validate.py, research/experiment.md): the benchmark workflow is now consolidated around a maintainer entrypoint with baseline, signals, thresholds, and validate-thresholds subcommands. Korean threshold research now defaults to sampled miracl-ko --size 50000 pools for better metric headroom, while 10000 remains the fast stable floor.
Benchmark safety watchdog (scripts/bench.sh, scripts/bench-env.sh): macOS benchmark runs now keep Metal enabled but watch system free memory, swapouts, and runaway ir CPU usage during long prepare / run phases. The wrapper aborts unsafe runs before they drag the machine into swap-heavy or CPU-fallback territory. Thresholds are tunable via IR_BENCH_MIN_FREE_PCT, IR_BENCH_MAX_IR_CPU_PCT, IR_BENCH_CPU_STRIKES, or can be disabled with IR_BENCH_GUARD=0.
Threshold override env vars (src/search/hybrid.rs): strong-signal and BM25 shortcut thresholds can be overridden via env vars during research runs. This keeps the shipped defaults unchanged while the harness validates candidate thresholds against a holdout collection.
Tier-2 router dataset export (scripts/export-tier2-router-data.py, research/experiment.md): signal sweeps can now be converted directly into smoltrain JSONL for a tiny skip_tier2 vs run_tier2 classifier trained on real benchmark behavior instead of hand labels.
Tier-2 router smoltrain prep (scripts/prepare-tier2-router-smoltrain.py, research/experiment.md): exported router datasets can now be turned into a self-contained smoltrain workspace with train.jsonl, train_balanced.jsonl, eval.jsonl, taxonomy.yaml, and world.json.
Router bundle wrapper (scripts/router-data.sh): router bundle prep now has a separate entrypoint for Korean-only, FiQA-only, or mixed training data, so router research stays isolated from the working scripts/research-harness.sh baseline / threshold flow.
Tier-1 router benchmark path (src/search/hybrid.rs, scripts/beir-eval.py, scripts/router-bench.py): research runs can force hybrid to return tier-1 fused results only, collect tier1.jsonl alongside normal signal files, and benchmark a trained router offline against a holdout without changing the shipped runtime path.
Research conclusions documented (research/experiment.md, READMEs, benchmark skill): current maintainer guidance explicitly prefers stricter threshold gating before any runtime router integration. FiQA stays on the current fused threshold, Korean threshold research uses sampled miracl-ko --size 50000, and router work remains offline until it clearly beats simple gating on holdout.

New Features

IR_CONFIG_DIR env var: override the config/data directory with ~ and $VAR expansion.
Safe to use in MCP configs (.mcp.json) synced across machines with different usernames.
Precedence: IR_CONFIG_DIR > XDG_CONFIG_HOME/ir > ~/.config/ir.

Deprecations

XDG_CONFIG_HOME is deprecated as the config dir override for ir. It still works but
prints a warning. Use IR_CONFIG_DIR instead.

Improvements

All path env vars (IR_*_MODEL, IR_MODEL_DIRS, IR_CONFIG_DIR) now support ~ and
$VAR/${VAR} expansion. Previously, ~ in these vars was treated literally and silently
failed to resolve.
Collection paths in config.yml now support ~ and $VAR notation. Portable paths are
preserved on write; expansion happens at load time.
Preprocessor commands installed via ir preprocessor install are now stored as
$IR_DIR/preprocessors/... instead of absolute paths, making them portable across machines.
Existing absolute-path commands continue to work. Re-run ir preprocessor install <lang>
to migrate to the portable format.
When BM25 returns no results (semantic query with no keyword overlap), the daemon wait
timeout is extended from 3s to 10s — the daemon is the only source of results in this
case, so a slow cold start no longer silently returns nothing. Diagnostic hints are printed
at all daemon fallback sites (start failure, timeout, query error) to guide follow-up
(ir embed <collection>, model path check).

Bug Fixes

Preprocessor install: pin lindera download to v3.0.5 instead of resolving /releases/latest
at install time. Prevents silent breakage if lindera ships a major version with changed CLI
flags or tokenizer output format. (2cdbd78)

Bug Fixes

Preprocessor pipe deadlock on large Korean collections (#13): process_line now uses
a sentinel-based protocol to handle lines where the official lindera CLI produces no output
(e.g. punctuation-only lines where all tokens are filtered by --token-filter). Previously,
read_line() would block forever waiting for output that never arrived, deadlocking both
processes at 0% CPU. Manifested as a hang at ~60k docs when indexing MIRACL-Ko or other large
Korean corpora. (2084e4e)

Bug Fixes

BM25 now uses OR semantics for natural-language queries (>3 terms): stop words are
stripped and remaining terms are ORed. Short keyword queries (≤3 non-stop terms) keep
AND semantics. Fixes near-zero recall on question-format queries (e.g. ir search --mode bm25 "what are the symptoms of diabetes" previously returned almost nothing due to AND forcing
all stop words to match).
Preprocessor subprocess: process_line now skips empty/whitespace-only lines instead of
sending them to the subprocess. Fixes a deadlock when indexing markdown with blank lines
using official lindera CLI (which emits no output for empty input when --token-filter is active).

Breaking

ir preprocessor install ko/ja/zh now uses official lindera releases. Previously ir
bundled its own preprocessor binaries in GitHub release artifacts (unreliable — artifacts
were occasionally missing). Starting v0.11.0, ir preprocessor install downloads the
official lindera CLI binary and per-language dictionary directly from lindera's GitHub
releases. Chinese (zh) switches from a custom bigram tokenizer to lindera + jieba.

Migration required if you used a preprocessor before v0.11.0:
```
ir preprocessor install ko   # (or ja / zh)
ir update <collection> --force
```
Existing config entries pointing to old bundled binaries are stale. Search silently degrades
(BM25 without tokenization) until reinstalled.

Features

-f/--filter "FIELD OP VALUE" on ir search: general structured filter supporting built-in fields (path, modified_at, created_at) and YAML frontmatter fields (meta.<name>). Operators: =, !=, >, >=, <, <=, ~ (contains), !~ (not-contains). Multiple -f flags are ANDed. Date values are normalized to UTC RFC3339. Applied at all three search pipeline tiers so each exit point returns correctly filtered results.
MCP search tool gains a structured filter array ([{field, op, value}]) with full JSON schema — LLM clients see typed enum choices for operators.
Frontmatter metadata indexed into a new document_metadata table at ir update time; supports all scalar values, tag arrays (one row per element), and nested keys.

Bug Fixes

Daemon tier-2: reranker without expander now correctly reranks tier-1 fused results (reranking is useful without expansion; expansion alone is harmful, -0.53% nDCG on NFCorpus).
Daemon tier-2: IR_COMBINED_MODEL load failure now falls back to dedicated models with an explicit warning instead of silently disabling tier-2.
Daemon tier-2: conflict between IR_COMBINED_MODEL and dedicated model env vars now warns before loading (combined wins).
Daemon tier-2: QMD_EXPANDER_MODEL / QMD_RERANKER_MODEL legacy aliases now correctly trigger dedicated mode instead of falling through to auto-detect.

Breaking

--modified-after / --modified-before CLI flags removed (were unreleased). Use -f "modified_at>=YYYY-MM-DD" and -f "modified_at<=YYYY-MM-DD".
MCP SearchInput.modified_after / modified_before fields removed. Use filter: [{field: "modified_at", op: ">=", value: "YYYY-MM-DD"}].
Collection DBs upgrade to schema version 2 on first use. A one-time backfill populates document_metadata from existing frontmatter (sub-second for <10k docs). No manual migration required.

[0.9.0] - 2026-04-15

Features

ir search --quiet / -q: suppress all stderr output (progress indicators, daemon log lines). Useful for scripting. Conflicts with --verbose.
IR_COMBINED_MODEL: new canonical env var for the unified Qwen3.5 GGUF (replaces both expander + reranker roles). IR_QWEN_MODEL still accepted but emits a deprecation warning on load.
IR_*_MODEL env vars now accept HuggingFace repo IDs (owner/name) in addition to local file and directory paths. Setting e.g. IR_EMBEDDING_MODEL=ggml-org/bge-m3-Q8_0-GGUF downloads and caches the model automatically on first use.
BGE-M3 added to the auto-download registry (ggml-org/bge-m3-Q8_0-GGUF). Download progress shown in foreground terminal; daemon loads from cache instantly.
Download UX improved: structured message before HF progress bar shows model name, size hint, source URL, cache location, and offline tip.
Download errors now include actionable fixes: retry, HF_HUB_OFFLINE=1, manual download URL, cache path to clear on corruption.

Breaking

Unrecognized IR_*_MODEL values (not a file, directory, or known HF repo ID) now error immediately instead of silently falling through to the default model. Users with leftover garbage env vars will see an error with an "Accepted forms" list. Unset the env var to restore default behavior.

Full changelog: https://github.com/vlwkaos/ir/blob/main/CHANGELOG.md

Releases: vlwkaos/ir

v0.15.0

Features

Uh oh!

v0.14.1

Features

Chores

Uh oh!

v0.14.0

Fixes

Dev / Benchmark Tooling

Uh oh!

v0.13.0

Breaking

Features

Improvements

Dev / Benchmark Tooling

Uh oh!

v0.12.0

New Features

Deprecations

Improvements

Uh oh!

v0.11.2

Bug Fixes

Uh oh!

v0.11.1

Bug Fixes

Uh oh!

v0.11.0

Bug Fixes

Breaking

Uh oh!

v0.10.0

Features

Bug Fixes

Breaking

Uh oh!

v0.9.0

[0.9.0] - 2026-04-15

Features

Breaking

Uh oh!