taois a Rust-first knowledge engine for markdown vaults: a JSON-first CLI over SDK services, an internal bridge adapter for daemon/runtime flows, and deterministic fixture and benchmark tooling.
- Primary references:
Cargo.toml,package.json,config.toml,crates/tao-cli/README.md,crates/tao-sdk-service/README.md,crates/tao-sdk-bridge/README.md,crates/tao-bench/README.md - Operational scripts are authoritative for release, path guards, fixtures, and benchmarks:
scripts/release.sh,scripts/path-guards.sh,scripts/fixtures.sh,scripts/bench.sh,scripts/budgets.sh - Fixture semantics live in
vault/README.mdandvault/fixtures/README.md - External docs used by this repo: Rust, Bun
- There is no tracked
.github/workflows/directory in the current repository; treat local scripts, hooks, and crate tests as the real enforcement surface
.
├── crates/
│ ├── tao-cli/ JSON-first CLI surface, daemon client/server, contract tests
│ ├── tao-sdk-*/ Core SDK crates: config, vault scan, storage, service, internal bridge, search
│ ├── tao-bench/ Deterministic benchmark harness
│ └── tao-tui/ Placeholder TUI shell
├── scripts/ Path guards, fixtures, benchmarks, release, cleanup
├── vault/ Shipped QA/conformance fixture vault plus parity fixtures
└── AGENTS.md Canonical repo-level agent instructions; `README.md` and `CLAUDE.md` symlink here
- Start behavior changes in
crates/tao-cli/src/cli_impl/commands/for CLI routing,crates/tao-sdk-service/src/for orchestration, andcrates/tao-sdk-storage/src/for SQLite schema/repository work vault/fixtures/graph-parity/expected/holds golden JSON snapshots for CLI graph contractsdist/,.benchmarks/,target/, andvault/generated/are generated runtime/build outputs
| Layer | Choice | Notes |
|---|---|---|
| Core engine | Rust 2024 workspace | unsafe_code = "forbid" at workspace level, strict clippy |
| Storage | SQLite via rusqlite |
schema and migrations owned by tao-sdk-storage |
| Vault FS | tao-sdk-vault |
canonical path safety, NFC normalization, case-policy matching |
| CLI | clap + JSON/Toon envelopes |
default JSON output, optional --toon, optional daemon forwarding |
| Native bridge | tao-sdk-bridge |
internal Rust adapter shared by CLI warm-runtime flows and bridge benchmarks |
| Tooling | Bun + Husky + Biome | JS tooling only; core product/runtime is Rust |
| Benchmarks | tao-bench + hyperfine |
timestamped reports under .benchmarks/reports/ |
bun installinstalls JS tooling and activates Husky hookscargo run -p tao-cli -- --helpiterates on the CLI without requiring a prior release buildbun run util:checkis the full completion gate: path-guard tests, Biome,cargo fmt --check, clippy, releasecargo check, release tests,cargo audit, andbun run buildbun run buildpackages release CLI artifacts viascripts/release.shbun run bench,bun run bench:smoke, andbun run bench:budgetare the package benchmark entrypoints; pass suite flags throughbun run bench -- --suite liveorbun run bench -- --suite cli./scripts/fixtures.sh --profile parityrefreshes compact parity fixtures; generated synthetic benchmark fixtures are limited to1kand5ktao validate <path>validates markdown frontmatter,.basefiles, or a non-recursive folder window; add--recursivefor nested folders
crates/tao-cli/src/cli_impl/commands/is an adapter layer only; keep business rules in SDK crates and keep envelope/CLI formatting out of service codecrates/tao-sdk-service/src/orchestrates indexing, reconcile, graph diagnostics, base execution, task/property operations, and health snapshots over storage and vault primitivescrates/tao-sdk-storage/src/owns SQLite migrations, repositories, and transaction helperscrates/tao-sdk-vault/src/enforces vault boundaries and deterministic scan/fingerprint behavior; scans skip.git,.obsidian,.tao, and root.taoignore, and honor root.taoignorepatterns for Tao indexing exclusions without reading.gitignorecrates/tao-sdk-bridge/src/exposesBridgeKerneland envelope types used by CLI runtime caches and retained benchmark flowsvault reindexis not a blind full rebuild: it prefers incremental reconcile and only escalates to full rebuild when link-resolution version state or indexed file-path consistency is staletao searchreads a derived unified search corpus (search_segments,search_segments_fts,search_aliases) built from the canonical file, doc FTS, property, task, graph, and base tables.vault reindex, incremental reconcile, daemon first-observation repair, and one-shot search stale checks keep that corpus in sync with the core index.- Public graph help is centered on
graph links,graph audit,graph path, andgraph walk; older graph-specific subcommands remain callable as compatibility wrappers, are omitted from defaulttao tools, and can still be inspected withtao tools <name>
- Vault root resolution is separate from other settings:
--vault-root->TAO_VAULT_ROOT->[vault].rootfrom repo/rootconfig.tomldiscovered from cwd ->[vault].rootfrom global~/.tools/tao/config.toml; once the vault is known, runtime/storage/security values resolve as explicit overrides ->TAO_*env vars -> vaultconfig.toml-> repo/root config -> global config -> built-in defaults - Relevant env vars:
TAO_VAULT_ROOT,TAO_CONFIG_PATH,TAO_DATA_DIR,TAO_DB_PATH,TAO_CASE_POLICY,TAO_TRACING_ENABLED,TAO_FEATURE_FLAGS,TAO_READ_ONLY;TAO_CONFIG_PATHoverrides the global config file location, and release/cleanup also honorTAO_HOME,TOOLS_HOME, and legacyOBS_HOME - Probe-only config behavior is intentional: root and vault
config.tomlfiles are read when present but are not auto-created during normal config resolution - Effective runtime defaults when config is absent are repo-local or vault-local: data dir
<vault>/.tao, db path<vault>/.tao/index.sqlite, case-sensitive matching, tracing enabled, read-only enabled config showreports effective config values, per-field source labels, source inputs, and precedence without opening or migrating SQLite state- Normal vault-facing CLI commands may auto-forward through a background daemon; hidden
vault daemon *commands remain lifecycle/inspection escape hatches, not the normal user workflow - Daemon sockets are Unix-only and default to
~/.tools/tao/daemons/vault-<hash>.sock; whenHOMEis missing the fallback is<cwd>/.tao/daemons/ - Daemon first observation may reconcile or fully rebuild before serving cached reads; later change-monitor generations invalidate cached results for the affected runtime
- Generated and local state to expect:
dist/,.benchmarks/reports/,vault/generated/, and local vault metadata directories likevault/.tao/ scripts/budgets.shoptionally readsplan/perf-budgets.json, but that file is absent in the current repo; the script falls back toprofile=5kand10msdefault p50 budgets
README.mdandCLAUDE.mdare symlink mirrors ofAGENTS.md; edit the root file only- Non-interactive CLI commands emit one JSON envelope to stdout by default; bare
taoand help/version flows use native clap output. --toonemits the normal public CLI envelope as Toon instead of default JSON.--json-streamis a narrow projected JSON envelope path: it only applies toquery --from docswithout--whereor--sort, and remains JSON-only.query --from graphwithout--pathmaps to the unresolved-link window; with--pathit returns outgoing and backlink panels- Public vault-content operations are read-only.
doc write,task set-state, global--allow-writes, and public--textoutput are not part of the CLI surface. - Internal state writes for
vault open,vault reindex, daemon/cache/index maintenance, watch reconciliation, search-corpus repair, and health synchronization remain allowed;vault reindex --dry-runinspects planned index work. tao search <query>is the primary graph-aware exploration entrypoint across indexed markdown docs, the indexed file inventory, bases/frontmatter properties, tasks, graph links, and context expansion. Usergfor raw grep; usetao searchwhen index metadata, exact aliases, normalized spaces/underscores/hyphens, canonical ranking, and relationships matter.tao validate <path>is the public validation surface for markdown frontmatter and.basefiles;tao base validateis not part of the public command surface.- If you change command names, parameters, or examples, update
crates/tao-cli/src/cli_impl/registry.rsand the contract tests that assert the public surface
- Do not run general automated QA or fixture generation against personal vaults or paths outside this repository. Use
vault/,vault/generated/, or repo-local temporary directories for tests and generated fixtures. - Live-vault smoke checks and live-vault benchmarks are allowed because the public CLI is vault-content read-only. Pass live paths at runtime with
--live-vaultorTAO_BENCH_LIVE_VAULT; keep private benchmark probes in gitignored.benchmarks/live-commands.txt, never in tracked files. - Treat
crates/tao-sdk-storage/,crates/tao-sdk-bridge/,crates/tao-cli/src/cli_impl/contract.rs,crates/tao-cli/src/cli_impl/registry.rs, andscripts/as high-risk boundaries for migrations, contract stability, packaging, and path/output guardrails scripts/clean.shremovesdist,TAO_HOME, and the legacy${OBS_HOME:-${TOOLS_HOME}/obs}install directory; do not run it casually if those env vars point somewhere unexpected- CLI/daemon/budget benchmark flows use repository-local generated fixtures by default;
bun run bench -- --suite liveuses a runtime-provided live vault. Daemon, live, fixture-generation, and budget suites require Unix sockets andhyperfine, while rawtao-benchscenarios (bridge,startup,parse,resolve,search,graph-walk,unified-query) do not.
- Required gate:
bun run util:check - CLI and JSON contract changes:
cargo test -p tao-cli --release - Service, bridge, or indexing changes:
cargo test -p tao-sdk-service --releaseandcargo test -p tao-sdk-bridge --release - Fixture or graph/base parity changes: use
vault/fixtures/README.md, rerun the parity refresh flow, and keepvault/fixtures/graph-parity/expected/in sync with CLI snapshot tests - Benchmark or performance changes: rerun the relevant suites from
scripts/bench.shandscripts/budgets.sh; reports land under.benchmarks/reports/with alatestsymlink - There is no tracked CI workflow directory at the repo root today, so local script/test output is the completion bar
scripts/tests/path_guards_test.shfor the generic repository-local output and live-vault path guard expectations the repo actively tests