A home for AI & Shared Knowledge
Current release: Sky Omega 1.8.3 · Mercury substrate production-validated across the 1.7 line · DrHook substrate-independence reached at 1.8.2 (netcoredbg retired), with the lifecycle triad + console-I/O isolation + 22-tool MCP surface landing at 1.8.3 · three paired Mercury measurements on the same substrate generation:
- cycle 10 Phase 3 r4 — 21.3 B full Wikidata, 23 h 57 m end-to-end (2026-05-13)
- truthy r1 — 8.17 B truthy Wikidata, 14 h 13 m end-to-end (2026-05-14)
- WGPB step C — ~150 M 2018 reduced-truthy Wikidata, 4 m 30 s end-to-end + 849/850 WGPB queries in 4 m 43 s (2026-05-16)
Truthy is the apples-to-apples companion vs published WDBench / QLever / Virtuoso numbers. WGPB enables comparison vs MillenniumDB's published systematic-graph-pattern benchmarks. Cumulative discipline: 0 substrate failures across 8,564 unique query × substrate executions (cycle 8 + cycle 9 + cycle 10 r4 + truthy r1 + WGPB step C). Note Mercury Reference includes a built-in full-text trigram index — like-for-like vs systems without text indexes: 15 h 26 m / 6 h 49 m / n/a (trigram is a +8 h 30 m / +7 h 24 m / ~negligible feature cost that buys SPARQL text:match out of the box).
Your AI assistants are brilliant and homeless. Every conversation starts from nothing. Every insight evaporates when the window closes. They can reason, but they can't remember. They can help, but they can't grow.
Sky Omega gives them a place to stay.
Not a platform. Not a cloud service. A home — on your machine, under your control, queryable by any agent you trust. What your AI learns today, it knows tomorrow. What it knows on your laptop, it can share with your team through the tools you already use.
Whose home?
Yours. The AI lives there. You hold the keys. The knowledge it accumulates is stored locally in open standards — RDF triples, queryable via SPARQL, portable as Turtle files. No vendor lock-in. No proprietary memory systems. No platform deciding what your AI remembers or forgets.
Switch models. Switch providers. The knowledge stays.
What grows there?
Understanding. Not conversation logs — structured meaning. Decisions and why they were made. Patterns that work and approaches that failed. The vocabulary your project actually uses. The constraints that took three sessions to discover. All of it queryable, traceable, version-controlled.
Code travels via git. Now knowledge does too.
Why now?
For fifteen years, RDF was the right answer to a question nobody was asking. Structured knowledge representation needed an interface layer — something that could read, write, and reason over triples naturally. LLMs are that interface. They were the missing piece.
Sky Omega is what becomes possible when you stop building better travelers and start building them a home.
Measurements on v1.7 — Three paired measurements: cycle 10 Phase 3 r4 + truthy r1 + WGPB step C complete. Substrate at 23 h 57 m / 14 h 13 m / 4 m 30 s end-to-end at 21.3 B full / 8.17 B truthy / ~150 M WGPB-filtered (with full-text trigram index). Like-for-like (no trigram, comparing vs published QLever / Virtuoso / WDBench / MillenniumDB numbers): 15 h 26 m / 6 h 49 m / 4 m 30 s. WGPB queries: 849/850 in 4 m 43 s (99.88 % completion, 0 timeouts). Four measured 21.3 B Wikidata production runs across the trajectory; substrate now 3.5× faster than its first incarnation, all on a single laptop, BCL-only .NET.
Cumulative trajectory (measured-vs-measured, four completed full-Wikidata runs)
- Phase 6 (2026-04-25, 1.7 pre-Sorted) — 85 h end-to-end. First successful 21.3 B Reference end-to-end on a single M5 Max.
- Cycle 8 (2026-05-06, 1.7) — 46 h with intervention. ADR-034 SortedAtomStore for Reference closed Phase 1; algorithmic switch from Hash → Sorted atom store; ~42 % atoms.atoms reduction via prefix compression; cleanup-class FD fixes.
- Cycle 9 (2026-05-09, 1.7) — 35 h 35 m clean. ADR-037 pipelined spill (parser 14 h 15 m → 9 h 18 m, measured); cleanup hook (3.96 TB reclaimed at end-of-merge, manual intervention requirement eliminated).
- Cycle 10 r4 (2026-05-13, 1.7) — 23 h 57 m clean. ADR-038 merge-phase read-side (prefix-compress intermediate chunks + frontier readahead + sidecar offset table); ADR-039 BBHash MPHF over sealed atom set with
MaxLevels=40 + dense final-level fallback; MPHF instrumentation surface (per-level events + dense-fallback + start/complete summary); listener wire-through fix atQuadStore.RebuildMphf.- Truthy r1 (2026-05-14, 1.7) — 14 h 13 m end-to-end on the same substrate. 8,171,214,990 truthy-Wikidata triples (vs full's 21.3 B). Apples-to-apples companion to cycle 10 r4 for comparison vs published WDBench / QLever / Virtuoso numbers. Key finding: trigram entries 90.7 % of full at 38.3 % triple-count ratio = ~2.4× more literal-density per triple in truthy → trigram-phase prediction needs literal-volume scaling, not triple-count scaling (dump-date confounder noted in the validation doc).
- WGPB step C (2026-05-16, 1.7) — 4 m 30 s end-to-end on a 2018 reduced-truthy Wikidata substrate (~150 M triples). MillenniumDB's Wikidata Graph Pattern Benchmark: 849/850 queries completed in 4 m 43 s (99.88 %, 0 timeouts; 1 query rejected as malformed source SPARQL — Mercury's parser correctly identifying real defects in published benchmark data). Aggregate p50 53 ms, p95 1.8 s, p99 4.3 s. Apples-to-apples vs published MillenniumDB / Virtuoso / Blazegraph WGPB numbers. See validation doc.
- Cumulative: 85 h → 24 h, −71.8 % wall-clock reduction across the substrate's evolution.
Cycle 10 r4 — production validation of ADR-038 + ADR-039 + MPHF instrumentation (2026-05-13)
- 21,316,531,403 triples ingested from full Wikidata (
latest-all.ttl.bz2) + sealed in 23 h 56 m 50 s end-to-end (parse 9 h 17 m + merge 2 h 41 m + MPHF 54 m 29 s + GSPO drain ~1 h 38 m + GPOS rebuild 55 m 27 s + Trigram rebuild 8 h 30 m 30 s)- Dataset note: every Mercury measurement runs against full Wikidata, not the truthy subset (
latest-truthy.nt.bz2) that most published QLever/Virtuoso/WDBench numbers use. Truthy is ~1.5–1.8× smaller and excludes statement-level qualifiers, references, and sitelinks. See the comparison-plane memo for the honest-comparison framing.- MPHF construction characterized at production scale (4.005 B atoms): 25 levels, 0 dense fallback engaged, placement_ratio held at 0.6065 across all levels — exact match to BBHash theoretical
1 − e^(−1/γ)for γ=2.0. Total 54 m 29 s, within 1 % of the cycle 10 plan's "+~55 min MPHF" budget.- Substrate output identity: 4,005,235,528 atoms, 17,029,283,265 GPOS entries, 7,472,855,623 trigram entries — bit-for-bit identical to cycle 9's measurements (same input, deterministic substrate). MPHF surface is purely additive: 1.75 GB
atoms.mphfblob + 16.0 GBatoms.idxtranslation table.- FD trajectory peaked at 8,325 during trigram rebuild (8,192 simultaneously-open chunks) vs the launchd ~10K effective ceiling = 17 % headroom held for 8 h+. Initial framing as
ExternalSorterpool-bypass retracted 2026-05-16 as false alarm — code review confirms the pool IS engaged on the trigram-drain path viaExternalSorter.ChunkReader.RefillBuffer → _pool.Get(_path); 8,192 was the pool running at its 8K cap with LRU eviction as designed. Eviction-overhead concern tracked indocs/limits/trigram-drain-cap-eviction.md.Substrate components shipped (cumulative)
- ADR-034 SortedAtomStore for Reference — Completed (1.7)
- ADR-035 Phase 7a metrics infrastructure — Completed
- ADR-036 BCL-only bz2 streaming decompression — Completed
- ADR-037 Pipelined spill in
SortedAtomBulkBuilder— Completed (1.7, production-validated cycle 9)- ADR-038 Merge-phase read-side optimization — Completed (1.7, production-validated cycle 10 r4)
- ADR-039 MPHF over sealed atom set — Completed (1.7, production-validated cycle 10 r4)
- Reference-profile measurements per ADR-008. Cognitive-profile validation drought persists — see docs/limits/cognitive-profile-validation-drought.md.
Read more
- 21.3 Billion Triples on a Laptop, in .NET — the Phase 6 article
- What Compounds — Sky Omega's first four months, the recipe
- Cycle 10 Phase 3 r4 production validation — most recent measurement (1.7)
- Cycle 9 21.3 B production validation — the comparison baseline
- CHANGELOG.md · Roadmap · Validations · Limits register
If you're an AI assistant, start with AI.md.
git clone --recurse-submodules <repo-url> && cd sky-omega
dotnet build SkyOmega.sln
dotnet test
./tools/install-tools.sh # macOS/Linux
mercury -m # Start an in-memory session — REPL + SPARQL HTTP endpoint at http://localhost:3031/sparql
mercury <store> --bulk-load data.ttl.bz2 # Bulk-load Turtle (or .nt, .nq, .trig, .rdf, .jsonld; .bz2 / plain)Already cloned without submodules? Run
./tools/update-submodules.shto fetch the W3C conformance test data needed bydotnet test.
New here? Follow the Getting Started tutorial.
Want to give Claude persistent memory? See Mercury MCP tutorial.
- sky-omega-public — Conceptual documentation, EEE methodology, architectural narratives
- grammar-meta-standard — EBNF grammars enabling grammar-aware reasoning
| Document | Purpose |
|---|---|
| AI.md | Start here if you're an AI assistant |
| CLAUDE.md | Operational guidance for AI-assisted development |
| MERCURY.md | Semantic memory discipline — when, why, how |
| STATISTICS.md | Codebase metrics and conformance tracking |
| Getting Started | 30-minute onboarding tutorial |
| Mercury CLI | CLI REPL deep dive |
| Mercury MCP | Claude integration and persistent memory |
| API Reference | Detailed code examples for all APIs |
| The Collected Poems of Kjell Silverstein | Sky Omega explained without a single line of code |
Mercury is a complete SPARQL 1.1 engine with zero external runtime dependencies (BCL-only core), zero-GC hot paths, and 100% W3C conformance across all core specifications. It gives AI assistants persistent, queryable memory on your machine — what your AI learns today, it knows tomorrow.
Scope of "BCL-only": Mercury core (
src/Mercury/) and its 21-public-type embeddable surface have noPackageReferenceentries. Adjacent surfaces —Mercury.Mcp(depends onModelContextProtocol),DrHook.Engine(admitted:Microsoft.Diagnostics.NETCore.Clientmanaged +Microsoft.Diagnostics.DbgShim.<rid>native per-RID +Microsoft.Diagnostics.Tracing.TraceEventfor EventPipe parsing, all per ADR-009; netcoredbg retired at 1.8.2) — package the substrate for tooling and runtime observation. The substrate-independence claim applies to the core; the tooling layer is honest about its dependencies.
The broader Sky Omega vision is a stand-alone cognitive agent built on this foundation, combining:
- Structured memory via a temporal RDF knowledge substrate (Mercury — built)
- Grammar-driven reasoning (syntax, behavior, and intent grammars)
- Local LLM inference (Minerva — BCL-only, in development)
- Explainable, traceable logic — a foundation for hybrid AGI
Mercury is a full SPARQL 1.1 + RDF stack. Every standard listed below is implemented in BCL-only C#, validated against the W3C conformance test suite, and exposed via CLI, HTTP endpoint, and embeddable .NET API.
| Format | W3C Conformance | Use |
|---|---|---|
| Turtle 1.2 | 309/309 (100%) | Human-friendly, prefix support, the de-facto interchange format |
| TriG 1.2 | 352/352 (100%) | Turtle with named graphs |
| N-Triples 1.2 | 70/70 (100%) | Line-oriented, the Wikidata dump format |
| N-Quads 1.2 | 87/87 (100%) | N-Triples with named graphs |
| RDF/XML 1.1 | 166/166 (100%) | Legacy interop, still required by many vocabularies |
| JSON-LD 1.1 | 461/467 (99%, 6 intentional skips) | JSON-native RDF for web/API surfaces |
| Spec | W3C Conformance |
|---|---|
| SPARQL 1.1 Query (SELECT, ASK, CONSTRUCT, DESCRIBE, all aggregates, property paths, federated SERVICE) | 421/421 (100%) |
| SPARQL 1.1 Update (INSERT, DELETE, LOAD, CLEAR, CREATE, DROP, COPY, MOVE, ADD) | 94/94 (100%) |
| SPARQL 1.1 Syntax | 103/103 (100%) |
| SPARQL 1.1 Federated Query (SERVICE clause, remote endpoints) | included in Query 421 |
- SPARQL Protocol over HTTP —
mercuryCLI ships with a built-in HTTP endpoint athttp://localhost:3031/sparql. Standard query/update content negotiation, JSON/XML/CSV/TSV result serialization. UseSERVICE <http://localhost:3030/sparql>to federate across local Mercury instances. - W3C Solid Protocol server (
Mercury.Solid) — WAC + ACP access control, N3 Patch updates, full HTTP handlers. - Model Context Protocol (MCP) —
mercury-mcpexposes Mercury as a Claude semantic-memory tool with persistent store survival across sessions.
- Valid-time + transaction-time stored as implicit dimensions on every triple
AS OF,DURING,ALL VERSIONSquery forms for time-travel- Versioning, soft-delete, audit trails — all queryable through standard SPARQL with temporal extensions
- Reference profile drops temporal columns by design — sealed canonical snapshots have no time dimension. See ADR-008 for the workload-profile distinction.
| Claim | Evidence | Command to Verify |
|---|---|---|
| 100% W3C SPARQL 1.1 Query | 421 passing tests | dotnet test --filter "W3C.Sparql.Query" |
| 100% W3C SPARQL 1.1 Update | 94 passing tests | dotnet test --filter "W3C.Sparql.Update" |
| 100% W3C SPARQL 1.1 Syntax | 103 passing tests | dotnet test --filter "W3C.Sparql.Syntax" |
| 100% W3C Turtle / TriG / N-Triples / N-Quads / RDF-XML | 984 passing tests | dotnet test --filter "W3C" |
| 100% W3C JSON-LD 1.1 | 461 passing tests (6 intentional skips: legacy 1.0, generalized RDF) | dotnet test --filter "W3C.JsonLd" |
| SPARQL HTTP endpoint | mercury CLI |
mercury -m then visit http://localhost:3031/sparql |
| Zero external runtime deps | Mercury.csproj | grep PackageReference src/Mercury/*.csproj |
| 4,463 Mercury tests passing | Test suite | dotnet test |
| AI-assisted development | Git history | git log --oneline | grep "Co-Authored-By" |
| Development velocity | ~197K lines | See STATISTICS.md |
Sky Omega - On Emergence, Epistemics, and the Patience Required to Build What Matters
Everything below has code in src/, tests, and benchmarks.
| Component | Description |
|---|---|
| Mercury | Temporal RDF substrate — 88,534 lines, BCL-only. SPARQL 1.1 Query + Update + Syntax (100% W3C). RDF parsing/writing for Turtle, TriG, N-Triples, N-Quads, RDF/XML, JSON-LD. Built-in SPARQL HTTP endpoint (http://localhost:3031/sparql) with standard content negotiation. Four storage profiles per the no-behavior-flags rule (ADR-029, closed 1.7): Cognitive (bitemporal, versioned), Graph (versioned with soft-delete + un-delete-on-add), Reference (immutable, Wikidata-shaped), Minimal (single P→S→T sort order). Bitemporal extensions for time-travel queries. Zero-GC hot paths. |
| Mercury.Solid | W3C Solid Protocol server — WAC + ACP access control, N3 Patch updates, full HTTP surface |
| Mercury.Pruning | Dual-instance pruning with copy-and-switch pattern |
| Mercury MCP | Claude integration with persistent semantic memory |
| Mercury CLI | Interactive REPL with persistent store, global tool install |
| DrHook.Engine | Runtime observation substrate — BCL + P/Invoke + source-gen COM, 6,231 lines. ICorDebug via per-RID libdbgshim; EventPipe for process listing + thread/stack snapshots. Substrate-independence reached at 1.8.2 (netcoredbg retired). PoC probes through 63 with 81 dated findings validating each capability on macOS/arm64 (single-platform so far). Production-suitability sequenced under ADR-007 (Accepted). |
| DrHook MCP | MCP server for .NET runtime inspection (peer to Mercury MCP) — 22 tools backed by DrHook.Engine: session lifecycle (launch / attach / stop / detach / kill), execution control (continue / pause / step over / into / out), breakpoints (source / function / exception with subclass filtering + condition / hit-count / logpoint, list / remove / clear), locals (field + array inspection), processes, snapshot, and console / log / anomaly drains. Tool names follow IDE-debugger convention (ADR-010); every state-changing or state-reading tool takes a hypothesis. |
The agent architecture that Mercury and DrHook are being built to support. The Sky Omega 1.8 line opens the cognitive-layers entry point per the amended version-line model (2026-04-26); in practice the early 1.8.x releases (1.8.1 → 1.8.3) first completed DrHook's substrate-independence and lifecycle hardening, so the cognitive layers below begin landing later in the 1.8.x line.
| Component | Role |
|---|---|
| Sky | Language layer — pruned reasoning, reflection, and short-term memory |
| James | Cognitive orchestration — tail recursive orchestration loop |
| Lucy | Long-term memory — epistemic and semantic, queryable, precise (powered by Mercury) |
| Mira | Integration layer — expression and sensory capabilities, UX/UI |
| Minerva | LLM inference substrate — BCL-only, zero-GC, local-first |
| Behavior & Intent Grammars | Define what Sky knows, intends, and verifies |
- Intent before implementation
- Transparency before complexity
- Semantics before scale
- Recursion as structure
- Code as a mirror of cognition
Modern IDEs (Visual Studio, Rider, VS Code) offer two views of this repository:
| View | What you see | Best for |
|---|---|---|
| Solution View | Virtual folders from SkyOmega.sln |
Browsing by component (Mercury ADRs under Mercury, etc.) |
| Filesystem View | Actual directory structure | Finding files by path, understanding repository layout |
Both are valid. The solution file organizes content logically for architects and developers, while the filesystem maintains consistent paths for documentation links.