It embeds like DuckDB, runs as a standalone HTTP server, or sits on our hosted cloud. Same engine in all three, and the bucket is always the source of truth.
NamiDB is a graph database engine built around object storage. You write Cypher, it lays your nodes and edges out as columnar files in a bucket, and that bucket is the only source of truth. There's nothing else to run and nothing to coordinate outside the bucket itself. The same engine ships three ways: embedded as a library, as an HTTP server, or on our hosted cloud.
A few things had to line up before this made sense.
S3 finally got conditional writes. In 2024, S3 shipped If-Match and If-None-Match, which was the last primitive we were missing. With compare-and-swap on the bucket you can build a coordinated, durable system where object storage is the database. There's no Raft, no ZooKeeper, no etcd in the picture. People had already pulled this off for vectors, for queues, for analytics. Nobody had done it for graphs.
The best columnar graph engine left the field. Apple bought Kùzu in October 2025 and archived the repo. It was the most carefully thought-out columnar graph engine anyone had published, and it just went quiet. That left a hole.
Agents need graphs. Vector search is necessary but it isn't enough. Knowledge graphs are what agent memory, deep retrieval, and reasoning under uncertainty actually sit on once you're past the demo. A lot of the interesting AI work this decade is going to be about relationships, not just embeddings.
So that's what we're building.
NamiDB started inside LESAI as the graph database behind a hosted product we're building. We've been at it for about a year now, and every Cypher query, every manifest CAS, every CSR adjacency table in here has been run against real workloads, not just unit tests.
We're open-sourcing the engine now because two things finally lined up:
- Apple archived Kùzu in October 2025, so the columnar property-graph space lost its one maintained option. We'd independently landed on more or less the design Kùzu pioneered, so putting NamiDB out there felt like the most useful thing we could do about that gap.
- Our own roadmap moved to a hosted product, NamiDB Cloud, which is multi-tenant and scales to zero per namespace. The engine doesn't need to be a competitive secret anymore. The engine is open, the cloud is the business.
NamiDB writes Cypher to your S3 bucket.
There's no control plane to provision, no Raft to tune, no etcd to babysit. Conditional writes (If-Match / If-None-Match) on the bucket take the place of a consensus tier, so the bucket itself holds the truth. Your graph database is just files: durability is whatever S3, R2, GCS or Azure already give you, cost drops to zero when nobody is querying, a backup is aws s3 sync, and a tenant is a folder.
The engine is the same whether you run it as a library inside your app, as a Rust daemon over HTTP, or on our hosted cloud. It works just as well against AWS S3, Cloudflare R2, GCS, Azure Blob, MinIO, or your local disk.
| Mode | Status | Best for | How it ships |
|---|---|---|---|
| Server | 🧪 alpha (v0.1) | Self-hosted over your own S3 / R2 / GCS / Azure bucket | namidb-server binary + Dockerfile |
| Embedded | 🧪 alpha (v0.1) | Notebooks, single-process apps, local dev, CI fixtures | pip install namidb, talks to a bucket from inside your process |
| Cloud | 🔒 closed beta | Multi-tenant SaaS, agent memory, scale-to-zero per namespace | Managed by LESAI on namidb.com, request access |
It's the same engine across all three. Server and Embedded write to an identical bucket layout, so you can open an embedded notebook against the exact s3://... URI a namidb-server daemon is serving.
NamiDB is pre-1.0 and alpha: the engine has run inside LESAI for about a year, but it has no external production users yet, several production concerns are still in progress (broad authz, backup/restore), and it is not yet a drop-in for critical data. Run it over a bucket you own, keep backups (aws s3 sync), and treat 0.x as the moving target it is.
- Cypher and GQL parsing. A strict subset of GQL (ISO/IEC 39075:2024) plus openCypher 9. All 12 in-scope LDBC SNB Interactive Complex Read queries (IC01 through IC12) parse, plan and run end to end.
- Writes through Cypher.
CREATE,MERGE,SET,DELETE,DETACH DELETE,REMOVE. Durable oncommit_batch(WAL append plus manifest CAS). - Cost-based optimizer. Predicate pushdown, projection pushdown, join reorder, hash-join conversion, hash semi-join (
EXISTSdecorrelation), Parquet row-group pruning.EXPLAIN VERBOSEprints the chosen plan with selectivity and cost annotations. - Vectorized execution. A morsel-driven executor with an optional factorized intermediate representation (RFC-017) for path-heavy queries.
- Columnar storage on object storage. Parquet node SSTs, a custom edge-SST format with CSR adjacency (RFC-002), zstd compression, bloom filters, fence-pointer indices.
- Coordination-free correctness. One writer per namespace, with epoch fencing via manifest CAS. Conditional writes (
If-Match,If-None-Match) stand in for external consensus. - Tiered caches. A process-wide
AdjacencyCache(CSR), aNodeViewCache, and anSstCache(decoded body, edge property streams, and the reader). Cross-snapshot reuse,Arc-shared and byte-budgeted. - Six storage backends.
memory://,file://(withflock-based CAS),s3://(AWS S3, R2, MinIO, Tigris, LocalStack),gs://,az://. - Python bindings.
pip install namidb. abi3 wheels for Linux (x86_64 and aarch64), macOS (arm64) and Windows (x86_64), with an sdist fallback everywhere else. Sync and async (acypher). Arrow, pandas and polars output. - CLI.
namidb parse,namidb explain --verbose,namidb run --store <uri>for ad-hoc query work against any backend. - HTTP server. The
namidb-serverbinary, with bearer-token auth, a periodic flush loop, a lock-free/v0/livezliveness probe, Prometheus metrics at/v0/metricsplus a slow-query log, and a small REST API (/v0/cypher,/v0/health,/v0/admin/flush). Optional TLS on both the HTTP and Bolt listeners via--tls-cert/--tls-key(rustls). - Bolt protocol. Same
namidb-serverbinary speaks Bolt 4.4 / 5.0 / 5.4 on an opt-in TCP listener (default 7687). Neo4j drivers connect overbolt://host:7687and run Cypher, verified end-to-end with the Python driver. The other language drivers (Java, JavaScript, .NET, Go, Rust) speak the same protocol but are not all exercised yet. The Cypher parser does not implementCALL/SHOWas general clauses, but the server answers the specific schema-introspection procedures Neo4j and Memgraph GUIs fire on connect (db.labels,db.relationshipTypes,apoc.meta.*,meta_util.schema, and the rest) over Bolt, so their schema panels populate, verified with G.V()/gdotv. Running those procedures over the HTTP or embedded APIs still hits the parser limitation. See RFC-022. - Vector search and embeddings.
cosine_similarity,dot_productandeuclidean_distancebuiltins rank stored f32 vectors through the normal scan +ORDER BY+LIMITpath, so K-nearest-neighbour search is just Cypher. Loading a markdown vault (namidb load-vault --embed, or the MCP server) embeds each note so semantic search works over it. The default embedder is local and lexical: a dependency-free hashing embedder that matches shared vocabulary, not meaning. For meaning-level search, build with--features remote-embedderand set theNAMIDB_EMBED_*variables to use OpenAI, Voyage, Cohere, Gemini or Jina. A namespace must be embedded consistently; switching embedders means a re-embed. The scan is still flat (no ANN index yet) and vectors are stored uncompressed. - Bench harness. A synthetic, deterministic LDBC SNB Interactive harness under
bench/.
Two ways in. Same engine behind both.
This is the headline use case. Point it at a bucket, write Cypher, and durability is whatever S3 already gives you.
pip install namidb
export AWS_ACCESS_KEY_ID=AKIA...
export AWS_SECRET_ACCESS_KEY=...import namidb
# Open (or bootstrap) the `prod` namespace on your bucket.
client = namidb.Client("s3://my-bucket/data?ns=prod®ion=us-east-1")
client.cypher("CREATE (a:Person {name: 'Alice', age: 30})")
client.cypher("CREATE (b:Person {name: 'Bob', age: 25})")
client.cypher(
"MATCH (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'}) "
"CREATE (a)-[:KNOWS {since: 2020}]->(b)"
)
result = client.cypher(
"MATCH (p:Person) WHERE p.age >= $min RETURN p.name AS name, p.age AS age",
params={"min": 18},
)
print(result.to_pandas())Kill the process and start it again. Open a notebook on another machine pointed at the same URI. The graph is still there, because the bucket is the database.
For when you just want to poke at the engine before wiring up a bucket. In-process, ephemeral, zero setup:
import namidb
client = namidb.Client("memory://acme")
client.cypher("CREATE (a:Person {name: 'Alice'})")
print(client.cypher("MATCH (p:Person) RETURN p.name").rows())The same handful of lines work against file://, gs://, az://, or any S3-compatible endpoint. Only the URI changes.
The URI tells the client which bucket and which namespace to use.
| Scheme | Backend |
|---|---|
s3://<bucket>[/<prefix>]?ns=<ns> |
AWS S3, Cloudflare R2, MinIO, Tigris, LocalStack, anything S3-compatible |
gs://<bucket>?ns=<ns> |
Google Cloud Storage |
az://<account>/<container>?ns=<ns> |
Azure Blob Storage |
file:///abs/dir?ns=<ns> |
Local filesystem (CAS via flock + atomic rename) |
memory://<ns> |
In-process and ephemeral, for testing only |
Every backend speaks the same Cypher, exposes the same Python, Rust and HTTP APIs, and gives you the same snapshot-isolated reads.
Durability. A write is acknowledged once commit_batch has written its WAL segment and swung the manifest pointer on the backend. On the object stores (s3://, gs://, az://, and S3-compatibles like R2 or MinIO) that inherits the backend's own durability: once the PUT is acked, the write is on durable storage. The file:// backend writes through the OS page cache and does not fsync, so a committed write survives a process crash but a kernel panic or power loss can lose the most recent un-flushed writes. Treat file:// as a development and single-node store, not as the only copy of data you cannot lose; point production at an object store. memory:// is not durable at all. Backup and restore a consistent snapshot of any namespace with namidb backup / namidb restore.
import os
os.environ["AWS_ACCESS_KEY_ID"] = "AKIA..."
os.environ["AWS_SECRET_ACCESS_KEY"] = "..."
client = namidb.Client(
"s3://my-bucket/data?ns=prod"
"®ion=us-west-2"
)Credentials come from the standard AWS env vars (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY, AWS_SESSION_TOKEN, AWS_DEFAULT_REGION). IAM roles on EC2, EKS, Lambda and ECS just work, with no NamiDB-specific auth to wire up.
The only IAM permissions NamiDB needs on the bucket are s3:GetObject, s3:PutObject, s3:DeleteObject and s3:ListBucket. That's it. No DynamoDB lock table, no separate metadata service.
R2 charges nothing for egress and has full S3-compatible conditional writes. Same scheme, just point at the R2 endpoint with region=auto:
import os
os.environ["AWS_ACCESS_KEY_ID"] = "<R2 access key>"
os.environ["AWS_SECRET_ACCESS_KEY"] = "<R2 secret>"
client = namidb.Client(
"s3://my-bucket?ns=prod"
"&endpoint=https://<ACCOUNT_ID>.r2.cloudflarestorage.com"
"®ion=auto"
)If you're running NamiDB anywhere outside AWS (Cloudflare Workers, Fly.io, your own VPS, your laptop), R2 is almost always the right call.
Same namidb.Client(...) call, just a different URI. Expand for the copy-paste credentials.
Google Cloud Storage (gs://)
import os
os.environ["GOOGLE_APPLICATION_CREDENTIALS"] = "/etc/gcs-key.json"
client = namidb.Client("gs://my-bucket/data?ns=prod")You can also pass the service-account path in the URI:
gs://my-bucket?ns=prod&service_account=/etc/gcs-key.json.
Azure Blob Storage (az://)
import os
os.environ["AZURE_STORAGE_ACCOUNT_NAME"] = "myacct"
os.environ["AZURE_STORAGE_ACCESS_KEY"] = "..."
client = namidb.Client("az://myacct/mycontainer?ns=prod")For Azurite (the local emulator) tack on &use_emulator=true.
MinIO (self-hosted S3), s3:// with an endpoint=...
docker run -d --rm -p 9000:9000 -p 9001:9001 \
-e MINIO_ROOT_USER=minioadmin -e MINIO_ROOT_PASSWORD=minioadmin \
--name minio minio/minio server /data --console-address ":9001"
docker exec minio mc alias set local http://127.0.0.1:9000 minioadmin minioadmin
docker exec minio mc mb local/namidbimport os
os.environ["AWS_ACCESS_KEY_ID"] = "minioadmin"
os.environ["AWS_SECRET_ACCESS_KEY"] = "minioadmin"
client = namidb.Client(
"s3://namidb?ns=dev"
"&endpoint=http://127.0.0.1:9000"
"®ion=us-east-1"
"&allow_http=true"
)For the production-style MinIO plus namidb-server plus docker-compose stack,
see Self-host as a database below.
LocalStack (S3 mock for tests), s3:// with an endpoint=...
docker run -p 4566:4566 -e SERVICES=s3 localstack/localstack
aws --endpoint-url=http://localhost:4566 s3 mb s3://namidb-dev
export AWS_ACCESS_KEY_ID=test AWS_SECRET_ACCESS_KEY=testclient = namidb.Client(
"s3://namidb-dev?ns=local"
"&endpoint=http://localhost:4566"
"&allow_http=true"
"®ion=us-east-1"
)Local filesystem (file://)
For CI fixtures or single-machine dev when you want durability without
a bucket. Full manifest CAS via per-namespace flock plus atomic
rename(2).
client = namidb.Client("file:///var/lib/namidb?ns=prod")
# relative paths work too:
client = namidb.Client("file://./data?ns=dev")There are two ways to run NamiDB as a database you fully own. Pick whichever matches how your app wants to talk to it.
Your app (Python or Rust) imports NamiDB directly and points at a bucket you control. Lowest latency, no extra hop, no network boundary, nothing to authenticate against. This is the "DuckDB for graphs" mode.
# Python service
import namidb
client = namidb.Client("s3://your-bucket/data?ns=prod®ion=us-east-1")
result = client.cypher("MATCH (n:Person) RETURN count(n) AS n")// Rust service
use namidb::{
core::id::NamespaceId,
storage::{parse_uri, WriterSession},
};
let (store, paths) = parse_uri("s3://your-bucket/data?ns=prod")?;
let mut writer = WriterSession::open(store, paths).await?;
// upserts, commit_batch, snapshot reads...Reach for this when your read fan-out fits in a single process and you don't want any network overhead. Because object storage is the source of truth, two replicas of your service can open the same namespace independently, and NamiDB's epoch-CAS protocol fences out the stale writer for you.
A single Rust binary (or container image) opens a namespace and serves it over HTTP. This is the one for when the database lives on a different machine than the app, or when you want a network boundary with bearer-token auth.
# Install from source
cargo install --path crates/namidb-server
# Or build the Docker image (from the repo root)
docker build -t namidb-server:0.1 -f crates/namidb-server/Dockerfile .namidb-server \
--store "s3://your-bucket/data?ns=prod®ion=us-east-1" \
--listen 0.0.0.0:8080 \
--auth-token "$NAMIDB_AUTH_TOKEN"curl -X POST http://your-host:8080/v0/cypher \
-H "Authorization: Bearer $NAMIDB_AUTH_TOKEN" \
-H 'Content-Type: application/json' \
-d '{"query": "MATCH (n:Person) RETURN count(n) AS n"}'
# {"columns":["n"],"rows":[{"n": 42}]}See crates/namidb-server/README.md
for the full route reference (/v0/cypher, /v0/health,
/v0/version, /v0/metrics, /v0/admin/flush), the metrics and
slow-query log, the JSON to Cypher type mapping, and the concurrency
model.
A complete self-hosted database in one file. Bring your own auth token, everything else is wired up:
# docker-compose.yml
services:
minio:
image: minio/minio
command: server /data --console-address ":9001"
environment:
MINIO_ROOT_USER: minioadmin
MINIO_ROOT_PASSWORD: minioadmin
volumes:
- minio-data:/data
healthcheck:
test: ["CMD", "mc", "ready", "local"]
interval: 3s
retries: 30
bucket-init:
image: minio/mc
depends_on:
minio:
condition: service_healthy
entrypoint: >
sh -c "
mc alias set local http://minio:9000 minioadmin minioadmin &&
mc mb --ignore-existing local/namidb
"
namidb-server:
image: namidb-server:0.1 # built from crates/namidb-server/Dockerfile
depends_on:
bucket-init:
condition: service_completed_successfully
environment:
NAMIDB_STORE: "s3://namidb?ns=prod&endpoint=http://minio:9000®ion=us-east-1&allow_http=true"
NAMIDB_LISTEN: "0.0.0.0:8080"
NAMIDB_AUTH_TOKEN: "${NAMIDB_AUTH_TOKEN:?set NAMIDB_AUTH_TOKEN in your env}"
NAMIDB_FLUSH_INTERVAL: "30s"
AWS_ACCESS_KEY_ID: "minioadmin"
AWS_SECRET_ACCESS_KEY: "minioadmin"
ports:
- "8080:8080"
volumes:
minio-data: {}export NAMIDB_AUTH_TOKEN=$(openssl rand -hex 32)
docker compose up -d
curl -s http://localhost:8080/v0/health | jq .That's it. A graph database, your data sitting in MinIO, and an
authenticated REST API on :8080. Swap the NAMIDB_STORE URI and the
same setup moves to AWS S3, R2, GCS or Azure without touching anything
else.
# Ephemeral in-memory namespace, same as the quickstart.
namidb run "CREATE (a:Person {name: 'Alice'}), (b:Person {name: 'Bob'})"
namidb run "MATCH (p:Person) RETURN p.name"
# Persistent. Any URI scheme works.
namidb run --store "file:///var/lib/namidb?ns=prod" \
"CREATE (a:Person {name: 'Alice'})"
namidb run --store "file:///var/lib/namidb?ns=prod" \
"MATCH (p:Person) RETURN p.name"
namidb run --store "s3://my-bucket/data?ns=prod®ion=us-west-2" \
"MATCH (p:Person) RETURN count(*) AS n"
# Plan inspection. Doesn't touch storage.
namidb explain --verbose \
"MATCH (a:Person)-[:KNOWS]->(b) RETURN b ORDER BY b.id LIMIT 20"See crates/namidb-cli/README.md
for every subcommand.
use std::sync::Arc;
use namidb_core::id::NamespaceId;
use namidb_query::{execute, lower, parse, Params};
use namidb_storage::{parse_uri, WriterSession};
#[tokio::main]
async fn main() -> anyhow::Result<()> {
// Any supported URI scheme: memory://, file://, s3://, gs://, az://.
let (store, paths) = parse_uri("memory://demo")?;
let mut writer = WriterSession::open(store, paths).await?;
// ... upsert nodes / edges, then commit_batch + flush ...
let snap = writer.snapshot();
let query = parse("MATCH (a:Person) RETURN count(*) AS n")?;
let plan = lower(&query)?;
let rows = execute(&plan, &snap, &Params::new()).await?;
println!("{rows:?}");
Ok(())
}The umbrella crate (crates/namidb/) re-exports
the stable surface, so a downstream Cargo.toml only needs the one
line.
┌─────────────────────────────────────────────────────────────────────┐
│ Cypher · GQL (ISO/IEC 39075:2024) │
│ Cost-based optimizer · Morsel-driven executor · Factorization │
├─────────────────────────────────────────────────────────────────────┤
│ Property graph · CSR adjacency · Columnar SSTs │
├─────────────────────────────────────────────────────────────────────┤
│ LSM tree · WAL · Memtable · SST · Manifest CAS │
│ Hybrid buffer pool (memory + NVMe) │
├─────────────────────────────────────────────────────────────────────┤
│ S3 · R2 · GCS · Azure Blob · MinIO · Tigris · Local FS │
└─────────────────────────────────────────────────────────────────────┘
Design proposals live in docs/rfc/. Start with
RFC-001 for the storage engine and
RFC-002 for the SST format.
A handful of env vars you can tune. The defaults are fine for almost everything; you mostly reach for these when you're chasing down a performance or memory problem.
| Env var | Default | What it does |
|---|---|---|
NAMIDB_ADJACENCY |
ON | CSR adjacency in RAM, shared across snapshots (RFC-018). |
NAMIDB_NODE_CACHE |
ON | Cross-snapshot NodeView lookup cache (RFC-019). |
NAMIDB_SST_CACHE |
ON | SST body, decoded edge property streams, and the parsed EdgeSstReader (RFC-020). |
NAMIDB_FACTORIZE |
OFF | Factorized intermediate results in the executor (RFC-017). |
NAMIDB_PROFILE_DUMP |
OFF | Dump per-stage profile counters to stderr after each query. |
namidb-server adds a few of its own:
| Env var | Default | What it does |
|---|---|---|
NAMIDB_STORE |
(required) | Storage URI, e.g. s3://bucket?ns=prod. |
NAMIDB_LISTEN |
0.0.0.0:8080 |
TCP bind address. |
NAMIDB_AUTH_TOKEN |
unset (open) | Bearer token. When it's unset the server warns and accepts every request. |
NAMIDB_FLUSH_INTERVAL |
30s |
Background memtable -> L0 flush cadence. 0s disables it. |
NAMIDB_SLOW_QUERY_THRESHOLD |
1s |
Log any query at or above this wall-clock at WARN. 0s disables the slow-query log. |
.
├── Cargo.toml # Workspace manifest
├── rust-toolchain.toml # Pinned toolchain
├── LICENSE # BSL 1.1 (auto-converts to Apache 2.0)
├── README.md
├── CONTRIBUTING.md
├── docs/
│ └── rfc/ # Design proposals (RFC-001 to RFC-020)
├── crates/
│ ├── namidb-core/ # Common types, errors, schema
│ ├── namidb-storage/ # LSM, WAL, manifest, SST, memtable, URI parser, file:// CAS
│ ├── namidb-graph/ # Property columns + CSR adjacency
│ ├── namidb-query/ # Cypher / GQL parser, optimizer, executor
│ ├── namidb-cli/ # `namidb` command-line tool
│ ├── namidb-py/ # Python bindings (PyO3 + maturin)
│ ├── namidb-server/ # `namidb-server` HTTP daemon + Dockerfile
│ ├── namidb-bench/ # LDBC-shaped synthetic bench harness
│ └── namidb/ # Public façade crate
├── bench/ # LDBC SNB Interactive bench harness
└── tests/ # Integration helpers (LocalStack, R2 wrapper)
| Resource | Where |
|---|---|
| Website | namidb.com |
| Reference docs & guides | docs.namidb.com |
| Design RFCs | docs/rfc/ |
| Python bindings | crates/namidb-py/README.md |
| HTTP server | crates/namidb-server/README.md |
| CLI | crates/namidb-cli/README.md |
| Benchmark harness | bench/README.md |
- Cloud (closed beta). Multi-tenant SaaS on namidb.com with per-namespace scale-to-zero, encrypted-at-rest tenants, and a hosted control plane. Request access.
- Streaming responses.
/v0/cypher/stream(NDJSON) and/v0/cypher/arrow(Arrow IPC) for zero-copy DataFrame ingestion. - Concurrent reads. RFC-021 takes the single-writer mutex off the
read path so a
namidb-servercan fan reads out across every core.
We develop in the open. Have a look at CONTRIBUTING.md
and the RFCs in docs/rfc/ before you start. Anything
non-trivial goes through an RFC first.
NamiDB is licensed under the Business Source License 1.1.
- Free for development, testing, internal production use, and anything that doesn't compete with a hosted NamiDB offering from the Licensor.
- Converts automatically to the Apache License 2.0 three years after each release.
- A separate commercial license is available if you need to embed or
redistribute NamiDB outside what BSL allows, including running it as
a hosted database service. Reach us at
info@namidb.com.
A few projects this leans on, directly or for ideas:
- Kùzu, for showing that columnar storage, CSR adjacency and factorization are the right model for property graphs.
- SlateDB, for the canonical recipe for LSM trees on object storage.
- turbopuffer, for proving that namespace-per-tenant on S3 is a viable SaaS architecture.
- Apache Arrow, Parquet and DataFusion, for the columnar foundation.
- foyer-rs, for the hybrid memory and disk cache.
NamiDB is a product of LESAI, Corp., Delaware, USA.
© 2026 LESAI, Corp. All rights reserved.