wd-fuse is a read-only Wikidata FUSE prototype built on libfuse's high-level API.
The mount is intentionally hybrid:
- FUSE serves a read-only namespace.
- A Python materializer fetches
Special:EntityDataon first access. - Fetched entities are frozen into a mount-private backing tree for snapshot stability.
- Reverse edges come from a local prebuilt index, never from remote on-demand queries.
- Default graph view:
truthy/ - Full statement view:
full/ - Entity-to-entity values: symlinks into
/entities/<id> - Literal values:
.txtor tiny.json - Ranks, qualifiers, references: nested under full statement directories
- Raw source views:
raw.json,raw.ttl - Large value sets: paginated under
pages/<nnnn>/ - Writes: rejected
/
├── README.txt
├── snapshot.json
└── entities/
└── Q42/
├── id.txt
├── type.txt
├── modified.txt
├── revision.txt
├── labels/
├── descriptions/
├── aliases/
├── truthy/
│ └── by-property/
│ └── P31/
│ ├── property -> ../../../../../entities/P31
│ └── values/
├── full/
│ └── by-property/
│ └── P31/
│ ├── property -> ../../../../../entities/P31
│ └── statements/
├── incoming/
│ └── by-property/
├── raw.json
└── raw.ttl
Truthy values use the best-rank Wikidata projection: preferred statements if any exist for a property, otherwise all non-deprecated statements.
Written once at mount time. Fields:
| Field | Type | Description |
|---|---|---|
kind |
string | Always "wd-fuse-snapshot" |
cache_root |
string | Absolute path to the backing tree |
created_at |
string | ISO 8601 UTC timestamp of mount creation |
page_size |
number | Pagination threshold used for this generation |
revision_pin |
string | null | Revision pinned via --revision-pin, or null |
incoming_index |
string | null | Path passed via --incoming-index, or null |
Runtime:
# Debian / Ubuntu
sudo apt install fuse3 libfuse3-3 python3
# Fedora / RHEL
sudo dnf install fuse3 fuse3-libs python3Build:
# Debian / Ubuntu
sudo apt install cmake gcc libfuse3-dev
# Fedora / RHEL
sudo dnf install cmake gcc fuse3-develPython 3.10 or newer is required for the materializer (str | None union syntax).
The repo vendors the public libfuse headers because this environment only exposes the runtime library. You still need the shared library and fusermount3 on the host.
cmake -S . -B build
cmake --build buildmkdir -p /tmp/wd-mount
./build/wd-fuse \
--incoming-index examples/incoming.jsonl \
--page-size 256 \
/tmp/wd-mountmkdir -p /tmp/wd-cache /tmp/wd-mount
./build/wd-fuse \
--cache-root /tmp/wd-cache \
--incoming-index examples/incoming.jsonl \
/tmp/wd-mountRun in the foreground to keep error output visible (recommended for first-time use and debugging):
./build/wd-fuse --incoming-index examples/incoming.jsonl /tmp/wd-mount -fEnable FUSE-level debug output:
./build/wd-fuse --incoming-index examples/incoming.jsonl /tmp/wd-mount -dThen browse:
ls /tmp/wd-mount/entities/Q42
readlink /tmp/wd-mount/entities/Q42/truthy/by-property/P31/values/000000
cat /tmp/wd-mount/entities/Q42/full/by-property/P31/statements/count.txtUnmount with:
fusermount3 -u /tmp/wd-mount--incoming-index accepts either:
- a JSON Lines file with records like
{"target":"Q42","property":"P50","source":"Q25169"} - a directory of per-target
.jsonor.jsonlshard files
A single file where every line is one reverse-edge record:
{"target":"Q42","property":"P50","source":"Q25169"}
{"target":"Q42","property":"P31","source":"Q463035"}
{"target":"Q5","property":"P31","source":"Q42"}For large indexes, split into per-entity files under one of these layouts (tried in order):
index/
├── Q42.jsonl # flat: index/Q42.jsonl
├── Q/
│ └── Q42.jsonl # one-char prefix: index/Q/<id>.jsonl
└── QA/
└── Q42.jsonl # two-char prefix: index/QA/<id>.jsonl
Each shard file may be .jsonl (one record per line) or .json (array or {"edges":[...]} object). The "target" field may be omitted in per-entity shards; it is inferred from the filename.
The prototype reads only local index data for reverse edges — no remote queries are issued.
- Supported entity ids in v0:
Q...andP... raw.ttlusesSpecial:EntityData/<id>.ttl?flavor=dump- Without
--revision-pin, each entity is fixed at the revision first fetched during the mount generation - Entity materialization may take 1–3 seconds per entity (two HTTP requests to
www.wikidata.org). Transient errors are retried up to three times with exponential back-off.
Transport endpoint is not connected — The mount process exited without cleanly unmounting. Force-unmount with:
fusermount3 -uz /tmp/wd-mountInput/output error on a specific entity — The materializer failed (network error, unexpected API response, etc.). Check stderr output by running with -f. The entity directory will not be created, so retrying the access will attempt materialization again.
Nothing appears at the mountpoint — FUSE daemonized and may have exited immediately. Run with -f to keep output in the terminal and see any startup errors.