Yaz

A sub-1M-parameter, byte-level language model whose individual facts you can create, read, update, and delete one at a time — with provable per-edit locality — and that abstains when it isn't confident which fact you mean, instead of guessing. Runs on CPU, offline.

Status: research prototype. Everything here is small-scale and honestly scoped (see Caveats). It is a clean, reproducible demonstration — not a production system and not a state-of-the-art result.

Idea

Each fact lives in its own addressable atom (one column of an additive decoder). A prompt is routed to a fact by a frozen sentence embedding (so paraphrases reach the same fact), and the routed atom contributes the answer. Because facts are disjoint columns:

UPDATE a fact = swap one decoder column (no retraining).
DELETE a fact = zero its atom (others provably untouched).
CREATE a fact = allocate a fresh atom.
Locality is structural: editing fact A cannot change fact B's output (given no routing collision).
Abstention: the routing confidence margin (top-1 minus top-2) is a calibrated "I don't know which fact you mean" signal — the model refuses low-confidence queries.

Quick start

# 1) deps (CPU-only; sentence-transformers pulls all-MiniLM-L6-v2 on first run)
pip install -r requirements.txt

# 2) try the demo (routes a prompt, answers, or abstains; edits/deletes are live)
python demo.py --demo
python demo.py --prompt "the country of the Eiffel Tower, its capital is "
python demo.py --prompt "The capital of France is " --edit France=Lima
python demo.py --prompt "best pizza topping?"        # -> ABSTAIN (out of scope)

The router uses sentence-transformers/all-MiniLM-L6-v2 out of the box — no local paths or private packages required. Two optional environment variables exist for advanced use: YAZ_EMBEDDER_PATH (point at an alternative Embedder package; the bundled MiniLM is used if unset) and YAZ_TINYSTORIES_DIR (a TinyStories corpus, only needed for the optional bits-per-character side-checks).

A trained checkpoint (checkpoints/yaz_gen_semantic_v2.pt) ships with the repo; retrain with python scripts/train_gen.py configs/semantic_v2.json.

Reproduce the results

python scripts/scaling/s3_route_abstain.py     # abstention risk-coverage (AURC)
python scripts/scaling/s4_create_primitive.py  # the CREATE 4-condition battery
pytest -q                                       # smoke test asserting a headline number

All runs are deterministic (seed 2026), CPU. Results write to results/.

What it can do (measured)

capability	result
Edit a fact, no retraining (UPDATE)	in-dist reliability 1.000; edits land 8/8 (first byte)
Delete a fact	fact gone, 0 collateral on others
Create a new fact	passes the 4-condition battery (monosemantic / local / readable / deletable)
Provable per-edit locality	0/10 collateral, bpc +0.000% across 40 sequential edits
No sequential-edit collapse	retention flat 1.000 over 40 edits
Paraphrase-robust routing	held-out reach 0.696 (vs 0.216 surface-routing)
Abstain when unsure	near-oracle: risk-coverage AURC 0.004 (oracle 0.003)

Caveats

First-byte editor. Edits reliably set the answer's first byte; multi-byte generation is not faithful (full-word transfer ≈ 0.05). Treat the first character as the signal.
Routing degrades on hard clues (≈0.85 on famous entities → ≈0.50 on oblique, name-free ones).
Locality is structural — it holds while no two facts route to the same atom; collisions can occur at larger fact counts.
Tiny, synthetic scope — 50 country→capital facts, single seed, CPU. Not validated at scale or on open-vocabulary knowledge.
Not novel-by-defensibility. The mechanisms (sentence-embedding-keyed editing, selective prediction) are individually present in the published literature. Yaz is a clean, reproducible prototype, not a unique capability.

Layout

yaz/            model + semantic router
scripts/        training, eval, and reproduction scripts
data/           synthetic country→capital facts + paraphrase probes
checkpoints/    trained model(s)
results/        result JSON written by the scripts
demo.py         the CLI demo
tests/          smoke test

License

MIT — see LICENSE.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Yaz

Idea

Quick start

Reproduce the results

What it can do (measured)

Caveats

Layout

License

About

Uh oh!

Releases 2

Packages

Uh oh!

Contributors

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
checkpoints		checkpoints
configs		configs
data		data
paper		paper
results		results
scripts		scripts
tests		tests
yaz		yaz
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
README.md		README.md
demo.py		demo.py
requirements.txt		requirements.txt
run_ab.sh		run_ab.sh

Folders and files

Latest commit

History

Repository files navigation

Yaz

Idea

Quick start

Reproduce the results

What it can do (measured)

Caveats

Layout

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Uh oh!

Contributors 0

Languages

Packages

Contributors