TOON-LD

Token-Oriented Object Notation for Linked Data — A lossless Knowledge Graph Compression format for LLM Context Windows.

TOON-LD reduces token usage by 40-60% compared to JSON-LD, allowing you to fit twice as much structured data into your prompts for RAG (Retrieval-Augmented Generation) applications.

It works by extending standard TOON syntax with Linked Data semantics, meaning every valid TOON-LD document is also a valid TOON document. Base TOON parsers can process it natively, while TOON-LD processors unlock the full semantic graph.

Why TOON-LD?

The Problem: Knowledge Graphs (JSON-LD) are incredibly verbose. Using them in RAG pipelines burns through token budgets and hits context limits fast.

The Solution: TOON-LD acts as a compression layer. It combines the semantic expressiveness of RDF with radical token efficiency through tabular arrays. By eliminating repetitive keys and using CSV-like rows for uniform data, TOON-LD fits significantly more information into LLM context windows without losing structure.

Features

Pure TOON Extension: Every TOON-LD document is valid TOON (like JSON-LD extends JSON)
Tabular Arrays: Serialize arrays of objects as CSV-like rows with shared headers
40-60% Token Reduction: Fewer tokens means lower costs and more data in context
Full JSON-LD Compatibility: Round-trip conversion without data loss
All JSON-LD 1.1 Keywords: Complete support for @context, @graph, @id, @type, value nodes, etc.
Cross-Platform: Rust, WebAssembly (npm), and Python (PyPI) implementations
High Performance: Optimized serialization with automatic tabular array detection

Benchmarks

Real-world token savings across different dataset sizes:

Records	JSON-LD Size	TOON-LD Size	Size Saved	Tokens Saved
10	862 B	518 B	39.9%	54.2%
100	8,782 B	5,109 B	41.8%	56.3%
1,000	90,682 B	53,710 B	40.8%	56.5%
10,000	936,682 B	566,711 B	39.5%	53.4%

Key takeaway: Token savings scale well and are especially valuable for LLM context windows.

Sparsity Analysis

TOON-LD's efficiency depends on data sparsity. Shape-based partitioning (enabled by default) ensures TOON-LD remains efficient even for highly heterogeneous data.

Low Sparsity (0-30%): Both Union and Partition approaches save ~40-50% tokens.
High Sparsity (60%+): Partitioning significantly outperforms the Union schema, maintaining efficiency where standard tabular formats fail.

Token Cost Analysis

Union Schema: High cost when null_count is large (sparse data). Partitioned Schema: Low cost when partitions have dense, non-overlapping fields.

Break-even point: ~30% sparsity threshold balances both approaches.

Partitioning excels when:

High field diversity (heterogeneous graphs)
Large datasets
Mixed entity types

Quick Example

JSON-LD:

{
  "@context": {
    "foaf": "http://xmlns.com/foaf/0.1/"
  },
  "@graph": [
    {"@id": "ex:1", "@type": "foaf:Person", "foaf:name": "Alice", "foaf:age": 30},
    {"@id": "ex:2", "@type": "foaf:Person", "foaf:name": "Bob", "foaf:age": 25}
  ]
}

TOON-LD:

@context:
  foaf: http://xmlns.com/foaf/0.1/
@graph[2]{@id,@type,foaf:age,foaf:name}:
  ex:1, foaf:Person, 30, Alice
  ex:2, foaf:Person, 25, Bob

Notice how object keys appear once in the header instead of repeating for each object.

How TOON-LD Extends TOON

Just as JSON-LD extends JSON by adding semantic meaning to certain key names (those starting with @), TOON-LD extends TOON the same way:

No new syntax: TOON-LD uses only standard TOON syntax (objects, arrays, tabular format)
Semantic interpretation: Keys like @context, @id, @type have special JSON-LD meaning
Full compatibility: Any TOON parser can parse TOON-LD documents
Value nodes: Language tags and datatypes use tabular format for efficiency

Example value node with language tag:

title[2]{@value,@language}:
  The Hobbit,en
  Der Hobbit,de

This is standard TOON tabular syntax that base TOON parsers handle natively, while TOON-LD processors interpret it as JSON-LD value nodes.

Installation

Rust

[dependencies]
toon-ld = "0.2"

CLI

cargo install toon-cli

Python

pip install toon-ld

JavaScript/TypeScript

npm install toon-ld

Quick Start

CLI

# Convert JSON-LD to TOON-LD
toon-ld convert -i data.jsonld -o data.toon

# Convert back to JSON-LD
toon-ld convert -i data.toon -o data.jsonld

# Run benchmark
toon-ld benchmark --max-records 10000

Rust

use toon_ld::{jsonld_to_toonld, toonld_to_jsonld};

let json_ld = r#"{"@context": {"foaf": "http://xmlns.com/foaf/0.1/"}, "foaf:name": "Alice"}"#;
let toon = jsonld_to_toonld(json_ld)?;
let back = toonld_to_jsonld(&toon)?;

Python

import toon_ld

json_ld = '{"@context": {"foaf": "http://xmlns.com/foaf/0.1/"}, "foaf:name": "Alice"}'
toon_str = toon_ld.convert_jsonld_to_toonld(json_ld)
json_str = toon_ld.convert_toonld_to_jsonld(toon_str)

JavaScript

import { convert_jsonld_to_toonld, convert_toonld_to_jsonld } from 'toon-ld';

const jsonLd = '{"@context": {"foaf": "http://xmlns.com/foaf/0.1/"}, "foaf:name": "Alice"}';
const toon = convert_jsonld_to_toonld(jsonLd);
const json = convert_toonld_to_jsonld(toon);

Key Concepts

Tabular Arrays

Arrays of objects share a header with field names, followed by CSV-like rows:

@context:
  foaf: http://xmlns.com/foaf/0.1/
  vcard: http://www.w3.org/2006/vcard/ns#
foaf:knows[3]{foaf:name,foaf:age,vcard:locality}:
  Alice, 30, null
  Bob, null, Portland
  Carol, 28, Seattle

Value Nodes

Language tags and datatypes use standard TOON object or tabular syntax:

@context:
  dc: http://purl.org/dc/terms/
  schema: http://schema.org/
  xsd: http://www.w3.org/2001/XMLSchema#
dc:title:
  @value: Bonjour
  @language: fr
schema:datePublished:
  @value: "2024-01-15"
  @type: xsd:date

Or using tabular format for multiple values:

dc:titles[2]{@value,@language}:
  Bonjour,fr
  Hello,en

Context Support

Automatic URI compaction using @context:

@context:
  foaf: http://xmlns.com/foaf/0.1/
foaf:name: Alice

Project Structure

toon-core/ - Core Rust implementation
toon-cli/ - Command-line tool
toon-wasm/ - WebAssembly bindings (npm)
toon-py/ - Python bindings (PyPI)

Building from Source

# Build all workspace members
cargo build --release

# Run tests
cargo test --workspace

# Build WASM package
cd toon-wasm && wasm-pack build --target web

# Build Python wheel
cd toon-py && maturin build --release

License

MIT License - See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
.github		.github
docs		docs
examples		examples
scripts		scripts
toon-cli		toon-cli
toon-core		toon-core
toon-ld		toon-ld
toon-py		toon-py
toon-wasm		toon-wasm
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
CONTRIBUTING.md		CONTRIBUTING.md
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

TOON-LD

Why TOON-LD?

Features

Benchmarks

Sparsity Analysis

Token Cost Analysis

Quick Example

How TOON-LD Extends TOON

Installation

Rust

CLI

Python

JavaScript/TypeScript

Quick Start

CLI

Rust

Python

JavaScript

Key Concepts

Tabular Arrays

Value Nodes

Context Support

Project Structure

Building from Source

License

About

Uh oh!

Releases 2

Packages

Languages

License

argahsuknesib/toon-ld

Folders and files

Latest commit

History

Repository files navigation

TOON-LD

Why TOON-LD?

Features

Benchmarks

Sparsity Analysis

Token Cost Analysis

Quick Example

How TOON-LD Extends TOON

Installation

Rust

CLI

Python

JavaScript/TypeScript

Quick Start

CLI

Rust

Python

JavaScript

Key Concepts

Tabular Arrays

Value Nodes

Context Support

Project Structure

Building from Source

License

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 2

Packages 0

Languages

Packages