A PHP port of toon-format/toon - a compact data format designed to reduce token consumption when sending structured data to Large Language Models.
- Quick Start · Basic Usage · Decoding · Configuration
- Tutorials · Version Compatibility · Development
TOON is a compact, human-readable format for structured data optimized for LLM contexts. For format details and efficiency analysis, see the TOON Specification.
Install via Composer:
composer require helgesverre/toon- PHP 8.1 or higher
use HelgeSverre\Toon\Toon;
// Encode data
echo Toon::encode(['user' => 'Alice', 'score' => 95]);
// user: Alice
// score: 95
// Decode back to PHP
$data = Toon::decode("user: Alice\nscore: 95");
// ['user' => 'Alice', 'score' => 95]Try it online at ArrayAlchemy.
use HelgeSverre\Toon\Toon;
// Simple values
echo Toon::encode('hello'); // hello
echo Toon::encode(42); // 42
echo Toon::encode(true); // true
echo Toon::encode(null); // null
// Arrays
echo Toon::encode(['a', 'b', 'c']);
// [3]: a,b,c
// Objects
echo Toon::encode([
'id' => 123,
'name' => 'Ada',
'active' => true
]);
// id: 123
// name: Ada
// active: trueTOON supports bidirectional conversion - you can decode TOON strings back to PHP arrays:
use HelgeSverre\Toon\Toon;
// Decode simple values
$result = Toon::decode('42'); // 42
$result = Toon::decode('hello'); // "hello"
$result = Toon::decode('true'); // true
// Decode arrays
$result = Toon::decode('[3]: a,b,c');
// ['a', 'b', 'c']
// Decode objects (returned as associative arrays)
$toon = <<<TOON
id: 123
name: Ada
active: true
TOON;
$result = Toon::decode($toon);
// ['id' => 123, 'name' => 'Ada', 'active' => true]
// Decode nested structures
$toon = <<<TOON
user:
id: 123
email: ada@example.com
metadata:
active: true
score: 9.5
TOON;
$result = Toon::decode($toon);
// ['user' => ['id' => 123, 'email' => 'ada@example.com', 'metadata' => ['active' => true, 'score' => 9.5]]]Note: TOON objects are decoded as PHP associative arrays, not objects.
TOON's most efficient format is for uniform object arrays:
echo Toon::encode([
'users' => [
['id' => 1, 'name' => 'Alice', 'role' => 'admin'],
['id' => 2, 'name' => 'Bob', 'role' => 'user'],
]
]);Output:
users[2]{id,name,role}:
1,Alice,admin
2,Bob,user
Field names are declared once in the header, then each row contains only values. This is where TOON achieves the largest token savings compared to JSON.
See docs/EXAMPLES.md for more encoding examples.
Customize encoding behavior with EncodeOptions:
use HelgeSverre\Toon\EncodeOptions;
// Custom indentation (default: 2)
$options = new EncodeOptions(indent: 4);
echo Toon::encode(['a' => ['b' => 'c']], $options);
// a:
// b: c
// Tab delimiter instead of comma (default: ',')
$options = new EncodeOptions(delimiter: "\t");
echo Toon::encode(['tags' => ['a', 'b', 'c']], $options);
// tags[3\t]: a b c
// Pipe delimiter
$options = new EncodeOptions(delimiter: '|');
echo Toon::encode(['tags' => ['a', 'b', 'c']], $options);
// tags[3|]: a|b|cTOON only quotes strings when necessary:
echo Toon::encode('hello'); // hello (no quotes)
echo Toon::encode('true'); // "true" (quoted - looks like boolean)
echo Toon::encode('42'); // "42" (quoted - looks like number)
echo Toon::encode('a:b'); // "a:b" (quoted - contains colon)
echo Toon::encode(''); // "" (quoted - empty string)
echo Toon::encode("line1\nline2"); // "line1\nline2" (quoted - control chars)DateTime objects are automatically converted to ISO 8601 format:
$date = new DateTime('2025-01-01T00:00:00+00:00');
echo Toon::encode($date);
// "2025-01-01T00:00:00+00:00"PHP enums are automatically normalized - BackedEnum values are extracted, UnitEnum names are used:
enum Status: string {
case ACTIVE = 'active';
case INACTIVE = 'inactive';
}
enum Priority: int {
case LOW = 1;
case HIGH = 10;
}
enum Color {
case RED;
case GREEN;
case BLUE;
}
// BackedEnum with string value
echo Toon::encode(Status::ACTIVE);
// active
// BackedEnum with int value
echo Toon::encode(Priority::HIGH);
// 10
// UnitEnum (no backing value)
echo Toon::encode(Color::BLUE);
// BLUE
// Array of enum cases
echo Toon::encode(Priority::cases());
// [2]: 1,10Non-finite numbers are converted to null:
echo Toon::encode(INF); // null
echo Toon::encode(-INF); // null
echo Toon::encode(NAN); // nullTOON provides global helper functions for convenience:
// Basic encoding
$toon = toon($data);
// Decoding
$data = toon_decode($toonString);
// Lenient decoding (forgiving parsing)
$data = toon_decode_lenient($toonString);
// Compact (minimal indentation)
$compact = toon_compact($data);
// Readable (generous indentation)
$readable = toon_readable($data);
// Tabular (tab-delimited)
$tabular = toon_tabular($data);
// Compare with JSON
$stats = toon_compare($data);
// Returns: ['toon' => 450, 'json' => 800, 'savings' => 350, 'savings_percent' => '43.8%']
// Get size estimate
$size = toon_size($data);
// Estimate token count (4 chars/token heuristic)
$tokens = toon_estimate_tokens($data);Step-by-step guides for integrating TOON with LLM providers:
- Getting Started with TOON (10-15 min) Learn the basics: installation, encoding, configuration, and your first LLM integration.
-
OpenAI PHP Client Integration (15-20 min) Integrate TOON with OpenAI's official PHP client. Covers messages, function calling, and streaming.
-
Laravel + Prism AI Application (20-30 min) Build a complete Laravel AI chatbot using TOON and Prism for multi-provider support.
-
Anthropic/Claude Integration (20-25 min) Leverage Claude's 200K context window with TOON optimization. Process large datasets efficiently.
-
Token Optimization Strategies (20-25 min) Deep dive into token economics, RAG optimization, and cost reduction strategies.
-
Building a RAG System with TOON and Ollama (30-40 min) Create a production-ready RAG pipeline with TOON, Ollama embeddings, and vector similarity search.
See the tutorials/ directory for all tutorials and learning paths.
This library tracks the TOON Specification. Major versions align with spec versions.
| Library | Spec | Key Changes |
|---|---|---|
| v3.1.0 | v3.0 | toJSON() method support, negative leading zeros fix |
| v3.0.0 | v3.0 | List-item objects with tabular first field use depth +2 for rows |
| v2.0.0 | v2.0 | Removed [#N] length marker; decoder rejects legacy format |
| v1.4.0 | v1.3 | Full decoder, strict mode |
| v1.3.0 | v1.3 | PHP enum support |
| v1.2.0 | v1.3 | Empty array fix |
| v1.1.0 | v1.3 | Benchmarks, justfile |
| v1.0.0 | v1.3 | Initial release |
For format details and token efficiency analysis, see the TOON Specification.
- Key-value pairs with colons
- Indentation-based nesting (2 spaces by default)
- Empty objects shown as
key:
- Primitives: Inline format with length
tags[3]: a,b,c - Uniform objects: Tabular format with headers
items[2]{sku,qty}: A1,2 - Mixed/non-uniform: List format with hyphens
- 2 spaces per level (configurable)
- No trailing spaces
- No final newline
PHP automatically converts numeric string keys to integers in arrays:
// PHP automatically converts numeric keys
$data = ['123' => 'value']; // Key becomes integer 123
echo Toon::encode($data); // "123": value (quoted as string)The library handles this by quoting numeric keys when encoding.
TOON is ideal for:
- Sending structured data in LLM prompts
- Reducing token costs in API calls to language models
- Improving context window utilization
- Making data more human-readable in AI conversations
Note: TOON is optimized for LLM contexts and is not intended as a replacement for JSON in APIs or data storage.
TOON is not a strict superset or subset of JSON. Key differences:
- Bidirectional encoding and decoding (objects decode as associative arrays)
- Optimized for readability and token efficiency in LLM contexts
- Uses whitespace-significant formatting (indentation-based nesting)
- Includes metadata like array lengths and field headers for better LLM comprehension
- Original TypeScript implementation: toon-format/toon
- Specification: toon-format/spec
- PHP port: HelgeSverre
composer test # Run tests
composer test:coverage # Generate coverage report
composer analyse # Static analysisKeep the library aligned with upstream spec changes:
just sync-spec # Download latest SPEC.md from upstream
just diff-spec # Show diff after download
just autofix # Sync spec and launch Claude Code for compliance reviewThe autofix command downloads the latest specification, then launches Claude Code in plan mode with the /spec-review prompt to analyze changes and propose implementation updates.
cd benchmarks && composer install && composer run benchmarkSee benchmarks/README.md for details.