11 releases
Uses new Rust 2024
| 0.2.11 |
|
|---|---|
| 0.2.10 |
|
| 0.1.1 | Mar 2, 2026 |
| 0.1.0 | Feb 25, 2026 |
#138 in Procedural macros
Used in 5 crates
(4 directly)
465KB
10K
SLoC
bbnf-lang
Better Backus-Naur Form—a monorepo for the BBNF grammar ecosystem.
BBNF extends EBNF for defining context-free grammars, used by the
parse-that parser combinator library.
Structure
rust/ Rust workspace
core/ BBNF grammar framework, IR lowering, codegen (lib)
ir/ Canonical grammar IR, bytecode compiler, interpreter
derive/ Proc-macro derive for BBNF
analysis/ LSP analysis engine (DocumentState, feature providers)
lsp/ Language Server Protocol server
wasm/ bbnf-wasm crate (wasm-pack → playground, bytecode VM)
typescript/ TS library (@mkbabb/bbnf-lang)
prettier-plugin-bbnf/ Prettier plugin for .bbnf files
playground/ Vue 3 + Monaco playground (uses bbnf-wasm)
extension/ VS Code extension (LSP client)
grammar/ Example grammars + language specification
css/ CSS grammar family (value-unit, color, values, selectors, keyframes, stylesheet, css-tokens, css-stylesheet-pretty)
lang/ Language/format grammars (JSON, CSV, math, regex, EBNF, Google Sheets, etc.)
docs/ Documentation (markdown, rendered by playground)
scripts/ Build automation scripts
data/ Benchmark datasets
server/ Compiled LSP binary (copied by Makefile)
.github/workflows/ CI + release pipeline
.vscode/ Launch configs, tasks, settings
Language
BBNF extends EBNF with features for practical parser generation: regex terminals, skip/next operators, mapping functions, and an import system. See grammar/BBNF.md for the full specification.
Quick orientation:
(* Rules: name = expression ; *)
value = object | array | string | number | "true" | "false" | "null" ;
(* Regex terminals *)
number = /\-?(0|[1-9][0-9]*)(\.[0-9]+)?([eE][+-]?[0-9]+)?/ ;
(* EBNF operators: [] optional, {} repetition, () grouping *)
array = "[", [ value, { ",", value } ], "]" ;
(* Imports—selective imports auto-unfurl transitive dependencies *)
@import { number, integer } from "css-value-unit.bbnf" ;
(* Recovery—per-rule sync expression for multi-error parsing *)
@recover declaration /[;}]/ ;
(* Custom whitespace—overrides ?w to use comment-aware scanning *)
@ws /(?s)(?:\s|\/\*.*?\*\/)*/ ;
declaration = propertyName ?w ":" ?w valueSpan ;
(* Force-inline—splice rule body at every call site, no enum variant *)
@inline optSemicolon ;
(* Lexical token—fusion-inlined + span eligible, enum variant preserved *)
@token selectorSpan ;
Recovery
@recover rule syncExpr ; annotates a rule with a synchronization expression used
for multi-error parsing. When the rule fails, the parser advances past input until
syncExpr matches, records a diagnostic, and continues. Any valid BBNF expression
(regex, alternation, concatenation, etc.) is valid as the sync expression.
The LSP provides semantic tokens, hover, completion, and diagnostics (undefined target
rule) for @recover directives.
Imports
BBNF supports @import directives for composing grammars:
@import "other.bbnf" ; (* import all rules *)
@import { number, integer } from "lib.bbnf" ; (* selective import *)
Import directives may appear at any position in a file (before, between, or after rules).
Selective imports automatically bring transitive dependencies—importing percentage
also brings number and percentageUnit if percentage references them.
Circular imports are handled via Python-style partial initialization (a module's rules
are registered before recursing into its own imports).
Cmd+Click on import paths opens the referenced file. Diagnostics are import-aware—imported rule names suppress "undefined rule" warnings.
Whitespace
@ws /regex/ ; overrides the ?w (optional whitespace) operator grammar-wide. By default, ?w trims ASCII whitespace (\s). With @ws, it compiles to the given regex instead, enabling comment-aware whitespace without allocating a named rule's enum variant at each call site. CSS grammars use this to route ?w through a SIMD-accelerated comment scanner.
Inlining
@inline ruleName ; force-inlines a rule at every call site during IR compilation. The rule's body is substituted directly—no enum variant generated, no function emitted. Useful for small helper rules (optional semicolons, opaque spans) where the abstraction aids readability but the codegen overhead doesn't.
Debugging
@debug ruleName ; instruments a rule for trace output across all codegen paths. @debug * ; instruments every rule. Compiled paths gate behind #[cfg(feature = "parser-trace")]; the bytecode VM uses Op::DebugBreak opcodes with stepping and breakpoint support. The VS Code extension supports grammar-level debugging via DAP (bbnf-lsp --dap)—breakpoints on rules, step through parse execution, inspect call stack and parse state.
Token
@token ruleName ; marks a rule as a lexical token. Token rules are forced span-eligible and use fusion-style inlining (body inlined at call sites, but the enum variant is preserved). Unlike @inline, which eliminates the variant entirely, @token is compatible with @pretty directives that need to reference the variant for formatting.
No-Collapse
@no_collapse ruleName ; preserves a rule's structural identity in the generated AST, preventing Span compression from collapsing it.
Formatting
BBNF's @pretty directives drive pretty-printing in the playground and Prettier plugin.
Slab Parsing
#[parser(slab)] on the derive macro generates a second set of parser methods that use BumpSlab for allocation instead of Box<T>. The slab path emits monolithic recursive functions—direct match dispatch on the first byte, inlined rule bodies, zero combinator construction overhead. Fresh slab and parser per parse; bulk deallocation on drop.
#[derive(Parser)]
#[parser(path = "json.bbnf", slab)]
struct JsonParser;
let slab = BumpSlab::with_capacity(input.len() / 32 * 32);
let ast = JsonParser::value_slab()
.parse_with_context(&input, &slab)
.unwrap();
Span-Only Parsing
#[parser(span)] generates zero-allocation __rule_span functions returning Option<Span<'a>>. No enum variants, no slab, no Vec. Validation-only parsing at maximum throughput.
Playground
Live at grammar.babb.dev.
Editor
Monaco with full BBNF language support via WASM: hover, completion, go-to-definition, semantic tokens, inlay hints (FIRST sets + nullable), code lens, code actions, document symbols, folding, and selection ranges. Diagnostics update on every keystroke.
Four panes show different views of the grammar and input:
| Pane | What it shows |
|---|---|
| Grammar | BBNF grammar editor with live diagnostics |
| Input | Source text parsed against the grammar |
| Parsed AST | JSON AST produced by the parser |
| Formatted | Pretty-printed output via @pretty directives |
| Debug | Step through parse execution—breakpoints, call stack, parse state |
Formatting uses gorgeous (WASM)—AOT-generated formatters for built-in languages, a bytecode VM for custom grammars.
Docs
Built-in documentation covering parse-that, BBNF, pprint, gorgeous, and
performance—rendered from Markdown with a sidebar nav grouped by section and syntax-highlighted code blocks.
Sources, acknowledgements, &c.
- Extended Backus-Naur form — ISO 14977. BBNF's ancestor.
- Wheeler, D. A. Don't Use ISO 14977 EBNF. — Motivation for BBNF's syntactic deviations.
- Aho, A. V., Lam, M. S., Sethi, R., & Ullman, J. D. (2006). Compilers: Principles, Techniques, and Tools (2nd ed.). Addison-Wesley. — Left recursion, left factoring, FIRST/FOLLOW sets.
- Tarjan, R. E. (1972). Depth-first search and linear graph algorithms. SIAM Journal on Computing. — SCC detection used for cycle analysis, FIRST-set propagation, and build ordering.
- Language Server Protocol. Microsoft. — The protocol implemented by
bbnf-lsp. parse-that— The parser combinator library that consumes BBNF grammars.
Dependencies
~5–8MB
~142K SLoC