-
-
Notifications
You must be signed in to change notification settings - Fork 1
blake3_audit
Audit of the leviathan-crypto WebAssembly BLAKE3 implementation
(AssemblyScript) against the BLAKE3 specification, covering the
v128-internal compress, the v128-external compress4, the §2.4
chunk machine, the §2.5 tree assembly + root finalize, the §2.6 XOF
squeeze, and the §2.3 keyed_hash and derive_key modes. Every
checkbox is falsifiable by reading the cited file and confirming the
invariant against the spec reference (and, where noted, against the
RustCrypto blake3 crate, which is consulted only after the
round-trip gate passes per AGENTS.md §Ground Rules #4).
- Buffer Layout (
src/asm/blake3/buffers.ts)- Flags (
src/asm/blake3/flags.ts)- Compression, v128-internal (
src/asm/blake3/compress.ts)- Compression, v128-external (
src/asm/blake3/compress_simd.ts)- Chunk Machine (
src/asm/blake3/chunk.ts)- Subtree Stack and Root (
src/asm/blake3/tree.ts)- Top-level Entries (
src/asm/blake3/index.ts)- TS Validation (
src/ts/blake3/validate.ts)- TS Public Surface (
src/ts/blake3/index.ts)- Memory Hygiene
- Constant-Time Considerations
- Side Channels
- Test Coverage
- Open Audit Items
- Cross-References
| Meta | Description |
|---|---|
| Target: |
leviathan-crypto WebAssembly implementation (AssemblyScript) |
| Spec: | BLAKE3, one function, fast everywhere (O'Connor / Aumasson / Neves / Wilcox-O'Hearn, 2020-01-09) |
| Modes: |
hash, keyed_hash, derive_key (BLAKE3 §2.3 Modes); XOF output via §2.6 Extendable Output |
| Test vectors: | Upstream 35-case corpus (pin ae3e8e6b3a5ae3190ca5d62820789b17886a0038) + RustCrypto blake3 = "=1.8.5" oracle |
| Independence: | Implemented from the BLAKE3 specification directly, no port from the reference implementation |
-
MUTABLE_START = 4096reserves the AS data segment for SIGMA-style read-only tables. -
BUFFER_END = 26328matches the running total of every region declared in the file. Fits inside 1 page (65536 bytes); module is sized at 2 pages (131072 bytes) for slack. -
INPUT_STAGING(4096..8191) is at least4 × 1024for compress4 batching (kept vestigial; chunk_simd reads directly from caller scratch). -
OUTPUT_STAGING(8192..9215) is at least 1024 bytes for one-shot XOF reads. - Working compress slots are pinned:
CVat 9216 (32 B),MSGat 9248 (64 B),COUNTERat 9312 (8 B),BLOCK_LENat 9320 (4 B),FLAGSat 9324 (4 B),COMPRESS_OUTat 9328 (64 B). -
KEYED_KEY(9392) holds the §2.3 keyed_hash 32-byte key;DERIVE_CV(9424) is a derive_key staging slot. -
LEVEL_QUEUES(9456..25007) is 54 × 9 × 32 bytes, matching the BLAKE3 §5.1.2 depth bound (54 levels) with a 9-entry width per level absorbing the finalize-time carry transient. Queue for level L starts atLEVEL_QUEUES_OFFSET + L * LEVEL_QUEUE_STRIDEwithLEVEL_QUEUE_STRIDE = 288bytes. -
LEVEL_COUNTS(25008..25223) is 54 × 4 bytes, one i32 count per level (number of pending CVs currently held in that level's queue). -
COMPRESS4_*slots (25224..25915) are 4-wide arrays matching the lane-parallel layout (used by bothchunkBatch4andparentBatch4). -
MODE_CV(25916, 32 B) is the per-mode starting CV slot read bychunkInitand parent compresses. - Chunk-state slots
CHUNK_INDEX/CHUNK_BLOCKS/CHUNK_PENDING_LEN/CHUNK_PENDING_BLOCK(25948..26027) hold the §2.4 chunk-machine state. - Tree-state slots
TREE_PARENT_BLOCK/CHUNK_CV_SCRATCH/ROOT_OUT_SCRATCH(26028..26187) hold the §2.5 single-pair finalize staging and the §2.5 root output. -
MODE_FLAGS(26188, 4 B) holds the OR-onto-every-compress mode bits; zero for hash,FLAG_KEYED_HASHfor keyed_hash,FLAG_DERIVE_KEY_CONTEXT/FLAG_DERIVE_KEY_MATERIALfor the two derive_key passes. -
CONTEXT_CV(26192, 32 B) holds the §2.3 derive_key pass-1 output between the two passes. -
ROOT_STATE_*(26224..26327) capture the root-compress input snapshot for §2.6 XOF squeezes. -
getModuleId()returns 4 (distinct from every other module). -
getMemoryPages()returns 2.
-
FLAG_CHUNK_START = 1(2^0), BLAKE3 §2.2 Table 3. -
FLAG_CHUNK_END = 2(2^1). -
FLAG_PARENT = 4(2^2). -
FLAG_ROOT = 8(2^3). -
FLAG_KEYED_HASH = 16(2^4). -
FLAG_DERIVE_KEY_CONTEXT = 32(2^5). -
FLAG_DERIVE_KEY_MATERIAL = 64(2^6). - All flags are powers of two; they may be ORed into the §2.2
dfield without overlap.
-
BLAKE3_IV0..7match BLAKE3 §2.2 Table 1 (identical to FIPS 180-4 §5.3.3 SHA-256 IV). -
SIGMApermutation[2, 6, 3, 10, 7, 0, 4, 13, 1, 11, 12, 5, 9, 14, 15, 8]matches BLAKE3 §2.2 Table 2. - Compress runs exactly 7 rounds (§2.2 round count).
- G function rotations are R1=16, R2=12, R3=8, R4=7 in that order, matching BLAKE3 §2.2.
- Initial state layout:
row0 = h0..h3,row1 = h4..h7,row2 = IV0..IV3,row3 = (counterLo, counterHi, blockLen, flags). Matches §2.2v0..v15. - Column G runs four G calls in parallel via one v128 op per state-update step; diagonal G is made columnar by per-row lane rotations, then the rotations are undone after the G calls fire.
- Between rounds 0..5 the message is permuted via
m'[i] = m[SIGMA[i]]; the permutation is NOT applied after round 6 (the final round), per BLAKE3 §2.2. - Feed-forward output writes the full 64 bytes: bytes 0..31 are
(v_0..v_7) XOR (v_8..v_15), and bytes 32..63 are(v_8..v_15) XOR (h_0..h_7). Matches BLAKE3 §2.2. - When
blockOff != MSG_OFFSETthe caller's block is staged intoMSG_OFFSETso the round-wise permutation does not mutate the caller's buffer.
-
compress4reads four CVs / msg blocks / counters / blens from theCOMPRESS4_*staging buffers and writes four 64-byte outputs toCOMPRESS4_OUTat lane-deinterleaved offsets (lane K at K × 64). - Lane K's CV occupies
COMPRESS4_CV_IN + K × 32for 32 bytes; the lane-K state words v0..v7 are gathered from the K-th 32-byte slot. - Lane K's message block occupies
COMPRESS4_MSG_IN + K × 64for 64 bytes; the lane-K message words m0..m15 are gathered from the K-th 64-byte slot. - Lane K's counter occupies
COMPRESS4_CTR_IN + K × 8(lo at +0, hi at +4);v12,v13are gathered as[ctrLo_0, ctrLo_1, ctrLo_2, ctrLo_3]and[ctrHi_0, ctrHi_1, ctrHi_2, ctrHi_3]. - Lane K's block_len occupies
COMPRESS4_BLEN_IN + K × 4;v14is gathered as[blen_0, blen_1, blen_2, blen_3]. - The flags word at
COMPRESS4_FLAGS_INis splatted across all four lanes ofv15; all four lanes share the same flags value per BLAKE3 §2.2 (thedinput) lane-parallelized per §5.3. - The 7-round permutation E(m, v) runs identically per-lane to
compress, with G rotations R1=16, R2=12, R3=8, R4=7. - SIGMA permutation is applied as whole-register renames between rounds (no within-register shuffles); the rename schedule encodes
[2, 6, 3, 10, 7, 0, 4, 13, 1, 11, 12, 5, 9, 14, 15, 8], identical tocompress. - Feed-forward output matches
compress: lane K's bytes K × 64 .. K × 64 + 63 are bit-identical to acompresscall against lane K's inputs. -
compress4output is bit-equivalent to 4 ×compressover the same inputs (asserted bytest/unit/blake3/blake3-compress4-equiv.test.ts). - Chunk-level dispatch to
compress4is wired intohashCore(src/asm/blake3/index.ts): for multi-chunk inputs the largest multiple of 4 full chunks runs throughchunkBatch4(src/asm/blake3/chunk_simd.ts), with trailing full chunks and the partial last chunk falling back to the §2.4 single-chunk path. Asserted bytest/unit/blake3/blake3-compress4-dispatch.test.ts(test-only WASM counter onchunkBatch4invocations). - Parent-level dispatch to
compress4is wired intotreePushChunkvia the queue-per-level discipline (BLAKE3 §2.5): each tree level maintains a queue of pending CVs and batches 4 parent merges viaparentBatch4(src/asm/blake3/tree_simd.ts) when the queue reaches 8 entries. Finalize usescompress(single-pair) for residual merges and the ROOT compress. Asserted bytest/unit/blake3/blake3-parent-dispatch.test.ts(test-only WASM counter onparentBatch4invocations).
-
chunkInit(chunkIndex)copiesMODE_CVintoCV(the working chunk CV starts from the mode CV per BLAKE3 §2.4). -
chunkInitwriteschunkIndextoCHUNK_INDEXas a u64; every compress within the chunk uses this value as the counter. -
chunkInitresetsCHUNK_BLOCKS,CHUNK_PENDING_LEN, andCHUNK_PENDING_BLOCKso an empty-input chunk (nochunkUpdatebeforechunkFinalize) compresses the canonical 64-zero block withblock_len = 0(§2.4 single-chunk root case). -
chunkUpdate(blockOff, blockLen)first compresses the previously buffered block as a non-final block when one is pending, then stashes the new block inCHUNK_PENDING_BLOCKfor the next call or forchunkFinalizeto consume. - When
blockLen < 64the remainder ofCHUNK_PENDING_BLOCKis zero-padded; the final block byte count is preserved inCHUNK_PENDING_LENfor the eventual finalize. -
chunkFinalize(outCv, isRootSoloChunk)compresses the buffered block withCHUNK_END; whenisRootSoloChunkis true, the compress also carriesROOT(§2.4 single-chunk root case). -
compressPendingOR'sMODE_FLAGSonto every compress so keyed_hash (FLAG_KEYED_HASH) and derive_key (FLAG_DERIVE_KEY_CONTEXT/FLAG_DERIVE_KEY_MATERIAL) flag bits are carried on every chunk compress. -
CHUNK_STARTis set on the compress whenCHUNK_BLOCKS == 0, i.e. the first compress of the chunk. - When
isRootSoloChunkis true, the §2.4 single-chunk root compress input (CV,CHUNK_PENDING_BLOCK,pendingLen,flags) is snapshotted intoROOT_STATE_*immediately before the compress fires, enabling §2.6 XOF squeezes to re-fire from the snapshot. - After every compress, the first 32 bytes of
COMPRESS_OUTare copied intoCVas the new chunk CV;CHUNK_BLOCKSis incremented;CHUNK_PENDING_LENis reset to 0.
The §2.5 tree-assembly machinery uses a queue-per-level discipline:
each of the BLAKE3 §5.1.2 54 tree levels maintains a small queue of
pending CVs in LEVEL_QUEUES. At push time, when a level's queue
reaches 8 entries, parentBatch4 (src/asm/blake3/tree_simd.ts)
batches 4 parent merges in parallel through the v128-external
compress4 kernel and the 4 outputs propagate to the next level's
queue. Finalize walks the queues bottom-up using single-pair
compress calls for residual merges and the §2.5 ROOT compress.
ROOT is exclusive (§2.5) and lives on a single compress invocation,
which is why finalize never batches: ROOT-flag bookkeeping is
simplest with single-pair semantics.
-
treeInit()zeroes all 54 entries ofLEVEL_COUNTS. Queue contents themselves do not need zeroing here: unread slots past a level's count are never consumed, andwipeBuffers()covers the queues on dispose. -
treePushChunk(chunkCv)appends the new chunk CV to level-0's queue and cascades upward: whilecount[L] >= 8, firesparentBatch4(queue[L], queue[L+1] + count[L+1] * 32), zeroescount[L], and adds 4 tocount[L+1]. The chunk counter is consumed inside chunk.ts for the §2.4tfield; the queue-per-level discipline only needs per-level counts. - Each
parentBatch4invocation runs four §2.5 parent compresses in parallel viacompress4: CV input =MODE_CVsplatted across all 4 lanes, msg input = left || right (64 bytes per lane), counter = 0 (shared), block_len = 64 (shared), flags =FLAG_PARENT | MODE_FLAGS(shared). Per BLAKE3 §2.5 each parent compress at one level is independent of the others, so the batched output is bit-equivalent to four sequentialcompresscalls. - Push-time batches fire when a level's count reaches exactly 8 (never higher); after batching the count resets to 0 and the next level's count grows by 4. The cascade continues as long as upper levels also reach 8 pending. Loop terminates naturally at
MAX_LEVEL - 1; the BLAKE3 §5.1.2 input-size bound (≤ 2^64 bytes) ensures the cascade does not exceed 54 levels with valid inputs. -
treeFinalizeRoot(outOff)computestotalCvsas the sum of allcount[L], setsremainingMerges = totalCvs - 1, and walks levels 0..MAX_LEVEL-1. At each level it pair-compresses whilecount[L] >= 2(emit toqueue[L+1]tail; the FINAL merge withremainingMerges == 1writes tooutOffand carries the §2.5 ROOT flag). A residual single CV (count[L] == 1after the pair loop) carries up toqueue[L+1]with no merge consumed. - The 9-entry-per-level queue width (
LEVEL_QUEUE_STRIDE = 288) covers the transient peak of 8 entries (post-push 4 from a prior batch plus 4 finalize-time emissions: 3 pair-emits plus 1 carry from the level below) with 1 slot of headroom for alignment and future tightening. Levels L ≥ 1 reach count 0 or 4 at push-end (pushes add +4 and batches fire at exactly 8), so the peak write offset during finalize is index 7. - During the root merge (
remainingMerges == 1)ROOT_STATE_*is populated withMODE_CV,TREE_PARENT_BLOCK,blockLen = 64, and the final flags (PARENT | ROOT | MODE_FLAGS). SubsequentsqueezeXofBlockcalls re-fire the root compress from this snapshot per BLAKE3 §2.5. - The root merge writes the full 64 bytes directly to the caller-supplied
outOff; non-root merges write 32-byte parent CVs into the destination queue tail. - After finalize, every
LEVEL_COUNTSslot is reset to 0 so a follow-up hash on the same module instance starts clean (the dispose-timewipeBuffers()is the broader sweep).
-
hash(inputOff, inputLen, outOff, outLen)loads the BLAKE3 IV intoMODE_CV(BLAKE3 §2.2 Table 1) and setsMODE_FLAGS = 0before invokinghashCore. -
hashKeyed(keyOff, ...)copies 32 bytes fromkeyOffintoMODE_CV(the §2.3 keyed_hash starting CV is the key, byte-for-byte; WASM little-endian matches the §2.3 u32 LE word interpretation) and setsMODE_FLAGS = FLAG_KEYED_HASH. -
deriveKey(contextOff, contextLen, materialOff, materialLen, outOff, outLen)runs the two §2.3 derive_key passes:- Pass 1:
MODE_CV = IV,MODE_FLAGS = FLAG_DERIVE_KEY_CONTEXT,hashCore(context, ..., CONTEXT_CV, 32). - Pass 2:
MODE_CV = CONTEXT_CV,MODE_FLAGS = FLAG_DERIVE_KEY_MATERIAL,hashCore(material, ..., outOff, outLen).
- Pass 1:
- After pass 2
deriveKeyzerosCONTEXT_CV(32 B) so the derived intermediate does not linger between successivederiveKeyinvocations on the same module instance. -
hashCore(inputOff, inputLen, outOff, writeLen)takes the single-chunk path wheninputLen ≤ 1024(§2.4 single-chunk root): one chunk init, sequentialchunkUpdateblocks of up to 64 bytes,chunkFinalize(..., isRootSoloChunk = true). ROOT lives on the chunk's final compress. - The single-chunk path mirrors
COMPRESS_OUT[0..63]toROOT_OUT_SCRATCHso the first-block emit path is uniform across single-chunk and multi-chunk inputs. -
hashCoretakes the multi-chunk path wheninputLen > 1024:treeInit, then loop emittingchunkInit/chunkUpdate× N /chunkFinalize(..., false)followed bytreePushChunk(...). The chunk counter is advanced from 0; the §2.4tfor compress K ischunkIdxat the K-th chunk. - After the chunk loop the multi-chunk path fires
treeFinalizeRoot(ROOT_OUT_SCRATCH), which writes the 64-byte root output to scratch. - For
writeLen > 64the loop firessqueezeRootBlock(counter, ROOT_OUT_SCRATCH)with counter = 1, 2, ... and copies up to 64 bytes per iteration into the output region. Per BLAKE3 §2.6 the root counter increments for each additional output block; counter 0 is consumed by the initial root compress. -
squeezeXofBlock(counterLo, counterHi, outOff)re-fires the root compress fromROOT_STATE_*and writes 64 bytes tooutOff. The export is gated for the TSBLAKE3OutputReaderand exercised through it.
-
validateKey(key)throwsTypeErrorwhenkeyis not aUint8Array. -
validateKey(key)throwsRangeErrorwhenkey.length !== 32; the error message names the actual length received. -
validateContext(context)acceptsstring(UTF-8 encoded here) orUint8Array(passed through). Other types throwTypeError. -
validateContext(context)throwsRangeErroron an empty context (string''or zero-lengthUint8Array); per BLAKE3 §2.3 an empty context defeats the domain-separation property. -
validateContext(context)does NOT impose an upper-cap on context length. Caller-trust without hard caps matches xero's substrate-code preference; a long context is a design smell but not a spec violation. -
validateOutputLen(outLen)throwsRangeErrorwhenoutLenis non-number, non-finite, non-integer, NaN, orInfinity. -
validateOutputLen(outLen)throwsRangeErrorwhenoutLen < 1. -
validateOutputLen(outLen)does NOT cap the per-call output. The caller-facing one-shot ceiling of 1024 bytes is enforced separately insrc/ts/blake3/index.tsso the validator stays usable for the streaming-XOF read path too.
-
BLAKE3.hash(msg, outLen?),BLAKE3KeyedHash.hash(key, msg, outLen?), andBLAKE3DeriveKey.derive(context, material, outLen?)each call_assertNotOwned('blake3')before any WASM access. - Each one-shot method validates its caller inputs (type / size) before staging anything to WASM memory.
- Each one-shot method enforces the 1024-byte one-shot output ceiling (
OUTPUT_STAGING_SIZE) and throwstooBigForOneShotErrorfor larger requests; the error message routes the caller tofinalizeXof()andBLAKE3OutputReader.read(n). - Each one-shot method enforces the per-call input ceiling of 114688 bytes (
INPUT_SCRATCH_MAX) viastageInput; the error message routes the caller to the streaming surface. -
oneShotHashwipes the input scratch region in itsfinallyand callsx.wipeBuffers()on the way out. -
oneShotKeyedHashzeros the 32-byteKEYED_KEYslot infinallyin addition to the input wipe andwipeBuffers(). -
oneShotDeriveKeyzeros the contiguous[ctxOff, matOff + materialLen)region infinally(covers both the context and material staging) pluswipeBuffers(). -
BLAKE3Stream,BLAKE3KeyedHashStream, andBLAKE3DeriveKeyStreamall acquire_acquireModule('blake3')in their constructor and release it ondispose()/finalize()/finalizeXof()(via the transfer path). - Each streaming class throws on
update()afterfinalize()/finalizeXof()and on any method call against a disposed instance. - Each streaming class enforces the running-length cap (
INPUT_SCRATCH_MAX = 114688) viaStreamState.pushChunk. The error names the per-call WASM input scratch size and routes the caller. -
BLAKE3KeyedHashStreamkeeps a defensive 32-byte copy of the key; the caller's key buffer is untouched and the instance-owned copy is wiped ondispose()/finalize()/ on transfer to aBLAKE3OutputReader. -
BLAKE3DeriveKeyStreamencodes the context once at construction (viavalidateContext) and reuses it for every subsequent operation; finalizing the stream re-stagesctx || materialinto the input scratch in that order, matching the §2.3 derive_key two-pass driver inderiveKey. -
finalizeXof()transfers the module exclusivity token from the streaming instance to the newBLAKE3OutputReaderwithout releasing-then-reacquiring, so no race opens between streams. -
BLAKE3OutputReader.read(n)validatesnviavalidateOutputLen, populates the WASM-sideROOT_STATE_*snapshot on its first call by running the underlying hash entry, and squeezes additional 64-byte blocks viasqueezeXofBlockwith an incrementing counter (starting at 1, since counter 0 was consumed by the snapshot-population call). -
BLAKE3OutputReader._populatedoes NOT callwipeBuffers()(which would clobberROOT_STATE_*); it wipes the input scratch and the output staging slot for that one block, and the reader owns module exclusivity for its lifetime so no other instance can clobber the snapshot. -
BLAKE3OutputReader.dispose()wipes_blockBuf, the instance key copy if present, callsx.wipeBuffers(), and releases module exclusivity. -
BLAKE3Hash.digest(msg)calls_assertNotOwned('blake3'), runs a one-shot 32-byte hash, and returns the bytes. No state, no exclusivity hold. - All
dispose()paths are idempotent and never throw (the innerwipeBuffers()call is wrapped intry / catch {}where the lifecycle is teardown-safe).
wipeBuffers() zeroes every mutable region of the BLAKE3 WASM module
in one pass (memory.fill(MUTABLE_START, 0, BUFFER_END - MUTABLE_START)),
covering the regions below. The TS wrapper's dispose() paths call
wipeBuffers(), and one-shot methods call it on every finally.
-
INPUT_STAGING(caller input residue) is zeroed. -
OUTPUT_STAGING(XOF / one-shot output staging) is zeroed. - Working compress slots (
CV/MSG/COUNTER/BLOCK_LEN/FLAGS/COMPRESS_OUT) are zeroed. -
KEYED_KEY(32-byte §2.3 keyed_hash key) is zeroed. -
DERIVE_CV(derive_key intermediate stage) is zeroed. -
LEVEL_QUEUES(54 × 9 × 32 bytes of per-level pending CVs) is zeroed. -
LEVEL_COUNTS(54 × 4 bytes of per-level i32 counts) is zeroed. -
COMPRESS4_*staging buffers (CV / MSG / CTR / OUT / BLEN / FLAGS) are zeroed. - Chunk-state slots (
CHUNK_INDEX/CHUNK_BLOCKS/CHUNK_PENDING_LEN/CHUNK_PENDING_BLOCK) are zeroed. - Tree-state slots (
TREE_PARENT_BLOCK/CHUNK_CV_SCRATCH/ROOT_OUT_SCRATCH) are zeroed. -
MODE_CV(per-mode starting CV; holds key bytes in keyed_hash mode and context CV in derive_key pass 2) is zeroed. -
MODE_FLAGSis zeroed. -
CONTEXT_CV(derive_key pass-1 output) is zeroed; additionally,deriveKeyexplicitly zeroes this slot between successive invocations even whenwipeBuffers()is not called between them. -
ROOT_STATE_*(XOF snapshot) is zeroed. - The AS data segment (offsets 0..MUTABLE_START-1) is NOT wiped. It holds the SIGMA-style read-only tables.
- Per-class
dispose()wipe coverage is asserted intest/unit/blake3/blake3-wipe.test.ts.
- BLAKE3's compress is straight-line ARX over a fixed schedule. No conditional branches inside the compression rounds; no key-indexed memory accesses; no secret-dependent loop bounds.
-
keyed_hashmode loads the 32-byte key directly into the chunk machine's starting CV (MODE_CV). The key bytes are XOR-mixed and added into the state on every round but never select a code path or memory index. -
derive_keymode loads the context bytes through the same chunk pipeline as ordinary input bytes; no branch reads the context as a secret. - The chunk machine's one-block lookahead branches on
pendingLen > 0, which is purely a function of public structure (how manychunkUpdatecalls have fired and with whatblockLen). Not secret-derived. - The tree-mode queue-per-level cascade branches on
count[L] >= 8(push) andcount[L] >= 2(finalize), which are functions oftotalChunks(the public input length divided by 1024) only. Not secret-derived. - The §2.6 XOF squeeze loop branches on the requested output length, which is a public, caller-specified value.
- Timing side channels. BLAKE3 has no key-dependent branches and no key-indexed table lookups by algorithm design. Timing equalization at the CPU level is out of scope; see architecture.md §Where defense ends. The published BLAKE3 / BLAKE2 cryptanalysis literature does not report timing-side-channel weaknesses in the ARX round structure.
- Cache side channels. No data-dependent table lookups exist in the compress kernel. The SIGMA table is indexed by the round number (a public loop counter), not by secret data. The
BLAKE3_IV*constants are inlined into the source. - Power and EM. Out of scope per architecture.md §Where defense ends.
- Fault attacks. Out of scope for v3 hashing modules.
-
test/unit/blake3/blake3-compress.test.tscovers the v128-internalcompressagainst a single-block KAT (the gate, BLAKE3 §2.2 empty-input compression). -
test/unit/blake3/blake3-kat.test.tscovers all 35 records of the upstream BLAKE3 KAT corpus for default-modehash(asserts first 32 bytes of the upstreamhashHex). -
test/unit/blake3/blake3-compress4-equiv.test.tscross-checkscompress4output against 4 ×compressover randomized inputs (64+ iterations) and asserts byte-for-byte equality across all four lanes. -
test/unit/blake3/blake3-compress4-dispatch.test.tsassertshashCoreactually dispatches multi-chunk inputs (≥ 4096 bytes) throughchunkBatch4, with exact-count assertions for representative sizes and KAT regression on every upstream record withinputLen >= 4096. -
test/unit/blake3/blake3-parent-dispatch.test.tsassertstreePushChunkactually dispatches parent merges throughparentBatch4for inputs producing ≥ 8 chunks (queue-per-level cascade), with exact-count assertions at 4096 / 7168 / 8192 / 16384 / 32768 / 65536 byte inputs and KAT regression on every upstream record withinputLen >= 8192. -
test/unit/blake3/blake3-keyed-hash.test.tscovershashKeyedagainst all 35 upstream KAT records using the upstream test key, asserting the first 32 bytes ofkeyedHashHex. -
test/unit/blake3/blake3-derive-key.test.tscoversderiveKeyagainst all 35 upstream KAT records using the upstream test context string, asserting the first 32 bytes ofderiveKeyHex. -
test/unit/blake3/blake3-surface.test.tscovers the TS public surface for the three one-shot classes (BLAKE3,BLAKE3KeyedHash,BLAKE3DeriveKey) including theBLAKE3HashFortuna const round-trip. -
test/unit/blake3/blake3-tree-internals.test.tsdrives the_testChunkCV,_testParentCV, and_testDeriveContextCVexports against a curated corpus of single-chunk / multi-chunk / power-of-2 / non-power-of-2 inputs across all three modes. -
test/unit/blake3/blake3-streaming.test.tscovers streaming-vs-one-shot equivalence across 10+ size regimes forBLAKE3Stream,BLAKE3KeyedHashStream, andBLAKE3DeriveKeyStream, plus the streaming lifecycle (update-after-finalize, double-dispose, dispose-while-reader-live). -
test/unit/blake3/blake3-xof.test.tscovers full 131-byte XOF assertions across all 105 corpus records (35 × 3 modes) viaBLAKE3OutputReader, including reads that cross the 64-byte block boundary. -
test/unit/blake3/blake3-large-input.test.tscross-checksBLAKE3.hash,BLAKE3KeyedHash.hash, andBLAKE3DeriveKey.deriveagainst a Rust oracle (RustCryptoblake3 = "=1.8.5") for input sizes spanning 1 KiB to 16 MiB; the expected hex values are precomputed byscripts/verify-vectors/and pinned intest/vectors/. -
test/unit/blake3/blake3-wipe.test.tsasserts that every publicdispose()and one-shotfinallypath zeroes the regions covered in the Memory Hygiene section. -
test/unit/blake3/blake3-validation.test.tscovers every validate.ts throw path: bad key length / type, empty context, bad context type, bad outLen (negative, zero, NaN, Infinity, non-integer), oversize input, oversize one-shot output, post-finalize update, double-dispose, exclusivity guard violations.
-
Verified streaming output. Bao verified streaming (a BLAKE3 §6 proof-system extension) is not implemented. Deferred to a future log-proof substrate layered on top of
src/ts/merkle/blake3-tree.ts, which would consume_testChunkCV/_testParentCV. -
OutputReader seek. The
BLAKE3OutputReaderreads sequentially forward. BLAKE3 §2.6 allows arbitrary seek into the XOF stream by re-firing the root compress with a target counter, but the reader's public surface does not expose seek. Deferred; consumers that need random-access XOF can dispose the reader and re-finalize the underlying stream. -
compress8. Deferred. WebAssembly SIMD is fixed at 128-bit vectors; acompress8over two physical registers per state word would double the register pressure without doubling the parallelism. Revisit when wide-SIMD WebAssembly lands.
| Document | Description |
|---|---|
| blake3 | BLAKE3 TypeScript API reference. |
| asm_blake3 | BLAKE3 WASM module reference: buffer layout, exports, SIMD dispatch. |
| architecture | Repository structure, build and CI, WASM modules, public API, test suite, and security posture |
| vector_audit | Test-vector tier classification and verifier coverage. |
| audits | Project audit index. |
| BLAKE3 paper | The BLAKE3 specification. |
- Sign Tools
-
SignatureSuite
- format-byte catalog, hybrid composite encodings, custom suite contract
- Serpent-256 TypeScript | WASM
-
Serpent,SerpentCtr,SerpentCbc,SerpentGenerator
-
- ChaCha20 TypeScript | WASM
-
ChaCha20,Poly1305,ChaCha20Poly1305,XChaCha20Poly1305,ChaCha20Generator
-
- AES TypeScript | WASM
-
AES,AESCbc,AESCtr,AESGCM,AESGCMSIV,AESGenerator
-
- ML-DSA TypeScript | WASM
- pure (FIPS 204):
MlDsa44,MlDsa65,MlDsa87 - pure-mode suites:
MlDsa44Suite,MlDsa65Suite,MlDsa87Suite - prehash suites:
MlDsa44PreHashSuite,MlDsa65PreHashSuite,MlDsa87PreHashSuite
- pure (FIPS 204):
- SLH-DSA TypeScript | WASM
- pure (FIPS 205):
SlhDsa128f,SlhDsa192f,SlhDsa256f - pure-mode suites:
SlhDsa128fSuite,SlhDsa192fSuite,SlhDsa256fSuite - prehash suites:
SlhDsa128fPreHashSuite,SlhDsa192fPreHashSuite,SlhDsa256fPreHashSuite
- pure (FIPS 205):
- Ed25519 TypeScript | WASM
-
Ed25519(pure + Ed25519ph),Ed25519Suite,Ed25519PreHashSuite
-
- ECDSA-P256 TypeScript | WASM
-
EcdsaP256(hedged + RFC 6979),EcdsaP256Suite - DER codec:
ecdsaSignatureToDer,ecdsaSignatureFromDer,encodeEcPrivateKey,decodeEcPrivateKey,pointDecompress
-
- Hybrid composites PQ-only | Classical+PQ
- PQ-only:
MlDsa44SlhDsa128fSuite,MlDsa65SlhDsa192fSuite,MlDsa87SlhDsa256fSuite - Classical+PQ:
MlDsa44Ed25519Suite,MlDsa65Ed25519Suite,MlDsa44EcdsaP256Suite,MlDsa65EcdsaP256Suite
- PQ-only:
- X25519 TypeScript | WASM
-
X25519,KeyAgreementError(RFC 7748)
-
- ML-KEM TypeScript | WASM
-
MlKem512,MlKem768,MlKem1024
-
-
Ratchet (SPQR)
-
KDFChain,ratchetInit,kemRatchetEncap,kemRatchetDecap,RatchetKeypair,SkippedKeyStore
-
- Hashing overview
- SHA-2 TypeScript | WASM
-
SHA256,SHA384,SHA512,SHA224,SHA512_224,SHA512_256 -
HMAC_SHA256,HMAC_SHA384,HMAC_SHA512,HKDF_SHA256,HKDF_SHA512
-
- SHA-3 TypeScript | WASM
-
SHA3_224,SHA3_256,SHA3_384,SHA3_512,SHAKE128,SHAKE256
-
- BLAKE3 TypeScript | WASM
-
BLAKE3,BLAKE3Stream,BLAKE3KeyedHash,BLAKE3KeyedHashStream -
BLAKE3DeriveKey,BLAKE3DeriveKeyStream,BLAKE3OutputReader,BLAKE3Hash
-
-
KMAC
-
CSHAKE128,CSHAKE256,KMAC128,KMAC256,KMACXOF128,KMACXOF256
-
-
Merkle
-
MerkleVerifier,MerkleLog -
SignedLog,Sha256Tree,Blake3Tree,MemoryStorage
-
-
Fortuna CSPRNG
-
Fortuna,SerpentGenerator,ChaCha20Generator,AESGenerator,SHA256Hash,SHA3_256Hash,BLAKE3Hash
-
- Utils TypeScript | WASM
-
constantTimeEqual,randomBytes,wipe, encoding helpers
-
-
TypeScript interfaces
-
Hash,KeyedHash,Blockcipher,Streamcipher,AEAD,Generator,HashFn
-