Releases · zerfoo/zerfoo

Features

crossasset: extract to feza-ai/wolf
parity: add MIMOMambaBlock and HModule structural parity tests

The crossasset/ package has been moved to github.com/feza-ai/wolf/crossasset.

Features

crossasset: convert model from float64 to float32
crossasset: replace forward slice math with Engine[T] ops and wire TrainGPU
parity: add GPU parity test Containerfile and Spark manifest
parity: add GPU vs CPU parity tests for activations, normalization, and RoPE
parity: add GPU vs CPU parity tests for core ops, attention, and backward

Bug Fixes

attention: recompute attention weights in SDPA backward after flash forward
crossasset: relax GPU parity test tolerance for flash attention divergence
cuda: add /opt/zerfoo/lib to libkernels.so dlopen search path

1.46.0 (2026-04-12)

Features

parity: add BlockAttnRes golden parity test (11e17fc)
parity: add CfC golden-file parity test (3490dac)
parity: add FreTS golden-file parity test (b0c63c8)
parity: add GRN golden-file parity test (18c6964)
parity: add TimeMixer golden-file parity test (d3b5ebe)
parity: upgrade PatchTST, N-BEATS, ITransformer to golden-file parity (92249e4)
parity: wire MambaBlock golden parity test (fdea55b)
timeseries: migrate CfC to Engine[T] compliance (ff971af)
timeseries: migrate DLinear and TimeMixer to Engine[T] compliance (d37ee31)
timeseries: migrate FreTS to Engine[T] compliance (37ece62)
timeseries: migrate ITransformer to Engine[T] (e8f4e0a)

1.45.0 (2026-04-11)

Features

parity: add 11 more layer parity tests (E86.1 remaining) (462f24e)
parity: add 22 new layer parity tests (E86.1 + E86.3) (092e3bf)
parity: add GQA and MoE golden-file parity tests (2e043e9)
parity: add PyTorch golden file parity tests for 32 layers (6200355)
parity: Wave 2 - backward parity + model architectures (E86.2, E86.4) (9351180)
parity: wire Go tests for 10 existing golden files (E86.0) (e8b9815)

Bug Fixes

core: add missing transposes in MatMul backward (fefdcba)
loss: add 2/N scaling factor to MSE backward for mean reduction (54be887)
loss: add batch normalization to CrossEntropy backward (7733c08)
normalization: correct ReduceSum axis in LayerNorm backward (5c300c9)

1.44.0 (2026-04-11)

Features

crossasset: add Save/Load for trained model weights (c1e7ab1), closes #378

1.43.0 (2026-04-10)

Features

bench: add bench-spark.sh helper for Spark submission (0321a18)
bench: add PatchTST training benchmark tool (d847238)
bench: add Spark pod manifest for PatchTST training (0e05d43)
timeseries: activate fused encoder forward path (8aa526d)
timeseries: add weight-hash debug helper for GPU training diagnosis (c5a34c5)
timeseries: wire fused encoder kernel into PatchTST training (bafdad0)

Bug Fixes

bench: mount /opt/zerfoo/lib so libkernels.so is reachable (aa6331a)
bench: post YAML (not JSON) and parse Spark status shape (9d20746)
ci: make govulncheck non-blocking for unfixed bbolt vuln (b6b38a6)
mlstm: use paper's stabilized exponential-gating formulation (46b7b86)
slstm: use paper's stabilized exponential-gating formulation (e47e4a4)
timeseries: compare Storage identity in gradTs sentinel (a67063a)
timeseries: GPU training convergence — rebuild paramTs/gradTs per batch, strengthen sentinel, remove dead machinery (168a938)
timeseries: GPU training writes back optimizer step to device (f29c93b)
timeseries: skip flaky TimeMixer gradient check + add WithTimeMixerRNG (4f96d99)
timeseries: use return value of GPU Reshape in PatchTST backward (d61cbab)

Performance Improvements

timeseries: pre-allocate PatchTST GPU train loop buffers (E85 T85.2.1-3,5) (09a318c)

1.42.1 (2026-04-06)

Bug Fixes

modeldsl: replace .Data() bias loop with engine.Add (4fd8d63)

1.42.0 (2026-04-05)

Features

inference: add builder_helpers with newTensorLookup and newParamWrapper (adfb334)

Bug Fixes

generate: remove unused compute import after merge (b7511c3)

1.41.0 (2026-04-04)

Features

cmd: add --pjrt flag for PJRT backend selection (66fb945)
crossasset: replace SGD with AdamW in CPU Train() (#315) (4d6664c)
functional: add GELUBackward for gradient computation (0e89305)
functional: add LayerNormBackward for gradient computation (1e51b9e)
functional: add LinearBackward for gradient computation (534127d)
functional: add MLPBackward for 2-layer MLP gradient computation (8624a1e)
functional: add MultiHeadAttentionBackward (2d91fa3)
functional: add SoftmaxBackward for gradient computation (1c2c486)
generate: wire PJRTPlan into decode loop (ca6bab6)
inference: add PJRT compilation path (9cde667)
layers: add functional activation wrappers (GELU, Softmax, ReLU, SiLU, Sigmoid) (962b36d)
layers: add functional LayerNorm and RMSNorm wrappers (08c7ac9)
layers: add functional Linear and MultiHeadAttention wrappers (e5449e8)

Bug Fixes

architecture: add crossasset/backward.go to privateLayer allowlist (5c01ccf)
architecture: add layernorm_ops.go backward to dataAbuse allowlist (34fe067)
crossasset: call Train() once with all epochs to preserve AdamW state (834b8f3)
crossasset: delegate TrainGPU to CPU full-backprop with AdamW (#317) (b345932)
crossasset: snapshot GPU tensors to CPU before backward reads (#317) (4de925e)
timeseries: resolve warmupLR merge conflict with scheduler.WarmupLR (9f573cf)
timeseries: update nhits_test weight shape check for transposed layout (f090509)
training: fix QuantileLoss generic type assertions (a282e9d)

Performance Improvements

training: replace guardAndClipGradients .Data() loops with Engine ops (92e1218)
training: replace SGD broadcast allocation with engine.MulScalar (aad4deb)

1.40.1 (2026-04-02)

Bug Fixes

crossasset: prevent CUDA illegal memory access in TrainGPU backward (#317) (8db043e)

Releases: zerfoo/zerfoo

v1.48.0

Features

Uh oh!

v1.47.0

Features

Bug Fixes

Uh oh!

v1.46.0

1.46.0 (2026-04-12)

Features

Uh oh!

v1.45.0

1.45.0 (2026-04-11)

Features

Bug Fixes

Uh oh!

v1.44.0

1.44.0 (2026-04-11)

Features

Uh oh!

v1.43.0

1.43.0 (2026-04-10)

Features

Bug Fixes

Performance Improvements

Uh oh!

v1.42.1

1.42.1 (2026-04-06)

Bug Fixes

Uh oh!

v1.42.0

1.42.0 (2026-04-05)

Features

Bug Fixes

Uh oh!

v1.41.0

1.41.0 (2026-04-04)

Features

Bug Fixes

Performance Improvements

Uh oh!

v1.40.1

1.40.1 (2026-04-02)

Bug Fixes

Uh oh!