feat(attention): add fused SDPA graph.Node (T3.1a) by dndungu · Pull Request #838 · zerfoo/zerfoo

dndungu · 2026-04-29T03:50:12Z

Summary

Wraps the existing layers/attention/scaled_dot_product_attention.go SDPA into a graph.Node[T] (FusedSDPA) so consumers (Wolf cross-attention, T3.1b) can compose it via graph.Builder without duplicating math.

New layers/attention/fused_sdpa_node.go: FusedSDPA[T tensor.Numeric] implementing graph.Node[T]. OpType=\"FusedSDPA\", attributes head_dim + causal, no parameters, cached OutputShape().
Forward accepts (Q,K,V) or (Q,K,V,mask); delegates to inner SDPA.
Backward delegates to inner SDPA and appends a nil grad slot for the mask input when present, so input/grad indexing stays aligned.
Options: WithFusedSDPABidirectional, WithFusedSDPAHeadCounts. Causal default mirrors existing SDPA convention (causal-on unless bidirectional).
New layers/attention/fused_sdpa_node_test.go covers fp32/fp64 x {causal, bidirectional, masked} forward+backward equivalence vs the unfused ScaledDotProductAttention chain. Tolerances 1e-6 fp32 fwd / 1e-12 fp64 fwd / 1e-5 fp32 bwd / 1e-10 fp64 bwd. Stdlib testing only.

Downstream: T3.1b (Wolf swap) consumes this node.

Test plan

go build ./...
go vet ./...
go test ./layers/attention/... -race -count=1 (1.29s, all green incl. new FusedSDPA cases)

…al-equivalence test)

feat(attention): add fused SDPA graph.Node (forward+backward, numeric…

40398e0

…al-equivalence test)

dndungu merged commit bfcd7a3 into main Apr 29, 2026
7 checks passed

dndungu deleted the wave-22-task-T3.1a branch April 29, 2026 03:58

dndungu mentioned this pull request May 8, 2026

docs(attention): document FusedSDPA graph.Node (T3.1a follow-up) #840

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(attention): add fused SDPA graph.Node (T3.1a)#838

feat(attention): add fused SDPA graph.Node (T3.1a)#838
dndungu merged 1 commit into
mainfrom
wave-22-task-T3.1a

dndungu commented Apr 29, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dndungu commented Apr 29, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant