Skip to content

ggml : assert b->ne[3] == 1 in ggml_conv_transpose_2d_p0 (#1448)#1520

Open
devYRPauli wants to merge 1 commit into
ggml-org:masterfrom
devYRPauli:fix-conv-transpose-2d-batch-assert
Open

ggml : assert b->ne[3] == 1 in ggml_conv_transpose_2d_p0 (#1448)#1520
devYRPauli wants to merge 1 commit into
ggml-org:masterfrom
devYRPauli:fix-conv-transpose-2d-batch-assert

Conversation

@devYRPauli

Copy link
Copy Markdown
Contributor

Addresses #1448 via Option A from @janblumenkamp's analysis (assert at the API level, no kernel changes).

Bug

ggml_conv_transpose_2d_p0 accepts inputs with b->ne[3] > 1 and returns a tensor with the correct 4D output shape, but the CPU compute kernel (ggml_compute_forward_conv_transpose_2d_impl in ggml-cpu/ops.cpp) only iterates batch index 0:

  • the src1 permutation loop iterates ne12, ne11, ne10 but never ne13;
  • the compute loop offsets the output by i2*nb2 (channel) only, not by i3*nb3 (batch);
  • the work-buffer sizing omits a *ne13 factor.

Net effect: batch 0 is correct, batches 1..N-1 are silently zero. Caught only by explicitly comparing across batch elements. The bug has been present since 8da5be2 (Aug 2023) — every existing caller (SAM example, test-conv-transpose.c, test-backend-ops.cpp parameterizations) uses ne[3] == 1, so it has never surfaced in CI.

Fix

Add GGML_ASSERT(b->ne[3] == 1) to ggml_conv_transpose_2d_p0. Matches the existing GGML_ASSERT(a->ne[3] == 1) precedent in ggml_conv_transpose_1d. Turns silent data corruption into an immediate, debuggable abort.

Why not implement batch support instead?

Option B (implementing the batch loop in CPU + Metal + Vulkan kernels) is genuinely useful but adds capability for a code path with no current caller. Option A solves the correctness problem and matches existing precedent with minimal risk. If a caller later needs batched conv_transpose_2d_p0, removing the assert and implementing batch support across backends is a clean follow-up.

Verification

  • Reporter's reproducer (ggml_conv_transpose_2d_p0 silently ignores batch dimension (ne[3] > 1) #1448 body, 3-batch input with identity-like kernel) — before this change: prints batch 1: sum_abs = 0.0 <-- BUG. After this change: aborts with GGML_ASSERT(b->ne[3] == 1) failed at ggml.c:4882.
  • Full test-backend-ops test (Metal + CPU) — no regressions. All CONV_TRANSPOSE_2D parameterizations use ne[3] == 1 and continue to pass.

No test added

The natural regression test ("assert fires on ne[3] > 1") requires a death-test pattern (fork + check exit on SIGABRT) since GGML_ASSERT aborts the process — test-backend-ops's compute-and-compare model can't host it. Happy to add a separate death-test executable if maintainers prefer; let me know.

Credit to @janblumenkamp for the original bug analysis, complete reproducer, and historical research.

The CPU compute kernel for ggml_conv_transpose_2d_p0
(ggml_compute_forward_conv_transpose_2d_impl in ggml-cpu/ops.cpp) only
iterates over batch index 0:

* the src1 permutation loop iterates ne12 (channels), ne11 (height),
  ne10 (width) but never ne13 (batch);
* the compute loop offsets the output by i2*nb2 (channel) only, not by
  i3*nb3 (batch);
* the work-buffer sizing omits a *ne13 factor.

The output 4D shape is computed correctly, so callers see plausible
batched output where batch 0 is correct and batches 1..N-1 are silently
zero. All existing callers (SAM example, test-conv-transpose.c,
test-backend-ops.cpp) pass ne[3] == 1, which is why this has not been
caught since the op was introduced in 8da5be2 (Aug 2023).

Reject batch > 1 at the API level. Matches the precedent set by
ggml_conv_transpose_1d which already enforces a->ne[3] == 1. Implementing
batch support in the CPU/Metal/Vulkan kernels is a sensible follow-up if
a caller needs it.

Credit to @janblumenkamp for the bug analysis, repro, and historical
research in ggml-org#1448.
@devYRPauli

Copy link
Copy Markdown
Contributor Author

Trimming down to one open PR to respect the new-contributor guideline — closing this for now and I'll resubmit after my first one lands. Thanks!

@devYRPauli devYRPauli closed this May 29, 2026
@devYRPauli devYRPauli reopened this Jun 10, 2026
@devYRPauli

Copy link
Copy Markdown
Contributor Author

Reopening now that my first PR has landed, as promised when I trimmed down to a single open PR. Thanks for your patience!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant