Skip to content

fix: split asm_topksoftmax into separate module to fix ctypes JIT build (#2548)#2603

Merged
valarLip merged 3 commits into
mainfrom
fix/moe-asm-split-module
Apr 3, 2026
Merged

fix: split asm_topksoftmax into separate module to fix ctypes JIT build (#2548)#2603
valarLip merged 3 commits into
mainfrom
fix/moe-asm-split-module

Conversation

@zufayu
Copy link
Copy Markdown
Contributor

@zufayu zufayu commented Apr 3, 2026

Summary

  • Split asm_topksoftmax.cu out of module_moe_asm into a new module_moe_topksoftmax_asm module
  • When topk_softmax_asm (ffi_type="ctypes") triggers JIT compilation, torch_exclude=True is forced, but other .cu files in module_moe_asm depend on torch, causing undefined symbol errors
  • The fix separates the ctypes ASM kernel into its own module so it compiles torch-free independently

Changes (2 files, ~10 lines):

  • aiter/jit/optCompilerConfig.json — remove asm_topksoftmax.cu from module_moe_asm, add new module_moe_topksoftmax_asm entry
  • aiter/ops/moe_op.py — point topk_softmax_asm decorator to new module name

No .cu files modified.

Fixes #2548

Test plan

  • Deleted stale .so files and ran python op_tests/test_moeTopkSoftmax.py — all tests pass
  • Verified module_moe_topksoftmax_asm.so has zero torch dependencies (ldd | grep torch returns 0)
  • Verified module_moe_asm.so builds and loads normally with pybind + torch

🤖 Generated with Claude Code

When topk_softmax_asm (ffi_type="ctypes") triggers JIT compilation,
torch_exclude=True is forced. But module_moe_asm also contains pybind
.cu files that depend on torch, causing undefined symbol errors.

Split asm_topksoftmax.cu into its own module_moe_topksoftmax_asm so it
compiles torch-free independently. No .cu files modified.

Fixes #2548
@zufayu zufayu requested review from a team and Copilot April 3, 2026 02:36
@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented Apr 3, 2026

🏷️ CI Guide

Runs automatically on every PR:

  • ✅ Pre-checks (submodule verification, code formatting)
  • ✅ Aiter op tests (gfx942 + gfx950)
  • ✅ Triton tests (only when aiter/ops/triton/** or related paths are changed)

Extended tests (opt-in via labels):

Label Tests
ci:triton-355 Run Triton tests on MI355 in addition to MI325
ci:sglang SGLang integration tests
ci:atom ATOM benchmark (DeepSeek-R1 + GPT-OSS)
ci:vllm vLLM benchmark
ci:all All of the above

Add labels via the sidebar or gh pr edit 2603 --add-label <label>

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR resolves a runtime JIT-loading failure for ffi_type="ctypes" MOE TopK+Softmax ASM by splitting the asm_topksoftmax.cu build target out of the existing torch-linked module_moe_asm into its own dedicated, torch-free JIT module, avoiding undefined torch symbol errors when torch_exclude=True is forced for ctypes modules.

Changes:

  • Add new JIT build module module_moe_topksoftmax_asm for asm_topksoftmax.cu (with its blob/codegen step).
  • Remove asm_topksoftmax.cu from module_moe_asm sources so that module_moe_asm remains torch/pybind-linked without impacting ctypes builds.
  • Update topk_softmax_asm decorator to compile/load from the new module.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
aiter/ops/moe_op.py Points topk_softmax_asm (ctypes) at the new torch-free JIT module name.
aiter/jit/optCompilerConfig.json Splits build config: module_moe_asm no longer includes asm_topksoftmax.cu; new module_moe_topksoftmax_asm builds it independently (including blob_gen_cmd).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@zufayu zufayu requested a review from amd-ruitang3 April 3, 2026 02:46
asm_topksoftmax.cu does not use ck_tile headers.
@valarLip valarLip merged commit 45defe6 into main Apr 3, 2026
38 of 39 checks passed
@valarLip valarLip deleted the fix/moe-asm-split-module branch April 3, 2026 11:40
yzhou103 pushed a commit that referenced this pull request Apr 8, 2026
…ld (#2548) (#2603)

* fix: split asm_topksoftmax into separate module to fix ctypes JIT build

When topk_softmax_asm (ffi_type="ctypes") triggers JIT compilation,
torch_exclude=True is forced. But module_moe_asm also contains pybind
.cu files that depend on torch, causing undefined symbol errors.

Split asm_topksoftmax.cu into its own module_moe_topksoftmax_asm so it
compiles torch-free independently. No .cu files modified.

Fixes #2548

* style: black formatting for compile_ops decorator line

* fix: remove unnecessary ck_tile include from module_moe_topksoftmax_asm

asm_topksoftmax.cu does not use ck_tile headers.

---------

Co-authored-by: root <root@hjbog-srdc-39.amd.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug] ctypes ffi_type modules fail to load: undefined symbol _ZTVN5torch8autograd12AutogradMetaE

3 participants