Skip to content

fix(python): initialize and expose ICICLE GPU runtime#1028

Open
peter941221 wants to merge 1 commit into
zkonduit:mainfrom
peter941221:issue/ezkl-882
Open

fix(python): initialize and expose ICICLE GPU runtime#1028
peter941221 wants to merge 1 commit into
zkonduit:mainfrom
peter941221:issue/ezkl-882

Conversation

@peter941221
Copy link
Copy Markdown

Summary

This fixes the Linux Python GPU path for ezkl.

Before this patch:

  • the CLI path initialized the ICICLE backend/device through run()
  • Python gen_srs, setup, and prove bypassed that path
  • with ICICLE_BACKEND_INSTALL_DIR set, Python could still stay on CPU
  • once Python was forced onto the CUDA path, backend loading could still fail because the ICICLE frontend shared libraries were not globally visible to later dlopen calls

After this patch:

  • gen_srs, setup, and prove all initialize the GPU backend/device through the shared execution path
  • Python module initialization promotes the required ICICLE frontend shared libraries to RTLD_GLOBAL on Linux GPU builds
  • CUDA backend loading can resolve ICICLE symbols correctly

Root Cause

There were two linked issues.

  1. Control-flow gap
  • set_device() existed, but only the CLI run() path called it
  • Python reached GPU-sensitive halo2 / ICICLE code directly
  1. Dynamic-linking gap
  • after fixing the control-flow issue, CUDA backend loading still depended on symbols exported by:
    • libicicle_device.so
    • libicicle_hash.so
    • libicicle_field_bn254.so
    • libicicle_curve_bn254.so
  • in Python, those frontend libraries were not globally visible by default
  • later backend dlopen calls could fail to resolve symbols like register_deviceAPI

Changes

src/execute.rs

  • make set_device() idempotent with Once
  • call set_device() from:
    • gen_srs_cmd
    • setup
    • prove

src/bindings/python.rs

  • on Linux GPU builds, promote the ICICLE frontend shared libraries to RTLD_GLOBAL during module initialization

Local Validation

Validated locally on 2026-05-17 in WSL on an RTX 5090.

Build

cargo check --lib --no-default-features --features python-bindings,gpu-accelerated,ezkl
maturin develop --no-default-features --features python-bindings,gpu-accelerated,ezkl

Python repro

Minimal Python chain:

  1. gen_settings
  2. compile_circuit
  3. gen_witness
  4. setup
  5. prove

Observed behavior:

  • before patch:
    • Python logged only Registering DEVICE: device=CPU
  • after only the execution-path fix:
    • Python entered CUDA setup, but backend loading still depended on global ICICLE symbols
  • after the full patch:
    • Python logs Registering DEVICE: device=CUDA
    • backend loads successfully
    • setup and prove both succeed

Scope

This is intentionally narrow:

  • no algorithmic proving changes
  • no non-Linux Python linker changes
  • no unrelated build-system or dependency changes

Python GPU entrypoints were bypassing the CLI device setup path, so gen_srs, setup, and prove could silently stay on CPU even when ICICLE_BACKEND_INSTALL_DIR was set.

This patch makes GPU initialization run from the shared execution entrypoints and makes it idempotent with Once.

It also fixes Linux Python GPU loading by promoting the required ICICLE frontend shared libraries to RTLD_GLOBAL, so later backend dlopen calls can resolve symbols such as register_deviceAPI.
@peter941221 peter941221 mentioned this pull request May 17, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant