Skip to content
#

bls12-381

Here are 76 public repositories matching this topic...

GPU-accelerated Number-Theoretic Transform for ZK-Proof generation. Targets the NTT bottleneck (91% of Groth16 prover time) via two CUDA optimizations: async double-buffered pipeline eliminating CPU-GPU transfer overhead, and IADD3-path Montgomery multiplication reducing finite-field instruction latency. BLS12-381, Ampere sm_86, Nsight-profiled.

  • Updated Mar 16, 2026
  • Cuda

Improve this page

Add a description, image, and links to the bls12-381 topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the bls12-381 topic, visit your repo's landing page and select "manage topics."

Learn more