Zero-knowledge template library
-
Updated
Jan 28, 2026 - Cuda
Zero-knowledge template library
GPU-accelerated Number-Theoretic Transform for ZK-Proof generation. Targets the NTT bottleneck (91% of Groth16 prover time) via two CUDA optimizations: async double-buffered pipeline eliminating CPU-GPU transfer overhead, and IADD3-path Montgomery multiplication reducing finite-field instruction latency. BLS12-381, Ampere sm_86, Nsight-profiled.
Add a description, image, and links to the bls12-381 topic page so that developers can more easily learn about it.
To associate your repository with the bls12-381 topic, visit your repo's landing page and select "manage topics."