Skip to content
#

ntt

Here are 49 public repositories matching this topic...

GPU-accelerated Number-Theoretic Transform for ZK-Proof generation. Targets the NTT bottleneck (91% of Groth16 prover time) via two CUDA optimizations: async double-buffered pipeline eliminating CPU-GPU transfer overhead, and IADD3-path Montgomery multiplication reducing finite-field instruction latency. BLS12-381, Ampere sm_86, Nsight-profiled.

  • Updated Mar 16, 2026
  • Cuda

Improve this page

Add a description, image, and links to the ntt topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the ntt topic, visit your repo's landing page and select "manage topics."

Learn more