High-performance CUDA kernels for CSR sparse x dense matmul with full benchmarks and PyTorch support.
-
Updated
Dec 9, 2025 - Cuda
High-performance CUDA kernels for CSR sparse x dense matmul with full benchmarks and PyTorch support.
High-performance GPU-accelerated linear algebra library for scientific computing. Custom kernels outperform cuBLAS+cuSPARSE by 2.4x in iterative solvers. Built for circuit simulation workloads.
Add a description, image, and links to the sparse-matrices topic page so that developers can more easily learn about it.
To associate your repository with the sparse-matrices topic, visit your repo's landing page and select "manage topics."