#cuda

  1. cudarc

    Safe and minimal CUDA bindings

    v0.19.2 365K #cuda #nvidia-gpu #nvidia #cudnn #cu-blas #gpu
  2. neptune

    Poseidon hashing over BLS12-381 for Filecoin

    v13.0.0 83K #zero-knowledge-proofs #poseidon-hash #bls12-381 #prime-field #filecoin #opencl #cuda #hashing #compile-time #gpu
  3. bindgen_cuda

    Bindgen like interface to build cuda kernels to interact with within Rust

    v0.1.6 111K #cuda #bindgen #kernel-interface #build #interact
  4. hvm

    A massively parallel, optimal functional runtime in Rust

    v2.0.22 110 #parallel #massively #functional #cuda #run-time #high-level-language #higher-order
  5. iron_learn

    ML library with GPU-accelerated gradient descent. Supports tensors, complex numbers, linear/logistic regression, and CUDA optimization.

    v0.6.5 #cuda #gradient-descent #gpu-accelerated #neural-network #machine-learning #iron #linear-regression #logistic-regression #complex-numbers #numerical-computation
  6. candle-kernels

    CUDA kernels for Candle

    v0.9.2 23K #cuda #machine-learning #tensor
  7. mwa_hyperbeam

    Primary beam code for the Murchison Widefield Array (MWA) radio telescope

    v0.10.4 #murchison-widefield-array #beam #telescope #python #primary #hdf5 #cuda #hip #radio-astronomy #env-vars
  8. getenv

    Getenv.rs

    v0.1.2 950 #env-var #python #conda #ssh #ruby #cuda #nodejs #java #xdg #docker
  9. infernum

    CLI - From the depths, intelligence rises

    v0.2.0-rc.2 #inference #chat #cache #chat-history #depths #hugging-face #cuda #metal #gpu #cli-for-running
  10. async-cuda

    Async CUDA for Rust

    v0.6.1 490 #cuda #async #npp #nvidia #nvidia-gpu #gpu
  11. ringkernel-cuda

    CUDA backend for RingKernel - NVIDIA GPU support via cudarc

    v0.4.2 #cuda #nvidia #gpu
  12. llama-cpp-sys-2

    Low Level Bindings to llama.cpp

    v0.1.132 26K #llama-cpp #low-level #cuda
  13. llmux

    Zero-reload model switching for vLLM - manages multiple models on shared GPU

    v0.5.0 #vllm #model #gpu #host #suspend #checkpoint #cuda #logging #llm #multiplexer
  14. ec-gpu

    Traits for field and eliptic curve operations on GPUs

    v0.2.0 11K #elliptic-curve #finite-field-arithmetic #opencl #finite-fields #gpu #limbs #cuda
  15. torch_poetry_bootstrap

    A command-line tool to detect CUDA version and install the appropriate PyTorch wheel via Poetry

    v0.1.18 190 #pytorch #poetry #cuda #bootstrap
  16. zfp-sys

    Raw Rust bindings to ZFP (https://github.com/LLNL/zfp)

    v0.4.2 310 #github #llnl #bindings #cuda #older-versions
  17. cudaforge

    Advanced CUDA kernel builder for Rust with incremental builds, auto-detection, and external dependency support

    v0.1.4 260 #gpu-kernel #cuda #gpu #nvcc
  18. async-tensorrt

    Async TensorRT for Rust

    v0.9.1 160 #tensor-rt #cuda #async #nvidia #gpu
  19. cuda-rust-wasm

    CUDA to Rust transpiler with WebGPU/WASM support

    v0.1.7 #cuda #web-gpu #transpiler #wasm #gpu
  20. pylate-rs

    WebAssembly library for late interaction models

    v1.0.4 220 #interaction-model #inference-engine #late #edge-computing #python-bindings #cuda #gpu #mkl #metal #candle
  21. perdix

    High-performance GPU-accelerated ring buffer for AI terminal multiplexing

    v0.1.1 #ring-buffer #cuda #web-gpu #terminal
  22. xdl-amp

    Multi-backend GPU/ML acceleration for XDL

    v0.1.1 #gpu #gpu-acceleration #ml #cuda #multi-backend #amp #xdl #directx #opencl #metal
  23. mwa_hyperdrive

    Calibration software for the Murchison Widefield Array (MWA) radio telescope

    v0.7.0 #murchison-widefield-array #radio #telescope #calibration #hyperdrive #cuda #astronomy #radio-astronomy
  24. gpu-scatter-gather

    World's fastest wordlist generator using GPU acceleration with multi-GPU support

    v1.8.0 #cuda #word-list #security
  25. cuvs

    RAPIDS vector search library

    v26.2.0 #vector-search #nearest-neighbors-search #gpu #cuda #rapids #approximate-nearest-neighbor
  26. ec-gpu-gen

    Code generator for field and eliptic curve operations on the GPUs

    v0.7.1 9.3K #elliptic-curve #cuda #codegen #opencl #finite-fields #gpu #finite-field-arithmetic #compile-time #source-builder #ec-gpu
  27. nam-ec-gpu-gen

    Code generator for field and elliptic curve operations on the GPUs

    v0.7.2-nam.0 170 #elliptic-curve #cuda #codegen #opencl #finite-fields #finite-field-arithmetic #gpu #ec-gpu
  28. ringkernel-cuda-codegen

    CUDA code generation from Rust DSL for RingKernel stencil kernels

    v0.4.2 #cuda #codegen #transpiler #stencil #gpu
  29. cuda-device-query

    CUDA deviceQuery.cpp port written in Rust with cudarc

    v0.1.0 #cpp #port #cuda #cudarc #device-query
  30. autd3-backend-cuda

    CUDA Backend for AUTD3

    v35.0.0 3.9K #autd #autd3 #cuda
  31. with-gpu

    Intelligent GPU selection wrapper for CUDA commands

    v0.3.0 #gpu #cuda #automation #nvidia-gpu #nvidia #cli-automation
  32. kitsune-stt

    Speech-to-Text tool using Candle and Voxtral

    v0.1.0 #text-to-speech #candle #audio #audio-processing #cuda #cudnn #hugging-face #gpu #transcribe #audio-format
  33. tensor_frame

    A PyTorch-like tensor library for Rust with CPU, WGPU, and CUDA backends

    v0.0.3-alpha 120 #wgpu #cuda #tensor #machine-learning #gpu
  34. sbv2_core

    Style-Bert-VITSの推論ライブラリ

    v0.2.0-alpha8 370 #bert #style #text-to-speech #cuda #tensor-rt #coreml #vits #onnx-runtime
  35. abaddon

    LLM inference engine - The Destroyer renders judgment

    v0.2.0-rc.2 #cuda #inference-engine #web-gpu #quantization #kv-cache #llm #llm-inference #metal #fused #flash-attention
  36. sass-assembler

    SASS (NVIDIA GPU) assembler for Gaia project

    v0.0.5 #sass #assembly #gpu #nvidia-gpu #kernel #gaia #cubin #cuda
  37. trueno-gpu

    Pure Rust PTX generation for NVIDIA CUDA - no LLVM, no nvcc

    v0.4.17 300 #cuda #cuda-ptx #gpu #nvidia #simd
  38. optirs-gpu

    OptiRS GPU acceleration and multi-GPU optimization

    v0.1.0 #cuda #gpu #metal #optimization #opencl
  39. supraseal-c2

    CUDA Groth16 proof generator for Filecoin

    v0.1.2 #groth16 #generator #filecoin #proof #cuda #zk-snarks #benchmark
  40. haagenti-cuda

    CUDA GPU decompression kernels for Haagenti tensor compression

    v0.1.0 #cuda #lz4 #compression #gpu #gpu-compression #decompression
  41. burn-cuda

    CUDA backend for the Burn framework

    v0.21.0-pre.1 59K #deep-learning #machine-learning #cuda #gpu
  42. hive-gpu

    High-performance GPU acceleration for vector operations with Device Info API (Metal, CUDA, ROCm)

    v0.1.7 #vector-search #cuda #gpu #hnsw #metal #hnsw-vector-search
  43. crown

    A cryptographic library

    v0.12.0 270 #cuda #encryption #cryptography #hashing
  44. pasta-msm

    Optimized multiscalar multiplicaton for Pasta moduli for x86_64 and aarch64

    v0.1.5 440 #pasta #arm64 #multiscalar #cuda #x86-64 #multi-scalar #moduli #multiplicaton
  45. cuda-driver-sys

    Rust binding to CUDA Driver APIs

    v0.3.0 43K #gpgpu #cuda #ffi
  46. nvidia-video-codec-sdk

    Bindings for NVIDIA Video Codec SDK

    v0.4.0 #video-codec #cuda #nvidia
  47. torsh-profiler

    Performance profiling and monitoring for ToRSh

    v0.1.0-rc.1 #profiling #torsh #profiler-performance #artificial-intelligence #memory-profiling #deep-learning #cuda #profiling-monitoring #memory-leaks #memory-leak-detection
  48. ringkernel-graph

    GPU-accelerated graph algorithm primitives

    v0.4.2 #graph #bfs #cuda #parallel
  49. kn-cuda-sys

    A wrapper around the CUDA APIs

    v0.7.3 750 #inference #cuda #neural-network #graph #cu-blas #nvidia-gpu #cudnn #onnx #llama #happen
  50. torsh-backend

    Backend abstraction layer for ToRSh

    v0.1.0-rc.1 #cuda #deep-learning #sci-rs2 #system #devices #torsh #web-gpu #neural-network #machine-learning #metal
  51. cuda-runtime-sys

    Rust binding to CUDA Runtime APIs

    v0.3.0-alpha.1 75K #gpgpu #cuda #ffi
  52. icicle-core

    GPU ZK acceleration by Ingonyama

    v1.3.0 #zero-knowledge-proofs #ntt #gpu-acceleration #msm #cuda #golang #hardware-acceleration #privacy-preserving
  53. piper-tts-rs

    Piper-TTS implementation in Rust

    v0.1.4 #text-to-speech #piper #audio #chunks #cuda #cargo-run
  54. tesser-cortex

    High-performance, hardware-agnostic AI inference engine for Tesser

    v0.9.3 #artificial-intelligence #inference-engine #onnx #tesser #cuda #tensor-rt #zero-copy #trading #quantitative-trading
  55. luminal_cudarc

    Safe wrappers around CUDA apis

    v0.10.0 #cuda #nvidia #cu-blas #nvidia-gpu #nvrtc #gpu
  56. ringkernel-montecarlo

    GPU-accelerated Monte Carlo primitives for variance reduction

    v0.4.2 #monte-carlo #cuda #variance-reduction #simulation
  57. RayBNN_DataLoader

    Read CSV, numpy, and binary files to Rust vectors of f16, f32, f64, u8, u16, u32, u64, i8, i16, i32, i64

    v2.0.3 480 #raybnn_dataloader #numpy #csv #opencl #cuda
  58. crown-bin

    A cryptographic library

    v0.12.0 #hashing #encryption #cryptography #cuda #encryption-hashing
  59. RayBNN_Raytrace

    Ray tracing library using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

    v2.0.3 500 #raybnn_raytrace #ray-tracer #opencl #cuda #ray-tracing
  60. rcudnn

    safe Rust wrapper for CUDA's cuDNN

    v1.8.0 #cudnn #neural-network #cuda #nvidia
  61. jawe-cuvs-iii

    RAPIDS vector search library

    v25.4.0 #vector-search #nearest-neighbors-search #cuvs #cuda #cluster-analysis #machine-learning #similarity-search #gpu #information-retrieval #rapids
  62. hpt-cudakernels

    implements cuda kernels for hpt

    v0.1.3 1.0K #hpt #deep-learning #cuda #compile #onnx #memory-layout
  63. jawe-cuvs-iv

    RAPIDS vector search library

    v25.4.0 #vector-search #nearest-neighbors-search #cuvs #cuda #cluster-analysis #similarity-search #gpu #machine-learning #information-retrieval #rapids
  64. RayBNN_Cell

    Cell Position Generator for RayBNN

    v2.0.3 350 #raybnn_cell #ray-tracer #opencl #cuda
  65. tropical-gemm-cuda

    CUDA backend for tropical matrix multiplication

    v0.2.0 #cuda #tropical #gemm #gpu #matrix
  66. rcublas

    safe Rust wrapper for CUDA's cuBLAS

    v0.6.0 #cu-blas #nvidia #cuda #blas
  67. cublas

    safe Rust wrapper for CUDA's cuDNN

    v0.2.0 110 #cuda #nvidia #blas
  68. scir-gpu

    SciR GPU foundations: device arrays and CUDA (feature-gated) elementwise/FIR kernels with CPU parity

    v0.3.2 #cuda #wgpu #gpu #scipy #scir
  69. hodu_cuda_kernels

    hodu cuda kernels

    v0.2.4 #cuda #matrix #tensor #hodu #gpu #cu-blas #row-major #non-contiguous #deep-learning
  70. cuda-config

    Helper crate for finding CUDA libraries

    v0.1.0 84K #gpgpu #cuda #ffi
  71. rcudnn-sys

    FFI bindings to cuDNN

    v0.5.0 #cudnn #nvidia #cuda #sys
  72. llama-cpp-sys-4

    Low Level Bindings to llama.cpp

    v0.1.94 1.3K #llama-cpp #sampler #low-level #cuda #safe-api
  73. icicle-cuda-runtime

    Ingonyama's Rust wrapper of CUDA runtime

    v1.3.0 #zero-knowledge-proofs #icicle #cuda #run-time #golang #gpu-acceleration #ntt #hardware-acceleration #msm #privacy-preserving
  74. cudnn

    safe Rust wrapper for CUDA's cuDNN

    v1.3.1 600 #neural-network #cuda #nvidia
  75. nam-supraseal-c2

    CUDA Groth16 proof generator for Filecoin

    v0.1.2-nam.0 #proof #groth16 #generator #filecoin #cuda #zk-snarks #benchmark
  76. crseo-sys

    Cuda Engined Optics Rust Interface

    v1.3.1 750 #astronomy #cuda #telescope
  77. jawe-cuvs-sys-ii

    Low-level rust bindings to libcuvs

    v25.4.0 #nearest-neighbors-search #cuvs #vector-search #cuda #machine-learning #similarity-search #cluster-analysis #information-retrieval #sparse-vector #gpu
  78. emixai

    Feature-gated AI helpers (audio, imaging, language, vision) for EssentialMix

    v0.6.0 #artificial-intelligence #helper #cuda #language #chatgpt #audio #feature-gated #metal #mkl #computer-vision
  79. luminal_cuda

    Cuda compiler for luminal

    v0.2.0 #deep-learning #cuda #compiler #luminal
  80. cmake-init

    Initialize CMake project at speed

    v0.1.0 #cuda #hip #cpp #openmpi
  81. zenu-cuda

    CUDA bindings for Rust

    v0.1.0 #deep-learning #cuda #cudnn #framework #bindings #gpu #cu-blas #classification
  82. fellhorn-llama-cpp-sys-2

    Low Level Bindings to llama.cpp

    v0.1.124 #llama-cpp #bindings #low-level #cuda #safe-api #llama-cpp-2
  83. accel

    GPGPU Framework for Rust

    v0.3.1 #gpgpu #cuda
  84. cuda

    CUDA bindings

    v0.4.0-pre.2 #bindings #random #cuda-driver #run-time #bindgen
  85. bevy_cuda

    CUDA integration for Bevy game engine

    v0.1.0 #cuda #gpu #bevy #gamedev #graphics
  86. Try searching with DuckDuckGo.

  87. RayBNN_Optimizer

    Gradient Descent Optimizers and Genetic Algorithms using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

    v2.0.1 140 #raybnn_optimizer #gradient-descent #opencl #cuda #math
  88. shimmy-llama-cpp-sys-2

    Low Level Bindings to llama.cpp with MoE CPU offloading support

    v0.1.123 410 #llama-cpp #bindings #low-level #cuda #moe #offloading #safe-api #llama-cpp-2
  89. cuda_bindgen

    Bindgen like interface to build cuda kernels to interact with within Rust

    v0.2.0 #cuda #bindgen #build #interact #interface #gpu
  90. whisper-cpp-plus-sys

    Low-level FFI bindings for whisper.cpp

    v0.1.3 #whisper-cpp #cuda #open-blas #quantization #metal #bindings-for-whisper
  91. async-cuda-npp

    Async NVIDIA Performance Primitives for Rust

    v0.4.0 #cuda #npp #nvidia #async #gpu #nvidia-gpu
  92. candle_embed

    Text embeddings with Candle. Fast and configurable. Use any model from Hugging Face. CUDA or CPU powered.

    v0.1.4 220 #hugging-face #cuda #vector-embedding #search #embeddings
  93. cudarse-driver

    Bindings to the CUDA Driver API that tries to stay faithful to the original

    v0.1.0 #cuda #cuda-driver #hardware-acceleration #turbo-metrics #video-processing #npp #ssimulacra2 #amf #cudarse #faithful
  94. cudnn-sys

    FFI bindings to cuDNN

    v0.0.3 #cuda #nvidia #sys
  95. silero-vad-rs

    Silero Voice Activity Detection

    v0.1.2 #voice #onnx #model #detect #voice-activity #audio-processing #vad #cuda #silero #model-inference
  96. jawe-cuvs-sys-iv

    Low-level rust bindings to libcuvs

    v25.4.0 #cuvs #nearest-neighbors-search #vector-search #machine-learning #cluster-analysis #similarity-search #gpu #cuda #information-retrieval #libcuvs
  97. torsh-core

    Core types and traits for ToRSh deep learning framework

    v0.1.0-rc.1 240 #deep-learning #devices #web-gpu #cuda #operation #stride #broadcasting #pytorch #sci-rs2 #neural-network
  98. jawe-cuvs-sys-iii

    Low-level rust bindings to libcuvs

    v25.4.0 #vector-search #cuvs #nearest-neighbors-search #machine-learning #similarity-search #cluster-analysis #information-retrieval #cuda #gpu #libcuvs
  99. cuda-oxide

    high-level, rusty wrapper over CUDA. It provides the best safety one can get when working with hardware.

    v0.4.0 #cuda #gpu #parallel
  100. darknet-sys

    -sys crate for Rust darknet wrapper

    v0.4.0 150 #cuda #neural-network #update #dynamic #default #git-submodule
  101. crown-jsasm

    A cryptographic library

    v0.12.0 #encryption #hashing #cuda #cryptography
  102. easy-tensorrt-core

    Rust wrapper for NVIDIA TensorRT

    v0.3.1 #tensor-rt #cuda #nvidia
  103. cufile

    Safe Rust bindings for NVIDIA CuFile library

    v0.2.0 #nvidia #cuda #storage #gpu #direct
  104. oxidized-transformers

    Transformers library (not functional yet)

    v0.1.1 #llama #transformer-models #hugging-face #bert #oxidized #cuda #albert #float16
  105. memonitor

    Query CPU and GPU memory information in a portable way

    v0.2.4 430 #system-information #gpu #cpu-memory #cuda #local #cpu-and-gpu #devices-information #vulkan
  106. rcublas-sys

    FFI bindings to cuBLAS

    v0.5.0 #cu-blas #cuda #nvidia #sys
  107. RayBNN_Neural

    Neural Networks with Sparse Weights in Rust using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

    v2.0.3 440 #raybnn_neural #deep-learning #neural-network #opencl #cuda #machine-learning
  108. easy-tensorrt-sys

    Rust binding to NVIDIA TensorRT, forked from tensorrt-rs-sys

    v0.2.1 #tensor-rt #cuda #nvidia #ffi
  109. libdebayer

    debayer images with CUDA

    v0.3.0 210 #cuda #image #debayer #debayering #benchmark
  110. tensorgraph-sys

    backbone for tensorgraph, providing memory manamagement across devices

    v0.1.11 #cuda #blas #neural-network #numeric #machine-learning
  111. cuda11-cudart-sys

    cuda ffi

    v0.3.0 #cuda #deep-learning #neural-network #machine-learning
  112. cuda_d3d11_interop_bindings

    Register and map D3D11 buffers with CUDA

    v0.1.1 #cuda #graphics #direct3d11
  113. cuda-colorspace-kernel

    Colorspace handling on CUDA (device code)

    v0.1.0 #cuda #color-space #video #turbo-metrics #hardware-acceleration #npp #ssimulacra2 #amf #nvdec #video-codec
  114. cuda11-cuda-sys

    cuda ffi

    v0.2.0 #cuda #deep-learning #neural-network #machine-learning
  115. RayBNN_Graph

    Graph Manipulation Library For GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

    v2.0.3 140 #raybnn_graph #opencl #cuda #graph #sparse #math
  116. ptoxide

    A virtual machine to execute CUDA PTX without a GPU

    v0.1.0 #cuda #cuda-ptx #vm #execute #model #gpu
  117. zenu-cuda-config

    CUDA configuration for Zenu

    v0.1.0 #deep-learning #cuda #zenu #framework #cudnn
  118. simt_cuda_sys

    part of simt. cuda driver api bindings

    v0.2.0 #cuda #cuda-driver #api #bindings #compute #part-of-simt #compute-shader #amd-gpu #hip
  119. ug-llama

    Micro compiler for tensor operations

    v0.4.0 #cuda #tensor #machine-learning
  120. tensorgraph-math

    backbone for tensorgraph, providing math primitives

    v0.1.11 #cuda #neural-network #machine-learning #numeric
  121. cuda_dnn

    cuDNN API bindings

    v0.1.1 #cuda #cudnn
  122. ulib

    Universal data storage library for CPU/GPU heterogeneous applications

    v0.3.3 #gpu #storage #cuda #universal #devices
  123. babichjacob-llama-cpp-sys-2

    Low Level Bindings to llama.cpp

    v0.1.85 340 #llama-cpp #low-level #cuda #safe-api
  124. cudi

    A small tool for displaying CUDA device properties

    v0.1.0 #cuda #cli
  125. cufile-sys

    Raw FFI bindings for NVIDIA CuFile library

    v0.1.1 #nvidia #cuda #storage #ffi
  126. nvrtc

    Bindings for NVIDIA® CUDA™ NVRTC in Rust

    v0.1.3 #cuda #gpu #bindings #api-bindings
  127. torch-build

    link libtorch FFI interface

    v0.1.0 #link #interface #libtorch #cuda #torch #cc #build-dependencies #cpp #cargo-subcommand
  128. tensorrt-rs-sys

    Rust binding to NVIDIA TensorRT

    v0.1.2 #tensor-rt #cuda #nvidia #ffi
  129. del-msh-cudarc

    2D/3D Mesh processing using Cuda for scientific prototyping

    v0.1.39 #mesh #del-msh #3d-mesh #graphics #prototyping #cuda #2d
  130. zenu-cudnn-sys

    Rust bindings for cuDNN

    v0.1.0 #cudnn #deep-learning #framework #cuda #bindings #classification
  131. zenu-cuda-driver-sys

    Rust bindings for CUDA Driver API

    v0.1.0 #cuda #deep-learning #framework #cudnn #cuda-driver #classification
  132. zenu-cublas-sys

    Rust bindings for cuBLAS

    v0.1.0 #deep-learning #cu-blas #framework #cuda #cudnn #classification
  133. zenu-cuda-kernel-sys

    CUDA kernel bindings for Rust

    v0.1.0 #cuda #deep-learning #framework #cudnn #bindings #classification
  134. zenu-cuda-runtime-sys

    CUDA runtime bindings for Rust

    v0.1.0 #cuda #deep-learning #framework #cudnn #bindings #run-time-bindings #classification
  135. galois-kernels

    galois cuda kernels

    v0.1.0 #cuda #tensor #artificial-intelligence #wwml #galois #inference-engine
  136. bullet

    Supersonic Math

    v0.1.2 #cuda #shared-memory #parameters #math
  137. grumpkin-msm

    Optimized multiscalar multiplicaton for the Grumpkin curve cycle

    v0.1.0 #curve #multiscalar #grumpkin #cycle #cuda #multiplicaton