#cuda

  1. cudarc

    Safe and minimal CUDA bindings

    v0.19.7 719K #cuda #nvidia #cudnn #cu-blas #nvidia-gpu
  2. whisper-rs

    Rust bindings for whisper.cpp

    v0.16.0 79K #whisper-cpp #logging #gpu #vulkan #hook #open-blas #cuda #metal #sampling-strategy #audio-transcription
  3. neptune

    Poseidon hashing over BLS12-381 for Filecoin

    v13.0.0 122K #zero-knowledge-proofs #poseidon-hash #bls12-381 #prime-field #filecoin #opencl #cuda #hashing #compile-time #gpu
  4. bindgen_cuda

    Bindgen like interface to build cuda kernels to interact with within Rust

    v0.1.6 115K #cuda #bindgen #interface #build #interact #cu
  5. dlpark

    dlpack Rust binding for Python

    v0.7.0 230K #python-bindings #deep-learning #dlpack #pyo3 #devices #cuda #deleter #abi
  6. cuvs

    RAPIDS vector search library

    v26.4.0 750 #vector-search #nearest-neighbors-search #similarity-search #k-means #machine-learning #cluster-analysis #information-retrieval #vector-similarity #gpu #cuda
  7. hvm

    A massively parallel, optimal functional runtime in Rust

    v2.0.22 750 #parallel #functional #massively #cuda #optimal #high-level-language #higher-order
  8. axonml-autograd

    Automatic differentiation engine for Axonml ML framework

    v0.6.2 #automatic-differentiation #inference #graphviz #gradient-checkpointing #axonml #convolution #forward-pass #ml #cuda #gpu
  9. ai-hwaccel

    Universal AI hardware accelerator detection, capability querying, and workload planning for Rust

    v1.2.0 130 #npu #tpu #cuda #gpu
  10. candle-kernels

    CUDA kernels for Candle

    v0.10.2 148K #cuda #tensor #machine-learning
  11. ringkernel-cuda

    CUDA backend for RingKernel - NVIDIA GPU support via cudarc

    v1.1.0 #cuda #nvidia #nvidia-gpu #gpu #actor
  12. shiguredo_nvcodec

    Rust bindings for NVIDIA Video Codec SDK

    v2026.2.0-canary.0 #video-codec #shiguredo #sdk #bindings-for-nvidia #cuda #デコーダー #エンコーダー
  13. mwa_hyperbeam

    Primary beam code for the Murchison Widefield Array (MWA) radio telescope

    v0.10.4 #murchison-widefield-array #beam #telescope #python #primary #hdf5 #cuda #hip #radio-astronomy #env-vars
  14. cudaforge

    Advanced CUDA kernel builder for Rust with incremental builds, auto-detection, and external dependency support

    v0.1.5 44K #gpu-kernel #cuda #gpu #nvcc
  15. mamba-rs

    Mamba SSM and Mamba-3 SISO in Rust with optional CUDA GPU acceleration. Inference and training (BPTT through SSM state, AdamW), CPU + GPU paths, custom CUDA kernels, CUDA Graph capture…

    v0.3.1 #cuda #ssm #deep-learning #state-space-model
  16. oxicuda-ptx

    OxiCUDA PTX - PTX code generation DSL and IR for GPU kernel development

    v0.1.7 240 #cuda #codegen #cuda-ptx #gpu-kernel #gpu
  17. oxicuda-blas

    OxiCUDA BLAS - GPU-accelerated BLAS operations (cuBLAS equivalent)

    v0.1.7 130 #cuda #gemm #linear-algebra
  18. tl_backend

    GPU Backend Trait Definitions for TL

    v0.4.11 #back-end #gpu #tl #back-end-traits #define #metal #cuda #top-k #matmul #tensor-logic
  19. ferrum-kernels

    Unified compute kernels (CUDA/Metal/CPU) and model runner for Ferrum inference

    v0.7.3 #cuda #llama #inference-engine #apple-silicon #moe #ferrum #llm-inference #metal #open-ai-compatible #cpu-model
  20. oxicuda-driver

    OxiCUDA Driver - Dynamic CUDA driver API wrapper via libloading (zero SDK dependency)

    v0.1.7 280 #cuda-driver #gpu-compute #cuda #gpu-driver #nvidia #gpu
  21. iron_learn

    ML library with GPU-accelerated gradient descent. Supports tensors, complex numbers, linear/logistic regression, and CUDA optimization.

    v0.6.5 #cuda #gradient-descent #gpu-accelerated #neural-network #tensor #machine-learning #linear-regression #iron #logistic-regression #complex-numbers
  22. oxicuda-launch

    OxiCUDA Launch - Type-safe GPU kernel launch infrastructure

    v0.1.7 210 #cuda #gpu-compute #launch #gpu
  23. cudaclaw

    CUDA Rust bindings for GPU programming in the Cocapn fleet

    v0.1.0 #cuda #gpu #lock-free-queue #agent #benchmark #cuda-bindings #lamports #performance-monitoring #zero-copy #claw
  24. async-cuda

    Async CUDA for Rust

    v0.6.1 450 #cuda #nvidia #gpu #async #npp
  25. xlog-prob

    Probabilistic inference engines for XLOG

    v0.5.0 #xlog #inference-engine #edge #predicate #cuda #gpu #cache #knowledge-graph #fact #probability
  26. omicsx

    SIMD-accelerated sequence alignment and bioinformatics analysis for petabyte-scale genomic data

    v1.0.2 #sequence-alignment #genomics #cuda #bioinformatics #simd-alignment
  27. kn-cuda-sys

    A wrapper around the CUDA APIs

    v0.7.4 #inference #cuda #graph #cu-blas #cudnn #neural-network #onnx #llama #happen #intermediate-representation
  28. ec-gpu

    Traits for field and eliptic curve operations on GPUs

    v0.2.0 26K #prime-field #opencl #finite-fields #curve #codegen #gpu #finite-field-arithmetic #limbs #cuda #elliptic-curve
  29. flash-map

    GPU-native concurrent hash map with bulk-only API. Robin Hood hashing, SoA layout, CUDA kernels. Designed for blockchain state, HFT, and batch-parallel workloads.

    v0.6.0 #hash-map #cuda #gpu #high-performance #concurrency
  30. zfp-sys

    Raw Rust bindings to ZFP (https://github.com/LLNL/zfp)

    v0.4.3 1.2K #github #llnl #bindings #cuda #older-versions
  31. bend-lang

    A high-level, massively parallel programming language

    v0.2.38 #parallel #bend #language #massively #recursion #cuda
  32. llama-cpp-sys-2

    Low Level Bindings to llama.cpp

    v0.1.146 81K #llama-cpp #low-level #cuda
  33. oxideav-nvidia

    Linux NVIDIA NVDEC/NVENC hardware decode/encode bridge for the oxideav framework — runtime-loaded via libloading, no compile-time CUDA SDK dep

    v0.0.2 #nvdec #nvenc #oxideav #cuda #nvidia #multimedia-encoding
  34. async-tensorrt

    Async TensorRT for Rust

    v0.9.1 210 #tensor-rt #cuda #async #nvidia #gpu
  35. tensor-crab

    Rust-native ML library. No Python. No GIL. Just speed.

    v0.1.0 #machine-learning #cuda #tensor #automatic-differentiation #python #neural-network #rust-native #save-load #single-binary #gpu
  36. crown

    A cryptographic library

    v0.19.0 850 #cuda #cryptography #encryption #hashing
  37. rf-detr-ort

    High-performance RF-DETR object detection inference via ONNX Runtime (TensorRT / CUDA / CPU)

    v0.2.0 #tensor-rt #cuda #object-detection #inference #computer-vision
  38. ringkernel-graph

    GPU-accelerated graph algorithm primitives for RingKernel (CSR, BFS, SCC, Union-Find, SpMV)

    v1.1.0 #graph #cuda #bfs #parallel
  39. cuda-rust-wasm

    CUDA to Rust transpiler with WebGPU/WASM support

    v0.1.7 #cuda #web-gpu #transpiler #wasm #gpu
  40. object_detector

    Object detection using ORT and the yoloe-26-seg model. This model can detect multiple objects per image, each having a tag, pixel-level mask, and a boundingbox. It's pretrained, it has a vocabulary of 4000+ objects.

    v0.5.0 #object-detection #detect #bounding-box #vocabulary #ort #hugging-face #cuda #promptable #tensor-rt #onnx
  41. xlog-cuda

    CUDA kernel provider, buffers, and interop for XLOG

    v0.5.0 #cuda #xlog #gpu #edge #query #cache #ilp #dlpack #radix-sorting #credits
  42. ringkernel-cuda-codegen

    CUDA code generation from Rust DSL for RingKernel stencil kernels

    v1.1.0 #cuda #codegen #stencil #transpiler #gpu
  43. wax-llm

    Command-line LLM inference with Candle, safetensors, GGUF, and Metal support

    v0.1.0 #gguf #safetensors #llm-inference #model #metal #candle #wax #top-p #top-k #cuda
  44. mwa_hyperdrive

    Calibration software for the Murchison Widefield Array (MWA) radio telescope

    v0.7.0 #murchison-widefield-array #radio #telescope #calibration #hyperdrive #cuda #astronomy #radio-astronomy
  45. cuda-async

    Safe Async CUDA support via Async Rust

    v0.1.0 #cuda #stream #scheduling-policy #multiple-devices #tensor #async-runtime #gpu #cutile #conventions #tokio-runtime
  46. gpu-scatter-gather

    World's fastest wordlist generator using GPU acceleration with multi-GPU support

    v1.8.2 #cuda #word-list #security
  47. guerks_image_processing

    CUDA image processing

    v0.2.0 #image-processing #cuda #command-line-arguments #load-image #grayscale #bat
  48. cudf

    Safe Rust bindings for NVIDIA libcudf -- GPU-accelerated DataFrame operations

    v0.3.1 #arrow #dataframe #cuda #rapids
  49. whisper-mcp-server

    Speech-to-text MCP server powered by whisper.cpp

    v0.1.1 #mcp #mcp-server #json-rpc #web-server #cpp #whisper-cpp #cuda #authentication #transcribe #stdio-transport
  50. pasta-msm

    Optimized multiscalar multiplicaton for Pasta moduli for x86_64 and aarch64

    v0.1.5 3.2K #pasta #multiscalar #cuda #multi-scalar #x86-64 #arm64 #moduli #multiplicaton #break-down #zk-snarks
  51. axonml-optim

    Optimizers and learning rate schedulers for the Axonml ML framework

    v0.6.2 #learning-rate #optimization #adam #training #neural-network #builder-pattern #sgd #gradient-descent #cuda #decay
  52. moe-gpu-dsp

    MoE-routed GPU signal processing framework — batch cuFFT, kernel dispatch, zero-copy pipelines

    v0.1.1 #cuda #signal-processing #fft #dsp #gpu
  53. ndrs

    A tensor library with GPU support

    v0.5.0 #cuda #gpu #tensor #transfer #dtype #npy #strided #arc #strides
  54. flodl-cli

    libtorch manager and GPU diagnostic tool for Rust deep learning

    v0.5.3 #libtorch #pytorch #cuda #deep-learning #gpu
  55. ringkernel-montecarlo

    GPU-accelerated Monte Carlo primitives for RingKernel (Philox RNG, variance reduction)

    v1.1.0 #monte-carlo #cuda #variance-reduction #simulation
  56. supraseal-c2

    CUDA Groth16 proof generator for Filecoin

    v0.1.2 #groth16 #generator #filecoin #proof #cuda #zk-snarks #benchmark
  57. nove

    lightweight deep learning library wrapped around Candle Tensor

    v0.1.2 #deep-learning #candle #model #metrics #optimization #data-loader #cuda #metal #learner #classification
  58. xlog-cli

    Command-line interface for deterministic and probabilistic XLOG execution

    v0.5.0 #statistics #deterministic #probabilistic #execution #command-line-interface #knowledge-graph #query-engine #python-bindings #cuda #pytorch
  59. oxicuda-backend

    OxiCUDA Backend - Abstract compute backend trait for GPU dispatch

    v0.1.7 100 #gpu-compute #cuda #pure-rust #back-end #gpu
  60. burn-cuda

    CUDA backend for the Burn framework

    v0.21.0 62K #deep-learning #cuda #machine-learning #gpu
  61. infernum

    CLI - From the depths, intelligence rises

    v0.2.0-rc.2 #inference #chat #cache #chat-history #depths #hugging-face #cuda #metal #gpu #local-llm
  62. xlog-gpu

    High-level Rust API for running XLOG programs on NVIDIA GPUs

    v0.5.0 #xlog #statistics #query #programs #nvidia-gpu #rust-api #knowledge-graph #profiling #python-bindings #cuda
  63. ferrum-cuda-kernels

    Custom CUDA kernels and decode runner for Ferrum inference

    v0.6.0 #llama #cuda #inference-engine #ferrum #llm-inference #openai #open-ai-compatible #apple-silicon #metal #embedding
  64. perdix

    High-performance GPU-accelerated ring buffer for AI terminal multiplexing

    v0.1.1 #ring-buffer #cuda #web-gpu #gpu-buffer #terminal #gpu
  65. torsh-backend

    Backend abstraction layer for ToRSh

    v0.1.2 #deep-learning #cuda #metal #gpu
  66. xdl-amp

    Multi-backend GPU/ML acceleration for XDL

    v0.1.1 #gpu #gpu-acceleration #xdl #ml #cuda #multi-backend #amp #opencl #directx #metal
  67. pylate-rs

    WebAssembly library for late interaction models

    v1.0.4 #interaction-model #inference-engine #late #edge-computing #python-bindings #cuda #gpu #mkl #metal #candle
  68. lumen-engine-ffmpeg

    FFmpeg integration for media decode, encode, muxing, and GPU interop in Lumen

    v0.1.0 #vulkan #lumen #interop #ffmpeg #muxing #gpu #cuda #video-frame #metal #gpl
  69. optirs-gpu

    OptiRS GPU acceleration and multi-GPU optimization

    v0.3.1 #cuda #gpu #metal #opencl #optimization
  70. torch_poetry_bootstrap

    A command-line tool to detect CUDA version and install the appropriate PyTorch wheel via Poetry

    v0.1.18 190 #pytorch #poetry #cuda #bootstrap
  71. ec-gpu-gen

    Code generator for field and eliptic curve operations on the GPUs

    v0.7.1 29K #elliptic-curve #cuda #codegen #opencl #finite-fields #gpu #finite-field-arithmetic #compile-time #source-builder #ec-gpu
  72. nvidia-video-codec-sdk

    Bindings for NVIDIA Video Codec SDK

    v0.4.0 270 #video-codec #cuda #nvidia #cuda-bindings
  73. sass-assembler

    SASS (NVIDIA GPU) assembler for Gaia project

    v0.1.1 #sass #gpu #assembly #nvidia-gpu #gaia #cubin #cuda #x86-64 #elf
  74. llama-cpp-bindings

    llama.cpp bindings for Rust

    v0.4.2 160 #llama-cpp #cuda #sampler #context #sampling #gpu
  75. atomr-accel-cuda

    GPU acceleration via the actor model. Wraps NVIDIA CUDA libraries (cuBLAS, cuDNN, cuFFT, cuRAND, cuSOLVER, cuSPARSE, cuTENSOR, cuBLASLt, NVRTC, NCCL) as supervised atomr actors with…

    v0.10.0 #cuda #atomr #ml #gpu
  76. xlog-core

    Core types, traits, and error surfaces shared across XLOG

    v0.5.0 #hash #cuda #symbols #arrow #traits #knowledge-graph #profiling #dedup #gpu #memory-tracking
  77. burn_dragon_kernel

    Fused GPU kernel crate for burn_dragon execution paths

    v0.21.0 #cuda #recurrent #cuda-kernel #burn #neuroscience
  78. cutile

    lets programmers safely author and execute tile kernels directly in Rust

    v0.1.0 #cuda #async #mlir #gpu
  79. cuda-driver-sys

    Rust binding to CUDA Driver APIs

    v0.3.0 33K #cuda #gpgpu
  80. singe-ptx

    CUDA PTX parser, AST, and instruction metadata utilities

    v0.1.0-alpha.2 #cuda #cuda-ptx #parser #nvidia
  81. autd3-backend-cuda

    CUDA Backend for AUTD3

    v35.0.0 3.9K #autd #autd3 #cuda
  82. docbert-pylate

    late interaction (ColBERT) models, vendored into the docbert workspace

    v0.9.0 #colbert #model #document #workspace #cuda #metal #hierarchical #mkl #accelerate #bert
  83. with-gpu

    Intelligent GPU selection wrapper for CUDA commands

    v0.3.0 #gpu #cuda #cli-automation #nvidia-gpu #nvidia #automation
  84. nam-ec-gpu-gen

    Code generator for field and elliptic curve operations on the GPUs

    v0.7.2-nam.0 220 #elliptic-curve #cuda #opencl #codegen #finite-fields #finite-field-arithmetic #gpu #ec-gpu
  85. oxicuda-sparse

    OxiCUDA Sparse - GPU-accelerated sparse matrix operations (cuSPARSE equivalent)

    v0.1.7 #sparse-matrix #cuda #matrix
  86. rlkit

    deep reinforcement learning library based on Rust and Candle, providing complete implementations of Q-Learning and DQN algorithms, supporting custom environments, various policy choices…

    v0.0.3 #reinforcement-learning #deep-learning #dqn #algorithm #q-learning #candle #learning-algorithm #ppo #policies #cuda
  87. car-memgine

    Memgine — graph-based memory engine for Common Agent Runtime

    v0.14.0 190 #graph-node #fact #graph-based #skill #memory-engine #car #conversation #success #cuda #distillation
  88. sbv2_core

    Style-Bert-VITSの推論ライブラリ

    v0.2.0-alpha8 370 #bert #style #text-to-speech #cuda #coreml #tensor-rt #directml #vits #場合 #onnx-runtime
  89. signinum-cuda-runtime

    CUDA Driver API runtime helpers for signinum device adapters

    v0.4.1 #cuda #cuda-driver #jpeg #run-time #adapter #rgb8
  90. tensor_frame

    A PyTorch-like tensor library for Rust with CPU, WGPU, and CUDA backends

    v0.0.3-alpha 120 #wgpu #cuda #tensor #machine-learning #gpu
  91. morok-runtime

    Kernel execution runtime for the Morok ML compiler

    v0.1.0-alpha.2 #cuda #devices #parallel-execution #runtime-execution #morok #jit #llvm #gpu #ml
  92. singe-cuda-find

    CUDA toolkit discovery and library path resolution utilities

    v0.1.0-alpha.2 #build-script #cuda #nvidia
  93. baracuda-forge

    Build-time CUDA kernel compiler for the baracuda ecosystem: nvcc-driven incremental builds, parallel compilation, GPU auto-detection, and CUTLASS / custom git dependency support

    v0.0.1-alpha.13 #gpu-kernel #cuda #gpu #nvcc
  94. atomr-accel-flashattn

    FlashAttention v2 + v3 kernel templates for atomr-accel — fp16/bf16/fp8, causal, varlen, ALiBi, sliding window, sink tokens, MQA/GQA, paged KV-cache, and chunked prefill, dispatched through NVRTC + Phase 0…

    v0.10.0 #cuda #atomr #ml #gpu
  95. baracuda-types

    Shared type vocabulary for the baracuda CUDA stack (Half/BFloat16/Complex, DeviceRepr, CudaVersion, Feature, CudaStatus)

    v0.0.1-alpha.13 200 #cuda #run-time #gpu #nvidia #driver
  96. haagenti-cuda

    CUDA GPU decompression kernels for Haagenti tensor compression

    v0.1.0 #cuda #lz4 #compression #gpu #gpu-compression #decompression
  97. tensorrt-infer

    Safe Rust wrappers for NVIDIA TensorRT inference

    v0.1.0 210 #tensor-rt #gpu #inference #cuda #deep-learning
  98. cyanea-gpu

    GPU compute abstraction (CUDA/Metal) for the Cyanea bioinformatics ecosystem

    v0.1.0 #bioinformatics #metal #gpu-compute #cuda #gpu
  99. tesser-cortex

    High-performance, hardware-agnostic AI inference engine for Tesser

    v0.9.3 #artificial-intelligence #inference-engine #onnx #tesser #hardware-agnostic #cuda #tensor-rt #zero-copy #trading #quantitative-trading
  100. cocapn-glue-core

    Cross-tier wire protocol unifying all FLUX ISA packages for the Cocapn fleet

    v0.1.0 #wire-protocols #plato #fleet #flux #isa #cocapn #tier #cuda #unifying #cache
  101. icicle-core

    GPU ZK acceleration by Ingonyama

    v1.3.0 #gpu-acceleration #ntt #msm #hardware-acceleration #cuda #golang #zero-knowledge-proofs
  102. nove_tensor

    lightweight deep learning library wrapped around Candle Tensor

    v0.1.2 #deep-learning #nove #candle #cuda #metal #data-loading #classification #neural-network #cargo-add #macro-derive
  103. cuda-runtime-sys

    Rust binding to CUDA Runtime APIs

    v0.3.0-alpha.1 44K #gpgpu #cuda #ffi
  104. luminal_cudarc

    Safe wrappers around CUDA apis

    v0.10.0 #cuda #nvidia #cu-blas #nvidia-gpu #nvrtc #gpu
  105. ct-cuda-prep

    GPU-ready CUDA snap kernels — compile-verified, CPU fallback, PTX analysis

    v0.1.0 #cuda #snap #gpu #pythagorean
  106. baracuda-cuda-sys

    Raw FFI bindings and dynamic loader for the CUDA Driver and Runtime APIs (libcuda / libcudart)

    v0.0.1-alpha.13 #cuda #run-time #cuda-driver #nvidia #gpu
  107. kaio-runtime

    KAIO runtime — CUDA driver API wrapper, kernel launch, and device memory management. Part of the KAIO GPU kernel authoring framework.

    v0.4.1 #cuda #cuda-ptx #gpu #run-time #gpu-kernel
  108. atomr-accel-cutlass

    CUTLASS kernel-template instantiation via NVRTC for atomr-accel. Provides GEMM, grouped GEMM, implicit-GEMM convolution, and EVT (epilogue visitor tree) actors that JIT CUTLASS C++…

    v0.10.0 #cuda #atomr #ml #gpu
  109. whisper-rs-sys

    Rust bindings for whisper.cpp (FFI bindings)

    v0.15.0 74K #cpp #whisper-rs #whisper-cpp #vulkan #gpu #open-blas #cuda #metal #logging #audio
  110. fatbinary

    manipulate CUDA fatbinary format

    v0.3.0 #cuda #format #manipulate #entries #elf
  111. cudf-cxx

    cxx-based FFI bridge between Rust and NVIDIA libcudf C++ API

    v0.3.1 #cpp #cuda #rapids #gpu #cxx
  112. piper-tts-rs

    Piper-TTS implementation in Rust

    v0.1.4 #piper #text-to-speech #audio #chunks #cuda #cargo-run
  113. oxicuda-runtime

    OxiCUDA Runtime - CUDA Runtime API wrapper (cudaMalloc/cudaMemcpy/cudaLaunchKernel) built on the driver API

    v0.1.7 #gpu-compute #cuda #run-time #nvidia #gpu
  114. baracuda-runtime

    Safe Rust wrappers for the CUDA Runtime API (devices, streams, events, managed memory, kernel launch via the library API)

    v0.0.1-alpha.13 #cuda #run-time #driver #nvidia #gpu
  115. cuvs-sys

    Low-level rust bindings to libcuvs

    v26.4.0 850 #vector-search #nearest-neighbors-search #machine-learning #cluster-analysis #information-retrieval #vector-similarity #similarity-search #gpu #cuda
  116. xndarray

    CPU and CUDA-backed ndarray

    v0.1.0 #cuda #rust #cuda-oxide
  117. infraqueue-ai-server

    AI model server for INFRAQUEUE

    v0.1.0 #artificial-intelligence #infraqueue #ai-model #server #model-server #candle #system-prompt #openai #cuda #local-model
  118. abaddon

    LLM inference engine - The Destroyer renders judgment

    v0.2.0-rc.2 #inference-engine #llm-inference #cuda #web-gpu #model #int8 #fused #metal #speedup #int4
  119. rcudnn

    safe Rust wrapper for CUDA's cuDNN

    v1.8.0 #cudnn #neural-network #cuda #nvidia
  120. RayBNN_Raytrace

    Ray tracing library using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

    v2.0.3 500 #raybnn_raytrace #ray-tracer #opencl #cuda
  121. faiss-next-sys

    Raw FFI bindings to Faiss (Facebook AI Similarity Search)

    v0.6.0 #faiss #search #search-path #cuda #bindings #similarity-search
  122. baracuda-cublas

    Safe Rust wrappers for NVIDIA cuBLAS (classic BLAS, Lt, Xt)

    v0.0.1-alpha.13 #nvidia #cuda #driver #run-time #gpu
  123. tensorlogic-oxicuda-solver

    OxiCUDA linear solver wrapper for TensorLogic (GPU + CPU fallback)

    v0.1.0 #cuda #solver #linear-solver #decomposition #linear-algebra #gpu
  124. baracuda-cutensor-sys

    Raw FFI bindings and dynamic loader for NVIDIA cuTENSOR (tensor contraction)

    v0.0.1-alpha.13 #cuda #run-time #nvidia #gpu #driver
  125. atomr-accel-agents

    Agentic / LLM GPU actor blueprints on atomr-accel-cuda: RagPipeline, EmbeddingCache, CpuVectorIndex, SharedGpuStateCoordinator, LangGraphGpuActor

    v0.10.0 #gpu #llm #rag #cuda #actor
  126. RayBNN_Sparse

    Sparse Matrix Library for GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

    v2.0.2 430 #raybnn_sparse #opencl #cuda #math
  127. baracuda-cvcuda-sys

    Raw FFI bindings and dynamic loader for NVIDIA CV-CUDA (computer-vision operators)

    v0.0.1-alpha.13 #cuda #nvidia #run-time #gpu #driver
  128. baracuda-cutensor

    Safe Rust wrappers for NVIDIA cuTENSOR. Scaffolding at v0.1.

    v0.0.1-alpha.13 #cuda #nvidia #run-time #gpu #driver
  129. dandelion-cuda

    NVIDIA CUDA backend for dandelion LLM inference engine

    v0.1.0 #inference-engine #llm-inference #dandelion #cuda #back-end #nvidia
  130. jawe-cuvs-iv

    RAPIDS vector search library

    v25.4.0 #vector-search #nearest-neighbors-search #cuvs #similarity-search #cluster-analysis #gpu #machine-learning #information-retrieval #cuda #rapids
  131. jawe-cuvs-iii

    RAPIDS vector search library

    v25.4.0 #vector-search #nearest-neighbors-search #cuvs #similarity-search #cluster-analysis #machine-learning #information-retrieval #cuda #rapids #vector-similarity
  132. RayBNN_DataLoader

    Read CSV, numpy, and binary files to Rust vectors of f16, f32, f64, u8, u16, u32, u64, i8, i16, i32, i64

    v2.0.3 480 #raybnn_dataloader #numpy #csv #opencl #cuda
  133. baracuda-nvcomp-sys

    Raw FFI bindings and dynamic loader for NVIDIA nvCOMP (GPU compression)

    v0.0.1-alpha.13 #cuda #nvidia #run-time #driver #gpu
  134. singe-nccl-sys

    Low-level FFI bindings for the NVIDIA Collective Communications Library (NCCL)

    v0.1.0-alpha.2 #nccl #nvidia #cuda #ffi
  135. kitsune-stt

    Speech-to-Text tool using Candle and Voxtral

    v0.1.0 #text-to-speech #candle #audio #audio-processing #voxtral #cuda #cudnn #hugging-face #gpu #transcribe
  136. baracuda-cusolver

    Safe Rust wrappers for NVIDIA cuSOLVER (dense LU factorization at v0.1)

    v0.0.1-alpha.13 #cuda #driver #nvidia #run-time #gpu
  137. blazen-llm-llamacpp

    Local LLM backend for Blazen using llama.cpp inference engine

    v0.5.3 #inference-engine #llama-cpp #blazen #local-llm #llm-inference #cuda #vulkan #gpu #metal #rocm
  138. baracuda-cufile-sys

    Raw FFI bindings and dynamic loader for NVIDIA cuFile (GPUDirect Storage, Linux-only)

    v0.0.1-alpha.13 #cuda #driver #nvidia #run-time #gpu
  139. blazen-llm-mistralrs

    Local LLM backend for Blazen using mistral.rs inference engine

    v0.5.3 #inference-engine #blazen #llm-inference #local-llm #back-end #cuda #mistralrs #mistral #metal #accelerate
  140. atomr-accel-train

    Distributed training blueprints on atomr-accel-cuda: DataParallelTrainer, PipelineParallelTrainer, TensorParallelTrainer, AsyncParameterServer, optimizer + loss enums

    v0.10.0 #training #cuda #gpu #ml #actor
  141. baracuda-nvml

    Safe Rust wrappers for the NVIDIA Management Library (NVML) — driver-bundled GPU monitoring

    v0.0.1-alpha.13 #driver #cuda #run-time #nvidia #api-bindings #gpu
  142. baracuda-cvcuda

    Safe Rust wrappers for NVIDIA CV-CUDA. Scaffolding at v0.1.

    v0.0.1-alpha.13 #cuda #nvidia #run-time #gpu #driver
  143. baracuda-cusparse

    Safe Rust wrappers for NVIDIA cuSPARSE (generic-API SpMV at v0.1)

    v0.0.1-alpha.13 #cuda #nvidia #driver #run-time #gpu #nvidia-gpu
  144. rakka-accel-train

    Distributed training blueprints on rakka-accel-cuda: DataParallelTrainer, PipelineParallelTrainer, TensorParallelTrainer, AsyncParameterServer, optimizer + loss enums

    v0.2.9 #training #cuda #actor #gpu #ml
  145. baracuda-nccl

    Safe Rust wrappers for NVIDIA NCCL (multi-GPU collective communication)

    v0.0.1-alpha.13 #nvidia #cuda #driver #run-time #gpu
  146. iro-cuda-ffi

    IRO CUDA FFI - A minimal, rigid ABI boundary for Rust to orchestrate nvcc-compiled CUDA kernels

    v0.2.1 #cuda #nvidia #gpu #api-bindings
  147. crseo-sys

    Cuda Engined Optics Rust Interface

    v1.3.2 #astronomy #cuda #telescope
  148. rakka-accel-agents

    Agentic / LLM GPU actor blueprints on rakka-accel-cuda: RagPipeline, EmbeddingCache, CpuVectorIndex, SharedGpuStateCoordinator, LangGraphGpuActor

    v0.2.9 #gpu #llm #rag #cuda #actor
  149. singe

    A machine learning framework that sets tensors ablaze

    v0.1.0-alpha.1 #deep-learning #machine-learning #cuda #tensor
  150. hpt-cudakernels

    implements cuda kernels for hpt

    v0.1.3 1.0K #hpt #deep-learning #cuda #compile #onnx #memory-layout
  151. singe-kernel

    Custom CUDA kernel development framework

    v0.1.0-alpha.1 #cuda #deep-learning #machine-learning
  152. singe-onnx

    ONNX model loading and execution utilities

    v0.1.0-alpha.1 #deep-learning #cuda #machine-learning
  153. baracuda-nvcomp

    Safe Rust wrappers for NVIDIA nvCOMP (GPU compression). Scaffolding at v0.1.

    v0.0.1-alpha.13 #nvidia #cuda #run-time #driver #gpu
  154. crown-bin

    A cryptographic library

    v0.19.0 #hashing #encryption #cuda #cryptography #encryption-hashing
  155. rcublas

    safe Rust wrapper for CUDA's cuBLAS

    v0.6.0 #cu-blas #nvidia #cuda #blas
  156. singe-cuda-sys

    Low-level FFI bindings for CUDA driver, runtime, NVRTC, and related NVIDIA APIs

    v0.1.0-alpha.2 #cuda #nvidia #nvrtc
  157. blazen-audio-whispercpp

    Local speech-to-text backend for Blazen using whisper.cpp

    v0.5.3 #blazen #text-to-speech #whisper-cpp #local #back-end #cuda #transcription #metal #llm #coreml
  158. cublas

    safe Rust wrapper for CUDA's cuDNN

    v0.2.0 #cuda #nvidia #blas
  159. baracuda-tensorrt

    Safe Rust API for NVIDIA TensorRT runtime inference

    v0.0.1-alpha.13 #nvidia #cuda #gpu #driver #run-time
  160. baracuda-nvjitlink

    Safe Rust wrappers for NVIDIA nvJitLink (CUDA 12.0+ JIT linker)

    v0.0.1-alpha.7 #cuda #driver #nvidia #run-time #gpu
  161. baracuda-curand

    Safe Rust wrappers for NVIDIA cuRAND (pseudo- and quasi-random number generation)

    v0.0.1-alpha.13 #nvidia #cuda #gpu #driver #run-time
  162. blazen-image-diffusion

    Local image generation backend for Blazen using diffusion-rs (pure Rust Stable Diffusion)

    v0.5.3 #image-generation #blazen #inference-engine #local #diffusion-rs #stable-diffusion #cuda #metal #llm
  163. codemem-embeddings

    Candle-based embedding service for Codemem using BAAI/bge-base-en-v1.5

    v0.16.1 #artificial-intelligence #codemem #embedding-model #candle #ollama #cuda #metal #model-provider #gpu #openai
  164. baracuda-cudf

    Safe Rust API skeleton for NVIDIA RAPIDS cuDF (GPU DataFrames)

    v0.0.1-alpha.13 #gpu #cuda #nvidia #driver #run-time
  165. blazen-embed-candle

    Local embedding backend for Blazen using HuggingFace candle

    v0.5.3 #hugging-face #blazen #inference #candle #local #embedding-model #cuda #metal #text-embedding #stub
  166. cuda-config

    Helper crate for finding CUDA libraries

    v0.1.0 65K #gpgpu #cuda #ffi
  167. baracuda-npp

    Safe Rust wrappers for NVIDIA NPP (Performance Primitives). Core + signal subset at v0.1.

    v0.0.1-alpha.13 #cuda #gpu #nvidia #run-time #driver
  168. mnemefusion-llama-cpp-sys-2

    Low Level Bindings to llama.cpp (MnemeFusion fork with build fixes)

    v0.1.139 #llama-cpp #build #bindings #low-level #cuda #safe-api #llama-cpp-2
  169. ptx-builder

    NVPTX build helper

    v0.5.3 #nvptx #cuda #gpgpu
  170. icicle-cuda-runtime

    Ingonyama's Rust wrapper of CUDA runtime

    v1.3.0 #zero-knowledge-proofs #icicle #cuda #run-time #golang #gpu-acceleration #ntt #hardware-acceleration #msm #privacy-preserving
  171. tropical-gemm-cuda

    CUDA backend for tropical matrix multiplication

    v0.2.0 #cuda #tropical #gemm #gpu
  172. jawe-cuvs-sys-ii

    Low-level rust bindings to libcuvs

    v25.4.0 #nearest-neighbors-search #vector-search #cuvs #machine-learning #cluster-analysis #similarity-search #vector-similarity #information-retrieval #cuda #sparse-vector
  173. hodu_cuda_kernels

    hodu cuda kernels

    v0.2.4 #cuda #tensor #matrix #hodu #gpu #cu-blas #row-major #non-contiguous #deep-learning
  174. RayBNN_Cell

    Cell Position Generator for RayBNN

    v2.0.3 350 #raybnn_cell #ray-tracer #opencl #cuda
  175. baracuda-cutlass-sys

    Header acquisition for NVIDIA CUTLASS as a baracuda workspace dependency. Sparse-checkout fetch with file-locked caching; emits cargo:include for downstream build.rs consumers.

    v0.0.1-alpha.13 #header #cuda #cutlass #gpu #ffi
  176. zenu-cuda

    CUDA bindings for Rust

    v0.1.0 #cuda #deep-learning #deep-learning-framework #cudnn #cuda-bindings #gpu #cu-blas #classification
  177. Try searching with DuckDuckGo.

  178. accel

    GPGPU Framework for Rust

    v0.3.1 #gpgpu #cuda
  179. blazen-llm-candle

    Local LLM backend for Blazen using candle inference engine

    v0.5.3 #blazen #llm-inference #inference-engine #candle #llm-provider #local-llm #cuda #metal #gpu
  180. cuda

    CUDA bindings

    v0.4.0-pre.2 #cuda-bindings #random #run-time #bindgen #driver
  181. baracuda-cudnn-sys

    Raw FFI bindings and dynamic loader for NVIDIA cuDNN (classic-API subset)

    v0.0.1-alpha.13 #cuda #run-time #driver #gpu #nvidia
  182. cudnn

    safe Rust wrapper for CUDA's cuDNN

    v1.3.1 #neural-network #cuda #nvidia
  183. baracuda-cusolver-sys

    Raw FFI bindings and dynamic loader for NVIDIA cuSOLVER (Dn subset)

    v0.0.1-alpha.13 #cuda #nvidia #run-time #driver #gpu
  184. baracuda-cusparse-sys

    Raw FFI bindings and dynamic loader for NVIDIA cuSPARSE

    v0.0.1-alpha.13 #cuda #gpu #nvidia #run-time #driver
  185. kaio-core

    KAIO core — PTX IR types and emission. Part of the KAIO GPU kernel authoring framework.

    v0.4.0 #cuda #cuda-ptx #codegen #gpu #ir
  186. baracuda-cublas-sys

    Raw FFI bindings and dynamic loader for NVIDIA cuBLAS (classic, Lt, Xt) libraries

    v0.0.1-alpha.13 #cuda #nvidia #run-time #driver #gpu
  187. baracuda-nvml-sys

    Raw FFI bindings and dynamic loader for the NVIDIA Management Library (NVML)

    v0.0.1-alpha.13 #nvidia #run-time #driver #cuda #gpu
  188. RayBNN_Optimizer

    Gradient Descent Optimizers and Genetic Algorithms using GPUs, CPUs, and FPGAs via CUDA, OpenCL, and oneAPI

    v2.0.1 140 #raybnn_optimizer #gradient-descent #opencl #cuda #math
  189. ventura-cuda

    cuda feature for ventura

    v0.1.0 #cuda #ventura #feature-for-ventura
  190. emixai

    Feature-gated AI helpers (audio, imaging, language, vision) for EssentialMix

    v0.6.0 #artificial-intelligence #helper #cuda #language #chatgpt #audio #feature-gated #metal #mkl #computer-vision
  191. baracuda-nccl-sys

    Raw FFI bindings and dynamic loader for NVIDIA NCCL (multi-GPU collective communication)

    v0.0.1-alpha.13 #cuda #nvidia #run-time #driver #gpu
  192. baracuda-tensorrt-sys

    Raw FFI bindings and dynamic loader for NVIDIA TensorRT (C API)

    v0.0.1-alpha.13 #cuda #gpu #nvidia #run-time #driver
  193. baracuda-nvjpeg-sys

    Raw FFI bindings and dynamic loader for NVIDIA nvJPEG

    v0.0.1-alpha.13 #nvidia #cuda #driver #run-time #gpu
  194. baracuda-curand-sys

    Raw FFI bindings and dynamic loader for NVIDIA cuRAND

    v0.0.1-alpha.13 #cuda #nvidia #run-time #gpu #driver
  195. singe-macros

    Procedural macros for the Singe framework

    v0.1.0-alpha.1 #deep-learning #cuda #machine-learning
  196. baracuda-cufft-sys

    Raw FFI bindings and dynamic loader for NVIDIA cuFFT

    v0.0.1-alpha.13 #cuda #run-time #nvidia #driver #gpu
  197. scir-gpu

    SciR GPU foundations: device arrays and CUDA (feature-gated) elementwise/FIR kernels with CPU parity

    v0.3.2 #cuda #wgpu #gpu #scipy #scir
  198. luminal_cuda

    Cuda compiler for luminal

    v0.2.0 #deep-learning #cuda #luminal #compiler
  199. cudarse-driver

    Bindings to the CUDA Driver API that tries to stay faithful to the original

    v0.1.0 #cuda #cuda-driver #cuda-bindings #hardware-acceleration #video-processing #turbo-metrics #npp #ssimulacra2 #amf #cudarse
  200. async-cuda-npp

    Async NVIDIA Performance Primitives for Rust

    v0.4.0 #cuda #npp #nvidia #async #gpu #nvidia-gpu
  201. cudf-sys

    Native build script for linking against NVIDIA libcudf

    v0.3.1 #cuda #rapids #gpu #api-bindings
  202. zerch-embed

    Local embedding model using ONNX Runtime

    v0.1.0 #onnx #embedding-model #download #local #embed #vector-embedding #cuda
  203. crown-jsasm

    A cryptographic library

    v0.19.0 #hashing #encryption #cuda #cryptography #encryption-hashing
  204. cudnn-sys

    FFI bindings to cuDNN

    v0.0.3 #cuda #nvidia #sys
  205. jawe-cuvs-sys-iii

    Low-level rust bindings to libcuvs

    v25.4.0 #vector-search #cuvs #nearest-neighbors-search #machine-learning #cluster-analysis #similarity-search #information-retrieval #cuda #vector-similarity #gpu
  206. candle_embed

    Text embeddings with Candle. Fast and configurable. Use any model from Hugging Face. CUDA or CPU powered.

    v0.1.4 130 #hugging-face #cuda #vector-embedding #search #embeddings
  207. jawe-cuvs-sys-iv

    Low-level rust bindings to libcuvs

    v25.4.0 #nearest-neighbors-search #cuvs #vector-search #machine-learning #similarity-search #cluster-analysis #gpu #information-retrieval #vector-similarity #cuda
  208. nam-supraseal-c2

    CUDA Groth16 proof generator for Filecoin

    v0.1.2-nam.0 #groth16 #proof #generator #filecoin #cuda #zk-snarks #benchmark #c1
  209. whisper-cpp-plus-sys

    Low-level FFI bindings for whisper.cpp

    v0.1.4 120 #whisper-cpp #cuda #open-blas #quantization #metal #bindings-for-whisper
  210. tensorrt-infer-sys

    Raw FFI bindings for NVIDIA TensorRT inference

    v0.1.0 270 #tensor-rt #cuda #inference #gpu #ffi
  211. cuda-oxide

    high-level, rusty wrapper over CUDA. It provides the best safety one can get when working with hardware.

    v0.4.0 #cuda #gpu #parallel
  212. gpufft-cuda-sys

    Raw FFI bindings to cuFFT + CUDA Runtime. Internal plumbing for gpufft.

    v0.1.2 #cuda #cufft #fft #gpu #ffi
  213. darknet-sys

    -sys crate for Rust darknet wrapper

    v0.4.0 150 #cuda #update #default #dynamic #recursion #git-submodule
  214. rcudnn-sys

    FFI bindings to cuDNN

    v0.5.0 #cudnn #nvidia #cuda #sys
  215. cmake-init

    Initialize CMake project at speed

    v0.1.0 #cuda #hip #cpp #openmpi
  216. oxidized-transformers

    Transformers library (not functional yet)

    v0.1.1 #llama #transformer-models #hugging-face #bert #oxidized #cuda #albert #float16
  217. rcublas-sys

    FFI bindings to cuBLAS

    v0.5.0 #cu-blas #cuda #nvidia #sys
  218. wgpu-cuda-interop

    vulkan and cuda interop of memory

    v0.9.0 #cuda #vulkan #interop #wgpu #memory
  219. rummage-sys

    Raw FFI bindings to the Rummage GPU Nostr mining library (CUDA)

    v0.2.3 #gpu #nostr #mining #cuda #miner #vanity-key
  220. tensorgraph-sys

    backbone for tensorgraph, providing memory manamagement across devices

    v0.1.11 #cuda #blas #numeric #neural-network #machine-learning
  221. aprender-gpu

    Pure Rust PTX generation for NVIDIA CUDA - no LLVM, no nvcc

    v0.33.0 #cuda #cuda-ptx #nvidia #gpu #simd