-
cudarc
Safe and minimal CUDA bindings
-
nvml-wrapper
A safe and ergonomic Rust wrapper for the NVIDIA Management Library
-
lru-slab
Pre-allocated storage with constant-time LRU tracking
-
vulkano
Safe wrapper for the Vulkan graphics API
-
whisper-rs
Rust bindings for whisper.cpp
-
femtovg
Antialiased 2D vector drawing library
-
opencl3
Khronos OpenCL 3.0 API and extensions
-
neptune
Poseidon hashing over BLS12-381 for Filecoin
-
pixels
A tiny library providing a GPU-powered pixel frame buffer
-
cl3
Khronos OpenCL 3.0 API and extensions
-
ocl
OpenCL bindings and interfaces for Rust
-
spirv-std
Standard functions and types for SPIR-V
-
ai-hwaccel
Universal AI hardware accelerator detection, capability querying, and workload planning for Rust
-
qmassa
Terminal-based tool for displaying GPUs usage stats on Linux
-
kubectl-view-allocations
kubectl plugin to list allocations (cpu, memory, gpu,...) X (utilization, requested, limit, allocatable,...)
-
cubecl
Multi-platform high-performance compute language extension for Rust
-
system_info_collector
fast application to collect os information and create graphs based on it
-
cuvs
RAPIDS vector search library
-
xilem
A next-generation cross-platform Rust UI framework
-
gpu-descriptor
agnostic descriptor allocator for Vulkan like APIs
-
tracy_full
Fully featured bindings for the Tracy profiler
-
axonml-autograd
Automatic differentiation engine for Axonml ML framework
-
dynamo-llm
Dynamo LLM Library
-
narsil
A terminal-based system resource monitor — GPU-aware, Braille charts, per-char label inversion
-
warp-types
Type-safe GPU warp programming via linear typestate: compile-time prevention of shuffle-from-inactive-lane bugs
-
gpu-alloc
agnostic memory allocator for Vulkan like APIs
-
beamterm-renderer
High-performance WebGL2 terminal renderer for beamterm, targeting sub-millisecond render times in web browsers
-
aprender
Next-generation ML framework in pure Rust —
cargo install aprenderfor theaprCLI -
diffusionx
A multi-threaded crate for random number generation and stochastic process simulation, with optional GPU acceleration
-
vhost-device-gpu
A virtio-gpu device using the vhost-user protocol
-
ranga
Core image processing library — color spaces, blend modes, pixel buffers, filters, and GPU compute for Rust
-
pathfinder_geometry
Basic SIMD-accelerated geometry/linear algebra
-
dynamo-memory
Memory management library for Dynamo
-
wgpu-llm-cli
Terminal-based chat interface for the wgpu LLM inference engine
-
oxicuda-driver
OxiCUDA Driver - Dynamic CUDA driver API wrapper via libloading (zero SDK dependency)
-
neonwhite_seed_finder
Find and simulate seeds for neon white level rushes
-
mold-ai
Local AI image generation CLI — FLUX, SDXL, SD3.5, Z-Image diffusion models on your GPU
-
poulpy-hal
providing layouts and a trait-based hardware acceleration layer with open extension points, matching the API and types of spqlios-arithmetic
-
nv-swaptop
A terminal user interface tool to monitor swap, NUMA topology, and GPU memory usage
-
orb8
eBPF-powered observability toolkit for Kubernetes with GPU telemetry
-
oxicuda-ptx
OxiCUDA PTX - PTX code generation DSL and IR for GPU kernel development
-
rave-cli
RAVE CLI — GPU-native video super-resolution engine
-
envy-tui
TUI manager for EnvyControl - GPU switching for Nvidia Optimus laptops
-
vulkane
Vulkan API bindings generated entirely from vk.xml, with a complete safe RAII wrapper covering compute and graphics: instance/device/queue, buffer, image, sampler, render pass, framebuffer…
-
numr
High-performance numerical computing with multi-backend GPU acceleration (CPU/CUDA/WebGPU)
-
msb_krun
Native Rust API for libkrun microVMs
-
tract-gpu
Tiny, no-nonsense, self contained, TensorFlow and ONNX inference
-
gpu-fft
performing Fast Fourier Transform (FFT) and Inverse FFT using GPU acceleration
-
ctt
Compress images to GPU texture formats
-
runmat-filesystem
Swappable filesystem abstraction for RunMat hosts (native, wasm, remote)
-
ryft-pjrt
Ryft bindings for PJRT
-
fast-umap
Configurable UMAP (Uniform Manifold Approximation and Projection) in Rust
-
cutty
A fast, cross-platform GPU terminal emulator
-
oxicuda-memory
OxiCUDA Memory - Type-safe GPU memory management with Rust ownership semantics
-
mold-ai-discord
Discord bot for mold — AI image generation via slash commands
-
quant-iron
high-performance, hardware-accelerated modular quantum computing library with a focus on physical applications. Quant-Iron provides tools to represent quantum states, apply standard quantum gates…
-
oxicuda-launch
OxiCUDA Launch - Type-safe GPU kernel launch infrastructure
-
burn-onnx
importing ONNX models into the Burn framework
-
oxicuda
Pure Rust CUDA replacement for the COOLJAPAN ecosystem (95% performance target)
-
beamterm-core
Platform-agnostic OpenGL terminal renderer using glow
-
torc
Workflow management system
-
mtl-gpu
Rust bindings to Apple's Metal framework
-
runpod-cli
RunPod CLI — auto-generated from OpenAPI spec
-
mosec
Model Serving made Efficient in the Cloud
-
ringkernel-cuda
CUDA backend for RingKernel - NVIDIA GPU support via cudarc
-
axonml-data
Data loading utilities for the Axonml ML framework
-
gptop
A cross-platform GPU monitor TUI with support for both Apple Silicon and NVIDIA GPUs
-
glcore-rs
The OpenGL core functions for Rust, also supports OpenGL ES
-
glances
A modern Glances-inspired TUI system monitor written in Rust — CPU, memory, GPU, Docker, network, disk, battery, alerts, and container API testing
-
cubecl-cpp
CPP transpiler for CubeCL
-
jolt-platform
Cross-platform battery and power monitoring for jolt
-
gameboy
emulator written in Rust and WebAssembly
-
luna-rs
LUNA EEG Foundation Model — inference in Rust with Burn ML
-
cudaforge
Advanced CUDA kernel builder for Rust with incremental builds, auto-detection, and external dependency support
-
mold-ai-inference
Candle-based inference engine for mold — FLUX, SDXL, SD3.5, Z-Image diffusion models
-
nvml-wrapper-sys
Generated bindings to the NVIDIA Management Library
-
kn-cuda-eval
A CUDA executor for neural network graphs
-
gpu-trace-perf
Plays a collection of GPU traces under different environments to evaluate driver changes on performance
-
beamterm-atlas
Font atlas generator for beamterm WebGL terminal renderer, creating GPU-optimized texture arrays from TTF/OTF fonts
-
zuna-rs
ZUNA EEG Foundation Model — inference in Rust with Burn ML
-
nvmon
Terminal-based NVIDIA GPU monitoring tool with real-time graphs using NVML
-
runmat-gc-api
Public API types for the RunMat garbage collector
-
vulkan-rust
Ergonomic Vulkan bindings for Rust, generated from vk.xml
-
mold-ai-tui
Terminal UI for mold — interactive AI image generation
-
zeusd
Zeus daemon
-
mlx-native
Pure-Rust Metal GPU compute library for MLX-compatible inference on Apple Silicon
-
neofetch
-
ringkernel-wavesim
Interactive 2D wave propagation showcase for RingKernel
-
oxifetch
program that displays key system information, such as OS details, uptime, CPU specs, memory usage, and more. The output includes an ASCII art logo and a quick overview of your machine's current status.
-
runmat-thread-local
Cross-platform thread-local storage helpers for RunMat (native and wasm)
-
gloss-rs
Top level crate for gloss-rs
-
kn-runtime
Dynamic wrapper around CPU and GPU inference
-
hardware
A no_std bare-metal hardware abstraction layer — all port I/O, memory and swap allocations are guarded at runtime. Do not consider this dependency stable before x.1.x
-
easy-async-opencl3
A declarative, multi-device asynchronous executor for OpenCL based on cl3
-
flash-rerank-cli
CLI for Flash-Rerank — compile, benchmark, serve, and download models
-
oversee
A modern system monitor for macOS with Apple Silicon GPU support
-
mold-ai-server
HTTP inference server for mold
-
vkcore-rs
The Vulkan core functions for Rust
-
neuronbox-cli
Local ML runner: declarative neuron.yaml, model store, daemon, and terminal dashboard
-
async-cuda
Async CUDA for Rust
-
vx-vision
GPU-accelerated computer vision for Apple Silicon via Metal compute shaders
-
mtl-sys
Low-level Objective-C runtime bindings for Metal
-
meganeura
E-graph optimized neural network training on Blade
-
ringkernel-ecosystem
Ecosystem integrations for RingKernel - actors, web frameworks, data processing, ML
-
rustorch-core
Core tensor library for RusTorch
-
cuneus
A WGPU-based shader development tool
-
mtl-foundation
Foundation framework bindings (NSObject, NSString, NSArray, etc.)
-
kronos-compute
A high-performance compute-only Vulkan implementation with cutting-edge GPU optimizations
-
forge-sort
GPU radix sort for Apple Silicon — 4,800+ Mkeys/s, 6 data types, zero-copy
-
burn-vision
Vision processing operations for burn tensors
-
flash-map
GPU-native concurrent hash map with bulk-only API. Robin Hood hashing, SoA layout, CUDA kernels. Designed for blockchain state, HFT, and batch-parallel workloads.
-
runmat-async
Shared async runtime error types and host I/O interaction primitives for RunMat
-
wbackend
WASMA – Resource-first runtime: CPU-priority, GPU-optional, platform-agnostic
-
nviwatch
A blazingly fast Rust-based TUI for managing and monitoring NVIDIA GPU processes
-
mtop-tui
Sudo-less system monitor for Apple Silicon Macs: beautiful real-time braille graphs for CPU, GPU, temperature, memory, disk, network, power, and processes
-
rtop
A system monitor implemented in Rust, Monitors both system activity and GPU activity for NVIDIA GPUs
-
trtx
Safe Rust bindings to NVIDIA TensorRT-RTX (EXPERIMENTAL - NOT FOR PRODUCTION)
-
hw_dcmi_wrapper
A safe and ergonomic Rust wrapper for the Huawei DCMI API
-
runmat-time
Cross-platform time utilities for RunMat (monotonic + wall-clock helpers)
-
async-tensorrt
Async TensorRT for Rust
-
msb_krun_vmm
Virtual machine monitor for msb_krun microVMs
-
beamterm-data
Core data structures and binary serialization for the beamterm WebGL terminal renderer
-
kaio-candle
Candle bridge for KAIO — CustomOp bindings for 8 GPU ops (matmul_tc, matmul_tc_async, matmul_int4, matmul_int8, attention_tc, attention_tc_causal, qkv_project_int8, qkv_project_int4)…
-
tensor-crab
Rust-native ML library. No Python. No GIL. Just speed.
-
runmat-builtins
RunMat built-in functions and standard library components
-
mtl-fx
MetalFX bindings for AI upscaling and frame interpolation
-
ringkernel-metal
Metal backend for RingKernel - Apple GPU support
-
opencl-sys
OpenCL C FFI bindings for the Rust programming language
-
silicon-monitor
Silicon Monitor: Comprehensive hardware monitoring for CPUs, GPUs, NPUs, memory, I/O, and network silicon across all platforms
-
msb_krun_devices
Virtio device implementations for msb_krun microVMs
-
gllm-kernels
Low-level attention kernels for gllm with CUDA/ROCm support
-
rocm_smi_lib
easy to use crate for using rocm-smi from rust
-
cuda-rust-wasm
CUDA to Rust transpiler with WebGPU/WASM support
-
mabda
— GPU foundation layer for AGNOS (device, buffers, compute, textures)
-
nnl
A high-performance neural network library for Rust with CPU and GPU support
-
burn-wgpu
WGPU backend for the Burn framework
-
hybrid-predict-trainer-rs
Hybridized predictive training framework with warmup, full-train, predict, and residual correction phases for accelerated deep learning
-
wa
Cross-platform window assistant made primarily for Rio terminal
-
ctt-cli
Command-line interface for the ctt texture compression library
-
ringkernel-wavesim3d
3D acoustic wave simulation with realistic physics, binaural audio, and GPU acceleration
-
inlyne
Introducing Inlyne, a GPU powered yet browserless tool to help you quickly view markdown files in the blink of an eye
-
general-mcmc
A compact Rust library for Markov Chain Monte Carlo (MCMC) methods with GPU support
-
ha-ndarray
A hardware-accelerated n-dimensional array
-
archx
High-performance CPU/GPU adaptive optimization library with SIMD and Multithreading
-
vyre-spec
Frozen data contracts for vyre — OpSpec, AlgebraicLaw, Category, IntrinsicTable
-
vkfetch-rs
fetch-program that displays basic information about your vulkan-compatible graphic card(s)!
-
freecycle
GPU-aware Ollama lifecycle manager for Windows. Monitors for games and GPU-intensive programs, automatically enabling/disabling networked Ollama access when the GPU is available.
-
blinc_gpu
Blinc GPU renderer - SDF-based rendering via wgpu
-
halldyll_starter_runpod
managing RunPod GPU pods - Provisioning, orchestration & state management
-
par-fractal
Cross-platform GPU-accelerated fractal renderer with 2D and 3D support
-
unmtx-gpu
Micro matrix library for neural networks that uses GPU
-
wgpu_render_manager
Cached Render/Compute Manager for wgpu (pipelines + bind groups + procedural textures automated)
-
rumus
A native-Rust deep learning framework with explicit memory safety and hardware acceleration
-
hanzo-pqc
Post-quantum cryptography primitives (ML-KEM, ML-DSA, SLH-DSA) for Hanzo ecosystem
-
runpod
client for the RunPod API
-
cubecl-common
Common crate for CubeCL
-
llmux
Hook-driven LLM model multiplexer with pluggable switch policy
-
pylate-rs
WebAssembly library for late interaction models
-
cudagrep
Safe Rust bindings for GPUDirect Storage zero-copy NVMe-to-GPU transfers
-
raytop
A real-time TUI monitor for Ray clusters
-
turboplot
A blazingly fast waveform renderer made for visualizing huge traces
-
image-colorizer
Never settle for images outside your colorscheme again!
-
kaio
Rust-native GPU kernel authoring framework. Write GPU compute kernels in Rust, automatically lower to PTX. Cross-platform (Windows + Linux), type-safe, no CUDA C++ required.
-
cubecl-wgpu
WGPU runtime for the CubeCL
-
optirs
Advanced ML optimization and hardware acceleration library (main integration crate)
-
edgefirst-image
High-performance image processing with hardware acceleration for edge AI
-
infernum
CLI - From the depths, intelligence rises
-
llama-cpp-bindings
llama.cpp bindings for Rust
-
csep
Cosine Similarity Embeddings Print
-
bare-metal-kernels
Metal GPU kernels for LLM inference on Apple Silicon — 85+ optimized compute shaders
-
burn-cubecl
Generic backend that can be compiled just-in-time to any shader language target
-
archetype_asset
Fast, modular asset system with spatial preloading
-
keplemon
Expanded functionality for the Standardized Astrodynamics Algorithms Library (SAAL)
-
bevy_gpu_test
A test harness for running GPU compute shaders in Bevy and reading back results for CPU-side assertions
-
opentelemetry-system-metrics
System metric export through Opentelemetry
-
tenflowers
Pure Rust implementation of TensorFlow - A comprehensive deep learning framework
-
axdriver_display
Common traits and types for graphics device drivers
-
vx-gpu
Shared-memory Metal buffer management for Apple Silicon UMA
-
feagi-npu-burst-engine
High-performance burst engine for FEAGI neural processing
-
trueno-zram-adaptive
ML-driven compression algorithm selection for trueno-zram
-
vyre
GPU compute intermediate representation with a standard operation library
-
memkit-gpu
Backend-agnostic GPU memory management for memkit
-
host_discovery
host discovery
-
morok-runtime
Kernel execution runtime for the Morok ML compiler
-
ax-driver-display
Common traits and types for graphics device drivers
-
neuronbox-runtime
Local ML runner: declarative neuron.yaml, model store, daemon, and terminal dashboard
-
ringkernel-cuda-codegen
CUDA code generation from Rust DSL for RingKernel stencil kernels
-
flodl-cli
libtorch manager and GPU diagnostic tool for Rust deep learning
-
wave-sycl
WAVE SYCL backend - translates WAVE binary to Intel SYCL kernel source
-
vyre-std
Vyre standard library: GPU DFA assembly pipeline, Aho-Corasick construction, and compositional arithmetic helpers
-
gatenative
execute natively Gate circuits
-
cubecl-cuda
CUDA runtime for CubeCL
-
memkit-co
CPU-GPU memory coordination for the memkit ecosystem
-
bitnet-metal
Metal GPU acceleration for BitNet on Apple Silicon
-
honeycrisp
bare-metal Rust drivers for Apple Silicon — unified memory, CPU/AMX, Metal GPU, and Neural Engine
-
rocm-rs
Rust bindings for AMD ROCm libraries
-
ringkernel-cpu
CPU backend for RingKernel - testing and fallback implementation
-
volren-gpu
wgpu-based GPU volume renderer for volren-rs
-
horizon-lattice-render
Graphics rendering backend for Horizon Lattice using wgpu
-
ringkernel-wgpu
WebGPU backend for RingKernel - cross-platform GPU support
-
oxiphysics-gpu
GPU acceleration backends for the OxiPhysics engine
-
dora-operator-api-c
C API implementation for Dora Operator
-
dora-ros2-bridge
ROS2 bridge for dora-rs
-
moe-gpu-dsp
MoE-routed GPU signal processing framework — batch cuFFT, kernel dispatch, zero-copy pipelines
-
blit-compositor
blit headless Wayland compositor
-
skia-graphics-rs
High-performance 2D graphics library built on Skia with GPU acceleration
-
yule-gpu
GPU compute backends: Vulkan, CUDA, Metal, and CPU SIMD fallback
-
ec-gpu-gen
Code generator for field and eliptic curve operations on the GPUs
-
pulsemon
Cross-platform system monitor TUI — CPU, memory, disk, GPU, ports, process management
-
nndex
In-memory nearest neighbor search engine
-
xdl-amp
Multi-backend GPU/ML acceleration for XDL
-
vyre-reference
Pure-Rust CPU reference interpreter for vyre IR — byte-identical oracle for backend conformance and small-data fallback
-
vkobject-rs
The Vulkan object wrappers for Rust
-
nvglances
A TUI system monitor with support for NVIDIA GPUs (CUDA/NVML) and Apple Silicon GPUs (Metal)
-
rsfgsea
High-performance fgsea-compatible preranked Gene Set Enrichment Analysis in Rust
-
images_and_words
GPU middleware and abstraction layer for high-performance graphics applications and games
-
hardware-query
Cross-platform Rust library for comprehensive hardware detection, real-time monitoring, power management, and AI/ML optimization
-
getgpuname
Gets the GPU name from the PCI-IDS database using either the provided parameters or the ones in /sys/class/drm/
-
nam-ec-gpu-gen
Code generator for field and elliptic curve operations on the GPUs
-
rust-ai-core
Unified AI engineering toolkit: orchestrates peft-rs, qlora-rs, unsloth-rs, axolotl-rs, bitnet-quantize, trit-vsa, vsa-optim-rs, and tritter-accel
-
ringkernel
GPU-native persistent actor model framework - Rust port of DotCompute Ring Kernel
-
zeta-reticula-server
GPU-accelerated ML inference server with Stripe billing, Hugging Face model caching, and SSE streaming
-
cubecl-hip
AMD ROCm HIP runtime for CubeCL
-
flash-rerank-benchmarks
Benchmark suite for Flash-Rerank
-
kenbun
Terminal system resource monitor for Linux
-
crystal-vk
Graphics wrapper for Vulkan
-
any-gpu
Tensor engine for every GPU. AMD, NVIDIA, Intel, Apple. One codebase, zero vendor lock-in. wgpu under the hood.
-
yule-sandbox
Cross-platform process sandboxing: seccomp, AppContainer, seatbelt
-
oxicuda-dnn
OxiCUDA DNN - GPU-accelerated deep learning primitives (cuDNN equivalent)
-
libinfer
Rust interface to TensorRT for high-performance GPU inference
-
wave-ptx
WAVE PTX backend - translates WAVE binary to NVIDIA PTX assembly
-
runpod-sdk
Unofficial Rust SDK for RunPod: deploy and scale GPU workloads with serverless endpoints and on-demand pods
-
zentype
A high-performance modular text rendering engine for Rust
-
yoinky
TUI tool for monitoring system resources like CPU, RAM, and GPU
-
wasma-windows-platform-wasma-sys
WASMA Windows Platform WASMA-Sys module
-
cubek-reduce
CubeK: Reduce Kernels
-
wave-metal
WAVE Metal backend - translates WAVE binary to Apple Metal Shading Language
-
burn-cubecl-fusion
Provide optimizations that can be used with cubecl based backends
-
tenflowers-neural
Neural network layers, models and training APIs for TenfloweRS
-
aprender-compute
High-performance SIMD compute library with GPU support, LLM inference engine, and GGUF model loading (was: trueno)
-
ringkernel-ir
Intermediate Representation for RingKernel GPU code generation
-
flashlight_tensor
gpu/cpu tensor library focused around matrix and neural network operations
-
ryft-xla-sys
Ryft bindings for XLA
Try searching with DuckDuckGo.