gpu

Star

Here are 431 public repositories matching this topic...

hujie-frank / SENet

Star

Squeeze-and-Excitation Networks

caffe gpu senet

Updated Feb 25, 2019
Cuda

rapidsai / cugraph

Star

cuGraph - RAPIDS Graph Analytics Library

graph graph-algorithms gpu cuda nvidia complex-networks graph-analysis graphml graph-framework rapids

Updated Mar 24, 2026
Cuda

CannyLab / tsne-cuda

Star

GPU Accelerated t-SNE for CUDA with Python bindings

python gpu cuda multithreading data-visualization mnist data-analysis tsne-algorithm tsne barnes-hut-tsne barnes-hut fit-tsne tsne-cuda

Updated Oct 2, 2024
Cuda

NVIDIA / cub

Star

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

cxx algorithms cpp gpu cpp14 cuda cpp11 nvidia cpp17 cub cpp20 cxx11 cxx14 cxx17 cxx20 nvidia-hpc-sdk

Updated Oct 9, 2023
Cuda

RAFT contains fundamental widely-used algorithms and primitives for machine learning and information retrieval. The algorithms are CUDA-accelerated and form building blocks for more easily writing high performance applications.

Updated Mar 24, 2026
Cuda

Celebrandil / CudaSift

Star

A CUDA implementation of SIFT for NVidia GPUs (1.2 ms on a GTX 1060)

gpu cuda nvidia vision sift

Updated Mar 17, 2026
Cuda

NVIDIA / nvbench

Star

CUDA Kernel Benchmarking Library

benchmark performance gpu cuda nvidia cuda-kernels kernel-benchmark

Updated Mar 24, 2026
Cuda

NVIDIA / cuopt

Star

GPU accelerated decision optimization

gpu optimization cuda linear-programming

Updated Mar 24, 2026
Cuda

brucefan1983 / GPUMD

Star

Graphics Processing Units Molecular Dynamics

machine-learning neural-network simulation gpu cuda molecular-dynamics neuroevolution high-performance-computing molecular-dynamics-simulation phonon physics-simulation natural-evolution-strategies heat-transport gpumd machine-learning-potential

Updated Mar 23, 2026
Cuda

rapidsai / cuvs

Star

cuVS - a library for vector search and clustering on the GPU

machine-learning information-retrieval statistics clustering gpu distance cuda sparse nearest-neighbors similarity-search vector-similarity anns vector-search llm vector-store neighborhood-methods

Updated Mar 24, 2026
Cuda

yassa9 / qwen600

Star

Static suckless single batch CUDA-only qwen3-0.6B mini inference engine

gpu cuda transformer cuda-programming llm llamacpp llm-inference qwen qwen3

Updated Sep 8, 2025
Cuda

Bruce-Lee-LY / cuda_hgemm

Star

Several optimization methods of half-precision general matrix multiplication (HGEMM) using tensor core with WMMA API and MMA PTX instruction.

gpu cuda cublas nvidia gemm matrix-multiply tensor-core hgemm

Updated Sep 8, 2024
Cuda

alicevision / popsift

Sponsor

Star

PopSift is an implementation of the SIFT algorithm in CUDA.

computer-vision gpu cuda image-processing feature-extraction sift

Updated Jan 4, 2026
Cuda

QINZHAOYU / CudaSteps

Star

基于《cuda编程-基础与实践》（樊哲勇著）的cuda学习之路。

gpu cuda nvidia

Updated Jan 15, 2024
Cuda

nosferalatu / SimpleGPUHashTable

Star

A simple GPU hash table implemented in CUDA using lock free techniques

gpu cuda data-structures cuda-programming gpu-cuda-programs

Updated Feb 7, 2024
Cuda

FZJ-JSC / tutorial-multi-gpu

Star

Efficient Distributed GPU Programming for Exascale, an SC/ISC Tutorial

hpc gpu mpi cuda multi-gpu supercomputing nccl exascale-computing sc23 sc21 nvshmem isc22 sc22 isc23 isc24 sc24 isc25 sc25

Updated Dec 3, 2025
Cuda

NVIDIA-Genomics-Research / GenomeWorks

Star

SDK for GPU accelerated genome assembly and analysis

genomics mapping gpu cuda nvidia alignment python-api poa partial-order-alignment

Updated May 3, 2024
Cuda

owensgroup / RXMesh

Star

GPU-accelerated triangle mesh processing

data-structure geometry gpu parallel-computing cuda mesh geometry-processing 3d 3d-graphics mesh-processing rxmesh

Updated Mar 22, 2026
Cuda

pyscf / gpu4pyscf

Star

A plugin to use Nvidia GPU in PySCF package

gpu

Updated Mar 21, 2026
Cuda

NVlabs / parrot

Star

Parrot is a C++ library for fused array operations using CUDA/Thrust. It provides efficient GPU-accelerated operations with lazy evaluation semantics, allowing for chaining of operations without unnecessary intermediate materializations.

algorithms gpu parallel cuda

Updated Mar 11, 2026
Cuda

Improve this page

Add a description, image, and links to the gpu topic page so that developers can more easily learn about it.

Curate this topic

Add this topic to your repo

To associate your repository with the gpu topic, visit your repo's landing page and select "manage topics."

Learn more

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gpu

Here are 431 public repositories matching this topic...

hujie-frank / SENet

rapidsai / cugraph

CannyLab / tsne-cuda

NVIDIA / cub

rapidsai / raft

Celebrandil / CudaSift

NVIDIA / nvbench

NVIDIA / cuopt

brucefan1983 / GPUMD

rapidsai / cuvs

yassa9 / qwen600

Bruce-Lee-LY / cuda_hgemm

alicevision / popsift

QINZHAOYU / CudaSteps

nosferalatu / SimpleGPUHashTable

FZJ-JSC / tutorial-multi-gpu

NVIDIA-Genomics-Research / GenomeWorks

owensgroup / RXMesh

pyscf / gpu4pyscf

NVlabs / parrot

Improve this page

Add this topic to your repo