Skip to content
View jiangsy's full-sized avatar

Block or report jiangsy

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
12 stars written in Cuda
Clear filter

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,374 2,055 Updated Feb 2, 2026

A massively parallel, optimal functional runtime in Rust

Cuda 11,237 436 Updated Nov 21, 2024

Tile primitives for speedy kernels

Cuda 3,331 276 Updated Apr 29, 2026

Mirage Persistent Kernel: Compiling LLMs into a MegaKernel

Cuda 2,235 201 Updated Apr 30, 2026

Learn CUDA Programming, published by Packt

Cuda 1,243 260 Updated Dec 30, 2023

RTP-LLM: Alibaba's high-performance LLM inference engine for diverse applications.

Cuda 1,107 179 Updated Apr 30, 2026

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 883 149 Updated Sep 26, 2025

Fast k nearest neighbor search using GPU

Cuda 546 111 Updated Aug 6, 2018

Optimizing SGEMM kernel functions on NVIDIA GPUs to a close-to-cuBLAS performance.

Cuda 415 52 Updated Jan 2, 2025

This is the first fully GPU Optimized IPC framework

Cuda 135 18 Updated Mar 20, 2026

Custom SpMM operations integrated into PyTorch

Cuda 11 Updated Apr 15, 2022