Skip to content
View gj-raza's full-sized avatar

Block or report gj-raza

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

11 stars written in Cuda
Clear filter

LLM training in simple, raw C/CUDA

Cuda 28,074 3,264 Updated Jun 26, 2025

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,030 2,017 Updated Oct 8, 2025

This package contains the original 2012 AlexNet code.

Cuda 2,762 356 Updated Mar 12, 2025

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,622 256 Updated Oct 28, 2025

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,801 463 Updated Oct 9, 2023

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,409 177 Updated Feb 24, 2025

NeRFshop: Interactive Editing of Neural Radiance Fields

Cuda 460 24 Updated Mar 27, 2023

Fastest kernels written from scratch

Cuda 384 51 Updated Sep 18, 2025

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 344 36 Updated Jul 10, 2025

Cuda extensions for PyTorch

Cuda 11 2 Updated Apr 22, 2025