gj-raza

Ghulam Jilani Raza gj-raza

18 followers · 482 following

Achievements

Starred repositories

11 stars written in Cuda

Clear filter

karpathy / llm.c

LLM training in simple, raw C/CUDA

Cuda 28,074 3,264 Updated Jun 26, 2025

NVlabs / instant-ngp

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,030 2,017 Updated Oct 8, 2025

computerhistory / AlexNet-Source-Code

This package contains the original 2012 AlexNet code.

Cuda 2,762 356 Updated Mar 12, 2025

thu-ml / SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,622 256 Updated Oct 28, 2025

NVIDIA / cub

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,801 463 Updated Oct 9, 2023

mit-han-lab / torchsparse

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,409 177 Updated Feb 24, 2025

graphdeco-inria / nerfshop

NeRFshop: Interactive Editing of Neural Radiance Fields

Cuda 460 24 Updated Mar 27, 2023

pranjalssh / fast.cu

Fastest kernels written from scratch

Cuda 384 51 Updated Sep 18, 2025

mit-han-lab / Quest

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 344 36 Updated Jul 10, 2025

uiuc-kang-lab / agentic-benchmarks

Cuda 44 3 Updated Jul 31, 2025

drisspg / driss_torch

Cuda extensions for PyTorch

Cuda 11 2 Updated Apr 22, 2025

Starred topics

computer-use

high-performance

action-recognition

skeleton-based-action-recognition

simd-optimizations

simd-parallelism

libjpeg

embedded-systems

Machine learning

video-super-resolution

See all starred topics

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Ghulam Jilani Raza gj-raza

Achievements

Achievements

Block or report gj-raza

Starred repositories

karpathy / llm.c

NVlabs / instant-ngp

computerhistory / AlexNet-Source-Code

thu-ml / SageAttention

NVIDIA / cub

mit-han-lab / torchsparse

graphdeco-inria / nerfshop

pranjalssh / fast.cu

mit-han-lab / Quest

uiuc-kang-lab / agentic-benchmarks

drisspg / driss_torch

Starred topics

computer-use

high-performance

action-recognition

skeleton-based-action-recognition

simd-optimizations

simd-parallelism

libjpeg

embedded-systems

Machine learning

video-super-resolution