Skip to content
View honggui's full-sized avatar

Block or report honggui

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

33 stars written in Cuda
Clear filter

Instant neural graphics primitives: lightning fast NeRF and more

Cuda 17,360 2,058 Updated Feb 2, 2026

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Cuda 4,472 925 Updated Aug 30, 2024

Fast parallel CTC.

Cuda 4,074 1,034 Updated Mar 4, 2024

Squeeze-and-Excitation Networks

Cuda 3,618 850 Updated Feb 25, 2019

Tile primitives for speedy kernels

Cuda 3,312 275 Updated Apr 8, 2026

how to optimize some algorithm in cuda.

Cuda 2,915 267 Updated Apr 9, 2026

cuGraph - RAPIDS Graph Analytics Library

Cuda 2,161 349 Updated Apr 13, 2026

Sample codes for my CUDA programming book

Cuda 2,036 383 Updated Dec 14, 2025

[ARCHIVED] Cooperative primitives for CUDA C++. See https://github.com/NVIDIA/cccl

Cuda 1,827 463 Updated Oct 9, 2023

Fully Convolutional Instance-aware Semantic Segmentation

Cuda 1,565 407 Updated Sep 27, 2021

NCCL Tests

Cuda 1,485 363 Updated Mar 11, 2026

[MICRO'23, MLSys'22] TorchSparse: Efficient Training and Inference Framework for Sparse Convolution on GPUs.

Cuda 1,457 187 Updated Feb 24, 2025

Examples demonstrating available options to program multiple GPUs in a single node or a cluster

Cuda 883 149 Updated Sep 26, 2025

Reference implementation of real-time autoregressive wavenet inference

Cuda 745 125 Updated Jan 19, 2021

Stereo Matching by Training a Convolutional Neural Network to Compare Image Patches

Cuda 725 226 Updated Jan 25, 2018

Source code that accompanies The CUDA Handbook.

Cuda 571 197 Updated Mar 10, 2026

Integral Human Pose Regression

Cuda 487 76 Updated Apr 4, 2019

A UNIVERSAL MUSIC TRANSLATION NETWORK - a method for translating music across musical instruments and styles.

Cuda 464 71 Updated Aug 15, 2021

NVIDIA-accelerated zero latency video compression library for interactive remoting applications

Cuda 394 93 Updated Jun 3, 2020

GPU implementation of a fast generalized ANS (asymmetric numeral system) entropy encoder and decoder, with extensions for lossless compression of numerical and other data types in HPC/ML applications.

Cuda 381 33 Updated Mar 18, 2026

A CUDA backend for Torch7

Cuda 340 207 Updated Sep 11, 2017

GPU-accelerated Levenberg-Marquardt curve fitting in CUDA

Cuda 337 102 Updated Mar 12, 2026

Facebook's CUDA extensions.

Cuda 284 57 Updated Mar 27, 2019

An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).

Cuda 278 24 Updated Jul 16, 2025

CGBN: CUDA Accelerated Multiple Precision Arithmetic (Big Num) using Cooperative Groups

Cuda 241 75 Updated Mar 31, 2026
Cuda 214 173 Updated Aug 27, 2019

Fast CUDA Kernels for ResNet Inference.

Cuda 183 47 Updated May 26, 2019

Improved 3DGS rasterizer.

Cuda 129 5 Updated Feb 26, 2025

DietCode Code Release

Cuda 65 9 Updated Jul 21, 2022
Next