Skip to content
View hhaAndroid's full-sized avatar
  • nuaa
  • 上海

Organizations

@open-mmlab

Block or report hhaAndroid

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
7 stars written in Cuda
Clear filter

DeepEP: an efficient expert-parallel communication library

Cuda 8,828 1,036 Updated Dec 24, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,996 779 Updated Dec 23, 2025

The project is an official implementation of our CVPR2019 paper "Deep High-Resolution Representation Learning for Human Pose Estimation"

Cuda 4,455 928 Updated Aug 30, 2024

how to optimize some algorithm in cuda.

Cuda 2,711 244 Updated Dec 23, 2025

Distribution-Aware Coordinate Representation for Human Pose Estimation

Cuda 562 83 Updated May 17, 2024

AdaptiveGEMM: FP8 GEMM with Adaptation to Various Lengths of Group M

Cuda 3 1 Updated Nov 13, 2025

PyTorch bindings for CUTLASS and CUBLAS Grouped GEMM, Permute and Unpermute.

Cuda 2 2 Updated Nov 11, 2025