Skip to content
View shanshanpt's full-sized avatar
🗣️
Focusing
🗣️
Focusing

Organizations

@AlibabaPAI @DeepRec-AI

Block or report shanshanpt

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

8 stars written in Cuda
Clear filter

DeepEP: an efficient expert-parallel communication library

Cuda 8,811 1,029 Updated Dec 5, 2025

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,966 776 Updated Dec 8, 2025

FlashInfer: Kernel Library for LLM Serving

Cuda 4,262 599 Updated Dec 17, 2025

Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.

Cuda 629 142 Updated Sep 4, 2025

CUDA Data Parallel Primitives Library

Cuda 437 96 Updated Nov 9, 2018

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 358 38 Updated Jul 10, 2025

A warp-oriented dynamic hash table for GPUs

Cuda 76 18 Updated Jan 19, 2024