shanshanpt

🗣️

Focusing

Tao Peng shanshanpt

🗣️

Focusing

fighting

71 followers · 40 following

@AlibabaPAI @DeepRec-AI
Beijing, China.
21:47 (UTC +08:00)
https://orcid.org/0009-0008-4450-4768

Achievements

Organizations

Lists (4)

Sort

Starred repositories

8 stars written in Cuda

Clear filter

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 8,811 1,029 Updated Dec 5, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,966 776 Updated Dec 8, 2025

flashinfer-ai / flashinfer

FlashInfer: Kernel Library for LLM Serving

Cuda 4,262 599 Updated Dec 17, 2025

tensorflow / recommenders-addons

Additional utils and helpers to extend TensorFlow when build recommendation systems, contributed and maintained by SIG Recommenders.

Cuda 629 142 Updated Sep 4, 2025

cudpp / cudpp

CUDA Data Parallel Primitives Library

Cuda 437 96 Updated Nov 9, 2018

mit-han-lab / Quest

[ICML 2024] Quest: Query-Aware Sparsity for Efficient Long-Context LLM Inference

Cuda 358 38 Updated Jul 10, 2025

ColfaxResearch / cutlass-kernels

Cuda 253 37 Updated Jul 11, 2024

owensgroup / SlabHash

A warp-oriented dynamic hash table for GPUs

Cuda 76 18 Updated Jan 19, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tao Peng shanshanpt

Achievements

Achievements

Organizations

Block or report shanshanpt

Lists (4)

🔮 Future ideas

✨ Inspiration

🚀 My stack

🐧MyRepo

Starred repositories

deepseek-ai / DeepEP

deepseek-ai / DeepGEMM

flashinfer-ai / flashinfer

tensorflow / recommenders-addons

cudpp / cudpp

mit-han-lab / Quest

ColfaxResearch / cutlass-kernels

owensgroup / SlabHash

Starred topics

Tensorflow

Deep learning

Compiler