🎀
Focusing
-
University of Science and Technology of China
- Hefei, Anhui
-
13:00
(UTC +08:00) - https://github.com/guaguastandup
Starred repositories
6
results
for source starred repositories
written in Cuda
Clear filter
DeepEP: an efficient expert-parallel communication library
DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling
FlashInfer: Kernel Library for LLM Serving
how to optimize some algorithm in cuda.
A lightweight design for computation-communication overlap.