Bingrui-Li

Bingrui Li Bingrui-Li

PhD student in Computer Science at TSAIL Group, Tsinghua University, @thu-ml. Interested in pretraining, optimization, theory for LLMs.

61 followers · 107 following

@thu-ml, Tsinghua University
Beijing, China
22:52 (UTC +09:00)
https://bingrui-li.github.io/
@bingruili_
@bingruil.bsky.social

Achievements

Stars

7 results for source starred repositories written in Cuda

Clear filter

deepseek-ai / DeepEP

DeepEP: an efficient expert-parallel communication library

Cuda 8,696 973 Updated Nov 6, 2025

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,863 737 Updated Oct 15, 2025

computerhistory / AlexNet-Source-Code

This package contains the original 2012 AlexNet code.

Cuda 2,764 357 Updated Mar 12, 2025

thu-ml / SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,627 258 Updated Nov 6, 2025

tspeterkim / flash-attention-minimal

Flash Attention in ~100 lines of CUDA (forward pass only)

Cuda 961 97 Updated Dec 30, 2024

thu-ml / SpargeAttn

[ICML2025] SpargeAttention: A training-free sparse attention that accelerates any model inference.

Cuda 758 65 Updated Oct 31, 2025

66RING / tiny-flash-attention

flash attention tutorial written in python, triton, cuda, cutlass

Cuda 442 47 Updated May 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bingrui Li Bingrui-Li

Achievements

Achievements

Block or report Bingrui-Li

Stars

deepseek-ai / DeepEP

deepseek-ai / DeepGEMM

computerhistory / AlexNet-Source-Code

thu-ml / SageAttention

tspeterkim / flash-attention-minimal

thu-ml / SpargeAttn

66RING / tiny-flash-attention