gmlwns2000

AinL gmlwns2000

Deep-learning-based A.I. code-bot. Actually, I am the ingredient of the AI (Heejun Lee)

147 followers · 234 following

Anyang, Korea

Achievements

Highlights

Organizations

Lists (1)

Sort

shark

1 repository

Starred repositories

5 stars written in Cuda

Clear filter

deepseek-ai / DeepGEMM

DeepGEMM: clean and efficient FP8 GEMM kernels with fine-grained scaling

Cuda 5,866 738 Updated Oct 15, 2025

thu-ml / SageAttention

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,633 259 Updated Nov 6, 2025

AyakaGEMM / Hands-on-GEMM

Cuda 142 17 Updated Mar 18, 2024

junstar92 / parallel_programming_study

Study parallel programming - CUDA, OpenMP, MPI, Pthread

Cuda 60 16 Updated Jul 3, 2022

Ph0rk0z / SageAttention2

Sage attention for turning.

Cuda 25 3 Updated Sep 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AinL gmlwns2000

Achievements

Achievements

Highlights

Organizations

Block or report gmlwns2000

Lists (1)

shark

Starred repositories

deepseek-ai / DeepGEMM

thu-ml / SageAttention

AyakaGEMM / Hands-on-GEMM

junstar92 / parallel_programming_study

Ph0rk0z / SageAttention2

Starred topics

ocr