Skip to content
View charleschetty's full-sized avatar
  • Math.SDU
  • JiNan ShanDong china

Block or report charleschetty

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
4 stars written in Cuda
Clear filter

[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.

Cuda 2,628 259 Updated Nov 6, 2025

CUDA 算子手撕与面试指南

Cuda 673 75 Updated Aug 23, 2025

CUDA C 编程权威指南代码实现 包含了书上第二章到第八章的大部分代码实现和作者笔记,全由作者本人手动实现,难免有错误的地方,请大家谨慎参考,非常欢迎对错误的指正。 如果有帮助的话请Star一下,对作者帮助很大,谢谢!

Cuda 369 24 Updated Oct 20, 2022

Solutions of LeetGPU

Cuda 44 3 Updated Oct 31, 2025