S-Lab, NTU SCSE PhD For 3D AIGC
-
Nanyang Technological University
- Singapore
- https://buaacyw.github.io/
Highlights
- Pro
Starred repositories
4
stars
written in Cuda
Clear filter
CUDA accelerated rasterization of gaussian splatting
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
Flash Attention in ~100 lines of CUDA (forward pass only)