-
Hyundai Steel
- Republic of Korea
-
22:44
(UTC +09:00) - https://dongwoo-im.github.io/
Lists (3)
Sort Name ascending (A-Z)
Stars
4
stars
written in Cuda
Clear filter
Instant neural graphics primitives: lightning fast NeRF and more
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
Training materials associated with NVIDIA's CUDA Training Series (www.olcf.ornl.gov/cuda-training-series/)
An efficient GPU support for LLM inference with x-bit quantization (e.g. FP6,FP5).