Starred repositories
A practical guide to high-performance gluon kernel development on AMD GFX9 GPUs.
📚LeetCUDA: Modern CUDA Learn Notes with PyTorch for Beginners🐑, 200+ CUDA Kernels, Tensor Cores, HGEMM, FA-2 MMA.🎉
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
Automatic Video Generation from Scientific Papers
[DEPRECATED] Moved to ROCm/rocm-libraries repo. NOTE: develop branch is maintained as a read-only mirror
Simulation platform for general-purpose robotics & embodied AI learning.
Tutorials, assignments, and competitions for MIT Deep Learning related courses.