MoonMath.ai

All

6 repositories

SageAttention
Public
Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.
Cuda
•
Apache License 2.0
•238•0•0•0•Updated Oct 8, 2025Oct 8, 2025
moon-lite-attention
Public
Fast and memory-efficient exact attention (fork of flash-attention)
Python
•
BSD 3-Clause "New" or "Revised" License
•2k•0•0•0•Updated Sep 22, 2025Sep 22, 2025
moondex
Public
MoonMath.ai’s index of knowledge
0•1•0•0•Updated Sep 8, 2025Sep 8, 2025
Jenga
Public
Official Implementation: Training-Free Efficient Video Generation via Dynamic Token Carving
Python
•12•0•0•0•Updated Aug 22, 2025Aug 22, 2025
radial-attention
Public
Radial Attention Official Implementation
Python
•
Apache License 2.0
•29•0•0•0•Updated Aug 6, 2025Aug 6, 2025
Wan2.1-pv-skip
Public
Wan: Open and Advanced Large-Scale Video Generative Models
Python
•
Apache License 2.0
•2k•0•0•0•Updated Jul 17, 2025Jul 17, 2025