Skip to content
Change the repository type filter

All

    Repositories list

    • Quantized Attention achieves speedup of 2-5x and 3-11x compared to FlashAttention and xformers, without lossing end-to-end metrics across language, image, and video models.
      Cuda
      238000Updated Oct 8, 2025Oct 8, 2025
    • Fast and memory-efficient exact attention (fork of flash-attention)
      Python
      2k000Updated Sep 22, 2025Sep 22, 2025
    • moondex

      Public
      MoonMath.ai’s index of knowledge
      0100Updated Sep 8, 2025Sep 8, 2025
    • Jenga

      Public
      Official Implementation: Training-Free Efficient Video Generation via Dynamic Token Carving
      Python
      12000Updated Aug 22, 2025Aug 22, 2025
    • Radial Attention Official Implementation
      Python
      29000Updated Aug 6, 2025Aug 6, 2025
    • Wan: Open and Advanced Large-Scale Video Generative Models
      Python
      2k000Updated Jul 17, 2025Jul 17, 2025