-
Salesforce AI Research
- Palo Alto
- https://zzxslp.github.io/
Highlights
- Pro
Stars
[ICLR2025, ICML2025, NeurIPS2025 Spotlight] Quantized Attention achieves speedup of 2-5x compared to FlashAttention, without losing end-to-end metrics across language, image, and video models.
Curate, Annotate, and Manage Your Data in LightlyStudio.
A curated list of papers on reinforcement learning for video generation
A simple pip-installable Python tool to generate your own HTML citation world map from your Google Scholar ID.
Official Implementation of Paper Transfer between Modalities with MetaQueries
A high-performance Python-based I/O system for large (and small) deep learning problems, with strong support for PyTorch.
Finetuning Stable Diffusion from Diffusers
Official implementation of the paper: "FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models"
A curated list of recent diffusion models for video generation, editing, and various other applications.
A comprehensive JAX/NNX library for diffusion and flow matching generative algorithms, featuring DiT (Diffusion Transformer) and its variants as the primary backbone with support for ImageNet train…
Enjoy the magic of Diffusion models!
Implementation of ST-Moe, the latest incarnation of MoE after years of research at Brain, in Pytorch
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
Efficient non-uniform quantization with GPTQ for GGUF
Course to get into Large Language Models (LLMs) with roadmaps and Colab notebooks.
Code for paper: [ICLR2025 Oral] FlexPrefill: A Context-Aware Sparse Attention Mechanism for Efficient Long-Sequence Inference
Efficient triton implementation of Native Sparse Attention.
Implementation of the sparse attention pattern proposed by the Deepseek team in their "Native Sparse Attention" paper
Official codebase used to develop Vision Transformer, SigLIP, MLP-Mixer, LiT and more.
A subset of PyTorch's neural network modules, written in Python using OpenAI's Triton.
You can easily calculate FVD, PSNR, SSIM, LPIPS for evaluating the quality of generated or predicted videos.
Efficient Triton Kernels for LLM Training
Helpful tools and examples for working with flex-attention