Lists (1)
Sort Name ascending (A-Z)
Stars
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
Fully open reproduction of DeepSeek-R1
Original reference implementation of "3D Gaussian Splatting for Real-Time Radiance Field Rendering"
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Janus-Series: Unified Multimodal Understanding and Generation Models
Lets make video diffusion practical!
HunyuanVideo: A Systematic Framework For Large Video Generation Model
StreamDiffusion: A Pipeline-Level Solution for Real-Time Interactive Generation
Text-to-3D & Image-to-3D & Mesh Exportation with NeRF + Diffusion.
Hallo: Hierarchical Audio-Driven Visual Synthesis for Portrait Image Animation
SUPIR aims at developing Practical Algorithms for Photo-Realistic Image Restoration In the Wild. Our new online demo is also released at suppixel.ai.
High-resolution models for human tasks.
SkyReels-V2: Infinite-length Film Generative model
SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer
[ICLR 2024 Oral] Generative Gaussian Splatting for Efficient 3D Content Creation
The simplest, fastest repository for training/finetuning small-sized VLMs.
InstantMesh: Efficient 3D Mesh Generation from a Single Image with Sparse-view Large Reconstruction Models
🚀 Efficient implementations of state-of-the-art linear attention models
A PyTorch library for implementing flow matching algorithms, featuring continuous and discrete flow matching implementations. It includes practical examples for both text and image modalities.
[ICLR 2025] Hallo2: Long-Duration and High-Resolution Audio-driven Portrait Image Animation
Code and models for ICML 2024 paper, NExT-GPT: Any-to-Any Multimodal Large Language Model