Stars
The most powerful and modular diffusion model GUI, api and backend with a graph/nodes interface.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
A generative world for general-purpose robotics & embodied AI learning.
Generative Models by Stability AI
Official inference repo for FLUX.1 models
Transfer learning / domain adaptation / domain generalization / multi-task learning etc. Papers, codes, datasets, applications, tutorials.-迁移学习
A little word cloud generator in Python
Official repo for consistency models.
One-for-All Multimodal Evaluation Toolkit Across Text, Image, Video, and Audio Tasks
A OpenMMLAB toolbox for human pose estimation, skeleton-based action recognition, and action synthesis.
DeepLab v3+ model in PyTorch. Support different backbones.
Pytorch implementation of Self-Attention Generative Adversarial Networks (SAGAN)
2018/2019/校招/春招/秋招/自然语言处理(NLP)/深度学习(Deep Learning)/机器学习(Machine Learning)/C/C++/Python/面试笔记,此外,还包括创建者看到的所有机器学习/深度学习面经中的问题。 除了其中 DL/ML 相关的,其他与算法岗相关的计算机知识也会记录。 但是不会包括如前端/测试/JAVA/Android等岗位中有关的问题。
Automated Deep Learning: Neural Architecture Search Is Not the End (a curated list of AutoDL resources and an in-depth analysis)
Emu Series: Generative Multimodal Models from BAAI
A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.
A state-of-the-art semi-supervised method for image recognition
Official implementation of "MIMO: Controllable Character Video Synthesis with Spatial Decomposed Modeling"
[NeurIPS 2025] MMaDA - Open-Sourced Multimodal Large Diffusion Language Models
[CVPR2024 Highlight] VBench - We Evaluate Video Generation
PyTorch implemented C3D, R3D, R2Plus1D models for video activity recognition.
PyTorch implementation of FractalGen https://arxiv.org/abs/2502.17437
[IJCV 2024] LaVie: High-Quality Video Generation with Cascaded Latent Diffusion Models
Vchitect-2.0: Parallel Transformer for Scaling Up Video Diffusion Models