Lists (23)
Sort Name ascending (A-Z)
Agent
AIGC
AIGC_RL
CV
Deep Learning
frp
🔮 Future ideas
llm
大模型相关仓库MLLM
NR-IQA
open-mmlab
pyTorch
初学pyTorchredis
VLM
医学分割模型
图像融合
底层训练框架
数据集
机场物流-vue-django
模型剪枝
知识蒸馏
算法面经
蒸馏相关
Stars
Directly Aligning the Full Diffusion Trajectory with Fine-Grained Human Preference
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3, Qwen3-MoE, DeepSeek-R1, GLM4.5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, …
Masked Depth Modeling for Spatial Perception
Your own personal AI assistant. Any OS. Any Platform. The lobster way. 🦞
Memory for 24/7 proactive agents like openclaw (moltbot, clawdbot).
A lightweight implementation of the Qwen-Image-Edit model for inference and LoRA fine-tuning on 8×V100 GPUs
TurboDiffusion: 100–200× Acceleration for Video Diffusion Models
Glance: Accelerating Diffusion Models with 1 Sample
Light Image Video Generation Inference Framework
Ongoing research training transformer models at scale
A general fine-tuning kit geared toward image/video/audio diffusion models.
[CVPR2023] Lite-Mono: A Lightweight CNN and Transformer Architecture for Self-Supervised Monocular Depth Estimation
Official Implementation of Upsample Anything: A Simple and Hard to Beat Baseline for Feature Upsampling
Kandinsky 5.0: A family of diffusion models for Video & Image generation
Official PyTorch Implementation of "Flow Map Distillation Without Data"
MixGRPO: Unlocking Flow-based GRPO Efficiency with Mixed ODE-SDE
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL
An official implementation of DanceGRPO: Unleashing GRPO on Visual Generation
(CVPR 2025) From Slow Bidirectional to Fast Autoregressive Video Diffusion Models
(NeurIPS 2024 Oral 🔥) Improved Distribution Matching Distillation for Fast Image Synthesis
https://little-misfit.github.io/GRAG-Image-Editing/
[SIGGRAPH Asia 2025] Official Implementation of "ConsistEdit: Highly Consistent and Precise Training-free Visual Editing"
[ICLR 2026] ChronoEdit: Towards Temporal Reasoning for Image Editing and World Simulation
Fast and Universal 3D reconstruction model for versatile tasks
This project is the official implementation of 'DreamOmni2: Multimodal Instruction-based Editing and Generation''
Efficient vision foundation models for high-resolution generation and perception.