Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 600+ LLMs (Qwen3.5, DeepSeek-R1, GLM-5, InternLM3, Llama4, ...) and 300+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, GLM4.5v, Llava, Phi4, ...)…

Python 13,309 1,293 Updated Mar 23, 2026

facebookresearch / vggt

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 12,682 1,391 Updated Mar 3, 2026

cft0808 / edict

🏛️ 三省六部制 · OpenClaw Multi-Agent Orchestration System — 9 specialized AI agents with real-time dashboard, model config, and full audit trails

Python 12,295 1,201 Updated Mar 17, 2026

Tencent-Hunyuan / HunyuanVideo

HunyuanVideo: A Systematic Framework For Large Video Generation Model

Python 11,867 1,213 Updated Nov 21, 2025

facebookresearch / sam3

The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…

Python 8,341 1,182 Updated Mar 18, 2026

QwenLM / Qwen-Image

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 7,602 463 Updated Feb 10, 2026

naver / dust3r

DUSt3R: Geometric 3D Vision Made Easy

Python 7,033 739 Updated Sep 24, 2025

ByteDance-Seed / Bagel

Open-source unified multimodal model

Python 5,763 508 Updated Oct 27, 2025

NVlabs / Sana

SANA: Efficient High-Resolution Image Synthesis with Linear Diffusion Transformer

Python 5,023 336 Updated Mar 17, 2026

deepseek-ai / Engram

Conditional Memory via Scalable Lookup: A New Axis of Sparsity for Large Language Models

Python 3,925 280 Updated Jan 14, 2026

facebookresearch / jepa

PyTorch code and models for V-JEPA self-supervised learning from video.

Python 3,625 371 Updated Feb 27, 2025

facebookresearch / ijepa

Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive arch…

Python 3,278 457 Updated May 8, 2024

Robbyant / lingbot-world

Advancing Open-source World Models

Python 3,230 264 Updated Mar 5, 2026

prs-eth / Marigold

[CVPR 2024 - Oral, Best Paper Award Candidate] Marigold: Repurposing Diffusion-Based Image Generators for Monocular Depth Estimation

Python 3,104 208 Updated Dec 10, 2025

Tencent-Hunyuan / Hunyuan3D-2.1

From Images to High-Fidelity 3D Assets with Production-Ready PBR Material

Python 3,024 435 Updated Oct 17, 2025

vita-epfl / Stable-Video-Infinity

[ICLR 26 Oral] Stable Video Infinity: Infinite-Length Video Generation with Error Recycling

Python 2,234 188 Updated Jan 19, 2026

LTH14 / JiT

PyTorch implementation of JiT https://arxiv.org/abs/2511.13720

Python 2,208 152 Updated Dec 8, 2025

stepfun-ai / Step1X-Edit

A SOTA open-source image editing model, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini 2 Flash.

Python 2,169 95 Updated Dec 29, 2025

bytetriper / RAE

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,822 72 Updated Feb 25, 2026

apple / pico-banana-400k

Python 1,799 80 Updated Dec 16, 2025

character-ai / Ovi

Python 1,672 187 Updated Nov 15, 2025

JiuhaiChen / BLIP3o

Official implementation of BLIP3o-Series

Python 1,656 77 Updated Nov 29, 2025

AvaLovelace1 / BrickGPT

Official repository for BrickGPT, the first approach for generating physically stable toy brick models from text prompts.

Python 1,618 100 Updated Feb 7, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Songsong Yu song2yu

Achievements

Achievements

Block or report song2yu

Lists (1)

UMMs

Stars

hpcaitech / Open-Sora

black-forest-labs / flux

deepseek-ai / DeepSeek-OCR

openai / gpt-oss

deepseek-ai / Janus

Wan-Video / Wan2.1

Wan-Video / Wan2.2

modelscope / ms-swift