Stars
HuggingFace conversion and training library for Megatron-based models
An AI image gen prompt manager !
AI-powered Resume Expert based on Conversation
Official inference repo for FLUX.2 models
The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…
Native Multimodal Models are World Learners
Identifying and removing near-duplicate images using perceptual hashing.
Tritonbench is a collection of PyTorch custom operators with example inputs to measure their performance.
🥢像老乡鸡🐔那样做饭。主要部分于2024年完工,非老乡鸡官方仓库。文字来自《老乡鸡菜品溯源报告》,并做归纳、编辑与整理。CookLikeHOC.
Official codebase for "Self Forcing: Bridging Training and Inference in Autoregressive Video Diffusion" (NeurIPS 2025 Spotlight)
Lumina-DiMOO - An Open-Sourced Multi-Modal Large Diffusion Language Model
Milvus is a high-performance, cloud-native vector database built for scalable vector ANN search
VeOmni: Scaling Any Modality Model Training with Model-Centric Distributed Recipe Zoo
gpt-oss-120b and gpt-oss-20b are two open-weight language models by OpenAI
Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.
Developer-friendly OSS embedded retrieval library for multimodal AI. Search More; Manage Less.
NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024
Official Release of ICCV 2025 paper -- DiscretizedSDF
Kimi K2 is the large language model series developed by Moonshot AI team
Long-RL: Scaling RL to Long Sequences (NeurIPS 2025)
Rethinking High-Quality Aesthetic Poster Generation in a Unified Framework
Official code for the paper: Depth Anything At Any Condition
Easy to use Python module to extract Exif metadata from digital image files.
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
[NeurIPS 2025] An official implementation of Flow-GRPO: Training Flow Matching Models via Online RL