Starred repositories
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Implementation of paper - YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors
[NeurIPS 2024 Best Paper Award][GPT beats diffusion🔥] [scaling laws in visual generation📈] Official impl. of "Visual Autoregressive Modeling: Scalable Image Generation via Next-Scale Prediction". A…
Reference PyTorch implementation and models for DINOv3
OpenMMLab Multimodal Advanced, Generative, and Intelligent Creation Toolbox. Unlock the magic 🪄: Generative-AI (AIGC), easy-to-use APIs, awsome model zoo, diffusion models, for text-to-image genera…
Official Code for Stable Cascade
Accepted as [NeurIPS 2024] Spotlight Presentation Paper
YOLOv6: a single-stage object detection framework dedicated to industrial applications.
The pytorch re-implement of the official efficientdet with SOTA performance in real time and pretrained weights.
【🔞🔞🔞 内含不适合未成年人阅读的图片】基于我擅长的编程、绘画、写作展开的 AI 探索和总结:StableDiffusion 是一种强大的图像生成模型,能够通过对一张图片进行演化来生成新的图片。ChatGPT 是一个基于 Transformer 的语言生成模型,它能够自动为输入的主题生成合适的文章。而 Github Copilot 是一个智能编程助手,能够加速日常编程活动。
Hunyuan-DiT : A Powerful Multi-Resolution Diffusion Transformer with Fine-Grained Chinese Understanding
Open-source and strong foundation image recognition models.
EfficientSAM: Leveraged Masked Image Pretraining for Efficient Segment Anything
从无名小卒到大模型(LLM)大英雄~ 欢迎关注后续!!!
Replication of simple CV Projects including attention, classification, detection, keypoint detection, etc.
Official PyTorch implementation of "EdgeSAM: Prompt-In-the-Loop Distillation for On-Device Deployment of SAM"
A minimal PyTorch implementation of probabilistic diffusion models for 2D datasets.
Train a 1B LLM with 1T tokens from scratch by personal
[ECCV 2024] Tokenize Anything via Prompting
Phi2-Chinese-0.2B 从0开始训练自己的Phi2中文小模型,支持接入langchain加载本地知识库做检索增强生成RAG。Training your own Phi2 small chat model from scratch.
This is a repository about PCB defect detection.
[NeurIPS 2024] CV-VAE: A Compatible Video VAE for Latent Generative Video Models
MoVQGAN - model for the image encoding and reconstruction
Learning YOLOv3 from scratch 从零开始学习YOLOv3代码
The official implementation of "[MASK] is All You Need"
Codebase for the Recognize Anything Model (RAM)