Stars
Official inference framework for 1-bit LLMs
[CVPR 2025] Official PyTorch implementation of "EdgeTAM: On-Device Track Anything Model"
[DEIMv2] Real Time Object Detection Meets DINOv3
[NeurIPS 2025] SpatialLM: Training Large Language Models for Structured Indoor Modeling
[CVPR2023] MOTRv2: Bootstrapping End-to-End Multi-Object Tracking by Pretrained Object Detectors
Official Implementation of "VAU-R1: Advancing Video Anomaly Understanding via Reinforcement Fine-Tuning".
Everything about the SmolLM and SmolVLM family of models
🚀 「大模型」1小时从0训练26M参数的视觉多模态VLM!🌏 Train a 26M-parameter VLM from scratch in just 1 hours!
Official implementation of "Referring Video Object Segmentation via Language Aligned Track Selection".
🚀🚀 「大模型」2小时完全从0训练26M的小参数GPT!🌏 Train a 26M-parameter GPT from scratch in just 2h!
[AAAI 2026] SimROD: A Simple Baseline for Raw Object Detection with Global and Local Enhancements
dnth / DEIMKit
Forked from Intellindust-AI-Lab/DEIMDEIMKit is a Python package that provides a wrapper for DEIM: DETR with Improved Matching for Fast Convergence. Check out the original repo for more details.
[CVPR 2025] DEIM: DETR with Improved Matching for Fast Convergence
[ICLR 2025] Automated Design of Agentic Systems
D-FINE: Redefine Regression Task of DETRs as Fine-grained Distribution Refinement [ICLR 2025 Spotlight]
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
PyTorch code and models for the DINOv2 self-supervised learning method.
Hackable and optimized Transformers building blocks, supporting a composable construction.
[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. 🔥 🔥 🔥
(Pattern Recognition) Pytorch implementation of “HTR-VT: Handwritten Text Recognition with Vision Transformer”
PyTorch Implementation of ECCV 2024 OOD-CV Workshop SSB Challenge (Open-Set Recognition Track) - 1st Place
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Implementations of few-shot object detection benchmarks
微信小程序组件 / API / 云开发示例
[ECCV 2024, Oral] DynamiCrafter: Animating Open-domain Images with Video Diffusion Priors
This is a resouce list for low light image enhancement
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.