-
Huazhong University of Science and Technology
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Stars
(CVPR 2026 🎉) Official repository of paper "PET-DINO: Unifying Visual Cues into Grounding DINO with Prompt-Enriched Training"
Fully autonomous & self-evolving research from idea to paper. Chat an Idea. Get a Paper. 🦞
Edit Banana: A framework for converting statistical formats into editable.
Official code of Franca: Nested Matryoshka Clustering for Scalable Visual Representation Learning
(ECCV 2024) VCP-CLIP: A visual context prompting model for zero-shot anomaly segmentation
[ICCV 2025] Disentangling Instance and Scene Contexts for 3D Semantic Scene Completion
This is the third party implementation of the paper Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection.
OV-DINO: Unified Open-Vocabulary Detection with Language-Aware Selective Fusion
A curated list of papers, datasets and resources pertaining to open vocabulary object detection.
[CVPR 2024] Official implementation of the paper "Visual In-context Learning"
Awesome OVD-OVS - A Survey on Open-Vocabulary Detection and Segmentation: Past, Present, and Future
[CVPR2023] This is an official implementation of paper "DETRs with Hybrid Matching".
detrex is a research platform for DETR-based object detection, segmentation, pose estimation and other visual recognition tasks.
State-of-the-art Image & Video CLIP, Multimodal Large Language Models, and More!
Witness the aha moment of VLM with less than $3.
DINO-X: The World's Top-Performing Vision Model for Open-World Object Detection and Understanding
🏄 [ICLR 2025] OVTR: End-to-End Open-Vocabulary Multiple Object Tracking with Transformer
[ICLR2025] Official code implementation of Video-UTR: Unhackable Temporal Rewarding for Scalable Video MLLMs
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
Sky-T1: Train your own O1 preview model within $450
🚀【AAAI 2025】Cross-View Referring Multi-Object Tracking
Framework agnostic sliced/tiled inference + interactive ui + error analysis plots
[ECCV2022] MOTR: End-to-End Multiple-Object Tracking with TRansformer
[CVPR 2024] Official implementation of the paper "Salience DETR: Enhancing Detection Transformer with Hierarchical Salience Filtering Refinement"
A DETR-style framework for open-vocabulary detection (OVD). CVPR 2023