Skip to content
View caoyunkang's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report caoyunkang

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results

[AAAI 2026] https://huggingface.co/csgaobb/AdaptCLIP

Python 100 4 Updated Dec 18, 2025

[DEIMv2] Real Time Object Detection Meets DINOv3

Jupyter Notebook 1,279 131 Updated Dec 13, 2025

A curated list of publications on image and video segmentation leveraging Multimodal Large Language Models (MLLMs), highlighting state-of-the-art methods, innovative applications, and key advanceme…

176 5 Updated Dec 8, 2025

[AAAI 2026 Oral] LENS: Learning to Segment Anything with Unified Reinforced Reasoning

Python 82 4 Updated Dec 3, 2025
3 Updated Dec 3, 2025

NeurIPS 2025 Spotlight; ICLR2024 Spotlight; CVPR 2024; EMNLP 2024

Python 1,780 76 Updated Nov 27, 2025
Python 7,512 444 Updated Dec 14, 2025

The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…

Python 6,252 726 Updated Dec 11, 2025

The repository provides code for running inference with the SAM 3D Body Model (3DB), links for downloading the trained model checkpoints and datasets, and example notebooks that show how to use the…

Python 2,296 215 Updated Dec 19, 2025

Project Page For "Seg-Zero: Reasoning-Chain Guided Segmentation via Cognitive Reinforcement"

Python 580 27 Updated Jul 30, 2025

Paper list for LLM/MLLM-based image segmentation

44 Updated Dec 10, 2025

[AAAI 2026] The Official Implementation for "Anomagic: Crossmodal Prompt-driven Zero-shot Anomaly Generation"

Python 70 3 Updated Dec 13, 2025

[AAAI 2026 Oral] The Official Implementation for "Towards High-Resolution 3D Anomaly Detection: A Scalable Dataset and Real-Time Framework for Subtle Industrial Defects"

Python 58 1 Updated Nov 19, 2025

(CVPR 2025 highlight✨) Official repository of paper "LLMDet: Learning Strong Open-Vocabulary Object Detectors under the Supervision of Large Language Models"

Python 524 28 Updated Dec 18, 2025

A curated publication list on open vocabulary semantic segmentation and related area (e.g. zero-shot semantic segmentation) resources..

793 33 Updated Oct 23, 2025

[Neurips 2025 Spotlight] Official repository for the paper: OpenWorldSAM: Extending SAM2 for Universal Image Segmentation with Language Prompts

Python 17 1 Updated Nov 26, 2025

[AAAI 2026] Official Implementation for "AnoStyler: Text-Driven Localized Anomaly Generation via Lightweight Style Transfer"

Python 7 2 Updated Nov 14, 2025

Detect Anything via Next Point Prediction (Based on Qwen2.5-VL-3B)

Jupyter Notebook 1,012 66 Updated Dec 15, 2025

[JMS 2025] A Comprehensive Survey for Real-World Industrial Surface Defect Detection: Challenges, Approaches, and Prospects (Journal of Manufacturing Systems)

75 7 Updated Dec 18, 2025

A Curated List of Awesome Works in World Modeling, Aiming to Serve as a One-stop Resource for Researchers, Practitioners, and Enthusiasts Interested in World Modeling.

1,506 61 Updated Dec 18, 2025

We have summarised all 3D anomaly detection methods and datasets (still updating).

36 Updated Dec 21, 2025

AnomalyControl: Learning Cross-modal Semantic Features for Controllable Anomaly Synthesis

Python 41 1 Updated Aug 8, 2025

RLinf is a flexible and scalable open-source infrastructure designed for post-training foundation models (LLMs, VLMs, VLAs) via reinforcement learning.

Python 1,764 168 Updated Dec 21, 2025

Normal-Abnormal Guided Generalist Anomaly Detection (NeurIPS 2025)

Python 57 6 Updated Nov 21, 2025

OmniGen: Unified Image Generation. https://arxiv.org/pdf/2409.11340

Jupyter Notebook 4,290 367 Updated Dec 4, 2025

Official PyTorch Implementation of "Diffusion Transformers with Representation Autoencoders"

Python 1,640 53 Updated Nov 15, 2025

[NeurIPS 2025 Spotlight] Official implementation of the SIU3R: Simultaneous Scene Understanding and 3D Reconstruction Beyond Feature Alignment

Python 152 5 Updated Sep 25, 2025

Coda and Data for NeurIPS 2025 paper "MuSLR: Multimodal Symbolic Logical Reasoning"

Jupyter Notebook 9 Updated Oct 5, 2025
Python 5 Updated Sep 30, 2025

[NeurIPS 2025] PANDA: Towards Generalist Video Anomaly Detection via Agentic AI Engineer

21 2 Updated Oct 2, 2025
Next