Stars
[ICCV 2025 Highlight] OminiControl: Minimal and Universal Control for Diffusion Transformer
The simplest, fastest repository for training/finetuning small-sized VLMs.
Code from my thesis Articulated 3D Hand from a Single RGB Image, later published as Monocular 3D Hand Pose Estimation with Implicit Camera Alignment. Also contains notes from an earlier study on 2D…
[NeurIPS 2025] YOLOv12: Attention-Centric Real-Time Object Detectors
Production First and Production Ready End-to-End Speech Recognition Toolkit
[CVPR 2024 Highlight] Logit Standardization in Knowledge Distillation
Strong and Open Vision Language Assistant for Mobile Devices
利用AI大模型,一键生成高清短视频 Generate short videos with one click using AI LLM.
PyTorch reimplementation of hand-biomechanical-constraints (ECCV2020)
[CVPR 2023] DepGraph: Towards Any Structural Pruning; LLMs, Vision Foundation Models, etc.
drewZZzz6 / YOLOv6
Forked from meituan/YOLOv6YOLOv6: a single-stage object detection framework dedicated to industrial applications.
Automated dense category annotation engine that serves as the initial semantic labeling for the Segment Anything dataset (SA-1B).
repo for NIMBLE: A Non-rigid Hand Model with Bones and Muscles
基于深度学习的肿瘤辅助诊断系统,以图像分割为核心,利用人工智能完成肿瘤区域的识别勾画并提供肿瘤区域的特征来辅助医生进行诊断。有完整的模型构建、后端架设、工业级部署和前端访问功能。TensorRT、PyTorch 、OpenCV 、Flask、Vue