Skip to content
View 1170300714's full-sized avatar

Block or report 1170300714

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
646 stars written in Python
Clear filter

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Python 7,828 595 Updated Jul 17, 2024

Make Python great again

Python 7,591 394 Updated Dec 4, 2019
Python 7,555 2,202 Updated Oct 23, 2025

AI Toolkit for Healthcare Imaging

Python 7,401 1,331 Updated Nov 6, 2025

PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO

Python 7,282 1,005 Updated Jul 3, 2024

[SIGGRAPH Asia 2022] VideoReTalking: Audio-based Lip Synchronization for Talking Head Video Editing In the Wild

Python 7,148 1,057 Updated Aug 5, 2024

Official repository of "SAMURAI: Adapting Segment Anything Model for Zero-Shot Visual Tracking with Motion-Aware Memory"

Python 6,989 481 Updated Mar 18, 2025
Python 6,702 1,131 Updated Nov 3, 2025

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,690 445 Updated May 29, 2024

Google AI 2018 BERT pytorch implementation

Python 6,495 1,329 Updated Sep 15, 2023

OpenMMLab Computer Vision Foundation

Python 6,303 1,718 Updated Apr 25, 2025

AppAgent: Multimodal Agents as Smartphone Users, an LLM-based multimodal agent framework designed to operate smartphone apps.

Python 6,199 700 Updated Mar 19, 2025

Mobile-Agent: The Powerful GUI Agent Family

Python 6,197 619 Updated Oct 31, 2025

OpenMMLab's next-generation platform for general 3D object detection.

Python 6,103 1,690 Updated Jul 10, 2024

[CVPR 2024] Real-Time Open-Vocabulary Object Detection

Python 5,981 567 Updated Feb 26, 2025

Qwen-Image is a powerful image generation foundation model capable of complex text rendering and precise image editing.

Python 5,937 324 Updated Nov 7, 2025

[ICLR 2024] Fine-tuning LLaMA to follow Instructions within 1 Hour and 1.2M Parameters

Python 5,913 383 Updated Mar 14, 2024

Solve Visual Understanding with Reinforced VLMs

Python 5,678 366 Updated Oct 21, 2025

中文文本分类,TextCNN,TextRNN,FastText,TextRCNN,BiLSTM_Attention,DPCNN,Transformer,基于pytorch,开箱即用。

Python 5,671 1,261 Updated Sep 23, 2020

[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.

Python 5,326 452 Updated May 21, 2025

Open-source unified multimodal model

Python 5,257 455 Updated Oct 27, 2025

大麦网抢票脚本

Python 5,248 918 Updated Mar 13, 2024

Nightly release of ControlNet 1.1

Python 5,101 403 Updated Aug 8, 2024

Most popular metrics used to evaluate object detection algorithms.

Python 5,084 1,036 Updated Jun 29, 2025

Agent Laboratory is an end-to-end autonomous research workflow meant to assist you as the human researcher toward implementing your research ideas

Python 5,062 735 Updated Aug 20, 2025

VideoCrafter2: Overcoming Data Limitations for High-Quality Video Diffusion Models

Python 4,993 395 Updated Jul 10, 2024

This repo is a pipeline of VITS finetuning for fast speaker adaptation TTS, and many-to-many voice conversion

Python 4,985 738 Updated Jan 21, 2025

A Next-Generation Training Engine Built for Ultra-Large MoE Models

Python 4,967 381 Updated Nov 7, 2025

Repo for BenTsao [original name: HuaTuo (华驼)], Instruction-tuning Large Language Models with Chinese Medical Knowledge. 本草(原名:华驼)模型仓库,基于中文医学知识的大语言模型指令微调

Python 4,887 493 Updated Feb 21, 2025