Lists (1)
Sort Name ascending (A-Z)
Stars
An open source implementation of CLIP.
Easy-to-use and powerful LLM and SLM library with awesome model zoo.
This repository contains the codes of "A Lip Sync Expert Is All You Need for Speech to Lip Generation In the Wild", published at ACM Multimedia 2020. For HD commercial model, please try out Sync Labs
Minimal reproduction of DeepSeek R1-Zero
The official GitHub page for the survey paper "A Survey of Large Language Models".
Run any open-source LLMs, such as DeepSeek and Llama, as OpenAI compatible API endpoint in the cloud.
YOLOv10: Real-Time End-to-End Object Detection [NeurIPS 2024]
Use PEFT or Full-parameter to CPT/SFT/DPO/GRPO 500+ LLMs (Qwen3, Qwen3-MoE, Llama4, GLM4.5, InternLM3, DeepSeek-R1, ...) and 200+ MLLMs (Qwen3-VL, Qwen3-Omni, InternVL3.5, Ovis2.5, Llava, GLM4v, Ph…
YOLOv3 in PyTorch > ONNX > CoreML > TFLite
YOLOX is a high-performance anchor-free YOLO, exceeding yolov3~v5 with MegEngine, ONNX, TensorRT, ncnn, and OpenVINO supported. Documentation: https://yolox.readthedocs.io/
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Implementation of paper - YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
The TinyLlama project is an open endeavor to pretrain a 1.1B Llama model on 3 trillion tokens.
基于AI的图片/视频硬字幕去除、文本水印去除,无损分辨率生成去字幕、去水印后的图片/视频文件。无需申请第三方API,本地实现。AI-based tool for removing hard-coded subtitles and text-like watermarks from videos or Pictures.
An Easy-to-use, Scalable and High-performance RLHF Framework based on Ray (PPO & GRPO & REINFORCE++ & vLLM & Ray & Dynamic Sampling & Async Agentic RL)
A resource for learning about Machine learning & Deep Learning
告别枯燥,致力于打造 Python 实用小例子,更多Python良心教程见 https://ai-jupyter.com
视频硬字幕提取,生成srt文件。无需申请第三方API,本地实现文本识别。基于深度学习的视频字幕提取框架,包含字幕区域检测、字幕内容提取。A GUI tool for extracting hard-coded subtitle (hardsub) from videos and generating srt files.
BoxMOT: Pluggable SOTA multi-object tracking modules modules for segmentation, object detection and pose estimation models
Object detection, 3D detection, and pose estimation using center point detection:
💎1MB lightweight face detection model (1MB轻量级人脸检测模型)
Minimal PyTorch implementation of YOLOv3
Large World Model -- Modeling Text and Video with Millions Context
LMDeploy is a toolkit for compressing, deploying, and serving LLMs.
Official release of InternLM series (InternLM, InternLM2, InternLM2.5, InternLM3).
[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation
Effortless data labeling with AI support from Segment Anything and other awesome models.
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.