Stars
OpenMMLab Detection Toolbox and Benchmark
A modular graph-based Retrieval-Augmented Generation (RAG) system
JumpServer is an open-source Privileged Access Management (PAM) platform that provides DevOps and IT teams with on-demand and secure access to SSH, RDP, Kubernetes, Database and RemoteApp endpoints…
Open-Sora: Democratizing Efficient Video Production for All
A generative world for general-purpose robotics & embodied AI learning.
😘 让你“爱”上 GitHub,解决访问时图裂、加载慢的问题。(无需安装)
State-of-the-art 2D and 3D Face Analysis Project
Qwen3 is the large language model series developed by Qwen team, Alibaba Cloud.
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
The open source developer platform to build AI/LLM applications and models with confidence. Enhance your AI applications with end-to-end tracking, observability, and evaluations, all in one integra…
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
Xiaomi Home Integration for Home Assistant
Fast and memory-efficient exact attention
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
中文LLaMA&Alpaca大语言模型+本地CPU/GPU训练部署 (Chinese LLaMA & Alpaca LLMs)
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
Fast and flexible image augmentation library. Paper about the library: https://www.mdpi.com/2078-2489/11/2/125
An open source AutoML toolkit for automate machine learning lifecycle, including feature engineering, neural architecture search, model compression and hyper-parameter tuning.
Object Detection toolkit based on PaddlePaddle. It supports object detection, instance segmentation, multiple object tracking and real-time multi-person keypoint detection.
Question and Answer based on Anything.
A Fundamental End-to-End Speech Recognition Toolkit and Open Source SOTA Pretrained Models, Supporting Speech Recognition, Voice Activity Detection, Text Post-processing etc.
[CVPR 2023] SadTalker:Learning Realistic 3D Motion Coefficients for Stylized Audio-Driven Single Image Talking Face Animation
Open deep learning compiler stack for cpu, gpu and specialized accelerators
🏄 Scalable embedding, reasoning, ranking for images and sentences with CLIP
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
ComfyUI-Manager is an extension designed to enhance the usability of ComfyUI. It offers management functions to install, remove, disable, and enable various custom nodes of ComfyUI. Furthermore, th…
A paper list of object detection using deep learning.