Lists (7)
Sort Name ascending (A-Z)
Starred repositories
A complete computer science study plan to become a software engineer.
《Hello 算法》:动画图解、一键运行的数据结构与算法教程。支持 Python, Java, C++, C, C#, JS, Go, Swift, Rust, Ruby, Kotlin, TS, Dart 代码。简体版和繁体版同步更新,English version in translation
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
Instant voice cloning by MIT and MyShell. Audio foundation model.
OpenPose: Real-time multi-person keypoint detection library for body, face, hands, and foot estimation
Cross-platform, customizable ML solutions for live and streaming media.
Pretrain, finetune ANY AI model of ANY size on 1 or 10,000+ GPUs with zero code changes.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
[NeurIPS'23 Oral] Visual Instruction Tuning (LLaVA) built towards GPT-4V level capabilities and beyond.
MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Event-driven network library for multi-threaded Linux server in C++11
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
Image Polygonal Annotation with Python (polygon, rectangle, circle, line, point and image-level flag annotation).
End-to-End Object Detection with Transformers
Annotate better with CVAT, the industry-leading data engine for machine learning. Used and trusted by teams at any scale, for data of any scale.
Image augmentation for machine learning experiments.
Pretrained models for TensorFlow.js
cvpr2024/cvpr2023/cvpr2022/cvpr2021/cvpr2020/cvpr2019/cvpr2018/cvpr2017 论文/代码/解读/直播合集,极市团队整理
High-Resolution 3D Assets Generation with Large Scale Hunyuan3D Diffusion Models.
Generate 3D objects conditioned on text or images