Stars
Fast, small, and fully autonomous AI assistant infrastructure — deploy anywhere, swap anything 🦀
Ultralytics YOLOv8, YOLOv9, YOLOv10, YOLOv11, YOLOv12 for ROS 2
Open Images is a dataset of ~9 million images that have been annotated with image-level labels and bounding boxes spanning thousands of classes.
Claraverse is a opesource privacy focused ecosystem to replace ChatGPT, Claude, N8N, ImageGen with your own hosted llm, keys and compute. With desktop, IOS, Android Apps.
[CVPR 2025] Official PyTorch implementation of "EdgeTAM: On-Device Track Anything Model"
🤗 Transformers: the model-definition framework for state-of-the-art machine learning models in text, vision, audio, and multimodal models, for both inference and training.
Transformer Explained Visually: Learn How LLM Transformer Models Work with Interactive Visualization
GenAI Bootcamp : 2 day workshop presented at MLCon Berlin 2025. Code samples and slides
An extremely fast Python linter and code formatter, written in Rust.
Object Detection Metrics. 14 object detection metrics: mean Average Precision (mAP), Average Recall (AR), Spatio-Temporal Tube Average Precision (STT-AP). This project supports different bounding b…
Reference PyTorch implementation and models for DINOv3
Copilot Chat extension for VS Code
Semantic segmentation models with 500+ pretrained convolutional and transformer-based backbones.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
✨ Build a beautiful and simple website in literally minutes. Demo at https://beautifuljekyll.com
Badges for your personal developer branding, profile, and projects.
Track-Anything is a flexible and interactive tool for video object tracking and segmentation, based on Segment Anything, XMem, and E2FGVI.
[NeurIPS 2025] YOLOv12: Attention-Centric Real-Time Object Detectors
BoxMOT: Pluggable SOTA multi-object tracking modules for segmentation, object detection and pose estimation models
⚡ Dynamically generated stats for your github readmes
GenAI Agent Framework, the Pydantic way
real time face swap and one-click video deepfake with only a single image
[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer
A TTS model capable of generating ultra-realistic dialogue in one pass.