vision
DeepSeek-VL2: Mixture-of-Experts Vision-Language Models for Advanced Multimodal Understanding
A collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-edge models like YOLO11, RT-DETR, SAM …
RF-DETR is a real-time object detection model architecture developed by Roboflow, SOTA on COCO & designed for fine-tuning.
This repository offers a comprehensive collection of tutorials on state-of-the-art computer vision models and techniques. Explore everything from foundational architectures like ResNet to cutting-e…
This repository is a curated collection of links to various courses and resources about Artificial Intelligence (AI)
A GUI Agent application based on UI-TARS(Vision-Language Model) that allows you to control your computer using natural language.
Unsloth Fine-tuning Notebooks for Google Colab, Kaggle, Hugging Face and more.