Highlights
- Pro
Stars
Python - 100天从新手到大师
🧑🏫 60+ Implementations/tutorials of deep learning papers with side-by-side notes 📝; including transformers (original, xl, switch, feedback, vit, ...), optimizers (adam, adabelief, sophia, ...), ga…
🐙 Guides, papers, lecture, notebooks and resources for prompt engineering
A high-throughput and memory-efficient inference and serving engine for LLMs
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
30 days of Python programming challenge is a step-by-step guide to learn the Python programming language in 30 days. This challenge may take more than100 days, follow your own pace. These videos ma…
Fully open reproduction of DeepSeek-R1
deep learning for image processing including classification and object-detection etc.
🤗 PEFT: State-of-the-art Parameter-Efficient Fine-Tuning.
Fast and memory-efficient exact attention
The official repo of Qwen (通义千问) chat & pretrained large language model proposed by Alibaba Cloud.
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
✨✨Latest Advances on Multimodal Large Language Models
✔(已完结)最全面的 深度学习 笔记【土堆 Pytorch】【李沐 动手学深度学习】【吴恩达 深度学习】
Qwen2.5-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
A PyTorch implementation of the Transformer model in "Attention is All You Need".
OpenMMLab Semantic Segmentation Toolbox and Benchmark.
[ECCV 2024] Official implementation of the paper "Grounding DINO: Marrying DINO with Grounded Pre-Training for Open-Set Object Detection"
[Lumina Embodied AI] 具身智能技术指南 Embodied-AI-Guide
PyTorch code for Vision Transformers training with the Self-Supervised learning method DINO
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
This is the official code for MobileSAM project that makes SAM lightweight for mobile applications and beyond!
[ICLR'24 spotlight] An open platform for training, serving, and evaluating large language model for tool learning.
TorchGeo: datasets, samplers, transforms, and pre-trained models for geospatial data
🦦 Otter, a multi-modal model based on OpenFlamingo (open-sourced version of DeepMind's Flamingo), trained on MIMIC-IT and showcasing improved instruction-following and in-context learning ability.
Diffusion model papers, survey, and taxonomy