Lists (5)
Sort Name ascending (A-Z)
Stars
⚡ Dynamically generated stats for your github readmes
Demonstrate all the questions on LeetCode in the form of animation.(用动画的形式呈现解LeetCode题目的思路)
Everything you need to know to get the job.
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
LeetCode Solutions: A Record of My Problem Solving Journey.( leetcode题解,记录自己的leetcode解题之路。)
CLI platform to experiment with codegen. Precursor to: https://lovable.dev
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
Google Research
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
A generative world for general-purpose robotics & embodied AI learning.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
✨✨Latest Advances on Multimodal Large Language Models
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
A markdown version emoji cheat sheet
An open source implementation of CLIP.
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
A collection of resources and papers on Diffusion Models
This repository contains demos I made with the Transformers library by HuggingFace.
Enjoy the magic of Diffusion models!
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型