Lists (5)
Sort Name ascending (A-Z)
Stars
YOLOv5 🚀 in PyTorch > ONNX > CoreML > TFLite
CLI platform to experiment with codegen. Precursor to: https://lovable.dev
Ray is an AI compute engine. Ray consists of a core distributed runtime and a set of AI Libraries for accelerating ML workloads.
The largest collection of PyTorch image encoders / backbones. Including train, eval, inference, export scripts, and pretrained weights -- ResNet, ResNeXT, EfficientNet, NFNet, Vision Transformer (V…
🤗 Diffusers: State-of-the-art diffusion models for image, video, and audio generation in PyTorch.
A generative world for general-purpose robotics & embodied AI learning.
Implementation of Vision Transformer, a simple way to achieve SOTA in vision classification with only a single transformer encoder, in Pytorch
Large-scale Self-supervised Pre-training Across Tasks, Languages, and Modalities
🤗 LeRobot: Making AI for Robotics more accessible with end-to-end learning
This is an official implementation for "Swin Transformer: Hierarchical Vision Transformer using Shifted Windows".
An open source implementation of CLIP.
Advanced AI Explainability for computer vision. Support for CNNs, Vision Transformers, Classification, Object detection, Segmentation, Image similarity and more.
Enjoy the magic of Diffusion models!
Implementation of Denoising Diffusion Probabilistic Model in Pytorch
[CVPR 2024 Oral] InternVL Family: A Pioneering Open-Source Alternative to GPT-4o. 接近GPT-4o表现的开源多模态对话模型
Flexible and powerful tensor operations for readable and reliable code (for pytorch, jax, TF and others)
PySlowFast: video understanding codebase from FAIR for reproducing state-of-the-art video models.
Example models using DeepSpeed
arXiv LaTeX Cleaner: Easily clean the LaTeX code of your paper to submit to arXiv
Unified framework for robot learning built on NVIDIA Isaac Sim
PyTorch implementation of MoCo: https://arxiv.org/abs/1911.05722
Google Drive Public File Downloader when Curl/Wget Fails
OpenMMLab's Next Generation Video Understanding Toolbox and Benchmark
OpenMMLab Video Perception Toolbox. It supports Video Object Detection (VID), Multiple Object Tracking (MOT), Single Object Tracking (SOT), Video Instance Segmentation (VIS) with a unified framework.
Deformable DETR: Deformable Transformers for End-to-End Object Detection.
Official codebase for I-JEPA, the Image-based Joint-Embedding Predictive Architecture. First outlined in the CVPR paper, "Self-supervised learning from images with a joint-embedding predictive arch…
[ECCV 2024] Official implementation of the paper "Semantic-SAM: Segment and Recognize Anything at Any Granularity"