Lists (5)
Sort Name ascending (A-Z)
Stars
The repository provides code for running inference with the SegmentAnything Model (SAM), links for downloading the trained model checkpoints, and example notebooks that show how to use the model.
Google Research
CLIP (Contrastive Language-Image Pretraining), Predict the most relevant text snippet given an image
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything
Qwen3-VL is the multimodal large language model series developed by Qwen team, Alibaba Cloud.
This repository contains demos I made with the Transformers library by HuggingFace.
PyTorch code for BLIP: Bootstrapping Language-Image Pre-training for Unified Vision-Language Understanding and Generation
A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.
Solutions of Reinforcement Learning, An Introduction
Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"
An open-source library for GPU-accelerated robot learning and sim-to-real transfer.
A new codebase for popular Scene Graph Generation methods (2020). Visualization & Scene Graph Extraction on custom images/datasets are provided. It's also a PyTorch implementation of paper “Unbiase…
Neural question generation using transformers
A PyTorch reimplementation of bottom-up-attention models
[NeurIPS 2023 D&B] VidChapters-7M: Video Chapters at Scale
[ICCV 2021 Oral + TPAMI] Just Ask: Learning to Answer Questions from Millions of Narrated Videos
Official implementation of "Towards Generalizable Vision-Language Robotic Manipulation: A Benchmark and LLM-guided 3D Policy."
Learning Descriptive Image Captioning via Semipermeable Maximum Likelihood Estimation (NeurIPS 2023)