-
MaVi, University of Bristol
- Bristol
- https://sid2697.github.io
- @Sid__Bansal
Highlights
- Pro
Stars
WiLoR: End-to-end 3D hand localization and reconstruction in-the-wild
MANO hand model in PyTorch (anatomy consistent, anchors, etc)
[CVPR 2023] Official repository for downloading, processing, visualizing, and training models on the ARCTIC dataset.
A procedural Blender pipeline for photorealistic training image generation
[CVPR2019 Oral] Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation on Python3, Tensorflow, and Keras
The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…
[CVPR 2024✨Highlight] Official repository for HOLD, the first method that jointly reconstructs articulated hands and objects from monocular videos without assuming a pre-scanned object template and…
Codebase for "Every Shot Counts: Using Exemplars for Repetition Counting in Videos"
An open source SDK for logging, storing, querying, and visualizing multimodal and multi-rate data
GRAB: A Dataset of Whole-Body Human Grasping of Objects
Light Vanilla Javascript library to compare multiples images with sliders. Also, you can add text and filters to your images.
Official implementation for the CVPR'23 paper: Visibility Aware Human-Object Interaction Tracking from Single RGB Camera
[3DV 2025] Code for "FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent" by Cameron Smith*, David Charatan*, Ayush Tewari, and Vincent Sitzmann
[CVPR 2024, Highlight] Living Scenes: Multi-object Relocalization and Reconstruction in Changing 3D Environments
HaMeR: Reconstructing Hands in 3D with Transformers
Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"
[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation
Official code for the paper "GestSync: Determining who is speaking without a talking head" published at BMVC 2023
Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…
Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)
a state-of-the-art-level open visual language model | 多模态预训练模型
The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.
✨✨Latest Advances on Multimodal Large Language Models
Instruction Tuning with GPT-4