Skip to content
View Sid2697's full-sized avatar
:octocat:
Focusing
:octocat:
Focusing

Highlights

  • Pro

Organizations

@dbpedia @EGO4D-Consortium

Block or report Sid2697

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

WiLoR: End-to-end 3D hand localization and reconstruction in-the-wild

Python 471 41 Updated Mar 22, 2026

MANO hand model in PyTorch (anatomy consistent, anchors, etc)

Python 280 26 Updated Feb 2, 2026

[CVPR 2023] Official repository for downloading, processing, visualizing, and training models on the ARCTIC dataset.

Python 465 29 Updated Mar 4, 2026

A procedural Blender pipeline for photorealistic training image generation

Python 3,468 504 Updated Jan 20, 2026

[CVPR2019 Oral] Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation on Python3, Tensorflow, and Keras

Python 492 74 Updated Dec 2, 2022

DUSt3R: Geometric 3D Vision Made Easy

Python 7,053 745 Updated Sep 24, 2025

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 18,848 2,412 Updated Mar 20, 2026

[CVPR 2024✨Highlight] Official repository for HOLD, the first method that jointly reconstructs articulated hands and objects from monocular videos without assuming a pre-scanned object template and…

Python 470 14 Updated Mar 10, 2026

Codebase for "Every Shot Counts: Using Exemplars for Repetition Counting in Videos"

Python 29 Updated Dec 18, 2024

An open source SDK for logging, storing, querying, and visualizing multimodal and multi-rate data

Rust 10,476 707 Updated Apr 3, 2026

GRAB: A Dataset of Whole-Body Human Grasping of Objects

Python 363 35 Updated Mar 8, 2022

Light Vanilla Javascript library to compare multiples images with sliders. Also, you can add text and filters to your images.

JavaScript 70 6 Updated Mar 1, 2022

The official Meta Llama 3 GitHub site

Python 29,293 3,530 Updated Jan 26, 2025

Official implementation for the CVPR'23 paper: Visibility Aware Human-Object Interaction Tracking from Single RGB Camera

Python 76 3 Updated Jun 10, 2023

[3DV 2025] Code for "FlowMap: High-Quality Camera Poses, Intrinsics, and Depth via Gradient Descent" by Cameron Smith*, David Charatan*, Ayush Tewari, and Vincent Sitzmann

Python 975 92 Updated Mar 26, 2025

[CVPR 2024, Highlight] Living Scenes: Multi-object Relocalization and Reconstruction in Changing 3D Environments

Python 102 8 Updated Jul 5, 2024

MCC-HO

Python 57 7 Updated Dec 2, 2024

HaMeR: Reconstructing Hands in 3D with Transformers

Python 931 136 Updated Feb 7, 2026
Python 33 1 Updated Dec 4, 2025

Code for the paper "ViperGPT: Visual Inference via Python Execution for Reasoning"

Jupyter Notebook 1,714 130 Updated Jan 29, 2024

[CVPR 2024] Depth Anything: Unleashing the Power of Large-Scale Unlabeled Data. Foundation Model for Monocular Depth Estimation

Python 8,056 611 Updated Jul 17, 2024

Official code for the paper "GestSync: Determining who is speaking without a talking head" published at BMVC 2023

Python 47 2 Updated Sep 1, 2024

Welcome to the Llama Cookbook! This is your go to guide for Building with Llama: Getting started with Inference, Fine-Tuning, RAG. We also show you how to solve end to end problems using Llama mode…

Jupyter Notebook 18,273 2,721 Updated Apr 1, 2026
Python 8,686 521 Updated Oct 9, 2024

Open-sourced codes for MiniGPT-4 and MiniGPT-v2 (https://minigpt-4.github.io, https://minigpt-v2.github.io/)

Python 25,752 2,909 Updated Sep 2, 2024

a state-of-the-art-level open visual language model | 多模态预训练模型

Python 6,735 452 Updated May 29, 2024

The official repo of Qwen-VL (通义千问-VL) chat & pretrained large vision language model proposed by Alibaba Cloud.

Python 6,605 489 Updated Aug 7, 2024

✨✨Latest Advances on Multimodal Large Language Models

17,566 1,120 Updated Apr 3, 2026

Instruction Tuning with GPT-4

HTML 4,335 311 Updated Jun 11, 2023
Next