Skip to content
View avijit9's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report avijit9

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A curated list of awesome egocentric (first-person) video datasets, papers, benchmarks and resources.

Python 99 3 Updated May 13, 2026

Convert 2D videos and photos into interactive 3D scenes using ML-SHARP and Rerun. Explore your videos in 3D space with depth maps, navigation tools, and creative effects.

Python 86 5 Updated Apr 13, 2026

Code for "EgoX: Egocentric Video Generation from a Single Exocentric Video"

Python 699 47 Updated May 4, 2026

This is a collection of recent papers on reasoning in video generation models.

153 5 Updated May 13, 2026

DuoLoRA implementation

Python 9 Updated Oct 18, 2025

pySLAM is a hybrid Python/C++ Visual SLAM pipeline supporting monocular, stereo, and RGB-D cameras. It provides a broad set of modern local and global feature extractors, multiple loop-closure stra…

Python 3,296 522 Updated May 16, 2026

Official PyTorch implementation of the paper "Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs"

Python 97 15 Updated Jun 6, 2025

A Pocket-Sized MLLM for Ultra-Efficient Image and Video Understanding on Your Phone

Python 25,026 1,957 Updated May 17, 2026

[CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".

Python 56 1 Updated May 25, 2025

[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding

Python 377 33 Updated May 8, 2024

Group-wise Temporal Logit Adjustment for TAS

Python 10 Updated Oct 24, 2024

A curated list for awesome discrete diffusion models resources.

559 22 Updated Sep 9, 2025

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 13,119 1,460 Updated May 16, 2026

Building simple diffusion models for image generation. More so for understanding and learning.

Python 8 2 Updated Mar 30, 2025

[WACV'25] Temporal Instructional Diagram Grounding in Unconstrained Videos

Python 5 Updated Dec 17, 2024

Video Annotation Tool

Vue 237 32 Updated Jun 18, 2024

[ICLR 2025] Video Action Differencing

Python 53 6 Updated Jul 3, 2025

A collection of my book notes on various subjects, mainly computer science

Java 3,006 784 Updated Mar 1, 2025

[ECCV2024] Gated Temporal Action Anticipation for Stochastic Long-Term Anticipation

Python 23 1 Updated May 29, 2025

Code and data release for the paper "Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment" (NeurIPS 2023)

Python 19 3 Updated Apr 5, 2024

React + Next.js template for research websites (for PhD students, researchers, etc)

TypeScript 228 94 Updated Jan 12, 2025

Visualizing the learned space-time attention using Attention Rollout

Jupyter Notebook 41 8 Updated Apr 1, 2022

MLX: An array framework for Apple silicon

C++ 26,288 1,809 Updated May 17, 2026
Jupyter Notebook 180 16 Updated Nov 10, 2024

Collection of AWESOME vision-language models for vision tasks

3,117 233 Updated Oct 14, 2025

A declarative drawing API in Python

Python 300 14 Updated Aug 28, 2024

It is my belief that you, the postgraduate students and job-seekers for whom the book is primarily meant will benefit from reading it; however, it is my hope that even the most experienced research…

4,832 322 Updated Aug 22, 2025

A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.

775 40 Updated May 8, 2026
Next