Skip to content
View avijit9's full-sized avatar
🎯
Focusing
🎯
Focusing

Block or report avijit9

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
801 results for source starred repositories
Clear filter

DuoLoRA implementation

Python 5 Updated Oct 18, 2025

pySLAM is a Python-based Visual SLAM pipeline that supports monocular, stereo, and RGB-D cameras. It offers a wide range of modern local and global features, multiple loop-closing strategies, a vol…

Python 2,713 446 Updated Nov 5, 2025

Official PyTorch implementation of the paper "Chapter-Llama: Efficient Chaptering in Hour-Long Videos with LLMs"

Python 77 10 Updated Jun 6, 2025

MiniCPM-V 4.5: A GPT-4o Level MLLM for Single Image, Multi Image and High-FPS Video Understanding on Your Phone

Python 22,185 1,665 Updated Sep 24, 2025

[CVPR 2025] Official PyTorch code of "Enhancing Video-LLM Reasoning via Agent-of-Thoughts Distillation".

Python 50 Updated May 25, 2025

[ICCV 2023] UniVTG: Towards Unified Video-Language Temporal Grounding

Python 368 34 Updated May 8, 2024

Group-wise Temporal Logit Adjustment for TAS

Python 10 Updated Oct 24, 2024

A curated list for awesome discrete diffusion models resources.

488 19 Updated Sep 9, 2025

[CVPR 2025 Best Paper Award] VGGT: Visual Geometry Grounded Transformer

Python 11,530 1,184 Updated Oct 11, 2025

Building simple diffusion models for image generation. More so for understanding and learning.

Python 8 2 Updated Mar 30, 2025

[WACV'25] Temporal Instructional Diagram Grounding in Unconstrained Videos

Python 5 Updated Dec 17, 2024

Video Annotation Tool

Vue 226 29 Updated Jun 18, 2024

[ICLR 2025] Video Action Differencing

Python 47 2 Updated Jul 3, 2025

A collection of my book notes on various subjects, mainly computer science

Java 2,887 749 Updated Mar 1, 2025

[ECCV2024] Gated Temporal Action Anticipation for Stochastic Long-Term Anticipation

Python 18 1 Updated May 29, 2025

Code and data release for the paper "Learning Fine-grained View-Invariant Representations from Unpaired Ego-Exo Videos via Temporal Alignment" (NeurIPS 2023)

Python 19 2 Updated Apr 5, 2024

React + Next.js template for research websites (for PhD students, researchers, etc)

TypeScript 202 82 Updated Jan 12, 2025

Visualizing the learned space-time attention using Attention Rollout

Jupyter Notebook 37 8 Updated Apr 1, 2022

MLX: An array framework for Apple silicon

C++ 22,713 1,378 Updated Nov 5, 2025
Jupyter Notebook 139 12 Updated Nov 10, 2024

Collection of AWESOME vision-language models for vision tasks

2,987 222 Updated Oct 14, 2025

A declarative drawing API in Python

Python 298 15 Updated Aug 28, 2024

It is my belief that you, the postgraduate students and job-seekers for whom the book is primarily meant will benefit from reading it; however, it is my hope that even the most experienced research…

4,724 320 Updated Aug 22, 2025

A curated list of awesome prompt/adapter learning methods for vision-language models like CLIP.

698 31 Updated Sep 8, 2025

A PyTorch implementation of NeRF (Neural Radiance Fields) that reproduces the results.

Python 5,940 1,123 Updated Jul 25, 2024

Code and models for the ICML 2024 paper "Tell, Don`t Show!: Language Guidance Eases Transfer Across Domains in Images and Videos"

Python 6 1 Updated May 18, 2024
Python 4,371 415 Updated Sep 14, 2025

Code release for "Detecting Twenty-thousand Classes using Image-level Supervision".

Python 1,985 220 Updated Mar 21, 2024

X-MIC: Cross-Modal Instance Conditioning for Egocentric Action Generalization, CVPR 2024

Python 11 Updated Nov 7, 2024
Next