jihoojung0106

🏠

Working from home

JIHOO JUNG jihoojung0106

🏠

Working from home

1 follower · 3 following

Highlights

Lists (4)

Sort

Stars

itsqyh / Awesome-LMMs-Mechanistic-Interpretability

A curated collection of resources focused on the Mechanistic Interpretability (MI) of Large Multimodal Models (LMMs). This repository aggregates surveys, blog posts, and research papers that explor…

170 3 Updated Oct 20, 2025

kyegomez / Vit-RGTS

Open source implementation of "Vision Transformers Need Registers"

Python 201 19 Updated Oct 20, 2025

IDEA-Research / Grounded-Segment-Anything

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 17,256 1,564 Updated Sep 5, 2024

facebookresearch / sam2

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 18,052 2,284 Updated Dec 25, 2024

hkchengrex / Tracking-Anything-with-DEVA

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation

Python 1,467 137 Updated Apr 26, 2025

TencentARC / Video-Holmes

Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?

Python 82 2 Updated Jul 13, 2025

HumanMLLM / HumanOmniV2

Python 143 9 Updated Jul 31, 2025

z-x-yang / Segment-and-Track-Anything

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…

Jupyter Notebook 3,089 354 Updated Apr 25, 2024

facebookresearch / DePALM

Code accompanying our paper "Improved Baselines for Data-efficient Perceptual Augmentation of LLMs"

Python 3 Updated May 17, 2024

clemneo / llava-interp

Python 76 6 Updated Nov 5, 2024

kmeng01 / rome

Locating and editing factual associations in GPT (NeurIPS 2022)

Python 709 151 Updated Apr 20, 2024

OmriKaduri / vlm-interp

Code for paper: "What’s in the Image? A Deep-Dive into the Vision of Vision Language Models" (CVPR 2025)

Python 14 3 Updated May 1, 2025

bytedance / vidi

The official repo for "Vidi: Large Multimodal Models for Video Understanding and Editing"

Python 535 33 Updated Dec 11, 2025

LzVv123456 / VISTA

Python 64 4 Updated Jul 28, 2025

Everlyn-Labs / ANTRP

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

Python 163 42 Updated Mar 12, 2025

mhamilton723 / DenseAV

Offical code for the CVPR 2024 Paper: Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language

Jupyter Notebook 85 13 Updated Jun 12, 2024

SAIC-MONTREAL / SAGE

Smart home Agent with Grounded Execution

Python 27 6 Updated Jul 22, 2024

sail-sg / Attention-Sink

[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)

Python 146 5 Updated Jul 8, 2025

mshukor / ima-lmms

[NeurIPS2024] Official code for (IMA) Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs

Python 22 4 Updated Oct 15, 2024

lorenzobasile / HeadPursuit

Code for the paper "Head Pursuit: Probing Attention Specialization in Multimodal Transformers" [NeurIPS 2025 spotlight]

Python 5 Updated Dec 4, 2025

Z1zs / MMNeuron

Official implementation of "MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model". Our codes are borrowed from Tang's language specific neurons imple…

Python 25 1 Updated Dec 20, 2024