Skip to content
View jihoojung0106's full-sized avatar
🏠
Working from home
🏠
Working from home

Highlights

  • Pro

Block or report jihoojung0106

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A curated collection of resources focused on the Mechanistic Interpretability (MI) of Large Multimodal Models (LMMs). This repository aggregates surveys, blog posts, and research papers that explor…

170 3 Updated Oct 20, 2025

Open source implementation of "Vision Transformers Need Registers"

Python 201 19 Updated Oct 20, 2025

Grounded SAM: Marrying Grounding DINO with Segment Anything & Stable Diffusion & Recognize Anything - Automatically Detect , Segment and Generate Anything

Jupyter Notebook 17,256 1,564 Updated Sep 5, 2024

The repository provides code for running inference with the Meta Segment Anything Model 2 (SAM 2), links for downloading the trained model checkpoints, and example notebooks that show how to use th…

Jupyter Notebook 18,052 2,284 Updated Dec 25, 2024

[ICCV 2023] Tracking Anything with Decoupled Video Segmentation

Python 1,467 137 Updated Apr 26, 2025

Video-Holmes: Can MLLM Think Like Holmes for Complex Video Reasoning?

Python 82 2 Updated Jul 13, 2025
Python 143 9 Updated Jul 31, 2025

An open-source project dedicated to tracking and segmenting any objects in videos, either automatically or interactively. The primary algorithms utilized include the Segment Anything Model (SAM) fo…

Jupyter Notebook 3,089 354 Updated Apr 25, 2024

Code accompanying our paper "Improved Baselines for Data-efficient Perceptual Augmentation of LLMs"

Python 3 Updated May 17, 2024
Python 76 6 Updated Nov 5, 2024

Locating and editing factual associations in GPT (NeurIPS 2022)

Python 709 151 Updated Apr 20, 2024

Code for paper: "What’s in the Image? A Deep-Dive into the Vision of Vision Language Models" (CVPR 2025)

Python 14 3 Updated May 1, 2025

The official repo for "Vidi: Large Multimodal Models for Video Understanding and Editing"

Python 535 33 Updated Dec 11, 2025
Python 64 4 Updated Jul 28, 2025

Intervening Anchor Token: Decoding Strategy in Alleviating Hallucinations for MLLMs

Python 163 42 Updated Mar 12, 2025

Offical code for the CVPR 2024 Paper: Separating the "Chirp" from the "Chat": Self-supervised Visual Grounding of Sound and Language

Jupyter Notebook 85 13 Updated Jun 12, 2024

Smart home Agent with Grounded Execution

Python 27 6 Updated Jul 22, 2024

[ICLR 2025] When Attention Sink Emerges in Language Models: An Empirical View (Spotlight)

Python 146 5 Updated Jul 8, 2025

[NeurIPS2024] Official code for (IMA) Implicit Multimodal Alignment: On the Generalization of Frozen LLMs to Multimodal Inputs

Python 22 4 Updated Oct 15, 2024

Code for the paper "Head Pursuit: Probing Attention Specialization in Multimodal Transformers" [NeurIPS 2025 spotlight]

Python 5 Updated Dec 4, 2025

Official implementation of "MMNeuron: Discovering Neuron-Level Domain-Specific Interpretation in Multimodal Large Language Model". Our codes are borrowed from Tang's language specific neurons imple…

Python 25 1 Updated Dec 20, 2024

Code for "Finding and Editing Multi-Modal Neurons in Pre-Trained Transformers" (Findings of ACL 2024)

Python 11 2 Updated Aug 28, 2025
Python 9 Updated Nov 15, 2024

[ICLR'25] Official code for "Can Video LLMs Refuse to Answer? Alignment for Answerability in Video Large Language Models"

Python 34 2 Updated May 10, 2025

Code for the paper: Discover-then-Name: Task-Agnostic Concept Bottlenecks via Automated Concept Discovery. ECCV 2024.

Python 52 7 Updated Nov 3, 2024

A framework that allows you to apply Sparse AutoEncoder on any models

Python 48 1 Updated Jul 11, 2025

[ICCV 2025] Auto Interpretation Pipeline and many other functionalities for Multimodal SAE Analysis.

Python 169 10 Updated Sep 26, 2025

Code for reproducing our paper "Not All Language Model Features Are Linear"

Jupyter Notebook 83 10 Updated Nov 27, 2024
Jupyter Notebook 112 12 Updated Feb 11, 2025

A curated list of resources for activation engineering

119 4 Updated Oct 2, 2025
Next