Skip to content
View JihoChoi's full-sized avatar
☘️
🪴 🌱
☘️
🪴 🌱

Organizations

@scone-snu

Block or report JihoChoi

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse

Starred repositories

Showing results
Python 113 3 Updated Nov 1, 2025

Toward Ambulatory Vision: Learning Visually-Grounded Active View Selection

14 Updated Dec 17, 2025

Vision Bridge Transformer at Scale

Python 126 6 Updated Dec 1, 2025

[NeurIPS 2024] Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Python 7,262 726 Updated Jan 22, 2025

G1: Bootstrapping Perception and Reasoning Abilities of Vision-Language Model via Reinforcement Learning

Python 93 2 Updated May 20, 2025

Witness the aha moment of VLM with less than $3.

Python 4,009 289 Updated May 19, 2025

[NIPS'25 Spotlight] Mulberry, an o1-like Reasoning and Reflection MLLM Implemented via Collective MCTS

Python 1,233 110 Updated Sep 19, 2025

A friendly programming language from the future

Haskell 6,445 292 Updated Dec 21, 2025

Stay in flow while building with AI

TypeScript 42 6 Updated Dec 20, 2025

A step-by-step reasoning framework for 3D scene understanding

10 1 Updated Nov 7, 2025

R1-VL: Learning to Reason with Multimodal Large Language Models via Step-wise Group Relative Policy Optimization

Python 444 Updated Dec 16, 2025

OS-Harm: A Benchmark for Measuring Safety of Computer Use Agents [NeurIPS 2025 Spotlight]

Jupyter Notebook 43 1 Updated Sep 18, 2025

Nav-R1: Reasoning and Navigation in Embodied Scenes

Python 85 Updated Oct 31, 2025

Official repo of "Chain-of-Visual-Thought: Teaching VLMs to See and Think Better with Continuous Visual Tokens"

Python 220 9 Updated Dec 9, 2025

Official PyTorch implementation of "Aligned but Stereotypical? The Hidden Influence of System Prompts on Social Bias in LVLM-Based Text-to-Image Models"

Python 12 Updated Dec 5, 2025

[ECCV 2024 Oral] Code for paper: An Image is Worth 1/2 Tokens After Layer 2: Plug-and-Play Inference Acceleration for Large Vision-Language Models

Python 534 22 Updated Jan 4, 2025

[NeurIPS 2025] We propose a first RL-based personalized image captioning framework with well-defined verifiable rewards.

Python 10 Updated Nov 17, 2025

Code for the paper "GRPO is Secretly a Process Reward Model": https://arxiv.org/abs/2509.21154

Python 5 Updated Oct 1, 2025
Python 164 8 Updated Nov 26, 2025

SAM 3D Objects

Python 5,018 463 Updated Dec 16, 2025

The repository provides code for running inference and finetuning with the Meta Segment Anything Model 3 (SAM 3), links for downloading the trained model checkpoints, and example notebooks that sho…

Python 6,283 728 Updated Dec 21, 2025

[Neurips'24 Spotlight] Visual CoT: Advancing Multi-Modal Language Models with a Comprehensive Dataset and Benchmark for Chain-of-Thought Reasoning

Python 413 21 Updated Dec 22, 2024

Codes for Visual Sketchpad: Sketching as a Visual Chain of Thought for Multimodal Language Models

Jupyter Notebook 272 15 Updated Aug 5, 2025

Think with 3D: Geometric Imagination Grounded Spatial Reasoning from Limited Views

Python 110 4 Updated Dec 9, 2025

Official repository for the A-OKVQA dataset

Python 106 14 Updated May 8, 2024
Python 63 8 Updated Feb 3, 2025

Code for 3D-LLM: Injecting the 3D World into Large Language Models

Python 1,164 71 Updated Jun 6, 2024

[CVPR 2023] Code for "3D Concept Learning and Reasoning from Multi-View Images"

Python 84 4 Updated Jan 20, 2024
Next