hyunbin70

Hyun-Bin Oh hyunbin70

3 followers · 3 following

Achievements

Stars

kaist-ami / HDR-NSFF

[ICLR’26] Official PyTorch Implementation of “HDR-NSFF: High Dynamic Range Neural Scene Flow Fields“

Python 36 Updated Apr 19, 2026

Bizilizi / VGGSounder

VGGSounder, a multi-label audio-visual classification dataset with modality annotations.

Jupyter Notebook 16 Updated Jun 3, 2026

ZGCTroy / RealCam-Vid

open-sourced video dataset with dynamic scenes and camera movements annotation

Python 94 1 Updated Apr 24, 2025

Inception3D / Easi3R

[ICCV 2025] A simple training-free approach adapting DUSt3R for dynamic scenes.

Python 530 26 Updated Apr 1, 2025

NVIDIA / audio-flamingo

PyTorch implementation of Audio Flamingo: Series of Advanced Audio Understanding Language Models

1,144 96 Updated Dec 15, 2025

bytedance / Sa2VA

Official Repo For Pixel-LLM Codebase: Sa2VA (Arxiv-25), SAMTok (CVPR-26), VRT, SaSaSa2VA (1-st solution for LSVOS)

Python 1,614 118 Updated Jun 15, 2026

BradyFU / Awesome-Multimodal-Large-Language-Models

✨✨Latest Advances on Multimodal Large Language Models

17,899 1,128 Updated Jun 18, 2026

humansensinglab / Hamba

[NeurIPS 2024] Hamba: Single-view 3D Hand Reconstruction with Graph-guided Bi-Scanning Mamba

Python 183 10 Updated Feb 1, 2026

Genesis-Embodied-AI / genesis-world

Simulation platform for general-purpose robotics & embodied AI learning.

Python 29,372 2,786 Updated Jun 17, 2026

hkchengrex / MMAudio

[CVPR 2025] MMAudio: Taming Multimodal Joint Training for High-Quality Video-to-Audio Synthesis

Python 2,210 259 Updated Feb 23, 2026

PKU-YuanGroup / LLaVA-CoT

[ICCV 2025] LLaVA-CoT, a visual language model capable of spontaneous, systematic reasoning

Python 2,137 82 Updated Dec 12, 2025

mega-sam / mega-sam

Code for the project "MegaSaM: Accurate, Fast and Robust Structure and Motion from Casual Dynamic Videos"

Python 1,318 79 Updated Jan 5, 2026

exitudio / MMM

Official repository for "MMM: Generative Masked Motion Model" (CVPR 2024 -- Highlight)

Jupyter Notebook 132 14 Updated Jul 5, 2025

RERV / VDT

[ICLR2024] The official implementation of paper "VDT: General-purpose Video Diffusion Transformers via Mask Modeling", by Haoyu Lu, Guoxing Yang, Nanyi Fei, Yuqi Huo, Zhiwu Lu, Ping Luo, Mingyu Ding.

Jupyter Notebook 255 15 Updated May 5, 2024

choijeongsoo / av2av

[CVPR 2024] AV2AV: Direct Audio-Visual Speech to Audio-Visual Speech Translation with Unified Audio-Visual Speech Representation

Python 48 4 Updated Sep 6, 2024

mok0102 / Video-LLaVA

Forked from PKU-YuanGroup/Video-LLaVA

Video-LLaVA: Learning United Visual Representation by Alignment Before Projection

Python 1 Updated May 6, 2024

seuqaj114 / paig

Code for the paper Physics-as-Inverse-Graphics: Joint Unsupervised Learning of Objects and Physics from Video

Python 41 11 Updated May 22, 2023

google-research / kubric

A data generation pipeline for creating semi-realistic synthetic multi-object videos with rich annotations such as instance segmentation masks, depth maps, and optical flow.

Jupyter Notebook 2,757 275 Updated May 21, 2026