Skip to content
View daeunni's full-sized avatar
☘️
Researching for the happiness
☘️
Researching for the happiness

Block or report daeunni

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

[CVPR 2026] Ego2Web: A Web Agent Benchmark Grounded in Egocentric Videos

Python 8 1 Updated Mar 25, 2026

Official Python inference and LoRA trainer package for the LTX-2 audio–video generative model.

Python 5,220 788 Updated Mar 11, 2026
Python 1,673 188 Updated Nov 15, 2025

"Visual Prompt Selection for In-Context Learning Segmentation Framework"

Python 15 1 Updated Dec 13, 2024

[Awesome] πŸ”₯πŸ”₯πŸ”₯ Latest Papers, Codes and Datasets on Streaming / Online Video Understanding

149 10 Updated Jan 13, 2026

Video-o3: Native Interleaved Clue Seeking for Long Video Multi-Hop Reasoning

Python 137 5 Updated Mar 6, 2026

When and How Much to Imagine: Adaptive Test-Time Scaling with World Models for Visual Spatial Reasoning

Python 13 1 Updated Feb 10, 2026

Track and Caption Any Motion: Query-Free Motion Discovery and Description in Videos

Python 3 Updated Dec 11, 2025

[CVPR2025] Number it: Temporal Grounding Videos like Flipping Manga

Python 146 6 Updated Jan 19, 2026

πŸ“– This is a repository for organizing papers, codes and other resources related to Visual Reinforcement Learning.

423 20 Updated Mar 6, 2026

Code for "StreamGaze: Gaze-Guided Temporal Reasoning and Proactive Understanding in Streaming Videos [CVPR 2026]"

Python 20 1 Updated Feb 21, 2026

Repository for NeurIPS 2025 Paper "Gaze-VLM: Bridging Gaze and VLMs via Attention Regularization for Egocentric Understanding"

Python 6 2 Updated Jan 19, 2026

[NeurIPS'25 Spotlight] ARM: Adaptive Reasoning Model

Python 65 3 Updated Mar 10, 2026

VideoLoom: A Video Large Language Model for Joint Spatial-Temporal Understanding

Python 25 Updated Jan 23, 2026

[NeurIPS 2025 spotlight] QFFT, Question-Free Fine-Tuning for Adaptive Reasoning

Python 91 1 Updated Nov 4, 2025
Jupyter Notebook 68 7 Updated Feb 23, 2026

Strefer: Empowering Video LLMs with Space-Time Referring and Reasoning via Synthetic Instruction Data

Jupyter Notebook 17 Updated Oct 29, 2025
Python 62 5 Updated Feb 27, 2026

A continuously updated project to track the latest progress in the field of multi-modal object tracking. This project focuses solely on single-object tracking.

Jupyter Notebook 1,201 55 Updated Mar 24, 2026

2026 AI/ML internship & new graduate job list updated daily

4,959 198 Updated Mar 25, 2026

[ICLR2026] Spatial Reasoning with Vision-Language Models

Python 41 1 Updated Jan 26, 2026

Official Repository for NeurIPS'25 Paper "Tool-Augmented Spatiotemporal Reasoning for Streamlining Video Question Answering Task"

Python 20 Updated Mar 20, 2026

[ECCV 2022] ByteTrack: Multi-Object Tracking by Associating Every Detection Box

Python 6,190 1,095 Updated Jun 19, 2024

Official implementation of RT-DETRv4: Painlessly Furthering Real-Time Object Detection with Vision Foundation Models

Python 375 40 Updated Jan 9, 2026

[CVPR 2024] Official RT-DETR (RTDETR paddle pytorch), Real-Time DEtection TRansformer, DETRs Beat YOLOs on Real-time Object Detection. πŸ”₯ πŸ”₯ πŸ”₯

Python 5,003 590 Updated Mar 2, 2026

Spatial-Temporal Knowledge-Embedded Transformer for Video Scene Graph Generation (TIP 2024, ACM MM 2023)

Python 20 2 Updated Mar 13, 2024

Wan: Open and Advanced Large-Scale Video Generative Models

Python 14,859 1,798 Updated Mar 17, 2026

[CVPR 2026] Official Code for "ARM-Thinker: Reinforcing Multimodal Generative Reward Models with Agentic Tool Use and Visual Reasoning"

Python 90 3 Updated Feb 13, 2026

Wan: Open and Advanced Large-Scale Video Generative Models

Python 15,680 2,494 Updated Mar 5, 2026
Next