Highlights
- Pro
-
DEL is a novel framework for dense semantic action localization in long, untrimmed real-world videos, aiming to accurately detect and classify multiple, potentially overlapping actions at fine-grai…
UpdatedSep 6, 2025 -
-
UnAV_yolyolVA Public
Forked from ttgeng233/UnAVDense-Localizing Audio-Visual Events in Untrimmed Videos: A Large-Scale Benchmark and Baseline (CVPR 2023)
Python MIT License UpdatedJan 9, 2025 -
FILS Public
FILS: Self-Supervised Video Feature Prediction In Semantic Language Space
Python MIT License UpdatedJun 16, 2024 -
AVION Public
Forked from zhaoyue-zephyrus/AVIONCode release for "Training a Large Video Model on a Single Machine in a Day"
Python MIT License UpdatedJun 16, 2024 -
SSVLI Public
Semantic Self-supervised Video Understanding Using Language Informtation
-
MOFO Public
The main contribution is to make self-supervised video representation learning more meaningful by raising awareness of motion data
-
-
scenic Public
Forked from google-research/scenicScenic: A Jax Library for Computer Vision Research and Beyond
Jupyter Notebook Apache License 2.0 UpdatedSep 1, 2023 -
VideoMAE Public
Forked from MCG-NJU/VideoMAE[NeurIPS 2022] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Python Other UpdatedAug 28, 2023 -
MCL-Motion-Focused-Contrastive-Learning Public
Forked from YihengZhang-CV/MCL-Motion-Focused-Contrastive-LearningPython Other UpdatedMar 3, 2023 -