Stars
Official implementation of our paper "Finetuned Multimodal Language Models are High-Quality Image-Text Data Filters".
Open-source code for Generic Grouping Network (GGN, CVPR 2022)
The official pytorch implementation of our paper "Is Space-Time Attention All You Need for Video Understanding?"
A deep learning library for video understanding research.