Stars
[NeurIPS 2022 Spotlight] VideoMAE: Masked Autoencoders are Data-Efficient Learners for Self-Supervised Video Pre-Training
Code release for "Making a Bird AI Expert Work for You and Me (TPAMI 2023)".
The official implementation of 'Weakly Supervised Temporal Action Localization via Representative Snippet Knowledge Propagation' (CVPR 2022)
[TPAMI 2024] Official repo of "ETPNav: Evolving Topological Planning for Vision-Language Navigation in Continuous Environments"
Official PyTorch implementation of the paper "Revisiting Temporal Modeling for CLIP-based Image-to-Video Knowledge Transferring"
[NeurIPS 2022] Embracing Consistency: A One-Stage Approach for Spatio-Temporal Video Grounding
Foreground-Action Consistency Network for Weakly Supervised Temporal Action Localization
[CVPR'22 Oral] Temporal Alignment Networks for Long-term Video. Tengda Han, Weidi Xie, Andrew Zisserman.
PyTorch code for "Self-Supervised Predictive Learning: A Negative-Free Method for Sound Source Localization in Visual Scenes" (CVPR, 2022)
A human-annotated, fine-grained dataset for Vision-and-Language Navigation
Code of our ICIP 2021 paper CMF: Cascaded Multi-model Fusion for Referring Image Segmentation
Code release for Hu et al., Language-Conditioned Graph Networks for Relational Reasoning. in ICCV, 2019
PyTorch Implementation for "Asymmetric Cross-Guided Attention Network for Actor and Action Video Segmentation From Natural Language Query"
IEEE TNNLS 2021, transformer, multi-graph transformer, graph, graph classification, sketch recognition, sketch classification, free-hand sketch, official code of the paper "Multi-Graph Transformer …
Faster-RCNN in Tensorflow
OpenMMLab Detection Toolbox and Benchmark
A large-scale edge-map dataset including 290, 281 edge-maps corresponding to 345 object categories of QuickDraw dataset.
Pytorch implementation of Blazingly Fast Video Object Segmentation with Pixel-Wise Metric Learning (Chen et al)
jianhua2022 / MyDeepMetric
Forked from czyczyyzc/MyDeepMetricAn implementation of "Semantic Instance Segmentation via Deep Metric Learning" with tensorflow
Auto-encoder Based Data Clustering Toolkit
Mask data and code for 'Mask-guided Contrastive Attention Model for Person Re-Identification' (CVPR-2018)
Dynamic Multimodal Instance Segmentation Guided by Natural Language Queries, ECCV 2018
A Neural Net Training Interface on TensorFlow, with focus on speed + flexibility
Demo code of the paper "Fast and Accurate Online Video Object Segmentation via Tracking Parts", in CVPR 2018
ESPNet: Efficient Spatial Pyramid of Dilated Convolutions for Semantic Segmentation