Starred repositories
OpenTAD is an open-source temporal action detection (TAD) toolbox based on PyTorch.
[CVPR 2024] Adapting Short-Term Transformers for Action Detection in Untrimmed Videos
[ECCV 2024] Official PyTorch implementation of "Classification Matters: Improving Video Action Detection with Class-Specific Attention"
🔥🔥DatalinkX异构数据源之间的数据同步系统,支持海量数据的增量或全量同步,同时支持HTTP、Oracle、MySQL、ES等数据源之间的数据流转,支持中间transform算子如SQL算子、大模型算子,底层依赖Flink、Seatunnel引擎,提供流转任务管理、任务级联配置、任务日志采集等功能🔥🔥
[停止维护 请使用note286/xduts]西安电子科技大学研究生学位论文XeLaTeX模板
[WACV 2025] Exploiting VLM Localizability and Semantics for Open Vocabulary Action Detection
Kimi-Audio, an open-source audio foundation model excelling in audio understanding, generation, and conversation
Listen to Look: Action Recognition by Previewing Audio (CVPR 2020)
Code for the 2021 paper "Audio-Based Fine-Grained Classroom Activity Detection with Neural Networks"
[Tiny VAD] SG-VAD: Stochastic Gates Based Speech Activity Detection
[NeurIPS 2022 Spotlight] VideoMAE for Action Detection
[ICASSP 2024] Official code for Slowfast Network for Continuous Sign Language Recognition
[NeurIPS 2023] Official implementation of the paper "CAST: Cross-Attention in Space and Time for Video Action Recognition"
Temporal Action Localization with Cross Layer Task Decoupling and Refinement
Official Implementation of our WACV2023 paper: “Holistic Interaction Transformer Network for Action Detection”
Improving Mamaba performance on Video Understanding task
[CVPR 2023] VideoMAE V2: Scaling Video Masked Autoencoders with Dual Masking
[ECCV2024] Official implementation of Crowd-SAM: SAM as a Smart Annotator for Object Detection in Crowded Scenes
[ICCV 2023] Efficient Video Action Detection with Token Dropout and Context Refinement
[CVPR 2021] Actor-Context-Actor Relation Network for Spatio-temporal Action Localization
Spatio-Temporal Action Localization System
Hiera: A fast, powerful, and simple hierarchical vision transformer.
Code repository for the paper "On the Benefits of 3D Pose and Tracking for Human Action Recognition", (CVPR 2023)
We have implemented Track # 1 for ICME 2024: Spatial Action Localization on Chaotic World dataset. Our mAP on the validation set reaches 26.62%, and if we directly use officially provided chaos_tes…
Context-based Dialogue Act Recognition using Recurrent Neural Networks
Switchboard Dialog Act Corpus with Penn Treebank links