-
ご飯を消し飛ばす魔法使いです
- 東京四十二区
-
14:52
(UTC +09:00) - http://kdplus.github.io/
- in/wyuxi
Highlights
- Pro
Lists (1)
Sort Name ascending (A-Z)
Starred repositories
RF-DETR is a real-time object detection and segmentation model architecture developed by Roboflow, SOTA on COCO and designed for fine-tuning.
Anki is a smart spaced repetition flashcard program
[ICCV 2025] Implementation of the paper "Q-Frame: Query-aware Frame Selection and Multi-Resolution Adaptation for Video-LLMs"
This repo contains the official implementation of ICLR 2024 paper "Is ImageNet worth 1 video? Learning strong image encoders from 1 long unlabelled video""
End-to-End, Single-Stream Temporal Action Detection in Untrimmed Videos (Official Repo for SS-TAD)
[ECCV 2024] "Elucidating the Hierarchical Nature of Behavior with Masked Autoencoders"
A unified inference and post-training framework for accelerated video generation.
The official PyTorch implementation of the IEEE/CVF Computer Vision and Pattern Recognition (CVPR) '24 paper PREGO: online mistake detection in PRocedural EGOcentric videos.
[ICLR2025] A versatile image-to-image visual assistant, designed for image generation, manipulation, and translation based on free-from user instructions.
[ECCV 2024 & NeurIPS 2024] Official implementation of the paper TAPTR & TAPTRv2 & TAPTRv3
[CVPR 2024] Official implementation of "VRP-SAM: SAM with Visual Reference Prompt"
[ ICLR 2024 ] Official Codebase for "InstructCV: Instruction-Tuned Text-to-Image Diffusion Models as Vision Generalists"
Painter & SegGPT Series: Vision Foundation Models from BAAI
(2024CVPR) MA-LMM: Memory-Augmented Large Multimodal Model for Long-Term Video Understanding
Code for Diffusion Action Segmentation (ICCV 2023)
The official implementation of Error Detection in Egocentric Procedural Task Videos
Temporal Action Detection & Weakly Supervised Temporal Action Detection & Temporal Action Proposal Generation
[CVPR 2024] Guided Slot Attention for Unsupervised Video Object Segmentation
[CVPR 2024] MovieChat: From Dense Token to Sparse Memory for Long Video Understanding
Implementation of "Similarity Contrastive Estimation for Self-Supervised Soft Contrastive Learning" WACV 2023.
SLIC: Self-Supervised Learning with Iterative Clustering for Human Action Videos [CVPR 2022]
[CVPR2025] Number it: Temporal Grounding Videos like Flipping Manga
Production-ready platform for agentic workflow development.
CoTracker is a model for tracking any point (pixel) on a video.
LiveBench: A Challenging, Contamination-Free LLM Benchmark
[arXiv:2309.16669] Code release for "Training a Large Video Model on a Single Machine in a Day"