Stars
[WACV'25 Oral] Fine-Tuning Image-Conditional Diffusion Models is Easier than You Think
Code for on-the-fly creation of pseudo video datasets as described in "How Important are Videos for Training Video LLMs?"
2
Updated Jun 11, 2025
Loss Functions in the Era of Semantic Segmentation: A Survey and Outlook
Mask4Former: Mask Transformer for 4D Panoptic Segmentation
[ICCV 2025] DONUT: A Decoder-Only Model for Trajectory Prediction
Estimate absolute 3D human poses from RGB images.
A logging handler allowing logging messages to be sent to Telegram.