default search action
18th ECCV 2024: Milan, Italy - Part VIII
- Ales Leonardis
, Elisa Ricci
, Stefan Roth
, Olga Russakovsky
, Torsten Sattler
, Gül Varol
:
Computer Vision - ECCV 2024 - 18th European Conference, Milan, Italy, September 29-October 4, 2024, Proceedings, Part VIII. Lecture Notes in Computer Science 15066, Springer 2025, ISBN 978-3-031-73241-6 - Mattia Segù, Luigi Piccinelli, Siyuan Li, Luc Van Gool, Fisher Yu, Bernt Schiele:
Walker: Self-supervised Multiple Object Tracking by Walking on Temporal Appearance Graphs. 1-18 - Sumin Lee
, Yooseung Wang, Sangmin Woo
, Changick Kim:
Spatio-Temporal Proximity-Aware Dual-Path Model for Panoramic Activity Recognition. 19-36 - Ali Hatamizadeh, Jiaming Song, Guilin Liu, Jan Kautz, Arash Vahdat:
DiffiT: Diffusion Vision Transformers for Image Generation. 37-55 - Zirui Shao
, Feiyu Gao, Hangdi Xing
, Zepeng Zhu, Zhi Yu, Jiajun Bu
, Qi Zheng, Cong Yao
:
WebRPG: Automatic Web Rendering Parameters Generation for Visual Presentation. 56-74 - Changshuo Wang
, Meiqing Wu, Siew-Kei Lam
, Xin Ning
, Shangshu Yu
, Ruiping Wang
, Weijun Li
, Thambipillai Srikanthan:
GPSFormer: A Global Perception and Local Structure Fitting-Based Transformer for Point Cloud Understanding. 75-92 - Ke Fan
, Junshu Tang
, Weijian Cao
, Ran Yi
, Moran Li
, Jingyu Gong
, Jiangning Zhang
, Yabiao Wang
, Chengjie Wang
, Lizhuang Ma
:
FreeMotion: A Unified Framework for Number-Free Text-to-Motion Synthesis. 93-109 - Zheng Jiang
, Jinqing Zhang
, Yanan Zhang
, Qingjie Liu
, Zhenghui Hu
, Baohui Wang, Yunhong Wang
:
FSD-BEV: Foreground Self-distillation for Multi-view 3D Object Detection. 110-126 - Yang Miao
, Francis Engelmann
, Olga Vysotska
, Federico Tombari
, Marc Pollefeys
, Dániel Béla Baráth:
SceneGraphLoc: Cross-Modal Coarse Visual Localization on 3D Scene Graphs. 127-150 - Chenming Zhu, Tai Wang, Wenwei Zhang, Kai Chen, Xihui Liu:
ScanReason: Empowering 3D Visual Grounding with Reasoning Capabilities. 151-168 - Renrui Zhang, Dongzhi Jiang, Yichi Zhang, Haokun Lin, Ziyu Guo, Pengshuo Qiu, Aojun Zhou, Pan Lu, Kai-Wei Chang, Yu Qiao, Peng Gao, Hongsheng Li
:
MATHVERSE: Does Your Multi-modal LLM Truly See the Diagrams in Visual Math Problems? 169-186 - Zhonghan Zhao
, Wenhao Chai
, Xuan Wang
, Li Boyi
, Shengyu Hao
, Shidong Cao
, Tian Ye
, Gaoang Wang
:
See and Think: Embodied Agent in Virtual Environment. 187-204 - Guangcheng Chen
, Yicheng He
, Li He
, Hong Zhang
:
PISR: Polarimetric Neural Implicit Surface Reconstruction for Textureless and Specular Objects. 205-222 - Xinpeng Liu
, Yong-Lu Li
, Ailing Zeng
, Zizheng Zhou, Yang You
, Cewu Lu
:
Bridging the Gap Between Human Motion and Action Semantics via Kinematic Phrases. 223-240 - Ofir Abramovich, Niv Nayman, Sharon Fogel, Inbal Lavi, Ron Litman, Shahar Tsiper, Royee Tichauer, Srikar Appalaraju, Shai Mazor, R. Manmatha:
VisFocus: Prompt-Guided Vision Encoders for OCR-Free Dense Document Understanding. 241-259 - Zhihao Li
, Biao Hou
, Siteng Ma
, Zitong Wu
, Xianpeng Guo
, Bo Ren
, Licheng Jiao
:
Masked Angle-Aware Autoencoder for Remote Sensing Images. 260-278 - Yi Wu, Ziqiang Li, Heliang Zheng, Chaoyue Wang, Bin Li:
Infinite-ID: Identity-Preserved Personalization via ID-Semantics Decoupling Paradigm. 279-296 - Zhi-Fan Wu, Lianghua Huang, Wei Wang, Yanheng Wei, Yu Liu:
MultiGen: Zero-Shot Image Generation from Multi-modal Prompts. 297-313 - Xianyu Chen
, Ming Jiang
, Qi Zhao
:
GazeXplain: Learning to Predict Natural Language Explanations of Visual Scanpaths. 314-333 - Yifeng Zhang
, Ming Jiang
, Qi Zhao
:
Learning Chain of Counterfactual Thought for Bias-Robust Vision-Language Reasoning. 334-351 - Hanrong Ye, Jason Kuen, Qing Liu, Zhe Lin, Brian L. Price, Dan Xu:
SegGen: Supercharging Segmentation Models with Text2Mask and Mask2Img Synthesis. 352-370 - Ishan Rajendrakumar Dave
, Fabian Caba Heilbron
, Mubarak Shah
, Simon Jenni
:
Sync from the Sea: Retrieving Alignable Videos from Large-Scale Datasets. 371-388 - Ishan Rajendrakumar Dave
, Mamshad Nayeem Rizve
, Mubarak Shah
:
FinePseudo: Improving Pseudo-labelling Through Temporal-Alignablity for Semi-supervised Fine-Grained Action Recognition. 389-408 - Yu Liu, Fatimah Binti Khalid, Lei Wang, Youxi Zhang, Cunrui Wang:
Elegantly Written: Disentangling Writer and Character Styles for Enhancing Online Chinese Handwriting. 409-425 - Sipeng Zheng, Bohan Zhou, Yicheng Feng, Ye Wang, Zongqing Lu:
UniCode: Learning a Unified Codebook for Multimodal Large Language Models. 426-443 - Baifeng Shi, Ziyang Wu, Maolin Mao, Xin Wang, Trevor Darrell:
When Do We Not Need Larger Vision Models? 444-462 - Xianglong He, Junyi Chen, Sida Peng, Di Huang, Yangguang Li, Xiaoshui Huang
, Chun Yuan, Wanli Ouyang, Tong He:
GVGEN: Text-to-3D Generation with Volumetric Representation. 463-479 - Zhening Liu
, Xinjie Zhang
, Jiawei Shao
, Zehong Lin
, Jun Zhang
:
Bidirectional Stereo Image Compression with Cross-Dimensional Entropy Model. 480-496
manage site settings
To protect your privacy, all features that rely on external API calls from your browser are turned off by default. You need to opt-in for them to become active. All settings here will be stored as cookies with your web browser. For more information see our F.A.Q.