{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,9,24]],"date-time":"2025-09-24T00:14:58Z","timestamp":1758672898241,"version":"3.44.0"},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2025,9]]},"abstract":"<jats:p>3D visual grounding aims to localize target objects in point clouds based on free-form natural language, which often describes both target and reference objects. Effective alignment between visual and text features is crucial for this task. However, existing two-stage methods that rely solely on object-level features can yield suboptimal accuracy, while one-stage methods that align only point-level features can be prone to noise. In this paper, we propose DDPA-3DVG, a novel framework that progressively aligns visual locations and language descriptions at multiple granularities. Specifically, we decouple natural language descriptions into distinct representations of target objects, reference objects, and their mutual relationships, while disentangling 3D scenes into object-level, voxel-level, and point-level features. By progressively fusing these dual-decoupled features from coarse to fine, our method enhances cross-modal alignment and achieves state-of-the-art performance on three challenging benchmarks\u2014ScanRefer, Nr3D, and Sr3D. The code will be released at https:\/\/github.com\/HDU-VRLab\/DDPA-3DVG.<\/jats:p>","DOI":"10.24963\/ijcai.2025\/117","type":"proceedings-article","created":{"date-parts":[[2025,9,19]],"date-time":"2025-09-19T08:10:40Z","timestamp":1758269440000},"page":"1044-1052","source":"Crossref","is-referenced-by-count":0,"title":["DDPA-3DVG: Vision-Language Dual-Decoupling and Progressive Alignment for 3D Visual Grounding"],"prefix":"10.24963","author":[{"given":"Hongjie","family":"Gu","sequence":"first","affiliation":[{"name":"Hangzhou Dianzi University"}]},{"given":"Jinlong","family":"Fan","sequence":"additional","affiliation":[{"name":"Hangzhou Dianzi University"}]},{"given":"Liang","family":"Zheng","sequence":"additional","affiliation":[{"name":"Hangzhou Dianzi University"}]},{"given":"Jing","family":"Zhang","sequence":"additional","affiliation":[{"name":"Wuhan University"}]},{"given":"Yuxiang","family":"Yang","sequence":"additional","affiliation":[{"name":"Hangzhou Dianzi University"}]}],"member":"10584","event":{"number":"34","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"acronym":"IJCAI-2025","name":"Thirty-Fourth International Joint Conference on Artificial Intelligence {IJCAI-25}","start":{"date-parts":[[2025,8,16]]},"theme":"Artificial Intelligence","location":"Montreal, Canada","end":{"date-parts":[[2025,8,22]]}},"container-title":["Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2025,9,23]],"date-time":"2025-09-23T11:33:02Z","timestamp":1758627182000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2025\/117"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2025,9]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2025\/117","relation":{},"subject":[],"published":{"date-parts":[[2025,9]]}}}