{"status":"ok","message-type":"work","message-version":"1.0.0","message":{"indexed":{"date-parts":[[2025,11,5]],"date-time":"2025-11-05T06:35:31Z","timestamp":1762324531243},"publisher-location":"California","reference-count":0,"publisher":"International Joint Conferences on Artificial Intelligence Organization","content-domain":{"domain":[],"crossmark-restriction":false},"short-container-title":[],"published-print":{"date-parts":[[2019,8]]},"abstract":"<jats:p>Visual focus of attention in multi-person discussions is a crucial nonverbal indicator in tasks such as inter-personal relation inference, speech transcription, and deception detection. However, predicting the focus of attention remains a challenge because the focus changes rapidly, the discussions are highly dynamic, and the people's behaviors are inter-dependent. Here we propose ICAF (Iterative Collective Attention Focus), a collective classification model to jointly learn the visual focus of attention of all people. Every person is modeled using a separate classifier. ICAF models the people collectively---the predictions of all other people's classifiers are used as inputs to each person's classifier. This explicitly incorporates inter-dependencies between all people's behaviors. We evaluate ICAF on a novel dataset of 5 videos (35 people, 109 minutes, 7604 labels in all) of the popular Resistance game and a widely-studied meeting dataset with supervised prediction. See our demo at https:\/\/cs.dartmouth.edu\/dsail\/demos\/icaf. ICAF outperforms the strongest baseline by 1%--5% accuracy in predicting the people's visual focus of attention.\nFurther, we propose a lightly supervised technique to train models in the absence of training labels. We show that light-supervised ICAF performs similar to the supervised ICAF, \nthus showing its effectiveness and generality to previously unseen videos.<\/jats:p>","DOI":"10.24963\/ijcai.2019\/626","type":"proceedings-article","created":{"date-parts":[[2019,7,28]],"date-time":"2019-07-28T07:46:05Z","timestamp":1564299965000},"page":"4504-4510","source":"Crossref","is-referenced-by-count":20,"title":["Predicting the Visual Focus of Attention in Multi-Person Discussion Videos"],"prefix":"10.24963","author":[{"given":"Chongyang","family":"Bai","sequence":"first","affiliation":[{"name":"Dartmouth College"}]},{"given":"Srijan","family":"Kumar","sequence":"additional","affiliation":[{"name":"Stanford University"},{"name":"Georgia Institute of Technology"}]},{"given":"Jure","family":"Leskovec","sequence":"additional","affiliation":[{"name":"Stanford University"}]},{"given":"Miriam","family":"Metzger","sequence":"additional","affiliation":[{"name":"University of California Santa Babara"}]},{"given":"Jay","family":"F. Nunamaker","sequence":"additional","affiliation":[{"name":"University of Arizona"}]},{"given":"V. S.","family":"Subrahmanian","sequence":"additional","affiliation":[{"name":"Dartmouth College"}]}],"member":"10584","event":{"number":"28","sponsor":["International Joint Conferences on Artificial Intelligence Organization (IJCAI)"],"acronym":"IJCAI-2019","name":"Twenty-Eighth International Joint Conference on Artificial Intelligence {IJCAI-19}","start":{"date-parts":[[2019,8,10]]},"theme":"Artificial Intelligence","location":"Macao, China","end":{"date-parts":[[2019,8,16]]}},"container-title":["Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence"],"original-title":[],"deposited":{"date-parts":[[2019,7,28]],"date-time":"2019-07-28T07:50:33Z","timestamp":1564300233000},"score":1,"resource":{"primary":{"URL":"https:\/\/www.ijcai.org\/proceedings\/2019\/626"}},"subtitle":[],"proceedings-subject":"Artificial Intelligence Research Articles","short-title":[],"issued":{"date-parts":[[2019,8]]},"references-count":0,"URL":"https:\/\/doi.org\/10.24963\/ijcai.2019\/626","relation":{},"subject":[],"published":{"date-parts":[[2019,8]]}}}