skip to main content
research-article

Tapestry of Time and Actions: Modeling Human Activity Sequences Using Temporal Point Process Flows

Published: 15 April 2024 Publication History

Abstract

Human beings always engage in a vast range of activities and tasks that demonstrate their ability to adapt to different scenarios. These activities can range from the simplest daily routines, like walking and sitting, to multi-level complex endeavors such as cooking a four-course meal. Any human activity can be represented as a temporal sequence of actions performed to achieve a certain goal. Unlike the time series datasets extracted from electronics or machines, these action sequences are highly disparate in their nature—the time to finish a sequence of actions can vary between different persons. Therefore, understanding the dynamics of these sequences is essential for many downstream tasks such as activity length prediction, goal prediction, and next action recommendation. Existing neural network based approaches that learn a continuous-time activity sequence are limited to the presence of only visual data or are designed specifically for a particular task (i.e., limited to next action or goal prediction). In this article, we present ProActive, a neural marked temporal point process framework for modeling the continuous-time distribution of actions in an activity sequence while simultaneously addressing three high-impact problems: next action prediction, sequence goal prediction, and end-to-end sequence generation. Specifically, we utilize a self-attention module with temporal normalizing flows to model the influence and the inter-arrival times between actions in a sequence. Moreover, for time-sensitive prediction, we perform an early detection of sequence goal via a constrained margin-based optimization procedure. This in turn allows ProActive to predict the sequence goal using a limited number of actions. In addition, we propose a novel addition over the ProActive model, called ProActive++, that can handle variations in the order of actions (i.e., different methods of achieving a given goal). We demonstrate that this variant can learn the order in which the person or actor prefers to do their actions. Extensive experiments on sequences derived from three activity recognition datasets show the significant accuracy boost of our ProActive and ProActive++ over the state of the art in terms of action and goal prediction, and the first-ever application of end-to-end action sequence generation.

References

[1]
Moustafa Alzantot, Supriyo Chakraborty, and Mani B. Srivastava. 2017. SenseGen: A deep learning architecture for synthetic sensor data generation. arXiv preprint arXiv:1701.08886 (2017).
[2]
Jimmy Lei Ba, Jamie Ryan Kiros, and Geoffrey E. Hinton. 2016. Layer normalization. arXiv preprint arXiv:1607.06450 (2016).
[3]
Emmanuel Bacry, Iacopo Mastromatteo, and Jean-François Muzy. 2015. Hawkes processes in finance. arXiv preprint arXiv:1502.04592 (2015).
[4]
Prathamesh Deshpande, Kamlesh Marathe, Abir De, and Sunita Sarawagi. 2021. Long horizon forecasting with temporal point processes. In Proceedings of WSDM.
[5]
Seth R. Donahue, Li Jin, and Michael E. Hahn. 2020. User independent estimations of gait events with minimal sensor data. IEEE Journal of Biomedical and Health Informatics 25, 5 (2020), 1583–1590.
[6]
Nan Du, Hanjun Dai, Rakshit Trivedi, Utkarsh Upadhyay, Manuel Gomez-Rodriguez, and Le Song. 2016. Recurrent marked temporal point processes: Embedding event history to vector. In Proceedings of KDD.
[7]
Yazan Abu Farha, Alexander Richard, and Juergen Gall. 2018. When will you do what? Anticipating temporal occurrences of activities. In Proceedings of CVPR.
[8]
Panna Felsen. 2019. Learning to Predict Human Behavior from Video. Ph. D. Dissertation. EECS Department, University of California, Berkeley.
[9]
Panna Felsen, Pulkit Agrawal, and Jitendra Malik. 2017. What will happen next? Forecasting player moves in sports videos. In Proceedings of ICCV.
[10]
Panna Felsen, Patrick Lucey, and Sujoy Ganguly. 2018. Where will they go? Predicting fine-grained adversarial multi-agent motion using conditional variational autoencoders. In Proceedings of ECCV.
[11]
Ian J. Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial networks. In Proceedings of NeurIPS.
[12]
Vinayak Gupta and Srikanta Bedathur. 2022. ProActive: Self-attentive temporal point process flows for activity sequences. In Proceedings of KDD.
[13]
Vinayak Gupta, Srikanta Bedathur, Sourangshu Bhattacharya, and Abir De. 2021. Learning temporal point processes with intermittent observations. In Proceedings of AISTATS.
[14]
Shota Haradal, Hideaki Hayashi, and Seiichi Uchida. 2018. Biosignal data augmentation based on generative adversarial networks. In Proceedings of EMBC.
[15]
Alan G. Hawkes. 1971. Spectra of some self-exciting and mutually exciting point processes. Biometrika 58, 1 (1971), 83–90.
[16]
Fabian Caba Heilbron, Victor Escorcia, Bernard Ghanem, and Juan Carlos Niebles. 2015. ActivityNet: A large-scale video benchmark for human activity understanding. In Proceedings of CVPR.
[17]
Minh Hoai and Fernando De la Torre. 2012. Max-margin early event detectors. In Proceedings of CVPR.
[18]
Wang-Cheng Kang and Julian McAuley. 2018. Self-attentive sequential recommendation. In Proceedings of ICDM.
[19]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In Proceedings of ICLR.
[20]
Durk P. Kingma and Prafulla Dhariwal. 2018. Glow: Generative flow with invertible 1x1 convolutions. In Proceedings of NeurIPS.
[21]
Durk P. Kingma, Tim Salimans, Rafal Jozefowicz, Xi Chen, Ilya Sutskever, and Max Welling. 2016. Improved variational inference with inverse autoregressive flow. In Proceedings of NeurIPS.
[22]
Ivan Kobyzev, Simon J. D. Prince, and Marcus A. Brubaker. 2020. Normalizing flows: An introduction and review of current methods. IEEE Transactions on Pattern Analysis and Machine Intelligence 43, 11 (2020), 3964–3979.
[23]
H. Kuehne, A. B. Arslan, and T. Serre. 2014. The language of actions: Recovering the syntax and semantics of goal-directed human activities. In Proceedings of CVPR.
[24]
Tian Lan, Tsung-Chuan Chen, and Silvio Savarese. 2014. A hierarchical representation for future action prediction. In Proceedings of ECCV.
[25]
Juho Lee, Yoonho Lee, Jungtaek Kim, Adam Kosiorek, Seungjin Choi, and Yee Whye Teh. 2019. Set transformer: A framework for attention-based permutation-invariant neural networks. In Proceedings of ICML.
[26]
Jiacheng Li, Yujie Wang, and Julian McAuley. 2020. Time interval aware self-attention for sequential recommendation. In Proceedings of WSDM.
[27]
Yue Luo, Sarah M. Coppola, Philippe C. Dixon, Song Li, Jack T. Dennerlein, and Boyi Hu. 2020. A database of human gait performance on irregular and uneven surfaces collected by wearable sensors. Scientific Data 7, 1 (2020), 219.
[28]
Shugao Ma, Leonid Sigal, and Stan Sclaroff. 2016. Learning activity progression in LSTMs for activity detection and early detection. In Proceedings of CVPR.
[29]
Tahmida Mahmud, Mahmudul Hasan, and Amit K. Roy-Chowdhury. 2017. Joint prediction of activity labels and starting times in untrimmed videos. In Proceedings of ICCV.
[30]
Nazanin Mehrasa, Ruizhi Deng, Mohamed Osama Ahmed, Bo Chang, Jiawei He, Thibaut Durand, Marcus Brubaker, and Greg Mori. 2019. Point process flows. arXiv preprint arXiv:1910.08281 (2019).
[31]
Nazanin Mehrasa, Akash Abdu Jyothi, Thibaut Durand, Jiawei He, Leonid Sigal, and Greg Mori. 2019. A variational auto-encoder model for stochastic point processes. In Proceedings of CVPR.
[32]
Nazanin Mehrasa, Yatao Zhong, Frederick Tung, Luke Bornn, and Greg Mori. 2017. Learning person trajectory representations for team activity analysis. arXiv preprint arXiv:1706.00893 (2017).
[33]
Hongyuan Mei and Jason M. Eisner. 2017. The neural Hawkes process: A neurally self-modulating multivariate point process. In Proceedings of NeurIPS.
[34]
Hongyuan Mei, Guanghui Qin, and Jason Eisner. 2019. Imputing missing events in continuous-time event streams. In Proceedings of ICML.
[35]
Hongyuan Mei, Chenghao Yang, and Jason Eisner. 2022. Transformer embeddings of irregularly spaced events and their participants. In Proceedings of ICLR.
[36]
Hao Ni, Lukasz Szpruch, Magnus Wiese, Shujian Liao, and Baoren Xiao. 2020. Conditional Sig-Wasserstein GANs for time series generation. arXiv preprint arXiv:2006.05421 (2020).
[37]
Takahiro Omi, Naonori Ueda, and Kazuyuki Aihara. 2019. Fully neural network based model for general temporal point processes. In Proceedings of NeurIPS.
[38]
Aaron van den Oord, Sander Dieleman, Heiga Zen, Karen Simonyan, Oriol Vinyals, Alex Graves, Nal Kalchbrenner, Andrew Senior, and Koray Kavukcuoglu. 2016. WaveNet: A generative model for raw audio. arXiv preprint arXiv:1609.03499 (2016).
[39]
Silviu Pitis. 2019. Rethinking the discount factor in reinforcement learning: A decision theoretic approach. In Proceedings of AAAI.
[40]
Danilo Rezende and Shakir Mohamed. 2015. Variational inference with normalizing flows. In Proceedings of ICML.
[41]
Marian-Andrei Rizoiu, Swapnil Mishra, Quyu Kong, Mark Carman, and Lexing Xie. 2018. SIR-Hawkes: On the relationship between epidemic models and Hawkes point processes. In Proceedings of WWW.
[42]
Marian-Andrei Rizoiu, Lexing Xie, Scott Sanner, Manuel Cebrian, Honglin Yu, and Pascal Van Hentenryck. 2017. Expecting to be hip: Hawkes intensity processes for social media popularity. In Proceedings of WWW.
[43]
M. Ryoo. 2011. Human activity prediction: Early recognition of ongoing activities from streaming videos. In Proceedings of ICCV.
[44]
Karishma Sharma, Yizhou Zhang, Emilio Ferrara, and Yan Liu. 2021. Identifying coordinated accounts on social media through hidden influence and group behaviours. In Proceedings of KDD.
[45]
Oleksandr Shchur, Marin Biloš, and Stephan Günnemann. 2020. Intensity-free learning of temporal point processes. In Proceedings of ICLR.
[46]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction. MIT Press.
[47]
Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Proceedings of NeurIPS.
[48]
Qingsong Wen, Liang Sun, Fan Yang, Xiaomin Song, Jingkun Gao, Xue Wang, and Huan Xu. 2021. Time series data augmentation for deep learning: A survey. In Proceedings of IJCAI.
[49]
Shuai Xiao, Mehrdad Farajtabar, Xiaojing Ye, Junchi Yan, Le Song, and Hongyuan Zha. 2017. Wasserstein learning of deep generative point process models. In Proceedings of NeurIPS.
[50]
Shuai Xiao, Junchi Yan, Xiaokang Yang, Hongyuan Zha, and Stephen M. Chu. 2017. Modeling the intensity function of point process via recurrent neural networks. In Proceedings of AAAI.
[51]
Siqiao Xue, Xiaoming Shi, Zhixuan Chu, Yan Wang, Fan Zhou, Hongyan Hao, Caigao Jiang, Chen Pan, Yi Xu, James Y. Zhang, Qingsong Wen, Jun Zhou, and Hongyuan Mei. 2023. EasyTPP: Towards open benchmarking the temporal point processes. arXiv preprint arXiv:2307.08097 (2023).
[52]
Ling Yang, Zhilong Zhang, Yang Song, Shenda Hong, Runsheng Xu, Yue Zhao, Wentao Zhang, Bin Cui, and Ming-Hsuan Yang. 2023. Diffusion models: A comprehensive survey of methods and applications. ACM Computing Surveys 56, 4 (2023), Article 105, 39 pages.
[53]
Mengfan Yao, Siqian Zhao, Shaghayegh Sahebi, and Reza Feyzi Behnagh. 2021. Stimuli-sensitive Hawkes processes for personalized student procrastination modeling. In Proceedings of WWW.
[54]
Serena Yeung, Olga Russakovsky, Ning Jin, Mykhaylo Andriluka, Greg Mori, and Li Fei-Fei. 2015. Every moment counts: dense detailed labeling of actions in complex videos. arXiv preprint arXiv:1507.05738 (2015).
[55]
Jinsung Yoon, Daniel Jarrett, and Mihaela van der Schaar. 2019. Time-series generative adversarial networks. In Proceedings of NeurIPS.
[56]
Lantao Yu, Weinan Zhang, Jun Wang, and Yong Yu. 2017. SeqGAN: Sequence generative adversarial nets with policy gradient. In Proceedings of AAAI.
[57]
Manzil Zaheer, Satwik Kottur, Siamak Ravanbakhsh, Barnabas Poczos, Russ R. Salakhutdinov, and Alexander J. Smola. 2017. Deep sets. In Proceedings of NeurIPS.
[58]
Qiang Zhang, Aldo Lipani, Omer Kirnap, and Emine Yilmaz. 2020. Self-attentive Hawkes processes. In Proceedings of ICML.
[59]
Qingyuan Zhao, Murat A. Erdogdu, Hera Y. He, Anand Rajaraman, and Jure Leskovec. 2015. SEISMIC: A self-exciting point process model for predicting tweet popularity. In Proceedings of KDD.
[60]
Simiao Zuo, Haoming Jiang, Zichong Li, Tuo Zhao, and Hongyuan Zha. 2020. Transformer Hawkes process. In Proceedings of ICML.

Cited By

View all
  • (2025)Daily activity-travel pattern identification using natural language processing and semantic matchingJournal of Transport Geography10.1016/j.jtrangeo.2024.104057122(104057)Online publication date: Jan-2025

Index Terms

  1. Tapestry of Time and Actions: Modeling Human Activity Sequences Using Temporal Point Process Flows

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 15, Issue 3
    June 2024
    646 pages
    EISSN:2157-6912
    DOI:10.1145/3613609
    • Editor:
    • Huan Liu
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 April 2024
    Online AM: 29 February 2024
    Accepted: 07 February 2024
    Revised: 22 January 2024
    Received: 26 July 2023
    Published in TIST Volume 15, Issue 3

    Check for updates

    Author Tags

    1. Marked temporal point process
    2. continuous-time sequences
    3. activity modeling
    4. goal prediction
    5. sequence generation

    Qualifiers

    • Research-article

    Funding Sources

    • DS Chair of AI fellowship to Srikanta Bedathur

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)136
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 12 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)Daily activity-travel pattern identification using natural language processing and semantic matchingJournal of Transport Geography10.1016/j.jtrangeo.2024.104057122(104057)Online publication date: Jan-2025

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media