skip to main content
research-article

Isomorphic Graph Embedding for Progressive Maximal Frequent Subgraph Mining

Published: 19 December 2023 Publication History

Abstract

Maximal frequent subgraph mining (MFSM) is the task of mining only maximal frequent subgraphs, i.e., subgraphs that are not a part of other frequent subgraphs. Although many intelligent systems require MFSM, MFSM is challenging compared to frequent subgraph mining (FSM), as maximal frequent subgraphs lie in the middle of graph lattice, and FSM algorithms must explore an exponential space and an NP-hard subroutine of frequency counting. Different from prior research, which primarily focused on optimal solutions, we introduce pmMine, a progressive graph neural framework designed for MFSM in a single large graph to attain an approximate solution. The framework combines isomorphic graph embedding, non-parametric partitioning, and an efficiently top-down pattern searching strategy. The critical insight that makes pmMine work is to define the concepts of rooted subgraph and isomorphic graph embedding, in which the costly isomorphism subroutine can be efficiently performed using similarity estimation in embedding space. In addition, pmMine returns the patterns identified during the mining process in a progressive manner. We validate the efficiency and effectiveness of our technique through extensive experiments on a variety of datasets spanning various domains.

References

[1]
Ehab Abdelhamid, Ibrahim Abdelaziz, Panos Kalnis, Zuhair Khayyat, and Fuad Jamour. 2016. ScaleMine: Scalable parallel frequent subgraph mining in a single large graph. In SC. 716–727.
[2]
Takuya Akiba, Shuji Suzuki, and Keisuke Fukuda. 2017. Extremely large minibatch SGD: Training ResNet-50 on ImageNet in 15 minutes. arXiv preprint arXiv:1711.04325 (2017).
[3]
Francesco Bonchi, Carlos Castillo, Aristides Gionis, and Alejandro Jaimes. 2011. Social network analysis and mining for business applications. ACM Transactions on Intelligent Systems and Technology (TIST) 2, 3 (2011), 1–37.
[4]
Hongyun Cai, Vincent W. Zheng, and Kevin Chen-Chuan Chang. 2018. A comprehensive survey of graph embedding: Problems, techniques, and applications. TKDE 30, 9 (2018).
[5]
Siming Chen, Shuai Chen, Zhenhuang Wang, Jie Liang, Yadong Wu, and Xiaoru Yuan. 2018. D-map+ interactive visual analysis and exploration of ego-centric and event-centric information diffusion patterns in social media. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 1 (2018), 1–26.
[6]
Phyllis Zweig Chinn. 1971. The frequency partition of a graph. In Recent Trends in Graph Theory. 69–70.
[7]
Young-Rae Cho and Aidong Zhang. 2009. Predicting protein function by frequent functional association pattern mining in protein interaction networks. IEEE Transactions on Information Technology in Biomedicine 14, 1 (2009), 30–36.
[8]
Wanqiu Cui, Junping Du, Dawei Wang, Feifei Kou, and Zhe Xue. 2021. MVGAN: Multi-view graph attention network for social event detection. ACM Transactions on Intelligent Systems and Technology (TIST) 12, 3 (2021), 1–24.
[9]
Mohammed Elseidy, Ehab Abdelhamid, Spiros Skiadopoulos, and Panos Kalnis. 2014. GraMi: Frequent subgraph and pattern mining in a single large graph. PVLDB 7, 7 (2014).
[10]
Hao Fu, Aston Zhang, and Xing Xie. 2015. Effective social graph deanonymization based on graph structure and descriptive information. ACM Transactions on Intelligent Systems and Technology (TIST) 6, 4 (2015), 1–29.
[11]
Vitor Gomes, Pablo Barcellos, and Jacob Scharcanski. 2017. Stochastic shadow detection using a hypergraph partitioning approach. Pattern Recognition 63 (2017), 30–44.
[12]
Palash Goyal and Emilio Ferrara. 2018. Graph embedding techniques, applications, and performance: A survey. KBS 151 (2018), 78–94.
[13]
Saket Gurukar, Sayan Ranu, and Balaraman Ravindran. 2015. COMMIT: A scalable approach to mining communication motifs from dynamic networks. In SIGMOD. 475–489.
[14]
Nur Al Hasan Haldar, Jianxin Li, Mohammed Eunus Ali, Taotao Cai, Yunliang Chen, Timos Sellis, and Mark Reynolds. 2022. Top-k socio-spatial co-engaged location selection for social users. IEEE Transactions on Knowledge and Data Engineering 35, 5 (2022), 5325–5340.
[15]
Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NIPS. 1024–1034.
[16]
Keith Henderson, Brian Gallagher, Tina Eliassi-Rad, Hanghang Tong, Sugato Basu, Leman Akoglu, Danai Koutra, Christos Faloutsos, and Lei Li. 2012. RolX: Structural role extraction & mining in large graphs. In KDD. 1231–1239.
[17]
Jun Huan, Wei Wang, Jan Prins, and Jiong Yang. 2004. SPIN: Mining maximal frequent subgraphs from graph databases. In KDD. 581–586.
[18]
Svante Janson, Tomasz Luczak, and Andrzej Rucinski. 2011. Random Graphs. Vol. 45. John Wiley & Sons.
[19]
Shengwei Ji, Chenyang Bu, Lei Li, and Xindong Wu. 2021. Local graph edge partitioning. ACM Transactions on Intelligent Systems and Technology (TIST) 12, 5 (2021), 1–25.
[20]
Yan Jia, Zhaoquan Gu, Zhihao Jiang, Cuiyun Gao, and Jianye Yang. 2023. Persistent graph stream summarization for real-time graph analytics. World Wide Web (2023), 1–21.
[21]
Yan Jia, Mengqi Lin, Ye Wang, Jianming Li, Kai Chen, Joanna Siebert, Geordie Z. Zhang, and Qing Liao. 2023. Extrapolation over temporal knowledge graph via hyperbolic embedding. CAAI Transactions on Intelligence Technology (2023).
[22]
Daniel Keysers and Walter Unger. 2003. Elastic image matching is NP-complete. Pattern Recognition Letters 24, 1-3 (2003), 445–453.
[23]
Thomas N. Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).
[24]
Giorgos Kollias, Shahin Mohammadi, and Ananth Grama. 2011. Network similarity decomposition (NSD): A fast and scalable approach to network alignment. TKDE 24, 12 (2011).
[25]
K. Mahesh Kumar and A. Rama Mohan Reddy. 2016. A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method. Pattern Recognition 58 (2016), 39–48.
[26]
Michihiro Kuramochi and George Karypis. 2001. Frequent subgraph discovery. In ICDM.
[27]
Michihiro Kuramochi and George Karypis. 2005. Finding frequent patterns in a large sparse graph. DMKD 11, 3 (2005), 243–271.
[28]
Sejeong Kwon, Meeyoung Cha, and Kyomin Jung. 2017. Rumor detection over varying time windows. PloS One 12, 1 (2017), e0168344.
[29]
Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In KDD.
[30]
Jure Leskovec and Rok Sosič. 2016. SNAP: A general-purpose network analysis and graph-mining library. ACM Transactions on Intelligent Systems and Technology (TIST) 8, 1 (2016), 1–20.
[31]
Lei Li, Ping Ding, Huanhuan Chen, and Xindong Wu. 2021. Frequent pattern mining in big social graphs. TETCI (2021).
[32]
Mu Li, Tong Zhang, Yuqiang Chen, and Alexander J. Smola. 2014. Efficient mini-batch training for stochastic optimization. In KDD. 661–670.
[33]
Xuelong Li, Guosheng Cui, and Yongsheng Dong. 2017. Refined-graph regularization-based nonnegative matrix factorization. ACM Transactions on Intelligent Systems and Technology (TIST) 9, 1 (2017), 1–21.
[34]
Jinbo Liu, Yunliang Chen, Xiaohui Huang, Jianxin Li, and Geyong Min. 2023. GNN-based long and short term preference modeling for next-location prediction. Information Sciences 629 (2023), 1–14.
[35]
Bin Lu, Xiaoying Gan, Haiming Jin, Luoyi Fu, Xinbing Wang, and Haisong Zhang. 2022. Make more connections: Urban traffic flow forecasting with spatiotemporal adaptive gated graph convolution network. ACM Transactions on Intelligent Systems and Technology (TIST) 13, 2 (2022), 1–25.
[36]
Alan K. Mackworth. 1977. Consistency in networks of relations. Artificial Intelligence 8, 1 (1977), 99–118.
[37]
Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. 2019. Weisfeiler and Leman go neural: Higher-order graph neural networks. In AAAI, Vol. 33. 4602–4609.
[38]
Lam B. Q. Nguyen, Bay Vo, Ngoc-Thao Le, Vaclav Snasel, and Ivan Zelinka. 2020. Fast and scalable algorithms for mining subgraphs in a single large graph. EAAI 90 (2020).
[39]
Thanh Toan Nguyen, Thanh Trung Huynh, Matthias Weidlich, Quan Thanh Tho, Hongzhi Yin, Karl Aberer, and Quoc Viet Hung Nguyen. 2023. Scalable maximal subgraph mining with backbone-preserving graph convolutions. Information Sciences (2023), 119287.
[40]
Zhaolong Ning, Peiran Dong, Xiaojie Wang, Joel J. P. C. Rodrigues, and Feng Xia. 2019. Deep reinforcement learning for vehicular edge computing: An intelligent offloading system. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 6 (2019), 1–24.
[41]
Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In KDD. 701–710.
[42]
Saeed Salem, Mohammed Alokshiya, and Mohammad Al Hasan. 2021. RASMA: A reverse search algorithm for mining maximal frequent subgraphs. BioData Mining 14, 1 (2021).
[43]
Erich Schubert, Jorg Sander, Martin Ester, Hans Peter Kriegel, and Xiaowei Xu. 2017. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. TODS 42, 3 (2017).
[44]
Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, and Karsten M. Borgwardt. 2011. Weisfeiler-Lehman graph kernels. JMLR 12, Sep. (2011).
[45]
Skyler Speakman, Yating Zhang, and Daniel B. Neill. 2013. Dynamic pattern detection with temporal consistency and connectivity constraints. In ICDM.
[46]
Lini T. Thomas, Satyanarayana R. Valluri, and Kamalakar Karlapalem. 2010. Margin: Maximal frequent subgraph mining. TKDD 4, 3 (2010), 1–42.
[47]
Nguyen Thanh Toan, Phan Thanh Cong, Duong Chi Thang, Nguyen Quoc Viet Hung, and Bela Stantic. 2018. Bootstrapping uncertainty in schema covering. In Databases Theory and Applications: 29th Australasian Database Conference, ADC 2018, Gold Coast, QLD, Australia, May 24–27, 2018, Proceedings 29. Springer, 336–342.
[48]
Thomas K. Tu, Jacob D. Moorman, Dominic Yang, Qinyi Chen, and Andrea L. Bertozzi. 2020. Inexact attributed subgraph matching. In 2020 IEEE International Conference on Big Data (Big Data’20). IEEE, 2575–2582.
[49]
Pengyang Wang, Yanjie Fu, Jiawei Zhang, Xiaolin Li, and Dan Lin. 2018. Learning urban community structures: A collective embedding perspective with periodic spatial-temporal mobility graphs. ACM Transactions on Intelligent Systems and Technology (TIST) 9, 6 (2018), 1–28.
[50]
Senzhang Wang, Meiyue Zhang, Hao Miao, Zhaohui Peng, and Philip S. Yu. 2022. Multivariate correlation-aware spatio-temporal graph convolutional networks for multi-scale traffic prediction. ACM Transactions on Intelligent Systems and Technology (TIST) 13, 3 (2022), 1–22.
[51]
Yuandong Wang, Hongzhi Yin, Tong Chen, Chunyang Liu, Ben Wang, Tianyu Wo, and Jie Xu. 2021. Passenger mobility prediction via representation learning for dynamic directed and weighted graphs. ACM Transactions on Intelligent Systems and Technology (TIST) 13, 1 (2021), 1–25.
[52]
Yiqun Xie, Xiaowei Jia, Shashi Shekhar, Han Bao, and Xun Zhou. 2021. Significant DBSCAN+: Statistically robust density-based clustering. ACM Transactions on Intelligent Systems and Technology (TIST) 12, 5 (2021), 1–26.
[53]
Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks?. In ICLR.
[54]
Xifeng Yan and Jiawei Han. 2002. gSpan: Graph-based substructure pattern mining. In ICDM. 721–724.
[55]
Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In KDD. 974–983.
[56]
R. Ying, A. Wang, J. You, and J. Leskovec. 2020. Frequent subgraph mining by walking in order embedding space. ICML (2020).
[57]
Mohammed J. Zaki, Wagner Meira Jr., and Wagner Meira. 2014. Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press.
[58]
Yanliang Zhu, Dongchun Ren, Yi Xu, Deheng Qian, Mingyu Fan, Xin Li, and Huaxia Xia. 2021. Simultaneous past and current social interaction-aware trajectory prediction for multiple intelligent agents in dynamic scenes. ACM Transactions on Intelligent Systems and Technology (TIST) 13, 1 (2021), 1–16.

Cited By

View all
  • (2025)Trust EEG epileptic seizure detection via evidential multi-view learningInformation Sciences10.1016/j.ins.2024.121699694(121699)Online publication date: Mar-2025

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 15, Issue 1
February 2024
533 pages
EISSN:2157-6912
DOI:10.1145/3613503
  • Editor:
  • Huan Liu
Issue’s Table of Contents

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 December 2023
Online AM: 27 October 2023
Accepted: 17 October 2023
Revised: 19 September 2023
Received: 31 August 2022
Published in TIST Volume 15, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Maximal frequent subgraph mining
  2. graph representation learning
  3. isomorphism testing

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)166
  • Downloads (Last 6 weeks)20
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Trust EEG epileptic seizure detection via evidential multi-view learningInformation Sciences10.1016/j.ins.2024.121699694(121699)Online publication date: Mar-2025

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

Full Text

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media