research-article

Isomorphic Graph Embedding for Progressive Maximal Frequent Subgraph Mining

Authors:

Thanh Toan Nguyen,

Thanh Tam Nguyen,

Thanh Hung Nguyen,

Thanh Thi Nguyen,

Quoc Viet Hung NguyenAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology, Volume 15, Issue 1

Article No.: 9, Pages 1 - 26

https://doi.org/10.1145/3630635

Published: 19 December 2023 Publication History

Abstract

Maximal frequent subgraph mining (MFSM) is the task of mining only maximal frequent subgraphs, i.e., subgraphs that are not a part of other frequent subgraphs. Although many intelligent systems require MFSM, MFSM is challenging compared to frequent subgraph mining (FSM), as maximal frequent subgraphs lie in the middle of graph lattice, and FSM algorithms must explore an exponential space and an NP-hard subroutine of frequency counting. Different from prior research, which primarily focused on optimal solutions, we introduce pmMine, a progressive graph neural framework designed for MFSM in a single large graph to attain an approximate solution. The framework combines isomorphic graph embedding, non-parametric partitioning, and an efficiently top-down pattern searching strategy. The critical insight that makes pmMine work is to define the concepts of rooted subgraph and isomorphic graph embedding, in which the costly isomorphism subroutine can be efficiently performed using similarity estimation in embedding space. In addition, pmMine returns the patterns identified during the mining process in a progressive manner. We validate the efficiency and effectiveness of our technique through extensive experiments on a variety of datasets spanning various domains.

References

[1]

Ehab Abdelhamid, Ibrahim Abdelaziz, Panos Kalnis, Zuhair Khayyat, and Fuad Jamour. 2016. ScaleMine: Scalable parallel frequent subgraph mining in a single large graph. In SC. 716–727.

[2]

Takuya Akiba, Shuji Suzuki, and Keisuke Fukuda. 2017. Extremely large minibatch SGD: Training ResNet-50 on ImageNet in 15 minutes. arXiv preprint arXiv:1711.04325 (2017).

[3]

Francesco Bonchi, Carlos Castillo, Aristides Gionis, and Alejandro Jaimes. 2011. Social network analysis and mining for business applications. ACM Transactions on Intelligent Systems and Technology (TIST) 2, 3 (2011), 1–37.

Digital Library

[4]

Hongyun Cai, Vincent W. Zheng, and Kevin Chen-Chuan Chang. 2018. A comprehensive survey of graph embedding: Problems, techniques, and applications. TKDE 30, 9 (2018).

[5]

Siming Chen, Shuai Chen, Zhenhuang Wang, Jie Liang, Yadong Wu, and Xiaoru Yuan. 2018. D-map+ interactive visual analysis and exploration of ego-centric and event-centric information diffusion patterns in social media. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 1 (2018), 1–26.

[6]

Phyllis Zweig Chinn. 1971. The frequency partition of a graph. In Recent Trends in Graph Theory. 69–70.

[7]

Young-Rae Cho and Aidong Zhang. 2009. Predicting protein function by frequent functional association pattern mining in protein interaction networks. IEEE Transactions on Information Technology in Biomedicine 14, 1 (2009), 30–36.

[8]

Wanqiu Cui, Junping Du, Dawei Wang, Feifei Kou, and Zhe Xue. 2021. MVGAN: Multi-view graph attention network for social event detection. ACM Transactions on Intelligent Systems and Technology (TIST) 12, 3 (2021), 1–24.

Digital Library

[9]

Mohammed Elseidy, Ehab Abdelhamid, Spiros Skiadopoulos, and Panos Kalnis. 2014. GraMi: Frequent subgraph and pattern mining in a single large graph. PVLDB 7, 7 (2014).

[10]

Hao Fu, Aston Zhang, and Xing Xie. 2015. Effective social graph deanonymization based on graph structure and descriptive information. ACM Transactions on Intelligent Systems and Technology (TIST) 6, 4 (2015), 1–29.

Digital Library

[11]

Vitor Gomes, Pablo Barcellos, and Jacob Scharcanski. 2017. Stochastic shadow detection using a hypergraph partitioning approach. Pattern Recognition 63 (2017), 30–44.

Digital Library

[12]

Palash Goyal and Emilio Ferrara. 2018. Graph embedding techniques, applications, and performance: A survey. KBS 151 (2018), 78–94.

[13]

Saket Gurukar, Sayan Ranu, and Balaraman Ravindran. 2015. COMMIT: A scalable approach to mining communication motifs from dynamic networks. In SIGMOD. 475–489.

[14]

Nur Al Hasan Haldar, Jianxin Li, Mohammed Eunus Ali, Taotao Cai, Yunliang Chen, Timos Sellis, and Mark Reynolds. 2022. Top-k socio-spatial co-engaged location selection for social users. IEEE Transactions on Knowledge and Data Engineering 35, 5 (2022), 5325–5340.

[15]

Will Hamilton, Zhitao Ying, and Jure Leskovec. 2017. Inductive representation learning on large graphs. In NIPS. 1024–1034.

[16]

Keith Henderson, Brian Gallagher, Tina Eliassi-Rad, Hanghang Tong, Sugato Basu, Leman Akoglu, Danai Koutra, Christos Faloutsos, and Lei Li. 2012. RolX: Structural role extraction & mining in large graphs. In KDD. 1231–1239.

[17]

Jun Huan, Wei Wang, Jan Prins, and Jiong Yang. 2004. SPIN: Mining maximal frequent subgraphs from graph databases. In KDD. 581–586.

[18]

Svante Janson, Tomasz Luczak, and Andrzej Rucinski. 2011. Random Graphs. Vol. 45. John Wiley & Sons.

[19]

Shengwei Ji, Chenyang Bu, Lei Li, and Xindong Wu. 2021. Local graph edge partitioning. ACM Transactions on Intelligent Systems and Technology (TIST) 12, 5 (2021), 1–25.

Digital Library

[20]

Yan Jia, Zhaoquan Gu, Zhihao Jiang, Cuiyun Gao, and Jianye Yang. 2023. Persistent graph stream summarization for real-time graph analytics. World Wide Web (2023), 1–21.

[21]

Yan Jia, Mengqi Lin, Ye Wang, Jianming Li, Kai Chen, Joanna Siebert, Geordie Z. Zhang, and Qing Liao. 2023. Extrapolation over temporal knowledge graph via hyperbolic embedding. CAAI Transactions on Intelligence Technology (2023).

Digital Library

[22]

Daniel Keysers and Walter Unger. 2003. Elastic image matching is NP-complete. Pattern Recognition Letters 24, 1-3 (2003), 445–453.

Digital Library

[23]

Thomas N. Kipf and Max Welling. 2016. Semi-supervised classification with graph convolutional networks. arXiv preprint arXiv:1609.02907 (2016).

[24]

Giorgos Kollias, Shahin Mohammadi, and Ananth Grama. 2011. Network similarity decomposition (NSD): A fast and scalable approach to network alignment. TKDE 24, 12 (2011).

[25]

K. Mahesh Kumar and A. Rama Mohan Reddy. 2016. A fast DBSCAN clustering algorithm by accelerating neighbor searching using Groups method. Pattern Recognition 58 (2016), 39–48.

Digital Library

[26]

Michihiro Kuramochi and George Karypis. 2001. Frequent subgraph discovery. In ICDM.

Digital Library

[27]

Michihiro Kuramochi and George Karypis. 2005. Finding frequent patterns in a large sparse graph. DMKD 11, 3 (2005), 243–271.

Digital Library

[28]

Sejeong Kwon, Meeyoung Cha, and Kyomin Jung. 2017. Rumor detection over varying time windows. PloS One 12, 1 (2017), e0168344.

[29]

Jure Leskovec, Jon Kleinberg, and Christos Faloutsos. 2005. Graphs over time: Densification laws, shrinking diameters and possible explanations. In KDD.

[30]

Jure Leskovec and Rok Sosič. 2016. SNAP: A general-purpose network analysis and graph-mining library. ACM Transactions on Intelligent Systems and Technology (TIST) 8, 1 (2016), 1–20.

Digital Library

[31]

Lei Li, Ping Ding, Huanhuan Chen, and Xindong Wu. 2021. Frequent pattern mining in big social graphs. TETCI (2021).

[32]

Mu Li, Tong Zhang, Yuqiang Chen, and Alexander J. Smola. 2014. Efficient mini-batch training for stochastic optimization. In KDD. 661–670.

[33]

Xuelong Li, Guosheng Cui, and Yongsheng Dong. 2017. Refined-graph regularization-based nonnegative matrix factorization. ACM Transactions on Intelligent Systems and Technology (TIST) 9, 1 (2017), 1–21.

[34]

Jinbo Liu, Yunliang Chen, Xiaohui Huang, Jianxin Li, and Geyong Min. 2023. GNN-based long and short term preference modeling for next-location prediction. Information Sciences 629 (2023), 1–14.

Digital Library

[35]

Bin Lu, Xiaoying Gan, Haiming Jin, Luoyi Fu, Xinbing Wang, and Haisong Zhang. 2022. Make more connections: Urban traffic flow forecasting with spatiotemporal adaptive gated graph convolution network. ACM Transactions on Intelligent Systems and Technology (TIST) 13, 2 (2022), 1–25.

Digital Library

[36]

Alan K. Mackworth. 1977. Consistency in networks of relations. Artificial Intelligence 8, 1 (1977), 99–118.

Digital Library

[37]

Christopher Morris, Martin Ritzert, Matthias Fey, William L. Hamilton, Jan Eric Lenssen, Gaurav Rattan, and Martin Grohe. 2019. Weisfeiler and Leman go neural: Higher-order graph neural networks. In AAAI, Vol. 33. 4602–4609.

[38]

Lam B. Q. Nguyen, Bay Vo, Ngoc-Thao Le, Vaclav Snasel, and Ivan Zelinka. 2020. Fast and scalable algorithms for mining subgraphs in a single large graph. EAAI 90 (2020).

[39]

Thanh Toan Nguyen, Thanh Trung Huynh, Matthias Weidlich, Quan Thanh Tho, Hongzhi Yin, Karl Aberer, and Quoc Viet Hung Nguyen. 2023. Scalable maximal subgraph mining with backbone-preserving graph convolutions. Information Sciences (2023), 119287.

Digital Library

[40]

Zhaolong Ning, Peiran Dong, Xiaojie Wang, Joel J. P. C. Rodrigues, and Feng Xia. 2019. Deep reinforcement learning for vehicular edge computing: An intelligent offloading system. ACM Transactions on Intelligent Systems and Technology (TIST) 10, 6 (2019), 1–24.

Digital Library

[41]

Bryan Perozzi, Rami Al-Rfou, and Steven Skiena. 2014. DeepWalk: Online learning of social representations. In KDD. 701–710.

[42]

Saeed Salem, Mohammed Alokshiya, and Mohammad Al Hasan. 2021. RASMA: A reverse search algorithm for mining maximal frequent subgraphs. BioData Mining 14, 1 (2021).

[43]

Erich Schubert, Jorg Sander, Martin Ester, Hans Peter Kriegel, and Xiaowei Xu. 2017. DBSCAN revisited, revisited: Why and how you should (still) use DBSCAN. TODS 42, 3 (2017).

Digital Library

[44]

Nino Shervashidze, Pascal Schweitzer, Erik Jan van Leeuwen, Kurt Mehlhorn, and Karsten M. Borgwardt. 2011. Weisfeiler-Lehman graph kernels. JMLR 12, Sep. (2011).

[45]

Skyler Speakman, Yating Zhang, and Daniel B. Neill. 2013. Dynamic pattern detection with temporal consistency and connectivity constraints. In ICDM.

[46]

Lini T. Thomas, Satyanarayana R. Valluri, and Kamalakar Karlapalem. 2010. Margin: Maximal frequent subgraph mining. TKDD 4, 3 (2010), 1–42.

Digital Library

[47]

Nguyen Thanh Toan, Phan Thanh Cong, Duong Chi Thang, Nguyen Quoc Viet Hung, and Bela Stantic. 2018. Bootstrapping uncertainty in schema covering. In Databases Theory and Applications: 29th Australasian Database Conference, ADC 2018, Gold Coast, QLD, Australia, May 24–27, 2018, Proceedings 29. Springer, 336–342.

[48]

Thomas K. Tu, Jacob D. Moorman, Dominic Yang, Qinyi Chen, and Andrea L. Bertozzi. 2020. Inexact attributed subgraph matching. In 2020 IEEE International Conference on Big Data (Big Data’20). IEEE, 2575–2582.

[49]

Pengyang Wang, Yanjie Fu, Jiawei Zhang, Xiaolin Li, and Dan Lin. 2018. Learning urban community structures: A collective embedding perspective with periodic spatial-temporal mobility graphs. ACM Transactions on Intelligent Systems and Technology (TIST) 9, 6 (2018), 1–28.

Digital Library

[50]

Senzhang Wang, Meiyue Zhang, Hao Miao, Zhaohui Peng, and Philip S. Yu. 2022. Multivariate correlation-aware spatio-temporal graph convolutional networks for multi-scale traffic prediction. ACM Transactions on Intelligent Systems and Technology (TIST) 13, 3 (2022), 1–22.

Digital Library

[51]

Yuandong Wang, Hongzhi Yin, Tong Chen, Chunyang Liu, Ben Wang, Tianyu Wo, and Jie Xu. 2021. Passenger mobility prediction via representation learning for dynamic directed and weighted graphs. ACM Transactions on Intelligent Systems and Technology (TIST) 13, 1 (2021), 1–25.

Digital Library

[52]

Yiqun Xie, Xiaowei Jia, Shashi Shekhar, Han Bao, and Xun Zhou. 2021. Significant DBSCAN+: Statistically robust density-based clustering. ACM Transactions on Intelligent Systems and Technology (TIST) 12, 5 (2021), 1–26.

Digital Library

[53]

Keyulu Xu, Weihua Hu, Jure Leskovec, and Stefanie Jegelka. 2019. How powerful are graph neural networks?. In ICLR.

[54]

Xifeng Yan and Jiawei Han. 2002. gSpan: Graph-based substructure pattern mining. In ICDM. 721–724.

Digital Library

[55]

Rex Ying, Ruining He, Kaifeng Chen, Pong Eksombatchai, William L. Hamilton, and Jure Leskovec. 2018. Graph convolutional neural networks for web-scale recommender systems. In KDD. 974–983.

[56]

R. Ying, A. Wang, J. You, and J. Leskovec. 2020. Frequent subgraph mining by walking in order embedding space. ICML (2020).

[57]

Mohammed J. Zaki, Wagner Meira Jr., and Wagner Meira. 2014. Data Mining and Analysis: Fundamental Concepts and Algorithms. Cambridge University Press.

[58]

Yanliang Zhu, Dongchun Ren, Yi Xu, Deheng Qian, Mingyu Fan, Xin Li, and Huaxia Xia. 2021. Simultaneous past and current social interaction-aware trajectory prediction for multiple intelligent agents in dynamic scenes. ACM Transactions on Intelligent Systems and Technology (TIST) 13, 1 (2021), 1–16.

Cited By

Liu YXu CWen ZDong Y(2025)Trust EEG epileptic seizure detection via evidential multi-view learningInformation Sciences10.1016/j.ins.2024.121699694(121699)Online publication date: Mar-2025
https://doi.org/10.1016/j.ins.2024.121699

Index Terms

Isomorphic Graph Embedding for Progressive Maximal Frequent Subgraph Mining
1. Computing methodologies
  1. Machine learning
    1. Machine learning approaches
      1. Learning latent representations
2. Information systems

Recommendations

Mining Frequent Subgraph Patterns from Uncertain Graph Data

In many real applications, graph data is subject to uncertainties due to incompleteness and imprecision of data. Mining such uncertain graph data is semantically different from and computationally more challenging than mining conventional exact graph ...
An Optimization of Closed Frequent Subgraph Mining Algorithm
Abstract
Graph mining isamajor area of interest within the field of data mining in recent years. Akey aspect of graph mining is frequent subgraph mining. Central to the entire discipline of frequent subgraph mining is the concept of subgraph ...
Distributed frequent subgraph mining on evolving graph using SPARK

Within the graph mining context, frequent subgraph identification plays a key role in retrieving required information or patterns from the huge amount of data in a short period. The problem of finding frequent items in traditional mining changed to the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology

ACM Transactions on Intelligent Systems and Technology Volume 15, Issue 1

February 2024

533 pages

EISSN:2157-6912

DOI:10.1145/3613503

Editor:
Huan Liu
Arizona State University, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 December 2023

Online AM: 27 October 2023

Accepted: 17 October 2023

Revised: 19 September 2023

Received: 31 August 2022

Published in TIST Volume 15, Issue 1

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
275
Total Downloads

Downloads (Last 12 months)166
Downloads (Last 6 weeks)20

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu YXu CWen ZDong Y(2025)Trust EEG epileptic seizure detection via evidential multi-view learningInformation Sciences10.1016/j.ins.2024.121699694(121699)Online publication date: Mar-2025
https://doi.org/10.1016/j.ins.2024.121699

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

Figures

Tables

Media

View full text|Download PDF

View Issue’s Table of Contents