skip to main content
research-article

Credit Card Fraud Detection via Intelligent Sampling and Self-supervised Learning

Published: 28 March 2024 Publication History

Abstract

The significant increase in credit card transactions can be attributed to the rapid growth of online shopping and digital payments, particularly during the COVID-19 pandemic. To safeguard cardholders, e-commerce companies, and financial institutions, the implementation of an effective and real-time fraud detection method using modern artificial intelligence techniques is imperative. However, the development of machine-learning-based approaches for fraud detection faces challenges such as inadequate transaction representation, noise labels, and data imbalance. Additionally, practical considerations like dynamic thresholds, concept drift, and verification latency need to be appropriately addressed. In this study, we designed a fraud detection method that accurately extracts a series of spatial and temporal representative features to precisely describe credit card transactions. Furthermore, several auxiliary self-supervised objectives were developed to model cardholders’ behavior sequences. By employing intelligent sampling strategies, potential noise labels were eliminated, thereby reducing the level of data imbalance. The developed method encompasses various innovative functions that cater to practical usage requirements. We applied this method to two real-world datasets, and the results indicated a higher F1 score compared to the most commonly used online fraud detection methods.

References

[1]
Ahmed Qasim Abdulghani, Osman Nuri Uçan, and Khattab M. Ali Alheeti. 2021. Credit card fraud detection using XGBoost algorithm. In Proceedings of the 14th International Conference on Developments in eSystems Engineering (DeSE’21). IEEE, 487–492.
[2]
Seyyede Zahra Aftabi, Ali Ahmadi, and Saeed Farzi. 2023. Fraud detection in financial statements using data mining and GAN models. Expert Syst. Appl. 227 (2023), 120144.
[3]
M. A. Al-Shabi. 2019. Credit card fraud detection using autoencoder model in unbalanced datasets. J. Adv. Math. Comput. Sci. 33, 5 (2019), 1–16.
[4]
Eric Arazo, Diego Ortego, Paul Albert, Noel O’Connor, and Kevin McGuinness. 2019. Unsupervised label noise modeling and loss correction. In Proceedings of the International Conference on Machine Learning. PMLR, 312–321.
[5]
Philip Bachman, R. Devon Hjelm, and William Buchwalter. 2019. Learning representations by maximizing mutual information across views. Adv. Neural Info. Process. Syst. 32 (2019).
[6]
Alejandro Correa Bahnsen, Djamila Aouada, Aleksandar Stojanovic, and Björn Ottersten. 2016. Feature engineering strategies for credit card fraud detection. Expert Syst. Appl. 51 (2016), 134–142.
[7]
Bernardo Branco, Pedro Abreu, Ana Sofia Gomes, Mariana S. C. Almeida, João Tiago Ascensão, and Pedro Bizarro. 2020. Interleaved sequence RNNs for fraud detection. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 3101–3109.
[8]
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. J. Artific. Intell. Res. 16 (2002), 321–357.
[9]
Hsin-Yi Chen and Szu-Hao Huang. 2022. Generating a trading strategy in the financial market from sensitive expert data based on the privacy-preserving generative adversarial imitation network. Neurocomputing 500 (2022), 616–631.
[10]
Dawei Cheng, Xiaoyang Wang, Ying Zhang, and Liqing Zhang. 2020. Graph neural network for fraud detection via spatial-temporal attention. IEEE Trans. Knowl. Data Eng. 34, 8 (2020), 3800–3813.
[11]
Dawei Cheng, Yujia Ye, Sheng Xiang, Zhenwei Ma, Ying Zhang, and Changjun Jiang. 2023. Anti-money laundering by group-aware deep graph learning. IEEE Trans. Knowl. Data Eng. 35, 12 (2023), 12444–12457.
[12]
Yi-Ching Chou, Chiao-Ting Chen, and Szu-Hao Huang. 2022. Modeling behavior sequence for personalized fund recommendation with graphical deep collaborative filtering. Expert Syst. Appl. 192 (2022), 116311.
[13]
Damien Dablain, Bartosz Krawczyk, and Nitesh V. Chawla. 2022. DeepSMOTE: Fusing deep learning and SMOTE for imbalanced data. IEEE Trans. Neural Netw. Learn. Syst. 34, 9 (2022), 6390–6404.
[14]
Andrea Dal Pozzolo, Giacomo Boracchi, Olivier Caelen, Cesare Alippi, and Gianluca Bontempi. 2015. Credit card fraud detection and concept-drift adaptation with delayed supervised information. In Proceedings of the International Joint Conference on Neural Networks (IJCNN’15). IEEE, 1–8.
[15]
Andrea Dal Pozzolo, Giacomo Boracchi, Olivier Caelen, Cesare Alippi, and Gianluca Bontempi. 2017. Credit card fraud detection: A realistic modeling and a novel learning strategy. IEEE Trans. Neural Netw. Learn. Syst. 29, 8 (2017), 3784–3797.
[16]
Shounak Datta, Sayak Nag, and Swagatam Das. 2019. Boosting with lexicographic programming: Addressing class imbalance without cost tuning. IEEE Trans. Knowl. Data Eng. 32, 5 (2019), 883–897.
[17]
Bowen Du, Xuanxuan Sun, Junchen Ye, Ke Cheng, Jingyuan Wang, and Leilei Sun. 2023. GAN-based anomaly detection for multivariate time series using polluted training set. IEEE Trans. Knowl. Data Eng. 35, 12 (2023), 12208–12219.
[18]
Val Andrei Fajardo, David Findlay, Charu Jaiswal, Xinshang Yin, Roshanak Houmanfar, Honglei Xie, Jiaxi Liang, Xichen She, and DB Emerson. 2021. On oversampling imbalanced data with deep conditional generative models. Expert Syst. Appl. 169 (2021), 114463.
[19]
Ugo Fiore, Alfredo De Santis, Francesca Perla, Paolo Zanetti, and Francesco Palmieri. 2019. Using generative adversarial networks for improving classification effectiveness in credit card fraud detection. Info. Sci. 479 (2019), 448–455.
[20]
Javad Forough and Saeedeh Momtazi. 2022. Sequential credit card fraud detection: A joint deep neural network and probabilistic graphical model approach. Expert Syst. 39, 1 (2022), e12795.
[21]
Jyoti R. Gaikwad, Amruta B. Deshmane, Harshada V. Somavanshi, Snehal V. Patil, and Rinku A. Badgujar. 2014. Credit card fraud detection using decision tree induction algorithm. Int. J. Innov. Technol. Explor. Eng. 4, 6 (2014), 66–69.
[22]
Waleed Hilal, S. Andrew Gadsden, and John Yawney. 2022. Financial fraud: A review of anomaly detection techniques and recent advances. Expert Syst. Appl. 193 (2022), 116429.
[23]
Pei-Ying Hsu, Chiao-Ting Chen, Chin Chou, and Szu-Hao Huang. 2022. Explainable mutual fund recommendation system developed based on knowledge graph embeddings. Appl. Intell. 52, 9 (2022), 10779–10804.
[24]
Desen Huang, Lifeng Shen, Zhongzhong Yu, Zhenjing Zheng, Min Huang, and Qianli Ma. 2022. Efficient time series anomaly detection by multiresolution self-supervised discriminative network. Neurocomputing 491 (2022), 261–272.
[25]
Mengda Huang, Yang Liu, Xiang Ao, Kuan Li, Jianfeng Chi, Jinghua Feng, Hao Yang, and Qing He. 2022. Auc-oriented graph neural network for fraud detection. In Proceedings of the ACM Web Conference. 1311–1321.
[26]
Tsan-Yin Hung and Szu-Hao Huang. 2022. Addressing the cold-start problem of recommendation systems for financial products by using few-shot deep learning. Appl. Intell. 52, 13 (2022), 15529–15546.
[27]
Lu Jiang, Zhengyuan Zhou, Thomas Leung, Li-Jia Li, and Li Fei-Fei. 2018. Mentornet: Learning data-driven curriculum for very deep neural networks on corrupted labels. In Proceedings of the International Conference on Machine Learning. PMLR, 2304–2313.
[28]
Yang Jiao, Kai Yang, Dongjing Song, and Dacheng Tao. 2022. Timeautoad: Autonomous anomaly detection with self-supervised contrastive loss for multivariate time series. IEEE Trans. Netw. Sci. Eng. 9, 3 (2022), 1604–1619.
[29]
Harsurinder Kaur, Husanbir Singh Pannu, and Avleen Kaur Malhi. 2019. A systematic review on imbalanced data challenges in machine learning: Applications and solutions. ACM Comput. Surveys 52, 4 (2019), 1–36.
[30]
Alexander Kolesnikov, Xiaohua Zhai, and Lucas Beyer. 2019. Revisiting self-supervised visual representation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1920–1929.
[31]
Han Kyu Lee and Seoung Bum Kim. 2018. An overlap-sensitive margin classifier for imbalanced and overlapping data. Expert Systems with Applications 98 (2018), 72–83.
[32]
Shikun Li, Xiaobo Xia, Shiming Ge, and Tongliang Liu. 2022. Selective-supervised contrastive learning with noisy labels. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 316–325.
[33]
Zhenchuan Li, Mian Huang, Guanjun Liu, and Changjun Jiang. 2021. A hybrid method with dynamic weighted entropy for handling the problem of class imbalance with overlap in credit card fraud detection. Expert Syst. Appl. 175 (2021), 114750.
[34]
Zhao Li, Pengrui Hui, Peng Zhang, Jiaming Huang, Biao Wang, Ling Tian, Ji Zhang, Jianliang Gao, and Xing Tang. 2021. What happens behind the scene? Towards fraud community detection in e-commerce from online to offline. In Proceedings of the Web Conference. 105–113.
[35]
Zhao Li, Haishuai Wang, Peng Zhang, Pengrui Hui, Jiaming Huang, Jian Liao, Ji Zhang, and Jiajun Bu. 2021. Live-streaming fraud detection: A heterogeneous graph neural network approach. In Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3670–3678.
[36]
Chen Liang, Ziqi Liu, Bin Liu, Jun Zhou, Xiaolong Li, Shuang Yang, and Yuan Qi. 2019. Uncovering insurance fraud conspiracy with network learning. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval. 1181–1184.
[37]
Can Liu, Yuncong Gao, Li Sun, Jinghua Feng, Hao Yang, and Xiang Ao. 2022. User behavior pre-training for online fraud detection. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 3357–3365.
[38]
Can Liu, Qiwei Zhong, Xiang Ao, Li Sun, Wangli Lin, Jinghua Feng, Qing He, and Jiayu Tang. 2020. Fraud transactions detection via behavior tree with local intention calibration. In Proceedings of the 26th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. 3035–3043.
[39]
Xiao Liu, Fanjin Zhang, Zhenyu Hou, Li Mian, Zhaoyu Wang, Jing Zhang, and Jie Tang. 2023. Self-supervised learning: Generative or contrastive. IEEE Trans. Knowl. Data Eng. 35, 1 (2023), 857–876.
[40]
Yang Liu, Xiang Ao, Zidi Qin, Jianfeng Chi, Jinghua Feng, Hao Yang, and Qing He. 2021. Pick and choose: A GNN-based imbalanced learning approach for fraud detection. In Proceedings of the Web Conference. 3168–3177.
[41]
Appala Srinuvasu Muttipati, Sangeeta Viswanadham, Radha Dharavathu, and Jayalakshmi Nema. 2022. LightGBM model for credit card fraud discovery. In Proceedings of 6th International Conference on Microelectronics, Electromagnetics, and Telecommunications (ICMEET’21), Volume 1. Springer, 51–58.
[42]
Rodolfo M. Pereira, Yandre M. G. Costa, and Carlos N. Silla Jr. 2020. MLTL: A multi-label approach for the Tomek link undersampling algorithm. Neurocomputing 383 (2020), 95–105.
[43]
Kuldeep Randhawa, Chu Kiong Loo, Manjeevan Seera, Chee Peng Lim, and Asoke K. Nandi. 2018. Credit card fraud detection using AdaBoost and majority voting. IEEE Access 6 (2018), 14277–14284.
[44]
Jing Ren, Feng Xia, Ivan Lee, Azadeh Noori Hoshyar, and Charu Aggarwal. 2023. Graph learning for anomaly analytics: Algorithms, applications, and challenges. ACM Trans. Intell. Syst. Technol. 14, 2 (2023), 1–29.
[45]
Seyed Ehsan Roshan and Shahrokh Asadi. 2020. Improvement of bagging performance for classification of imbalanced datasets using evolutionary multi-objective optimization. Eng. Appl. Artific. Intell. 87 (2020), 103319.
[46]
Yusuf Sahin and Ekrem Duman. 2011. Detecting credit card fraud by ANN and logistic regression. In Proceedings of the International Symposium on Innovations in Intelligent Systems and Applications. IEEE, 315–319.
[47]
Ruttala Sailusha, V. Gnaneswar, R. Ramesh, and G. Ramakoteswara Rao. 2020. Credit card fraud detection using machine learning. In Proceedings of the 4th International Conference on Intelligent Computing and Control Systems (ICICCS). IEEE, 1264–1270.
[48]
Chuan-Yun Sang. 2022. Online Reinforcement Learning With Adaptive Exploration for Portfolio Management Based on Multi-task Self-supervised Representation, master thesis, National Yang Ming Chiao Tung University, Taiwan.
[49]
Bernhard Schölkopf, John C. Platt, John Shawe-Taylor, Alex J. Smola, and Robert C. Williamson. 2001. Estimating the support of a high-dimensional distribution. Neural Comput. 13, 7 (2001), 1443–1471.
[50]
Fengzhao Shi, Yanan Cao, Yanmin Shang, Yuchen Zhou, Chuan Zhou, and Jia Wu. 2022. H2-FDetector: A GNN-based fraud detector with homophilic and heterophilic connections. In Proceedings of the ACM Web Conference. 1486–1494.
[51]
Connor Shorten and Taghi M. Khoshgoftaar. 2019. A survey on image data augmentation for deep learning. J. Big Data 6, 1 (2019), 1–48.
[52]
Amit Singh, Ranjeet Kumar Ranjan, and Abhishek Tiwari. 2022. Credit card fraud detection under extreme imbalanced data: A comparative study of data-level algorithms. J. Exper. Theor. Artific. Intell. 34, 4 (2022), 571–598.
[53]
Hwanjun Song, Minseok Kim, and Jae-Gil Lee. 2019. Selfie: Refurbishing unclean samples for robust deep learning. In Proceedings of the International Conference on Machine Learning. PMLR, 5907–5915.
[54]
Hwanjun Song, Minseok Kim, Dongmin Park, Yooju Shin, and Jae-Gil Lee. 2023. Learning from noisy labels with deep neural networks: A survey. IEEE Trans. Neural Netw. Learn. Syst. 34, 11 (2023), 8135–8153.
[55]
Aboozar Taherkhani, Georgina Cosma, and T. Martin McGinnity. 2020. AdaBoost-CNN: An adaptive boosting algorithm for convolutional neural networks to classify multi-class imbalanced datasets using transfer learning. Neurocomputing 404 (2020), 351–366.
[56]
Pattaramon Vuttipittayamongkol and Eyad Elyan. 2020. Neighbourhood-based undersampling approach for handling imbalanced and overlapped data. Info. Sci. 509 (2020), 47–70.
[57]
Ting-Yun Wang, Chiao-Ting Chen, Ju-Chun Huang, and Szu-Hao Huang. 2023. Modeling cross-session information with multi-interest graph neural networks for the next-item recommendation. ACM Trans. Knowl. Discov. Data 17, 1 (2023), 1–28.
[58]
Man Leung Wong, Kruy Seng, and Pak Kan Wong. 2020. Cost-sensitive ensemble of stacked denoising autoencoders for class imbalance problems in business domain. Expert Syst. Appl. 141 (2020), 112918.
[59]
Junyu Xuan, Jie Lu, and Guangquan Zhang. 2020. Bayesian nonparametric unsupervised concept drift detection for data stream mining. ACM Trans. Intell. Syst. Technol. 12, 1 (2020), 1–22.
[60]
Wenlu Yang, Hongjun Wang, Yinghui Zhang, Zehao Liu, and Tianrui Li. 2022. Self-supervised discriminative representation learning by fuzzy autoencoder. ACM Trans. Intell. Syst. Technol. 14, 1 (2022), 1–18.
[61]
Wun-Ting Yang, Chiao-Ting Chen, Chuan-Yun Sang, and Szu-Hao Huang. 2023. Reinforced pu-learning with hybrid negative sampling strategies for recommendation. ACM Trans. Intell. Syst. Technol. 14, 3 (2023), 1–25.
[62]
Yao-Chun Yang, Chiao-Ting Chen, Tzu-Yu Lu, and Szu-Hao Huang. 2023. Hierarchical reinforcement learning for conversational recommendation with knowledge graph reasoning and heterogeneous questions. IEEE Trans. Serv. Comput. 16, 5 (2023), 3439–3452.
[63]
Jiangchao Yao, Jiajie Wang, Ivor W. Tsang, Ya Zhang, Jun Sun, Chengqi Zhang, and Rui Zhang. 2019. Deep learning from noisy image labels with quality embedding. IEEE Trans. Image Process. 28, 4 (2019), 1909–1922.
[64]
Bee Wah Yap, Khatijahhusna Abd Rani, Hezlin Aryani Abd Rahman, Simon Fong, Zuraida Khairudin, and Nik Nik Abdullah. 2014. An application of oversampling, undersampling, bagging and boosting in handling imbalanced datasets. In Proceedings of the 1st International Conference on Advanced Data and Information Engineering (DaEng’13). Springer, 13–22.
[65]
Yuxin Zhang, Jindong Wang, Yiqiang Chen, Han Yu, and Tao Qin. 2023. Adaptive memory networks with self-supervised learning for unsupervised anomaly detection. IEEE Trans. Knowl. Data Eng. 35, 12 (2023), 12068–12080.
[66]
Zhilu Zhang and Mert Sabuncu. 2018. Generalized cross entropy loss for training deep neural networks with noisy labels. Adv. Neural Info. Process. Syst. 31 (2018).
[67]
Zhenyu Zhang, Lin Zhao, Dongyang Cai, Shuming Feng, Jiawei Miao, Yirun Guan, Haicheng Tao, and Jie Cao. 2022. Time series anomaly detection for smart grids via multiple self-supervised tasks learning. In Proceedings of the IEEE International Conference on Knowledge Graph (ICKG’22). IEEE, 392–397.
[68]
Qiwei Zhong, Yang Liu, Xiang Ao, Binbin Hu, Jinghua Feng, Jiayu Tang, and Qing He. 2020. Financial defaulter detection on online credit payment via multi-view attributed heterogeneous information network. In Proceedings of the Web Conference 2020. 785–795.
[69]
Fan Zhou, Pengyu Wang, Xovee Xu, Wenxin Tai, and Goce Trajcevski. 2021. Contrastive trajectory learning for tour recommendation. ACM Trans. Intell. Syst. Technol. 13, 1 (2021), 1–25.
[70]
Kun Zhou, Hui Wang, Wayne Xin Zhao, Yutao Zhu, Sirui Wang, Fuzheng Zhang, Zhongyuan Wang, and Ji-Rong Wen. 2020. S3-rec: Self-supervised learning for sequential recommendation with mutual information maximization. In Proceedings of the 29th ACM International Conference on Information and Knowledge Management. 1893–1902.
[71]
Yan Zhu, Cangzhi Jia, Fuyi Li, and Jiangning Song. 2020. Inspector: A lysine succinylation predictor based on edited nearest-neighbor undersampling and adaptive synthetic oversampling. Analyt. Biochem. 593 (2020), 113592.

Cited By

View all
  • (2025)The Significance of Generative AI in Enhancing Fraud Detection and Prevention Within the Banking IndustryGenerative Artificial Intelligence in Finance10.1002/9781394271078.ch9(159-173)Online publication date: 21-Jan-2025
  • (2024)Deep Learning and Machine Learning Models for Scalable Credit Card Default Prediction on Big Data2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825743(7292-7296)Online publication date: 15-Dec-2024
  • (2024)Enhancing Credit Card Fraud Detection Through Adaptive Model Optimization2024 IEEE 7th International Conference on Big Data and Artificial Intelligence (BDAI)10.1109/BDAI62182.2024.10692581(49-54)Online publication date: 5-Jul-2024
  • Show More Cited By

Index Terms

  1. Credit Card Fraud Detection via Intelligent Sampling and Self-supervised Learning

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Intelligent Systems and Technology
    ACM Transactions on Intelligent Systems and Technology  Volume 15, Issue 2
    April 2024
    481 pages
    EISSN:2157-6912
    DOI:10.1145/3613561
    • Editor:
    • Huan Liu
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 March 2024
    Online AM: 23 January 2024
    Accepted: 30 December 2023
    Revised: 22 December 2023
    Received: 23 February 2023
    Published in TIST Volume 15, Issue 2

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Self-supervised learning
    2. feature engineering
    3. intelligent sampling
    4. discriminative representation
    5. credit card fraud detection

    Qualifiers

    • Research-article

    Funding Sources

    • National Science and Technology Council, Taiwan
    • E.SUN AI & FinTech Research Center
    • National Yang Ming Chiao Tung University

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)841
    • Downloads (Last 6 weeks)52
    Reflects downloads up to 12 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2025)The Significance of Generative AI in Enhancing Fraud Detection and Prevention Within the Banking IndustryGenerative Artificial Intelligence in Finance10.1002/9781394271078.ch9(159-173)Online publication date: 21-Jan-2025
    • (2024)Deep Learning and Machine Learning Models for Scalable Credit Card Default Prediction on Big Data2024 IEEE International Conference on Big Data (BigData)10.1109/BigData62323.2024.10825743(7292-7296)Online publication date: 15-Dec-2024
    • (2024)Enhancing Credit Card Fraud Detection Through Adaptive Model Optimization2024 IEEE 7th International Conference on Big Data and Artificial Intelligence (BDAI)10.1109/BDAI62182.2024.10692581(49-54)Online publication date: 5-Jul-2024
    • (2024)Credit card fraud detection based on federated graph learningExpert Systems with Applications: An International Journal10.1016/j.eswa.2024.124979256:COnline publication date: 5-Dec-2024
    • (2024)An online fuzzy fraud detection framework for credit card transactionsExpert Systems with Applications10.1016/j.eswa.2024.124127252(124127)Online publication date: Oct-2024
    • (2024)Credit card fraud detection using the brown bear optimization algorithmAlexandria Engineering Journal10.1016/j.aej.2024.06.040104(171-192)Online publication date: Oct-2024

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media