Group 23

The document discusses a project aimed at developing a machine learning model for detecting credit card fraud, which is a significant issue resulting in substantial financial losses. It outlines the use of various machine learning algorithms, including Logistic Regression, K-Nearest Neighbors, and Decision Trees, to analyze historical transaction data and identify fraudulent patterns. The project includes phases of data collection, cleaning, normalization, model training, and evaluation, ultimately aiming to create a high-accuracy fraud detection system.

Uploaded by

nchakrapanireddy098

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

17 views11 pages

Group 23

Uploaded by

nchakrapanireddy098

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 11

Institute of Information Technology

Jahangirnagar University
Savar, Dhaka-1342

Credit Card Fraud Detection using Machine Learning

Submitted To:
Md. Mahmudur Rahman
Lecturer
Institute of Information Technology
Jahangirnagar University

Submitted By:

Group - 23
Name Roll
Md. Shakil Hossain 2023
Mahbubur Rahman 2024
Nahidul Islam 2028
ABSTRACT

Credit card fraud is a major problem, with billions of dollars lost each year. Machine learning
can be used to detect credit card fraud by identifying patterns that are indicative of fraudulent
transactions. Credit card fraud refers to the physical loss of credit card or loss of sensitive credit
card information. Many machine- learning algorithms can be used for detection. This project
proposes to develop a machine learning model to detect credit card fraud. The model will be
trained on a dataset of historical credit card transactions, and it will be evaluated on a holdout
dataset of unseen transactions.
INTRODUCTION
'Fraud’ in credit card transactions is unauthorized and unwanted usage of an account by
someone other than the owner of that account. Fraud has been increasing drastically with the
progression of state-of-art technology and worldwide communication. Credit cards are one of
the most popular objectives of fraud but not the only one. Credit card fraud, wide-ranging term
for theft and fraud committed or any similar payment mechanism as a fraudulent resource of
funds in a transaction. Credit card fraud has been expanding issue in the credit card industry.
Detecting credit card fraud is a difficult task when using normal process, so the development of
the credit card fraud detection models has become of importance whether in the academic or
business organizations currently. Fraud can be avoided in two main ways: prevention and
detection. Prevention avoids any attacks from fraudsters by acting as a layer of protection.
Detection happens once the prevention has already failed. Therefore, detection helps in
identifying and alerting as soon as a fraudulent transaction is being triggered.
Machine learning is this generation's solution which replaces such methodologies and can work
on large datasets which is not easily possible for human beings. Machine learning techniques fall
into two main categories: supervised learning and unsupervised learning. Fraud detection can be
done in either way and only can be decided when to use according to the dataset. Supervised
learning requires prior classification to anomalies. During the last few years, several supervised
algorithms have been used in detecting credit card fraud. The data which is being used in this
study is analyzed in two main ways: as categorical data and as numerical data. The dataset
originally comes with categorical data. The raw data can be prepared by data cleaning and other
basic preprocessing techniques. First, categorical data can be transformed into numerical data
and then appropriate techniques are applied to do the evaluation. Secondly, categorical data is
used in the machine learning techniques to find the optimal algorithm.
This project consists of selecting optimal algorithms for fraud patterns through an extensive
comparison of machine learning such as Logistic Regression, KNN Neighbors, Decision Tree.
Techniques via an effective performance measure for the detection of fraudulent credit card
transactions. The rest of this paper is presented as follows. Section 2 presents the literature
review. Section 3 provides the experimental methodology including results. Finally, conclusions
and discussions of the paper are presented in Section 4.
LITERATURE REVIEW
In earlier studies, many approaches have been proposed to bring solutions to detect fraud from
supervised approaches, unsupervised approaches to hybrid ones, which makes it a must to learn
the technologies associated in credit card frauds detection and to have a clear understanding of
the types of credit card fraud. With the analysis of various detection models, past researchers
have found many problems regarding fraud detection. Classical algorithms such as Support
Vector Machines (SVM), Decision Tree (DT), LR and RF proven useful.
In paper [1], European dataset was also used, and comparison was made between the models
based on LR, DT and RF. Among the three models, RF proved to be the best, with accuracy of
95.5%, followed by DT with 94.3% and LR with accuracy of 90%.
According to [2] and [3], k-Nearest neighbors (KNN) and outlier detection techniques can also
be efficient in fraud detection. They are proven useful in minimizing false alarm rates and
increasing fraud detection rate.
KNN algorithm also performed well in experiment for paper [4], where the authors tested and
compared it with other classical algorithms. The paper [5] discussed commonly used supervised
techniques and they have provided a thorough evaluation of supervised learning techniques.
Also, they have shown that all algorithms change according to the problem area.
Fraud detection system presented in paper [6] is built to handle class imbalance, the formation of
labelled and unlabeled, and processing of large datasets. The proposed system was able to
overcome all the challenges.
In paper [7] they have highlighted fraud detection cost and lack of adaptability as challenges in
the fraud detection process. When considering a system, the cost of fraudulent behavior and the
prevention cost should be taken into consideration. Lack of adaptability occurs when the
algorithm is exposed to new types of fraud patterns and normal transactions.
PROPOSED METHODS
In this project we will use three different Algorithms to find out the prediction of a card to real or
fraud. Description of these Algorithms are given blew:

Logistic Regression:
This statistical classiﬁcation model based on probabilities detects the fraud using logistic curve.
Since the value of this logistic curve varies from 0 to 1, it can be used to interpret class
membership probabilities. The dataset fed as input to the model is being classiﬁed for training
and testing the model. Post model training, it is tested for some minimum threshold cut-off value
for prediction. Since the logistic regression, based on some threshold probabilities can divide the
plane using a single line and divides dataset points into exactly two regions.

Fig: The logistic regression model

K-Nearest Neighbor (KNN):

This is a supervised learning technique that achieves consistently high performance in
comparison to other fraud detection techniques of supervised statistical pattern recognition [24].
Three factors majorly affect its performance distance to identify the least distant neighbors, some
rule to deduce a categorization from k-nearest neighbor & the count of neighbors to label the
new sample. This algorithm classiﬁes any transactions that occurred by computing the least
distant point to this particular transaction and if this least distant neighbor is classiﬁed as
fraudulent then the new transaction is also labeled as a fraudulent one. Euclidean distance is a
good choice to calculate the distances in this scenario. This technique is fast and results in fault
alerts. Its performance can be improved by distance metric optimization.
Fig: Pros and Cons of K-Nearest Neighbors - From The GENESIS

Discission Tree:
A supervised learning algorithm, A decision tree which is in the form of tree structure, consisting
of root node and other nodes split in a binary or multi-split manner further into child nodes with
each tree using its own algorithm to perform the splitting process. With the tree growing, there
may be possibilities of overfitting of the training data with possible anomalies in branches, some
errors or noise. Hence pruning is used for improving classification performance of the tree by
removing certain nodes. Ease in the use, and the flexibility that the decision trees provide to
handle different data types of attributes make them quite popular.

Fig: Decision Tree Algorithm in Machine Learning

Support Vector Machine:

Support vector machines or SVMs are linear classifiers as stated in that work in high
dimensionality because in high-dimensions, a non-linear task in input becomes linear and hence
this makes SVMs highly useful for detecting frauds. Due to its two most important features that
is a kernel function to represent classification function in the dot product of input data point
projection, and the fact that it tries finding a hyperplane to maximize separation between classes
while minimizing overfitting of training data, it provides a very high generalization capability.

Fig: Support Vector Machine algorithm.

Dataset:
In this research the Credit Card Fraud Detection dataset was used, which can be downloaded
from Kaggle [8]. This dataset contains transactions, occurred in two days, made in September
2013 by European cardholders.

Credit Card Fraud Detection

PROJECT PLAN

Fig: Project plan.

The project will be completed in different phases:

Data collection:
The first phase will involve collecting a dataset of historical credit card transactions. The
data will be collected from a variety of sources, including banks, credit card companies,
and merchants.

Data Cleaning:

 Impute the missing values with the mean, median, or mode of the column.
 Drop the rows with missing values.
 Use a machine learning model to predict the missing values like isnull(), heatmap().

Normalize the data:

Normalization is the process of scaling the data so that all of the features have a similar
range of values. This can help to improve the performance of machine learning models
by making the features more comparable.
Model training:
The second phase will involve training the machine learning model on the collected data.
The model will be trained using a supervised learning algorithm, such as SVM.
Model evaluation:
The third phase will involve evaluating the performance of the machine learning model
on a holdout dataset of unseen transactions. The performance of the model will be
evaluated using metrics such as accuracy, precision, and recall.

Fig: Working Flow of Credit Card Fraud Detection

Timeline for Our Project:

Results and Evaluations
Expected Result:
 A machine learning model that can detect credit card fraud with high accuracy.
 A better understanding of the patterns that are indicative of fraudulent transactions.
 A framework for using machine learning to detect credit card fraud in real-time.

Performance Metrics and Evaluation Methodology:

Confusion Metrics:
A Confusion matrix is an N x N matrix used for evaluating the performance of a classification
model, where N is the number of target classes. The matrix compares the actual target values
with those predicted by the machine learning model.

Classification Report:
REFERENCES
[1]. S. V. S. S. Lakshmi, S. D. Kavilla “Machine Learning for Credit Card Fraud Detection
System”, unpublished
[2] N. Malini, Dr. M. Pushpa, “Analysis on Credit Card Fraud Identification Techniques based
on KNN and Outlier Detection “, Advances in Electrical, Electronics, Information,
Communication and Bio- Informatics (AEEICB), 2017 Third International Conference on pp.
255-258. IEEE.
[3] Mrs. C. Navamani, M. Phil, S. Krishnan, “Credit Card Nearest Neighbor Based Outlier
Detection Techniques”
[4] J. O. Awoyemi, A. O. Adentumbi, S. A. Oluwadare, “Credit card fraud detection using
Machine Learning Techniques: A Comparative Analysis”, Computing Networking and
Informatics (ICCNI), 2017 International Conference on pp. 1-9. IEEE.
[5] R. Choudhary and H. K. Gianey 2017 Int. Conf. Mach. Learn. Data Sci., pp. 3743, 2017.
[6]. G. E. Melo-Acosta, F. Duitama-Muñoz, and J. D. Arias-Londoño, -supervised
Common. Compute. (COLCOM), 2017 IEEE Colomb. Conf., pp. 16, 2017.
[7]. Survey of Credit Card Fraud Detection Techniques: Data and Technique Oriented 26, 2016
[8]. Credit Card Fraud Detection dataset: downloaded from Kaggle, September 2013 by
European cardholders.

Credit Card Research Paper
No ratings yet
Credit Card Research Paper
12 pages
Credit Card Fraud Detection Proposal Redone
No ratings yet
Credit Card Fraud Detection Proposal Redone
5 pages
MPML10 2022 FR
No ratings yet
MPML10 2022 FR
24 pages
Credit Card Fraud Detection Using Machine Learning PDF
No ratings yet
Credit Card Fraud Detection Using Machine Learning PDF
6 pages
A Study On Credit Card Fraud Detection Using Machine Learning
No ratings yet
A Study On Credit Card Fraud Detection Using Machine Learning
4 pages
A Review of Machine Learning Applications For Cred
No ratings yet
A Review of Machine Learning Applications For Cred
11 pages
Research Paper 4 (Abnormal Transactions)
No ratings yet
Research Paper 4 (Abnormal Transactions)
7 pages
Credit Card Fraud Detection System Using Machine Learning Process
No ratings yet
Credit Card Fraud Detection System Using Machine Learning Process
4 pages
Credit Card Fraud Detection Analysis
No ratings yet
Credit Card Fraud Detection Analysis
9 pages
Credit Card Fraud Detection Using Machine Learning Final Research Paper
100% (2)
Credit Card Fraud Detection Using Machine Learning Final Research Paper
11 pages
Credit Card Fraud Detection Using Machine Learning
100% (1)
Credit Card Fraud Detection Using Machine Learning
5 pages
Analysis On Credit Card Fraud Detection Methods
No ratings yet
Analysis On Credit Card Fraud Detection Methods
19 pages
A Review Credit Card Fraud Detection in Banks Using Machine Learning Algorithms
No ratings yet
A Review Credit Card Fraud Detection in Banks Using Machine Learning Algorithms
7 pages
Credit Card Fraud Detect
No ratings yet
Credit Card Fraud Detect
19 pages
Credit Card Fraud Detection Report
No ratings yet
Credit Card Fraud Detection Report
31 pages
Online Transaction Fraud Detection Using Backlogging On e Commerce Website IJERTV11IS050319
No ratings yet
Online Transaction Fraud Detection Using Backlogging On e Commerce Website IJERTV11IS050319
6 pages
Machine Learning For Credit Card Fraud D
No ratings yet
Machine Learning For Credit Card Fraud D
6 pages
Project Report
No ratings yet
Project Report
51 pages
Credit Card Fraud Detection Using Machine Learning
No ratings yet
Credit Card Fraud Detection Using Machine Learning
6 pages
Credit Card Fraud Detection Techniques
No ratings yet
Credit Card Fraud Detection Techniques
8 pages
Credit Card Fraud Detection Report
No ratings yet
Credit Card Fraud Detection Report
31 pages
Credit Card Fraud Detection Insights
No ratings yet
Credit Card Fraud Detection Insights
6 pages
AI in Healthcare Fraud Detection
No ratings yet
AI in Healthcare Fraud Detection
25 pages
Support Vector Machine Based Credit Card Fraud Detection IJERTV12IS030209
No ratings yet
Support Vector Machine Based Credit Card Fraud Detection IJERTV12IS030209
5 pages
A Comparative Analysis of Credit Card Fraud Detection Using Machine Learning Techniques
No ratings yet
A Comparative Analysis of Credit Card Fraud Detection Using Machine Learning Techniques
2 pages
Approaches To Fraud Detection On
No ratings yet
Approaches To Fraud Detection On
10 pages
Credit Card Fraud Detection Study
100% (1)
Credit Card Fraud Detection Study
14 pages
Credit Card Fraud Detection Using Machine Learning
No ratings yet
Credit Card Fraud Detection Using Machine Learning
6 pages
Final Eddited Research Paper1
No ratings yet
Final Eddited Research Paper1
6 pages
Machine Learning Model For Credit Card Fraud Detection-A Comparative Analysis
No ratings yet
Machine Learning Model For Credit Card Fraud Detection-A Comparative Analysis
9 pages
Credit Card Fraud Detection Insights
No ratings yet
Credit Card Fraud Detection Insights
6 pages
Credit Card Fraud Detection - Machine Learning Methods
No ratings yet
Credit Card Fraud Detection - Machine Learning Methods
5 pages
Machine Learning Algorithms For Credit Card Fraud Detection
No ratings yet
Machine Learning Algorithms For Credit Card Fraud Detection
10 pages
Credit Card Fraud 1.4% Positive Class
No ratings yet
Credit Card Fraud 1.4% Positive Class
17 pages
Report Credit Card
No ratings yet
Report Credit Card
26 pages
PID 89: Analysis and Performance Evaluation of Credit Card Fraud Detection by Multi-Model ML
No ratings yet
PID 89: Analysis and Performance Evaluation of Credit Card Fraud Detection by Multi-Model ML
19 pages
Major 1 2nd
No ratings yet
Major 1 2nd
13 pages
Paper 2
No ratings yet
Paper 2
9 pages
Credit Card Fraud Detection: Title
No ratings yet
Credit Card Fraud Detection: Title
5 pages
A Hybrid Approach For Optimized Fraudulent Transaction Detection With Credit Card Using
No ratings yet
A Hybrid Approach For Optimized Fraudulent Transaction Detection With Credit Card Using
7 pages
Itmconf Icdsia2023 02012
No ratings yet
Itmconf Icdsia2023 02012
10 pages
jcc2024126 11732760
No ratings yet
jcc2024126 11732760
11 pages
Credit Card Fraud Detection Using KNC SVC and Decision Tree Machine Learning Algorithms
No ratings yet
Credit Card Fraud Detection Using KNC SVC and Decision Tree Machine Learning Algorithms
3 pages
Credit Card Fraud Detection Model
No ratings yet
Credit Card Fraud Detection Model
27 pages
Analysis and Prediction For Credit Card Fraud
No ratings yet
Analysis and Prediction For Credit Card Fraud
19 pages
Implementation of Credit Card Fraud Detection Using Support Vector Machine
No ratings yet
Implementation of Credit Card Fraud Detection Using Support Vector Machine
13 pages
20 Elias Kogler Fraud Detection
No ratings yet
20 Elias Kogler Fraud Detection
7 pages
AI Boosts Credit Card Fraud Detection
No ratings yet
AI Boosts Credit Card Fraud Detection
18 pages
Fraud Detection Using Machine Learning and Deep Learning
No ratings yet
Fraud Detection Using Machine Learning and Deep Learning
6 pages
A Review On Credit Card Fraud Detection Using Machine Learning
No ratings yet
A Review On Credit Card Fraud Detection Using Machine Learning
4 pages
Credit Card Fraud Detection Using Machine Learning Methods
No ratings yet
Credit Card Fraud Detection Using Machine Learning Methods
7 pages
Icesc48915.2020.9155615
No ratings yet
Icesc48915.2020.9155615
6 pages
Credit Card Fraud Detection-Ppt-1
100% (1)
Credit Card Fraud Detection-Ppt-1
22 pages
Credit Card Fraud Detection
100% (1)
Credit Card Fraud Detection
4 pages
A Review On Credit Card Fraud Detection Using Mach
No ratings yet
A Review On Credit Card Fraud Detection Using Mach
6 pages
AI Credit Card Fraud Detection
No ratings yet
AI Credit Card Fraud Detection
17 pages
ML Credit Card Fraud Detection
100% (1)
ML Credit Card Fraud Detection
18 pages
Customer Service Workspace Lab Guide
No ratings yet
Customer Service Workspace Lab Guide
36 pages
Year 7 Autumn 1 Sequences Mini Assessment Answers A
No ratings yet
Year 7 Autumn 1 Sequences Mini Assessment Answers A
2 pages
BIM360 Manual Presentation
No ratings yet
BIM360 Manual Presentation
84 pages
Isa 3000 GSG
No ratings yet
Isa 3000 GSG
88 pages
Martial Arts
No ratings yet
Martial Arts
3 pages
Computer P
No ratings yet
Computer P
11 pages
Redbook FS Policy-Based Replication and HA
No ratings yet
Redbook FS Policy-Based Replication and HA
178 pages
Bizhub C360i C300i C250i Quick Reference en 1 0 0
No ratings yet
Bizhub C360i C300i C250i Quick Reference en 1 0 0
4 pages
Smart Parking Final Report
No ratings yet
Smart Parking Final Report
16 pages
CV - RF Optimization & Planning Engg - 14+ Yrs - Ajmal - Updated
No ratings yet
CV - RF Optimization & Planning Engg - 14+ Yrs - Ajmal - Updated
5 pages
701-100 Exam - Free Actual Q&As, Page 1 - ExamTopics
No ratings yet
701-100 Exam - Free Actual Q&As, Page 1 - ExamTopics
2 pages
Accounting Information System
No ratings yet
Accounting Information System
2 pages
Aqa 8520 PG Sample
No ratings yet
Aqa 8520 PG Sample
20 pages
ION Series: Til-Eu1
No ratings yet
ION Series: Til-Eu1
4 pages
Computational Physics Lecture Notes
100% (3)
Computational Physics Lecture Notes
129 pages
Design of Portable EDM Machine Tool
No ratings yet
Design of Portable EDM Machine Tool
4 pages
Construction of A Low-Voltage Standard Cell Library For Ultra-Low Power Application
No ratings yet
Construction of A Low-Voltage Standard Cell Library For Ultra-Low Power Application
64 pages
Quality Assurance Officer Cover Letter Guide
100% (2)
Quality Assurance Officer Cover Letter Guide
8 pages
Offset and Match - Part II
No ratings yet
Offset and Match - Part II
4 pages
Discrete Structures 2 - SEQUENCES, SUMMATION, AND SERIES
No ratings yet
Discrete Structures 2 - SEQUENCES, SUMMATION, AND SERIES
11 pages
Sims 4 Script Error Log Analysis
No ratings yet
Sims 4 Script Error Log Analysis
34 pages
Version History
No ratings yet
Version History
3 pages
Manual
No ratings yet
Manual
241 pages
Unit 3 Big Data Analytics
No ratings yet
Unit 3 Big Data Analytics
18 pages
F-1 Fazer User Manual Guide
No ratings yet
F-1 Fazer User Manual Guide
60 pages
BEML Layout in Bagalur
No ratings yet
BEML Layout in Bagalur
4 pages
Best UPSC Coaching Institute in Delhi Best IAS Coaching in Delhi 2
No ratings yet
Best UPSC Coaching Institute in Delhi Best IAS Coaching in Delhi 2
1 page
Semi Detailed Lesson Plan in Grade 7
No ratings yet
Semi Detailed Lesson Plan in Grade 7
3 pages
VDA Company Profile 2021 EN
No ratings yet
VDA Company Profile 2021 EN
31 pages
REFUone WiFi Stick LSW-3 Manual EN 20191218
No ratings yet
REFUone WiFi Stick LSW-3 Manual EN 20191218
16 pages