Seminar Report
Seminar Report
Bachelor of Technology
In
Computer Engineering
Submitted by
Gajanan Patil
I
A dynamic Churn prediction Model using soft
computing and Random Forest based Supervised
Learning
Bachelor of Technology
In
Computer Engineering
Submitted by
Gajanan Patil
Guided by
Prof. R. V. Patil
II
S.S.V.P.S.’s B.S. DEORE COLLEGE OF ENGINEERING, DHULE
CERTIFICATE
Date: Guide
Place: Dhule Prof. R. V. Patil
Head Principal
Prof. Dr. B. R. Mandre Dr. Hitendra D. Patil
III
ACKNOWLEDGEMENT
This Seminar I report has taken its current shape after a lot of hard work and
perseverance-not only just by me. I would like to express our sincere gratitude for
the assistance and support of a number of people who are helping to make this
success.
Immeasurable appreciation and deepest gratitude are extended for the help and
support to Prof. R. V. Patil, my guide for his guidance and enlightening comments
throughout the seminar work. It has been an altogether different experience to work
with him and I would like to thank for his helpful suggestion and numerous
discussions. I gladly take this opportunity to thank Prof. B. R. Mandre (Head Of
Department, Computer Engineering) and Dr. Hitendra D. Patil (Principal, SSVPS,
BSD, College of Engineering, Dhule) for providing facilities during progress of the
thesis.
I wish to express my sincere thanks to Prof. R. V. Paitl for his expert, sincere
and valuable guidance and encouragement extended to me.
I am also thankful to all those who helped us directly or indirectly to develop
this thesis and complete it successfully. Then I would like to thank all the Staff for
their encouragement. They had always been very prompt at extending in their
helping hand and sharing valuable technical knows. Special thanks to my Family
and Friends.
Gajanan Patil
IV
ABBREVIATIONS
Abbreviations Details
V
Table of Contents
Page
Sr.No No
List of Tables X
List of Figures IX
1 INTRODUCTION………………………………………………………..... 1
1.1 Introduction………………………………………………………...... 1
1.2 Background………………………………………………………...... 2
1.3 Churn in Telecom Industry………………………………………...... 3
1.4 Machine Learning Approach………………………………………… 5
1.5 Motivation…………………………………………………………… 9
1.6 Scope…………………………………………………………………. 10
1.7 Goals…………………………………………………………………. 10
1.8 Organization of Thesis………………………………………………. 10
1.9 Summary…………………………………………………………...... 10
2 LITERATUR REVIEW……………………………………………………. 11
2.1 Literature Survey…………………………………………………...... 11
2.2 Summary of the Literature…………………………........................... 15
2.3 Problem Statement and Objective…………………………………… 17
2.4 Summary…………………………………………………………….. 18
3 METHODOLOGY…………………………………………………………. 19
3.1 System Analysis……………………………………………………... 19
3.1.1 Existing Algorithms………………………........................... 20
3.1.2 Limitations of previous algorithm…………………………. 20
3.1.3 Analysis of the problem……………………………………. 20
3.1.4 Proposed System…………………………………………….21
3.2 Software Requirement Specification…………………………………21
3.2.1 Introduction………………………………………………… 21
3.2.2 Over all Description…………………………………………22
3.2.3 External interface requirements……………………………. 22
3.2.4 Software requirements……………………………………… 23
3.2.5 Hardware requirements…………………………………….. 23
3.3 Applications…………………………………………………………. 24
3.4 Future Scope…………………………………………………………. 24
3.5 System Architecture…………………………………………………. 24
3.5.1 Proposed architecture………………………………………. 24
3.5.2 Modules…………………………………………………...... 26
VI
3.6 UML Diagrams……………………………………………………… 26
3.6.1 Use case Diagram………………………………………….. 27
3.6.2 Activity Diagram……………………………………………28
3.6.3 Class Diagram……………………………………………… 29
3.6.4 Data Flow Diagram………………………………………… 30
3.6.5 Component Diagram………………………………………. 32
3.6.6 Sequence Diagram…………………………………………. 33
3.7 Summary……………………………………………………………. 33
4 IMPLEMENTATION……………………………………………………… 34
4.1 System model……………………………………………………….. 34
4.2 Training and Testing…………………………………………………36
4.3 Algorithms…………………………………………………………... 36
4.4 Flow chart of System………………………………………………... 39
4.5 Summary……………………………………………………………. 40
5 RESULTS AND DISCUSSION…………………………………………… 41
5.1 Experimental Setup…………………………………………………. 41
5.2 Simulation Scenario……………………………………………….... 41
5.3 Evaluation Metrics………………………………………………….. 42
5.4 Methods of Comparison……………………………………………. 44
5.5 Result Analysis……………………………………………………… 45
5.6 Summary……………………………………………………………. 52
6 ADVANTAGES…………………………………………………………… 53
7 CONCLUSION……………………………………………………………. 54
BIBILOGRAPHY ……………………………………………………………….. 55
PUBLICATION…………………………………………………………………....... 57
APPENDIX………………………………………………………………………... ...58
VII
FIGURE INDEX
Figure 1.1 : Machine learning approach……………………………………………..6
Figure 1.2: Supervised Learning approach………………………………………… 7
Figure 1.3 : Semi-Supervised Learning approach…………………………………. 7
Figure 1.4: Un-Supervised Learning approach……………………………………. 8
Figure 1.5 : Reinforcement Learning approach……………………………………. 9
Figure 3.1: Block diagram of Architecture…………………………………………. 25
Figure 3.2: Use case Diagram………………………………………………………. 27
Figure 3.3: Activity Diagram for User……………………………………………. 28
Figure 3.4: Class Diagram………………………………………………………… 29
Figure 3.5: DFD Level 0 Diagram………………………………………………….. 30
Figure 3.6: DFD Level 1 Diagram………………………………………………….. 31
Figure 3.7: Component Diagram…………………………………………………. 32
Figure 3.8: Sequence Diagram……………………………………………………. 33
Figure 4.1: Flow chart of System…………………………………………………… 39
Figure 5.1 : Comparitive analysis of various classification algorithms……………. 45
Figure 5.2 : Classification Report of Random Forest………………………………. 45
Figure 5.3 : Performance evaluation of Random Forest classification…………….. 46
Figure 5.4 : Discrimination threshold evaluation of Random Forest classification.. 46
Figure 5.5 : Classification Report of DT classification…………………………….. 47
Figure 5.6 : Performance evaluation of DT classification………………………….. 47
Figure 5.7 : Discrimination threshold evaluation of DT classification…………….. 48
Figure 5.8 : Classification Report of Bagging classification……………………….. 49
Figure 5.9 : Performance evaluation of Bagging classification……………………. 49
Figure 5.10 : Discrimination threshold evaluation of Bagging classification……… 50
Figure 5.11: Classification Report of K-Neighbors………………………………… 51
Figure 5.12 : Performance evaluation of K-Neighbors classification………………. 51
Figure 5.13 : Discrimination threshold evaluation of K-Neighbors classification… 52
VIII
TABLE INDEX
Table 1: Summary of Literature survey…………………………………… 15
Table 2: Testing Parameter for Algorithm…………………………………42
Table 3: Confusion Matrix Analysis……………………………………… 43
Table 4: Comparative analysis of various classification algorithms……… 44
IX
ABSTRACT
Customer churn is a major problem and one of the most important concerns for large
companies. Due to the direct effect on the revenues of the companies, especially in the
telecom field, companies are seeking to develop means to predict potential customer to
churn. Therefore, finding factors that increase customer churn is important to take
necessary actions to reduce this churn. The main contribution of our work is to
develop a churn prediction model which assists telecom operators to predict
customers who are most likely subject to churn. The model developed in this work uses
machine learning techniques on big data platform and builds a new way of features’
engineering and selection. In order to measure the performance of the model, this
work also identified churn factors that are essential in determining the root causes of
churn. By knowing the significant churn factors from customers' data, CRM can
improve productivity, recommend relevant promotions to the group of likely churn
customers based on similar behavior patterns, and excessively improve marketing
campaigns of the company. The proposed churn prediction model is evaluated using
metrics, such as accuracy, precision, recall, f-measure, and receiving operating
characteristics (ROC) area. Furthermore, it also provides factors behind the churning
of churn customers through the rules generated by using the attribute-selected
classifier algorithm.
Key terms: Receiving Operating Characteristics, Deep learning, Convolution Neural
Network, churn prediction, Feature selection.
X
Chapter - 1
INTRODUCTION
This chapter introduces churn prediction model and different approaches for churn prediction
model. It explains churn in telecom industry and their causes and also introduces different
machine learning techniques like supervised learning, unsupervised learning, Reinforcement
learning and also analyze which approaches is best for churn prediction model.
1.1 Introduction
Consumers today go through a complex decision making process before subscribing to any
one of the numerous Telecom service options. The services provided by the Telecom
vendors are not highly differentiated and number portability is commonplace. The mobile
telephone industry churn is the similar problem [2] [9] [12]. Customer loyalty becomes an
issue. Hence, it is becoming increasingly important for telecommunications companies to
proactively identify factors that have a tendency to unsubscribe and take preventive
measures to retain customers. To calculate your probable monthly churn, start with the
number of users who churn that month. Then divide by the total number of user days that
month to get the number of churns per user day. Then multiply by the number of days in the
month to get your resulting monthly churn rate. It is found that data mining techniques are
more effective in predicting consumer churn from the research conducted over the past few
years [17]. Creating an efficient churn prediction model is an essential activity requiring a lot
of work right from determining appropriate predictor variables (features) from the large
volume of available customer data to choosing an effective predictive data mining technique
suitable for the feature set.
The A multi-layer perceptron approach for customer churn prediction has used in
[14] for customer-related data such as customer profiling, calling pattern, and democratic
data in addition to the network data they generate. Based on the customer‘s history of calling
behavior and behavior, there is a possibility to classify their attitude of either going away or
not. Data mining techniques are found to be more effective in predicting churn from the
research done over the past decade. The predictive modeling techniques in churn prediction
are also considered to be more accurate. Churn prediction systems and sentiment analysis
using classification as well as clustering techniques to classify churn customers and the
reasons behind the churning of telecom customers [18].
1
In telecom industry should we generate large amount of data on daily basis, it is very
tedious task to mine such a kind of last data using specific data mining techniques, while
hard to interpret the prediction on classical techniques. Sometime such telecommunication
data may be containing some churn and, it is much necessary to identify search problems.
Big companies implement churn prediction models to be able to detect possible churners
before they effectively leave the company [16].
1.2 Background
It is found that data mining techniques are more effective in predicting consumer churn from
the research conducted over the past few years [3]. Creating an efficient churn prediction
model is an essential activity requiring a lot of work right from determining appropriate
predictor variables (features) from the large volume of available customer data to choosing
an effective predictive data mining technique suitable for the feature set. Telecom Industries
collect a large amount of customer-related data such as customer profiling, calling pattern,
and democratic data in addition to the network data they generate [4]. Based on the
customer‘s history of calling behavior, there is a possibility to classify their attitude of either
going away or not.
Data mining techniques are found to be more effective in predicting churn from the
research done over the past decade. The predictive modeling techniques in churn prediction
are also considered to be more accurate. Churn prediction systems and sentiment analysis
using classification as well as clustering techniques to classify churn customers and the
reasons behind the churning of telecom customers. In telecom industry [7] should we
generate large amount of data on daily basis, it is very tedious task to mine such a kind of
last data using specific data mining techniques, while hard to interpret the prediction on
classical techniques. Various researchers already described search a work to eliminate churn
from large data sets fusion static as well as dynamic approaches, but still such systems are
facing many problems actual identification of churn. Sometime such telecommunication data
may be containing some churn and, it is much necessary to identify search problems. To
successful identification of churn from large data is providing effectiveness to customer
relationship management (CRM) using various soft computing techniques in e.g. genetic
algorithms, adaboosting etc. [9].
Using these classes, the Adaptive Neuro Fuzzy Inference System (ANFIS) is used to
develop a sensitive prediction model for churn management [1].
What is Churn Rate? What is the cause for customer churn in the Telecom
Industry?
The churn rate, also known as the rate of attrition or customer churn, is the rate at which
customers stop doing business with an entity [4]. It is most commonly expressed as the
percentage of service subscribers who discontinue their subscriptions within a given time
period. The churn rate in developing markets ranges from 20% to 70%. In some of these
markets, more than 90% of all mobile subscribers are on prepaid service. Some operators in
developing markets lose in aggregate their entire subscriber base to churn in a year [5].
In the present world, Business Intelligence is helping businesses and organizations ask and
answer questions about their data. It is helping companies make better decisions by showing
present and historical data within their business context. With the availability of BI self-
service tools capable of helping companies understand performance from various angles so
they can then take action to drive better business outcomes on big data [10]. These tools can
mine massive datasets for performance insights relevant to customer churn, and then push
them to the attention of marketers, customer service managers, and executives so they can
factor these findings into subsequent decisions.
4
Rich customer data availability in Telecom Sector
Telecom providers both Communication Service Providers (CSPs) and content providers
have a unique opportunity to access rich customer data that isn‘t available to many other
industries. This is due to the nature of their products/services and the visibility they have to
the end-to-end supply chain of communication services. They can see content and service
usage through web services and centralized systems [9]. By accessing data from cell towers
and deployed infrastructure, companies can add a location dimension to the data. Reaching
into individual consumer devices, these companies gain visibility to the last mile of the
supply chain and can access data about the types of users/viewers of their services and
telemetry on end-user service performance [12].
5
a word from each term in an input text); parsing, that assigns a distance matrix to a feature
vector, defining the linguistic meaning of the sentence; etc. [17]. Other examples include
estimation, which assigns a productive capacity to each input; stochastic classification is a
general subset of classification. To find the best classification for a given case, algorithms of
this type use statistical analysis. Probabilistic formulas output a likelihood of the example is
a participant of one of the training images, unlike other equations, which output a 'best' class.
Usually, the best class would then be chosen as the one with the highest likelihood. Such an
implementation, however, has substantial advantages over – anti classification models.
Supervised learning
Supervised learning is the methodology of artificial intelligence that operates on labeled data
and maps team obtained with train and test instances. As trained knowledge is labeled and
properly categorized here, it is, therefore, a regulated process and conducted under
observation. The supervised technique (also known as the probabilistic activation method)
uses co-occurrence association rule mining to find categories, similar to the first method
[17].
6
Figure 1.2: Supervised Learning approach
Semi-Supervised Learning
Semi-supervised has become a machine-learning activity that small quantities of labeled data
can be used, including some unlabeled data. The mixture of different classifiers is also a
variation. Semi-supervised learning objective to train unlabeled data using a labeled data set.
[17]
7
Unsupervised Learning
The most frequently used computer vision strategy, where correlations are discovered, and
grouping techniques ha used for unsupervised classification. It operates on unlabeled
information specifically concerned about giving to the machine with no independent
variables the input vector or cluster. The suggested unsupervised technique (dubbed the
spreading activation method) learns relevant rules between notional words (defined as the
words in the sentence after deleting stop words and low frequency words) and the
considered categories using co-occurrence association rule mining in a similar fashion
[17].
In unsupervised classification data, the system explicitly operates on the given data
or repository with some way to succeed, neither marked nor labeled. It is not controlled.
Since the output variable is uncertain, uncontrolled learning can manage more complex
tasks than reinforcement methods.
Reinforcement Learning
Reinforcement learning operates based on steps of reward and penalties. This can be seen
as how we can benefit from their actions. Either qualifying action may give us the
incentive for desired performance in a given context, or it may merit a violation based on
performed errors.
8
The agent learns how to focused intervention on his behavior in a given context. In the
given case, the agent must properly analyses the things and get away from the penalties by
doing the right things.
The well-being diagnostic system predicts the disease using the neural
classification approach based on the suggested fuzzy theory. This section has a sub-
component called the severity section, which is responsible for breaking the degree of
severity [8] [11] [18]. The user information is eventually categorized as ordinary and
affected by the infection. Smart fuzzy criteria are used in the expert system to decide on
choices about rehabilitative documents. The exploratory findings indicate that the
application of the work carried out overcomes the existing traditional classification
mechanisms.
1.5 Motivation
• Having the ability to accurately predict future churn rates is necessary because it helps
your business gain a better understanding of future expected revenue [7].
• Predicting churn rates can also help your business identify and improve upon areas
where customer service is lacking [4].
• With this research we proposed a churn for telecom sector using machine learning to
eliminate future revenue losses [12].
9
1.6 Scope
It is our intention to collect data from the first popular online Customer reviews website for
churning predictions [3] [4]. Predict future Churn Prediction using machine learning
algorithms. The system can work in a stable and real time environment and can predict the
best accuracy [5].
1.7 Goal
In different category class labels to categorize online customer reviews.
1.9 Summary
This chapter describes a basic idea of churn and problem of churn in telecom industry.
Reducing churn is more important than ever, particularly in light of the telecom industry's
growing competitive pressures. At the present stage, many operators have not taken the steps
required to build a strong analytical foundation for successfully establishing a truly
aspirational mandate for data-based decision-making or capitalize on analytical insights. The
companies that move quickly will be best positioned for success in the future.
10
Chapter - 2
LITERATURE SURVEY
This chapter gives the details of various abstractive summarization techniques. It also gives
the literature survey for the abstractive summarization. Literature review helps to summarize
and synthesize the arguments and ideas of existing knowledge in a particular field without
adding any new contributions.
11
The maximal neural network architecture includes 14 input nodes, 1 concealed node
and 1 output node with the learning algorithm Levenberg Marquardt (LM). Multilayer
Perceptron (MLP) neural network approach to predict client churn in one of the leading
telecommunications companies in Malaysia compared to the most common churn prediction
techniques, such as Multiple Regression Analysis and Logistic Regression Analysis.
In system [5] on creating an efficient and descriptive statistical churn model utilizing
a Partial Least Square (PLS) approach focused on strongly associated intervals in data sets.
A preliminary analysis reveals that the proposed model provides more reliable results than
conventional forecast models and recognizes core variables in order to better explain
churning behaviors. Additionally, network administration, overage administration and issue
handling approaches are introduced in certain simple marketing campaigns and discussed.
Burez and Van den Poel [6] Unbalance data sets studies in churn prediction models, and
contrasts random sampling performance, Advanced Under-Sampling, Gradient Boosting
Method, and Weighted Random Forest. The concept was evaluated using Metrics (AUC,
Lift). The study shows that the methodology under sampling is preferable to the other
techniques evaluated.
Brandusoiu [7] describes an innovative data mining method to explain the broad dataset type
of consumer churn detection. About 3500 consumer details is analyzed based on incoming
number as well as outgoing input call and texts. Specific machine learning algorithms were
used for training classification and research, respectively. The system's estimated average
accuracy is about 90 percent for the entire dataset.
He et al. [8] with approximately 5.23 million subscribers, a major Chinese
telecommunications corporation developed a predictive model focused on the Neural
Network method to address the issue of consumer churn. The average degree of precision
was the extent of predictability of 91.1%.
Idris [9] suggested a genetic engineering solution to modeling AdaBoost-churning
telecommunications problems. Two Standard Data Sets verified the series. With a precision
of 89%, one from Orange Telecom and the other from cell2cell and 63% for the other one.
Huang et al. [10] the customer churn studied on the big data platform. The
researchers ' aim was to show that big data significantly improves the cycle of churn
prediction, based on the quantity, variety and pace of the data. A broad data repository for
fracture engineering was expected to accommodate data from the Project Support and
12
Business Support Department at China's biggest telecommunications firm. AUC used the
forest algorithm at random and assessed.
According to [11] with k-means and fuzzy c-means clustering algorithms are
clustered input features to place subscribers in separate discrete groups. The Adaptive Neuro
Fuzzy Inference System (ANFIS) is introduced using these classes to construct a predictive
model for active churn management. The first prediction step begins with Neuro fuzzy
parallel classification. FIS then takes Neuro fuzzy classifiers outputs as input to decide on
churners activities. Measurements of success can be used to recognize inefficiency problems.
Churn management metrics are associated with customer service network services,
operations, and efficiency. GSM number versatility is a vital criterion for churner‘s
determination.
In System [12] a New set of apps to improve the identification level of potential
churners. The features are derived from call details and customer profiles and are categorized
as description features related to contract, call pattern, and call pattern changes. The features
are tested using two Naïve Bayes and Bayesian Network probabilistic data mining
algorithms and their results compared to those obtained from the use of C4.5 decision tree,
an algorithm commonly used in many classification and prediction tasks. These have
contributed, among other factors, to the risk that customers can easily switch to competitors.
One of the techniques that can be used to do this is to improve churn prediction from large
amount of data with extraction in the near future.
According to [13] Formalization of the selection method in time window, along with
analysis of literature. Second, this study analyzes the increase in churn model consistency by
extending the history of customer events from one to seventeen years using logistic
regression, classification trees and bagging along with classification trees. The functional
consequence is that researchers, such as data storage, planning and research, can
significantly reduce data-related burdens. The amount that consumers have to pay depends
on the subscription's duration and pro-motional sense. A letter is sent by the newspaper
company to remind them that the subscription is coming to an end. Then ask them if they
want to renew their subscription, along with guidance on how to do that. Customers are
unable to cancel their subscription and have a grace period of four weeks once they have
subscribed lapsed.
According to [14] the most effective customer retention techniques should be used to
effectively reduce customer turnover rates. The research suggests a neural network approach
for Multilayer Perceptron (MLP) to predict customer churn in one of Malaysia's leading
13
telecommunications firms. The findings were compared with the most common techniques
of churn prediction such as Multiple Regression Analysis and Logistic Regression Analysis.
The optimal configuration of the neural network contains 14 input nodes, 1 hidden node and
1 output node with Levenberg Marquardt (LM) learning algorithm. Multilayer Perceptron
(MLP) neural network approach to predict client churn in one of the leading
telecommunications companies in Malaysia most common Analysis and Logistic Regression
Analysis.
In system [15] on Building a predictive churn model that is accurate and concise
using a Partial Least Square (PLS) methodology based on highly correlated data sets
between variables. A preliminary experiment shows that the model presented provides more
accurate performance than traditional models of prediction and identifies key variables to
better understand churning behaviors. Additionally, there is a range of basic churn marketing
strategies— system management, overage management, and complaint management
strategies is presented and discussed.
Burez and Van den Poel [16] studied the problem of unbalance datasets in churn
prediction models and compared performance of Random Sampling, Advanced Under-
Sampling, Gradient Boosting Model, and Weighted Random Forests. They used (AUC, Lift)
metrics to evaluate the model. The result showed that under sampling technique
outperformed the other tested techniques.
Schouten et al. [17] presented an advanced methodology of data mining to predict
churn for prepaid customers using dataset for call details of 3333 customers with 21 features,
and a dependent churn parameter with two values: Yes/No. Some features include
information about the number of incoming and outgoing messages and voicemail for each
customer. The author applied principal component analysis algorithm ―PCA‖ to reduce data
dimensions. Three machine learning algorithms were used: Neural Networks, Support
Vector Machine, and Bayes Networks to predict churn factor. The author used AUC to
measure the performance of the algorithms. The AUC values were 99.10%, 99.55% and
99.70% for Bayes Networks, Neural networks and support vector machine, respectively. The
dataset used in this study is small and no missing values existed.
Karahoca [18] proposed a model for prediction based on the Neural Network
algorithm in order to solve the problem of customer churn in a large Chinese telecom
company which contains about 5.23 million customers. The prediction accuracy standard
was the overall accuracy rate, and reached 91.1%.
14
Kamalraj [19] proposed an approach based on genetic programming with
Adaboosting to model the churn problem in telecommunications. The model was tested on
two standard data sets. One by Orange Telecom and the other by cell2cell, with 89%
accuracy for the cell2cell dataset and 63% for the other one.
15
Tree[2] customers 112 calls pattern
attributes
Heterogeneous dataset
3 Neural Unknown Demographic, Value tedious to
handle in similar
network, 129,892 added, usage pattern patterns
Regression
[3] customers 113 Environments.
attributes
Redundant features
5 Stepwise Cell2Cell Dataset Behavioral should be
Generating high error
Variable 100,000 information, Customer rate.
Selection customers 171 care and
partial least attributes Demographics
squares [5]
Language influence
7 Binomial Iranian Telco Demographic, call should be
generate irrelevant
Logistic operator 3150 usage pattern, features
Regression customers 15 customer care service Vector.
model [7] Attributes
16
models (GAM) Attributes payment
[8]
Behaviors
information
10 Decision tree Cell2Cell Dataset Behavioral data, of generate
the churn
possibility
as well as 100,000 customer care and sometime it
Generate false
Machine customers 171 feature information ratio.
learning [10] Attributes
algorithms has
used.
Problem Statement
In the proposed research work to measure and identify the churn using text analysis using
NLP and machine learning classifier. To identify the customer changing behavior pattern
during prediction. To identify the factor which mostly influence to reduce accuracy of churn
prediction? To evaluate and calculate churn rate for month wise as well as day wise, which
useful for enhance the service quality of system
Objectives
To design and develop an approach for Churn Prediction with Sentiment Analysis on
customer reviews large dataset.
To implement proposed system with various feature extraction as well as selection
techniques and evaluate the performance analysis of system.
17
To validate the proposed system with respective machine learning algorithm and deploy
on real time environment.
To explore and validate the proposed system comparative analysis on various dataset with
classification accuracy
2.4. Summary
A lot of research has being done in the phenomenon of churn prediction of customer reviews.
Various techniques have been studied and analyzed. The work of various researchers has been
tabulated on the basis of their techniques, framework, dataset, evaluation metrics,
performance and limitations. By analyzing the survey we can conclude that a lot of work has
been done and some are still working on it as there are still many areas which are need to be
studied and applied.
18
Chapter - 3
METHODOLOGY
This chapter gives the description of the system in detail. It gives details information about
proposed system with the benefits and architecture of the model. It also explains the existing
system of churn prediction model and limitations and the problems related to existing
system.
It is found that data mining techniques [8] [12] are more effective in predicting consumer
churn from the research conducted over the past few years. Creating an efficient churn
prediction model is an essential activity requiring a lot of work right from determining
appropriate predictor variables (features) from the large volume of available customer data to
choosing an effective predictive data mining technique suitable for the feature set. Telecom
Industries collect a large amount of customer-related data such as customer profiling, calling
pattern, and democratic data in addition to the network data they generate. Based on the
customer‘s history of calling behavior and behavior, there is a possibility to classify their
attitude of either going away or not.
Data mining techniques are found to be more effective in predicting churn from the
research done over the past decade [15]. The predictive modeling techniques in churn
prediction are also considered to be more accurate. Churn prediction systems and sentiment
analysis using classification as well as clustering techniques to classify churn customers and
the reasons behind the churning of telecom customers [11]. In telecom industry should we
generate large amount of data on daily basis, it is very tedious task to mine such a kind of
last data using specific data mining techniques, while hard to interpret the prediction on
classical techniques. Various researchers already described search a work to eliminate churn
from large data sets fusion static as well as dynamic approaches, but still such systems are
facing many problems actual identification of churn. Sometime such telecommunication data
may be containing some churn and, it is much necessary to identify search problems. To
successful identification of churn from large data is providing effectiveness to customer
relationship management (CRM) [10].
19
In today‘s computer environment writing comments to churn more frequently while
voice mail plan customers can disposed to churn less frequently. Customers with four or
more customer service calls churn as often as other customers churn more than four times.
We calculate the average churn rate during model training using different machine learning
approaches and evaluate the for testing [5].
To maximize the organization's sales, as we suggested in our study, predicting
accuracy churn is very critical. The cost of making an excessive retention effort (false
positives) and the cost of losing a customer because the model does not accurately anticipate
churn can be reduced by combining the customer lifetime value with the churn prediction
(false negatives) [19].
3.1.1 Existing Algorithms
According to [1] Clustering algorithms are clustered input functions with k-means and fuzzy
c-means to position subscribers in independent, distinct classes. Using these groups the
Adaptive Neuro Fuzzy Inference Framework (ANFIS) is implemented to construct a
predictive model for successful churn management. The first step towards prediction starts
with the parallel classification of Neuro soft [18]. FIS then uses the outputs of Neuro fuzzy
classifiers as feedback to settle on the behaviors of the churners. Progress metrics can be
used to identify issues of inefficiency. Churn reduction indicators are concerned with the
facilities, processes and performance of customer support network. Versatility of GSM
numbers is a critical criterion for churner’s determination
3.1.2 Limitations of previous algorithm
The algorithm's main goal is to create a system that produces highly fixable results with
exceptional precision. The machine learning algorithm in use seeks to accomplish the same
thing. The input of the system can be of size or resolution. It does not depend on the
operating system. The dataset here are trained and tested. In the proposed research work to
design and develop an approach for churn prediction using NLP and machine learning
approaches to enhance the system accuracy [8] [17]. It is very important for making the data
20
useful because noisy data can lead to poor results. In telecom dataset, there are a lot of
missing values, incorrect values like ``Null'' and imbalance attributes in the dataset. In our
dataset, the number of features is 29. We analyzed the dataset for filtering and reduced the
number of features so that it contains only useful features.
3.1.4 Proposed System
In the proposed research work to design and develop an approach for churn prediction using
NLP and machine learning approaches to enhance the system accuracy. Then we identify the
customer changing behavior pattern during prediction [8]. We also evaluate the factor which
mostly influences to reduce accuracy of churn prediction and finally evaluate and calculate
churn rate for month wise as well as day wise, which useful for enhance the service quality
of system. In this research we proposed churn prediction from large scale data, system
initially deals with telecommunication synthetic data set which contains some imbalance
meta data [10]. To apply data preprocessing, data normalization, feature extraction as well as
feature selection respectively. During this execution some Optimization strategies have been
used to eliminate redundant features which sometimes generate high error rate during the
execution. The proposed system execution for training and testing. After completion both
phases system describe classification accuracy for entire data set [17].
21
analyze Churn Prediction for telecom sector. This document is written in the following style
given below.
1. Font style : Times new roman
2. Heading : 16 size, Bold
3. Sub-headings: 14 size, Bold
4. Description : 12 size
1. User friendly: As the architecture is simple, thus it allow the users to navigate through
the project easily
1. Maintainability: All the modules are clearly separated to allow the future development of
user interface and the system in a thoughtful and effective software engineering.
2. User interface: Users can easily load the dataset and obtain the required results.
User Documentation
A user manual will be included in the programmer to assist and guide users on how to
interact with the system and perform various duties. This paper is for any individual user
who needs to know about the system's basic architecture and standards. In the next sections,
we'll go through the main components and how they're used.
Input to the System: The input of the system is the CSV file where the customer churn in
the telecom sector can be estimated.
Output of the System: The system provides the Customer Churn Prediction in Telecom
Sector.
The external interface includes user interface, hardware and software interface. These are
explained in detailed as follows.
22
User Interface.
Hardware Interface.
The hardware interface required to run the system is the graphics card or the GPU (Graphical
processing Unit).As we know that a computer has enough space along with good processing
speed which is sufficient for developing for a particular projects. Apart from a standard
personal computer, a graphics card to perform large computations and parallel task with
good efficiency is needed.
Software Interface.
The language used for developing the application is Python. The GitHub IDE is used as an
interface to deal with various python packages. Tensor flow is also required to develop
project. The operating system used can be either Windows 7
23
3.3. Applications
To implement a proposed system with deep learning algorithm to achieve better accuracy, as
well as the input data contains large size and volume, if we deal the proposed systems with
HDFS framework and parallel machine learning algorithm which will provide better result in
low computation cost
In the proposed research work to design and develop an approach for churn prediction using
NLP and machine learning approaches to enhance the system accuracy. Then we identify the
customer changing behavior pattern during prediction [4]. We also evaluate the factor which
mostly influences to reduce accuracy of churn prediction and finally evaluate and calculate
churn rate for month wise as well as day wise, which useful for enhance the service quality
of system. In this research we proposed churn prediction from large scale data, system
initially deals with telecommunication synthetic data set which contains some imbalance
Meta data. To apply data preprocessing, data normalization, feature extraction as well as
feature selection respectively [17]. During this execution some Optimization strategies have
been used to eliminate redundant features which sometimes generate high error rate during
the execution. The proposed system execution for training and testing. After completion both
phases system describe classification accuracy for entire data set
24
Figure 3.1 Block diagram of Architecture
System overview
The aim of this kind of research in the telecommunications industry is to help businesses
make more profit. Telecom companies have become known to forecast turnover as one of the
most important sources of income. Therefore, this research was aimed at building a system
in the Telecom Company that predicts customer churn. Such prediction models will achieve
high AUC values. The sample data was divided into 70% for training and 30% for testing to
evaluate and develop the model [9]. We chose 10-fold cross-validation for evaluating and
optimizing hyper parameters. We used engineering tools, effective function transformation
and selection approach. Making the interface fit for machine learning algorithms. Another
concern was also found: the data was not balanced. Only about 5% of the entries are
customers ' churn. A problem has been solved by under-sampling or using trees algorithms
25
that are not affected by this issue. In detecting the churn in large data and providing accurate
prediction, our different classifiers can be more accurate.
This work contributes to suggesting a supervised approach to the extraction of dimensional
categories, selecting suitable characteristics and avoiding duplication by measuring
correlation between characteristics. The results obtained show that there is a comparatively
higher f-score in the weighted frequency of the term with the correlation process. In this
regard, selecting features using weighted word frequency is more important [16]. The
overlap between features in a category of aspect is avoided by measuring the association
3.5.2 Modules
Data Acquisition: First of all the information for different Telecom Sector Customer
based on certain parameters is extracted data.
Preprocessing: Then we will apply various preprocessing steps such as lexical analysis,
stop word removal, stemming (Porters algorithm), index term selection and data cleaning
in order to make our dataset proper.
Lexical analysis: Lexical analysis separates the input alphabet into,
1. Word characters (e.g. the letters a-z) and 2)
2. Word separators (e.g. space, newline, and tab).
Stop word removal: Stop word removal refers to the removal of words that occur most
frequently in documents.
Stemming: Stemming replaces all the variants of a word with a single stem word.
Variants include plurals, gerund forms , third person suffixes, past tense suffixes, etc.).
Data Training: We compile artificial as well as real time using online news data and
provide training with any machine learning classifier.
Testing with machine learning: We predict online news using any machine learning
classifier, weight calculator for real time or synthetic input data accordingly.
Analysis: We demonstrate the accuracy of proposed system and evaluate with other
existing systems
The Unified Modeling Language (UML) gives a standard way to write a system model
covering the conceptual ideas. It can be used for modeling a system independent of a
platform language. It is a graphical language for visualizing, specifying, constructing and
documenting information about software intensive system.
26
3.6.1 Use case Diagram:
Activity diagram is a flow chart to represent the flow from one activity to another activity.
The activity can be described as an operation of the system. The flow can be sequential,
branched or concurrent. Here we have two activity diagrams, one for the user and the other
for the system. The purpose of activity diagrams is to capture the dynamic behavior of the
system. These are drawn as follows.
28
3.6.3 Class Diagram
A class diagram in the Unified Modeling Language (UML) is a sort of static structure
diagram that portrays the structure of a framework by demonstrating the framework's classes,
their characteristics, operations (or techniques), and the connections among objects.
The class diagram is the primary building piece of protest situated modeling. It is
utilized for general theoretical modeling of the precise of the application, and for point by
point modeling making an interpretation of the models into programming code. Class
diagrams can likewise be utilized for information modeling. The classes in a class diagram
speak to both the primary components, interactions in the application, and the classes to be
modified
29
3.6.4 Data Flow Diagram
A data flow diagram (DFD) maps out the flow of data for any procedure or framework. It
utilizes characterized images like rectangles, circles and bolts, in addition to short content
marks, to demonstrate data inputs, yields, stockpiling focuses and the courses between every
goal. Data flowcharts can extend from straightforward, even hand-drawn process diagrams,
to top to bottom, multi-level DFDs that delve logically more profound into how the data is
taken care of. They can be utilized to break down a current framework or model another one.
Like all the best diagrams and graphs, a DFD can frequently outwardly "say" things that
would be difficult to clarify in words, and they work for both specialized and nontechnical
gatherings of people, from designer to CEO. That is the reason DFDs remain so mainstream
after such a long time
30
3.6.4.2 DFD Level 1
31
3.6.5 Component Diagram
32
3.6.6 Sequence Diagram
3.7 Summary
The methodology chapter involves system analysis. The architecture of the proposed system
is outlined. Modeling describes a high level of abstraction of a software system. The models
are building for better understanding of the system that is developing. Graphical
representation of static, logical, dynamic view of system and flow of execution is shown.
Different view of the system is represented using UML diagram is represented in this
chapter. Each UML diagram is designed for view a software system from a different
perspective and in varying degrees of abstraction.
33
Chapter - 4
IMPLEMENTATION
This chapter describes implementation of proposed system. It also describes the necessary
steps for churn prediction and also the details about each algorithm (Random forest,
Decision tree, Bagging Classifier and Knearest neighbor). It also explains the data set used
by algorithm for training and testing and gives the detail flow chart of the system.
Each sentence in the training dataset undergoes preprocessing like tokenization, transforming
case (uppercase to lowercase), stop word filtering and stemming. The standard stop word
dictionary is used which is available on https://gist.github.com/larsyencken/1440509.
Stemming as well as Lemmatization these are two crucial feature normalization methods
used in the preprocessing stage. The stemming method restores all the affected words in the
text into a root form called stem words. e.g., ‗studying, studies are each converted into the
stem study’ respectively. Basically, lemmatization converts all the forms of words to its
basic lemma. For example, the terms studying, studies will be converted to lemma study. So
lemma features are considered as further accurate than stemmed features. In this
experimentation, lemmas are extracted as features and then it undergoes through feature
selection methods.
Feature selection:
In the system, various feature selection approaches are analyzed and hybrid approach for
feature selection is proposed. The feature selection strategies analyzed are:
Term frequency (TF)
In this approach features are selected based on term frequency count. Term frequency of
each feature with respect to each aspect category is calculated.
34
A threshold is set for feature selection. Features having term frequency greater than 2 are
selected in each aspect category.
In this approach, the weight of each term is calculated using (1). The weight of a term is the
conditional probability where is the occurrence count of term in aspect category and is the
total occurrence count of a term in all aspect categories. If the proportion of occurrence of a
term in aspect category is more with respect to other aspect categories, then weight of
increases. A threshold is set on weight for each aspect category. Terms (features) having
weight greater than threshold are selected to generate a binary train matrix. Weight
calculation of term is also done in Kirui et. al. [2]. This work follows a similar approach as
[2] and proposes a hybrid approach for feature selection using correlation to avoid
redundancy in features. In [2], weights are used to determine aspect category of test sentence
and in this approach, weight is used for feature selection and further to generate a binary
train matrix.
Xt,k
weight ( t ) = (4.1)
Xt
In classification, features must be relevant but should not be redundant to increase the
accuracy of the classifier. In this strategy, the term frequency matrix obtained is used.
Features obtained using this matrix are relevant but are redundant. So to avoid redundancy
correlation of each feature is calculated with other features in that aspect category. Pearson
correlation coefficient is used to calculate correlation.
C 0 weight [ t i ]=n ¿ ¿ (4.2)
Eq. (4.2) is used to calculate the correlation of each term with other terms where x and y are
vectors
Of term and resp., containing term frequency with respect to each aspect category.
Correlation
Of a term‘t’ with other terms in that aspect category is averaged. Terms having correlation
value less than equal to 0.85 are selected to generate a binary train matrix.
35
Weighted Term Frequency with Correlation Coefficient (WTF+CC)
In this approach, weighted matrix obtained in (ii) is used to generate a new matrix which
contains the weight of a term with respect to each aspect category. Eq. (4.2) is used to
calculate the correlation of each term with other terms where x and y are vectors of term
and resp. containing the weight of term tin each aspect category. Finally, a binary train
matrix is generated as mentioned in (iii).
Contribution of this work is to propose a supervised approach for aspect category extraction
which selects relevant features and avoids redundancy by calculating correlation among
features.
Obtained results show that weighted term frequency with correlation approach has
comparatively more f-score. In this, features selected using weighted term frequency are
more relevant. Redundancy among features in an aspect category is avoided by calculating
the correlation.
4.3 Algorithms
4.3.1 Bagging Classifier
Step 3:
36
Step 5: T get current state with timestamp
Generate event
end for
Input : Selected feature of all test instances D i….n , Training database policies {T1
………….Tn }
37
4.3.3 Knearest neighbour Classifier
Input: Train_DatasetF TrF[], Test_DatesetF TsF[], Threshold T.
38
4.4. Flow chart of System
• To apply similar NLP (stop word removal, Lemmatization, feature extraction and
selection) and Machine learning algorithm on large scale data and identify the churn.
• To develop an algorithm which can work on structured semi-structured as well as
unstructured large dataset?
39
• To improve the proposed system accuracy than classical machine learning algorithms.
• In the proposed research work to measure identify the churn using text analysis using
NLP and some soft computing based machine learning classifier.
• To identify the customer changing behavior pattern during prediction.
• To identify the factor which mostly influence to reduce accuracy of churn prediction?
• To evaluate and calculate churn rate for month wise as well as day wise, which useful for
enhance the service quality of system.
4.5 Summary
This chapter includes an implementation process of the project. Explanation of system
model, desired goals of system are presented and proposed system is described in this
chapter.
40
Chapter - 5
This chapter provides the results and a complete analysis of each algorithm. Confusion
matrix describes the performance of each algorithm and with the help of confusion matrix
it’s easy to analyze which algorithm is best for customer churn prediction.
41
Table 2: Testing Parameters for the Algorithm
Serial No
Properties/Parameters
1 Data Set
2 Size of Dataset
Number of Records for training and
3 testing
4 Framework selection
5 Name of classifier
6 Epoch size
7 Activation function
8 Time required for execution
9 Accuracy of each algorithm
10 Error Rate of system
11 Elapse time of specific algorithm
System should be providing the n number of outcomes with minimum time complexity.
The algorithms has used all given parameters which is described in Table 2
System collects data from local file system and the proposed data placement mechanism
can also provide the fast detection of fake account from application GUI.
42
whether two groups are confused. In supervised learning, an uncertainty matrix is a simple
tool for evaluating outcomes. It's used to describe the test result of a prediction model.
Each column of the matrix represents the instances in a predicted class, while each row
represents those classes in a class diagram. Four independent experiments were performed to
test the discriminant function for various dataset formats.
The calculation strategy for confusion matrix in implemented after experimental analysis
which is defined in below section:
The accuracy (Eq. 5.1) is the percentage of accurate predictions out of an overall amount of
projections.
The equation is used to measure it:
TP+TN
acc =
TP+TN + FP+ FN
(5.1)
The recommended accuracy has been estimated using the equation described, and it achieves
about 97.23% precise forecasts, which is better than all other methods.
To compare the outcomes of different experiments, the F1-score was used as an
assessment measure in this analysis. Convergent and discriminant validity are used to
measure the F1 score (Eq. 5.2). TP stands for a positive result, FP for false positive, and FN
for the negative test in Eqs. (5.3) and (5.4).
43
(5.2)
(5.3)
(5.4)
The proposed implementation has done in Windows open-source environment, python
Platform has used due to availability of open source. The file system dataset has used to
extract the data from file system application. We create various data chunks to perform
the system classification accuracy with different deep learning algorithms.
The above Table 4 depicts the comparative analysis of various classification algorithms
evaluation for proposed churn prediction module. The KNeighbors provides lowest
accuracy thus Random forest classification gives highest accuracy with 95% on various
cross validation. The similar results have been demonstrated in below Figure 5.1.
44
Figure 5.1: Comparative analysis of various classification algorithms
Random Forest
45
Figure 5.3: Performance evaluation of Random forest classification
Confusion Matrix Shows the Actual Performance of Model. And ROC is helps to
analyze the True Positive rate and True negative Rate.
46
Figure 5.4: Discrimination threshold evaluation of Random Forest classification
Threshold plot shows the score of Precision, Recall and f1 measure with the help of this
we can analyze which algorithm is best to predict churn customer.
Decision Tree
47
Figure 5.6: Performance evaluation of DT classification
Threshold plot shows the score of Precision, Recall and f1 measure with the help of this
we can analyze which algorithm is best for prediction of churn data.
48
Bagging Classifier
49
Figure 5.9: Performance evaluation of bagging classification
Confusion matrix is best way to know the Actual Performance of Model. And ROC is
helps to analyze the True Positive rate and True negative Rate.
50
Threshold plot shows the score of Precision, Recall and f1 measure these terms are helps
to analyze the performance of the classifier.
51
Knearest Neighbor
Confusion Matrix Shows the Actual Performance of Model. And ROC is helps to analyze the
True Positive rate and True negative Rate.
52
Figure 5.13: Discrimination threshold evaluation of K-Neighbors classification
Threshold plot shows the score of Precision, Recall and f1 measure with the help of this we
can analyze which algorithm is best to predict churn and non-churn data.
5.6 Summary
This chapter describes the experimental data, results and their performance on the proposed
model.it also provides details of the bigml dataset used as the input dataset for the system.
The results show that the proposed model performs better than the existing methods. The
results are more precise and accurate.
53
Chapter - 6
ADVANTAGES
Accuracy: Proposed system gives highest accuracy based on real time data with multiple
classification algorithms.
Identify at-risk customers: For any business that wants to enjoy the benefits of
customer churn prediction, machine learning opens dozens of opportunities. Machine
learning is able to analyze client behavior and measure their probability of churning. In
particular, to precisely identify churn rate, machine learning algorithms can be trained to
learn the behavior patterns of clients/partners who have already canceled their contracts or
any other relationships with a particular company and compare them with the existing ones.
Then correlations between the actions of active and inactive clients are done. As a result, the
algorithm recognizes the customers that are more likely to leave.
Identify pain points: Different companies lose their clients for different reasons. In most
cases, there are numerous "pain points," which remain unknown for product owners. From
the bad quality and absent features to unpleasant design and poor customer service — there
are a lot of details which you do not take into account that your clients do. Even if your
product is almost perfect, you can still reward your new customers with some attractive
discounts and offers and ignore your loyal ones. When a business applies churn prediction,
machine learning can do analysis and forecasts based not only on customer behavior but also
on the brands.
Identify methods to implement: After the root cause of client churn has been
identified, companies can reconsider and rebuild their products and change their business
strategy accordingly. Transformed data and automated flow can be used in CRM and
marketing automation systems. However, this doesn't mean that using machine learning for
churn prediction is about building a certain model for a certain task. It is more about domain
knowledge and an ability to deliver the best possible solution based on learning data,
processes, and behavior.
54
55
Chapter - 7
CONCLUSION
This work mainly focuses on identifying and detecting churn consumers from massive data
set of telecommunications and discusses churn prediction systems produced by different
algorithms. Some systems still face problems of conversion of linguistic data, which can
occur at high error rate during execution. Many researchers have been putting forward
Natural Language Processing (NLP) techniques as well as various machine learning
algorithms such a combination is likely to generate good performance when structuring data.
Customer churn is a major problem and one of the most important concerns for large
companies. Due to the direct effect on the revenues of the companies, especially in the
telecom field, companies are seeking to develop machine learning algorithms to predict
potential customer churn. In this work Random Forest, Decision tree, Bagging Classifier and
K-nearest neighbor classifiers are employed to find out churn prediction rate. Among all
these algorithm random forest classifier gives highest accuracy of 95% as compared to
decision tree, Bagging classifier and K-nearest neighbor classifier whereas KNN classifier
has lowest accuracy of 81%.
56
BIBLIOGRAPHY
[1] Karahoca, Adem, and Dilek Karahoca. "GSM churn management by using fuzzy c-
means clustering and adaptive neuro fuzzy inference system." Expert Systems with
Applications 38.3 (2011): 1814-1822.
[2] Kirui, Clement, et al. "Predicting customer churn in mobile telephony industry using
probabilistic classifiers in data mining." International Journal of Computer Science
Issues (IJCSI) 10.2 Part 1 (2013): 165.
[3] Ballings, Michel, and Dirk Van den Poel. "Customer event history for churn
prediction: How long is long enough?" Expert Systems with Applications 39.18
(2012): 13517-13522.
[4] Ismail, Mohammad Ridwan, et al. "A multi-layer perceptron approach for customer
churn prediction." International Journal of Multimedia and Ubiquitous Engineering
10.7 (2015): 213-222.
[5] Lee, Hyeseon, et al. "Mining churning behaviors and developing retention strategies
based on a partial least squares (PLS) model." Decision Support Systems 52.1
(2011): 207-216.
[6] Burez D, den Poel V. Handling class imbalance in customer churn prediction. Expert
Syst Appl. 2009; 36(3):4626–36.
[7] Brandusoiu I, Toderean Gavril, Ha B. Methods for churn prediction in the prepaid
mobile telecommunications industry. In: International conference on
communications. 2016. p. 97–100.
[8] He Y, He Z, Zhang D. A study on prediction of customer churns in fixed
communication network based on data mining. In: Sixth international conference on
fuzzy systems and knowledge discovery, vol. 1. 2009. p. 92–4.
[9] Idris A, Khan A, Lee YS. Genetic programming and adaboosting based churn prediction
for telecom. In: IEEE international conference on systems, man, and cybernetics (SMC).
2012. p. 1328–32.
[10] Huang F, Zhu M, Yuan K, Deng EO. Telco churn prediction with big data. In: ACM
SIGMOD international conference on management of data. 2015. p .607–18
[11] Karahoca, Adem, and Dilek Karahoca. "GSM churn management by using fuzzy c-
means clustering and adaptive neuro fuzzy inference system." Expert Systems with
Applications 38.3 (2011): 1814-1822.
57
[12] Kirui, Clement, et al. "Predicting customer churn in mobile telephony industry using
probabilistic classifiers in data mining." International Journal of Computer Science
Issues (IJCSI) 10.2 Part 1 (2013): 165.
[13] Ballings, Michel, and Dirk Van den Poel. "Customer event history for churn
prediction: How long is long enough?" Expert Systems with Applications 39.18
(2012): 13517-13522.
[14] Ismail, Mohammad Ridwan, et al. "A multi-layer perceptron approach for customer
prediction." International Journal of Multimedia and Ubiquitous Engineering 10.7
(2015): 213-222.
[15] Lee, Hyeseon, et al. "Mining churning behaviors and developing retention strategies
based on a partial least squares (PLS) model." Decision Support Systems 52.1
(2011): 207-216.
[16] Burez, Jonathan, and Dirk Van den Poel. "Handling class imbalance in customer
churn prediction." Expert Systems with Applications 36.3 (2009): 4626-4636.
[17] Schouten, Kim, et al. "Supervised and unsupervised aspect category detection for
sentiment analysis with co-occurrence data." IEEE transactions on cybernetics 48.4
(2017): 1263-1275.
[18] Karahoca, Adem, and Dilek Karahoca. "GSM churn management by using fuzzy c-
means clustering and adaptive neuro fuzzy inference system." Expert Systems with
Applications 38.3 (2011): 1814-1822.
[19] Kamalraj, N., and A. Malathi. "A survey on churn prediction techniques in
communication sector." International Journal of Computer Applications 64.5 (2013):
39-42.
58