Mini Project
Mini Project
ASSIGNMENT
MINI-PROJECT
SUBJECT NAME:
ADVANCED BUSINESS DECISIONS USING
ANALYTICS
SUBJECT CODE:
MBA224B2
SUBMITTED BY:
1
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
TABLE OF CONTENTS
EXECUTIVE SUMMARY............................................................................................... 3
I. INTRODUCTION TO BUSINESS........................................................................... 5
II. INTRODUCTION TO DATA COLLECTION AND DATA METHODOLOGY...............5
III. INTRODUCTION TO THE RESULTS OBTAINED AND INFERENCES DRAWN.........6
IV. CONCLUSION..................................................................................................6
V. CODE USED.................................................................................................... 6
VI. REFERENCES...................................................................................................7
LIST OF TABLES
LIST OF FIGURES
2
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
EXECUTIVE SUMMARY
Problem Statement
To enhance the customer experience and drive sales, KFC aims to develop a purchase
recommendation system that utilizes association rules to generate personalized suggestions
based on customers' past purchase history and preferences. The system will analyze
transaction data to identify item relationships and patterns, providing targeted
recommendations to customers, thereby optimizing their purchasing journey and increasing
overall satisfaction.
The primary aim of this recommendation system is to leverage association rule mining
techniques to analyze transaction data and provide personalized suggestions to KFC
customers, thereby optimizing their purchasing journey. Through the utilization of association
rules, the system identifies item relationships within KFC's transaction history, enabling the
generation of targeted recommendations based on customers' past purchases and
preferences.
3
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
4
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
KFC, a subsidiary of Yum! Brands, Inc. (NYSE: YUM.), is a global chicken restaurant brand
with a rich, decades-long history of success and innovation. It all started with one cook,
Colonel Harland Sanders, who created a finger lickin’ good recipe more than 80 years ago, a
list of secret herbs and spices scratched out on the back of the door to his kitchen. Today we
still follow his formula for success, with real cooks breading and freshly preparing our
delicious chicken by hand in more than 26,000 restaurants in over 150 countries and
territories around the world.
KFC's original product is pressure-fried chicken pieces, seasoned with Sanders' signature
recipe of "11 herbs and spices". The constituents of the recipe are a trade secret. Larger
portions of fried chicken are served in a cardboard "bucket", which has become a feature of
the chain since it was first introduced by franchisee Pete Harman in 1957. Since the early
1990s, KFC has expanded its menu to offer other chicken products such as chicken fillet
sandwiches and wraps, as well as salads and side dishes such as French fries and coleslaw,
desserts and soft drinks; the latter often supplied by PepsiCo. KFC is known for
its slogans "It's Finger Lickin' Good!", "Nobody does chicken like KFC" and "So good".
KFC, short for Kentucky Fried Chicken, is renowned for its signature menu items, including
crispy fried chicken, seasoned with a secret blend of 11 herbs and spices. Alongside their
classic chicken pieces, KFC offers a variety of tasty sides such as mashed potatoes with
gravy, coleslaw, biscuits, and seasoned fries. They also feature sandwiches, like the iconic
Zinger, and wraps for those seeking a handheld option. With its focus on flavorful chicken
5
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
and satisfying sides, KFC continues to delight customers worldwide with its delicious
offerings.
Here are few selected and most popular products that are sold very offen and trending in the
kitchen of KFC.
Some of its prices ranging from 119 and may go up till 1000 and so.
We can look into one of a customers profile visiting KFC and knowing their preferences and
feedback.
Sarah is a busy young professional who often finds herself pressed for time. She enjoys
eating out but doesn't always have the luxury to sit down for a long meal. When she's craving
something quick and satisfying, KFC is one of her go-to options.
She appreciates the convenience of KFC's locations, often opting for the drive-thru when
she's on the run. Sarah enjoys the variety on the menu, especially the classic fried chicken,
but she's also conscious about her health, so she sometimes opts for grilled chicken or
salads. Overall, Sarah sees KFC as a reliable choice for satisfying her cravings when she's
short on time but still wants a delicious meal.
6
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
1. Yes
2. No
Introduction to Association Rules
Association Rule mining is one the popular algorithms used to solve problems such as Market
Basket Analysis or Affinity Analysis and Recommender systems/ Collaborative filtering. This
is a descriptive, not predictive, method often used to discover interesting relationships hidden
in a large dataset. It is one of the unsupervised learning methods. Both methods are popular
in marketing for cross-selling products associated with an item that a consumer is
considering.
Association Rules Analysis: The association rules analysis focused on identifying frequent
itemsets, which are combinations of products that are frequently chosen together by
consumers. The concepts of support, confidence, and lift were utilized to identify meaningful
patterns in the data
8
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
Handle missing values: Check for missing values in the dataset and decide on an appropriate
strategy for dealing with them (e.g., imputation or removal).
Remove duplicates: Eliminate any duplicate transactions or records to ensure data integrity.
Outlier detection: Identify and handle any outliers that may skew the results.
3. Transaction Encoding:
Convert data into transaction format: Transform the dataset into a transactional format where
each row represents a transaction and contains a list of items purchased.
Handle itemsets: Combine duplicate items within each transaction and ensure that each item
appears only once.
4. Data Transformation:
Binarization: Convert the transactional data into a binary format (0 or 1) representing the
presence or absence of each item in a transaction.
One-Hot Encoding: Encode categorical variables (e.g., item IDs) into a binary format using
one-hot encoding to represent each item as a binary vector.
5. Transaction Reduction:
Remove infrequent items: Filter out items that occur infrequently in the dataset, as they may
not provide significant insights and can lead to sparse association rules.
Prune transactions: Remove transactions that contain only a small number of items or do not
contribute significantly to the analysis.
Association Rules
Association rules are a fundamental concept in data mining and machine learning,
particularly in the context of market basket analysis and recommendation systems. They
identify relationships or associations between items in a dataset, revealing patterns of co-
occurrence or correlation. These rules are typically represented as "if-then" statements,
indicating that if certain items are present in a transaction, then other items are likely to be
present as well.
Here's a detailed explanation of association rules:
Support: The support of an itemset is the proportion of transactions in the dataset
that contain that itemset. It measures the frequency of occurrence of the itemset and
is calculated as the number of transactions containing the itemset divided by the total
number of transactions.
10
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
Confidence: The confidence of an association rule measures the likelihood that the
presence of one item (antecedent) in a transaction implies the presence of another
item (consequent) as well. It is calculated as the support of the combined itemset
divided by the support of the antecedent itemset.
Lift: Lift measures the strength of association between two items, taking into account
the support of both the antecedent and consequent itemsets. It compares the
observed support of the combined itemset to what would be expected if the items
were independent.
11
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
Rule Metrics: Begin by examining the support, confidence, and lift values associated
with each rule. These metrics provide quantitative measures of the rule's significance
and strength:
Support: Indicates how frequently the itemset occurs in the dataset.
Confidence: Measures the probability of the consequent item given the
antecedent item.
Lift: Compares the observed support of the rule to what would be
expected if the items were independent.
Identify High-Lift Rules: Focus on rules with lift values greater than 1, as they
indicate a positive association between the antecedent and consequent items. These
rules suggest that the items tend to occur together more often than expected by
chance.
Understand Confidence Levels: Consider the confidence level of each rule to gauge
the strength of the association. Higher confidence values imply stronger relationships
between items. Rules with low confidence may indicate weaker associations or
potential areas for further investigation.
Explore Support Levels: Evaluate the support values to understand the frequency of
occurrence of each itemset in the dataset. Higher support values indicate that the
itemset is more common among transactions, while lower support values may signify
niche or less frequent patterns.
Consider Rule Interpretability: Assess the interpretability of the association rules by
examining the antecedent and consequent itemsets. Rules should be logical and
meaningful in the context of the dataset and domain. Ensure that the rules make
intuitive sense and align with prior knowledge or expectations.
Prune Rules: Depending on the objectives of your analysis, you may need to filter or
prune the association rules to focus on the most relevant or actionable insights. This
could involve setting thresholds for support, confidence, or lift values or applying
additional criteria based on domain knowledge.
Visualize Results: Use visualizations such as scatter plots, network diagrams, or
heatmaps to visualize the association rules and their key metrics. Visual
representations can help identify patterns, trends, and clusters within the data more
effectively.
Contextualize Findings: Interpret the association rule output within the broader
context of the business problem or research objectives. Consider how the discovered
patterns and relationships can inform decision-making, strategy development, or
operational improvements.
12
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
Interpreting this finding reveals important insights into consumer behavior and preferences:
Popular Combination: Popular combinations refer to sets of KFC products that are
frequently purchased together by customers. These combinations represent patterns of
consumer behavior and preferences, providing valuable insights for menu optimization,
cross-selling strategies, and recommendation systems.
Complementary Products: Complementary products are items or goods that are typically
used or consumed together or in conjunction with each other. These products have a natural
affinity or relationship, and their consumption or use tends to enhance the overall value or
utility for the consumer. In other words, they complement each other in a way that makes
them more appealing or useful when used together than when used individually.
Opportunities for Recommendation: The potential areas where a recommendation system
can suggest products or services to users based on their preferences, behavior, or past
interactions. In the context of a business like KFC, opportunities for recommendation involve
identifying scenarios where the recommendation system can suggest specific menu items or
promotions to customers to enhance their experience and drive sales.
13
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
The results presented are association rules generated from the transactions dataset,
indicating the relationships between different sets of products. Each rule consists of an
antecedent (lhs - left-hand side) and a consequent (rhs - right-hand side), along with various
metrics such as support, confidence, coverage, lift, and count.
1. {} => {French Fries}:
Support: 0.6181818
Confidence: 0.6181818
Lift: 1
Interpretation: This rule suggests that 61% of transactions involve the purchase of French
Fries. The confidence of 61% indicates that when no other product is purchased, there is an
61% chance that French fries will be bought. Since the lift is 1.0, there is no significant
association between the absence of other products.
2. {} => {Smoky Grilled}:
Support: 0.60000
Confidence: 0.60000
Lift: 1
Interpretation: Similarly, this rule indicates that 60% of transactions involve the purchase of
smoky grilled. The confidence of 60% suggests that when no other product is purchased,
there is an 80% chance that this will be bought.
14
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
Interpretation: Similarly, this rule indicates that 56% of transactions involve the purchase of
Zinger Burger. The confidence of 56% suggests that when no other product is purchased,
there is an 80% chance that this will be bought.
4. {} => {Tandoori Burger}:
Support: 0.50909
Confidence: 0.50909
Lift: 1
Interpretation: This rule indicates that 50% of transactions involve the purchase of Tandoori
burger. The confidence of 50% suggests that when no other product is purchased, there is an
50% chance that this will be bought. Again, the lift of 1.0 indicates no significant association
beyond what would be expected by chance.
5. {} => {Pepsi}:
Support: 0.4909091
Confidence: 0.4909091
Lift: 1
Interpretation: Similarly, this rule indicates that 49% of transactions involve the purchase of
Pepsi. The confidence of 49% suggests that when no other product is purchased, there is an
49% chance that this will be bought. Again, the lift of 1.0 indicates no significant association
beyond what would be expected by chance.
15
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
Interpretation: in the above scatter plot, gives a visual understanding of the generated
rules plotted support against the confidence for every rules generated shown as points
on the graph. By considering the cutoff level of 0.55 for both support and confidence
and a lift of 1 we filter only few rules of 1 frequent itemsets which can be considered
and taken forward.
The above graph is a pair plot which gives the visual effect of the scatter points plotted
involving various other parameters say support, confidence, coverage, lift and count.
16
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
Interpretation:
1. {Rice bowl}=> { Smoky grilled}
Support=0.4181
Lift = 1.23
Count = 23
Confidence = 0.741
The support level indicates that 41% of purchase of rice bowl involves the purchase of
smoky grilled. And a lift of 1.236 indicated the positive association among them.
2. {Zinger burger}=> { Smoky grilled}
Support=0.381
Lift = 1.212
Count = 21
Confidence = 0.67
The support level indicates that 38%% of purchase of Zinger burger involves the
purchase of smoky grilled. And a lift of 1.212 indicated the positive association among
them.
3. {Krunchy burger}=> {Chicken popcorns}
Support=0.3636
Lift = 1.571
Count = 20
Confidence = 0.800
The support level indicates that 36% of purchase of Krunchy burger involves the
purchase of chicken popcorn. And a lift of 1.571 indicated the positive association
among them.
4. {French Fries }=> {Noodles}
Support=0.3636
Lift = 1.15
Count = 20
Confidence = 0.588
17
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
The support level indicates that 36% of purchase of French fries involves the
purchase of Noodles. And a lift of 1.15 indicated the positive association among them.
According to the association rules obtained above I find the transaction 1and 2 has the
highest support level of 0.4181 which indicates that 41% of purchase rice bowl involves the
purchase of smoky grill along with it and the lift of 1.23 indicates that there is a positive
assosciation among these items and a confidence of 0.74.
The rules 3 and 4 having support level of 0.381 and lift of 1.129 with a count of 21 forms a
another set of rules which can be adopted for the business decisions and rest other 8 rules
having support level of 0.363 with lift varing from 1.15 to 1.57 showing the positive
association among themselves.
Decision: looking at these rules we can consider the transaction having the support level of
0.41 that is rice bowl with Smoky grilled for the business decision and purchase strategies
forming in business.
In the scatter plot above the support level is very low and hence almost all the rules come under the
support level of 0.36 whereas the confidence level is high and the color of the scatter points describe
the lift level.
18
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
19
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
Interpretation:
The association rules are generated above for the 3 frequent itemsets shown in lhs and rhs
and their corresponding support confidence liftlevels and count for each rules generated.we
can see that the transactions 1,2,3 has the highest support level among all the rules with a
value of 0.30909. This rule indicates that 30.9% of transactions involving Chicken popcorns
and Krunchy burger also include the purchase of Smoky grilled and 30.9% of smoky grilled
and Krunchy burger purchase also involes Chicken popcorns.
Similarly looking at the lift level to be greater than 1 indicating there is a positive a strong
association among these itemsets.
But with a lift level of 1.96 being highest among these rules and the count being 17 the 3 rd
rule that is, {Smoky grilled, Chicken popcorn}=>{Krunchy burger} is considered to be optimum
for forming the business decisions and rules.
20
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
21
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
22
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
Interpretation:
Looking into the table above are the association rules obtained for 4 frequency itemsets
through the calculation of support level, confidence and lift. The rules are sorted according to
the highest support level values.The most appropriate rule is selected on the bases of the
support level to be greater and the lift above 1 and higher confidence level for the rules
obtained.From the above table we can we that the support level of 4 frequent itemsets is
greater for the transaction 1,2,3,4.
Lets look into them in detail:
For the transaction 1, {Chicken popcorn, Krunchy burger, Zinger burger}=>{smoky grilled} the
support level is .25, which says that when the lhs products are bought together there is
25%chance that the customer will go for smoky grilled. And lift to be grater than 1 indicates
the positive association among the products combined.
Where as the confidence level and the count for the trnsactions must be high after the
support level and the lift level to determine the importance of that rule to be considered.
Decision: In this case we have the support level same for the 4 transaction but lift=0.27272
to be greater among them and the confidence level of 0.93333 for the 4 th transaction {smoked
23
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
grilles, zinger burger,chicken popcorn}=>{Krunchy burger}, with a count of 14. Which can be
considered for the business decisions and planning marketing strategies accordingly.
support, confidence, and lift. These metrics help determine the significance and reliability of
the rules in revealing meaningful patterns and relationships in the data. Here's how you can
assess the strength of association rules:
Support:
Assess the support value of each rule, which indicates the frequency of occurrence of the
itemset in the dataset. Higher support values suggest that the rule is more significant and
relevant.
Confidence:
Evaluate the confidence level of each rule, which measures the reliability or strength of the
association between the antecedent and consequent itemsets.
Higher confidence values indicate stronger associations, suggesting a higher likelihood that
the presence of the antecedent item implies the presence of the consequent item.
Lift:
Examine the lift value of each rule, which compares the observed support of the combined
itemset to what would be expected if the items were independent.
Domain Knowledge:
Interpret the association rules in the context of domain knowledge, business goals, and
practical implications.
Consider the relevance and interpretability of the rules in relation to the specific industry,
market trends, customer preferences, and business strategies.
Interpreting the results obtained from a recommendation system for KFC products is crucial
for marketing purposes as it provides valuable insights into consumer behavior, preferences,
and opportunities for targeted promotions. Here's the interpretation of the results obtained
from the recommendation system and how they can be utilized for marketing purposes:
Cross-Selling Opportunities:
25
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
Seasonal Promotions:
Analyze association rules to identify seasonal trends and preferences, such as summer
favorites or holiday specials.Seasonal promotions and limited-time offers based on
association rules can capitalize on seasonal demand and increase customer engagement.
For example, promoting refreshing beverages and salads during the summer months.
Leverage association rules to identify opportunities for introducing and promoting new menu
items or limited-time offerings.By recommending new products to customers who have shown
interest in similar items, KFC can generate excitement and drive trial of new menu additions.
For example, promoting a new spicy chicken sandwich to customers who frequently order
spicy menu items.
Location-Based Recommendations:
Utilize location data to provide recommendations based on regional preferences and local
market trends.Tailoring recommendations to specific geographic locations allows KFC to
cater to local tastes and preferences, driving foot traffic and sales at individual restaurant
locations. For instance, promoting regional specialties or local favorites in targeted marketing
campaigns.
V. CODE USED
##################################################
#PROBLEM 5: Purchases of Phone Faceplates/Covers
install.packages('arules')
library(arules)
26
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
## get rules
# when running apriori(), include the minimum support, minimum confidence,
#and target as arguments.
##########################################
# frequent 1-itemsets
Data.freq1 <- apriori(Data.bin, parameter=list(minlen=1, maxlen=1, support=0.02,
target="frequent itemsets"))
summary(Data.freq1)
inspect(head(sort(Data.freq1, by = "support"), 10))
27
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
###################################
## frequent 2-itemsets (Rule Generation and Visualization)
#(we will keep support=0.2 and confidence =0.7)-Notice same results in excel
###################################
# frequent 3-itemsets
Data.freq3<- apriori(Data.bin, parameter=list(minlen=3, maxlen=3, support=0.25,
target="frequent itemsets"))
inspect(sort(Data.freq3, by ="support"))
####################################
# frequent 4-itemsets (no items exist)
Data.freq4 <- apriori(Data.bin, parameter=list(minlen=4, maxlen=4, support=0.02,
target="frequent itemsets"))
inspect(sort(Data.bin, by ="support"))
inspect(head(sort(rules, by = "support"), n = 20))
rules <- apriori(Data.bin, parameter=list(minlen=4,maxlen=4,support=0.23,
confidence=0.7, target = "rules"))
summary(rules)
inspect(rules)
################
#if we do not specify the minlen and maxlen, we get all the rules
29
FUNDAMENTALS OF ANALYTICS
ASSIGNMENT
#Notice that, size of circles is based on support size. Since the rule
# { }--> "white" is 0.7 we have bigger circle size.
VI. REFERENCES
(MENTION WEBSITES/ARTICLES/BLOGS WHICH YOU VISITED TO COLLECT THE
INFORMATION)
i. “How Coca Cola uses brand equity” retrieved from https://phdessay.com/how-coca-
cola-uses-brand-equity/
30