0% found this document useful (0 votes)
48 views19 pages

Association (IML)

The document discusses association rule learning, an unsupervised machine learning method used to find relationships between variables in large datasets, particularly in market basket analysis. It outlines the steps involved in association rule generation, including data representation, support, confidence, and lift, along with their definitions and formulas. The application of association rule mining spans various fields such as retail, healthcare, and fraud detection, providing valuable insights for decision-making and optimization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
48 views19 pages

Association (IML)

The document discusses association rule learning, an unsupervised machine learning method used to find relationships between variables in large datasets, particularly in market basket analysis. It outlines the steps involved in association rule generation, including data representation, support, confidence, and lift, along with their definitions and formulas. The application of association rule mining spans various fields such as retail, healthcare, and fraud detection, providing valuable insights for decision-making and optimization.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 19

Government polytechnic for girls, surat

Association
Algorithm of Unsupervised Machine Learning

Submitted By :
Fataniya Dhruvanshi N. (226150307053)
Hyalij Roshni J. (226150307072)
Jotaniya Arpita H. (226150307077)
Introduction ofAssociation

• Definition: An association rule is an unsupervised


learning method which is used for finding the
relationships between variables in the large database.
• It determines the set of items that occurs together in the
dataset. Association rule makes marketing strategy
more effective. Such as people who buy X item
(suppose a bread) are also tend to purchase Y
(Butter/Jam) item.
• The association rule learning is one of the very important
concepts of machine learning, and it is employed in
Market Basket analysis, Web usage mining, continuous
production, etc.
How does Association Rule Learning work?

Association rule learning works on the concept of If and Else Statement, such as if A then B.

Here the If element is called antecedent, and then statement is called as Consequent. These types of
relationships where we can find out some association or relation between two items is known as single
cardinality. It is all about creating rules, and if the number of items increases, then cardinality also
increases accordingly. So, to measure the associations between thousands of data items
Step of Association

1. Data representation
2. Support
3. Confidence
4. Lift
5. Association rule generation
Data representation

• Description: Prepare the data in a suitable format for mining association rules. The most common format is
transactional data, where each transaction contains a set of items (e.g., products bought together).
• Example:
1. Transaction 1: {Bread, Butter, Milk}
2. Transaction 2: {Bread, Milk}
3. Transaction 3: {Butter, Milk}
• The data is typically represented in a binary matrix where each row is a transaction, and each column is an
item. If an item is present in the transaction, it is marked with a 1; otherwise, it is marked with a 0.
Support

• Description: Support measures how frequently an itemset appears in the dataset. It helps to filter out
less frequent itemsets.
• Formula: Support=nomberOf itemset/ Total number Of transaction
• Example: If "Bread" appears in 3 out of 5 transactions, the support for "Bread" is 35=0.6\frac{3}{5} =0.653
=0.6.
• Purpose: Itemsets with support below the minimum threshold are not considered for further analysis.
Confidence

• Description: Confidence measures the likelihood that the consequent (Y) occurs given
the antecedent (X). It evaluates how strong a rule is.
• Formula: Confident = Support of (A & B) / Support of A
• Example: If "Bread" and "Butter" appear together in 2 out of 3 transactions involving
"Bread", the confidence of the rule "Bread →Butter" is 23=0.67\frac{2}{3} =0.6732 =0.67.
• Purpose: Confidence indicates how reliable the rule X→YX \rightarrow YX→Yis.
Lift
• Description: Lift measures the strength of an association rule by comparing the observed frequency of
the rule X→YX\rightarrow YX→Y against its expected frequency if X and Y were independent.
• Formula:

• Interpretation:
Lift > 1: Items X and Y are more likely to occur together than by random chance (positive
association).
Lift = 1: X and Y are independent.
Lift < 1: X and Y are less likely to occur together (negative association).
• Example: A lift value of 1.5 for "Bread → Butter" means that buying bread makes it 1.5 times more
likely that butter will be bought.
Association rule generation
• Description: After calculating support, confidence, and lift, generate rules of the form X→YX \rightarrow
YX→Y (e.g., "If X is bought, Y is likely to be bought").​
• Steps:
1. Identify all frequent itemsets using support.
2. Generate rules from frequent itemsets by calculating confidence.
3. Validate rules based on confidence and lift.
4. Prune rules that don’t meet the minimum confidence or lift threshold
• Example:
Rule: "Bread → Butter"
Confidence: 67%
Lift: 1.5 (positive association).
Example of Association

Market Basket Analysis Example:


Problem: A grocery store wants to understand which products are
frequently bought together to optimize product placement and
promotions.
The store collects transactional data (items purchased together)
from customers. For example:
Transection id Item purchas
1 Milk, Bread, Butter

2 Milk, Cereal, Bread, Butter

3 Milk, Bread, Butter

4 Sugar, eggs
Example of
association

Customer 4
Steps of association example
Step 1: Identify Frequent Itemsets
The goal here is to find all itemsets that have support greater than a predefined threshold. Let's say our
minimum support is 50%.
Calculate Support for each itemset. Support refers to how frequently an itemset appears in the
transactions:
• Milk: 3 times (in Transactions 1, 2, and 3) → Support = 3/4 = 0.75
• Bread: 3 times → Support = 0.75
• Butter: 3 times → Support = 0.75
• Cereal: 1 time → Support = 0.25
• Sugar: 1 time → Support = 0.25
• Eggs: 1 time → Support = 0.25
Step 2: Determine itemset pairs and calculate support for each pair:
• Milk, Bread: 3 times → Support = 0.75
• Milk, Butter: 3 times → Support = 0.75
• Bread, Butter: 3 times → Support = 0.75
• Milk, Cereal: 1 time → Support = 0.25
• Step 3:
• Confidence is the ratio of how often items in a rule appear together to how often the antecedent appears by
itself:
• For the rule “Milk → Bread”:
• Confidence = (Support of {Milk, Bread}) / (Support of {Milk}) = 0.75 / 0.75 = 1.0 (100%)
• Lift is calculated by comparing how often items appear together versus how often they would appear
if they were independent:
• Lift (Milk → Bread) = Confidence / Support of Bread = 1.0 / 0.75 = 1.33
The final rules will include combinations of items with high confidence and lift, indicating strong
associations.
Features of Association

• Identifies Hidden Patterns: Finds relationships between variables that may not be immediately
apparent.
• Supports Decision-Making: Provides insights to support decision-making in various domains.
• Handles High-Dimensional Data: Can handle large datasets with many variables.
• Scalability: Can handle large datasets and is scalable to meet the needs of big data analytics.
• Flexibility: Can be used with different types of data, including categorical, numerical, and
transactional data.
• Measures Rule Interestingness: Provides metrics such as support, confidence, and lift to measure the
interestingness and relevance of the discovered rules.
Application
1. Market Basket Analysis: Identifies products
frequently bought together in retail and e-
commerce to improve product placement and
promotions.
2. Product Recommendations: Suggests related
products based on past purchases, commonly
used in e-commerce platforms like Amazon and
Netflix.
3. Web Usage Mining: Analyzes user behavior on
websites to improve layout, user experience, and
targeted ads.
4. Fraud Detection: Detects unusual patterns in
transaction data to identify fraudulent activities in
banking and e-commerce.
5. Healthcare: Finds relationships between
symptoms, diseases, and treatments for early
diagnosis and personalized care.
Conclusion

Association rule mining uncovers hidden patterns in data, offering valuable insights across industries. It helps
businesses make data- driven decisions, optimize operations, and enhance customer experiences. With
applications in retail, healthcare, fraud detection, and more, it’s a vital tool for extracting meaningful
relationships from large datasets.
Thank you

You might also like