Machine Learning Study Guide
Table of Contents
1. Machine Learning vs AI vs Deep Learning
2. Machine Learning: Supervised vs Unsupervised
3. Supervised Learning: Classification vs Regression
4. What is Unsupervised Learning
5. Linear Regression
6. Gradient Descent
7. Polynomial Regression
8. Logistic Regression
Machine Learning vs AI vs Deep Learning
       Definition
       Artificial Intelligence (AI) is the broad field of creating machines that can perform
       tasks requiring human-like intelligence. Machine Learning (ML) is a subset of AI
       focused on algorithms that learn patterns from data. Deep Learning (DL) is a subset
       of ML that uses neural networks with many layers.
       Intuition
       Think of AI as the entire universe of intelligent machines. ML is one planet where
       machines learn from data. Deep Learning is a specific country on that planet, where
       neural networks live.
       Mathematical / Technical Details
       Hierarchy:
       AI ⊇ ML ⊇ DL
       AI includes rule-based systems and expert systems.
       ML uses algorithms like regression, decision trees.
       DL uses neural networks such as CNNs and RNNs.
       Example
       Example: AI: A chess-playing program. ML: A spam filter that learns from emails. DL:
       A voice assistant like Siri using deep neural networks for speech recognition.
      Summary / Key Takeaways
      AI is the broadest, ML is a subset using data, DL is a further subset using neural
      networks.
Machine Learning: Supervised vs Unsupervised
Definition
      Supervised learning uses labeled data (inputs with correct outputs). Unsupervised
      learning uses unlabeled data to find hidden patterns.
Intuition
      Supervised: Like a teacher giving students questions with answers.
      Unsupervised: Like exploring a library without guidance, trying to organize books
      yourself.
More Details
      Supervised: Algorithms include Linear Regression, Decision Trees, SVM.
      Unsupervised: Algorithms include K-means clustering, PCA.
Example
   Supervised Learning Example – House Price Prediction
      •      Problem: You want to predict the price of a house.
      •      Data: You already have a dataset of houses where each record includes features
             (size in square meters, number of rooms, location) and the actual selling price.
      •      Process:
                1. Feed the model both the features (X) and labels (y = price).
                2. The algorithm (e.g., Linear Regression) learns the relationship between
                   features and price.
                3. After training, when you give the model a new house (say, 120 m², 3
                   rooms, city center), it can predict its price (e.g., $150,000).
      •      Key Point: Supervised = data with labels (answers are known during training).
   Unsupervised Learning Example – Customer Segmentation
      •      Problem: A retail company wants to understand its customer base to improve
             marketing.
      •      Data: You only have customer data like age, income, shopping frequency, but no
             labels (no predefined “customer type”).
       •     Process:
                 1. Use clustering algorithms (e.g., K-means).
                 2. The algorithm groups customers based on similarity. For example:
                        ▪   Group A: Young, low income, frequent buyers.
                        ▪   Group B: Middle-aged, high income, luxury buyers.
                        ▪   Group C: Older, moderate income, occasional buyers.
                 3. The business can now design different marketing campaigns for each
                    group.
       •     Key Point: Unsupervised = data without labels (the model discovers structure
             by itself).
Summary / Key Takeaways
Supervised = labeled data, predicts outputs. Unsupervised = unlabeled data, discovers
structure.
Supervised Learning: Classification vs Regression
Definition
       Classification assigns inputs into discrete categories.
       Regression predicts continuous values.
Intuition
       Classification is like sorting fruits into 'apple' or 'orange'. Regression is like
       predicting the weight of a fruit.
More Details
       Classification: Algorithms like Logistic Regression, Decision Trees.
       Regression: Algorithms like Linear Regression, Polynomial Regression.
Example
       Example: Classification vs Regression
       Classification Example – Sorting Fruits
             •   Problem: You want to automatically classify fruits as apple or orange.
             •   Data: Each fruit has features such as color, weight, and diameter.
             •   Process:
                 1. Training data:
                          ▪   Apple → (Red, 150g, 7cm)
                          ▪   Orange → (Orange, 200g, 8cm)
                          ▪   Apple → (Red, 170g, 7.5cm)
                          ▪   Orange → (Orange, 210g, 8.2cm)
                 2. The model (e.g., Logistic Regression or Decision Tree) learns
                    decision boundaries.
                 3. New fruit: (Red, 160g, 7.2cm). The model predicts → Apple.
          •   Output: A discrete label → “Apple” or “Orange”.
          •   Key Point: Classification = predicting categories (discrete classes).
     Regression Example – Predicting Fruit Weight
          •   Problem: You want to predict the weight of a fruit based on its diameter.
          •   Data:
                 o    Diameter (cm): [5, 6, 7, 8, 9]
                 o    Weight (g): [100, 120, 150, 180, 210]
          •   Process:
     1.       Train a Linear Regression model.
     2.       The model finds the relationship:
              Weight=20+20×DiameterWeight = 20 + 20 \times
              DiameterWeight=20+20×Diameter
     3.       For a fruit with diameter = 7.5 cm → Predicted Weight = 20 + 20×7.5 = 170 g.
          •   Output: A continuous value (170 g).
          •   Key Point: Regression = predicting numbers (continuous values).
Summary / Key Takeaways
     Classification = discrete output, Regression = continuous output.
What is Unsupervised Learning
Definition
Unsupervised learning analyzes data without labels to find hidden structure or patterns.
Intuition
Like being given a box of mixed LEGO bricks with no instructions and trying to group them
by shape or color.
Mathematical / Technical Details
Techniques: Clustering (K-means, Hierarchical), Dimensionality Reduction (PCA, t-SNE).
Example
Example: Clustering – Grouping customers by purchasing behavior into segments (A, B, C).
Summary / Key Takeaways
Unsupervised learning = discovering hidden patterns in unlabeled data.
Linear Regression
Definition
A supervised algorithm that models the relationship between input (x) and output (y) using
a straight line.
Intuition
Think of drawing the best straight line that predicts exam scores from study hours.
Mathematical / Technical Details
Equation: y = β0 + β1x + ε
Example
Example: Linear Regression – Predicting Exam Scores
Problem
We want to predict a student’s exam score based on how many hours they studied.
Data (training set)
   •   Student A: 1 hour → 40 marks
   •   Student B: 2 hours → 50 marks
   •   Student C: 3 hours → 60 marks
    •    Student D: 4 hours → 70 marks
    •    Student E: 5 hours → 80 marks
Process
    1. Plot the points on a graph (Hours on x-axis, Scores on y-axis).
    2. The goal is to find the best straight line that goes through the cloud of points.
    3. Linear Regression calculates this line using the formula:
                                        y=β0+β1x
where:
            o   y = predicted score
            o   x = study hours
            o   β0 = intercept (score when study hours = 0)
            o   β1 = slope (how much the score increases per extra study hour)
Result
From the data, the best line is:
                                        Score=30+10×(Hours)
Predictions
    •    If a student studies 6 hours → Predicted score = 30 + 10×6 = 90 marks.
    •    If a student studies 0 hours → Predicted score = 30 (the baseline).
Key Point
Linear Regression = finding the best straight line to predict a continuous outcome. In this
case, scores increase by 10 marks for each extra study hour.
Summary / Key Takeaways
Linear regression predicts continuous outcomes with a straight line.
Gradient Descent
Definition
An optimization algorithm to minimize cost functions by updating parameters step by step.
Intuition
Imagine going down a hill blindfolded: you feel the slope and take steps downhill until
reaching the bottom.
Mathematical / Technical Details
        Update rule:
                θ = θ - α * ∇J(θ),
                        where α is learning rate, ∇J(θ) is the derivative of loss by the
                        coefficient.
Example
Example: Minimizing error in linear regression coefficients using gradient descent steps.
Summary / Key Takeaways
Gradient descent finds optimal parameters by iterative updates.
Polynomial Regression
Definition
A regression algorithm where the relationship between x and y is modeled as an nth degree
polynomial.
Intuition
Instead of a straight line, you draw a curve that fits data points better.
Mathematical / Technical Details
Equation: y = β0 + β1x + β2x² + ... + βnx^n + ε
Example
Example: Hours studied vs. performance may curve upward, requiring quadratic terms.
Example: Polynomial Regression – Hours Studied vs Performance
Problem
Sometimes performance does not increase in a straight line with study hours. Maybe at first,
more hours give a big boost, but after a certain point, performance improves slowly (or
even drops due to fatigue).
Data Table
                            Hours Studied (x) Exam Score (y)
                            1                    40
                            2                    55
                            3                    65
                            4                    68
                            5                    70
                            6                    69
                            7                    67
Observation
   •   From 1 → 3 hours, the score increases quickly.
   •   From 4 → 5 hours, the improvement slows down.
   •   After 5 hours, performance actually drops slightly (maybe due to overstudying or
       tiredness).
Straight Line (Linear Regression)
   •   A simple linear regression would try to fit a single straight line, but it cannot
       capture the curve (rise then fall).
Polynomial Regression
   •   If we use a quadratic model (degree 2 polynomial):
                                                       Constant      Exam
                 Hours Studied Hours
                                                                     Score
                 (x)           Studied^2 (x^2)
                                                                     (y)
                 3                 9                   1             65
                 4                 16                  1             68
                 5                 25                  1             70
                                                        Constant      Exam
                  Hours Studied Hours
                                                                      Score
                  (x)           Studied^2 (x^2)
                                                                      (y)
                  6                 36                  1             69
                  7                 49                  1             67
Summary / Key Takeaways
Polynomial regression extends linear regression by fitting curves.
Logistic Regression
Definition
A classification algorithm that predicts probabilities using the logistic (sigmoid) function.
Intuition
Like predicting whether a student passes or fails based on study hours (yes/no outcome).
Mathematical / Technical Details
Equation: P(y=1|x) = 1 / (1 + e^-(β0 + β1x))
Example
Example: Predict probability of passing an exam: if Hours = 5 → P(pass)=0.9, Hours=1 →
P(pass)=0.2.
Summary / Key Takeaways
Logistic regression is used for binary classification, outputting probabilities.