Lecture Slides for
Machine Learning
CS 4552
Instructor: Dr. Olfa MZOUGHI
Reference book
◼ "Introduction to Machine Learning" by ,Ethem Alpaydin
https://www.cmpe.boun.edu.tr/~ethem/i2ml/
2
Outline
◼ Introduction
◼ Supervised learning
The K-Nearest-Neighbour (KNN) algorithm
The decision Tree
Bayes classifier
Neural Networks??
◼ Unsupervised learning
K-means
◼ Evaluation metrics
◼ Overfitting
3
CHAPTER 1:
Introduction
Data is Everywhere!
◼ Data is cheap and
abundant (data Cyber Security
E-Commerce
warehouses, data marts…)
◼ Knowledge is expensive
and scarce.
Traffic Patterns Social Networking: Twitter
5
Why “Learn” ?
◼ Machine learning is programming computers to
optimize a performance criterion using example
data or past experience.
6
When we talk about learning
◼ There is no need to “learn” to calculate payroll
◼ Learning is used when:
Human expertise does not exist (navigating on Mars),-
(exploring new relationships between data)
Difficulty to extract human expertise (medical image
processing)
When you need to deal with huge number options and
factors (e.g, Media sites rely on machine learning to shift
through millions of options to give you song or movie
recommendations.)
7
Machine learning: a new programming paradigm
Data Classical
Answers
Rules programming
Machine learning is a type of artificial intelligence (AI) that provides computers with the ability
to learn without being explicitly programed
Data
Machine learning Rules
Answers programming
8
Machine learning Tasks
◼ Supervised Learning
Classification
Regression
◼ Unsupervised Learning
Clustering
Association
◼ Reinforcement Learning
9
Classification(1)
◼ Example: Credit
scoring
◼ Differentiating
between low-risk
and high-risk
customers from
their income and
savings
Discriminant: IF income > θ1 AND savings > θ2
THEN low-risk ELSE high-risk
10
Classification(2)
11
Classification (3)
◼ Find a model for class attribute as a function of
the values of other attributes
# years at
Level of Credit
Tid Employed present
Education Worthy
address
1 Yes Undergrad 7 ?
# years at 2 No Graduate 3 ?
Level of Credit
Tid Employed present 3 Yes High School 2 ?
Education Worthy
address
… … … … …
1 Yes Graduate 5 Yes 10
2 Yes High School 2 No
3 No Undergrad 1 No
4 Yes High School 10 Yes
… … … … … Test
Set
Training
Learn
Model12
Set Classifier
Classification: Applications
◼ Aka Pattern recognition
◼ Face recognition: Pose, lighting, occlusion (glasses,
beard), make-up, hair style
◼ Character recognition: Different handwriting styles.
◼ Speech recognition: Temporal dependency.
Use of a dictionary or the syntax of the language.
Sensor fusion: Combine multiple modalities; eg, visual (lip
image) and acoustic for speech
◼ Medical diagnosis: From symptoms to illnesses
◼ ...
13
Face Recognition
Training examples of a person
Test images
AT&T Laboratories, Cambridge UK
http://www.uk.research.att.com/facedatabase.html
14
Regression Applications
◼ Predict a value of a given continuous valued
variable based on the values of other variables,
assuming a linear or nonlinear model of
dependency.
15
Regression (Example of linear
dependency)
◼ Example: Price of a
used car
◼ x : car attributes y = wx+w0
y : price
◼ Predict wi ?
16
Regression
◼ Examples:
Predicting sales amounts of new product based
on advetising expenditure.
Predicting wind velocities as a function of
temperature, humidity, air pressure, etc.
17
Supervised Learning: Uses
◼ Prediction of future cases: Use the rule to predict
the output for future inputs
◼ Knowledge extraction: The rule is easy to
understand
◼ Compression: The rule is simpler than the data it
explains
◼ Outlier detection: Exceptions that are not covered
by the rule, e.g., fraud
18
Unsupervised Learning
◼ No need to supervise the model.
◼ The model work on its own to discover patterns
and information that was previously undetected.
◼ It mainly deals with the unlabelled data.
19
Clustering
◼ Finding groups of objects such that the objects in a
group will be similar (or related) to one another and
different from (or unrelated to) the objects in other
groups
Inter-cluster
Intra-cluster distances are
distances are maximized
minimized
20
Clustering examples
◼ Understanding
Customer segmentation for targeted marketing
◼ Summarization
Image compression: Color quantization
◼ Anomaly Detection
Detecting changes in the global forest cover.
21
Learning Associations
◼ Basket analysis:
P (Y | X ) probability that somebody who buys X also
buys Y where X and Y are products/services.
Example: P ( chips | milk ) = 0.7
◼ Example of applications:
Market-basket analysis: Rules could be used for sales
promotion, shelf management, and inventory
management
22
Reinforcement Learning
◼ Learning a policy: A sequence of outputs
◼ No supervised output but delayed reward
◼ An agent interacting with the world makes
observations, takes actions, and is rewarded or
punished; it should learn to choose actions in such
a way as to obtain a lot of reward
◼ Examples:
Game playing
Robot in a maze
...
23
Learning
Supervised Unsupervised Reinforcement
classification regression Clustering Association
24
Classification vs. clustering
Classification Clustering
Similarity They both classify data using Features X
Differences Supervised Unsupervised learning
learning
Classes Y are Classes Y are unknown
known
Train a Based on Intra-class
classifier model similarity (or inter-class
f based on diversity) within the feature
training space X.
examples (X,Y) 25
Classification vs. regression
Classification Regression
Similarity They both use supervised learning task that
use training example (X,Y) where X represent
input data features and Y is the output
Differences Y is a class Y is a continous value
label (name)
26
Summary
27
Resources: Datasets
◼ UCI Repository:
http://www.ics.uci.edu/~mlearn/MLRepository.html
◼ UCI KDD Archive:
http://kdd.ics.uci.edu/summary.data.application.html
◼ Statlib: http://lib.stat.cmu.edu/
◼ Delve: http://www.cs.utoronto.ca/~delve/
28