0% found this document useful (0 votes)
17 views43 pages

COMP5318/COMP4318 Machine Learning and Data Mining Semester 1, 2023, Week 1b

The document outlines key concepts in machine learning and data mining, focusing on data attributes, data cleaning, preprocessing techniques, and similarity measures. It discusses nominal and numeric attributes, methods for handling noise and missing values, and various data preprocessing techniques such as feature extraction and selection. Additionally, it highlights the importance of data aggregation and normalization in preparing data for analysis.

Uploaded by

zhouyjchris
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
17 views43 pages

COMP5318/COMP4318 Machine Learning and Data Mining Semester 1, 2023, Week 1b

The document outlines key concepts in machine learning and data mining, focusing on data attributes, data cleaning, preprocessing techniques, and similarity measures. It discusses nominal and numeric attributes, methods for handling noise and missing values, and various data preprocessing techniques such as feature extraction and selection. Additionally, it highlights the importance of data aggregation and normalization in preparing data for analysis.

Uploaded by

zhouyjchris
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 43

Data

COMP5318/COMP4318 Machine Learning and Data Mining


semester 1, 2023, week 1b
Irena Koprinska

Reference: Tan ch. 2

1
Outline
• Nominal and numeric attributes
• Data cleaning
• Noise
• Missing values
• Data preprocessing
• Data aggregation
• Feature extraction
• Feature subset selection
• Converting features from one type to another
• Normalization of feature values
• Similarity measures
• Euclidean, Manhattan, Minkowski
• Hamming, SMC, Jaccard coefficient
• Cosine similarity
• Correlation
Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023
2
Data

• Data is collection of examples (also Attributes (features)


called instances, records, Class
observations, objects) Tid Refund Marital Taxable
Status Income Cheat

1 Yes Single 125K No


• Examples are described with 2 No Married 100K No
attributes (features, variables) 3 No Single 70K No

Examples 4 Yes Married 120K No


5 No Divorced 95K Yes
6 No Married 60K No
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
10 No Single 90K Yes
10

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


3
Nominal and numeric attributes

• Two types of attributes:


• nominal (categorical) - their values belong to a pre-specified, finite set of
possibilities
• numeric (continuous) - their values are numbers
outlook temp. humidity windy play sepal sepal petal petal iris type
sunny hot high false no length width length width
sunny hot high true no 5.1 3.5 1.4 0.2 iris setosa
overcast hot high false yes 4.9 3.0 1.4 0.2 iris setosa
rainy mild high false yes
rainy cool normal false yes
4.7 3.2 1.3 0.2 iris setosa
rainy cool normal true no ...
overcast cool normal true yes 6.4 3.2 4.5 1.5 iris versicolor
sunny mild high false no 6.9 3.1 4.9 1.5 iris versicolor
sunny cool normal false yes 5.5 2.3 4.0 1.3 iris versicolor
rainy mild normal false yes 6.5 2.8 4.6 1.5 iris versicolor
sunny mild normal true yes 6.3 3.3 6.0 2.5 iris virginica
overcast mild high true yes 5.8 2.7 5.1 1.9 iris virginica
overcast hot normal false yes
rainy
...
mild high true no

nominal numeric

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


4
Types of data

Tid Refund Marital Taxable


Status Income Cheat
TID Items
1 Yes Single 125K No 1 Bread, Coke, Milk
2 No Married 100K No
2 Beer, Bread
3 No Single 70K No
4 Yes Married 120K No
3 Beer, Coke, Diaper, Milk
5 No Divorced 95K Yes 4 Beer, Bread, Diaper, Milk
6 No Married 60K No
5 Coke, Diaper, Milk
7 Yes Divorced 220K No
8 No Single 85K Yes
9 No Married 75K No
transaction data
10 No Single 90K Yes
10

data matrix

GGTTCCGCCTTCAGCCCCGCGCC
CGCAGGGCCCGCCCCGCGCCGTC
GAGAAGGGCCCGCCTGGCGGGCG
GGGGGAGGCGGGGCCGCCCGAGC
CCAACCGAGTCCGACCAGGTGCC
CCCTCTGCTCGGCCTAGACCTGA
GCTCATTAGGCGGCAGCGGACAG
GCCAAGTAGAACACGCGAAGCGC
TGGGCTGCCTGCTGCGACCAGGG graph Average monthly temperature
(e.g. molecular spatio-temporal
Sequential structure)
(e.g. genetic sequence)
Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023
5
Data cleaning

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


6
Data cleaning

• Data is not perfect


• Noise due to
• distortion of values
• addition of spurious examples
• inconsistent and duplicate data
• Missing values

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


7
Noise

• Human errors when entering data or limitations of measuring


instruments, flaws in the data collection process

• 1) Noise - distortion of values


• Ex: distortion of human voice when talking on a poor phone line
• Higher distortion => the shape of the signal may be lost

voice voice + noise

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


8
Noise (2)

• 2) Noise – addition of spurious examples


• Some are far from the other examples (are outliers), some are mixed
with the non-noisy data

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


9
Noise (3)

• 3) Noise – inconsistent and duplicate data


• E.g. negative weight and height values, non-existing zip codes, 2
records for the same person – need to be detected and corrected
• Typically easier to detect and correct than the other two types of
noise

• Reducing noise types 1) and 2):


• Using signal and image processing and outlier detection techniques
before DM
• Using ML algorithms that are more robust to noise – give acceptable
results in presence of noise

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


10
Dealing with missing values

outlook temp. humidity windy play


• Various methods, e.g.: sunny hot high false no
sunny hot high true no
overcast hot high false yes
1) Ignore all examples with missing values rainy mild high false yes
? cool normal false yes
• Can be done if small % missing values rainy cool normal true no
overcast cool normal true yes
sunny mild high false no
sunny ? normal false yes
rainy mild normal false yes
sunny mild normal true yes
overcast ? high true yes
overcast hot normal false yes
rainy mild high true no

2) Estimate the missing values by using the remaining values


• Nominal attributes - replace the missing values for attribute A with
• the most common value for A or
• the most common value among the examples with the same class (if
supervised learning)
• Numerical – replace with the average value of the nearest neighbors (the
most similar examples)
Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023
11
Data preprocessing

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


12
Data preprocessing

• Data aggregation
• Dimensionality reduction
• Feature extraction
• Feature subset selection
• Converting attributes from one type to another
• Normalization

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


13
Data aggregation

• Combining two or more attributes into one – purpose:


• Data reduction - less memory and computation time; may allow the
use of computationally more expensive ML algorithms
• Change of scale - provides high-level view
• E.g. cities aggregated into states or countries
• More stable data - aggregated data is less variable than non-
aggregated
• E.g. consumed daily food (food_day1, food_day2, etc.) aggregated into
weekly food to get a more reliable understanding of the diet
(carbohydrates, fat, protein, etc.)
• Disadvantage – potential loss of interesting details

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


14
Feature extraction

• Feature extraction is the creation of features from raw data – very


important task
• Requires domain expertise
• Ex: classifying images into outdoors or indoors
raw data: color value for each pixel
extracted features: color histogram, dominant color, edge histogram, etc.
• May require mapping data to a new space, then extract features
• The new space may reveal important characteristics

Fourier 2 peaks
transform corresponding
→ to the periods
of the sin
waves

2 sin waves + noise power spectrum


Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023
15
Feature subset selection

• The process of removing irrelevant and redundant features and


selecting a small set of features that are necessary and sufficient for
good classification
• Very important for successful classification
• Good feature selection typically improves accuracy
• Using less features also means:
• Faster building of the classifiers, i.e. reduces computational cost
• Often more compact and easier to interpret classification rule

Useful references:
Kohavi, R., John, H.: Wrappers for feature subset selection, Artificial Intelligence, vol. 97,
issue 1-2 (1997), pp. 273 – 324
Hall, M.: Correlation-based Feature Selection for Discrete and Numeric Class Machine
Learning. 17th Int. Conf. on Machine Learning (ICML). Morgan Kaufmann (2000) 359-366

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


16
Feature subset selection methods

• Brute force – try all possible combinations of features as input to a ML


algorithm, select the best one (rarely possible in practice – too many
combinations of features)
• Embedded - some ML algorithms can automatically select features (e.g.
decision trees)
• Filter – select features before the ML algorithm is run; the feature
selection is independent of the ML algorithm
• Based on statistical measures, e.g. information gain, mutual
information, odds ratio, etc.
• Correlation-based feature selection, Relief
• Wrapper – select the best subset for a given ML algorithm; it uses the
ML algorithm as a black box to evaluate different subsets and select the
best
Feature selection is well studied in ML and there are many excellent
methods!
Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023
17
Feature weighting

• Can be used instead of feature reduction or in conjunction with it


• The more important features are assigned a higher weight, the less
important – lower
• manually - based on domain knowledge
• automatically – some classification algorithms do it (e.g. boosting) or may
do it if this option is selected (k-nearest neighbor)
• Key idea: features with higher weights play more important role in
the construction of the ML model

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


18
Converting attributes from one type to another

• Converting numeric attributes to nominal (discretization)


• Converting numeric and nominal attributes to binary attributes
(binarization)

• Needed as some ML algorithms work only with numeric, nominal or


binary attributes

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


19
Binarization

• Converting categorical and numeric attributes into binary


• There is no best method; the best one is the one that works best for a given
ML algorithm but all possibilities cannot be evaluated
• Simple technique
• categorical attribute -> integer -> binary
• numeric attribute -> categorical -> integer -> binary

categorical -> binary

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


20
Discretization

• Converting numeric attributes into nominal


• 2 types: unsupervised and supervised
• Unsupervised – class information is not used
• Supervised - class information is used
• Decisions to be taken numeric -> nominal
• How many categories (intervals)?
• Where should the splits be?

2-dim data; x and y are numeric attributes


Goal: convert x from numeric to nominal

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


21
Unsupervised discretization

• How many intervals?


• The user specifies them, e.g. 4 • equal width – 4 intervals
• Where should the splits be? with the same width - [0,5),
[5, 10), [11,15), [15,20)
• 3 methods
• equal frequency – 4
intervals with the same
number of points in each of
them
• clustering (e.g. k-means) –
4 intervals determined by a
clustering method
Data Equal width

Equal frequency K-means


Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023
22
Supervised discretization – entropy-based

• Splits are placed so that they maximizes the purity of the intervals
• Entropy is a measure of the purity of a dataset (interval) S
• The higher the entropy, the lower the purity of the dataset
entropy( S ) = − Pi . log 2 Pi Pi - proportion of examples from class i
i
• Ex.: Consider a split between 70 and 71. What is the entropy of the left and right
datasets (intervals)?
• values of temperature:
64 65 68 69 70 71 72 73 74 75 80 81 83 85
yes no yes yes yes no no no yes yes no yes yes no

4 4 1 1
entropy ( S left ) = − log 2 − log 2 = 0.722 bits
5 5 5 5
4 4 5 5
entropy ( S right ) = − log 2 − log 2 = 0.991bits
9 9 9 9
Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023
23
Entropy-based discretization - example

• Total entropy of the split = weighted average of the interval entropies

n
totalEntropy =  wi entropy (Si )
i

wi – proportion of values in interval i, n – number of intervals

• Algorithm: evaluate all possible splits and choose the best one (with the
lowest total entropy); repeat recursively until stopping criteria are satisfied
(e.g. user specified number of splits is reached)

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


24
Entropy-based discretization – example (2)

-attribute temperature
64 65 68 69 70 71 72 73 74 75 80 81 83 85
yes no yes yes yes no no no yes yes no yes yes no

• 7 initial possible splits


• For each of the 7 splits:
• Compute the entropy of the 2 intervals
• Compute the total entropy of the split
• Choose the best split (the one with minimum total entropy)
• Repeat for the remaining splits until the desired number of splits is
reached

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


25
Normalization and standardization

• Attribute transformation to a new range, e.g. [0,1]


• Used to avoid the dominance of attributes with large values over
attributes with small values
• Required for distance-based ML algorithms; some other algorithms also
work better with normalized data
• E.g. age (in years) and annual income (in dollars) have different scales
A=[20, 40 000]
B=[40, 60 000]
D(A,B)=|20-40| + |40 000-60 000|=20 020
Difference in income dominates, age doesn’t contribute

• Solution: first normalize or standartize then calculate distance

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


26
Normalization and standardization (2)

• Performed for each attribute

Normalization
(also called min-max scaling): Standardization:

𝑥 − min(x) 𝑥 − 𝜇(x)
𝑥′ = 𝑥′ =
max(x) − min(x) 𝜎(x)

x – original value
x’ – new value

x – all values of the attribute; a vector


min(x) and max(x) – min and max values of the attribute (of the vector x)
(x) - mean value of the attribute
(x) - standard deviation of the attribute
Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023
27
Normalization - example

Examples with 2 attributes: age and income:


A=[20, 40 000]
B=[40, 60 000]
C=[25, 30 000]

Suppose that:
for age: min = 0, max=100
for income: min=0, max=100 000

After normalization:
A=[0.2, 0.4]
B=[0.4, 0.6]
C=[0.25, 0.3]

D(A,B)=|0.2-0.4| + |0.4-0.6|=0.4, i.e. income and age contribute equally
Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023
28
Similarity measures

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


29
Measuring similarity

• Many ML algorithms require to measure the similarity between 2


examples
• Two main types of measures
• Distance
• Correlation

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


30
Euclidean and Manhattan distance

• Distance measures for numeric attributes


• A, B – examples with attribute values a1, a2,..., an & b1, b2,..., bn
• E.g. A= [1, 3, 5], B=[1, 6, 9]

• Euclidean distance (L2 norm) – most frequently used

D( A, B) = (a1 − b1 ) 2 + (a2 − b2 ) 2 + ... + (an − bn ) 2


D(A,B) = sqrt ((1-1)2+(3-6)2+(5-9)2)=5

• Manhattan distance (L1 norm)


D ( A, B ) = a1 − b1 + a 2 − b2 + ... + a n − bn
D(A,B)=|1-1|+|3-6|+|5-9|=7

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


31
Minkowski distance

• Minkowski distance – generalization of Euclidean & Manhattan


q q q 1/ q
D( A, B ) = ( a1 − b1 + a2 − b2 + ... + an − bn )
q – positive integer

• Weighted distance – each attribute is assigned a weight according to


its importance (requires domain knowledge)
• Weighted Euclidean:
2 2 2
D( A, B ) = w1 a1 − b1 + w2 a2 − b2 + ... + wn an − bn

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


32
Similarity between binary vectors

• Hamming distance = Manhattan for binary vectors


• Counts the number of different bits

D( A, B ) = a1 − b1 + a2 − b2 + ... + an − bn
A = [1 0 0 0 0 0 0 0 0 0 ]
B = [0 0 0 0 0 0 1 0 0 1 ]
D(A,B) = 3
• Similarity coefficients
f00: number of matching 0-0 bits
f01: number of matching 0-1 bits
f10: number of matching 1-0 bits
f11: number of matching 1-1 bits
• Calculate these coefficients for the example above!
Answer: f01 = 2, f10 = 1, f00 = 7 , f11 = 0

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


33
Similarity between binary vectors (2)

• Simple Matching Coefficient (SMC) - matching 1-1 and 0-0 / num. attributes
SMC = (f11+f00)/(f01+f10+f11+f00)
Ex.: A = [1 0 0 0 0 0 0 0 0 0 ]
B = [0 0 0 0 0 0 1 0 0 1 ]
f01 = 2, f10 = 1, f00 = 7 , f11 = 0
SMC = (0+7) / (2+1+0+7) = 0.7

• Task: Suppose that A and B are the supermarket bills of 2 customers. Each
product in the supermarket corresponds to a different attribute.
• attribute value = 1 – product was purchased
• attribute value = 0 - product was not purchased
• SMC is used to calculate the similarity between A and B. Is there any
problem using SMC?

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


34
Similarity between binary vectors (2)

• Yes, SMC will find all customer transactions (bills) to be similar


• Reason: The number of products that are not purchased in a transaction is
much bigger than the number of products that are purchased
• => f00 will be very high (not purchased products)
• f11 will be low (purchased products)
• f00 will be much higher than f11 and its effect will be lost

SMC = (f11+f00)/(f01+f10+f11+f00)

• => More generally, the problem is that the 2 vectors A and B contain many
0s, i.e. are very sparse => SMC is not suitable for sparse data
A = [1 0 0 0 0 0 0 0 0 0 ]
B = [0 0 0 0 0 0 1 0 0 1 ]

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


35
SMC vs Jaccard

• An alternative: Jaccard coefficient


• counts matching 1-1 and ignores matching 0-0
J=f11/(f01+f10+f11)

A = [1 0 0 0 0 0 0 0 0 0 ]
B = [0 0 0 0 0 0 1 0 0 1 ]
f01 = 2, f10 = 1, f00 = 7 , f11 = 0
J = 0 / (2 + 1 + 0) = 0 (A and B are dissimilar)

• Compare with SMC:


SMC= (0+7) / (2+1+0+7) = 0.7 (A and B are highly similar - incorrect)

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


36
Cosine similarity

• Useful for sparse data (both binary and non-binary)


• Widely used for classification of text documents:
𝐴. 𝐵
cos(𝐴, 𝐵) =
𝐴 𝐵
. - vector dot product, ||A|| - length of vector A
• Geometric representation: measures the angle between A and B
• Cosine similarity=1 => angle(A,B)=0º
• Cosine similarity =0 => angle (A,B)=90º

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


37
Cosine similarity - example

• Two document vectors:


d1 = 3 2 0 5 0 0 0 2 0 0
d2 = 1 0 0 0 0 0 0 1 0 2
𝑑1. 𝑑2
cos(𝑑1, 𝑑2) =
𝑑1 𝑑2

d1 . d2= 3*1 + 2*0 + 0*0 + 5*0 + 0*0 + 0*0 + 0*0 + 2*1 + 0*0 + 0*2 = 5
||d1|| = (3*3+2*2+0*0+5*5+0*0+0*0+0*0+2*2+0*0+0*0)1/2 = (42) 1/2 = 6.481
||d2|| = (1*1+0*0+0*0+0*0+0*0+0*0+0*0+1*1+0*0+2*2) 1/2 = (6) 1/2 = 2.449
=> cos( d1, d2 ) = 0.315

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


38
Correlation

• Measures linear relationship between numeric attributes


• Pearson correlation coefficient between vectors x and y with
dimensionality n
covar(x, y )
corr(x, y ) =
std(x) std(y )
n 2
where:
 xk
n
 (xk − mean(x) )
mean(x) = k =1 std (x) = k =1

n n −1
1 n
co var(x, y ) = 
n − 1 k =1
( xk − mean( x))( y k − mean( y ))

• Range: [-1, 1]
• -1: perfect negative correlation
• +1: perfect positive correlation
• 0: no correlation
Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023
39
Correlation - examples

• Ex1: corr(x,y)=?
x=(-3, 6, 0, 3, -6)
y=( 1,-2, 0,-1, 2)

• Ex2: corr(x,y)=?
x=(3, 6, 0, 3, 6)
y=( 1, 2, 0, 1, 2)

• Ex3: corr(x,y)=?
x=(-3, -2, -1, 0, 1, 2, 3)
y=( 9, 4, 1, 0, 1, 4, 9)

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


40
Answers

• Ex1: corr(x,y)=? corr(x,y) = -1


x=(-3, 6, 0, 3, -6) perfect negative linear correlation
y=( 1,-2, 0,-1, 2)

• Ex2: corr(x,y)=? corr(x,y) = +1


x=(3, 6, 0, 3, 6) perfect positive linear correlation
y=( 1, 2, 0, 1, 2)

• Ex3: corr(x,y)=? corr(x,y) = 0


x=(-3, -2, -1, 0, 1, 2, 3) no linear correlation
y=( 9, 4, 1, 0, 1, 4, 9) However, there is a non-linear
relationship: y=x2

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


41
Correlation – visual evaluation

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


42
Distance measures for nominal attributes

• Various options depending on the task and type of data type; requires
domain expertise
• E.g.:
• difference =0 if attribute values are the same
• difference =1 if they are not

• Example: 2 attributes = temperature and windy


temperature values: low and high
windy values: yes and no
A = (high, no)
B = (high, yes)
d(A,B) =(0+1)1/2=1 (Euclidean distance)

Irena Koprinska, irena.koprinska@sydney.edu.au COMP5318 ML&DM, week 1b, 2023


43

You might also like