0% found this document useful (0 votes)

29 views39 pages

CH 4

Uploaded by

aryanpattni913

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views39 pages

CH 4

Uploaded by

aryanpattni913

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 39

Classification (Discrete value output)

Regression (Predict real value output)

Clustering (Structure of the dataset)
D: 4
n: 10
Supervised Learning
Training dataset n*m
Unsupervised Learning
10 Gb
Learning algorithm 5 Gb

Hypothesis Y
X
Size (X) Thane Price (Y)
525 25 L
1000 40L
750
h1
h2
Y

price
X, Y
3X1+2X2 = Y

2
h3

X
Size

Θ0 = 2 0 2
Θ1 = 0 1 1
• Y= h(x) = Θ0 + Θ1*x
• h(x) = Θ0 + Θ1*x Linear regression eq. for
univariate Multivariate
• Θ = Parameters Choose

Minimize
Θ0 , Θ 1 J(Θ0, Θ1)
Cost function
Squared error function

2+(-2) + 4 + (-4) = 0
4+4+16+16 = 40
Minimize J(Θ1)
h(x) = Θ0 + Θ1*x h(x) = Θ1*x for Θ0 = 0 Θ1
X Y
1 1
2 2
3 3

Case - 1
Y Θ1 = 1

Case-1 J(Θ1) = 1/2M(02 + 02 + 02) = 0

3
2
Case-2 Case - 2
Θ1 = 0.5
1
J(Θ1) = 1/2M((0.5-1)2 + (1-2)2 + (1.5-3)2) = ?
1 2 3
X

Case-3 Case - 3
Θ1 = -0.5

J(Θ1) = 1/2M((0.5-1)2 + (1-2)2 + (1.5-3)2) = ?

J(Θ1)
Optimization problem

Faster convergence

Θ1
Bowl shape function 1

Minimize J(Θ1) Objective function

Θ1
X -> dx

J(Θ0, Θ1)

Θ1
Θ0
Gradient descent

Minimize J

1. Start with some values of Θ0, Θ1 (Initialization of variable (Random))

2. Update Θ to minimize J (Convergence)

3. Θ0, Θ1 = 0 Y= mx + c

M = (y2 –y1)/(x2 – x1)

J= 0 or 1

Learning rate
• ML DL/RL
• IEEE, Sciencedirect (Elsevier), ACM, Springer
• Research paper
• 2016> State-of-art
• X -> Disadvantage
• X1
• XY X1 + Y1 = X1Y1
h(x) = Θ0 + Θ1*x

Polyno,
Quadr
h(x) = Θ0 + Θ1x1 +
Θ0 Θ1X1 + Θ02X3
Normalized Normalization Feature scaling
Raw data Feature extra

Feature Feature spac

Color Size Weight Class Redu and no

Orange 15-25 cm 100 gm Mango
Feature selec
Green 100 cm 1 Kg Watermelon (FR, GA, AN)
classifier

1 hr
1min
Mean SD Var MG3 MG4 RMS Class 60 sample
10.5 Bus 1 min = 100 va
100*60 = 6000
Bycle
100*6 = 600
Train
Walkin Frequency dom
g Time-Frequenc
Raw data Feature extraction Outp
classifier
Feature space

Redu and non-sensiti,

Feature selection /Transformation (PCA)

(FR, GA, AN) ML

Raw data Deep learning method Output

Capability to extract the features by their own

DL
Distinguishing feature
Min J(O) x2 Minmaxscalar

Min-max normalization

Max value (Column) = xi/max(x

0-1
0<x1<1
x1 0<x2<1
Iteration
x2 Image = 0-255

Ordinary least square

x1
h(x) = Θ0 + Θ1x1 + Θ2x2
h(x) = Θ0x0 + Θ1x1 + Θ2x2 ……….. + Θnxn

Θ = Θ0 X= 1
Θ1 x1
Θn xn

h(x) = ΘTX

X0 = 1
Logistic regression: Classification

2-class classification h(x) = ΘTX

Email-s/ns
D- C/NC h(x)
C- Y/N
W–l
Y=1
Y = 0 or 1

Threshold = 0. 5 0.5

h(x) > 0.5 Y=1

h(x) <= 0.5 Y=0 Y=0

Size
0<=h(x)<=1
0<=h(x)<=1 Linear regression: h(x) = ΘTX
Logistic regression = g(ΘTX)

Sigmoid function g(z)

Probability that x Y 0 or 1

x2
p(y=1 | X, Θ)
5 Y=1
p(y=0 | X, Θ) + p(y=1 | X, Θ) = 1
Y=1 x1+x2>=5
Y= 0 x1+x2<5

x1
5
Y=0
Cost(h(x), y) = -log(h(x) if Y = 1
-log(1-h(x)) if Y = 0

Cost = 0 if Y=1 h(x) = 1 but h(x) -> 0 cost->Infinity

Cost = 0 if Y = 0 h(x) = 0 but h(x) -> 1 cost->Infinity

Min J(Θ)

https://www.desmos.com/calculator
Solve
Y=1, Y=2, y=3 One vs All (OVA) NN(OVA) SVM(OVA)

DT
max
Linear
Logistic
Just right
Underfit
High bias

Overfit
High variance
Subset Selection
Reduce the number of features
1) Manually
2) Algorithm
Best subset selection
Forward and backward selection

Shrinkage method (Regularization)

Ridge (L2 Norm)
Lasso (L1 Norm) least absolute shrinkage and selection operator

P-norm
Regularization

Θ0 + Θ 1 x + Θ 2 x2

Θ0 + Θ 1 x + Θ 2 x2 + Θ 3 x3 + Θ 4 x4

+ 1000Θ3x3 + 1000Θ4x4

Θ3 and Θ4 = 0
Elastic net

L21
Regularization parameter
Neural Network and learning machines by Simon Haykin x2
Y=1
Y=1
Input and output behaviour

McCulloch-Pitts Model
Y=0
X1 or X2 Y=1
x0
-15 h(x) = g(-5+10x1+10x2)
x1

10
x1 y

x2 X1 X2 h(x) and h(x) or

0 0 0 g(-15) 0 g(-5)
0 1 0 g(-5) 1 g(5)
1 0 0 g(-5) 1 g(5)
1 1 1 g(5) 1 g(15)
NN as Directed Graph

w
x y = wx Synaptic link

linear X
g()
x y = g(wx) Activation link
Non-linear

Xi
yi
Y = yi + yj
x

yj xj
Computation Graph Y= x2 y
Y = 2x

1.001 x
1
X=1 y=1 X=1 y=2
X= 1.001 y=1.002 X= 1.001 y=2.002
Slope = 2
X=4 y= 8
X=4.001 y=8.0002
X=4 y= 16
X=4.001 y=16.008 Slope = height/width
Slope = 8 2
J(a,b,c) = 3(a + bc)

b v=a+u J = 3v
u = bc
dJ/dv = 3
c

J = 3v
dJ/du = dJ/dv * dv/du V = 11 11.001
dJ/da = dJ/dv * dv/da =3*1=3 J = 33 33.003
=3 * d (a+u)/da
3
=3*1=3

dJ/db = dJ/dv * dv/du * du/db

=3*c

dJ/dc = 3b
W b

a Cost/L
X sum Sig a = h(x)
oss g(ΘTx)

da/dz = a(1-a) L(y,a) = -ylog(a)-(1-y)log(1-a)

dL/da = -y/a + (1-y)/(1-a)

dL/dz = dL/da * da/dz
=a-y

dL/dw1 = dL/da * da/dz * dz/dw1

= (a-y)x1

dL/db= a-y g(z)(1-g(z))

x
z = w Tx + b a = g(z) L(a,y)
w
Sigmoid
b dz da Tanh
Threshold
Relu
Input layer Rbf
W1, b1 Leaky relu
x1
W2, b2 Output layer

x2 a

L(a2,y) a = max(0,z)
x3 Z2 and a2 a = max(0.001z, z)
W2 = 1,3
W1 = 3,3 B2 = 1,1
X=a0== 3,1 b1 = 3,1 1, 0,
Z2 = 1,1
Z1 = 3,1 0, 1,
A2 = 1,1
a1 = 3, 1 0, 0,
Z1 and a1 Hidden layer-1 0 0
Parameters: W1, b1, W2, b2 W1, b1
Cost function: J(W1, b1, W2, b2) = x1
1/m sum L(a2,y) W2, b2

x2 L(a2,y)

W=random(-1,1) * 0.0001
Z2
B =0
x3
Z1

Backward propagation
Forward propagation
Z1 = W1X+b1 dz2= dL/dz2 = dL/da2 * da2/dz2 = a2 - y
A1 = g(z1) dw2 = dz2 a1T
Z2 = W2A1 + b2 db2 = dz2
A2 = g(z2) (sigmoid) dz1 = dL/dz1 = W2Tdz2 * a1(1-a1)
Dw1 = dz1 XT
Db1 = dz1

W2 =W2 – αdw2
b2 =b2 – αdb2
W1 =W1 – αdw1
b1 =b1 – αdb1
GA:
Genetic Representation
GA: Initial Population: ELM1(L, w, b), ELM2(L,w,b), ELM3,
Population …..ELMn :20, 50, 100 multi-D
Crossover Fitness function: RMSE (Test): 0.3> ELM Accuracy,
Mutation F1-score
Fitness function Population: New based on Initial population
For I =0 to T T: Preset: 100, 50, 20 or Early stopping: 5
Selection function: Tournament, Roulette wheel selection
A: ELM1, B: ELM5
Consume Crossover operator:
Grid Search Node-based operator
Randomly Node <L A, B
Child 1: A+ b(B)
CHild2: B + b(A)
Link-based operator
Randomly choosing w from A and B
Arthime

Mutation operator:
Node-based operator
Delete node and add node
Link-based operator
Exchange link
Fitness function: Child RMSE
One hot encoding Softmax Function

Why?
Categorical data (B, C, T) 1, 2, 3
1.5

One hot encoding: Binary variable [1, 3, 2]

B: 1, 0, 0
C: 0, 1, 0 1/(1+3+2)
T: 0, 0, 1 3/
2/

1.0
Deep Learning
Why

DNN

Forward and back propagation Parameters

W, b, W, b

Hyperparameter
Learning rate
Epoch
L
2018/2019 Neuron
Broad learning Activation
Convolutional Neural Network (CNN)

Edge detection Filter, Kernel, Mask

Vertical
Horizontal

Padding
Stride= 1 2
Image = 2D

Max pooling
* +b1

(m-a+1) X (n-b+1), 2

X:mXnX3 aXbX3

relu(WTX + b)

+b2
Cost J = 1/M L(y’ , y)

Input Conv Max Conv Max Conv Max

FC FC FC output
Conv
and
Conv Max
3 x 3 x6

28 X 28 X 3 = 2352

24 X 24 X6= 1728

Parameter sharing 2352 X 1728 = 40 L

Sparsity of connections

Pre-trained
VGG-16, 19, ResNet,

Transfer Learning

5X5X6 150
Sequence models
NLP

X Y
Audio Person
Music Pattern, music
Sentimental analysis Hate speech analysis, Trend in
share….
Book Suicide
Syntax
Paragra Summary
Mangoes am in VJTI……. , are ….. .
X0 x1 x2 x3 Input = Output
RNN
Y0 y1 y2 y3 Many to Many
1 Many to one
One to Many
One to one

P(? | I)
Voc. X0 X1
1 0
I
0 1
A P(in | I, am)
m 0 0
In 0 0
0 0
vjt 0 0
i 0 0
0 0
t
Vanishing gradient problem
L(y’,y) = sum(t) L (y’, y) Exploding
NaN
L = -ylog(y) - (1-y)log(1-Y)

A A

A A
P(? | I)
LSTM/GRU
Error rate: y NotEQ Y’

F-Score = 2 * P *R /(P + R)
100 .0% 96.0% 4.0%
ROC analysis
Accuracy: 1 – Error rate Kappa constant
R2
Confusion matrix Sensitivity = Recall
Specificity= TN / (TN + FP)
Actual class Y=1 Y=0
Predicated Y’ = 1 True +ve (TP) False +ve (FP)
7 0
Y’ = 0 False –ve (FN) True –ve (TN)
1 10

Recall: actual class what fraction did we correctly classified = TP / (TP + FN): 0 to 1

Precision: Predicted value what fraction actually y = TP / (TP + FP): 0 to 1

CS 229: Supervised Learning Basics
100% (1)
CS 229: Supervised Learning Basics
48 pages
Maths Behind ML Algos
No ratings yet
Maths Behind ML Algos
18 pages
Math Behind ML Algos
No ratings yet
Math Behind ML Algos
18 pages
2IIG0 Cheat Sheet 1
No ratings yet
2IIG0 Cheat Sheet 1
2 pages
01B DL2023 LinearModels
No ratings yet
01B DL2023 LinearModels
47 pages
Machine Learning Notes Cs229 1
No ratings yet
Machine Learning Notes Cs229 1
217 pages
cs229 Notes1 PDF
No ratings yet
cs229 Notes1 PDF
28 pages
Unit 1,2,3
No ratings yet
Unit 1,2,3
17 pages
CSE 440 AI Volume1 (p1)
No ratings yet
CSE 440 AI Volume1 (p1)
4 pages
BITS F464 ML Lecture Notes
No ratings yet
BITS F464 ML Lecture Notes
86 pages
Updating Weight
No ratings yet
Updating Weight
9 pages
AC-ED L04 - Logistic Regression, Regularization
No ratings yet
AC-ED L04 - Logistic Regression, Regularization
80 pages
Machine Learning Summary
No ratings yet
Machine Learning Summary
38 pages
Cheat Sheet For Exam
No ratings yet
Cheat Sheet For Exam
2 pages
Machine Learning Notes by Standard Andrew NG
No ratings yet
Machine Learning Notes by Standard Andrew NG
142 pages
Machine Learning Basics for Students
No ratings yet
Machine Learning Basics for Students
7 pages
Machine Learning Notes AndrewNg
No ratings yet
Machine Learning Notes AndrewNg
141 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
293 pages
Stanford ML CS229-Merged Notes
No ratings yet
Stanford ML CS229-Merged Notes
126 pages
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
No ratings yet
Anuranan Das Summer of Sciences, 2019. Understanding and Implementing Machine Learning
17 pages
DAC ML Tutorial Final Deck
No ratings yet
DAC ML Tutorial Final Deck
150 pages
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
No ratings yet
Aula 4 (L) - Oggi La Tua Lezione È in Presenza
11 pages
Regression
No ratings yet
Regression
30 pages
Linear Regression and Gradient Descent
No ratings yet
Linear Regression and Gradient Descent
30 pages
cs229 2
No ratings yet
cs229 2
275 pages
CS229
No ratings yet
CS229
69 pages
Notes 1
No ratings yet
Notes 1
30 pages
CS229 Lecture Notes: Supervised Learning
No ratings yet
CS229 Lecture Notes: Supervised Learning
30 pages
Machine Learning (CSEN3203) 1-14
No ratings yet
Machine Learning (CSEN3203) 1-14
15 pages
Notes5 Regression
No ratings yet
Notes5 Regression
14 pages
CS229 Lecture Notes: Andrew NG and Tengyu Ma April 25, 2023
No ratings yet
CS229 Lecture Notes: Andrew NG and Tengyu Ma April 25, 2023
223 pages
Skript Opt Mach
No ratings yet
Skript Opt Mach
49 pages
Machine Learning Guide 2017
No ratings yet
Machine Learning Guide 2017
15 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Regression Analysis
No ratings yet
Regression Analysis
54 pages
Linear Regression Notes
No ratings yet
Linear Regression Notes
15 pages
AWS Machine Learning Specialty Master Cheat Sheet
No ratings yet
AWS Machine Learning Specialty Master Cheat Sheet
24 pages
Main Notes
No ratings yet
Main Notes
227 pages
ML Day2
No ratings yet
ML Day2
7 pages
(MLP) MidtermNote
No ratings yet
(MLP) MidtermNote
31 pages
ML Record Print
No ratings yet
ML Record Print
20 pages
Linear Regression With One Variable
No ratings yet
Linear Regression With One Variable
48 pages
Multivariable Linear Regression Guide
No ratings yet
Multivariable Linear Regression Guide
7 pages
ECE 449 Notes
No ratings yet
ECE 449 Notes
5 pages
Machinelearning
No ratings yet
Machinelearning
59 pages
CS229 Andrew NG Lecture Notes
No ratings yet
CS229 Andrew NG Lecture Notes
216 pages
Stanford Lecture: Regression Models
No ratings yet
Stanford Lecture: Regression Models
64 pages
(MLP) Lecture Notes
No ratings yet
(MLP) Lecture Notes
22 pages
Textbook
No ratings yet
Textbook
161 pages
Algorithms For Data Science: Attendance: 88772147
No ratings yet
Algorithms For Data Science: Attendance: 88772147
35 pages
189 Cheat Sheet Nominicards PDF
No ratings yet
189 Cheat Sheet Nominicards PDF
2 pages
Undergraduate Fundamentals of Machine Learning
No ratings yet
Undergraduate Fundamentals of Machine Learning
163 pages
CS229: Machine Learning Notes
No ratings yet
CS229: Machine Learning Notes
241 pages
Cs229-Main Notes Andrew NG and Tengyu Ma
No ratings yet
Cs229-Main Notes Andrew NG and Tengyu Ma
227 pages
UCSP PPT Concept Characteristics and Forms Stratification Systems
No ratings yet
UCSP PPT Concept Characteristics and Forms Stratification Systems
29 pages
Alabama State Policy Grading Rubric
No ratings yet
Alabama State Policy Grading Rubric
3 pages
Bring Your Future Into GOODNESS!: 2x2 Photo
100% (1)
Bring Your Future Into GOODNESS!: 2x2 Photo
2 pages
Cells of The Pulp
No ratings yet
Cells of The Pulp
5 pages
Introducing Qualitative Research in Psychology 3rd Edition by Carla Willig ISBN 9780335244492 PDF Download
100% (3)
Introducing Qualitative Research in Psychology 3rd Edition by Carla Willig ISBN 9780335244492 PDF Download
75 pages
AWL Sublist 1 Words 11-20 Worksheet
No ratings yet
AWL Sublist 1 Words 11-20 Worksheet
4 pages
Media & Information Literacy: 1st SEMESTER - Module 1
No ratings yet
Media & Information Literacy: 1st SEMESTER - Module 1
17 pages
Kelly Nichols Resume: Psychology Major
No ratings yet
Kelly Nichols Resume: Psychology Major
2 pages
DISCIPLINES AND IDEAS IN THE SOCIAL SCIENCESer
No ratings yet
DISCIPLINES AND IDEAS IN THE SOCIAL SCIENCESer
2 pages
Meiosis 1 vs 2: Key Differences Explained
No ratings yet
Meiosis 1 vs 2: Key Differences Explained
4 pages
Infant Alternative Feeding Methods
No ratings yet
Infant Alternative Feeding Methods
4 pages
Inquiry-Based Introduction To Engineering
No ratings yet
Inquiry-Based Introduction To Engineering
322 pages
The Negative Effects of Video Game: Counter-Argument Essay Topic
No ratings yet
The Negative Effects of Video Game: Counter-Argument Essay Topic
8 pages
DLL - Mapeh 4 - Q1 - W2
No ratings yet
DLL - Mapeh 4 - Q1 - W2
3 pages
Brother Rice Alumni Dinner Ad Book 2013
No ratings yet
Brother Rice Alumni Dinner Ad Book 2013
98 pages
Menstrual Disorders in Adolescent Girls
No ratings yet
Menstrual Disorders in Adolescent Girls
5 pages
M.Tech & MBA for Working Pros at IIIT-A
No ratings yet
M.Tech & MBA for Working Pros at IIIT-A
3 pages
Seed Story
No ratings yet
Seed Story
4 pages
Decision Making Models & Biases
100% (1)
Decision Making Models & Biases
37 pages
Philosophical Underpinnings of Research
100% (3)
Philosophical Underpinnings of Research
20 pages
Module in Gymnastics
No ratings yet
Module in Gymnastics
46 pages
Gambella University Dormitory Management System Soe Project
100% (2)
Gambella University Dormitory Management System Soe Project
31 pages
The Falls Behavioural (Fab) Scale For The Older Person: Instruction Manual
No ratings yet
The Falls Behavioural (Fab) Scale For The Older Person: Instruction Manual
21 pages
Assessment. EXMAPLES
No ratings yet
Assessment. EXMAPLES
12 pages
Failure To Log in Failure To Log Out
No ratings yet
Failure To Log in Failure To Log Out
2 pages
TERM PAPER IN EDUCATIONAL PLANNING-Albino, Sandrex S.
100% (1)
TERM PAPER IN EDUCATIONAL PLANNING-Albino, Sandrex S.
20 pages
Brand Guidelines - NAE and Master 1, Master 2 and Hybrid Schools
No ratings yet
Brand Guidelines - NAE and Master 1, Master 2 and Hybrid Schools
125 pages
s1 University Ds
No ratings yet
s1 University Ds
2 pages
FL FS LL LS Nico Escullar
No ratings yet
FL FS LL LS Nico Escullar
10 pages
Doubt, A Parable - Questions
No ratings yet
Doubt, A Parable - Questions
2 pages