0% found this document useful (0 votes)
8 views11 pages

ML 24

Uploaded by

kajall.xxi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
8 views11 pages

ML 24

Uploaded by

kajall.xxi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 11

3143 2

Section A

(Compulsory)

1. (a) Consider a scenario where 6000 patients are tested

for Covid positive. Out of which 5000 are actually

Covid negative and 1000 are actually Covid

positive. For covid positive patients the test

however gave positive indication for 700 only and

for covid negative patients, the test gave positive


indication for 200 patients. Construct a confusion

matrix for above scenario and find the values of

True Positive Rate (TPR), False Positive Rato

(FPR), Specificity, Sensitivity metrics. (5)

(b) Answer the following: (5)

(i) What is the impact of small dataset with

respect to large number of features?

(ii) For the given values theta_0=0.2,

theta_1=0.1, and theta_2=0.1; predict

values of dependent variable y for all

कालिन्दी महाविद्यालय पुस्तकालय


KALINDI COLLEGE LIBRARY
3143 3

instances of independent variables x1

and x2 as given in following data table

using linear regression. Also predict mean

squared error.

x1 x2 y
2 3
2 4 5
3 8 9
2 1 1.5

(c) Cluster the following set of data objects in two

clusters by applying one iteration of k-means

algorithm. Treat objects 2 and 5 as initial cluster

centres. Use Euclidean distance as the distance

metric. Determine updated cluster centre

coordinates. (5)

X-coordinate Y-coordinate
Object
Number
1 2 4

2 4 6
3 6 8

4 10 4

5 12 4

कालिन्दी महाविद्यालय पुस्तकालय P.T.O.


KALINDI COLLEGE LIBRARY
3143 4

(d) Differentiate between linear regression and

polynomial regression. Derive the gradient descent

algorithm to find the unknown parameters in

multivariate linear regression. (5)

(e) How PCA (Principal Component Analysis)

algorithm helps in dimension reduction in machine

learning? Write the steps of PCA algorithm.

(5)

(f) What is regularization? Write equations of cost

function for regularized linear and regularized

logistic regression. What will be the effect on

model when the regularization parameter is set to

zero? (5)

(g) Consider the following dataset with 8 training

instances. Use k-NN algorithm (for k=3) to

determine the 'Result' status for a new test

instance with values CGPA = 7.6, Assessment = 60

and Project Points = 7. (5)

कालिन्दी महाविद्यालय पुस्तकालय


KALINDI COLLEGE LIBRARY
3143 5

S.No. Result
CGPA Assessment Project Points
1 9.2 85 8 Pass
2 8 7 Pass
80
3 8.5 81 8 Pass
4 6 45 5 Fail
5 6.5 50 4 Fail
6 8.2 72 7 Pass
7 5.8 38 5 Fail
8 8.9 91 9 Pass

Section - B

2. (a) Consider two features in a dataset and their

possible values as shown below: (4)

Income: values (medium, low, high, very high)

Status: values (SO, AO, Clerk)

Answer the following questions :

(i) Using Cartesian product on above feature

set, construct a new feature and generate

its possible values list.

कालिन्दी महाविद्यालय
ालय पुस्तक
KALINDI COLL
EGE LIBRARY
P.T.O.
3143 6

(ii) State one advantage and one disadvantage

of above approach for feature construction.

(b) For the given set of points, identify clusters using

complete linkage in agglomerative clustering. Use

Euclidean distance to calculate the distance

between two points. (6)

Points X coordinate Y coordinate


P1 1 1

P2 1.5 1.5
P3 5 5
P4 3 4

3. (a) Consider the following two dimensional space with

some data points such that circle points represent

positive class points and triangular points represent

negative class points separated by a decision

boundary as shown. (5)

् य ा ल य प ु स्तकाल
महाविद
कालिन्दी E LIBR ARY
O L L E G
KALIN DI C
3143
7

4.5

4
010.5,4).
3.5

(1.5, 3)
3
13,91
2.5 (2,3)
410,25)
2
-11,2)
1.5
А (3.5,1.5)
1

0.5


이 1 2 3 4 5 6

Answer the following questions :

(i) Identify support vectors, (with respect to

SVM classifier applied on above data)

(ii) Draw marginal planes, (with respect to


SVM classifier applied on above data)

(iii) Define Marginal Distance in SVM


algorithm.

(b) Construct neural network for a two input NOR


gate using truth table. Show diagram for your
generated neural network model with weights.
(5)
कालिन्दी महाविद्यालय पुस्तकालय
KALINDI COLLEGE LIBRARY Р.Т.О.
3143 8

4. (a) Apply Naive Bayesian Classifier to Predict

whether a car is stolen or not with features

{Color:RED, Origin:Domestic, Typer:SUV based

on given dataset. (5)

Color Type Origin Stolen


RED SPORTS DOMESTIC YES
RED SPORTS DOMESTIC NO
RED SPORTS DOMESTIC YES
YELLOW SPORTS DOMESTIC NO
YELLOW SPORTS IMPORTED YES
YELLOW SUV IMPORTED NO
YELLOW SUV IMPORTED YES
YELLOW SUV DOMESTIC NO
RED SUV IMPORTED NO
RED SPORTS IMPORTED YES

(b) Differentiate between hold out method, leave one

out method and k-fold method for cross-validation.

Which of the above methods has low bias and

high variance. Justify. (5)

् य ा ल य प ु स ्तकालय
हाविद
कालिन्दी म RARY
OL LEGE LIB
KALINDI C
3143 9

the data given below, build logistic


5. (a) Using a

student is
regression model to predict whether
a

pass or fail based on exam score using gradient

descent algorithm. Assume initial values for model

0 and learning rate 0.3.


parameters (thetas) as as

Use one iterations of gradient descent algorithm


(6)
to update the model parameters.

Exam Score (x) Pass/Fail (у)


50 0
55 이
60 이
65 1
70 1

75 1

80 1

85 1

90 1
95 1

(b) Using least squares method, learn the regression

coefficients for the data given below. Also predict

the value of y for x=12 using your learned

coefficients. (4)
पुस्तक ालय
हावि द ् य ा ल य
कालिन्दी म E L I BRARY P.T.O.
COLL E G
K ALINDI
3143 10

X Y

2 21
4 27
6 29
8 64
10 86

6. (a)

w4=0.4
w1=0.1 h1 y1
X1
w2
w2
=0
=0
.2
.2
.3

.3
=0

0
3=
w3

X2 y2
h2
w4=0.4 w4=0.4

For given input values of x1 and x2 as 0.3 and 0.5

respectively, determine the values of output nodes

y1 and. y2. Use bias b1=0.5 and b2=0.5. Use

sigmoid as the activation function for hidden as

well as output layer. (7)

महाविद्यालय पुस्तकालय
कालिन्दी
RARY
KALINDI COLLEGE LIB
3143
11

(b) Explain the effect of following factors in achieving

t gradient
model convergence with respec
to

descent algorithm.

Learning rate is too small.

(3)
Learning rate is too large.

7. (a) Consider following training data for 5 persons. For

binary classification of a person as sick or not

sick create a decision tree model. Show all the

steps. (8)

Person A1 A2 A3 Class
No
Yes Yes Yes Not Sick
1
2 Yes No Yes Sick
3 No No Yes Sick
4 No Yes Yes Not Sick

5 No Yes No Sick

(b) Consider the expected and predic


ted outcomes of

a machine learning classifier on a data set

containing. 7 observations. Calculate the

कालिन्दा महाविद्यालय पुरा जय

KALINDI COLLEGE LIBRARY P.T.O.


3143 12

performance of the classifier using Jaccard Index


metric. (2)

0 이 0 1 1 1
Y expected 0
0 0 1 0 1 0
Y predicted 1

कालिन्दी महाविद्यालय पुस्तकालर

KALINDI COLLEGE LIBRARY (2500)

You might also like