0% found this document useful (0 votes)

11 views12 pages

Revision

The document discusses various data visualization techniques suitable for different types of variables, including histograms, bar charts, and scatter plots. It also covers clustering methods, the Self-Organizing Map, evaluation metrics for clustering, and the Apriori algorithm for mining frequent itemsets. Additionally, it addresses model evaluation techniques, overfitting, and the significance of predictors in regression analysis.

Uploaded by

lawjavier3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views12 pages

Revision

Uploaded by

lawjavier3

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as DOCX, PDF, TXT or read online on Scribd

You are on page 1/ 12

(A) A single continuous variable (e.g.

height of a student): A histogram or

a box plot would be appropriate here. These plots can provide information
about the central tendency, dispersion, and skewness of the data.

(B) A single categorical variable (e.g. the days of a week): A bar chart or a
pie chart would work well in this case. These charts can show the
frequency or proportion of each category.

© A single continuous variable (e.g. personal income) and a single

categorical variable (e.g. gender): A box plot or a violin plot can be used
here. These plots can show the distribution of the continuous variable
across different categories.

(D) Two continuous variables (e.g. height and weight of a student): A

scatter plot would be the best choice for this type of data. It can show the
relationship or correlation between the two variables.

(E) Two categorical variables (e.g. highest qualification and gender): A

stacked bar chart or a mosaic plot can be used in this case. These plots
can show the relationship between two categorical variables.

(A) Partitional Clustering vs Hierarchical Clustering:

Connections: Both are methods used to group similar objects into clusters.
They aim to maximize the similarity within a cluster and minimize the
similarity between different clusters.

Differences:

Initialization: Partitional clustering (like k-means) starts with a random

partitioning of data points into clusters and iteratively refines the clusters.
Hierarchical clustering starts by treating each data point as a cluster and
successively merges or splits existing clusters.

Number of Clusters: In partitional clustering, the number of clusters needs

to be specified in advance. In hierarchical clustering, it doesn’t need to be
specified at the start, and an entire hierarchy of clusters is created.

Flexibility: Hierarchical clustering provides more flexibility than partitional

clustering as it offers the ability to visualize the data clusters using a
dendrogram.

(B) Self-Organizing Map (SOM) and its “Topology Preserving Properties”:

A Self-Organizing Map (SOM) is a type of artificial neural network that is

trained using unsupervised learning to produce a low-dimensional
representation of the input space of the training samples, called a map.

The “topology preserving” property means that the SOM maintains the
topological properties of the input space. This means that if certain
patterns are close together in the input space, they are also close
together in the map, preserving the similarity relationships of the original
data.

© Parameters in Training SOM:

Grid size: This defines the dimensions of the grid on which the map is
created. It affects the granularity of the map.

Learning Rate: This controls how much the weights are adjusted at each
step. A high learning rate can speed up training but may lead to less
accurate results.

Neighborhood Radius: This determines the extent to which neighboring

nodes are updated for each training sample. A larger radius means more
nodes are considered neighbors.

(D) Determining Better Clustering Result:

To determine whose clustering result is better, we can use various

evaluation metrics such as Silhouette Score, Davies-Bouldin Index, or
Dunn Index. These metrics consider both the compactness of the clusters
(how close the points within a cluster are) and the separation between
different clusters. The choice of metric can depend on the specific
requirements of the task. For example, if we care more about the
separation of clusters, we might choose a metric that emphasizes that
aspect.
(A) Apriori Algorithm: The Apriori algorithm is a popular algorithm for
mining frequent itemsets for boolean association rules. The main steps
are:

Set Generation: Start by forming a set of all individual items that have a
support greater than the minimum support. These sets are called 1-
itemsets.

Candidate Generation: Generate the candidate itemsets (Ck) in the

database using the frequent itemsets found in the previous step (Lk-1).
This is done by joining Lk-1 with itself and pruning subsets that are
infrequent.

Pruning: Any (k-1)-itemset that is not frequent cannot be a subset of a

frequent k-itemset, so remove all such candidate itemsets.

Iteration: Repeat the candidate generation and pruning steps until no

more frequent k-itemsets can be found.

(B) Creating Transactions: Let’s assume we have 10 transactions in total.

For the rule {A, D} => {F, H} to have a support of 0.3, it means that 30%
of the transactions contain {A, D, F, H}, which is 3 transactions. For the
rule to have a confidence of 0.6, it means that 60% of the transactions
that contain {A, D} also contain {F, H}. So, we need 5 transactions to
contain {A, D}. Here is one possible set of transactions:

{A, D, F, H}

{A, D, B}

{A, D}

{B, F, H}

{B, F}

{B, H}

{F, H}

{A, B, D, F, H}

© Issue with “Confidence” Measure: The confidence measure has a

drawback: it only considers the popularity of the itemset X and does not
take into account the popularity of Y. This could lead to misleading results
if Y is very common in the dataset. For example, if bread is bought
frequently, the rule {eggs} => {bread} might have a high confidence not
because eggs and bread are strongly associated but simply because bread
is bought frequently.

To address this issue, other measures like lift or leverage are used. The lift
of a rule is the ratio of the observed support to that expected if X and Y
were independent. A lift value greater than 1 indicates that itemsets X and
Y are more likely to be bought together than separately, which gives us
more information than the confidence measure alone. Similarly, leverage
computes the difference between the observed frequency of X and Y
appearing together and the frequency that would be expected if X and Y
were independent. A leverage value of 0 indicates independence.
(A) Procedure for k-NN Classifier:

Preprocessing: Normalize the M training samples to ensure consistency.

Choosing k: Use cross-validation on the training set to find an optimal ‘k’

value that minimizes error.

Distance Metric: Select an appropriate distance metric (e.g., Euclidean,

Manhattan) for measuring distances between samples.

Training: Use the M labeled samples to train the k-NN model by storing
them in memory.

Testing: Test the classifier using the N hidden samples, comparing

predicted labels with actual labels and adjusting ‘k’ or other parameters
as needed.

(B) Total Number of Weights in MLP: The total number of weights in this
MLP can be calculated as follows:

Number of weights from input layer to hidden layer = 45 features * 20

neurons = 900

Number of weights from hidden layer to output layer = 20 neurons * 3

targets = 60

Total number of weights = 900 + 60 = 960

© Computing Output for MLP: Given the threshold function, the output for
each sample can be computed as follows (assuming the threshold µ is 0.5
for both neurons):

Sample1:

Hidden neuron input: (1.5+1)+(1.2+1)=2.7; Output: f(2.7)=1 (since

µ=0.5)

Output neuron input: (2.7-2)+(0+1)= -5.4; Output: f(-5.4)=0 (since

µ=0.5)

Repeat similar calculations for Sample2, Sample3, and Sample4.

(D) Linearity of MLP with Linear Activation Function: If all neurons in an

MLP network use a linear activation function f(x) = x, then the relationship
between the input and output of this network will remain linear. This is
because a linear function preserves the linearity of the input data, and the
composition of linear functions is also a linear function. Therefore, no
matter how many hidden layers the MLP has, if all the neurons use a
linear activation function, the overall network represents a linear
transformation from the input to the output.

(A) Hold-out and 10-fold Cross-Validation (CV):

Hold-out: This method involves splitting the dataset into two sets: a
training set and a testing set. The model is trained on the training set and
evaluated on the testing set. The strength of this method is its simplicity
and speed. However, its weakness is that the evaluation may vary
significantly depending on how the data is split.

10-fold CV: This method involves splitting the dataset into 10 equal parts,
or ‘folds’. The model is then trained 10 times, each time using 9 folds for
training and a different fold for testing. The final model performance is the
average of the performances of the 10 models. The strength of this
method is that it gives a more robust estimate of the model performance
than the hold-out method. However, it is computationally more expensive.

(B) Overfitting Problem: Overfitting occurs when a model learns the

training data too well. It captures not only the underlying patterns but also
the noise and outliers in the training data. As a result, while the model
performs well on the training data, it performs poorly on unseen data (test
data) because it fails to generalize.

© Random Forest Method: Random Forest is an ensemble learning method

that operates by constructing multiple decision trees at training time and
outputting the mean prediction of the individual trees for regression
problems. It improves upon a single decision tree by reducing overfitting
and providing a more robust prediction by averaging the results of many
trees.

(D) Inappropriateness of Linear Regression for Binary/Categorical

Response Variable: Linear regression is not suitable for binary/categorical
response variables because it may predict values outside the range of 0
and 1 for binary response variables. This does not make sense for
binary/categorical outcomes. Logistic regression is a more appropriate
choice in this case as it predicts the probability of the outcome.
(E) Optimisation Problem for Soft Margin Classifier:

(i) The variable M in the optimization problem for the soft margin classifier
represents the margin of the classifier. The goal of the optimization
problem is to maximize this margin.

(ii) The variable C is a regularization parameter that controls the trade-off

between maximizing the margin (M) and minimizing the sum of the slack
variables (ϵ𝑖). A larger C places more emphasis on minimizing the sum of
the ϵ𝑖, which can lead to a smaller margin if the data is not linearly
separable.
1) root (n=706, deviance=139239800, yval=3266.356)

|--- 2) totwrk>=2256.5 (n=369, deviance=70902640, yval=3151.081)

| |

| |--- 4) totwrk>=3693.5 (n=20, deviance=4127513, yval=2747.500)

| |

| |--- 5) totwrk<3693.5 (n=349, deviance=63330890,

yval=3174.209)

| |

| |--- 10) totwrk>=2622.5 (n=174, deviance=35897580,

yval=3122.690)

| | |

| | |--- 20) totwrk<2784.5 (n=47, deviance=11624900,

yval=2929.553) *

| | |

| | |--- 21) totwrk>=2784.5 (n=127, deviance=21870690,

yval=3194.165) *

| |

| |--- 11) totwrk<2622.5 (n=175, deviance=26512260,

yval=3225.434) *

|--- 3) totwrk<2256.5 (n=337, deviance=58064950, yval=3392.576)

|--- 6) educ>=7.5 (n=324, deviance=54736770, yval=3375.145)

| |

| |--- 12) totwrk>=1174 (n=209, deviance=32660830,

yval=3314.206) *

| |

| |--- 13) totwrk<1174 (n=115, deviance=19889250,

yval=3485.896) *

|--- 7) educ<7.5 (n=13, deviance=776314, yval=3827.000) *

(B) Predicted Sleep Per Week: Using the regression tree, we can predict
the sleep per week for each record as follows:

Record 1: Follows path 2 -> 5 -> 11, predicted sleep = 3225.434 minutes
per week.

Record 2: Follows path 2 -> 4, predicted sleep = 2747.500 minutes per

week.

Record 3: Follows path 3 -> 6 -> 12, predicted sleep = 3314.206 minutes
per week.

© Mean Square Error (MSE) and Mean Absolute Error (MAE): To compute
the MSE and MAE, we need the actual sleep values for all records in the
test set, which are not provided in the question. The formulas for MSE and
MAE are as follows:

MSE = (1/n) * Σ(actual - predicted)²

MAE = (1/n) * Σ|actual - predicted|

(D) Support Vectors vs Usual Observations: In Support Vector Machine

(SVM) classification, support vectors are the data points that lie closest to
the decision boundary and are the most difficult to classify, whereas usual
observations are those that are farther away from the decision boundary.
Support vectors are used to maximize the margin of the classifier. If these
points are moved, the position of the decision boundary would change.
However, moving the usual observations (those not on the margin) would
not affect the position of the decision boundary.
(A) Precision and Recall for Each Classifier:

Linear SVM:

Precision = TP / (TP + FP) = 230 / (230 + 35) = 0.868

Recall = TP / (TP + FN) = 230 / (230 + 32) = 0.878

Nonlinear SVM with γ = 1:

Precision = 253 / (253 + 32) = 0.888

Recall = 253 / (253 + 9) = 0.966

Nonlinear SVM with γ = 500:

Precision = 180 / (180 + 12) = 0.938

Recall = 180 / (180 + 82) = 0.687

Based on these calculations, the nonlinear SVM with γ = 1 has the highest
recall and the nonlinear SVM with γ = 500 has the highest precision. If we
consider both precision and recall, the nonlinear SVM with γ = 1 might be
the best overall as it has a good balance of precision and recall.

(B) Information Gain from the First Split: Information gain is calculated as
the entropy of the parent node minus the weighted sum of the entropy of
the child nodes. In this case, the entropy of the parent node is -
0.5log2(0.5) - 0.5log2(0.5) = 1. The entropy of the left child node is -
0.57log2(0.57) - 0.43log2(0.43) = 0.985, and the entropy of the right child
node is -0.46log2(0.46) - 0.54log2(0.54) = 0.997. The information gain is
therefore 1 - 0.70.985 - 0.30.997 = 0.014.

© Predictor totwrk: The p-value for the predictor totwrk is less than 0.001,
which is less than the significance level of 0.10. Therefore, we reject the
null hypothesis that the coefficient of totwrk is zero. This suggests that
totwrk is a significant predictor of sleep at the 10% significance level.

(D) Differences in Sleep Between Men and Women: The p-value for the
predictor male is 0.0118, which is less than the significance level of 0.05.
Therefore, we reject the null hypothesis that the coefficient of male is
zero. This suggests that there is a significant difference in the minutes of
sleep at night per week between men and women at the 5% significance
level.

(E) MAE and MSE for the Test Set: To compute the Mean Absolute Error
(MAE) and Mean Squared Error (MSE), we need the actual and predicted
sleep values for all records in the test set, which are not provided in the
question. The formulas for MAE and MSE are as follows:

MAE = (1/n) * Σ|actual - predicted|

MSE = (1/n) * Σ(actual - predicted)²

Without the actual and predicted values, we cannot compute the MAE and
MSE, and therefore cannot compare the performance of the linear
regression model to the regression tree.

MLC2
No ratings yet
MLC2
9 pages
Machine Learning
No ratings yet
Machine Learning
9 pages
Mooc Part 2
No ratings yet
Mooc Part 2
8 pages
ML Imp Que
No ratings yet
ML Imp Que
57 pages
ML - Machine Learning PDF
No ratings yet
ML - Machine Learning PDF
13 pages
Data Mining Techniques
No ratings yet
Data Mining Techniques
11 pages
Chapter 2,3,4
No ratings yet
Chapter 2,3,4
8 pages
Chapter-V CLASSIFICATION & CLUSTERING
No ratings yet
Chapter-V CLASSIFICATION & CLUSTERING
153 pages
WEEK 5 Machine Learning
No ratings yet
WEEK 5 Machine Learning
8 pages
Data Analysis Chap 3
No ratings yet
Data Analysis Chap 3
21 pages
Chapter - 4
No ratings yet
Chapter - 4
14 pages
Machine Learning Ensemble Guide
No ratings yet
Machine Learning Ensemble Guide
26 pages
Advance Aiml Cie3 Ans
No ratings yet
Advance Aiml Cie3 Ans
5 pages
Exploratory Data Analysis & ML Concepts
No ratings yet
Exploratory Data Analysis & ML Concepts
16 pages
Let's Begin With:: Differentiate Between Supervised and Unsupervised Learning
No ratings yet
Let's Begin With:: Differentiate Between Supervised and Unsupervised Learning
26 pages
M.L. 3,5,6 Unit 3
No ratings yet
M.L. 3,5,6 Unit 3
6 pages
MODELS (AutoRecovered)
No ratings yet
MODELS (AutoRecovered)
9 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Machine Learning Notes ?
No ratings yet
Machine Learning Notes ?
14 pages
Unit Iv
No ratings yet
Unit Iv
29 pages
Ds Notes Mca
No ratings yet
Ds Notes Mca
30 pages
ML4 - Decision Trees & Random Forest
No ratings yet
ML4 - Decision Trees & Random Forest
44 pages
Data Analytics Unit IV
No ratings yet
Data Analytics Unit IV
36 pages
NLP Chapter 2
No ratings yet
NLP Chapter 2
79 pages
Question1 Answers Complete
No ratings yet
Question1 Answers Complete
4 pages
Spam Not Spam
No ratings yet
Spam Not Spam
7 pages
Unit 3 Ds
No ratings yet
Unit 3 Ds
10 pages
ML (Interview)
No ratings yet
ML (Interview)
20 pages
6 الى13 داتا ماينق
No ratings yet
6 الى13 داتا ماينق
19 pages
Data Minning Unit 2-1
No ratings yet
Data Minning Unit 2-1
10 pages
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
No ratings yet
Huawei H12-211 PRACTICE EXAM HCNA-HNTD H
117 pages
Bilal Ahmed Shaik Data Mining
No ratings yet
Bilal Ahmed Shaik Data Mining
88 pages
Data Science Interview Questions
No ratings yet
Data Science Interview Questions
31 pages
Ai Word Document Session 2 Detailed Exaple
No ratings yet
Ai Word Document Session 2 Detailed Exaple
15 pages
Intro to Exploratory Data Analysis
No ratings yet
Intro to Exploratory Data Analysis
17 pages
Coincent - Data Science With Python Assignment
100% (2)
Coincent - Data Science With Python Assignment
23 pages
Classification With Decision Trees I: Instructor: Qiang Yang
No ratings yet
Classification With Decision Trees I: Instructor: Qiang Yang
29 pages
Unsupervised Learning
No ratings yet
Unsupervised Learning
30 pages
Fifth Chapter Classification Clustering
No ratings yet
Fifth Chapter Classification Clustering
16 pages
Classifying in Machine Learning
No ratings yet
Classifying in Machine Learning
26 pages
Classification and Clustering Algorithm Notes
No ratings yet
Classification and Clustering Algorithm Notes
19 pages
Pattern Summary Final
No ratings yet
Pattern Summary Final
28 pages
How To Perform Clustering Algorithms in Machine Learning
No ratings yet
How To Perform Clustering Algorithms in Machine Learning
9 pages
Refer For KNNDecison Tree SVM
No ratings yet
Refer For KNNDecison Tree SVM
90 pages
Chapter 03
No ratings yet
Chapter 03
30 pages
Module 3 - Classification
No ratings yet
Module 3 - Classification
9 pages
5 Markd
No ratings yet
5 Markd
24 pages
ES335
No ratings yet
ES335
22 pages
Algorithms 1
No ratings yet
Algorithms 1
23 pages
مشین سیکھنا
No ratings yet
مشین سیکھنا
5 pages
ML UNIT-4 Answers
No ratings yet
ML UNIT-4 Answers
19 pages
Unit 3 - ML (NEW)
No ratings yet
Unit 3 - ML (NEW)
68 pages
MLT Study
No ratings yet
MLT Study
22 pages
Machine Learning Notes
No ratings yet
Machine Learning Notes
21 pages
Top 90+ Data Science Interview Questions and Answers (2024)
No ratings yet
Top 90+ Data Science Interview Questions and Answers (2024)
38 pages
Unit Iii
No ratings yet
Unit Iii
11 pages
Data Analytics-1
No ratings yet
Data Analytics-1
21 pages
Nwea 300
No ratings yet
Nwea 300
3 pages
Cryptocurrency Basics & Investment Guide
No ratings yet
Cryptocurrency Basics & Investment Guide
1 page
Cloud Computing: A Brief History of
No ratings yet
Cloud Computing: A Brief History of
11 pages
Timing Gear Case Installation (4JA1)
100% (2)
Timing Gear Case Installation (4JA1)
39 pages
24 Case Study One World Trade Center
No ratings yet
24 Case Study One World Trade Center
7 pages
Ancient Tamil Vattezhutthu Alphabets Recognition in Stone Inscription Using Wavelet Transform and SVM Classifier
No ratings yet
Ancient Tamil Vattezhutthu Alphabets Recognition in Stone Inscription Using Wavelet Transform and SVM Classifier
5 pages
Time Geography Measurement Theory
No ratings yet
Time Geography Measurement Theory
29 pages
Cyclic Codes
No ratings yet
Cyclic Codes
9 pages
Unit-4 ECE 3-2 Notes
No ratings yet
Unit-4 ECE 3-2 Notes
46 pages
Stain Removal Effect of Novel Papain-And Bromelain-Containing Gels Applied To Enamel
No ratings yet
Stain Removal Effect of Novel Papain-And Bromelain-Containing Gels Applied To Enamel
6 pages
EPTD W8-L2 Corona in Transmission Lines MAZS
No ratings yet
EPTD W8-L2 Corona in Transmission Lines MAZS
17 pages
أهمية المساندة الاجتماعية في تحقيق الشعور بالأمن النفسي لدى طلبة الجامعة
No ratings yet
أهمية المساندة الاجتماعية في تحقيق الشعور بالأمن النفسي لدى طلبة الجامعة
17 pages
Report
No ratings yet
Report
20 pages
Swish Max Unleashed
No ratings yet
Swish Max Unleashed
83 pages
Coffee Rush Solo
No ratings yet
Coffee Rush Solo
2 pages
Modul XXX
No ratings yet
Modul XXX
3 pages
Training Week HL
No ratings yet
Training Week HL
35 pages
Understanding Loading and Linking
100% (1)
Understanding Loading and Linking
18 pages
Process Framework: Module-4 Syllabus
No ratings yet
Process Framework: Module-4 Syllabus
86 pages
C4 Electronification EN PDF
No ratings yet
C4 Electronification EN PDF
6 pages
MATRICES 3x3 - Equations
No ratings yet
MATRICES 3x3 - Equations
28 pages
Reasoning 101
No ratings yet
Reasoning 101
5 pages
Amphenol 7-16 RF Conn
No ratings yet
Amphenol 7-16 RF Conn
4 pages
American Wide Flange Steel Beams W Beam Letter 1
No ratings yet
American Wide Flange Steel Beams W Beam Letter 1
7 pages
Layout Techniques of AD Converter
No ratings yet
Layout Techniques of AD Converter
20 pages
Van Berkel Bos - Diagrams
No ratings yet
Van Berkel Bos - Diagrams
4 pages
Self-Supervised Visual Learning Insights
No ratings yet
Self-Supervised Visual Learning Insights
13 pages
Adekuasi HD Rspad
No ratings yet
Adekuasi HD Rspad
58 pages
Lu A Comparative Study of The Measurements of Perceived Risk Among Contractors in China 307 312
No ratings yet
Lu A Comparative Study of The Measurements of Perceived Risk Among Contractors in China 307 312
6 pages
ORION FE Analysis of Foundations
No ratings yet
ORION FE Analysis of Foundations
17 pages

Revision

Uploaded by

Revision

Uploaded by

(A) A single continuous variable (e.g.

height of a student): A histogram or

© A single continuous variable (e.g. personal income) and a single

(D) Two continuous variables (e.g. height and weight of a student): A

(E) Two categorical variables (e.g. highest qualification and gender): A

(A) Partitional Clustering vs Hierarchical Clustering:

Initialization: Partitional clustering (like k-means) starts with a random

Number of Clusters: In partitional clustering, the number of clusters needs

Flexibility: Hierarchical clustering provides more flexibility than partitional

(B) Self-Organizing Map (SOM) and its “Topology Preserving Properties”:

A Self-Organizing Map (SOM) is a type of artificial neural network that is

© Parameters in Training SOM:

Neighborhood Radius: This determines the extent to which neighboring

(D) Determining Better Clustering Result:

To determine whose clustering result is better, we can use various

Candidate Generation: Generate the candidate itemsets (Ck) in the

Pruning: Any (k-1)-itemset that is not frequent cannot be a subset of a

Iteration: Repeat the candidate generation and pruning steps until no

(B) Creating Transactions: Let’s assume we have 10 transactions in total.

© Issue with “Confidence” Measure: The confidence measure has a

Preprocessing: Normalize the M training samples to ensure consistency.

Choosing k: Use cross-validation on the training set to find an optimal ‘k’

Distance Metric: Select an appropriate distance metric (e.g., Euclidean,

Testing: Test the classifier using the N hidden samples, comparing

Number of weights from input layer to hidden layer = 45 features * 20

Number of weights from hidden layer to output layer = 20 neurons * 3

Total number of weights = 900 + 60 = 960

Hidden neuron input: (1.5*+1)+(1.2*+1)=2.7; Output: f(2.7)=1 (since

Output neuron input: (2.7*-2)+(0*+1)= -5.4; Output: f(-5.4)=0 (since

Repeat similar calculations for Sample2, Sample3, and Sample4.

(D) Linearity of MLP with Linear Activation Function: If all neurons in an

(A) Hold-out and 10-fold Cross-Validation (CV):

(B) Overfitting Problem: Overfitting occurs when a model learns the

© Random Forest Method: Random Forest is an ensemble learning method

(D) Inappropriateness of Linear Regression for Binary/Categorical

(ii) The variable C is a regularization parameter that controls the trade-off

|--- 2) totwrk>=2256.5 (n=369, deviance=70902640, yval=3151.081)

| |--- 4) totwrk>=3693.5 (n=20, deviance=4127513, yval=2747.500)

| |--- 5) totwrk<3693.5 (n=349, deviance=63330890,

| |--- 10) totwrk>=2622.5 (n=174, deviance=35897580,

| | |--- 20) totwrk<2784.5 (n=47, deviance=11624900,

| | |--- 21) totwrk>=2784.5 (n=127, deviance=21870690,

| |--- 11) totwrk<2622.5 (n=175, deviance=26512260,

|--- 3) totwrk<2256.5 (n=337, deviance=58064950, yval=3392.576)

|--- 6) educ>=7.5 (n=324, deviance=54736770, yval=3375.145)

| |--- 12) totwrk>=1174 (n=209, deviance=32660830,

| |--- 13) totwrk<1174 (n=115, deviance=19889250,

|--- 7) educ<7.5 (n=13, deviance=776314, yval=3827.000) *

Record 2: Follows path 2 -> 4, predicted sleep = 2747.500 minutes per

MSE = (1/n) * Σ(actual - predicted)²

MAE = (1/n) * Σ|actual - predicted|

(D) Support Vectors vs Usual Observations: In Support Vector Machine

Precision = TP / (TP + FP) = 230 / (230 + 35) = 0.868

Recall = TP / (TP + FN) = 230 / (230 + 32) = 0.878

Nonlinear SVM with γ = 1:

Precision = 253 / (253 + 32) = 0.888

Recall = 253 / (253 + 9) = 0.966

Nonlinear SVM with γ = 500:

Precision = 180 / (180 + 12) = 0.938

Recall = 180 / (180 + 82) = 0.687

MAE = (1/n) * Σ|actual - predicted|

MSE = (1/n) * Σ(actual - predicted)²

You might also like

Hidden neuron input: (1.5+1)+(1.2+1)=2.7; Output: f(2.7)=1 (since

Output neuron input: (2.7-2)+(0+1)= -5.4; Output: f(-5.4)=0 (since