0% found this document useful (0 votes)
75 views22 pages

Supervised Learning Algorithms Analysis

buying persons lug_boot safety Condition doors maint Test mode: 5-fold cross-validation === Classifier model (full training set) === Linear Regression with ridge parameter of 1.0E-8 Coefficients... buying=vhigh -0.0368 buying=high -0.0237 buying=med 0.0119 buying=low 0.0486 persons 0.0244 lug_boot=small -0.0153 lug_boot=med 0.0038 lug_boot=big

Uploaded by

Nilay Debnath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
75 views22 pages

Supervised Learning Algorithms Analysis

buying persons lug_boot safety Condition doors maint Test mode: 5-fold cross-validation === Classifier model (full training set) === Linear Regression with ridge parameter of 1.0E-8 Coefficients... buying=vhigh -0.0368 buying=high -0.0237 buying=med 0.0119 buying=low 0.0486 persons 0.0244 lug_boot=small -0.0153 lug_boot=med 0.0038 lug_boot=big

Uploaded by

Nilay Debnath
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 22

Name: Nilay Debnath

ID: CSE 06607735


Section: C
Batch- 66
Submitted to: Tanvir Rahman
Date of submission: 7-10-21
SUPERVISED LEARNING

Introduction:

Supervised learning is the machine learning task of learning a function that maps an input to

an output based on example input-output pairs. It infers a function from labeled training

data consisting of a set of training examples. In supervised learning, each example is

a pair consisting of an input object (typically a vector) and a desired output value (also called

the supervisory signal). A supervised learning algorithm analyzes the training data and

produces an inferred function, which can be used for mapping new examples. An optimal

scenario will allow for the algorithm to correctly determine the class labels for unseen instances.

This requires the learning algorithm to generalize from the training data to unseen situations in a

"reasonable" way. The parallel task in human and animal psychology is often referred to

as concept learning.

Abstraction:

In this supervised learning focused on the following criteria based on some issues

• Buying
• Maint
• Doors
• Persons
• Lug_Boot
• Safety
• Condition

Here based on this parameter the efficiency of five different algorithm for this case was tested and
implemented and there are results given in result portion.
Algorithms for Maint attributes

Run Information

1 weka.classifiers.rules.ZeroR
Scheme: weka.classifiers.rules.ZeroR

Relation: Car

Instances: 1728

Attributes: 7

buying

maint

doors

persons

lug_boot

safety

Condition

Test mode: 5-fold cross-validation

=== Classifier model (full training set) ===

ZeroR predicts class value: vhigh

Time taken to build model: 0 seconds

=== Stratified cross-validation ===

=== Summary ===

Correctly Classified Instances 430 24.8843 %

Incorrectly Classified Instances 1298 75.1157 %

Kappa statistic -0.0015


Mean absolute error 0.375

Root mean squared error 0.433

Relative absolute error 100 %

Root relative squared error 100 %

Total Number of Instances 1728

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class

0.597 0.601 0.249 0.597 0.351 -0.003 0.498 0.249 vhigh

0.199 0.200 0.249 0.199 0.221 -0.001 0.498 0.249 high

0.199 0.201 0.249 0.199 0.221 -0.002 0.498 0.249 med

0.000 0.000 ? 0.000 ? ? 0.498 0.249 low

Weighted Avg. 0.249 0.250 ? 0.249 ? ? 0.498 0.249

=== Confusion Matrix ===

a b c d <-- classified as

258 87 87 0 | a = vhigh

259 86 87 0 | b = high

260 86 86 0 | c = med

260 86 86 0 | d = low
2 weka.classifiers.rules.PART

Scheme: weka.classifiers.rules.PART -C 0.25 -M 2

Relation: Car

Instances: 1728

Attributes: 7

buying

maint

doors

persons

lug_boot

safety

Condition

Test mode: 5-fold cross-validation

=== Classifier model (full training set) ===

PART decision list

------------------

Condition = acc AND

buying = high: high (108.0/72.0)


Condition = unacc: vhigh (1210.0/850.0)

Condition = acc AND

buying = vhigh: med (72.0/36.0)

Condition = acc AND

safety = high: vhigh (89.0/43.0)

Condition = good: low (69.0/23.0)

safety = high: med (65.0/39.0)

lug_boot = big: vhigh (40.0/24.0)

lug_boot = small: med (35.0/21.0)

doors > 3: vhigh (20.0/12.0)

doors <= 2: med (10.0/6.0)

persons <= 4: med (5.0/3.0)


: vhigh (5.0/3.0)

Number of Rules : 12

Time taken to build model: 0.04 seconds

=== Stratified cross-validation ===

=== Summary ===

Correctly Classified Instances 315 18.2292 %

Incorrectly Classified Instances 1413 81.7708 %

Kappa statistic -0.0903

Mean absolute error 0.3537

Root mean squared error 0.4338

Relative absolute error 94.3215 %

Root relative squared error 100.1908 %

Total Number of Instances 1728


=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class

0.308 0.298 0.256 0.308 0.280 0.009 0.638 0.377 vhigh

0.111 0.299 0.110 0.111 0.111 -0.188 0.485 0.234 high

0.150 0.270 0.157 0.150 0.153 -0.121 0.525 0.263 med

0.160 0.223 0.193 0.160 0.175 -0.068 0.584 0.345 low

Weighted Avg. 0.182 0.273 0.179 0.182 0.180 -0.092 0.558 0.305

=== Confusion Matrix ===

a b c d <-- classified as

133 156 85 58 | a = vhigh

179 48 108 97 | b = high

110 123 65 134 | c = med

97 109 157 69 | d = low


3 weka.classifiers.functions.Logistic

Scheme: weka.classifiers.functions.Logistic -R 1.0E-8 -M -1 -num-decimal-places 4

Relation: Car

Instances: 1728

Attributes: 7

buying

maint

doors

persons

lug_boot

safety

Condition

Test mode: 5-fold cross-validation

=== Classifier model (full training set) ===

Logistic Regression with ridge parameter of 1.0E-8

Coefficients...

Class

Variable vhigh high med


===============================================

buying=vhigh -0.4104 -0.2583 -0.0603

buying=high -0.2765 -0.2119 -0.0781

buying=med 0.2 0.1098 0.0169

buying=low 0.4869 0.3605 0.1215

doors 0.0017 0.0009 0

persons 0.3477 0.1927 0.0074

lug_boot=small -0.2145 -0.1108 -0.0014

lug_boot=med 0.0479 0.0241 0.0007

lug_boot=big 0.1667 0.0867 0.0007

safety=low -0.642 -0.3601 -0.012

safety=med 0.1167 0.0771 0.0132

safety=high 0.5253 0.283 -0.0012

Condition=unacc 3.5598 1.7988 0.0034

Condition=acc 2.0261 1.1919 0.1981

Condition=vgood -14.8556 -0.0593 -0.1045

Condition=good -14.5985 -15.1667 -0.8131

Intercept -4.2465 -2.1353 -0.0077

Odds Ratios...

Class

Variable vhigh high med


===============================================

buying=vhigh 0.6634 0.7723 0.9415

buying=high 0.7584 0.8091 0.9249

buying=med 1.2214 1.116 1.017

buying=low 1.6273 1.434 1.1292

doors 1.0017 1.0009 1

persons 1.4158 1.2125 1.0074

lug_boot=small 0.8069 0.8951 0.9986

lug_boot=med 1.049 1.0244 1.0007

lug_boot=big 1.1814 1.0906 1.0007

safety=low 0.5263 0.6976 0.9881

safety=med 1.1238 1.0802 1.0133

safety=high 1.6909 1.3271 0.9988

Condition=unacc 35.1578 6.0422 1.0034

Condition=acc 7.5847 3.2932 1.2191

Condition=vgood 0 0.9425 0.9008

Condition=good 0 0 0.4435

Time taken to build model: 0.12 seconds


=== Stratified cross-validation ===

=== Summary ===

Correctly Classified Instances 532 30.787 %

Incorrectly Classified Instances 1196 69.213 %

Kappa statistic 0.0772

Mean absolute error 0.3611

Root mean squared error 0.4268

Relative absolute error 96.2986 %

Root relative squared error 98.5638 %

Total Number of Instances 1728

=== Detailed Accuracy By Class ===

TP Rate FP Rate Precision Recall F-Measure MCC ROC Area PRC Area Class

0.567 0.382 0.331 0.567 0.418 0.162 0.655 0.369 vhigh

0.148 0.131 0.274 0.148 0.192 0.021 0.508 0.247 high

0.271 0.218 0.293 0.271 0.281 0.054 0.580 0.299 med

0.245 0.191 0.299 0.245 0.270 0.058 0.601 0.366 low

Weighted Avg. 0.308 0.231 0.299 0.308 0.290 0.074 0.586 0.320
=== Confusion Matrix ===

a b c d <-- classified as

245 77 55 55 | a = vhigh

197 64 92 79 | b = high

153 48 117 114 | c = med

145 45 136 106 | d = low


Algorithms for persons attributes

1. weka.classifiers.functions.GaussianProcesses

Scheme: weka.classifiers.functions.GaussianProcesses -L 1.0 -N 0 -K


"weka.classifiers.functions.supportVector.PolyKernel -E 1.0 -C 250007" -S 1

Relation: Car

Instances: 1728

Attributes: 7

buying

maint

doors

persons

lug_boot

safety

Condition

Test mode: 5-fold cross-validation

=== Classifier model (full training set) ===

Gaussian Processes

Kernel used:
Linear Kernel: K(x,y) = <x,y>

All values shown based on: Normalize training data

Average Target Value : 0.555555555555551

Inverted Covariance Matrix:

Lowest Value = -0.020975481221554237

Highest Value = 0.993470109156667

Inverted Covariance Matrix * Target-value Vector:

Lowest Value = -0.7812415052997398

Highest Value = 0.8274300015945182

Time taken to build model: 5.4 seconds


=== Cross-validation ===

=== Summary ===

Correlation coefficient 0.5203

Mean absolute error 0.915

Root mean squared error 1.0651

Relative absolute error 82.2512 %

Root relative squared error 85.2746 %

Total Number of Instances 1728

2. weka.classifiers.functions.LinearRegression
=== Run information ===

Scheme: weka.classifiers.functions.LinearRegression -S 0 -R 1.0E-8 -num-decimal-places 4

Relation: Car

Instances: 1728

Attributes: 7

buying

maint

doors

persons

lug_boot

safety

Condition

Test mode: 5-fold cross-validation

=== Classifier model (full training set) ===

Linear Regression Model

persons =

-0.1398 * buying=high,med,low +
-0.296 * buying=med,low +

-0.194 * maint=high,med,low +

-0.2147 * maint=med,low +

-0.234 * lug_boot=med,big +

-0.6555 * safety=med,high +

-0.2865 * safety=high +

1.9373 * Condition=good,acc,vgood +

-0.2593 * Condition=acc,vgood +

0.5098 * Condition=vgood +

4.3284

Time taken to build model: 0.03 seconds

=== Cross-validation ===

=== Summary ===

Correlation coefficient 0.519

Mean absolute error 0.9149

Root mean squared error 1.0663

Relative absolute error 82.235 %

Root relative squared error 85.365 %

Total Number of Instances 1728


3. weka.classifiers.misc.InputMappedClassifier

Scheme: weka.classifiers.misc.InputMappedClassifier -I -trim -W weka.classifiers.rules.ZeroR

Relation: Car

Instances: 1728

Attributes: 7

buying

maint

doors

persons

lug_boot

safety

Condition

Test mode: 5-fold cross-validation

=== Classifier model (full training set) ===

InputMappedClassifier:

ZeroR predicts class value: 3.6666666666666665

Attribute mappings:

Model attributes Incoming attributes

--------------------- ----------------
(nominal) buying --> 1 (nominal) buying

(nominal) maint --> 2 (nominal) maint

(numeric) doors --> 3 (numeric) doors

(numeric) persons --> 4 (numeric) persons

(nominal) lug_boot --> 5 (nominal) lug_boot

(nominal) safety --> 6 (nominal) safety

(nominal) Condition --> 7 (nominal) Condition

Time taken to build model: 0 seconds

=== Cross-validation ===

=== Summary ===

Correlation coefficient -0.0727

Mean absolute error 1.1125

Root mean squared error 1.2491

Relative absolute error 100 %

Root relative squared error 100 %

Total Number of Instances 1728


Analysis and Result:

Analyzing all 6 different algorithms for the given specification (which is Maint and Persons) and
dataset was tested.

For maint attributes I have used three different algorithms which is rules.ZeroR, rules.PART,
functions.Logistic. From these following 3 algorithms it seems functions.Logisticis is the best
classifier for the case. It has less error.

Relative absolute error 96.2986 %

Root relative squared error 98.5638 %

For Persons attributes I have used 3 different algorithms which is functions.GaussianProcesses ,


functions.LinearRegression, misc.InputMappedClassifier. From this 3 algorithms is seems
functions.GaussianProcesses is the best classifier for the case.

Relative absolute error 82.2512 %

Root relative squared error 85.2746 %


For 2nd Sheet

You might also like