Experiment-4
Aim- Demonstration of classification rule process on dataset student.arff using j48 algorithm.
This experiment illustrates the use of j-48 classifier in weka. The sample data set used in this
experiment is “student” data available at arff format. This document assumes that appropriate data
pre processing has been performed.
Steps involved in this experiment:
Step-1: We begin the experiment by loading the data (student.arff)into weka.
Step2: Next we select the “classify” tab and click “choose” button to select the “j48”classifier.
Step3: Now we specify the various parameters. These can be specified by clicking in the text box to
the right of the chose button. In this example, we accept the default values. The default version does
perform some pruning but does not perform error pruning.
Step4: Under the “text” options in the main panel. We select the 10-fold cross validation as our
evaluation approach. Since we don’t have separate evaluation data set, this is necessary to get a
reasonable idea of accuracy of generated model.
Step-5: We now click ”start” to generate the model .the Ascii version of the tree as well as
evaluation statistic will appear in the right panel when the model construction is complete.
Step-6: Note that the classification accuracy of model is about 69%.this indicates that we may find
more work. (Either in preprocessing or in selecting current parameters for the classification)
Step-7: Now weka also lets us a view a graphical version of the classification tree. This can be done
by right clicking the last result set and selecting “visualize tree” from the pop-up menu.
Step-8: We will use our model to classify the new instances.
Step-9: In the main panel under “text” options click the “supplied test set” radio button and then
click the “set” button. This wills pop-up a window which will allow you to open the file containing
test instances.
Experiment-5
Aim- Demonstration of classification rule process on dataset employee.arff using naïve bayes
algorithm.
This experiment illustrates the use of naïve bayes classifier in weka. The sample data set used in this
experiment is “employee” data available at .arff format. This document assumes that appropriate
data pre processing has been performed.
Steps involved in this experiment:
Step1: We begin the experiment by loading the data (employee.arff) into weka.
Step2: next we select the “classify” tab and click “choose” button to select the “id3”classifier.
Step3: now we specify the various parameters. These can be specified by clicking in the text box to
the right of the chose button. In this example, we accept the default values his default version does
perform some pruning but does not perform error pruning.
Step4: under the “text “options in the main panel. We select the 10-fold cross validation as our
evaluation approach. Since we don’t have separate evaluation data set, this is necessary to get a
reasonable idea of accuracy of generated model.
Step-5: we now click”start”to generate the model .the ASCII version of the tree as well as evaluation
statistic will appear in the right panel when the model construction is complete.
Step-6: note that the classification accuracy of model is about 69%.this indicates that we may find
more work. (Either in preprocessing or in selecting current parameters for the classification)
Step-7: now weka also lets us a view a graphical version of the classification tree. This can be done
by right clicking the last result set and selecting “visualize tree” from the pop-up menu.
Step-8: we will use our model to classify the new instances.
Step-9: In the main panel under “text “options click the “supplied test set” radio button and then
click the “set” button. This will show pop-up window which will allow you to open the file containing
test instances.
Experiment-6
Aim- Demonstration of clustering rule process on dataset iris.arff using simple k-means.
This experiment illustrates the use of simple k-mean clustering with Weka explorer. The sample data
set used for this example is based on the iris data available in ARFF format. This document assumes
that appropriate preprocessing has been performed. This iris dataset includes 150 instances.
Steps involved in this Experiment:
Step 1: Run the Weka explorer and load the data file iris.arff in preprocessing interface.
Step 2: Inorder to perform clustering select the ‘cluster’ tab in the explorer and click on the choose
button. This step results in a dropdown list of available clustering algorithms.
Step 3 : In this case we select ‘simple k-means’.
Step 4: Next click in text button to the right of the choose button to get popup window shown in the
screenshots. In this window we enter six on the number of clusters and we leave the value of the
seed on as it is. The seed value is used in generating a random number which is used for making the
internal assignments of instances of clusters.
Step 5 : Once of the option have been specified. We run the clustering algorithm there we must
make sure that they are in the ‘cluster mode’ panel. The use of training set option is selected and
then we click ‘start’ button. This process and resulting window are shown in the following
screenshots.
Step 6 : The result window shows the centroid of each cluster as well as statistics on the number and
the percent of instances assigned to different clusters. Here clusters centroid are means vectors for
each clusters. This clusters can be used to characterized the cluster.For eg, the centroid of cluster1
shows the class iris.versicolor mean value of the sepal length is 5.4706, sepal width 2.4765, petal
width 1.1294, petal length 3.7941.
Step 7: Another way of understanding characterstics of each cluster through visualization ,we can do
this, try right clicking the result set on the result. List panel and selecting the visualize cluster
assignments.