Practical NO.
4
Shruti Sandip Manval 2306114
Aim:*Apply data cleansing on any two datasets.
(Excel)
Step 1: Identify Missing or Incorrect Data.
Step 2: Replace or Remove Missing Values
Enter Ctrl+H to launch the "Find and Replace"
window
Replace the ( ? ) with “ Unknown “.
Replace the Blank cells with N/A.
Step 3: Go to the Data tab and click on Remove Duplicates
Navigate to the Data Tab and select "Remove Duplicates" to easily
eliminate identical entries
we want to remove duplicates based on all columns that's why choose
"Select all Columns" and click "OK".
Preview results.
Step 4: Replace 0 with average value.
we want to remove duplicates based on all columns that's why choose
"Select all Columns" and click "OK".
Click on an empty cell and enter the formula:
=AVERAGEIF(Range,Criteria,Average_range)
=AVERAGEIF(E2:E10,”<>0”,E2:E10)
Step 5: Final Result.
(Weka Tool)
Step 1: Load the Dataset into Weka
Open Weka.
Click on Explorer.
Go to Open file, then select your dataset
Weka will display all attributes (columns) in the Preprocess tab.
Step 2: Handle Missing Values
Use the Edit button to manually enter values.
Step 3: Select the attribute and the drag the options and select one
of them.
Step 4:Preview result.