1 Asdfadgaf

The document outlines a simulation exercise for data extraction and preprocessing techniques using MATLAB or Python. It includes objectives, procedures, and programming examples for handling missing values, normalization, encoding, outlier detection, and data smoothing. Additionally, it features pre-lab and post-lab questions to assess understanding and application of the techniques learned.

Uploaded by

boopathiboops2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

11 views8 pages

1 Asdfadgaf

Uploaded by

boopathiboops2004

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 8

FLOWCHART:

Ex. No: 1 SIMULATE THE DATA EXTRACTION FROM THE DATABASE AND
VARIOUS DATA PRE-PROCESSING TECHNIQUES FOR A GIVEN
DATE:
DATASET

OBJECTIVES:
To perform data extraction techniques and preprocessing techniques for the given dataset
AIM:
To simulate data extraction from a database and apply specific preprocessing techniques,
such as handling missing values, normalization, encoding, and outlier detection, using
MATLAB/PYTHON to enhance data quality.
SOFTWARE REQUIRED:
MATLAB R2022a/ Open CV/ Google Colab
PROCEDURE FOR MATLAB:

1. Click on the MATLAB Icon on the desktop.

2. Click on the ‘FILE’ Menu on menu bar.
3. Click on NEW M-File from the file Menu.
4. Save the file in directory.
5. Click on DEBUG from Menu bar and Click Run.
6. Open the command window\ Figure window for the output

THEORY:
In machine learning, there are preprocessing techniques collectively improve the quality of the
dataset, making it more suitable for further analysis or machine learning applications. They are
Data Extraction:
Simulated by reading a CSV file: In this step, we simulate extracting data from a database
by reading it from a CSV file. This involves using MATLAB functions like readtable or csvread to
load the data into the workspace. The extracted data can then be processed and analyzed within
MATLAB.
PROGRAM:

% Read the Excel file

filename = 'Student_Details.xlsx'; % Ensure the file is in the working directory
data = readtable(filename);
% Find and replace missing values with column mean
numericCols = varfun(@isnumeric, data, 'OutputFormat', 'uniform'); % Identify numeric columns
for i = find(numericCols)
colData = data{:, i};
meanValue = mean(colData, 'omitnan'); % Compute mean ignoring NaN
colData(isnan(colData)) = meanValue; % Replace NaN with mean
data{:, i} = colData;
end
% Save the updated data back to a new Excel file
newFilename = 'Student_Details_Filled.xlsx';
writetable(data, newFilename);
disp('Missing values filled and saved successfully.');
Handling Missing Values:
Replaces missing values with the mean of the column: Missing data can cause issues in
analysis. To address this, we replace missing values with the mean value of their respective
columns. This is done using MATLAB's fill missing function, which ensures that the dataset
remains complete and reduces potential biases.
Normalization:
Scales the data to the range [0, 1]: Normalization adjusts the values in the dataset to a
common scale, typically between 0 and 1. This is crucial when the features have different units or
ranges. MATLAB's normalize function can be used to perform this scaling, helping to improve the
performance of machine learning algorithms.
Standardization:
Standardizes the data to have zero mean and unit variance: Standardization transforms the
data so that it has a mean of zero and a standard deviation of one. This is done using the zscore
function in MATLAB. Standardization is particularly useful when the data has varying scales and
is necessary for certain algorithms that assume normally distributed data.
Encoding Categorical Variables:
Converts categorical variables to numerical values: Categorical variables must be
converted to numerical values for use in machine learning models. This can be done by converting
categories to integers using MATLAB's categorical and double functions, ensuring that the data is
in a suitable format for analysis.
Outlier Detection and Removal:
Removes rows with outliers based on the z-score: Outliers can skew results and affect
model performance. We detect outliers using the z-score, which measures the number of standard
deviations a data point is from the mean. Data points with z-scores beyond a certain threshold
(e.g., 3) are considered outliers and can be removed to clean the dataset.
Data Smoothing:
Applies a moving average filter to smooth the data: Smoothing helps to reduce noise and
fluctuations in the data, making patterns more apparent. This can be achieved using a moving
average filter, implemented in MATLAB with the movmean function. By averaging data points
within a defined window, we produce a smoother dataset that is easier to analyze.
PRELAB QUESTIONS:

1. What is the purpose of data extraction in the context of this experiment?

2. Which MATLAB function is used to read data from a CSV file?
3. Why is it important to handle missing values in a dataset?
4. Explain how normalization differs from standardization.
5. Describe the purpose of applying a moving average filter to data.
6. What are the potential impacts of outliers on data analysis and model performance?
7. How can normalization improve the performance of machine learning algorithms?
8. What are some common techniques for handling missing data, and why is replacing with
the mean a valid approach?

POSTLAB QUESTIONS:

1. Which preprocessing technique had the most significant impact on the dataset, and why?
2. Were there any challenges encountered during data extraction or preprocessing? How were
they addressed?
3. Evaluate the effectiveness of the moving average filter in smoothing the data. Did it help
reveal underlying patterns?
4. Based on your results, what additional preprocessing steps would you recommend to
further enhance the dataset's quality?
RESULT:

CORE COMPETENCY:

MARKS ALLOCATION:

Details Marks Marks Awarded

Allotted
BOOPATHI V DHARANESH P
(73772213110) (73772213116)

Preparation 20

Conducting 20

Calculation / Graphs 15

Results 10

Basic understanding (Core 15

competency learned)

Viva 10

Record 10

Total 100
`

Signature of faculty
FLOW CHART:

Experiment No. 5: Objective
No ratings yet
Experiment No. 5: Objective
5 pages
Module 3 Notes
No ratings yet
Module 3 Notes
5 pages
ML Exp No 1
No ratings yet
ML Exp No 1
8 pages
02 - 23ECE216 - EDA - Pre Processing
No ratings yet
02 - 23ECE216 - EDA - Pre Processing
16 pages
Data Preprocessing: Clean, Transform, Integrate
No ratings yet
Data Preprocessing: Clean, Transform, Integrate
6 pages
Data Preprocessing and Cleaning
No ratings yet
Data Preprocessing and Cleaning
6 pages
1data Cleansing Cheklist
No ratings yet
1data Cleansing Cheklist
2 pages
C2 - Data Cleaning & Preprocessing
No ratings yet
C2 - Data Cleaning & Preprocessing
59 pages
ML 4
No ratings yet
ML 4
17 pages
Unit II (DWDM)
No ratings yet
Unit II (DWDM)
19 pages
ML Self Unit 2
No ratings yet
ML Self Unit 2
20 pages
Assignment Questions - Data Analysis and Visualization Using Power BI and Tableau
No ratings yet
Assignment Questions - Data Analysis and Visualization Using Power BI and Tableau
2 pages
Study Material Data Preprocessing
No ratings yet
Study Material Data Preprocessing
11 pages
Unit 2
No ratings yet
Unit 2
9 pages
Lec 9
No ratings yet
Lec 9
1 page
Chapter 2
No ratings yet
Chapter 2
37 pages
Module II - Data Processing
No ratings yet
Module II - Data Processing
54 pages
3-Data Preprocessing
No ratings yet
3-Data Preprocessing
32 pages
Section 4
No ratings yet
Section 4
3 pages
Dev Core
No ratings yet
Dev Core
7 pages
Machine Learning Project Roadmap
No ratings yet
Machine Learning Project Roadmap
4 pages
Dav Exps - Merged - Merged
No ratings yet
Dav Exps - Merged - Merged
99 pages
Week 6 - Data Cleaning
No ratings yet
Week 6 - Data Cleaning
8 pages
Ads Exp2 C35
No ratings yet
Ads Exp2 C35
9 pages
DAV Practical 2
No ratings yet
DAV Practical 2
6 pages
CS322 - Lec 3 - S25
No ratings yet
CS322 - Lec 3 - S25
42 pages
Data Analytics Using R Lab - Master Manual
No ratings yet
Data Analytics Using R Lab - Master Manual
29 pages
DAI101 4 Data Preparation
No ratings yet
DAI101 4 Data Preparation
45 pages
UNIT 2 DT
No ratings yet
UNIT 2 DT
8 pages
MSDSModule 2
No ratings yet
MSDSModule 2
35 pages
FOUND. DATA SCIENCE Practical
No ratings yet
FOUND. DATA SCIENCE Practical
15 pages
Exploratory Data
No ratings yet
Exploratory Data
47 pages
Lab2
No ratings yet
Lab2
8 pages
L3 Overview of ML Model Development Lifecycle-1
No ratings yet
L3 Overview of ML Model Development Lifecycle-1
30 pages
Group A Assignment No2 Writeup
No ratings yet
Group A Assignment No2 Writeup
9 pages
Data Preprocessing Essentials
No ratings yet
Data Preprocessing Essentials
9 pages
Unit 2 Data Preprocessing
No ratings yet
Unit 2 Data Preprocessing
3 pages
DS Unit 2
No ratings yet
DS Unit 2
42 pages
TE ML LAB Mannual
No ratings yet
TE ML LAB Mannual
21 pages
01 Apply Data Preprocessing On Heart Dataset and Evaluate Performance Using Confusion Matrix
No ratings yet
01 Apply Data Preprocessing On Heart Dataset and Evaluate Performance Using Confusion Matrix
19 pages
Exploratory Data Analysis
No ratings yet
Exploratory Data Analysis
48 pages
Aids-B Ii-Ii DSP Lab LP
No ratings yet
Aids-B Ii-Ii DSP Lab LP
2 pages
Chapter 1. Data Preparation
No ratings yet
Chapter 1. Data Preparation
74 pages
Foundation of Data Science Previous Year Question Paper
No ratings yet
Foundation of Data Science Previous Year Question Paper
40 pages
ML ch-1
No ratings yet
ML ch-1
32 pages
ML Lab Record
No ratings yet
ML Lab Record
38 pages
3 Data Preprocessing
No ratings yet
3 Data Preprocessing
25 pages
Machine Learning Data Prep Guide
No ratings yet
Machine Learning Data Prep Guide
9 pages
Lab Manual New
No ratings yet
Lab Manual New
12 pages
Syed BSP Report PDF
100% (1)
Syed BSP Report PDF
9 pages
Lecture 1
No ratings yet
Lecture 1
37 pages
DSBDA Lab Assignment No 2
No ratings yet
DSBDA Lab Assignment No 2
7 pages
Lect 04 Preprocessing Structured
No ratings yet
Lect 04 Preprocessing Structured
39 pages
Part 5
No ratings yet
Part 5
22 pages
EDA Guide for Data Analysts
No ratings yet
EDA Guide for Data Analysts
35 pages
Data Preparation
No ratings yet
Data Preparation
17 pages
R-Programming Lab Mannual
No ratings yet
R-Programming Lab Mannual
33 pages
Bi Ut2 Answers
No ratings yet
Bi Ut2 Answers
23 pages
HW 1
No ratings yet
HW 1
7 pages
EX NO:06 Simulate Dimensionality Reduction Using Pca On A Dataset Date
No ratings yet
EX NO:06 Simulate Dimensionality Reduction Using Pca On A Dataset Date
4 pages
Simulation of Decision Tree Classification For A Dataset: EX NO:05 Date
No ratings yet
Simulation of Decision Tree Classification For A Dataset: EX NO:05 Date
4 pages
Simulate A Regression Model For A Given Dataset: EX NO:03 - Date
No ratings yet
Simulate A Regression Model For A Given Dataset: EX NO:03 - Date
4 pages
Ex. No: 2 Simulate The Ann Using The Back-Propagation Algorithm Date
No ratings yet
Ex. No: 2 Simulate The Ann Using The Back-Propagation Algorithm Date
6 pages
Planos de Fases
No ratings yet
Planos de Fases
3 pages
Enter The MATLAB Syntax You Used and MATLAB Output in The Space Provided 1. Find The Length, Distance, Angle Between
No ratings yet
Enter The MATLAB Syntax You Used and MATLAB Output in The Space Provided 1. Find The Length, Distance, Angle Between
4 pages
Matlab Curve Fitting Guide
No ratings yet
Matlab Curve Fitting Guide
16 pages
Simulink
100% (1)
Simulink
20 pages
Finite Element Analysis Using MATLAB ANSYS 16 Hrs
No ratings yet
Finite Element Analysis Using MATLAB ANSYS 16 Hrs
2 pages
SIMULINK Notes PDF
No ratings yet
SIMULINK Notes PDF
7 pages
Introduction To Matlab Application To Electrical Engineering Part II
No ratings yet
Introduction To Matlab Application To Electrical Engineering Part II
167 pages
Matlab Tutorial - CIE323 - 2018
No ratings yet
Matlab Tutorial - CIE323 - 2018
13 pages
Uday Chavan Internship Report 1
No ratings yet
Uday Chavan Internship Report 1
20 pages
IOT2
No ratings yet
IOT2
4 pages
Downey (2011) - Physical Modelling With Matlab
100% (5)
Downey (2011) - Physical Modelling With Matlab
157 pages
Mechatronics Control Kit User'S Manual: Important Notice!
No ratings yet
Mechatronics Control Kit User'S Manual: Important Notice!
56 pages
MATLAB Image Processing Guide
No ratings yet
MATLAB Image Processing Guide
25 pages
Applied Numerical Methods and Programming in Engineering
No ratings yet
Applied Numerical Methods and Programming in Engineering
56 pages
Lecture Slides: Lecture: 5 Introduction To Matlab Part - 1
No ratings yet
Lecture Slides: Lecture: 5 Introduction To Matlab Part - 1
42 pages
Positioning of Object
No ratings yet
Positioning of Object
2 pages
ZMap A Software Package To Analyze Seismicity PDF
No ratings yet
ZMap A Software Package To Analyze Seismicity PDF
10 pages
Lab 6 - Learning System Identification Toolbox of Matlab PDF
No ratings yet
Lab 6 - Learning System Identification Toolbox of Matlab PDF
11 pages
Acquire Continuous Audio Data
No ratings yet
Acquire Continuous Audio Data
4 pages
GMT API Guide for Developers
No ratings yet
GMT API Guide for Developers
73 pages
Slci Ug PDF
No ratings yet
Slci Ug PDF
75 pages
Create Model That Uses MATLAB Function Block
No ratings yet
Create Model That Uses MATLAB Function Block
5 pages
Glucoma
No ratings yet
Glucoma
40 pages
Robotics Lab Manual Final
100% (4)
Robotics Lab Manual Final
39 pages
Machine Learning Assignment
No ratings yet
Machine Learning Assignment
8 pages
MATLAB Finite Elements Guide
0% (1)
MATLAB Finite Elements Guide
1 page
QArm - Lab 2 - Lab Procedure
No ratings yet
QArm - Lab 2 - Lab Procedure
3 pages
Dgalab: An Extensible Software Implementation For Dga: Saleh I. Ibrahim, Sherif S.M. Ghoneim, Ibrahim B.M. Taha
No ratings yet
Dgalab: An Extensible Software Implementation For Dga: Saleh I. Ibrahim, Sherif S.M. Ghoneim, Ibrahim B.M. Taha
8 pages
Matlab Stateflow Documentation PDF Download
100% (2)
Matlab Stateflow Documentation PDF Download
76 pages

1 Asdfadgaf

Uploaded by

1 Asdfadgaf

Uploaded by

FLOWCHART:

1. Click on the MATLAB Icon on the desktop.

% Read the Excel file

1. What is the purpose of data extraction in the context of this experiment?

Details Marks Marks Awarded

Basic understanding (Core 15

You might also like