0% found this document useful (0 votes)

36 views9 pages

ML Fat

The document outlines the FAT exam for the course 'Machine Learning for Data Science (LAB)', detailing the steps taken to preprocess a dataset, divide it into training, validation, and test sets, and apply machine learning models including Random Forest and an Artificial Neural Network (ANN). It includes code snippets for data handling, model training, hyperparameter tuning, and performance evaluation using accuracy scores and confusion matrices. The document emphasizes the importance of model validation and performance metrics in machine learning tasks.

Uploaded by

Shiny Sundarmoorthy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

36 views9 pages

ML Fat

Uploaded by

Shiny Sundarmoorthy

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 9

Winter semester 23-24

Course code MDI4001

Course name Machine Learning for DataScience (LAB)

Submitted to:
Jyotismita Chaki
Jyotismita@vit.ac.in

FAT Exam

Submitted by :
Shiny. S (21MID0079)
Shiny.2021@vitstudent.ac.in

Date: 29 April 2024

a) Performing the preprocessing steps in the given dataset

CODE:

import pandas as pd

import numpy as np

data = pd.read_csv("agriculture_dataset.csv")

data.head()

data.info()

data.isnull().sum()

data.describe()

SCREENSHOT :
There are no null values in the dataset. So there isn’t need for further preprocessing
steps.

b. Divide the dataset into train, validation, and test sets.

CODE :

x = data.iloc[:,0:6]

y = data['Plant type']

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

SCREENSHOT:
C ) Use a suitable hyperparameter-tuned ML model to train the dataset.

Random Forest is the suitable hyperparameter-tuned model to train the given dataset.

CODE :

rf_classifier = RandomForestClassifier(n_estimators=100, random_state=42)

rf_classifier.fit(X_train, y_train)

y_pred = rf_classifier.predict(X_test)

SCREENSHOT:

d. After training, validate it and test the model’s performance

CODE:

model = RandomForestClassifier(random_state=1, max_depth=10)

model.fit(X_train, y_train)

pred_train = model.predict(X_train)

train_score = accuracy_score(y_train,pred_train)

print('train_accuracy_score',train_score)

pred_val = model.predict(X_test)

val_score = accuracy_score(y_test,pred_val)

print('val_accuracy_score',val_score)
SCREENSHOT:

Hypertuning the model for better value accuracy:

CODE:

from sklearn.metrics import accuracy_score, confusion_matrix, precision_score,

recall_score, ConfusionMatrixDisplay

from sklearn.model_selection import RandomizedSearchCV

from scipy.stats import randint

param_dist = {'n_estimators': randint(50,500),'max_depth': randint(1,20)}

rf = RandomForestClassifier()

rand_search = RandomizedSearchCV(rf,param_distributions = param_dist, n_iter=5,

cv=5)

rand_search.fit(X_train, y_train)

# Create a variable for the best model

best_rf = rand_search.best_estimator_

# Print the best hyperparameters

print('Best hyperparameters:', rand_search.best_params_)

# Generate predictions with the best model

pred_train = best_rf.predict(X_train)

train_score = accuracy_score(y_train,pred_train)

print('train_accuracy_score',train_score)

pred_val = best_rf.predict(X_test)

val_score = accuracy_score(y_test,pred_val)
print('val_accuracy_score',val_score)

SCREENSHOT:

Creating the confusion matrix:

CODE:

cm = confusion_matrix(y_test,pred_val)

ConfusionMatrixDisplay(confusion_matrix=cm).plot()

SCREENSHOT:
a) Perform the pre-processing steps if needed. If the pre-processing steps are not needed

CODE:

import numpy as np

import pandas as pd

from sklearn.model_selection import train_test_split

from sklearn.preprocessing import StandardScaler

from keras.models import Sequential

from keras.layers import Dense

data = pd.read_csv('University_dataset.csv')

data.head()

data.isnull().sum()

SCREENSHOT:
B ) Divide the dataset into train, validation, and test set.

CODE:

X = data.iloc[:, 1:6].values

y = data.iloc[:, 6].values

SCREENSHOT:

C ) Can we use an ANN to train the dataset? If yes, then create an ANN and train and validate the
model by using the dataset and write a discussion on the performance of the model on the answer
booklet given. If no, then write your justification on the answer booklet given.

CODE:

scaler = StandardScaler()

X = scaler.fit_transform(X)

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Build the ANN

model = Sequential()

model.add(Dense(128, input_dim=5, activation='relu'))

model.add(Dense(64, activation='relu'))

model.add(Dense(1, activation='linear'))

# Compile the model

model.compile(loss='mean_squared_error', optimizer='adam', metrics=['mean_absolute_error'])

# Train the model

model.fit(X_train, y_train, epochs=100, batch_size=32, validation_data=(X_test, y_test)) #Evaluate

the model

loss, accuracy = model.evaluate(X_test, y_test)

print(f'Loss: {loss}, Mean Absolute Error: {accuracy}')

# Make predictions

predictions = model.predict(X_test)
SCREENSHOT:

DA PRA WEEK 13 (Random Forest) - 054551
No ratings yet
DA PRA WEEK 13 (Random Forest) - 054551
12 pages
Shobit Sharma (2124399) ML Lab File PDF
No ratings yet
Shobit Sharma (2124399) ML Lab File PDF
19 pages
Supple Maximizing Performance in Cs CuBiCl
No ratings yet
Supple Maximizing Performance in Cs CuBiCl
5 pages
Da Lab Mannual
No ratings yet
Da Lab Mannual
25 pages
AI ML - Cycle 2 Programs
No ratings yet
AI ML - Cycle 2 Programs
15 pages
Decision Tree
No ratings yet
Decision Tree
6 pages
ML Surya
No ratings yet
ML Surya
19 pages
Task 4
No ratings yet
Task 4
2 pages
MlLabManualdocx 2024 09 04 22 02 58
No ratings yet
MlLabManualdocx 2024 09 04 22 02 58
19 pages
Car Evaluation Data Analysis & Random Forest Model
No ratings yet
Car Evaluation Data Analysis & Random Forest Model
12 pages
ML Lab-1
No ratings yet
ML Lab-1
32 pages
8 To 12 Jaimeen
No ratings yet
8 To 12 Jaimeen
34 pages
Rev Insurance Business Report
No ratings yet
Rev Insurance Business Report
4 pages
ML Functions
No ratings yet
ML Functions
12 pages
AI Assignment-6
No ratings yet
AI Assignment-6
7 pages
ML Lab
No ratings yet
ML Lab
29 pages
1
No ratings yet
1
13 pages
Telecom Churn Proj
No ratings yet
Telecom Churn Proj
4 pages
Session 13
No ratings yet
Session 13
15 pages
Final-12-Lab Programs
No ratings yet
Final-12-Lab Programs
30 pages
Assignment 1: Q1. Task Description
No ratings yet
Assignment 1: Q1. Task Description
12 pages
Assgn 06 ML - Ipynb - Colab
No ratings yet
Assgn 06 ML - Ipynb - Colab
5 pages
ML5&6&7&8&9&10
No ratings yet
ML5&6&7&8&9&10
35 pages
Da Rec
No ratings yet
Da Rec
29 pages
Reast Cancer Prediction Using Debt
No ratings yet
Reast Cancer Prediction Using Debt
18 pages
Linearregression SVM
No ratings yet
Linearregression SVM
3 pages
AIML Project
No ratings yet
AIML Project
4 pages
ML Lab Programs 2
No ratings yet
ML Lab Programs 2
16 pages
Untitled Document
No ratings yet
Untitled Document
6 pages
ML Lab Manual
No ratings yet
ML Lab Manual
17 pages
Aiml Ex 4-7
No ratings yet
Aiml Ex 4-7
8 pages
ML Exp8 C36
No ratings yet
ML Exp8 C36
18 pages
Data Preprocessing
No ratings yet
Data Preprocessing
9 pages
Random Forest
No ratings yet
Random Forest
8 pages
Jupyter Lab
No ratings yet
Jupyter Lab
42 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
CS3491 Lab Manual
No ratings yet
CS3491 Lab Manual
21 pages
CP4252 Lab Manual
No ratings yet
CP4252 Lab Manual
13 pages
KNN RandomSearchCV Guide
No ratings yet
KNN RandomSearchCV Guide
4 pages
Random Forest Classifier on Banking Dataset
No ratings yet
Random Forest Classifier on Banking Dataset
7 pages
MLT 1 - 7 Kanish
No ratings yet
MLT 1 - 7 Kanish
24 pages
C2W3 Lab 01 Model Evaluation and Selection
No ratings yet
C2W3 Lab 01 Model Evaluation and Selection
21 pages
Classification Is For Predicting Type and Regression Is For Predicting Value
No ratings yet
Classification Is For Predicting Type and Regression Is For Predicting Value
4 pages
ML Remaining Jds
No ratings yet
ML Remaining Jds
35 pages
Aml Lab
No ratings yet
Aml Lab
6 pages
Data Mining Practicals
No ratings yet
Data Mining Practicals
22 pages
16BCB0126 VL2018195002535 Pe003
No ratings yet
16BCB0126 VL2018195002535 Pe003
40 pages
1 (A) Explain Supervised Learning and Unsupervised Learning
No ratings yet
1 (A) Explain Supervised Learning and Unsupervised Learning
52 pages
Ai Lab PRGM
No ratings yet
Ai Lab PRGM
10 pages
ML Lab
No ratings yet
ML Lab
20 pages
AIML Laboratory Set-B
No ratings yet
AIML Laboratory Set-B
7 pages
Random Forest
100% (1)
Random Forest
11 pages
Model Evaluation Techniques
No ratings yet
Model Evaluation Techniques
5 pages
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
No ratings yet
Import Pandas As PD DF PD - Read - CSV ("Titanic - Train - CSV") DF - Head
20 pages
23BCE7092 ML Lab Assignment
No ratings yet
23BCE7092 ML Lab Assignment
14 pages
Kaggle Course Notes
No ratings yet
Kaggle Course Notes
87 pages
AML ML Practical List
No ratings yet
AML ML Practical List
10 pages
ML Cheat Sheet
No ratings yet
ML Cheat Sheet
7 pages
Rdbms Versus Ordbms Versus Oodbms
100% (1)
Rdbms Versus Ordbms Versus Oodbms
19 pages
User Manual
No ratings yet
User Manual
23 pages
ER Diagram for Hospital System
No ratings yet
ER Diagram for Hospital System
11 pages
Week6 Module Educ8
No ratings yet
Week6 Module Educ8
4 pages
QA Plan Checklist for HUD SDM
100% (8)
QA Plan Checklist for HUD SDM
8 pages
TSQ-179 (V) (CGS) - Archived 1/2006
No ratings yet
TSQ-179 (V) (CGS) - Archived 1/2006
9 pages
C15 and C18 Generator Set
No ratings yet
C15 and C18 Generator Set
4 pages
Suraj Jagdale
No ratings yet
Suraj Jagdale
15 pages
Final
No ratings yet
Final
24 pages
ABB GIS Training Guide
No ratings yet
ABB GIS Training Guide
38 pages
The Demystification of Lookup Tables in Revit Families I
100% (1)
The Demystification of Lookup Tables in Revit Families I
35 pages
MANSCIE
No ratings yet
MANSCIE
3 pages
Topaire Wall Mounted Aircond Vstelog
No ratings yet
Topaire Wall Mounted Aircond Vstelog
4 pages
SMM7 37
No ratings yet
SMM7 37
1 page
AI and Lean Manufacturing
No ratings yet
AI and Lean Manufacturing
95 pages
Icssr Seminar - 20241214 - 160236 - 0000
No ratings yet
Icssr Seminar - 20241214 - 160236 - 0000
5 pages
Airline Service Exec Training Equipment
No ratings yet
Airline Service Exec Training Equipment
2 pages
Lesson Plan For Writing A Position Paper
100% (1)
Lesson Plan For Writing A Position Paper
10 pages
Industrial Pressure Transmitter: 0.13% FS Accuracy, External Adjustments, 4 To 20 Ma Output
No ratings yet
Industrial Pressure Transmitter: 0.13% FS Accuracy, External Adjustments, 4 To 20 Ma Output
3 pages
Accounting Project
No ratings yet
Accounting Project
88 pages
02 TIA Project SINAMICS S120 en
No ratings yet
02 TIA Project SINAMICS S120 en
38 pages
Mechanical Separations: Unit Operation
No ratings yet
Mechanical Separations: Unit Operation
5 pages
A Review On Nature Cybercrime and Best Practices of Digital Footprints
No ratings yet
A Review On Nature Cybercrime and Best Practices of Digital Footprints
7 pages
GSM - Billing - Tutorialspoint
No ratings yet
GSM - Billing - Tutorialspoint
2 pages
Q2 MODULE1 G10 CSS Week-1-4-1
No ratings yet
Q2 MODULE1 G10 CSS Week-1-4-1
9 pages
Rym-B8.01.04-Dsp-Sch-Zz-Gf-Acs-1010-Ground Floor-Acs Layout
No ratings yet
Rym-B8.01.04-Dsp-Sch-Zz-Gf-Acs-1010-Ground Floor-Acs Layout
1 page
Resume Bekhruz Makhmudov
No ratings yet
Resume Bekhruz Makhmudov
2 pages
AI Overview: Concepts, Types, and Applications
No ratings yet
AI Overview: Concepts, Types, and Applications
35 pages
Industrial Brake Motors Guide
No ratings yet
Industrial Brake Motors Guide
12 pages
NCM122n-Journal Article-OR-Narce
No ratings yet
NCM122n-Journal Article-OR-Narce
7 pages