0% found this document useful (0 votes)
15 views10 pages

MLT 9

The document outlines a procedure for simulating a boosting ensemble method using MATLAB on a dataset for medical diagnosis prediction. It details the steps for initializing weak learners, defining loss functions, training on weighted data, and tracking cumulative training error over 1000 iterations. The final output includes a trained ensemble model, accuracy plots, and evaluation metrics such as precision, recall, and F1-score.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
15 views10 pages

MLT 9

The document outlines a procedure for simulating a boosting ensemble method using MATLAB on a dataset for medical diagnosis prediction. It details the steps for initializing weak learners, defining loss functions, training on weighted data, and tracking cumulative training error over 1000 iterations. The final output includes a trained ensemble model, accuracy plots, and evaluation metrics such as precision, recall, and F1-score.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 10

EX NO:09 Simulate a boosting ensemble method for any

DATE: dataset

AIM:

To Simulate boosting ensemble method for any dataset.

SOFTWARE REQUIRED:

 MATLAB

PROCEDURE

 Initialize weak learners, dataset, prediction array, and learning parameters (η, T, error).
 Define the loss function for different sample-label combinations.
 For each iteration, randomly sample a weighted training subset.
 For each learner, train on the weighted data (focus on harder cas
 Accumulate weighted predictions from all learners.
 Repeat for all iterations and record training error.
 Display final ensemble model and plot accuracy with moving average.

THEORY

Boosting, a supervised ensemble learning technique, is implemented to simulate medical


diagnosis prediction based on patient symptom data. The model operates on a dataset where
each instance includes features and a diagnosis label: Positive or Negative. The ensemble
combines multiple weak classifiers—simple models like decision stumps—trained
sequentially to focus on misclassified cases. An error function evaluates how well each
learner classifies the data, with higher weights assigned to incorrect predictions. Initially, all
samples are assigned equal weights to reflect uniform importance across the dataset. The
boosting process runs for 1000 iterations, where in each iteration, a weak learner is trained
using the current weighted distribution of the data.After training, each learner's error is
calculated, and its weight is computed based on accuracy. Misclassified samples have their
weights increased, encouraging the next learner to focus on them in subsequent rounds.
This process reinforces correct predictions and mitigates individual model
weaknesses.Throughout the iterations, cumulative training error is tracked to evaluate
ensemble improvement. After all rounds, the final ensemble is presented, representing a
strong classifier formed by many weak learners.
A graph is also plotted showing prediction accuracy per iteration and its moving average,
highlighting steady performance enhancement over time.This experiment demonstrates how
ensemble learning enables robust predictive modeling by leveraging the strengths of multiple
weak learners and iteratively refining performance.

PROGRAM

% Read the dataset


data= readtable('/MATLAB
Drive/mlt/network_traffic_data.csv/loan/loan_approval_dataset.csv');
% Remove 'loan_id' column if it exists

if ismember('loan_id',
data.Properties.VariableNames) data.loan_id = [];
end
% Convert 'education' to categorical if it is a cell array
if iscell(data.education)
data.education = categorical(data.education);
end

data.education = categorical(data.education); % Ensure it's categorical


% Create dummy variables for 'education'
eduDummies = dummyvar(data.education);
eduNames = categories(data.education);
% Add dummy variables to the
dataset for i = 1:length(eduNames)
varName = matlab.lang.makeValidName(['Education_' eduNames{i}]);
data.(varName) = eduDummies(:, i);
end
% Remove original 'education'
column data.education = [];
% Convert 'self_employed' to categorical and then to binary (Yes = 1, No =
0) if iscell(data.self_employed)
data.self_employed = categorical(data.self_employed);
end
data.self_employed = double(data.self_employed == 'Yes');
% Create binary label for 'approved' based on 'cibil_score'
data.approved = double(data.cibil_score >= 700);
% Define core features
features = {'no_of_dependents', 'self_employed', 'income_annum', ...
'loan_amount', 'loan_term', 'cibil_score'};
% Append the dummy variable names for education

eduCols = startsWith(data.Properties.VariableNames, 'Education_');

features = [features, data.Properties.VariableNames(eduCols)];


% Extract features (X) and labels
(Y) X = table2array(data(:,
features));
Y = table2array(data(:, 'approved'));
% Normalize the
features X =
normalize(X);
% Split the data into 70% training and 30% testing
cv = cvpartition(Y, 'HoldOut', 0.3);
XTrain = X(training(cv),
:); YTrain =
Y(training(cv)); XTest =
X(test(cv), :); YTest =
Y(test(cv));
% Train the SVM model with RBF kernel
SVMModel = fitcsvm(XTrain, YTrain, ...
'KernelFunction', 'rbf', ...

'Standardize', true, ...

'ClassNames', [0 1]);
% Predict on the test set

YPred = predict(SVMModel, XTest);


% Generate confusion matrix

confMat = confusionmat(YTest,
YPred); TP = confMat(2,2);
FP = confMat(1,2);
FN = confMat(2,1);
TN = confMat(1,1);
% Compute evaluation
metrics precision = TP / (TP +
FP); recall = TP / (TP + FN);
f1 = 2 * (precision * recall) / (precision + recall);
accuracy = (TP + TN) / sum(confMat(:));
% Display results

fprintf('Accuracy: %.2f%%\n', accuracy *


100); fprintf('Precision: %.2f\n', precision);
fprintf('Recall: %.2f\n', recall);
fprintf('F1-Score: %.2f\n', f1);
% Compute scores for ROC curve

[~, scores] = predict(SVMModel, XTest);

[Xroc, Yroc, ~, AUC] = perfcurve(YTest, scores(:,2), 1);


% Plot the ROC
curve figure;
plot(Xroc, Yroc, 'b-', 'LineWidth',
2); xlabel('False Positive Rate');
ylabel('True Positive Rate');
title(['ROC Curve (AUC = ' num2str(AUC, '%.2f')
')']); grid on;

FLOWCHART:

Figure:1
Output:

Figure :2

RESULT:
Thus, the simulation of a boosting ensemble for a given dataset was completed
successfully using MATLAB Software.

CORE COMPETENCY:
Thus, Successfully learned how to simulate a boosting ensemble for a givendataset using
MATLAB software.
MARKS ALLOCATION:
Details Marks Allotted Vinantika E A Vishmitha M

Preparation 20

Conducting 20

Calculation / Graphs 15

Results 10
Basic understanding (Core
15
competency learned)
Viva 10
Record 10

Total 100

Signature of faculty

You might also like