EX NO:09                  Simulate a boosting ensemble method for any
DATE:                                                    dataset
AIM:
        To Simulate boosting ensemble method for any dataset.
SOFTWARE REQUIRED:
       MATLAB
PROCEDURE
       Initialize weak learners, dataset, prediction array, and learning parameters (η, T, error).
       Define the loss function for different sample-label combinations.
       For each iteration, randomly sample a weighted training subset.
       For each learner, train on the weighted data (focus on harder cas
       Accumulate weighted predictions from all learners.
       Repeat for all iterations and record training error.
       Display final ensemble model and plot accuracy with moving average.
THEORY
Boosting, a supervised ensemble learning technique, is implemented to simulate medical
diagnosis prediction based on patient symptom data. The model operates on a dataset where
each instance includes features and a diagnosis label: Positive or Negative. The ensemble
combines multiple weak classifiers—simple models like decision stumps—trained
sequentially to focus on misclassified cases. An error function evaluates how well each
learner classifies the data, with higher weights assigned to incorrect predictions. Initially, all
samples are assigned equal weights to reflect uniform importance across the dataset. The
boosting process runs for 1000 iterations, where in each iteration, a weak learner is trained
using the current weighted distribution of the data.After training, each learner's error is
calculated, and its weight is computed based on accuracy. Misclassified samples have their
weights increased, encouraging the next learner to focus on them in subsequent rounds.
This    process    reinforces   correct    predictions    and    mitigates    individual   model
weaknesses.Throughout the iterations, cumulative training error is tracked to evaluate
ensemble improvement. After all rounds, the final ensemble is presented, representing a
strong classifier formed by many weak learners.
A graph is also plotted showing prediction accuracy per iteration and its moving average,
highlighting steady performance enhancement over time.This experiment demonstrates how
ensemble learning enables robust predictive modeling by leveraging the strengths of multiple
weak learners and iteratively refining performance.
PROGRAM
% Read the dataset
data= readtable('/MATLAB
Drive/mlt/network_traffic_data.csv/loan/loan_approval_dataset.csv');
% Remove 'loan_id' column if it exists
if ismember('loan_id',
  data.Properties.VariableNames) data.loan_id = [];
end
% Convert 'education' to categorical if it is a cell array
if iscell(data.education)
  data.education = categorical(data.education);
end
data.education = categorical(data.education); % Ensure it's categorical
% Create dummy variables for 'education'
eduDummies = dummyvar(data.education);
eduNames = categories(data.education);
% Add dummy variables to the
dataset for i = 1:length(eduNames)
  varName = matlab.lang.makeValidName(['Education_' eduNames{i}]);
  data.(varName) = eduDummies(:, i);
end
% Remove original 'education'
column data.education = [];
% Convert 'self_employed' to categorical and then to binary (Yes = 1, No =
0) if iscell(data.self_employed)
  data.self_employed = categorical(data.self_employed);
end
data.self_employed = double(data.self_employed == 'Yes');
% Create binary label for 'approved' based on 'cibil_score'
data.approved = double(data.cibil_score >= 700);
% Define core features
features = {'no_of_dependents', 'self_employed', 'income_annum', ...
        'loan_amount', 'loan_term', 'cibil_score'};
% Append the dummy variable names for education
eduCols = startsWith(data.Properties.VariableNames, 'Education_');
features = [features, data.Properties.VariableNames(eduCols)];
% Extract features (X) and labels
(Y) X = table2array(data(:,
features));
Y = table2array(data(:, 'approved'));
% Normalize the
features X =
normalize(X);
% Split the data into 70% training and 30% testing
cv = cvpartition(Y, 'HoldOut', 0.3);
XTrain = X(training(cv),
:); YTrain =
Y(training(cv)); XTest =
X(test(cv), :); YTest =
Y(test(cv));
% Train the SVM model with RBF kernel
SVMModel = fitcsvm(XTrain, YTrain, ...
  'KernelFunction', 'rbf', ...
  'Standardize', true, ...
  'ClassNames', [0 1]);
% Predict on the test set
YPred = predict(SVMModel, XTest);
% Generate confusion matrix
confMat        =     confusionmat(YTest,
YPred); TP = confMat(2,2);
FP = confMat(1,2);
FN = confMat(2,1);
TN = confMat(1,1);
% Compute evaluation
metrics precision = TP / (TP +
FP); recall = TP / (TP + FN);
f1 = 2 * (precision * recall) / (precision + recall);
accuracy = (TP + TN) / sum(confMat(:));
% Display results
fprintf('Accuracy: %.2f%%\n', accuracy *
100); fprintf('Precision: %.2f\n', precision);
fprintf('Recall: %.2f\n', recall);
fprintf('F1-Score: %.2f\n', f1);
% Compute scores for ROC curve
[~, scores] = predict(SVMModel, XTest);
[Xroc, Yroc, ~, AUC] = perfcurve(YTest, scores(:,2), 1);
% Plot the ROC
curve figure;
plot(Xroc, Yroc, 'b-', 'LineWidth',
2); xlabel('False Positive Rate');
ylabel('True Positive Rate');
title(['ROC Curve (AUC = ' num2str(AUC, '%.2f')
')']); grid on;
FLOWCHART:
                                     Figure:1
Output:
                           Figure :2
RESULT:
    Thus, the simulation of a boosting ensemble for a given dataset was completed
successfully using MATLAB Software.
CORE COMPETENCY:
    Thus, Successfully learned how to simulate a boosting ensemble for a givendataset using
MATLAB software.
MARKS ALLOCATION:
Details                     Marks Allotted   Vinantika E A        Vishmitha M
Preparation                      20
Conducting                       20
Calculation / Graphs             15
Results                          10
Basic understanding (Core
                                 15
competency learned)
Viva                             10
Record                           10
Total                            100
                                                         Signature of faculty