Unit 3
1. What are the different ways the multiple base-learners are combined to
       generate thefinal output?
       1) Multiexpertcombinationmethodshavebase-
       learnersthatworkinparallel.These methods can in turn be divided
       into two:
       _Intheglobalapproach,alsocalled learnerfusion,givenaninput,allbase-
       learners generate an output and all these outputs are used.
       Examplesarevoting andstacking.
       _Inthelocalapproach,orlearnerselection,forexample,inmixtureofexperts,thereisa
       gatingmodel,whichlooksattheinputandchoosesone(orveryfew)ofthelearnersas
       responsible for generating the output.
       2) Multistagecombinationmethodsuseaserialapproachwherethenextco
       mbination base-learner is trained with or tested on only the instances
       where the previous base- learners are not accurate enough. The idea is
       that the base-learners (or the different
       representationstheyuse)aresortedinincreasingcomplexitysothatacomplexba
       se- learner is not used (or its
   2. WhatisVotingEnsemble?Howdoesitwork?
   Votingensemblesare the type of machine learningalgorithmsthatfallsunder
the ensembletechniques. Astheyareoneoftheensemblealgorithms,theyusemultiple
modelstotrainonthedatasetandforpredictions.
Theyaretwocategoriesofvotingensembles.
      Classification
      Regression
   VotingClassifiersaretheensemblesusedinclassificationtasksinmachinelearning
. In
VotingClassifiers,multiplemodelsofthedifferentmachinelearningalgorithmsare
present,towhomthewholedatasetisfed,andeveryalgorithmwillpredictoncetrained
onthedata.Onceallthemodelspredictthesampledata,themostfrequentstrategyis
usedtogetthefinalpredictionfromthemodel.Here,thecategorymostpredictedbythe
multiplealgorithmswillbetreatedasthefinalpredictionofthemodel.
ForExample,ifthreemodelspredict YESandtwomodelspredictNO, YESwouldbe
considered the final prediction of the model.
VotingRegressorsarethesameasvotingclassifiers.Still,theyareusedonregression
problems,andthefinaloutputfromthismodelisthemeanofthepredictionofall
individualmodels.ForExample,iftheoutputsfromthethreemodelsare5,10,and15,
thenthefinalresultwouldbethemeanofthesevalues,whichis15.
   3. Explain the concept of ensemble learning. (Anna University, May/June
            2019)
            Ensemblelearningisamachinelearningparadigmwheremultiplemodel
            s,
      often called base learners, are trained and combined to solve the same
      problem.                                                               The
      maingoalistoimprovetheoverallperformance,robustness,andgeneralizabilit
      yofthe model by leveraging the strengths of individual learners and
      compensating for their weaknesses.
   4. Describethebaggingtechnique anditsadvantages.
      (AnnaUniversity,Nov/Dec2018)
            Bagging, or Bootstrap Aggregating, is an ensemble method that
      involves training multiple models on different subsets of the training data
      created through bootstrap sampling. The predictions of these models are
      then combined, typically by averaging for regression or voting for
      classification. Bagging reduces variance and helps prevent overfitting,
      leading to more stable and reliable predictions.
   5. Whatisboostingandhowdoesitimprovemodelperformance?
      (AnnaUniversity, May/June 2020)
            Boosting is an ensemble technique that sequentially trains models,
      with each new model focusing on correcting the errors made by the
         previous ones. This method combines the strengths of each model to
         improve overall performance, particularly on difficult-to-classify
         instances. Boosting can reduce both bias and variance, leading to better
         generalization.
     6. Explaintheconceptofstackinginensemblelearning.(AnnaUniversity,Nov/
         Dec 2019)
               Stackinginvolvestrainingmultiplebasemodelsandthenusingtheirpred
         ictions as inputs to a meta-model, which learns to make the final
         prediction. The meta-model effectively combines the outputs of the base
         models, potentiallyimprovingthe overall performance by leveraging the
         strengths of each base model.
     7. Whatisstackedgeneralization?
               Stackedgeneralization           isatechniqueproposed            by
         Wolpert(1992)thatextends          votinginthatthewaytheoutputofthebase-
         learnersiscombinedneednotbelinearbut is learned through a combiner
         system, f (·|Φ), which is another learner, whose parameters Φ are also
         trained
The combiner learns what the correct output is when the base-learners give a
certainoutputcombination.Wecannottrainthecombinerfunction onthetraining data
because the base-learners may be memorizing the training set; the combiner
systemshould actually learn how the base learners make errors.
  8. Differencebetween Baggingand Boosting
UnsupervisedLearning
  9. What is the K-means clustering algorithm and how does it work? (Anna
     University,May/June 2018)
           The K-means clustering algorithm partitions a dataset into K
     clusters, where each data point belongs to the cluster with the nearest
     mean. The algorithm iteratively assigns data points to clusters based on
     the distance to the current cluster means, then updates the means based on
     the assigned points, until convergence.
Instance-BasedLearning
  10.DescribetheworkingoftheK-NearestNeighbors(KNN)algorithm.
     (AnnaUniversity, Nov/Dec 2017)
           The KNN algorithm classifies a data point based on the majority
     class among its K nearest neighbors in the feature space. The distance
     between points is typically measured using metrics such as Euclidean
     distance. KNN is a non-parametric, lazy learning algorithm that is simple
     and effective for many applications.
GaussianMixtureModelsandExpectationMaximization
  11.Explain the Gaussian Mixture Model (GMM) and its applications. (Anna
     University, May/June 2021)
           AGaussianMixtureModelisaprobabilisticmodelthatrepresentsadistri
     bution of data as a mixture of multiple Gaussian distributions. Each
     Gaussian component is characterized by its mean and covariance. GMMs
     are commonly used for clustering, density estimation, and anomaly
     detection.
  12.What is the Expectation-Maximization (EM) algorithm and how is it used
     in GMMs? (Anna University, Nov/Dec 2019)
           The Expectation-Maximization (EM) algorithm is used to estimate
     the parameters of models with latent variables,such as GMMs.Ititeratively
     performs two steps:
           The
Expectationstep,whichassignsprobabilitiestodatapointsforeachGaussian
component based on current parameter estimates, and the Maximization
step which updates the parameters to maximize the likelihood of the data
given these assignments. This process continues until convergence.
      PARTB&C
      CombiningMultipleLearnersandEnsembleLearning
           1. Explaintheconceptofmodelcombinationschemesanddiscusstheirimporta
                ncein machine learning.
           2. Whatisvotinginensemblelearning?
                Differentiatebetweenmajorityvotingand weighted voting with
                examples.
           3. Describethebaggingtechniqueinensemblelearning.Explainitsadvantag
                esand limitations.
           4. Discusstheboostingalgorithm.Explainhowitdiffersfrombagginginter
                msof methodology and applications.
           5. Whatisstackinginensemblelearning?Illustrate
                itsworkingwithapracticalexample.
           6. Comparebagging,boosting,andstackingintermsofimplementation,advantag
                es,and use cases.
           7. Explaintheroleofweaklearnersinboosting.Provideanexampletoshowho
                w boosting combines them.
           8.
                Analyzetheimpactofoverfittinginensemblemethodsandhowitisaddresse
                dby techniques like bagging and boosting.
      UnsupervisedLearning
           9. ExplaintheK-
                meansclusteringalgorithmwithanexample.Discussitslimitations and
                solutions.
           10. DiscusstheinitializationprobleminK-meansclusteringandexplaintheK-
                means++ initialization technique.
AL3451_M
L
           11. ExplaintheworkingofGaussianMixtureModels(GMM)forclustering.Co
              mpare GMM with K-means.
           12. DescribetheExpectation-
              Maximization(EM)algorithmusedinGaussianMixture Models.
           13. DiscusstheadvantagesanddisadvantagesofusingGMMsoverK-
              meansfor unsupervised learning tasks.
           14. Explainthecriteriaforchoosingthenumberofclustersinclusteringalgorithm
              slike K-means and GMMs.
           15. Describetheelbowmethodandsilhouetteanalysisfordeterminingtheop
              timal number of clusters.
      Instance-BasedLearning
           16. Whatisinstance-basedlearning?DiscusstheworkingoftheK-
              NearestNeighbors (KNN) algorithm.
AL3451_M
L
           17. ExplainhowthedistancemetricaffectstheperformanceofKNN.Co
              mpare Euclidean, Manhattan, and Minkowski distances.
           18. DiscusstheeffectofthevalueofkkontheperformanceoftheKNNalgorithm.H
              ow can we optimize kk?
           19. CompareandcontrastK-
              meansclusteringandKNNintermsofmethodologyand applications.
           20. ExplaintheroleoffeaturescalinginKNNanditsimpactonthealgorith
              m's performance.
AL3451_M
L