0% found this document useful (0 votes)
6 views33 pages

Unit 4

The document discusses dimensionality reduction techniques in machine learning, emphasizing their importance in simplifying models, improving computational efficiency, and enhancing data visualization. Key techniques include Feature Selection, Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Independent Component Analysis (ICA), each with specific applications in fields like public health, genomics, and image processing. Additionally, it introduces Evolutionary Learning, an optimization method inspired by biological evolution, detailing its process and applications in various domains.

Uploaded by

soukyapulimamidi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views33 pages

Unit 4

The document discusses dimensionality reduction techniques in machine learning, emphasizing their importance in simplifying models, improving computational efficiency, and enhancing data visualization. Key techniques include Feature Selection, Principal Component Analysis (PCA), Linear Discriminant Analysis (LDA), and Independent Component Analysis (ICA), each with specific applications in fields like public health, genomics, and image processing. Additionally, it introduces Evolutionary Learning, an optimization method inspired by biological evolution, detailing its process and applications in various domains.

Uploaded by

soukyapulimamidi
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 33

UNIT-4

PART-1
Dimensionality Reduction
Dimensionality reduction is a technique in machine learning used to reduce the
number of input features (dimensions) while preserving as much information as
possible. It simplifies models, speeds up training, reduces overfitting, and makes
data visualization easier.
Why Dimensionality Reduction?
1. Curse of Dimensionality – As dimensions increase, data becomes sparse,
making models less effective.
2. Overfitting – Too many features can lead to models that fit noise instead of
patterns.
3. Computational Efficiency – Fewer features mean faster computation and
training.
4. Visualization – Reducing data to 2D or 3D allows for better visualization of
patterns and clusters.
Techniques for Dimensionality Reduction:
1. Feature Selection (Subset Selection)
 Selects a subset of the most important features.
 Methods:
o Filter Methods – Correlation, Chi-square, Mutual Information
o Wrapper Methods – Recursive Feature Elimination (RFE)
o Embedded Methods – Lasso Regression (L1 regularization)
2. Feature Extraction (Projection-based)
 Transforms existing features into a lower-dimensional space.
 Key Methods:
o Principal Component Analysis (PCA)
 Projects data onto orthogonal axes capturing maximum
variance.
o Linear Discriminant Analysis (LDA)
 Focuses on maximizing class separability.
Applications:
 Public Health – Analyzing large-scale epidemiological datasets.
 Genomics – Reducing the number of gene expressions for classification.
 Image Processing – Compressing image data while preserving critical
information.
 Natural Language Processing (NLP) – Reducing sparse word embeddings.

Linear Discriminant Analysis:


Linear Discriminant Analysis (LDA) is a dimensionality reduction technique used
for classification tasks. Unlike PCA, which focuses on maximizing variance, LDA
seeks to find a feature space that maximizes class separability.
How LDA Works
LDA projects data onto a lower-dimensional space where the classes are as
distinguishable as possible by maximizing the between-class variance while
minimizing the within-class variance.
Steps in LDA
1. Compute the Mean Vectors – Calculate the mean for each class.
2. Scatter Matrices –
o Within-Class Scatter – Measures how spread out data is within each
class.
o Between-Class Scatter – Measures how far the means of different
classes are from each other.
3. Solve the Eigenvalue Problem – Calculate eigenvectors and eigenvalues to
find the optimal projection directions.
4. Select the Top K Eigenvectors – Choose the eigenvectors corresponding to
the largest eigenvalues to reduce dimensionality.
5. Transform the Data – Project data onto the new lower-dimensional space.
Applications
 Public Health – Classifying patient outcomes (e.g., disease vs. no disease).
 Image Recognition – Facial recognition systems.
 Bioinformatics – Classifying gene expressions by disease.

Principal Component Analysis:


Principal Component Analysis (PCA) is an unsupervised dimensionality
reduction technique that transforms data into a new coordinate system where
the greatest variance lies along the first principal component, the second
greatest variance along the second component, and so on.
Why Use PCA?
 Reduce Dimensionality – Simplifies datasets by reducing the number of
features.
 Visualization – Allows high-dimensional data to be visualized in 2D or 3D.
 Remove Noise and Redundancy – Focuses on the most informative
features.
 Speed Up Algorithms – Reduces computational complexity for large
datasets.
How PCA Works
1. Standardize the Data – Scale the data to have zero mean and unit variance.
2. Compute the Covariance Matrix – Understand feature relationships.
3. Calculate Eigenvectors and Eigenvalues – Derive principal components
from the covariance matrix.
4. Rank Components – Sort eigenvectors by descending eigenvalues.
5. Select Top K Components – Choose the top K components to retain most of
the variance.
6. Transform the Data – Project original data onto the selected principal
components.
Applications of PCA
 Public Health – Epidemiological data analysis to identify risk factors.
 Genomics – Reducing high-dimensional gene expression data.
 Image Compression – Extracting essential features from image data.
 Finance – Identifying principal factors in stock market data.
Problem Solve:
Factor Analysis:
Factor Analysis (FA) is a dimensionality reduction technique used to identify
latent variables (factors) that explain the observed correlations among
features. Unlike PCA, which focuses on maximizing variance, Factor Analysis
models the underlying structure in the data by assuming that the observed
variables are influenced by fewer unobserved factors.
Why Use Factor Analysis?
 Identify Hidden Patterns – Reveals latent constructs that drive observed
data.
 Reduce Noise – Separates noise from shared variance.
 Simplify Interpretation – Groups correlated features into fewer factors.
 Handle Multicollinearity – Useful for datasets with correlated features.

2. Estimate Factor Loadings – Calculate how much each observed variable


loads onto each factor.
3. Extract Factors – Identify the smallest number of factors that explain the
majority of variance.
4. Rotate Factors – Apply rotation (Varimax, Promax) to improve
interpretability by simplifying loadings.
5. Interpret Results – Examine factor loadings to determine which features
contribute to each factor.
Types of Factor Analysis
1. Exploratory Factor Analysis (EFA):
o Unsupervised, used to discover latent factors without prior
assumptions.
o Suitable when the factor structure is unknown.
2. Confirmatory Factor Analysis (CFA):
o Tests a predefined factor model to confirm hypothesized
relationships between variables.
o Used in hypothesis-driven research.
Applications of Factor Analysis
 Public Health – Identify latent factors influencing health outcomes (e.g.,
socioeconomic status, lifestyle).
 Psychology – Develop and validate psychological scales.
 Market Research – Group customer preferences into underlying segments.
 Finance – Identify factors driving stock market returns.

Independent Component Analysis:


Independent Component Analysis (ICA) is a dimensionality reduction
technique used to separate a multivariate signal into independent, non-
Gaussian components. Unlike PCA, which focuses on maximizing variance, ICA
aims to identify hidden sources that generate the observed data by
minimizing statistical dependence between components.
Why Use ICA?
 Blind Source Separation (BSS) – Separates mixed signals (e.g., separating
different audio sources from a single recording).
 Noise Reduction – Identifies and removes noise as an independent source.
 Feature Extraction – Useful for finding hidden patterns in data.
How ICA Works:

3. Assumptions in ICA:
o The source signals S are statistically independent.
o The source signals have non-Gaussian distributions.
o The mixing process is linear.
Applications of ICA
 Biomedical Signal Processing – EEG and fMRI data analysis (separating
brain activity signals).
 Audio Processing – "Cocktail Party Problem" – Separating multiple
overlapping audio signals.
 Image Processing – Facial recognition and feature extraction.
 Finance – Identifying independent factors driving market data.
Example Use Case in Public Health
 EEG Data Analysis – ICA is used to analyze brain activity in public health
research to detect abnormal patterns associated with neurological
conditions.
 Epidemiological Data – Identify independent factors contributing to disease
spread.
Locally linear Embedding:
Locally Linear Embedding (LLE) is a non-linear dimensionality reduction
technique that preserves the local structure of high-dimensional data by
mapping it to a lower-dimensional space. Unlike PCA or LDA, LLE focuses on
maintaining local relationships between neighboring data points rather than
global variance or class separability.
Why Use LLE?
 Non-Linear Data – Suitable for data lying on a non-linear manifold (e.g.,
curved surfaces).
 Preserves Local Geometry – Captures the local neighborhood structure of
data points.
 High-Dimensional Data – Effective for reducing dimensions in data with
complex, non-linear distributions.
Key Parameters in LLE
 Number of Neighbors (k): Controls the locality by specifying how many
neighbors to consider.
 Dimensionality (d): The target dimensionality of the embedded space.
Applications of LLE
 Public Health – Dimensionality reduction in epidemiological data to identify
hidden patterns.
 Genomics – Visualizing non-linear gene expression patterns.
 Image Processing – Uncovering low-dimensional representations of high-
dimensional image data.
 Finance – Modeling non-linear relationships in financial time series.
Example Use Case in Public Health
 Patient Clustering – LLE can group patients with similar symptoms or
conditions based on complex non-linear patterns in health data.
 Disease Spread Modeling – Visualizing the spread of diseases over regions
where data relationships are non-linear.

Isomap (Isometric Mapping):


Isomap is a non-linear dimensionality reduction technique that extends
classical Multidimensional Scaling (MDS) by incorporating geodesic distances
to capture the intrinsic geometry of high-dimensional data. It is particularly
effective for data that lies on a non-linear manifold.
Why Use Isomap?
 Non-Linear Dimensionality Reduction – Suitable for complex, curved data
structures.
 Preserves Global Geometry – Maintains the global structure by
approximating geodesic distances.
 High-Dimensional Visualization – Helps project data onto lower
dimensions for visualization while retaining meaningful relationships.
How Isomap Works
1. Construct Neighborhood Graph
o Connect each data point to its k-nearest neighbors or points within a
certain radius.
o Form a weighted graph where edges represent Euclidean distances
between neighboring points.
2. Compute Geodesic Distances
o Use Dijkstra's or Floyd-Warshall algorithm to compute the shortest
path (geodesic) between all pairs of points in the graph.
o This approximates the true manifold distance.
3. Apply Classical MDS
o Perform classical Multidimensional Scaling (MDS) on the geodesic
distance matrix to find a lower-dimensional embedding that
preserves the pairwise geodesic distances.
Key Parameters in Isomap
 Number of Neighbors (k): Controls locality by defining how many neighbors
to connect to each point.
 Output Dimensions (d): The target dimensionality for embedding.
Applications of Isomap
 Public Health – Reducing dimensionality in large epidemiological datasets
to detect non-linear patterns.
 Genomics – Uncovering the intrinsic structure of gene expression data.
 Image Analysis – Dimensionality reduction in face recognition and object
classification tasks.
 Natural Language Processing (NLP) – Visualizing word embeddings in
reduced space.
Example in Public Health
 Disease Progression Modeling – Isomap can help visualize how diseases
evolve over time by embedding patient data into a 2D space.
 Patient Clustering – Grouping patients with similar conditions while
capturing non-linear health indicators.
Least squares optimization:
Least Squares Optimization is a fundamental technique used to minimize the
sum of squared differences between predicted and actual values. It is widely
applied in regression models, curve fitting, and parameter estimation.
Why Use Least Squares?
 Simplicity and Efficiency – Computationally efficient and easy to
implement.
 Closed-Form Solution – In many cases, least squares provides an analytical
solution without iterative optimization.
 Interpretable Models – Produces interpretable coefficients in linear
regression.
 Optimal for Gaussian Noise – Minimizes error effectively when data is
affected by Gaussian noise.
Applications of Least Squares
 Public Health – Predicting patient outcomes based on health indicators.
 Epidemiology – Modeling disease spread using regression techniques.
 Genomics – Fitting models to gene expression data.
 Finance – Estimating risk models and portfolio optimization.
 Engineering – Curve fitting and system identification.
Example in Public Health
 Disease Prediction – Use least squares to build a regression model
predicting disease severity based on biomarkers.
 Risk Factor Analysis – Fit linear models to identify key health risk factors
affecting patient outcomes.
UNIT-4
PART-2
Evolutionary Learning
Evolutionary Learning:
Evolutionary Learning is an optimization technique inspired by biological
evolution. It involves simulating processes like natural selection, mutation,
and recombination to iteratively improve candidate solutions for complex
problems. This approach is often used when traditional optimization
techniques (like gradient descent) are insufficient for non-differentiable, highly
non-linear, or multimodal functions.
How Evolutionary Learning Works
1. Initialization
o Generate an initial population of candidate solutions randomly.
2. Evaluation
o Assess each candidate using a fitness function that quantifies how
well the solution solves the problem.
3. Selection
o Select candidates based on fitness (e.g., the fittest survive and pass
on their "genes").
4. Crossover (Recombination)
o Combine features from pairs of candidates to create offspring.
5. Mutation
o Apply random changes to some solutions to introduce diversity.
6. Replacement
o Replace the least fit candidates with offspring.
7. Iteration
o Repeat the process until convergence (e.g., fitness no longer
improves or a maximum number of iterations is reached).
Types of Evolutionary Algorithms
1. Genetic Algorithms (GA):
o Mimics the process of natural selection using crossover, mutation,
and selection.
o Useful for combinatorial optimization and feature selection.
2. Evolution Strategies (ES):
o Focuses on evolving populations through mutation and selection,
without crossover.
o Good for continuous optimization problems.
3. Genetic Programming (GP):
o Evolves programs or symbolic expressions to solve problems, often
applied in automated feature generation.
4. Differential Evolution (DE):
o Optimizes real-valued functions by evolving difference vectors
between candidates.
o Effective for continuous optimization tasks.
5. Neuroevolution:
o Evolves neural network architectures or weights, often used for
reinforcement learning tasks.
Mathematical Representation

Applications of Evolutionary Learning


 Public Health – Optimize healthcare resource allocation, disease modeling,
and epidemiological predictions.
 Bioinformatics – Design gene regulatory networks and optimize drug
discovery pipelines.
 Robotics – Evolve control algorithms for autonomous robots.
 Finance – Optimize trading strategies and portfolio allocations.
 Image Processing – Evolve filters for feature extraction and object
recognition.
Example in Public Health
 Vaccine Distribution Optimization – Use evolutionary learning to optimize
vaccine distribution plans across regions to minimize disease spread.
 Medical Imaging – Evolve algorithms to detect anomalies in medical scans
automatically.

Genetic Algorithms:
Genetic Programming:

You might also like