Unit-7 ML
Unit-7 ML
1Genetic Algorithms
1
2. Those individuals who are successful (fittest) then mate to
create more offspring than others
3. Genes from the “fittest” parent propagate throughout the
generation, that is sometimes parents create offspring
which is better than either parent.
4. Thus each successive generation is more suited for their
environment.
Search space
The population of individuals are maintained within search space.
Each individual represents a solution in search space for given
problem. Each individual is coded as a finite length vector
(analogous to chromosome) of components. These variable
components are analogous to Genes. Thus a chromosome
(individual) is composed of several genes (variable components).
Fitness Score
A Fitness Score is given to each individual which shows the
ability of an individual to “compete”. The individual having
optimal fitness score (or near optimal) are sought.
The GAs maintains the population of n individuals
(chromosome/solutions) along with their fitness scores.The
individuals having better fitness scores are given more chance to
reproduce than others. The individuals with better fitness scores
are selected who mate and produce better offspring by
combining chromosomes of parents. The population size is static
so the room has to be created for new arrivals. So, some
individuals die and get replaced by new arrivals eventually
creating new generation when all the mating opportunity of the
old population is exhausted. It is hoped that over successive
generations better solutions will arrive while least fit die.
Each new generation has on average more “better genes” than
the individual (solution) of previous generations. Thus each new
generations have better “partial solutions” than previous
generations. Once the offspring produced having no significant
difference from offspring produced by previous populations, the
population is converged. The algorithm is said to be converged to
a set of solutions for the problem.
Operators of Genetic Algorithms
2
Once the initial generation is created, the algorithm evolves the
generation using following operators –
1) Selection Operator: The idea is to give preference to the
individuals with good fitness scores and allow them to pass their
genes to successive generations.
2) Crossover Operator: This represents mating between
individuals. Two individuals are selected using selection operator
and crossover sites are chosen randomly. Then the genes at
these crossover sites are exchanged thus creating a completely
new individual (offspring). For example –
#include <bits/stdc++.h>
using namespace std;
// Valid Genes
const string GENES =
"abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOP"\
"QRSTUVWXYZ 1234567890, .-;:_!\"#%&/()=?@${[]}";
4
int len = TARGET.size();
string gnome = "";
for(int i = 0;i<len;i++)
gnome += mutated_genes();
return gnome;
}
Individual::Individual(string chromosome)
{
this->chromosome = chromosome;
fitness = cal_fitness();
};
5
// otherwise insert random gene(mutate),
// for maintaining diversity
else
child_chromosome += mutated_genes();
}
// Driver code
int main()
{
srand((unsigned)(time(0)));
// current generation
int generation = 0;
vector<Individual> population;
bool found = false;
6
for(int i = 0;i<POPULATION_SIZE;i++)
{
string gnome = create_gnome();
population.push_back(Individual(gnome));
}
while(! found)
{
// sort the population in increasing order of fitness
score
sort(population.begin(), population.end());
7
cout<< "Generation: " << generation << "\t";
cout<< "String: "<< population[0].chromosome <<"\t";
cout<< "Fitness: "<< population[0].fitness << "\n";
generation++;
}
cout<< "Generation: " << generation << "\t";
cout<< "String: "<< population[0].chromosome <<"\t";
cout<< "Fitness: "<< population[0].fitness << "\n";
}
Output:
Generation: 1 String: tO{"-?=jH[k8=B4]Oe@} Fitness: 18
Generation: 2 String: tO{"-?=jH[k8=B4]Oe@} Fitness: 18
Generation: 3 String: .#lRWf9k_Ifslw #O$k_ Fitness: 17
Generation: 4 String: .-1Rq?9mHqk3Wo]3rek_ Fitness: 16
Generation: 5 String: .-1Rq?9mHqk3Wo]3rek_ Fitness: 16
Generation: 6 String: A#ldW) #lIkslw cVek) Fitness: 14
Generation: 7 String: A#ldW) #lIkslw cVek) Fitness: 14
Generation: 8 String: (, o x _x%Rs=, 6Peek3 Fitness: 13
.
.
.
Generation: 29 String: I lope Geeks#o, Geeks Fitness: 3
Generation: 30 String: I loMe GeeksfoBGeeks Fitness: 2
Generation: 31 String: I love Geeksfo0Geeks Fitness: 1
Generation: 32 String: I love Geeksfo0Geeks Fitness: 1
Generation: 33 String: I love Geeksfo0Geeks Fitness: 1
Generation: 34 String: I love GeeksforGeeks Fitness: 0
Note: Every-time algorithm start with random strings, so output
may differ
As we can see from the output, our algorithm sometimes stuck at
a local optimum solution, this can be further improved by
updating fitness score calculation algorithm or by tweaking
mutation and crossover operators.
Why use Genetic Algorithms
They are Robust
Provide optimisation over large space state.
Unlike traditional AI, they do not break on slight change in
input or presence of noise
Application of Genetic Algorithms
Genetic algorithms have many applications, some of them are –
Recurrent Neural Network
Mutation testing
Code breaking
Filtering and signal processing
8
Learning fuzzy rule base etc
2Genotype representing
Genotype representation is very important during the
implementation of a genetic algorithm, as it directly
impacts the performance of the genetic algorithm. A
proper representation is where the phenotype and the
genotype mappings have proper spaces between them.
1. Binary representation
Binary representation is one of the most common
methods of representing GA.
2. Integer representation
In the case of discrete-valued genes, we cannot always
limit the results to binary, so instead, we use integer
representation. For example, if we had to encode the
three directions, we could have encoded them as:
{1,2,3,4}, and represented using integer representation.
3. Permutation representation
9
Whenever the solutions are represented by an order of
elements we can use permutation representation. We can
take the same example of TSP, let us assume the person
has to travel all the cities, visiting one city at a time, and
then comes back to the source city. As we can see the
order of the TSP is naturally becoming a permutation,
therefore we can use the permutation representation.
10
1. It must be fast to easily solve or compute complex problems.
2. It must measure in quantity how good or fit the function
provided or how much good can it provide individuals from the
particular solution.
4.Genetic algorithm:
Hypothesis space search
As already understood from our illustrative example, it is
clear that genetic algorithms employ a randomized beam
search method to seek maximally fit hypotheses. In the
hypothesis space search method, we can see that the
gradient descent search in backpropagation moves
smoothly from one hypothesis to another. On the other
hand, the genetic algorithm search can move much more
abruptly. It replaces the parent hypotheses with an
offspring that can be very different from the parent. Due
12
to this reason, genetic algorithm search has lower
chances of it falling into the same kind of local minima
that plaques the gradient descent methods.
13
In a genetic algorithm, the evolution of the population
depends on the selection step, the recombination step,
and the mutation step. The schema theorem is one of the
most widely used theorems in the characterization of
population evolution within a genetic algorithm. If it fails
to consider the positive effects of crossover and mutation,
it is in a way incomplete. There are many other recent
theoretical analyses that have been proposed, many of
these analogies are based on models such as Markov
chain models and the statistical mechanical model.
14
1. Population: A set of candidate solutions.
2. Chromosomes: Data structures representing solutions,
typically encoded as strings of binary or other types of
genes.
3. Fitness Function: A function that evaluates how good
a solution is with respect to the problem being solved.
4. Selection: The process of choosing the fittest
individuals to reproduce.
5. Crossover: The process of combining two parent
solutions to produce offspring.
6. Mutation: The process of making random alterations to
offspring to maintain genetic diversity.
6.Introduction to Genetic Programming
Genetic Programming (GP) extends the concept of genetic
algorithms to evolve programs or expressions. Instead of
evolving a set of parameters or solutions, GP evolves entire
programs or expressions that can perform a task or solve a
problem.
Key Components of Genetic Programming
1. Population: A set of programs or expressions.
2. Fitness Function: A measure of how well a program or
expression performs a given task.
3. Selection, Crossover, and Mutation: Similar to GAs,
but applied to program structures or expressions.
Key Principles of Evolutionary Algorithms
Evolutionary Algorithms (EAs), including GAs and GP, are based
on several fundamental principles:
1. Natural Selection: The idea that better solutions are
more likely to be selected for reproduction.
2. Genetic Variation: Diversity in the population is
introduced through crossover and mutation to explore a
wider solution space.
3. Survival of the Fittest: Solutions are evaluated based
on a fitness function, and the fittest solutions are more
likely to be selected for the next generation.
These principles ensure that the algorithm explores a variety of
solutions and converges towards optimal or near-optimal
solutions.
Examples in Optimization Problems
1. Knapsack Problem
15
The Knapsack Problem is a classic optimization problem where
the goal is to maximize the total value of items placed in a
knapsack without exceeding its weight capacity.
GA Approach:
1. Representation: Items are represented as binary
strings, where each bit indicates whether an item is
included.
2. Fitness Function: Evaluates the total value of selected
items while penalizing solutions that exceed the weight
limit.
3. Crossover and Mutation: Involve swapping items
between solutions or flipping bits to introduce variation.
Example: Solving the 0/1 Knapsack Problem using GAs can
yield good approximations even for large instances where exact
methods become computationally infeasible.
2. Traveling Salesman Problem (TSP)
The Traveling Salesman Problem involves finding the shortest
possible route that visits a set of cities and returns to the origin
city.
GA Approach:
1. Representation: Routes are represented as
permutations of city indices.
2. Fitness Function: Measures the total distance of the
route.
3. Crossover and Mutation: Involves techniques like
order crossover (OX) and swap mutation to preserve the
feasibility of routes.
Example: GAs can effectively solve TSP instances with
hundreds of cities, providing near-optimal solutions in a
reasonable time frame.
3. Scheduling Problems
Scheduling Problems involve assigning resources to tasks over
time, aiming to optimize certain criteria like minimizing
completion time or maximizing resource utilization.
GA Approach:
1. Representation: Schedules are represented as
sequences or matrices of tasks and resources.
2. Fitness Function: Measures the quality of the
schedule based on criteria like total completion time or
resource conflicts.
3. Crossover and Mutation: Involve swapping tasks or
adjusting schedules to explore different configurations.
16
Example: GAs are used in job-shop scheduling to minimize
makespan or in timetabling to ensure conflicts are resolved
efficiently.
7.Real-World Applications of Genetic Algorithms and Genetic
Programing
1. Optimizing Complex Systems
One of the classic applications of genetic algorithms is in
optimizing complex systems where traditional approaches
might fail due to the vastness of the solution space. For
example, GAs have been effectively used in airline industry for
scheduling flights and crew assignments, considering numerous
constraints and objectives. Similarly, they have been applied to
optimize the layout of components on a computer chip, which
involves a highly complex configuration space.
2. Automated Design and Creativity
In the field of automated design, GAs can generate innovative
solutions to engineering problems. For instance, NASA has used
genetic algorithms to design antennae for spacecrafts. These
algorithms generated unconventional, asymmetric designs that
performed better than the traditional symmetrical ones.
Similarly, in architecture, GAs have helped in creating novel
building layouts that optimize space utilization and
environmental factors like sunlight and airflow.
3. Financial Market Analysis
Genetic algorithms have found a niche in the financial sector
for portfolio optimization and algorithmic trading. By simulating
numerous investment scenarios, GAs help in identifying the
best allocation of assets that maximizes returns and minimizes
risks. Moreover, they are used in trading algorithms to predict
market movements and execute trades at optimal times.
4. Game Development and AI
In game development, genetic programming has been
employed to evolve behaviors for non-player characters (NPCs),
making them more challenging and realistic opponents. For
example, GAs have been used to develop strategic behaviors in
games like chess and Go, where the vast number of possible
moves makes brute force approaches impractical.
5. Machine Learning and Data Mining
GAs are increasingly integrated into machine learning
workflows, particularly in feature selection and model
optimization. By selecting the most relevant features from large
datasets, GAs improve the efficiency and accuracy of predictive
17
models. Additionally, they are used to optimize neural network
architectures in deep learning without human intervention.
6. Robotics and Autonomous Vehicles
In robotics, genetic programming can automate the design of
control systems for autonomous robots, enabling them to adapt
to new environments and perform complex tasks without
explicit programming. This approach has been instrumental in
developing autonomous vehicles, where GAs optimize driving
strategies based on real-time data.
LAMARCKIAN MODEL :
The Lamarckian model defines that the traits of an
individual get his or her lifetime can also be passed to its
further offspring. It was then named after the French
biologist – Jean Baptiste Lamarck. Though natural biology
has excluded or disregarded the Lamarckism theory we
know that only the information in a particular genotype
can be transmitted.
But from a computation point of view, it has also been
stated that the Lamarckian model also gives better results
for some kinds of problems.
18
In the Lamarckian model, there is a local search operator
which examines the neighbors getting new traits, and if a
good chromosome is found then it itself becomes the
offspring.
BALDWINIAN MODEL :
It was named after James Mark Baldwin in 189. It was an
intermediate idea. As per the Baldwininan model, a
chromosome can have the superficiality of encoding
learning beneficial behavior types. Unlike the Lamarckian
model, here it does not transmit the traits acquired to the
next generation, and neither it completely ignores the
traits acquired by the Darwinian model.
The Baldwinian model out of these extremes of two has
the tendency of such kind of an individual for acquiring
traits that are encoded except the traits of themselves.
In the Baldwinian model, there is a local search operator
that exists that examines the neighbor on acquiring new
traits, and if a good chromosome is found at the step, it
then assigns only the improved fitness factor to the
particular chromosomes and it does not modify the
chromosome itself. The ability on acquiring the trait
though it does not get passed directly on to the fur=ture
generations.
Therefore we often see a genetic algorithm with local
search which is similar to the Memetic algorithm. Hence,
one always must choose the correct and suitable models,
either Lamarckian or Baldwinian to get decided what the
individuals generate after a normal local search.
Therefore, it generally follows the non-Darwinian theory or
methodology of computing evolutionary procedures in
machine learning for the guide of new individuals.
The networks which are neurally generated from
Lamarckian theory states that they are small but require
less time and to get designed. Thus, from every point of
view, the Lamarckian operator gets improved by the
fitness functions if used in early agreements. It was due to
some part which gets passed to the next individuals which
are fit and therefore the Lamarckian model is more
suitable.
19
Parallelizing GA
Parallelizing a Genetic Algorithm (GA) can significantly improve its performance by
spreading the computational workload across multiple processors. Here's a high-level
overview of the steps involved in parallelizing a GA:
1. Divide the Population: Split the population of individuals into sub-populations, which
can be processed in parallel. Each sub-population will undergo the selection,
crossover, and mutation processes independently.
2. Evaluate Fitness: Evaluate the fitness of individuals within each sub-population using
parallel processors. This step is often the most computationally intensive and benefits
greatly from parallelization.
3. Migration: Periodically, exchange individuals between sub-populations. This helps
maintain genetic diversity and prevents sub-populations from converging to local
optima. There are different migration strategies, such as ring topology, fully connected
topology, and others.
4. Master-Slave Model: Alternatively, use a master-slave model where a master process
handles the population and distributes tasks (like fitness evaluation) to slave processes.
The master gathers results and performs genetic operations.
5. Synchronization: Ensure that the parallel processes are synchronized at key points in
the algorithm, such as after each generation. This isn't always necessary and depends
on the specific implementation.
6. Implementation Tools: Use parallel computing libraries or frameworks such as MPI
(Message Passing Interface), OpenMP, or even GPU-based libraries like CUDA for
implementation
1. Greedy Search
Greedy search is a straightforward search method that builds up
the solution piece by piece, always choosing the next piece that
offers the most immediate benefit.
2. Beam Search
Beam search is an optimization of the greedy search that keeps
track of multiple top candidates at each step, balancing between
diversity and best score among the candidates.
3. Random Search
20
As the name suggests, random search involves randomly
selecting possible solutions and evaluating them. It's less
structured but can be useful for exploring large, unknown spaces.
4. Genetic Algorithms
Inspired by the process of natural selection, genetic algorithms
are used to find approximate solutions by mimicking processes
like mutation, crossover, and natural selection.
5. Simulated Annealing
21
This method weights the probability of success in certain paths and
selects the paths with higher probabilities, often used when dealing
with uncertainty and incomplete information.
Applications
These methods are highly applicable in various domains like
machine learning, artificial intelligence, game development, and
more. Understanding and choosing the right one can significantly
impact the efficiency and effectiveness of the induction process.
22
23