Artificial Intelligence
Unit 1
Contents
Artificial Intelligence
Characteristics of AI Program
Categories of System
Foundations of AI
Views of AI Goals
Components of AI Programs
Sub-areas of AI
Applications
Latest Perception of AI
Criteria for Success
Turing Test
Text Book
E. Rich and K. Knight, "Artificial intelligence",
TMH, 2nd ed., 1999.
Reference Book
Artificial Intelligence: A Modern Approach
Textbook by Peter Norvig and Stuart J. Russell
Assignments and Tests
Assignment
To be done in group of two students
2 project assignment
Any programming language can be used.
Demo need to be given on system with running program
Tests
Will be based on MCQ mostly and solving
practical problems of AI.
Warm UP exercises
Water Jug Problem
You are given two jugs, a 7-gallon one and a 4-
gallon one, a pump which has unlimited water
which you can use to fill the jug, and the ground
on which water may be poured. Neither jug has
any measuring markings on it. How can you get
exactly 2 gallons of water in the 4-gallon jug?
Warm up Exercises
8 Puzzle problem
The 8 puzzle consists of
eight numbered, movable
tiles set in a 3x3 frame.
One cell of the frame is
always empty thus making
it possible to move an
adjacent numbered tile
into the empty cell. Such a
puzzle is illustrated in
following diagram.
Artificial Intelligence
Artificial Intelligence
Quick Answer from Academia:
Modeling human cognition or mental faculty
using computers
Study of making computers do things which
at the moment people better
Making computers do things which require
intelligence
More Formal Definition of AI
AI is a branch of computer science which is
concerned with the study and creation of
computer systems that exhibit
some form of intelligence
OR
those characteristics which we associate
with intelligence in human behavior
AI is a broad area consisting of
different fields, from machine vision,
expert systems to the creation of
machines that can "think".
In order to classify machines as
"thinking", it is necessary to define
intelligence.
What is Intelligence?
Intelligence is a property of mind that
encompasses many related mental abilities,
such as the capabilities to
reason
plan
solve problems
think abstractly
comprehend ideas and language and
learn
Characteristics of AI systems
learn new concepts and tasks
reason and draw useful conclusions about
the world around us
remember complicated interrelated facts and draw
conclusions from them (inference)
understand a natural language or perceive
and comprehend a visual scene
look through cameras and see what's there
(vision), to move themselves and objects around
in the real world (robotics)
Contd..
plan sequences of actions to complete a goal
offer advice based on rules and situations
may not necessarily imitate human senses and
thought processes
but indeed, in performing some tasks differently, they
may actually exceed human abilities
capable of performing intelligent tasks effectively
and efficiently
perform tasks that require high levels of intelligence
Understanding of AI
AI techniques and ideas seem to be
harder to understand than most things in
computer science
AI shows best on complex problems for
which general principles don't help much,
though there are a few useful general
principles
Artificial intelligence is also difficult to
understand by its content.
Boundaries of AI are not well defined.
Often it means the advanced software
engineering, sophisticated software
techniques for hard problems that can't be
solved in any easy way.
AI programs - like people - are usually not
perfect, and even make mistakes.
It often means, nonnumeric ways of solving
problems, since people can't handle
numbers well.
Nonnumeric ways are generally "common
sense" ways, not necessarily the best ones.
Understanding of AI also requires an
understanding of related terms such as
intelligence, knowledge, reasoning,
thought, cognition, learning, and a number
of other computer related terms.
Categories of AI System
Systems that think like humans
Systems that act like humans
Systems that think rationally
Systems that act rationally
Systems that think like humans
Most of the time it is a black box where we are
not clear about our thought process.
One has to know functioning of brain and its
mechanism for possessing information.
It is an area of cognitive science.
The stimuli are converted into mental representation.
Cognitive processes manipulate representation to build
new representations that are used to generate actions.
Neural network is a computing model for
processing information similar to brain.
Systems that act like humans
The overall behavior of the system
should be human like.
It could be achieved by observation.
Systems that think rationally
Such systems rely on logic rather than human to
measure correctness.
For thinking rationally or logically, logic formulas
and theories are used for synthesizing outcomes.
For example,
given John is a human and all humans are mortal then
one can conclude logically that John is mortal
Not all intelligent behavior are mediated by logical
deliberation.
Systems that act rationally
Rational behavior means doing right thing.
Even if method is illogical, the observed
behavior must be rational.
Foundations of AI
Foundation of AI is based on
Mathematics
Neuroscience
Control Theory
Linguistics
Foundations - Mathematics
More formal logical methods
Boolean logic
Fuzzy logic
Uncertainty
The basis for most modern approaches to
handle uncertainty in AI applications can
be handled by
Probability theory
Modal and Temporal logics
Foundations - Neuroscience
How do the brain works?
Early studies (1824) relied on injured and abnormal
people to understand what parts of brain work
More recent studies use accurate sensors to
correlate brain activity to human thought
By monitoring individual neurons, monkeys can now
control a computer mouse using thought alone
Moore’s law states that computers will have as
many gates as humans have neurons in 2020
How close are we to have a mechanical brain?
Parallel computation, remapping, interconnections,….
Foundations – Control Theory
Machines can modify their behavior in response
to the environment (sense/action loop)
Water-flow regulator, steam engine governor,
thermostat
The theory of stable feedback systems (1894)
Build systems that transition from initial
state to goal state with minimum energy
In 1950, control theory could only describe
linear systems and AI largely rose as a
response to this shortcoming
Foundations - Linguistics
Speech demonstrates so much of human
intelligence
Analysis of human language reveals thought
taking place in ways not understood in other
settings
Children can create sentences they have never heard
before
Language and thought are believed to be tightly
intertwined
Two Views of AI Goals
AI is about duplicating what the (human)
brain DOES
Cognitive Science
AI is about duplicating what the (human)
brain SHOULD do
Rationality (doing things logically)
Cool Stuff in AI
Game playing agents
Machine learning
Speech
Language
Vision
Data Mining
Web agents …….
Useful Stuff
Medical Diagnosis
Fraud Detection
Object Identification
Space Shuttle Scheduling
Information Retrieval ….
AI Techniques
Rule-based
Fuzzy Logic
Neural Networks
Genetic Algorithms
Components of AI Program
AI techniques must be independent of
the problem domain as far as possible.
AI program should have
knowledge base
navigational capability
inferencing
Knowledge Base
AI programs should be learning in nature
and update its knowledge accordingly.
Knowledge base consists of facts and
rules.
Characteristics of Knowledge:
It is voluminous in nature and requires
proper structuring
It may be incomplete and imprecise
It may keep on changing (dynamic)
Navigational Capability
Navigational capability contains
various control strategies
Control Strategy
determines the rule to be applied
some heuristics (thump rule) may be
applied
Inferencing
Inferencing requires
search through knowledge base
and
derive new knowledge
Sub-areas of AI
Sub areas of AI are:
Knowledge representation
Theorem proving
Game playing
Common sense reasoning dealing with
uncertainty and decision making
Learning models, inference techniques, pattern
recognition, search and matching etc.
Logic (fuzzy, temporal, modal) in AI
Planning and scheduling
Sub-areas of AI – Contd..
Natural language understanding
Computer vision
Understanding spoken utterances
Intelligent tutoring systems
Robotics
Machine translation systems
Expert problem solving
Neural Networks, AI tools etc
Applications
Some of the applications are given below:
Business : Financial strategies, give advice
Engineering: check design, offer suggestions to
create new product
Manufacturing: Assembly, inspection & maintenance
Mining: used when conditions are dangerous
Hospital : monitoring, diagnosing & prescribing
Education : In teaching
household : Advice on cooking, shopping etc.
farming : prune trees & selectively harvest mixed
crops.
Latest Perception of AI
Three typical components of AI Systems
THE WORLD
Perception Action
Reasoning
Recent AI
Heavy use of
probability theory
decision theory
statistics
logic (fuzzy, modal, temporal)
Criteria For Success
Alan Turing proposed
the Turing Test which
determined if a
machine could think.
Recently computers
have convinced five
people out of ten that
the programs loaded
inside them were of
human intelligence
The Turing Test
Turing proposed operational test for intelligent
behavior in 1950.
Human
Human ?
Interrogator
AI system
ELIZA
Eliza was first program developed by Joseph
Wiezbaum to converse in English in mid
1960's
It passed turing test.
The following passage shows Eliza’s talking to
a teenage girl. Blue text Eliza’s response
Men are all alike.
IN WHAT WAY
They're always bugging us about something or other.
CAN YOU THINK OF A SPECIFIC EXAMPLE
Well, my boyfriend made me come here.
YOUR BOYFRIEND MADE YOU COME HERE
He says that I am depressed all the time.
I AM SORRY TO HEAR THAT YOU ARE DEPRESSED
It's true that I am unhappy
DO YOU THINK COMING HERE WILL HELP YOU NOT
TO BE UNHAPPY
I need help, that seems certain.
WHAT WOULD IT MEAN IF YOU GOT SOME HELP
Perhaps I could learn to get along with my mother.
TELL ME MORE ABOUT YOUR FAMILY
My mother takes care of me.
WHO ELSE IN YOUR FAMILY TAKES CARE OF YOU
My father.
YOUR FATHER
You are like my father in some ways.
WHAT RESEMBLANCE DO YOU SEE
You are not very aggressive but I think you don't want me to
notice that.
WHAT MAKES YOU THINK THAT I AM NOT VERY
AGGRESSIVE
You don’t argue with me.
WHY DO YOU THINK I DON’T ARGUE WITH YOU
You are afraid of me.
DOES IT PLEASE YOU TO BELIEVE, I AM AFRAID OF
YOU
My father is afraid of everybody.
WHAT ELSE COMES TO MIND WHEN YOU THINK OF
YOUR FATHER
Bullies.
DOES THAT HAVE ANYTHING TO DO WITH THE FACT
THAT YOUR BOYFRIEND MADE YOU COME HERE
Formal Description of a Problem
In AI, we will formally define a problem as
a space of all possible configurations where each
configuration is called a state
thus, we use the term state space
an initial state
one or more goal states
a set of rules/operators which move the problem from one
state to the next
In some cases, we may enumerate all possible states
(see monkey & banana problem on the next slide)
but usually, such an enumeration will be overwhelmingly
large so we only generate a portion of the state space, the
portion we are currently examining
The Monkey & Bananas Problem
A monkey is in a cage and bananas are suspended
from the ceiling, the monkey wants to eat a banana
but cannot reach them
in the room are a chair and a stick
if the monkey stands on the chair and waves the stick, he
can knock a banana down to eat it
what are the actions the monkey should take?
Initial state:
monkey on ground
with empty hand
bananas suspended
Goal state:
monkey eating
Actions:
climb chair/get off
grab X
wave X
eat X
Missionaries and Cannibals
3 missionaries and 3 cannibals are on one side of the river
with a boat that can take exactly 2 people across the river
how can we move the 3 missionaries and 3 cannibals across the river
with the constraint that the cannibals never outnumber the
missionaries on either side of the river (lest the cannibals start eating
the missionaries!)??
We can represent a state as a 6-item tuple:
(a, b, c, d, e, f)
a/b = number of missionaries/cannibals on left shore
c/d = number of missionaries/cannibals in boat
e/f = number of missionaries/cannibals on right shore
where a + b + c + d + e + f = 6
and a >= b unless a = 0, c >= d unless c = 0, and e >= f unless e = 0
Legal operations (moves) are
0, 1, 2 missionaries get into boat (c + d must be <= 2)
0, 1, 2 missionaries get out of boat
0, 1, 2 cannibals get into boat (c + d must be <= 2)
0, 1, 2 missionaries get out of boat
boat sails from left shore to right shore (c + d must be >= 1)
boat sails from right shore to left shore (c + d must be >= 1)
8 Puzzle
The 8 puzzle search space consists of 8! states (40320)
Search
Given a problem expressed as a state space
(whether explicitly or implicitly)
with operators/actions, an initial state and a goal state, how
do we find the sequence of operators needed to solve the
problem?
this requires search
Formally, we define a search space as [N, A, S, GD]
N = set of nodes or states of a graph
A = set of arcs (edges) between nodes that correspond to
the steps in the problem (the legal actions or operators)
S = a nonempty subset of N that represents start states
GD = a nonempty subset of N that represents goal states
Our problem becomes one of traversing the graph
from a node in S to a node in GD
we can use any of the numerous graph traversal techniques
for this but in general, they divide into two categories:
brute force – unguided search
heuristic – guided search
Consequences of Search
As shown a few slides back, the 8-puzzle has over 40000
different states
what about the 15 puzzle?
A brute force search means try all possible states blindly
until you find the solution
if a problem has a state space that consists of n moves where each
move has m possible choices, then there are 2m*n states
two forms of brute force search are: depth first search, breath first
search
A guided search examines a state and uses some heuristic
(usually a function) to determine how good that state is (how
close you might be to a solution) to help determine what
state to move to
hill climbing
best-first search
A/A* algorithm
Minimax
While a good heuristic can reduce the complexity from 2m*n
to something tractable, there is no guarantee so any form of
search is O(2n) in the worst case
Forward vs Backward Search
The common form of reasoning starts with data and leads to
conclusions
for instance, diagnosis is data-driven – given the patient symptoms,
we work toward disease hypotheses
we often think of this form of reasoning as “forward chaining” through rules
Backward search reasons from goals to actions
Planning and design are often goal-driven
“backward chaining”
Depth-first Search
Starting at node A, our search gives us:
A, B, E, K, S, L, T, F, M, C, G, N, H, O, P,
U, D, I, Q, J, R
Depth-first Search Example
Traveling Salesman Problem
Breadth-First Search
Starting at node A, our search would generate the
nodes in alphabetical order from A to U
Breadth-First Search Example
PRODUCTION SYSTEMS
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 5
Types of Production Systems.
A Knowledge representation formalism consists of collections of condition-action
rules(Production Rules or Operators), a database which is modified in accordance
with the rules, and a Production System Interpreter which controls the operation of
the rules i.e The 'control mechanism' of a Production System, determining the order
in which Production Rules are fired.
A system that uses this form of knowledge representation is called a production
system.
A production system consists of rules and factors. Knowledge is encoded in a
declarative from which comprises of a set of rules of the form
Situation ------------ Action
SITUATION that implies ACTION.
Example:-
IF the initial state is a goal state THEN quit.
Components of Production system
A global database
A set of production rules and
A control system
The global database is the central data structure used by an AI
production system,The production system.
The production rules operate on the global database.
Each rule has a precondition that is either satisfied or not by the
database. If the precondition is satisfied, the rule can be applied.
Application of the rule changes the database.
The control system chooses which applicable rule should be applied
and ceases computation when a termination condition on the
database is satisfied.
If several rules are to fire at the same time, the control system
resolves the conflicts.
Four classes of production
systems:-
A monotonic production system
A non monotonic production system
A partially commutative production system
Non pratially commutative system
A commutative production system.
Contd:
Monotonic production system
A system in which the application of a rule never prevents
the later application of another rule, that could have also
been applied at the time the first rule was selected.
Partial Commutative production system:-
A production system in which the application of a
particular sequence of rules transforms state X into state
Y, then any permutation of those rules that is allowable
also transforms state x into state Y.
Contd:
Non Monotonic System
A system in which the application of a
rule prevents the later application of another
rule, that could have also been applied at the
time the first rule was selected.
Commutative System
Production system which is both monotonic and
partial commutative.
Relationship between PS
Monotonic Non Monotonic
Partial Theorem Proving Robot navigation
commutative (Solving programming (NNE) or (NEN)
problem) (changes occur but can be
(solving ignorable reversed)
problems)
Non partial Chemical synthesis Bridge
Commutative
(irreversible
changes occur)
* Any production systems can be used to solve all type of problems
When to use which?
Partially commutative , monotonic production
systems are useful for solving ignorable problems.
These systems are important for man implementation
standpoint because they can be implemented without the
ability to backtrack to previous states, when it is discovered
that an incorrect path was followed. Such systems increase
the efficiency since it is not necessary to keep track of the
changes made in the search process.
Monotonic partially commutative systems are useful for
problems in which changes occur but can be reversed and in
which the order of operation is not critical (ex: 8 puzzle
problem).
Fig 6.1 A production system. Control loops until working memory
pattern no longer matches the conditions of any productions.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 6
Fig 6.2 Trace of a simple production system.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 7
Fig 6.3 The 8-puzzle as a production system.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 8
Fig 6.4 The 8-puzzle searched by a production system with loop detection
and depth-bound , from Nilsson (1971).
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 9
Fig 6.5 Legal moves of a chess knight.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 10
Fig 6.6 a 3 x 3 chessboard with move rules for the simplified knight tour
problem.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 11
Table 6.1 Production rules for the 3 x 3 knight problem.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 12
Fig6.7 A production system solution to the 3 x 3 knight’s tour problem.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 13
Fig 6.8 The recursive path algorithm as production system.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 14
Fig 6.9 Data-driven search in a production system.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 15
Fig 6.10 Goal-driven search in a production system.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 16
Fig 6.11 Bidirectional search missing in both directions, resulting in
excessive search.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 17
Fig 6.12 Bidirectional search meeting in the middle, eliminating much of
the space examined by unidirectional search.
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 18
Major advantages of production systems for artificial intelligence
Separation of Knowledge and Control
A Natural Mapping onto State Space Search
Modularity of Production Rules
Pattern-Directed Control
Opportunities for Heuristic Control of Search
Tracing and Explanation
Language Independence
A Plausible Model of Human Problem-Solving
Luger: Artificial Intelligence, 6th edition. © Pearson Education Limited, 2009 19
Issues in the design of the search
problem
Every Search process can be considered
either as
Traversal of search graph
The tree to searched must be in principle created in its
entirely from the rules defined for the search space
It is too large and never searched entirely in practice
Only some part is visited explicitly
Issues consists of:-
The direction of search
How to select applicable rule
How to represent each node of the search process
Traversal of search graph vs
search tree
Think of search space as
directed graph rather than a tree
As most of the rules are
repeated and hence generate
same node in the tree
Issues
Cycles may be introduces in the
graph
Effort required to understand
whether same node is generated
earlier also
Comparison of basic search
techniques
DFS
is effective when there are few sub trees in the search tree
that have only one connection point to the rest of the
states.
can be dangerous when the path closer to the START and
farther from the GOAL has been chosen.
Is best when the GOAL exists in the lower left portion of the
search tree.
Is effective when the search tree has a low branching
factor.
BFS
can work even in trees that are infinitely deep.
requires a lot of memory as number of nodes in level of the
tree increases exponentially.
is superior when the GOAL exists in the upper right portion
of a search tree.
09/28/20 Prof Saroj Kaushik 86
Heuristic Search
87
Heuristic Search Techniques
Direct techniques (blind search) are not
always possible (they require too much time
or memory).
Weak techniques can be effective if applied
correctly on the right kinds of tasks.
Typically require domain specific information.
88
Example: 8 Puzzle
1 2 3 1 2 3
7 8 4 8 4
6 5 7 6 5
89
1 2 3 GOAL 1 2 3
8 4 7 8 4
7 6 5 6 5
up
left right
1 2 3 1 2 3 1 2 3
7 8 4 7 8 4 7 4
6 5 6 5 6 8 5
Which move is best?
90
8 Puzzle Heuristics
Blind search techniques used an arbitrary
ordering (priority) of operations.
Heuristic search techniques make use of
domain specific information - a heuristic.
What heurisitic(s) can we use to decide which
8-puzzle move is “best” (worth considering
first).
91
8 Puzzle Heuristics
For now - we just want to establish some
ordering to the possible moves (the values of
our heuristic does not matter as long as it
ranks the moves).
Later - we will worry about the actual values
returned by the heuristic function.
92
A Simple 8-puzzle heuristic
Number of tiles in the correct position.
The higher the number the better.
Easy to compute (fast and takes little memory).
Probably the simplest possible heuristic.
93
Another approach
Number of tiles in the incorrect position.
This can also be considered a lower bound on the
number of moves from a solution!
The “best” move is the one with the lowest
number returned by the heuristic.
Is this heuristic more than a heuristic (is it always
correct?).
Given any 2 states, does it always order them properly
with respect to the minimum number of moves away
from a solution?
94
1 2 3 GOAL 1 2 3
8 4 7 8 4
7 6 5 6 5
up
left right
1 2 3 1 2 3 1 2 3
7 8 4 7 8 4 7 4
6 5 6 5 6 8 5
h=2 h=4 h=3
95
Another 8-puzzle heuristic
Count how far away (how many tile
movements) each tile is from it’s correct
position.
Sum up this count over all the tiles.
This is another estimate on the number of
moves away from a solution.
96
1 2 3 GOAL 1 2 3
8 4 7 8 4
7 6 5 6 5
up
left right
1 2 3 1 2 3 1 2 3
7 8 4 7 8 4 7 4
6 5 6 5 6 8 5
h=2 h=4 h=4
97
Techniques
There are a variety of search techniques that
rely on the estimate provided by a heuristic
function.
In all cases - the quality (accuracy) of the
heuristic is important in real-life application of
the technique!
98
Generate-and-test
Very simple strategy - just keep guessing.
do while goal not accomplished
generate a possible solution
test solution to see if it is a goal
Heuristics may be used to determine the
specific rules for solution generation.
99
Example - Traveling Salesman
Problem (TSP)
Traveler needs to visit n cities.
Know the distance between each pair of
cities.
Want to know the shortest route that visits all
the cities once.
n=80 will take millions of years to solve
exhaustively!
100
TSP Example
A 6 B
1 2
5 3
D 4 C
101
Generate-and-test Example
TSP - generation of
possible solutions is done
A B C D
in lexicographical order of
cities:
1. A - B - C - D B C D
2. A - B - D - C
3. A - C - B - D
C D B D C B
4. A - C - D - B
...
D C D B B C
102
Hill Climbing
Variation on generate-and-test:
generation of next state depends on feedback
from the test procedure.
Test now includes a heuristic function that
provides a guess as to how good each possible
state is.
There are a number of ways to use the
information returned by the test procedure.
103
Simple Hill Climbing
Use heuristic to move only to states that are
better than the current state.
Always move to better state when possible.
The process ends when all operators have
been applied and none of the resulting states
are better than the current state.
104
Simple Hill Climbing
Function Optimization
y = f(x)
y
x 105
Potential Problems with
Simple Hill Climbing
Will terminate when at local optimum.
The order of application of operators can
make a big difference.
Can’t see past a single move in the state
space.
106
Simple Hill Climbing Example
TSP - define state space as the set of all
possible tours.
Operators exchange the position of adjacent
cities within the current tour.
Heuristic function is the length of a tour.
107
TSP Hill Climb State Space
ABCD Initial State
ABCD
Swap 1,2 Swap 2,3
Swap 3,4 Swap 4,1
BACD
BACD ACBD
ACBD ABDC
ABDC DBCA
DBCA
Swap 1,2 Swap 3,4
Swap 2,3 Swap 4,1
CABD
CABD ABCD
ABCD ACDB
ACDB DCBA
DCBA
108
Steepest-Ascent Hill Climbing
A variation on simple hill climbing.
Instead of moving to the first state that is
better, move to the best possible state that is
one move away.
The order of operators does not matter.
Not just climbing to a better state, climbing up
the steepest slope.
109
Hill Climbing Termination
Local Optimum: all neighboring states are
worse or the same.
Plateau - all neighboring states are the same
as the current state.
Ridge - local optimum that is caused by
inability to apply 2 operators at once.
110
Problem in Hill Climbing
Initial State Possible Moves
Heuristic Dependence
Hill climbing is based on the value assigned
to states by the heuristic function.
The heuristic used by a hill climbing algorithm
does not need to be a static function of a
single state.
The heuristic can look ahead many states, or
can use other means to arrive at a value for a
state.
112
Best-First Search
Combines the advantages of Breadth-First
and Depth-First searchs.
DFS: follows a single path, don’t need to generate
all competing paths.
BFS: doesn’t get caught in loops or dead-end-
paths.
Best First Search: explore the most promising
path seen so far.
113
Best-First Search (cont.)
While goal not reached:
1. Generate all potential successor states and add to a list
of states.
2. Pick the best state in the list and go to it.
Similar to steepest-ascent, but don’t throw
away states that are not chosen.
114
Process of Best first Search
Simulated Annealing
Based on physical process of annealing a
metal to get the best (minimal energy) state.
Hill climbing with a twist:
allow some moves downhill (to worse states)
start out allowing large downhill moves (to much
worse states) and gradually allow only small
downhill moves.
116
Simulated Annealing (cont.)
The search initially jumps around a lot,
exploring many regions of the state space.
The jumping is gradually reduced and the
search becomes a simple hill climb (search
for local optimum).
117
Simulated Annealing
6
2 4
3 5
1
118
A* Algorithm (a sure test topic)
The A* algorithm uses a modified evaluation
function and a Best-First search.
A* minimizes the total path cost.
Under the right conditions A* provides the
cheapest cost solution in the optimal time!
119
A* evaluation function
The evaluation function f is an estimate of the
value of a node x given by:
f(x) = h’(x) + g(x)
g(x) is the cost to get from the start state to
state x.
h’(x) is the estimated cost to get from state x
to the goal state (the heuristic).
120
Modified State Evaluation
Value of each state is a combination of:
the cost of the path to the state
estimated cost of reaching a goal from the state.
The idea is to use the path to a state to
determine (partially) the rank of the state
when compared to other states.
This doesn’t make sense for DFS or BFS, but
is useful for Best-First Search.
121
Why we need modified evaluation
Consider a best-first search that generates
the same state many times.
Which of the paths leading to the state is the
best ?
Recall that often the path to a goal is the
answer (for example, the water jug problem)
122
A* Algorithm
The general idea is:
Best First Search with the modified evaluation
function.
h’(x) is an estimate of the number of steps from
state x to a goal state.
loops are avoided - we don’t expand the same
state twice.
Information about the path to the goal state is
retained.
123
A* Algorithm
1. Create a priority queue of search nodes (initially the start
state). Priority is determined by the function f )
2. While queue not empty and goal not found:
get best state x from the queue.
If x is not goal state:
generate all possible children of x (and save
path information with each node).
Apply f to each new node and add to queue.
Remove duplicates from queue (using f to pick
the best).
124
Example - Maze
START GOAL
125
Example - Maze
START GOAL
126
Do A* Guaranteed to find an optimal
solution?
h’ underestimates h h’ overestimates h
A* Optimality and Completeness
If the heuristic function h’ is admissible the
algorithm will find the optimal (shortest path)
to the solution in the minimum number of
steps possible (no optimal algorithm can do
better given the same heuristic).
An admissible heuristic is one that never
overestimates the cost of getting from a state
to the goal state (is pessimistic).
128
Admissible Heuristics
Given an admissible heuristic h’, path length
to each state given by g, and the actual path
length from any state to the goal given by a
function h.
We can prove that the solution found by A* is
the optimal solution.
129
A* Optimality Proof
Assume A* finds the (suboptimal) goal G2
and the optimal goal is G.
Since h’ is admissible: h’(G2)=h’(G)=0
Since G2 is not optimal: f(G2) > f(G).
At some point during the search some node n
on the optimal path to G is not expanded. We
know:
f(n) and f(G) are optimal
130
Proof (cont.)
We also know node n is not expanded before
G2, so:
f(G2) = f(n)
Combining these we know:
f(G2) = f(G)
This is a contradiction ! (G2 can’t be
suboptimal).
131
root (start state)
G G2
132
Problem Reduction
So far search strategies discussed were for
OR graphs.
Here several arcs indicate a different ways of
solving problem.
Another kind of structure is AND-OR graph
(tree).
Useful for representing the solution of problem
by decomposing it into smaller sub-problems.
Problem Reduction – Contd..
Each sub-problem is solved and final solution is
obtained by combining solutions of each sub-
problem.
Decomposition generates arcs that we will call
AND arc.
One AND arc may point to any number of
successors, all of which must be solved.
Such structure is called AND–OR graph rather
than simply AND graph.
Example of AND-OR Tree
Acquire TV
Steal TV Earn Money Buy TV
AND–OR Graph
To find a solution in AND–OR graph, we
need an algorithm similar to A*
with the ability to handle AND arc
appropriately.
In search for AND-OR graph, we will also
use the value of heuristic function f for
each node.
AND–OR Graph Search
Traverse AND-OR graph, starting from the
initial node and follow the current best path.
Accumulate the set of nodes that are on the
best path which have not yet been expanded.
Pick up one of these unexpanded nodes and
expand it.
Add its successors to the graph and compute f
(using only h) for each of them.
AND–OR Graph Search – Contd..
Change the f estimate of newly expanded node
to reflect the new information provided by its
successors.
Propagate this change backward through the graph to
the start.
Mark the best path which could be different from
the current best path.
Propagation of revised cost in AND-OR graph
was not there in A*.
Contd.. Example
Consider AND-OR graph given on next slide.
Let us assume that each arc with single successor will
have a cost of 1 and each AND arc with multiple
successor will have a cost of 1 for each of its
components for the sake of simplicity.
Here the numbers listed in the circular brackets ( ) are
estimated cost and the revised costs are enclosed in
square brackets [ ].
Thick lines indicate paths from a given node.
A
(20) (19) initially estimated values
[18] [28] revised values
B C D
(19) (8) (9) estimated values
E [17] F G [9] H I [17] J revised values
(5) (10) (3) (4) (8) (7) estimated values
Explanation
Initially we start from start node A and compute
heuristic values for each of its successors, say
{B, (C and D)} as {19, (8, 9)}.
The estimated cost of paths from A to B is 20 (19
+ cost of one arc from A to B) and from A to (C
and D) path is 19 ( 8+9 + cost of two arcs A to C
and A to D).
The path from A to (C and D) seems to be better.
So expend this AND path by expending C to {(G
and H)} and D to {(I and J)}.
Contd..
Now heuristic values of G, H, I and J are 3, 4, 8
and 7 respectively.
This leads to revised cost of C and D as 9 and 17
respectively.
These values are propagated up and the revised
costs of path from A through (C and D) is
calculated as 28 (9 + 17 + cost of arcs A to C and
A to D).
Now the revised cost of this path is 28 instead of
earlier estimation of 19 and this path is no longer a
best path.
Contd..
Then choose path from A to B for expansion.
After expansion we see that heuristic value of
node B is 17 thus making cost of path from A
to B to be 18.
This path is still best path so far, so further
explore path from A to B.
The process continues until either a solution
is found or all paths have lead to dead ends,
indicating that there is no solution.
One more example
Cyclic Graph
If a graph is cyclic (containing cycle) then the
algorithm discussed earlier does not operate
unless modified as follows:
If successor is generated and found to be already in
the graph, then
we must check that the node in the graph is not an ancestor of
the node being expanded.
If not, then newly discovered path to the node be entered in
the graph.
We can now state precisely the steps taken for
performing heuristic search of an AND-OR
graph.
Cyclic Graph – Contd..
Algorithm for searching AND-OR graph is called
AO*
Here we maintain single structure G, representing the
part of the search graph explicitly generated so far
rather than two lists, OPEN and CLOSED as in
previous algorithms.
Each node in the graph will
point both down to its immediate successors and up to
its immediate predecessor.
have an h value (an estimate of the cost of a path
from current node to a set of solution nodes)
associated with it.
Cyclic Graph – Contd..
We will not store g (cost from start to current
node) as it is not possible to compute a single
such value since there may be many paths to
the same state.
The value g is also not necessary because of the
top-down traversing of the best-known path which
guarantees that only nodes on the best path will
ever be considered for expansion.
So h will be good estimate for AND/OR graph
search.
The "Solve" labeling Procedure
A terminal node is labeled as
"solved" if it is a goal node (representing a solution of
sub-problem)
"unsolved" otherwise (as we can not further reduce it)
A non-terminal AND node labeled as
"solved" if all of its successors are "solved".
"unsolved" as soon as one of its successors is labeled
"unsolved".
A non-terminal OR node is labeled as
"solved" as soon as one of its successors is labeled
"solved".
"unsolved" if all its successors are "unsolved".
Example
1. After one cycle
A (3)
B (2) C (1) D (1)
2. After two cycle
A (4)
B (5) C (1) D (1)
Best path
E (4) F (6)
Example – contd..
3. After three cycle
A (5)
B (5) C (2) D (1)
Solved
E (4) F (6)
G (2) H (0) I (0)
Solved Solved
4. After four cycle
A (5)
Solved
B (5) C (2) D (1)
Solved
E (4) F (6)
G (2) H (0) I (0)
Solved Solved
AO* Algorithm
Let graph G consists initially the start node. Call it INIT.
Compute h(INIT).
Until INIT is SOLVED or h(INIT) > Threshold or Un_Sol
{1
Traverse the graph starting from INIT and follow the current
best path.
Accumulate the set of nodes that are on the path which
have not yet been expanded or labeled as SOLVED.
Select one of these unexpanded nodes. Call it NODE and
expand it.
Generate the successors of NODE. If there are none, then
assign Threshold as the value of this NODE else for each
SUCC that is also not ancestor of NODE do the following
{2
Add SUCC to the graph G and compute h for each.
If h(SUCC) = 0 then it is a solution node and label it as
SOLVED.
Contd..
Propagate the newly discovered information up the
graph as follows:
Initialize S with NODE.
Until S is empty
{3
Select from S, a node such that the selected node has no
ancestor in G occurring in S /* to avoid cycle */.
Call it CURRENT and remove it from S.
Compute the cost of each arcs emerging from CURRENT.
Cost of AND arc = (h of each of the nodes at the end of
the arc) + (cost of arc itself)
Assign the minimum of the costs as new h value of
CURRENT.
Mark the best path out of CURRENT
(with minimum cost).
Mark CURRENT node as SOLVED if all of the nodes
connected to it through the new marked arc have been
labeled SOLVED.
If CURRENT has been marked SOLVED or if the cost of
CURRENT was just changed, then new status must be
propagated back up the graph. So add to S all of the
ancestors of CURRENT.
3}
2 }
1 }
Longer Path May be Better
Consider another example
1
2 3 4 Unsolvable
5 6
7 8
9 10
Explanation
Nodes are numbered in order of their
generation.
Now node 10 is expanded at the next step
and one of its successors is node 5.
This new path to 5 is longer than the previous
path to 5 going through 3.
But since the path through 3 will only lead to
a solution as there is no solution to 4, so the
path through 10 is better.
Interaction between Sun goals
AO* may fail to take into account an interaction
between sub-goals.
A (10)
D (3)
E (2) C (5)
Assume that both C and E ultimately lead to a
solution.
Contd..
According to AO* algorithm, both C and D must be
solved to solve A.
Algorithm considers the solution of D as a
completely separate process from the solution of C.
As there is no interaction between these two sub-goals).
Looking just at the alternative from D, the path from
node E is the best path but it turns out that C is must
anyways, so it is better also to use it to satisfy D.
But to solve D, the path from node E is the best path
and will try to solve E.
AO* algorithm does not consider such interactions,
so it will find a non-optimal path.
Constrained Satisfaction
Many AI problems can be viewed as problems of
constrained satisfaction in which the goal is to solve
some problem state that satisfies a given set of
constraints.
Example of such a problem are
Crypt-Arithmetic puzzles.
Many design tasks can also be viewed as constrained
satisfaction problems.
N-Queen: Given the condition that no two queens on the
same row/column/diagonal attack each other.
Map colouring: Given a map, colour three regions in blue, red
and black, such that no two neighbouring regions have the
same colour.
Such problems do not require a new search methods.
They can be solved using any of the search strategies which
can be augmented with the list of constraints that change as
parts of the problem are solved.
Algorithm
Until a complete solution is found or all paths have
lead to dead ends
{
Select an unexpanded node of the search graph.
Apply the constraint inference rules to the selected
node to generate all possible new constraints.
If the set of constraints contain a contradiction, then
report that this path is a dead end.
If the set of constraint describes a complete solution,
then report success.
If neither a contradiction nor a complete solution has
been found, then
apply the problem space rules to generate new partial
solutions that are consistent with the current set of
constraints.
Insert these partial solutions into the search graph.
}
Crypt-Arithmetic puzzle
Problem Statement:
Solve the following puzzle by assigning numeral (0-9) in such a
way that each letter is assigned unique digit which satisfy the
following addition.
Constraints : No two letters have the same value. (The constraints
of arithmetic).
S E N D
+ M O R E
_________________________________
M O N E Y
_________________________________
Initial Problem State
S = ? ; E = ? ;N = ? ; D = ? ; M = ? ;O = ? ; R = ? ;Y = ?
Carries :
C4 = ? ; C3 = ? ; C2 = ? ; C1 = ?
C4 C3 C2 C1 Carry
S E N D
+ M O R E
_________________________________
M O N E Y
_________________________________
Constraint equations:
Y= D+E C1
E = N + R + C1 C2
N = E + O + C2 C3
O = S + M + C3 C4
M = C4
We can easily see that M has to be non
zero digit, so the value of C4 =1
1. M = C4 M = 1
C4 C3 C2 C1 Carry
2. O = S + M + C3 C4
For C4 =1, S + M + C3 > 9 S E N D
S + 1 + C3 > 9 S+C3 > 8.
If C3 = 0, then S = 9 else if C3 = 1, + M O R E
then S = 8 or 9. _________________________________
We see that for S = 9
C3 = 0 or 1
M O N E Y
It can be easily seen that C3 = 1 is _________________________________
not possible as O = S + M + C3
O = 11 O has to be assigned digit
1 but 1 is already assigned to M, so
not possible. Y= D+E C1
Therefore, only choice for C3 = 0, E = N + R + C1 C2
and thus O = 10. This implies that O
is assigned 0 (zero) digit. N = E + O + C2 C3
Therefore, O = 0 O = S + M + C3 C4
M = C4
M = 1, O = 0
3. Since C3 = 0; N = E + O + C2
produces no carry. C4 C3 C2 C1 Carry
As O = 0, N = E + C2 .
Since N E, therefore, C2 = 1. S E N D
Hence N = E + 1 + M O R E
Now E can take value from 2 to 8 _________________________________
{0,1,9 already assigned so far }
If E = 2, then N = 3.
M O N E Y
Since C2 = 1, from E = N + R + _________________________________
C1 , we get 12 = N + R + C1
If C1 = 0 then R = 9, which is not
possible as we are on the path
with S = 9
If C1 = 1 then R = 8, then
From Y = D + E , we get
10 + Y= D + 2 . Y= D+E C1
For no value of D, we can E = N + R + C1 C2
get Y. N = E + O + C2 C3
Try similarly for E = 3, 4. We fail in O = S + M + C3 C4
each case.
M = C4
If E = 5, then N = 6
Since C2 = 1, from E = N + R + C4 C3 C2 C1 Carry
C1 , we get 15 = N + R + C1 ,
If C1 = 0 then R = 9, which is not S E N D
possible as we are on the path
with S = 9. + M O R E
If C1 = 1 then R = 8, then _________________________________
From Y = D + E , we get 10 + M O N E Y
Y= D + 5 i.e., 5 + Y = D. _________________________________
If Y = 2 then D = 7. These
values are possible.
Hence we get the final solution as
given below and on backtracking, Y= D+E C1
we may find more solutions. E = N + R + C1 C2
N = E + O + C2 C3
S=9;E=5;N=6;D=7; O = S + M + C3 C4
M = 1 ; O = 0 ; R = 8 ;Y = 2 M = C4
Constraints:
Y = D + E C1
E = N + R + C1 C2
N = E + O + C2 C3
O = S + M + C3 C4
M = C4
Initial State
M = 1 C4 = 1
O = 1 + S + C3
O
S = 9 S = 8
O
C3 = 0 C3 = 1
O = 0 O = 1
Fixed
M = 1
O = 0
N = E + O + C2 = E + C2 C2 = 1 (must) N = E + 1
E = 2 E = 3 ….. E = 5
N = 3 N = 6
E = N + R + C1 E = N + R + C1
10 + 2 = 3 + R + C1 10 + 5 = 6 + R + C1
O O
R = 9 R = 8 R = 9 R = 8
C1 =0 C1 = 1 C1 = 0 C1 = 1
O O
10 + Y = D + E = D + 2 10 + Y = D + E = D + 5
O O
D = 8 D = 9 D = 7
Y = 0 Y = 1 Y = 2
The first solution obtained is:
M = 1, O = 0, S = 9, E = 5, N = 6, R = 8, D = 7, Y = 2
C4 C3 C2 C1 Carries
B A S E
+ B A L L
_________________________________
G A M E S
_________________________________
Constraints equations are:
E+L=S C1
S + L + C1= E C2
2A + C2 = M C3
2B + C3 = A C4
G = C4
Initial Problem State
G = ?; A = ?;M = ?; E = ?; S = ?; B = ?; L = ?
1. G = C4 G = 1
2. 2B+ C3 = A C4
2.1 Since C4 = 1, therefore, 2B+ C3 > 9 B can take values from 5 to 9.
2.2 Try the following steps for each value of B from 5 to 9 till we get a
possible value of B.
if C3 = 0 A = 0 M = 0 for C2 = 0 or M = 1 for C2 = 1
If B = 5
if C3 = 1 A = 1 (as G = 1 already)
For B = 6 we get similar contradiction while generating the search tree.
If B = 7 , then for C3 = 0, we get A = 4 M = 8 if C2 = 0 that leads to
contradiction, so this path is pruned. If C2 = 1, then M = 9 .
3. Let us solve S + L + C1 = E and E + L = S
Using both equations, we get 2L + C1 = 0 L=5 and C1 = 0
Using L = 5, we get S + 5 = E that should generate carry C2 = 1 as shown
above
So S+5 > 9 Possible values for E are {2, 3, 6, 8} (with carry bit C2 = 1 )
If E = 2 then S + 5 = 12 S = 7 (as B = 7 already)
If E = 3 then S + 5 = 13 S = 8.
Therefore E=3 and S = 8 are fixed up.
4. Hence we get the final solution as given below and on backtracking, we may find
more solutions. In this case we get only one solution.
G = 1; A = 4; M = 9; E = 3; S = 8;B = 7; L = 5