1 Ai
1 Ai
“Weak AI”
- Machines can possibly act intelligently.
“Strong AI”
- Machines can actually think intelligently.
Most AI researchers take the weak hypothesis for granted, and don’t care about
the strong AI hypothesis.
* AI History
Turing Test:
1
* Search Techniques
There are two types of searches, these are:
1- Blind search
- Depth First Search.
- Breadth First Search.
- Hybrid Search.
2- Heuristic search
- Hill Climbing.
- Best First Search.
- A algorithm.
- A* algorithm.
The theory of state space search is the primary tool for answering these questions.
By representing a problem as a state space graph, the graph theory can be used to
analyze the structure and complexity of both the problem and the procedures used to
solve it.
If a graph is used, the problem of cycles will occur so the best structure to represent a
problem is a tree structure (there is a unique path between every two nodes), which
can be defined as a tree with no cycles. It is important to distinguish between
problems whose state space is a tree and those that may contain loops (graph).
General graph search algorithms must detect and eliminate loops from potential
solution paths, while tree searches may gain efficiency by eliminating this test and its
overhead. So convert graph representation to a tree representation. To convert a graph
to a tree, remove any nodes that may cause a cycle.
2
Example: convert a graph to a tree
a a
d
b d
e c e
b
c
d c
e b
3
Example: use depth first search to find the path between A and L for the following
search space:
B
D E
C
R
G
I J H
N
M
K L
P Q
Solution:
Start=[A], goal=L.
Iteration #0:
Open=[A], closed=[ ], parent[A]=null.
Iteration #1:
X=A, L=[B,C,D,E], closed=[A], Parent[B]=A, parent[C]=A, parent[D]=A,
parent[E]=A. open=[B,C,D,E].
Iteration #2:
X=B, L=[F,C], closed=[A,B], Parent[F]=B, parent[C]=B, open=[F,C,D,E].
Iteration #3:
X=F, L=[I,J], closed=[A,B,F], Parent[I]=F, parent[J]=F, open=[I,J,C,D,E].
Iteration #4:
X=I, L=[N], closed=[A,B,F,I], Parent[N]=I, open=[N,J,C,D,E].
4
Iteration #5:
X=N, L=[ ], closed=[A,B,F,I,N], open=[J,C,D,E].
Iteration #6:
X=J, L=[ ], closed=[A,B,F,I,N,J], open=[C,D,E].
Iteration #7:
X=C, L=[F,G,H], closed=[A,B,F,I,N,J,C], Parent[G]=C, parent[H]=C,
open=[G,H,D,E].
Iteration #8:
X=G, L=[K,L], closed=[A,B,F,I,N,J,C,G], Parent[K]=G, parent[L]=G,
open=[K,L,H,D,E].
Iteration #9:
X=K, L=[ ], closed=[A,B,F,I,N,J,C,G,K], open=[L,H,D,E].
Iteration #10:
X=L
Since L is a goal, stop and find path.
Path: A B C G L
Example: use breadth first search to find the path between A and L for the
search space in the previous example:
Solution:
Start=[A], goal=L.
5
Iteration #0:
Open=[A], closed=[ ], parent[A]=null.
Iteration #1:
X=A, L=[B,C,D,E], closed=[A], Parent[B]=A, parent[C]=A, parent[D]=A,
parent[E]=A, open=[B,C,D,E].
Iteration #2:
X=B, L=[F,C], closed=[A,B], Parent[F]=B, open=[C,D,E,F].
Iteration #3:
X=C, L=[F,G,H], closed=[A,B,C], Parent[G]=C, parent[H]=C, open=[D,E,F,G,H].
Iteration #4:
X=D, L=[H], closed=[A,B,C,D], open=[E,F,G,H].
Iteration #5:
X=E, L=[M,R], closed=[A,B,C,D,E], Parent[M]=E, parent[R]=E, open=[F,G,H,M,R].
Iteration #6:
X=F, L=[I,J], closed=[A,B,C,D,E,F], Parent[I]=F, parent[J]=F, open=[G,H,M,R,I,J].
Iteration #7:
X=G, L=[K,L], closed=[A,B,C,D,E,F,G], Parent[K]=G, parent[L]=G,
open=[H,M,R,I,J,K,L].
Iteration #8:
X=H, L=[M], closed=[A,B,C,D,E,F,G,H], open=[M,R,I,J,K,L].
Iteration #9:
X=M, L=[ ], closed=[A,B,C,D,E,F,G,H,M], open=[R,I,J,K,L].
Iteration #10:
X=R, L=[ ], closed=[A,B,C,D,E,F,G,H,M,R], open=[I,J,K,L].
Iteration #11:
X=I, L=[N], closed=[A,B,C,D,E,F,G,H,M,R,I], Parent[N]=I, open=[J,K,L,N].
Iteration #12:
X=J, L=[ ], closed=[A,B,C,D,E,F,G,H,M,R,I,J], open=[K,L,N].
Iteration #13:
X=K, L=[ ], closed=[A,B,C,D,E,F,G,H,M,R,I,J,K], open=[L,N].
Iteration #14:
X=L
Since L is a goal, stop and find path.
Path: A C G L
6
* Hybrid Search Algorithm
Apply depth first iteratively increasing the level of the state space by one until
the goal is found or the whole space is searched with no success.
When you reapply depth first, you start from the start state and ignore any
information from previous runs.
Propositional Logic
* Logic
Predicate Logic
* Propositional Logic
“Water is liquid” true.
“Today is Monday” true.
“It is raining now” true.
* Connectives
AND
OR
NOT
IMPLIES
EQUAL =
- Sentences formed by these connectives are called well formed formulas (wff).
- Brackets of the form (, ), [, ], {, } can be used to group symbols into sub
expressions and thus control the order of evaluation, for example (PQ)=R
and P(Q=R).
1- (P) P.
2- (PQ) (PQ).
3- (PQ) (PQ).
4- (PQ) (PQ).
5- P(QR) (PQ) (PR).
7
6- P (QR) (PQ) (PR).
7- PQ QP.
8- PQ QP.
9- (PQ ) R P(Q R).
10- (PQ ) R P (Q R).
11- PQ PQ. (important).
* Predicate Logic
predicate (arguments).
Examples
1- liquid (water).
is (water, liquid).
2- Today is Saturday
day (Saturday).
3- It is raining today.
weather (today, rain).
For all
* Quantifiers
There exists
Examples
8
X weather (X,rain) true for some values of X
* Definitions
1- Constant:
A constant refers to a specific object or to a property of an object. A constant
starts with a lower case letter.
2- Variable:
A variable is used to refer to general cases of objects or properties. Variables
start with upper case letter.
3- Function:
A function name starts with a lower case letter. It has an associated number
of arguments. Each argument can be either: constant, variable, or a function.
4- Term:
A term refers either to constant, variable, or a function.
- The English alphabet and the digits 0, 1, 2, ……, 9 and the underscore “ _ “
are used to construct a term.
5- Predicate:
A predicate names a relationship between zero or more objects. It starts with
a lower case letter and the name is constructed using the characters used for
terms.
6- Atomic Sentence:
Is a predicate with arity n followed by n terms enclosed in parentheses and
separated by commas, predicate logic sentences are delimited by the period
character “.”.
Examples:
- like (X,Y).
- X Y like(X,Y).
9
- X Y like(X,Y).
7- Literal:
A literal is an atomic sentence or the negation of an atomic sentence.
8- Clause:
A clause is one or more literals connected by the connectives (,,,,= ).
* Horn Clause
A Horn clause has the following form:
b1(X,Y) b2(X,Y) …….. bn (X,Y) a(X,Y). Where the literals a, b1,…, bn
are all positive.
The a (X,Y) is called the head of the clause and b1(X,Y) …….. bn (X,Y) is called
the body of the clause.
1- The original clause has no head b1(X,Y) b2(X,Y) …….. bn (X,Y). This
represents a goal to be proved.
2- The clause has no body: a (X,Y). This represents a fact.
3- The clause has a body and a head: b1(X,Y) ….. bn (X,Y) a(X,Y).
In this case the clause is called a rule.
A Horn clause may be written in the form: b1 (X,Y) b2(X,Y) …. bn (X,Y)
a(X,Y). Thus a Horn can be defined as a clause with at most one positive literal.
* Common Identities
1- X p (X) X p (X).
2- X p (X) X p (X).
3- X p (X) Y p (Y).
4- X p (X) Y p (Y).
5- X [p (X) q(X)] X p (X) Y q(Y).
6- X [p (X) q(X)] X p (X) Y q(Y).
* Examples:
10
Solution\ X tall (X). X {basketball player}.
X [basketball_player (X) tall (X). X {players}.
X [player (X) play (X,basketball) tall (X). X {all people}.
* Unification
Unification is the process of making two literals look alike.
Assume that we have a set of literals to be unified: {L1, L2, L3, …., Lk}, we seek a
substitution, such that: F= {(t1,v1), (t2,v2), ……, (tn,vn)} such that:
L1F=L2F=……=Lkf.
Example:
2- L1 = father (X,Y).
L2 = father (X1,Y1).
F = {(X,X1),(Y,Y1)}.
L1F = father (X,Y).
L2F = father (X,Y).
Q = {(ali,X),(ahmed,Y),(ali,X1),(ahmed,Y1)}.
L1Q = father (ali,ahmed).
L2Q = father (ali,ahmed).
R = {(ali,X),(ali,X1),(Y,Y1)}.
L1R = father (ali,Y).
L2R = father (ali,Y).
Let F be a substitution, then F is mgu (most general unifier) of s = {L1, L2, …., Lk}
provided that for any other unifier Q of {L1, L2, ….., Lk} there exist a unifier R of
{L1, L2, …., Lk} such that: Q(S) = R(F(S)).
11
Unification is done by replacing a variable by a term (important).
* Procedure mgu
Begin
- Generate the first disagreement set D.
- Repeat
While (D is disagreement) do
If non of the terms in D consists of a variable by itself then
- stop and report failure becomes the set of literals can
not be unified by any substitutions.
Else
- Convert variable into a term by adding a pair of the
form (t,v) and simultaneously perform the substitution
in the literals.
If v is not in F as a second argument of a pair, then
- add a pair (tp,vp) to F.
Else
- stop and report failure.
End if
End if
End while
- generate the next disagreement set D;
Until (no D remain)
Return (F).
End.
Example:
Let L1 = p (X,f(Y)).
L2 = p(a,f(g(Z))).
Find the mgu.
Solution\
F = {}.
Step1:
D = {X,a}.
F = {(a,X)}.
L1 = L1F = p(a,f(Y)).
L2 = L2F = p(a,f(g(Z))).
Step2:
D = {f(Y),f(g(Z))}.
F = {(a,X),(g(Z),Y)}.
L1 = L1F = p(a,f(g(Z))).
L2 = L2F = p(a,f(g(Z))).
Stop.
12
* Skolemization
Example:
2- Y X father (X,Y)
Y father (f(Y),Y).
f (Y) is a skolen function.
4- X Y Z W p (X,Y,Z,W).
X Y W p (X,Y,f(X,Y),W).
5- X Y Z W p (X,Y,Z,W).
X Y p (X,Y,f(X,Y),g(X,Y)).
* Clause Form
Definition: A predicate logic expression WFF (Well Form Formula) is in clause form
if it consists of a disjunction of literals only.
P1 p2 ….. pn.
P1 p2 ….. pn.
(p1 p2) q1 is not in clause form.
Example:
Solution:
13
3- Standarize variables, such that each quantifier assumes a different name for the
variables.
X {[ p(X) q(X)] [r(X,a) Y (Z r(Y,Z) s(X,Y))]} W t(W).
5- Skolemize variables.
X Z W {[ p(X) q(X)] [r(X,a) ( r(f(X),Z) s(X,f(X)))]} t(W).
9- Rename variables such that each clause uses a different set of variables.
Definition: (complete):
14
Inference Rules
1- Modus-Ponens.
2- Resolution.
Modus Ponens
If p q
p true.
Conclusion: q is true.
Example:
C1: p q.
C2: q r t.
C3: p.
C4: r.
Is t true?
p q
p is true.
q is true.
qr t
q is true.
r is true.
Conclusion: t is true.
* Resolution
If C1 and C2 in normal form such that C1 contains a literal and C2 contains the
negation of that literal. The result is a clause, which consist of the disjunction of all
15
literals in C1 and C2 , except the literal and its negation in the two clauses (P, P). The
resulting clause is called resolvent.
Example:
C1 : p q r s
C2: t z q
prstz resolvent.
Example:
C1 : p q
C2: p
C3: q
Empty Clause
C4: C1 and C2: q
C5: C4 and C3:
If the is appear in the result it mean that the contradiction in the terms (q,q).
So to prove the goal, add the negation of the goal and then apply resolution and if
appears then there is an error in the negation of the goal, which means the goal, is true
and vice-versa.
Solution:
16
C2': dog (X1) animal (X1).
C3': animal (X2) die (X2).
g': die (fido).
#1
g': die (fido). (fido,X2) C4: animal (fido).
C3': animal (X2) die (X2).
#2
C4: animal (fido). (fido,X1) C5: dog (fido).
C2': dog (X1) animal (X1).
#3
C5: dog (fido). Empty clause
C1': dog (fido).
Example:
“ all people that are not poor and smart are happy. Those people that read are not
stupid. Ali can read and is wealthy. Happy people have easy life”.
Solution:
#1
g': easy_life(Z). (Z,X3) C5: happy (Z).
C4': happy (X3) easy_life (X3).
17
#2
C5: happy (Z). (Z,X1) C6: poor (Z) smart (Z).
C1': poor (X1) smart (X1) happy (X1).
#3
C6: poor (Z) smart (Z). (ali,Z) C7: smart (ali).
C32': poor (ali).
#4
C7: smart (ali). (ali,X2) C8: read (ali).
C2': read (X2) smart (X2).
#5
C8: read (ali). Empty clause
C31': read (ali).
Solution:
18
C1': r (X1) t (X1).
C2': d (X2) t (X1).
C31': d (a).
C32': h (a).
Round 1
Round 2
19
C20: empty.
C1 C2 C32 Goal
C4 C7
C14 C31
The set of support consists initially of the negation of the goal consequently any
resolvent whose parent in the set of support become member of the set of support.
This strategy is good for dealing with large number of clauses.
20
S = { h (Z) r (Z), C4}.
2- resolves C4 and C2'
C5: h (Z) d (Z). {(Z,X2)}
S = { h (Z) r (Z), C4, C5}.
3- resolves C5 and C31'
C6: h (a) {(a,Z)}
S = { h (Z) r (Z), C4, C5, C6}.
4- resolves C6 and C32'
C7 : empty.
Goal C1
C4 C2
C5 C32
C32
C6
21
C1': r (X1) t (X1).
C2': d (X2) t (X1).
C31': d (a).
C32': h (a).
Goal: h (Z) r (Z).
Goal C1
C4 C2
C5 C32
C32
C6
22
Solution: Convert to normal form:
Goal C1
C4 C2
C5 C32
C32
C6
* Heuristics
May be defined as the study of methods and rules of discovery and invention.
In state space search heuristics are formalized as rules for choosing those
branches in the state space that are most likely to lead to an acceptable solution.
23
Example:
In vision, vision scenes are often ambiguous allowing multiple
interpretations of objects. Heuristics are used to select the most likely
interpretation.
A heuristics can lead a search algorithm to a sub optimal solution or fail to find
any solution at all.
* Heuristic Algorithms:
It is useful to think of heuristic algorithms as consisting of two parts:
1- A heuristic measure function (is a measure to determine which path is the
best).
2- An algorithm that uses that the measure to search the state space.
* Heuristic Functions:
There are heuristics of a very general applicability and ones that represent
specific knowledge that is relevent to the solution of a particular problem. One
example of a good general purpose heuristic is the algorithm (nearest neighbor
algorithm), which works by selecting the locally superior alternative at each step.
General purpose heuristics may be coupled with some special purpose heuristics
so as to work well for the specific domain.
The purpose of the heuristics functions is to guide the search process in the most
profitable direction by suggesting, which path to follow first when more than one path
is available.
In general there is a trade off between the cost of evaluating a heuristic function
and the savings in search time that the function provides.
However, the most accurately the heuristic function estimated the true merits of
each node in the tree or graph, the more direct will be the solution process.
24
Examples:
1- 8-puzzle, the number of tiles that are in the place they belong to.
2- traveling salesman, the sum of the distances.
* Search Algorithms
The strategy used for controlling a search is often critical in determining how
effective the heuristics functions; here are some general-purpose control strategies:
1- Generate and test.
2- Hill climbing.
3- Best first search.
4- A* .
* Hill Climbing:
The strategy consists of the following steps:
2- From this solution apply a number of applicable rules to generate a new set
of proposed solutions.
5- Go back to step 2.
Four cubes each of those sides is painted by one of 4 colors. A solution to the
puzzle consists of an arrangement of the cubes in a row such that on all four sides of
the row, one block face of each color is showing.
To solve the problem, we first need to define a heuristic function that describes
how close a particular configuration is to being a solution. One such solution is
simply the sum of the number of different colors on each of the four sides. Next we
need to define a set of rules that describe ways of transforming one configuration into
another. One rule will suffice. It says simply pick a block and rotate it at 90º in any
direction.
The next step is to generate a starting state this can be done at random. Now hill
climbing can begin. We could try all possible rotations of all four blocks and see
which leads to the greatest improvement. Another strategy may be to try some of the
possible moves. This can be done by picking one block and see if there is any way to
rotate it to improve the situations. If there is, perform that rotation and continue. But
what if more of the possible rotations produce desirable state.
25
* Hill Climbing Algorithm:
Begin
CS=start, found=false, Path=[]
While (not found) do
Begin
Add CS to Path
If CS=goal then return (path)
Else generate all child states of CS
Remove any child states of CS already on Path
If CS has no remaining child states then return (fail)
Else compute the heuristic value of all remaining child states
Choose a child state with the best heuristic value call it B
If B is not better than CS then return (fail)
Else CS=B
End
End
Example: use hill climbing search to find the path between A and P for the
following search space:
A20
F8 J11
E1 G14 H10 I12
L3 P0 Q14 R13
O3
Solution:
Iteration CS Path
0 A []
1 C [A]
2 H [A,C]
3 P [A,C,H]
4 [A,C,H,P]
26
With hill climbing can arrive at the following situations:
1- a local maximum is a state that is better than all its neighbors but is not better
than other states further away. At local maximum, all moves appear to made
things worse.
2- A plateau is a flat area of the search space in which a whole set of neighboring
states have the same value on a plateau. it is not possible to determine the best
direction in which to move by making local comparisons.
3- A ridge is an area of the search space that is higher than surrounding areas but
that cannot be traversed by single moves in any one direction. There are some
ways of dealing with these problems. Although the methods are by no means
guaranteed.
The solution:
1- Backtrack to some earlier node and try going in a different direction. This is
reasonable if at that node there was another direction that looked as promising
or almost as promising as one that was chosen. To adopt this strategy,
maintain a list of paths almost taken and go back to one of them, if the path
that was taken leads to a dead end.
2- Make a big jump in one direction to try to get to a new section of the search
space. This strategy may be used with plateau. If the only rules available
describe single small steps apply them several times in the same direction.
3- Apply two or more rules before doing the test. This corresponds to moving in
several directions at once. This is a good strategy for dealing with ridges.
27
An added step in the algorithm orders the states on open according to some heuristic
estimated of their closeness to the goal. Thus each iteration of the loop considers the
most promising state on the open list:
Example: use best first search to find the path between A and P for the following
search space:
A20
B2 C4 D6
F5 28 J11
E5 G 4 H3 I12
Solution:
Start=[A], goal=P.
Iteration #0:
open=[A], closed=[ ], parent[A]=null.
Iteration #1:
X=A, L=[B,C,D], open=[ B2,C4, ,D6], closed=[A].
parent[B]=A, parent[C]=A, parent[D]=A.
Iteration #2:
X=B, L=[E,F], open=[C4,E5,F5,D6], closed=[A,B].
parent[E]=B, parent[F]=B.
Iteration #3:
X=C, L=[G,H], open=[H3,G4,E5,F5,D6], closed=[A,B,C].
parent[G]=C, parent[H]=C.
Iteration #4:
X=H, L=[O,P], open=[O2,P3,G4,E5,F5,D6], closed=[A,B,C,H].
parent[O]=H, parent[P]=H.
Iteration #5:
X=O, L=[], open=[ P3,G4,E5,F5,D6], closed=[A,B,C,H,O].
Iteration #6:
X=P=goal then stop
Path: A C H P
This state will have greater probability of being on the shortest path to the goal.
The distance from the starting state to its descendants can be measured by maintaining
a depth count for each state. This count is (0) for the beginning state and is increment
29
by (1) for each level of the search. It records the actual number of moves that have
been used to go from the starting state to each descendant. This can be added to the
heuristic evaluation of each state to bias search in favor of states found shallower in
the graph.
In what sense is one heuristic better than another? This is the informedness of
heuristic.
When a state is discovered using heuristic search is there any guarantee that the
same state won’t be found later in the search at a cheaper cost (with a shorter path
from the start state?) this is property of monotoncity.
* Admissibility measures:
An algorithm is admissible if it is guaranteed to find a minimal path to a
solution whenever such a path exists.
Breadth first search is an admissible search strategy, since it looks at every state
at level n of the graph before considering any state at level n+1 so any goal nodes are
found along the shortest possible path, however, it is often too inefficient for practical
use.
* Definitions:
Consider the evaluation function f(n)=g(n)+h(n), where:
- n is any state encountered in the search.
- g(n) is the cost of n from the start state.
- h(n) is the heuristic estimate of the cost of going from n to goal.
If this evaluation function is used with the BFS algorithm, the result is called
algorithm A.
- g*n) is the cost of the shortest path from the start node to node n.
- h*(n) is the actual cost of the shortest path from n to the goal.
It follows that f*(n) is the actual cost of the optimal path from a start node to a goal
node that passes through n.
30
If BFS is used with the evaluation function f*, the resulting search strategy is
admissible. For most real problems, f* does not exist.
*Algorithm A:
Begin
Input the start node s and the goal node.
Open=[s]; closed=[], g[s]=0; pred[s]=null.
While open<>[] do
Begin
Remove the first element form open, call it X.
If X is a goal then return the solution path that led to X.
Else
Generate all possible children of X and put them in list L.
For each child Y of X do
Case
The child Y is not on open or closed:
Begin
g[Y]=g[X]+cost(X,Y).
f[Y]=g[Y]+h[Y].
Set pred[Y]=X.
Add the child Y to open.
End;
Else
The child Y in open or in close:
Begin
temp=f[Y]-g[Y]+g[X]+cost(X,Y).
If temp<f[Y] then
Begin
g[Y]=g[X]+cost(X,Y).
f[Y]=temp.
pred[Y]=X.
If Y is on close then insert Yin open and
remove it from close
End;
End;
End; {case}
Put X on closed
Reorder states on open according to heuristic value (best values first)
End; {while}
Return (failure)
End.
31
Example: use A algorithm to find the path between A and K for the following
search space
A40
7
8
5
B20 C25 D27
12
15 18
7 7
F60 28
E50 G18 I18
15 18
H10
7 4
J12 K0
Solution:
Start=[A], goal=K.
Iteration #0:
Open=[A], closed=[ ], G[A]=0, pred[A]=null.
Iteration #1:
X=A, L=[B,C,D], G[B]=8, F[B]=28, pred[B]=A, G[C]=5 F[C]=30, pred[C]=A,
G[D]=7, F[D]=34, pred[D]=A, closed=[A], open=[B28,C30,D34].
Iteration #2:
X=B, L=[E,F], G[E]=23, F[E]=73, pred[E}=B, G[F]=26, F[F]=86, pred[F]=B,
closed=[A,B], open=[C30,D34,E73,F86].
Iteration #3:
X=C, L=[G,H,I], G[G]=12, F[G]=30, pred[G]=C, G[H]=33, F[H]=43, pred[H]=C,
G[I]=17, F[I]=35, pred[I]=C, closed=[A,B,C], open=[G30,D34,I35,H43,E73,F86].
Iteration #4:
X=G, L=[H], temp=37, G[H]=27, F[H]=37, pred[H]=G, closed=[A,B,C,G],
open=[D34,I35,H37,E73,F86].
Iteration #5:
X=D, L=[I], temp=32, G[I]=14, F[I]=32, pred[I]=D, closed=[A,B,C,G,D],
open=[I32,H37,E73,F86].
32
Iteration #6:
X=I, L=[H], temp=42, closed=[A,B,C,G,D,I], open=[H37,E73,F86].
Iteration #7:
X=H, L=[J,K], G[J]=34, F[J]=46, pred[J]=H, G[K]=31, F[K]=31, pred[K]=H,
closed=[A,B,C,G,D,I,H], open=[K31,J46].
Iteration #8:
X=K
Since K is a goal, stop and find path.
Path: A C G H K
All A* algorithms are admissable: This theorm says that any A* algorithm (i.e
that uses a heuristic h(n) such that h(n)≤h*(n) for all n), is guaranteed to find the
minimal path from n to the goal, if such a path exists, Breadth first search may be
characterized as an A* algorithm in which f(n)= g(n)+0. (i.e h(n) =zero).
In the 8-puzzle, the heuristic h(n) used of counting the number of tiles out of
place is counting less than or equal to the actual number of moves required to move
them to their goal position. Hence this heuristic is admissable and guarantees the
optimal or shortest path solution.
* Monotonucity
The definition of the A* algorithm did not require that g(n)=g*(n), this means
that admissable heuristics may initially reach non-goal states along a suboptimal path.
As long as the algorithm eventually finds an optimal path to all states on the
path to the goal. A heuristic function h is monotonic if:-
This is to describe the monotone property that is to say that the heuristic is
everywhere admissable reaching each state along the shortest path from its ancestors.
If the graph search algorithm for BFS is used with a monotonic heuristic, an important
step may be omitted, since the heuristic finds the shortest path to any state the first
time it is discovered, when a state is rediscoverd, it is not necessary to check if the
new path is shorter.
33
S1 to S2 h(S1)-h(S2) ≤ cost (S1,S2)
S2 to S3 h(S2)-h(S3) ≤ cost (S2,S3)
S3 to S4 h(S3)-h(S4) ≤ cost (S3,S4)
.
.
.
Sg-1 to Sg h(Sg-1)-h(Sg) ≤ cost (Sg-1,Sg)
Summing up each column and using the monotone property h(Sg)=0, path from S1 to
Sg, h(S1) ≤ cost(S1,Sg).
*Informedness
For two A* heuristics h1 and h2 if h1(n)≤h2(n), for all states n in the search
space, heuristic h2 is said to be more unformed than h1.
This is trivially less than h*. Also h2, the number of tiles out of place with
respect to the goal state is alower bound for h*. In this case:-
h1 ≤ h2 ≤ h*
It follows that the number of tiles out of place heuristic is more informed than
Breadth first search, both h1 and h2 find the optimal path, but h2 evalutes many fewer
states.
Similary we can argue that the heuristic that calculates the sum of direct
distances by which all the tiles are out of place is gain more informed than that of the
number of tiles out of place.
This can visualize a sequence of search spaces, each smaller than the previous
one. Converging on the optimal path solution.
Search in games
*The minimax procedure:-
Games are an important application area for heuristic algorithms. Two person
games are more complicated than simple puzzles
(i.e. the existance of unpredictable opponent).
6-1
5-2 4-3 max
34
To play nim, a number of matches are placed on a table. At each move, the
player must divide a pile of matches into two non-empty piles with different number
of matches in each pile. The first player who can no longer make a move losses the
game.
Each level in the search space is labelled according to whose move it is at the
point in the game, (i.e min or max). Each leaf node is given a value of (0) or (1)
depending on wether it is a win for max or for min. Minimax propagates these values
up the graph through successive parent nodes according to the rule:-
- If the parent state is a max node, give it the maximum value among its children.
- If the parent state is a min node, give it the minimum value among its
children.
The value that is assigned to each state indicates the value of the best state that
player can hope to achieve.
The values of the leaf nodes are propagated up the graph using minimax.
This strategy is called an n-move look ahead, where n is the number of levels
explored, since the leaves of this subgraph are not final states of the game, it is not
possible to give them values that reflect a win or a loss instead each node is given a
value according to some heuristic evaluation function.The value that is propagated
back to the root node is not an indication of whether or not a win can be achieved but
35
is simply the heuristic value of the best state that can be reached in n moves from this
start node .
Game graphs are searched by level or ply. Each move by a player define a new
ply of the graph. Game-playing programs typically look ahead a fixed ply depth.
The states on that ply are measured heuristically and the values are propagated
back up the graph using minimax. The search algorithm then uses these derived
values to select among possible next moves .
In the example opposite, since our goal is to maximize the value of the heuristic
function, we choose to move to B. Hence A’s value is 8 since we can move to
aposition with avalue of 8.
8 3 -2
B C D
If we stop at two-ply look ahead, we have now the situation opposite. Taking
into account that the opponent gets to choose which of the successor moves will be
made, and this which terminal value should be backed up to the next level. Suppose
we make move B. The opponent’s goal is to minimize the value of the evalution
function and hence is expected to choose move F. This means that we will end up in a
very bad position (-6) even through there is position E (9) which is very good. Once
the values from the second ply are backed up, it is clear that the correct move for us at
the first level is C. This process can be reapeted for as many ply as time allows, and
the more accurate evaluations that are produced can be used to choose the correct
move at the top level. This alternation of maxmizing and minimizing at alternate ply
corrospondes to the opposing strategies of the two players and gives this method its
name minimax .
A max
min
B D D
max
E F G H I J K
* Minimax Procedure
36
The recursive procedure for the minimax operation is assumed to return two
results.
▪ The backed up value of the path it chooses (value).
▪ The path it self.
It is called to compute the best move from the current position (CURRENT) and
takes two arguments: a position and current depth of the search, (e.x. minimax
(CURRENT,0)).
1- If the node is on the final level explored (i.e search should be stopped at the
current level) then the value of this mode is that determined by the static
evaluation function and the path is nil.
2- Otherwise generate one more ply of the tree and return a list of nodes
(successors) representing the moves that could be mode starting in the current
node.
4- If successors, then examine each element and keep track of the best one. This
is done as follows:-
5- Initilize BS ( best score) to the minimum value that the static function can
return it will be updated to reflect the best score that can be achieved by an
element of the successors.
6- For each element of the successors (to be called succ) do the following:-
b- Set new value to minus value (result succ). This will cause it to reflect
the merits of the position from the opposite side from that of the next
lower level.
c- If new value > BS, then we have found a successor that is better than any
that have benn examined so far. record this by:
37
heuristic, the second to propagate values back up the tree, minimax follows all
branches in the space including many that could be ignored or pruned by a more
intelligent algorithm.
A >=3
3
B C <=5 (Maximizing level)
D E F G
3 5 -5
In the example above, A is guaranteed a score of (3) or greater if we move to B,
any move that produces ascore of less than 3 is worse than the move B, after
examining node F, we know that the opponent is guaranteed a score of (-5) or less at
C, so after examining only F we are sure that a move to C is worse regaredless of the
score of node G, thus we need not explore node G at all.However cutting out one node
may not appear effective but the process can be cost – effective.
If we were to eliminate not a single node but an entire tree three ply deep.
A Max
3 B C Min
D E 5 F H Max
G
3 5 4
<=0 I J M N Min
5 7 8
K L
The idea of alph-beta search is simple, rather than searching the entire space,
alph- beta proceeeds in a depth first fashion, first descent to full ply depth and apply
the heuristic evalution function to a state and all its siblings. Suppose these are min
nodes, the maximum of these min values is then backed up to the parent (a maximum
38
node). This value is then offered to the grandparent of these mins as a potential ß
cutoff.
Next the algorithm descends to other grand children and terminates exploration
of their parent if any of their values is equal or larger than this beta value.
Note: if the maximizing node is not at the top of the tree, one must also consider the
alpha value that has passed down from a higher node. Thus at a maximizing level, α
should be set to either the value it had at the next highest maximizing level or the best
value found at this level whichever is greater. The corresponding statement can be
made about β.
The effectiveness of the α-β procedure depends greatly on the order in which
paths are examined. If the worst paths are examined first, then no cutoffs at all will
occur.
The idea behined the α-β procedure can be extended to cutoff additional paths
that appear to be at best only slight improvements over paths that have already been
explored. The idea is to devote more time exploring other parts of the tree where there
may be more gain.
*Other refinements:
These are often modifications to the minimax procedure that can also improve
its performance: -
A
EX.1 A
A
B C D
B C D
B C D
E F
E F
G H I J
[1] [2]
[3]
39
when node B in fig (1) is expanded one more level as in fig (2) the estimate of the
worth of B changed greatly, if we stop exploring the tree at this level the value (-4) is
assingned to B and we will therefore decide that B is not a good move, hence to make
sure of the choice of move, we should continue the search until no such drastic change
occures from one level to the next.
2- Secondary search. Another way that the accuracy of the minimax procedure
can be improved is to double check a chosen move to make sure that there is
not a hidden pitfall a few moves farther away. It is not very expensive to search
the single chosen branch an additional level or two to make sure that it still
looks good.
Expert Systems
Expert system breakthrough began in the late 1970’s on the bases of the success
of a few programs that performed specific tasks almost as well as human expert.
40
The expert system relies on a body of knowledge to perform somewhat difficult
task usually performed by a human expert.
An expert system deals successfully with problems for which clear algorithmic
solutions do not exist.
Expert systems are practical programs that use heuristic strategies to solve
specific classes of problems.
Many expert systems serve the role of an expert advisor to help experts solve
problems by providing them with useful advice.
User
KNOWLEDGE
BASE
USER
Knowledge
INTERFACE
Engineer
EXPLANATION
GENERATOR
KNOWLEDGE
ACQUISITION
Expert
Knowledge
INFERENCE
ENGINE
WORKING
SHELL MEMORY
1- The knowledge base which contains the necessary long term memory of facts,
structures and rules that represent the knowledge about the domain of the
expertise.
2- The inference engine which controls the reasoning process of the system, it
uses the expert knowledge to solve the problem.
3- A user interface in the form of a communication facility that allows users to
query the system, supply information and receive advice.
41
4- The explanation facility, which provides explanation to justify the expert
systems recommendations or decisions. Here it helps the end user to feel more
assured about the actions and enables the developer to follow through the
operation of the system.
1- They allow easy modification of the knowledge base (both in adding and
deleting skills or facts).
2- Reason heuristically using often-imperfect knowledge to obtain useful
problem solutions.
3- Support inspection of their reasoning processes both in presenting
intermediate steps and in answering questions about the solutions.
4- Conduct a dialogue with human specialists in a language that is natural to
them (i.e. expert systems are frequently interactive and use human-oriented
dialogue). The appearance of the intelligence is enhanced to the extent that the
program can accept free form input in simple sentences and can state its
conclusions in the same way. Such a system is said to have a natural language
interface.
5- Capability of dealing with uncertainty rather than only clear facts and
information.
6- Knowledge base and inference engine constitute separate parts of the system.
7- Most expert systems have been written for relatively specialized expert level
domains.
8- Generally expert systems use rule-based architecture for automated reasoning.
The mechanisms used by most expert systems are built on rule-based
architecture. This is because rules are easy to understand and constitute a basic
construct in logic programming.
One of the design principles of rule-based expert systems is that the knowledge
base is kept separate from the inference engine. This is because:
1- The separation of the knowledge base and the inference engine makes it
possible to represent knowledge in a more natural fashion (if …. Then….)
than a program, which embeds this knowledge in a lower level computer
code.
2- The separation of the knowledge and control a long with the modularity
provided by the rules and other structures, allows changes to be made in the
knowledge base without affecting other parts of the program.
42
components of the expert system except that knowledge base and the case-
specific data contain no information. Programmers can use empty shell and
create a new knowledge base appropriate to another application.
The root of the tree is top-level goal to be proved. To prove the goal, we have to
traverse part of the tree, as is the case. The parts of the tree that we traverse to prove
the goal is called the PROOF path. The proof path it self is a sub tree, which is called
the INFERENCE TREE.
43
- In backward chaining, we start at the root of the tree and follow the
branches, toward the leaves until we find facts in the fact base.
- In forward chaining, we start from the leaves and work our way toward
the root until we find a chain of branches that leads to the goal.
Example
1- if A and B then C
2- if C or D then E.
C D
A B
The inference tree consists of each rule represented as a conclusion with the
relevant premises nested beneath the conclusion. Node C is a conclusion in the first
rule and a premise in the second rule.
Define the way and order in which facts, rules and parts of rules are used.
The choice of a control structure can enormously affect the success and
efficiency of a rule-based expert system. Rule based expert systems typically
need large number of rules about a problem domain in order to approach skilled
human performance. Hence the choice of control structure is important in such
system.
1- Forward chaining.
2- Backward chaining.
44
Backward Chaining
In a backward chaining the system is provided with a specific goal to prove. To
prove a goal, backward chaining begins with and focuses on the conclusions of rules.
The prolog interpreter follows the backward chaining mechanism or goal-directed
reasoning.
It gives the goal order top priority and then treats rules and facts order of equal
importance after that if alternative conclusions are possible, backward chaining can
try to prove the first, then try to prove the second, if the first fails and so on.
Backward chaining is a good control structure when there are many more facts
than final conclusions or goals.
Example:
Consider the following rules:
R1: goal1:-fact1.
R2: goal1:-a,b.
R3: goal2(X):-c(X).
R4: a:-not (d).
R5: b:-d.
R6: b:-e.
R7: c(2):-not (e).
R8: d:-fact2,fact3.
R9: e:-fact2,fact4.
Suppose the goals are goal1 and goal2(X) and the facts are fact2 and fact3.
Solution
The rules will be tried in the following order:-
45
Implementation of Backward Chaining
The goal of most expert systems is to reach a diagnosis, assume that this is
obtained in prolog style by typing the query diagnosis(X), where X describes a
diagnosis.
Example: diagnosis(“fuse blown”):-nosie(“pop”).
One problem is that rules require advance entry of facts (often many facts).
Hence can use method of facts demanded when needed. Define a predicate (ask) as a
way to get facts. This predicate has two arguments. The first is a question text to be
typed out on the screen and the second argument can be a variable to be bound to the
answer.
Ask(Q,A):-write(Q), write(“?”), readln(A), assert(asked(Q,A)).
In large rule based expert systems, two important coding techniques are used:-
Input coding groups user answers into categories. An important case is questions with
“yes” or “no” answers, can define two new predicates: affirmative and negative for
+ve or –ve answers respectively.
affirmative(“yes”). negative(“no”)
affirmative(“y”). negative(“n”)
affirmative(“right”). negative(“not”)
affirmative(“ok”). negative(“impossible”)
positive_answer(Q,A):- affirmative(A).
46
write(“please answer yes or no”), readln(A2),
retract(asked(Qcode,A)),
assert(asked(Qcode,A2)),affirmative(A2).
Ask_if_not(Q):-not(ask_if(Q)).
Users may not always understand a question, can let them type (?) and give them
some explanatory text and then give them another chance to answer:-
ask(Q,A):- asked(Q,A).
ask(Q,A):- not (asked(Q,A)), write(Q), write(“?”), readln(A2), ask2(Q,A2,A).
Output coding
1- Can code questions so we need not repeat their text at each mention in the rules.
This makes rules easier to read and help prevent mistakes. Hence may use a
predicate “question_code” of two arguments (a code word and a text string for the
question).
Example :
ask(Qcode,A):- asked(Qcode,A).
47
ask2(Q,Qcode,”?”,A):- explain(Qcode), ask(Qcode,A).
3- Can also code diagnosis which helps when diagnose are provable in many
different ways. Diagnosis coding request a new top-level predicate that users may
query instead of diagnosis.
Forward Chaining
Often rule-based systems work from just a few facts, but are capable of reaching
many possible conclusions. Examples are those expert systems that identify an object
from a description of what you see at a distance or diagnosis systems that tell you
what to do when the car breaks down from a description of what isn’t working.
48
case where no variables were bound to make the
match.
b- Mark F as used.
3- For each NOT expression in rules whose argument does not match any used
fact, add it to the fact list, mark it as unused and redo step 2. Consider the
expressions in rule order.
Example
2- Takes fact3, R10 succeeds and a new fact d is proved. The set of new rules
becomes:
R1: goal1:- fact1.
R2: goal1:- a,b.
R3: goal2(X):- c(X).
R4: a:- not(d).
R5: b:- d.
R6: b:- e.
R7: c(2):- not(e).
R11: e:- fact4.
3- Fact d is used next, R5 now succeeds and give a new fact b, R5 and R6 are
removed. The set of new rules becomes:
R1: goal1:- fact1.
R2: goal1:- a,b.
49
R3: goal2(X):- c(X).
R4: a:- not(d).
R7: c(2):- not(e).
R11: e:- fact4.
4- Fact b matches something in R2 giving:-
R12: goal1:-a. , rule R2 can be eliminated. The current set of rules:-
R1: goal1:- fact1.
R12: goal1:- a.
R3: goal2(X):- c(X).
R4: a:- not(d).
R7: c(2):- not(e).
R11: e:- fact4.
5- Hence no more facts to pursue, takes the rules with NOTs.
6- Fact d is true, so R4 can never succeed. But fact e has not been proved. Hence
add not(e) to the list of facts.
7- This matches the right side of R7, hence c(2) is a fact too. Eliminate R7, The
current set of rules:-
R1: goal1:- fact1.
R12: goal1:- a.
R3: goal2(X):- c(X).
R4: a:- not(d).
R11: e:- fact4.
8- 8- This matches the only expression on the right side of R3 when X=2 and
hence goal2(2) is a fact, we can’t eliminate R3 now because a variable had to be
bound to make the match. The current set of rules:-
R1: goal1:- fact1.
R12: goal1:- a.
R3: goal2(X):- c(X).
R4: a:- not(d).
R11: e:- fact4.
Example
R1: top(X,Y,Z):- side(Z,W,Y,X).
R2: side(A,B,7,D):- data(A,0,B),data(A,D,1).
Facts: data(3,0,1),data(3,2,1).
Solution
50
Obtain fact top(0,7,3).
Solution
51
R3: m(X):- n(X),b.
R4: b:-c.
R7:- t:- s.
Remove R2
Remaining facts: c,n(12).
52
R12: m(X):- n(X).
R7:- t:- s.
Example:
a:- b. becomes rule(a,[b]).
c:- d,e,f. becomes rule(c,[d,e,f]).
g(X):- h(X,Y),not(f(Y)) becomes rule(g(X),[h(X,Y),not(f(Y))]).
2- Forward chaining requires that facts are identified and be distinguished from
rules. This may be realized by making each fact an argument to a predicate
“fact” (with a single argument).
To implement focus of attention forward chaining and prevent fact reuse, can
delete facts once considered by the system and copy them into a predicate “usedfact”
with one argument.
3- To make every fact F and find the rules whose right sides can match it so as to
derive new rules and possibly facts, the goal of the programming may be
written as:-
Forward:- done.
Forward:-fact(F),not(pursuit(F)),assertz(usedfact(F)),
retract(fact(F)),forward.
To make sure that all possible conclusions are reached such that forward
chaining continues until there are no more facts:-
done:- not(fact(F)).
pursuit(F):- rule(L,R),rule_pursuit(F,L,R),fail.
Forward:- done.
53
Forward:- fact(F),not (pursuit(F)),assertz(usedfact(F)),
retract(fact(F)),forward.
pursuit(F):- rule(L,R),rule_pursuit(F,L,R),fail.
rule_pursuit(F,L,R):- member(F,R),delete(F,R,Rnew),retract(rule(L,R)),
new_rule(L,Rnew).
new_rule(L,[]):- not(fact(L)),asserta(fact(L)).
new-rule(L,R):- not(R=[]),asserta(rule(L,R)).
Note: The above solution assumes that rules do not contain NOTS.
For each rule R, treat its right side as a query about the facts (without
using any other rules via backward chaining), if R succeeds, add its left
side (with substitution bindings made) as a fact at the front of the list of
facts. And then eliminate from further consideration all rules whose
left sides are equivalent to this new fact, if the rule left side has
variables, do this for every possible way of binding those variables.
2- Repeat the previous step with all the original rules, taking also the NOT whose
arguments are not facts.
Example
54
Solution
Cycle1: Rules R1,R2,R3,R5,and R6 are tried none succeed. R4 and R7 have NOTs.
R8 is executed, fact d is obtained as a fact and hence asserted, R8 is
removed, R9 fails.
Cycle2: R5 is executed, fact b is asserted. Both R5 and R6 are eliminated. Now the
new rules are:
R1: goal1:- fact1.
R2: goal1:- a,b.
R3: goal2(X):- c(X).
R4: a:- not(d).
R7: c(2):- not(e).
R9: e:- fact2,fact4.
Cycle5: R3 is executed with X=2 and goal2(2) must be a fact, R3 not eliminated
because goal2(2) is more specific than goal2(X).
Note: This algorithm will only work with restriction that no not(P) occurs in the right side of a rule
before a rule having P as its left side.
Each group may be considered as a separate rule-based expert system that may
call another. This idea is called partitioning or context limiting control structure. To
implement such control structure we normally use one other partition of rules, a
“startup” partition to look at key evidence and decide on the partition most relevant to
the problem. Also partitions can choose to transfer control to other partitions, if say
none of their rules succeed.
55
A good example is the diagnosis of malfunctioning car: electrical problems,
transmission, fuel system, car body,….
Meta Rules
Specification of control may be itself implemented by a rule-based system. Meta
rules are just rules whose domain of knowledge is the operation of another rule-based
system. They are a kind of heuristics. An example is rules to load partitions, also rules
for rule ordering. Meta rules are in fact rules that treat other rules as data usually by
choosing the one to use next.
Example: prefer(L1,R1,L2,R2):-length(R1,Len1),length(R2,Len2),Len1<Len2.
This rule says that a rule with shorter right side is preferred.
Meta rules permit flexible control adjustable to the processing control and hence
enhance general-purpose control structures. This means that Meta rule
implementation is different for backward, hybrid and forward chaining.
The big advantage of Meta rules is their flexibility and modifiability, which
allows precise control of a rule-based system.
Decision Lattices
Choosing a good sequence for rules can be important and hard. However
computers use storage structures besides sequences. Rules can be organized in a
hierarchy that is called decision lattice. Therefore decision lattices do a restricted but
very efficient kind of reasoning (a kind of classification). Decision lattices are useful
for simple expert systems. Consider the expert system to diagnose the malfunction of
an appliance.
Appliance
works
No Yes
No Yes Other
Heating Mechanical
3- They need not explicitly pose questions but can examine buffer contains.
1- They can’t reason efficiently because they don’t permit backtracking or use of
variables.
2- They are difficult to modify and debug since later questions assume certain
results to earlier questions.
3- They may be hard to build because at each point, you must try to determine
the best question to ask
1- For every top-level rule, repeatedly substitute in the definitions for all
intermediate predicates on its right side until no more remains, if there is more
than one rule proving an intermediate predicate, make multiple versions of the
rule, one for each possibility.
57
2- Examine the right sides of the new rules pick a predicate expression that
appears unnegated in some rules and negated in approximately equal number
(the more rules it appears in the better and the more even. The spilt the better).
Call this the partitioning predicate expression and have the first question to the
user to ask about it. Create branches from the starting node to new nodes; one
corresponding to each possible answer. Partition the rules into groups
corresponding to the answers and associate each group with one node (copies
of rules not mentioning the predicate expression should be put into every
group. Remove all occurrences of the expression and its negation from rules.
Within each rule group, apply this step recursively choosing a predicate that
partitions the remaining rules in the group best. Repeat this until no more
partitioning is possible.
Example
r:- a,d,not(e).
s:- not(a),not(c),q.
t:- not(a),p.
u:- a,d,e.
u:- a,q.
v:- not(a),not(b),c.
p:- b,c.
p:- not(c),d.
q:- not(d).
Solution
After step1:
r:- a,d,not(e).
s:- not(a),not(c),not(d).
58
t:- not(a),b,c.
t:- not(a),not(c),d.
u:- a,d,e.
u:- a,not(d).
v:- not(a),not(b),c.
After step2:
v:- not(a),not(b),c.
The first set will be used whenever the fact ‘a’ is true and the second will be
used whenever the fact ‘a’ is false. In the first group ‘d’ appears in all rules, so it can
be used as the partitioning expression, similarly ‘c’ can partition the second group.
This gives four rule groups or sub databases.
u:- e.
t:- b.
s:- not(d).
Three groups have two rules and hence can be portioned further to give a
unique answer.
Implementation
59
To implement a decision lattice:
1- Give code names to every node in the decision lattice, including the starting
nodes.
2- Define a successor predicate of two arguments that gives conditions for one
node to be followed by another node.
Example
successor(N1,N2):- ask_if(“a”).
successor(N1,N3):- ask_if_not(“a”).
diagnosis(N,N):- not(successor(N,_)).
diagnosis(D,Start):- successor(Start,X),diagnosis(D,X).
Uncertainty
Probability is used to model degrees of uncertainty in the world. A probability is
the fraction of the time we expect something will be true. Rules in rule bases systems
can be:-
1- Absolute, when things are absolutely true on the right side of a rule then the
conclusion on the left side is absolutely true.
battery(“dead”,0.03).
Which implies that a battery in a randomly picked car is dead 3% of the time.
battery(“dead”,0.3):-ignition(“won’t start”,1.0).
Which says that 20% of the time when the car won’t start it is true the battery is dead.
60
Probability Issues
1- OR Combination Issue
Different rules of inference may be written of the same fact from different
sources of evidence, each with its own probability. So if 50% of the time when the
radio won’t start, the battery is dead: battery(“dead”,0.5):- radio(“won’t play”,1.0).
Hence the reason about whether the battery is dead, we should gather all the
relevant rules and facts. Then somehow, must combine the probabilities from facts
and successful rules to get accumulative property that the battery is dead. This is
called the OR combination.
Rules can be uncertain for a different reason than facts, in the preceding
example, for the likely hood that the battery is dead when ignition won’t start (which
would be true if the engine is flooded). Then the probability that the battery is dead is
less than the rule probability 0.2, because of the probability of the rule right side (i.e.
the fact or evidence). Hence the probability of the rule as a whole must be combined
with the probabilities of the facts on the right side. This is called rule probability with
evidence probability combination issue.
Rules can have several predicate expressions on their right sides and if each
has a probability, we must somehow, combine these probabilities to get a total
probability for the right side.
Combine Evidence
Assuming statistical independence:
2- If the stock market index has gone up for three straight day, it will go down
tomorrow with probability 0.4.
61
AND Combination
In general, if events A,B,C,…. are independent, then:-
Example
F=a,b,c.
Given that P(a)=0.7, P(b)=1, P(c)=0.95 and rule probability=0.8. then the total
probability of F=0.7*1*0.95*0.8=0.532.
OR Combination
P[A or B or C,or …]=1-[(1-P(A))*(1-P(B))*(1-P(C))*……
Example
F=a or b or c.
Given that P(a)=0.7, P(b)=1, P(c)=0.95 and rule probability=0.8. then the total
probability of F=1-[(1-0.7)*(1-1)*(1-0.95)*(1-0.8)
=1-[0.3*0*0.05*0.2]
=1-0=1
62