0% found this document useful (0 votes)

24 views405 pages

Introduction To Artificial Intelligence

Uploaded by

nixonob380

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

24 views405 pages

Introduction To Artificial Intelligence

Uploaded by

nixonob380

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 405

Introduction to

Artificial Intelligence
Lecture: Introduction to AI
Outline

● What is Artificial Intelligence (AI)?

● The Foundations of Artificial Intelligence
● The History of Artificial Intelligence
● The State of the Art

2
AI: Dreams for everyone

3
AI: Sophia Robot

4
AI: Deep Blue – AlphaGo

5
Intelligence vs. Artificial Intelligence

● Intelligence includes the capacity for logic, understanding,

learning, reasoning, creativity, and problem solving, etc.
● Artificial intelligence (AI) attempts not just to understand but
also to build intelligent entities.

John McCarthy Marvin Minsky Allen Newell Arthur Samuel Herbert Simon
(1927 – 2011) (1927 – 2016) (1927 – 1992) (1901 – 1990) (1916 – 2001)

6
The fields of AI

● AI research aims to build intelligent entities that are capable of

simulating humans in different aspects.
○ Thinking: learning, planning, knowledge refinement
○ Perception: see, hear, feel, etc.
○ Communication in natural languages
○ Manipulation and moving objects

7
What is AI?

8
Acting humanly

● The Turing Test approach (Alan Turing, 1950):

○ A computer passes the test if a human interrogator, after
posing some written questions, cannot tell whether the
written responses come from a person or from a computer.

9
Acting humanly

● Other Turing Test: Chihuahua and muffin

10
Thinking humanly

● Cognitive modeling approach

● The program’s input-output behavior matches corresponding
human behavior.
● We need to get inside the actual workings of human minds
○ Introspection: trying to catch our own thoughts as they go
by
○ Psychological experiments: observing a person in action
○ Brain imaging: observing the brain in action

11
Thinking humanly

● General Problem Solver – GPS (Newell and Simon, 1961)

○ Not merely solve problems correctly
○ Compare the trace of its reasoning steps to traces of
human subjects solving the same problems
● Cognitive Science
○ Computer models from AI
○ Experimental techniques from psychology
● These approaches are now distinct from AI
○ Share the available theories but do not explain anything
resembling human intelligence
○ All share a principal direction

12
Thinking rationally

● The laws of thought approach

● “Right thinking” = irrefutable reasoning processes
○ Example of Aristotle (381BC – 322BC)

All men are mortal. ∀x.man(x)⇒mortal(x)

Socrates is a man. man(Socrates)
Therefore, Socrates is mortal. mortal(Socrates)

● Obstacles:
○ Not all intelligence is mediated by logic behavior
○ Solving a problem “in principle” is different from doing in
practise.

13
Acting rationally

● The rational agent approach

● Rational behavior = “doing the right thing”
○ The “right thing” is what is expected to maximize goal
achievement given the available information.
● An agent is just something that perceives and then acts
f: P* → A
● A rational agent acts to achieve the best outcome or, when
there is uncertainty, the best expected outcome.

14
Goals of AI

● AI studies the intelligent part concerned with humans and

represents those actions using computers.
● Make computers more useful by letting them take over
dangerous or tedious tasks from human
● Understand principles of human intelligence

15
Related research fields

Field Description

Philosophy Logic, methods of reasoning, mind as physical system, foundations

of learning, language, rationality.

Mathematics Formal representation and proof, algorithms, computation,

(un)decidability, (in)tractability, probability.

Economics Utility, decision theory, rational economic agents

Neuroscience Neurons as information processing units.

Psychology/ How do people behave, perceive, process information, represent

Cognitive Science knowledge.

Computer Engineering Building fast computers

Control Theory Design systems that maximize an objective function over time

Linguistic Knowledge representation, grammar

16
Common topics in AI

● Search (includes Game Playing)

● Representing knowledge and reasoning with it
● Planning
● Learning
● Natural language processing
● Expert systems
● Interacting with the Environment
○ Vision, Speech recognition, Robotics, etc.

17
Pros and Cons of AI

● Pros:
○ More powerful and more useful computers
○ New and improved interfaces
○ Solve new problems
○ Better handling of information
○ Relieve information overload
○ Conversion of information into knowledge
● Cons:
○ Increased costs
○ Difficulty with software development - slow and expensive
○ Few experienced programmers

18
Solving problems by searching

● Search is the fundamental technique of AI, either “uninformed”

or “informed”.
● Search for the first answer that satisfies our goal or keep
searching until we find the best answer.

19
Solving problems by searching

● Search is the fundamental technique of AI, either “uninformed”

or “informed”.
● Search for the first answer that satisfies our goal or keep
searching until we find the best answer.

20
Knowledge and reasoning

● To act rationally in our environment, then we must have some

way of describing that environment and drawing inferences
from that representation.
○ How do we describe what we know about the world ?
○ How do we describe it concisely ?
○ How do we describe it so that we can get hold of the right
piece of knowledge when we need it ?
○ How do we generate new pieces of knowledge ?
○ How do we deal with uncertain knowledge ?

21
Knowledge and reasoning

● Propositional logic and predicate logic

● Inference techniques: forward chaining, backward chaining,
and resolution
● Uncertain knowledge and reasoning

22
Machine Learning

● If a system is going to act truly appropriately, then it must be

able to change its actions in the light of experience.
○ How do we generate new facts from old ?
○ How do we generate new concepts ?
○ How do we learn to distinguish different situations in new
environments?

23
History of AI

● 1940-1950: Early days

○ 1943: McCulloch & Pitts: Boolean circuit model of brain
○ 1950: Turing's “Computing Machinery and Intelligence”
● 1950—70: Excitement: Look, Ma, no hands!
○ 1950s: Early AI programs, including Samuel's checkers
program, Newell & Simon's Logic Theorist, Gelernter's
Geometry Engine
○ 1956: Dartmouth meeting: “Artificial Intelligence” adopted
○ 1965: Robinson's complete algorithm for logical reasoning
● 1970—90: Knowledge-based approaches
○ 1969—79: Early development of knowledge-based systems
○ 1980—88: Expert systems industry booms
○ 1988—93: Expert systems industry busts: “A Winter”

24
History of AI

● 1990—: Statistical approaches

○ Resurgence of probability, focus on uncertainty
○ General increase in technical depth
○ Agents and learning systems... “A Spring”?
● 2000—: Where are we now?

25
Recent advancements

26
Recent advancements

27
Recent advancements

28
Homework

● Find 03 recent advancements in AI.

● Briefly introduce the selected advancements.
● Determine, as deeply as possible, the subfields of them.

29
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

30
Introduction to
Artificial Intelligence
Lecture: Intelligent Agents
Outline

● Agents and Environments

● The Concept of Rationality
● The Nature of Environments
● The Structures of Agents

2
What is agent?

● AI studies how to make computers do things that people are

better at if they could.
● Such systems are called Agents.
● An agent perceives its environment through sensors and acts
upon that environment through actuators.

3
Examples of Agents

Agent Sensors Actuators

Human agent eyes, ears, and other hands, legs, vocal tract,
organs. etc.
Robotic agent cameras, infrared range levels, motors, etc.
finders, etc.
Software agent keystrokes, file displaying on screen,
contents, network writing files, sending
packets, etc. network packets, etc.

4
Agent’s Behavior

● Percept: the agent’s perceptual inputs at any given instant

● Percept sequence: the complete history of everything the agent
has ever perceived
● An agent’s behavior is described by the agent function that
maps any given percept sequence to an action.
f: P* → A

5
Example: Vacuum Agent

● Percepts: location and contents, e.g., [A,Dirty]

● Actions: Left, Right, Suck, Do Nothing

6
Vacuum Agent

Percept sequence Action

[A, Clean] Right
[A, Dirty] Suck
[B, Clean] Left
[B, Dirty] Suck
… …
[A, Clean], [A, Clean], [A, Clean] Right
[A, Clean], [A, Clean], [A, Dirty] Suck

7
Vacuum Agent

function VACUUM-AGENT([location,status])
returns an action
if status = Dirty then return Suck
else if location = A then return Right
else if location = B then return Left

8
Importance of agents

● A tool for analyzing systems

● All areas of engineering can be seen as designing artifacts that
interact with the world.
● AI designs artifacts that have significant computational
resources and the task environment requires nontrivial decision
making.

9
Rational agents

● A rational agent is one that does the right thing.

● We need ways to measure success → Performance measure.
● Performance measure evaluates any given sequence of
environment states.
● General rule: Design performance measures according to
○ What one actually wants in the environment
○ Not how one thinks the agent should behave
● Example: vacuum agent
○ The amount of dirt cleaned up in a single eight-hour shift
○ The floor clean, no matter how the agent behaves

10
Rationality

● Performance measure
○ Define the criterion of success
● Prior knowledge
○ What the agent knows about the environment
● Percept sequence
○ The agent’s percept to date
● Actions
○ What the agent can perform

11
Definition of rational agents

● For each possible percept sequence, a rational agent should

select an action that is expected to maximize its performance
measure, given the evidence provided by the percept sequence
and whatever built-in knowledge the agent has.

12
Example: Vacuum machine

● Performance measure
○ Award one point for each clean square at each time step,
over 10000 timesteps
● Prior knowledge about the environment
○ The geography of the environment (2 squares)
○ The effect of the actions
● Actions that can perform
○ Left, Right, Suck and Do Nothing
● Percept sequences
○ Where is the agent?
○ Whether the location contains dirt?
● The agent is rational.

13
Omniscience, Learning, Autonomy

Omniscience ● Knows the actual outcome of its actions in

advance
● No other possible outcomes
● Impossible in real world

Learning ● A rational agent must learn as much as possible

from what it perceives

Autonomy ● A rational agent should be autonomous – Learn

what it can to compensate for partial or incorrect
prior knowledge.

14
Examples

● Hide and seek

● Apple collector
● https://youtu.be/shccS5kJvtQ
● https://youtu.be/FIz8ycRCSw0

15
Task environment

● Task environments are essentially the “problems” to which

rational agents are the “solutions”.
● PEAS:
○ Performance measure
○ Environment
○ Agent’s Actuators
○ Agent’s Sensors
● In designing an agent, the first step must always be to specify
the task environment (PEAS) as fully as possible.

16
Example: Automated taxi driver

● Agent type: taxi driver

● Performance measure: safe, fast, legal, comfortable trip,
maximize profits
● Environment: roads, other traffic, pedestrians, passengers.
● Sensors: cameras, sonar, speedometer, GPS, odometer,
accelerometer, engine sensors, etc.
● Actuators: steering, accelerator, brake, signal, horn, display.

17
Properties of Task Environment

Fully observable Partially observable

Single agent Multiagent
Deterministic Stochastic
Episodic Sequential
Static Dynamic
Discrete Continuous
Known Unknown

18
Properties of Task Environment

● Fully observable: The agent’s sensory gives it access to the

complete state of the environment.
○ The agent need not maintain internal state to keep track of
the world.
● Partially observable
○ Noisy and inaccurate sensors
○ Parts of the state are simply missing from the sensor data,
e.g., a vacuum agent with only a local dirt sensor cannot tell
whether there is dirt in other squares
● Unobservable: The agent has no sensors at all

19
Properties of Task Environment

● Single agent: An agent operates by itself in an environment.

○ E.g., solving crossword → single-agent,
○ playing chess → two-agent
● Competitive vs. Cooperative multiagent environment
○ E.g., playing chess → competitive,
○ driving on road → cooperative

20
Properties of Task Environment

● Deterministic: The next state of the environment is completely

determined by the current state and the action executed by the
agent.
○ E.g., the vacuum world → deterministic,
○ driving on road → stochastic
● Most real situations are so complex that they must be treated
as stochastic.

21
Properties of Task Environment

● Episodic: The agent’s experience is divided into atomic

episodes, in each of which the agent receives a percept and
then performs a single action.
○ Quality of action depends just on the episode itself
○ Do not need to think ahead
● Sequential: A current decision could affect future decisions.
○ E.g., spotting defective parts on an assembly line vs.
playing chess

22
Properties of Task Environment

● Static: The environment is unchanged while an agent is

deliberating.
○ E.g., crossword puzzles → static, taxi driving → dynamic
● Dynamic: The agent is continuous asked what it wants to do
○ If it has not decided yet, that counts as deciding to do
nothing.
● Semi-dynamic: The environment itself does not change with the
passage of time but the agent’s performance score does
○ E.g., chess playing with a clock

23
Properties of Task Environment

● Discrete vs. continuous

○ The distinction applies to the state of the environment, to the way
time is handled, and to the agent’s percepts and actions
○ E.g., the chess has a finite number of distinct states, percepts and
actions; while the vehicles’ speeds and locations sweep through a
range of continuous values smoothly over time.
● Known vs. unknown
○ Known environment: the outcomes (or outcome probabilities if the
environment is stochastic) for all actions are given.
○ Unknown environment: the agent needs to learn how it works to
make good decisions.

24
Structures of Agents

● Simple reflex agents

● Model-based reflex agents
● Goal-based agents
● Utility-based agents

25
Simple reflex agents

● Select actions based on the current percept, ignoring the rest of

the percept history
● The connection from percept to action is represented by
condition-action rules.
IF current percept THEN action

26
Simple reflex agents

27
Model-based reflex agents

● Partially observable → the agent must keep track of an internal

state
○ It depends on the percept history and reflects some of the
unobserved aspects
○ E.g., driving a car and changing lane
● The agent program updates the internal state information as
time goes by by encoding two kinds of knowledge
○ How the world evolves independently of the agent
○ How the agent’s actions affect the world

28
Model-based reflex agents

29
Goal-based agents

● Current state of the environment is always not enough

● The agent further needs some sort of goal information that
describes desired situations.
○ E.g., at a road junction, the taxi can turn left, turn right, or
go straight on, depending on where the taxi is trying to get
to.
● Less efficient but more flexible
○ Knowledge that supports the decisions is represented
explicitly and can be modified

30
Goal-based agents

31
Utility-based agents

● Goals alone are inadequate to generate high-quality behavior

in most environments
○ Many action sequences can get the goals, some are better,
and some are worse
○ E.g., go home: Vinasun taxi or Grab car?
● An agent’s utility function is essentially an internalization of the
performance measure.
○ Goal → success, utility → degree of success (how
successful it is)
○ If state A is more preferred than others, then A has higher
utility

32
Utility-based agents

33
Learning Agents

34
Learning Agents

● Learning in intelligent agents is a process of modification of

each component of the agent to bring the components into
closer agreement with the available feedback information,
thereby improving the overall performance of the agent.
○ Learning element → Making improvement
○ Performance element → Selecting external actions
○ Critic → Tells the Learning element how well the agent is
doing with respect to fixed performance standard.
(Feedback from user or examples, good or not?)
○ Problem generator → Suggest actions that will lead to new
and informative experiences.

35
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

36
Introduction to
Artificial Intelligence
Lecture: Problem Solving Using Searching
Outline

● Problem-Solving Agents
● Example Problems
● Searching for Solutions

2
Holiday in Romania

3
Goal-based Agents

● Intelligent agents maximize their performance measure.

● Goals help organize behavior by limiting the objectives that the
agent is trying to achieve and the actions it considers.

4
Problem Formulation

● Consider a goal to be a set of world states in which the

objective is satisfied.
● Problem formulation is the process of deciding what actions
and states to consider, given a goal.

5
Properties of the Romania environment

● Observable
○ Each city has a sign indicating its presence for arriving
drivers.
○ The agent always knows the current state.
● Discrete
○ Each city is connected to a small number of other cities.
○ There are only finitely many actions to choose from any
given state.
● Known
○ The agent knows which states are reached by each action.
● Deterministic
○ Each action has exactly one outcome.

6
Solving problem by searching

● Search: the process of looking for a sequence of actions that

reaches the goal
● A search algorithm takes a problem as input and returns a
solution in the form of an action sequence.
● Execution phase: once a solution is found, the recommended
actions are carried out.

7
Solving problem by searching

function PROBLEM-SOLVING-AGENT(percept) returns an action

persistent: seq, an action sequence, initially empty
state, some description of the current world state
goal, a goal, initially null
problem, a problem formulation
state ← UPDATE-STATE(state, percept)
if seq is empty then
goal ← FORMULATE-GOAL(state)
problem ← FORMULATE-PROBLEM(state, goal)
seq ← SEARCH(problem)
if seq = failure then return a null action
action ← FIRST(seq)
seq ← REST(seq)
return action 8
Well-defined problems and solutions

● A problem can be defined formally by five components.

○ Initial state: in which the agent starts
■ E.g., the agent in Romania has its initial state described
as 𝐼𝑛(𝐴𝑟𝑎𝑑)
○ Actions: the possible actions available to the agent
■ E.g., 𝐴𝐶𝑇𝐼𝑂𝑁(𝐴𝑟𝑎𝑑) = {
𝐺𝑜(𝑆𝑖𝑏𝑖𝑢), 𝐺𝑜(𝑇𝑖𝑚𝑖𝑠𝑜𝑎𝑟𝑎), 𝐺𝑜(𝑍𝑒𝑟𝑖𝑛𝑑)}
○ Transition model: what each action does
■ E.g., 𝑅𝑒𝑠𝑢𝑙𝑡(𝐼𝑛(𝐴𝑟𝑎𝑑), 𝐺𝑜(𝑍𝑒𝑟𝑖𝑛𝑑)) = 𝐼𝑛(𝑍𝑒𝑟𝑖𝑛𝑑)
○ Successor: a state reachable from a given state by a single
action

9
Well-defined problems and solutions

○ Goal test: determine whether a given state is a goal state

■ The goal is specified by either an explicit set of possible
goal states or an abstract property.
■ E.g., 𝐼𝑛(𝐵𝑢𝑐h𝑎𝑟𝑒𝑠𝑡), checkmate
○ Path cost: a function that sets a numeric cost to each path
■ Non-negative, reflecting the agent’s performance
measure
■ E.g., 𝑐(𝐼𝑛(𝐴𝑟𝑎𝑑),𝐺𝑜(𝑍𝑒𝑟𝑖𝑛𝑑),𝐼𝑛(𝑍𝑒𝑟𝑖𝑛𝑑)) = 75
● An optimal solution has the lowest path cost.

10
Formulating problems by abstraction

● Abstraction creates an approximate and simplified model of the

real world, which is too detailed for computer.
● This is critical for automated problem solving.
● The choice of a good abstraction involves
○ Remove as much detail as possible while
○ Retain validity and ensure that the abstract actions are easy
to be carried out.

11
The Vacuum-cleaner world

● States: determined by both the agent location and the dirt

locations
○ 2 x 22 = 8 possible world states (𝑛 × 2𝑛 in general)
● Initial state: Any state can be designated as the initial state.
● Actions: Left, Right, and Suck
● Transition model: The actions have their expected effects.
● Goal test: whether all the squares are clean
● Path cost: each step costs 1

12
The Vacuum-cleaner world

13
The 8-puzzle

● States: the location of each of the eight tiles and the blank
● Initial state: any state can be designated as the initial state
● Actions: movements of the blank space
○ Left, Right, Up, or Down.
○ Different subsets of these are possible depending on where
the blank is
● Transition model: return a resulting state given a state and an
action
● Goal test: whether the state matches the goal configuration
● Path cost: each step costs 1

14
The 8-puzzle

15
The 8-queens

● Incremental formulation: add a queen step-by-step to the empty

initial state
● Complete-state formulation: start with all 8 queens on the board
and move them around
● The path cost is trivial because only the final state counts

16
The 8-queens

● States: any arrangement of 0 to 8 queens on the board

● Initial state: no queens on the board
● Actions: add a queen to any empty square
● Transition model: returns the board with a queen added to the
specified square
● Goal test: 8 queens are on the board, none attacked
● 64 ∙ 63 ⋯ 57 ≈ 1.8 × 1014 possible sequences to investigate

17
Search Tree

● Search algorithms consider many possible action sequences to

find the solution sequence.
● Search tree: the possible action sequences starting at the initial
state (root)
○ Branches are actions and nodes are states in the problem’s
state space
● Frontier: the set of all leaf nodes available for expansion at any
given point
● Search algorithms all share the basic structure while vary
according to how they choose which state to expand next --
called search strategy.

18
Search Tree

19
Search Tree

function TREE-SEARCH(problem) returns a solution, or failure

initialize the frontier using the initial state of problem
loop do
if the frontier is empty then return failure
choose a leaf node and remove it from the frontier
if the node contains a goal state then return the
corresponding solution
expand the chosen node, adding the resulting nodes to the
frontier

20
Redundant paths

● Redundant paths are unavoidable.

● Following redundant paths may cause a tractable problem to
become intractable.
● This is true even for algorithms that know how to avoid infinite
loops.

21
Graph Search

function GRAPH-SEARCH(problem) returns a solution, or failure

initialize the frontier using the initial state of problem
initialize the explored set to be empty
loop do
if the frontier is empty then return failure
choose a leaf node and remove it from the frontier
if the node contains a goal state then return the
corresponding solution
add the node to the explored set
expand the chosen node, adding the resulting nodes to the
frontier only if not in the frontier nor explored set 22
Graph Search

23
Infrastructure for search algorithms

● Each node 𝑛 is structuralized by four components.

○ 𝒏. 𝐒𝐓𝐀𝐓𝐄: the state in the state space to which the node
corresponds
○ 𝒏. 𝐏𝐀𝐑𝐄𝐍𝐓: the node in the search tree that generated the
node 𝑛
○ 𝒏. 𝐀𝐂𝐓𝐈𝐎𝐍: the action applied to the parent to generate 𝑛
○ 𝒏. 𝐏𝐀𝐓𝐇 − 𝐂𝐎𝐒𝐓 : the cost, denoted by 𝒈(𝒏), of the path
from the initial state to the node, as indicated by the parent
pointer
● Frontier can be implemented with a (priority) queue or stack.
● Explored set can be a hash table that allows for efficient
checking of repeated states.

24
Infrastructure for search algorithms

25
Problem-solving performance

● Completeness: does it always find a solution if one exists?

● Time complexity: how long does it take to find a solution?
● Space complexity: how much memory is needed to perform
the search?
● Optimality: does it always find a least-cost solution?

26
Homework

● Task 1:
○ Given the start state and the goal one of the 8-puzzle in
page 15.
○ Students draw a search tree by the expanding nodes.
○ The expansion stops when the goal state is expanded.
● Task 2:
○ Program Task 1 in Python using Google Colab
○ Use graph search instead of tree search

27
Homework

1 8 2 8 2 8 2
D L
4 3 1 4 3 1 4 3

7 6 5 7 6 5 7 6 5
U
1 8 2 1 8 2
L 1 2
7 4 3 7 4 3
L 6 5 6 5
4 8 3

7 6 5
1 8 2
D
4 3 1 8 2
U
7 6 5 4 6 3
1 8 2
L 7 5
4 3

7 6 5 28
Homework

from graphviz import Digraph

dot = Digraph()
dot.node('0', '182\n_43\n765')
dot.node('1', '182\n4_3\n765')
dot.node('2', '_82\n143\n765')
dot.node('3', '182\n743\n_65')
dot.edge('0', '1', 'L')
dot.edge('0', '2', 'D')
dot.edge('0', '3', 'U')

dot

29
Homework

● Conduct homework in the given notebook

30
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

31
Introduction to
Artificial Intelligence
Lecture: Uninformed Search
Outline

● Uninformed Search Strategies

● Breadth-first Search
● Uniform-cost Search
● Depth-first Search
● Depth-limited Search
● Iterative Deepening Search
● Bidirectional Search

2
Uninformed Search Strategies

● No additional information about states beyond that provided in

the problem definition → Blind Search
○ Breadth-first Search
○ Uniform-cost Search
○ Depth-first Search
○ Depth-limited Search
○ Iterative Deepening Search
○ Bidirectional Search

3
Breadth-first Search

4
Breadth-first Search

● Implementation: frontier is a FIFO queue

● The goal test is applied to each node when it is generated
rather than when it is selected for expansion.
● Discard any new path to a state already in the frontier or in the
explored set.

5
Breadth-first Search

function BREADTH-FIRST-SEARCH(problem) returns a solution, or failure

node ← a node with STATE = problem.INITIAL-STATE, PATH-COST = 0
if problem.GOAL-TEST(node.STATE) then return SOLUTION(node)
frontier ← a FIFO queue with node as the only element
explored ← an empty set
loop do
if EMPTY?( frontier) then return failure
node ← POP(frontier)
add node.STATE to explored
for each action in problem.ACTIONS(node.STATE) do
child ← CHILD-NODE(problem, node, action)
if child.STATE is not in explored or frontier then
if problem.GOAL-TEST(child.STATE) then
return SOLUTION(child)
frontier ← INSERT(child, frontier) 6
Breadth-first Search

● Draw the search tree S → G

● Write down the path
● Nodes at the same level are handled in the alphabetic order

7
Breadth-first Search

8
Breadth-first Search

● Evaluation
○ Completeness: YES
○ Time complexity: O(bd)
○ Space complexity: O(bd)
○ Optimality: YES if costs are uniform
● Terms:
○ b: maximal branching factor
○ d: level/depth of the solution
○ m: height of the search tree

9
Uniform-cost Search

● UCS expands the node 𝑛 with the lowest path cost 𝑔(𝑛)
● Implementation: frontier is a priority queue ordered by 𝑔
○ Equivalent to breadth-first search if step costs all equal
○ Equivalent to Dijkstra’s algorithm in general
● The goal test is applied to a node when it is selected for
expansion.

10
Uniform-cost Search

function UNIFORM-COST-SEARCH(problem) returns a solution, or failure

node ← a node with STATE = problem.INITIAL-STATE, PATH-COST = 0
frontier ← a priority queue ordered by PATH-COST, with node as the element
explored ← an empty set
loop do
if EMPTY?( frontier) then return failure
node ← POP(frontier)
if problem.GOAL-TEST(node.STATE) then return SOLUTION(node)
add node.STATE to explored
for each action in problem.ACTIONS(node.STATE) do
child ← CHILD-NODE(problem, node, action)
if child.STATE is not in explored or frontier then
frontier ← INSERT(child, frontier)
else if child.STATE is in frontier with higher PATH-COST then
replace that frontier node with child 11
Uniform-cost Search

● Draw the search tree S → G

● Write down the path

12
Uniform-cost Search

● Search path: S → d → e → r → f → G, cost = 10

13
Uniform-cost Search

● Evaluation:
○ Completeness: YES if the best solution has a finite cost and
minimum arc cost is positive.
○ Time complexity: 𝑂(𝑏1+ 𝐶∗/𝜖 )
■ 𝐶∗ : the cost of the best solution
■ 𝜖 : minimal action cost
○ Space complexity: 𝑂(𝑏1+ 𝐶∗/𝜖 )
○ Optimality: YES

14
Depth-first Search

● Implementation: frontier is a LIFO Stack

● Evaluation:
○ Completeness: YES if loops prevented
○ Time complexity: 𝑂(𝑏𝑚)
○ Space complexity: 𝑂(𝑏𝑚)
○ Optimality: NO

15
Depth-first Search

● Draw the search tree S → G

● Write down the path
● Nodes at the same level are handled in the alphabetic order

16
Depth-limited Search

● Standard DFS with a predetermined depth limit 𝑙, i.e., nodes at

depth 𝑙 are treated as if they have no successors.
→ infinite problems solved
● Depth limits can be based on knowledge of the problem.

17
Depth-limited Search

function DEPTH-LIMITED-SEARCH(problem, limit) returns a solution, or failure/cutoff

return RECURSIVE-DLS(MAKE-NODE(problem.INITIAL-STATE), problem, limit)

function RECURSIVE-DLS(node, problem, limit) returns a solution, or failure/cutoff

if problem.GOAL-TEST(node.STATE) then return SOLUTION(node)
else if limit = 0 then return cutoff
else cutoff_occurred? ← false
for each action in problem.ACTIONS(node.STATE) do
child ← CHILD-NODE(problem, node, action)
result ← RECURSIVE-DLS(child, problem, limit – 1)
if result = cutoff then cutoff_occurred? ← true
else if result != failure then return result
if cutoff_occurred? then return cutoff else return failure 18
Depth-limited Search

● Evaluation:
○ Completeness: maybe NO if l < d
○ Time complexity: O(bl)
○ Space complexity: O(bl)
○ Optimality: NO if l > d

19
Iterative Deepening Search

● General strategy, often used in combination with depth-first tree

search to find the best depth limit.
● Gradually increase the limit until a goal is found.

function ITERATIVE-DEEPENING-SEARCH(problem) returns a solution, or failure

for depth = 0 to ∞ do
result ← DEPTH-LIMITED-SEARCH(problem, depth)
if result != cutoff then return result

20
Iterative Deepening Search

21
Iterative Deepening Search

22
Iterative Deepening Search

● Evaluation:
○ Completeness: YES when the branching factor is finite
○ Time complexity: O(bd)
○ Space complexity: O(bd)
○ Optimality: YES if costs are uniform

23
Bidirectional Search

● Two simultaneous searches: one from the initial state towards,

and the other from the goal state backwards
● Hoping that two searches meet in the middle

24
Summary

● Comparison of uninformed algorithms (tree-search versions)

25
Homework

● Conduct homework in the given notebook.

26
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

27
Introduction to
Artificial Intelligence
Lecture: Informed Search
Outline

● Informed Search Strategies

● Best-first Search
● Greedy Best-first Search
● A* Search
● Heuristic Dominance

2
Informed search strategies

● Use problem-specific knowledge beyond the definition of the

problem itself
● Find solutions more efficiently
● Provide significant speed-up in practice

3
What are heuristics?

● Additional knowledge of the problem is imparted to the search

algorithm using heuristics.
● A heuristic is any practical approach to problem solving
sufficient for reaching an immediate goal where an optimal
solution is usually impossible.
○ Not guaranteed to be optimal, perfect, logical, or rational
○ Speed up the process of finding a satisfactory solution
○ Ease the cognitive load of making a decision

4
What are heuristics?

5
Best-first search

● An instance of the general TREE-SEARCH or GRAPH-

SEARCH algorithm
● A node is selected for expansion based on an evaluation
function, 𝒇(𝒏).
● Node with the lowest 𝑓(𝑛) is expanded first
● The choice of 𝑓 determines the search strategy.

6
Heuristic function

● h(n) → estimated cost of the cheapest path from the state at

node 𝑛 to a goal.
● Unlike 𝑔(𝑛), h(𝑛) depends only on the state at that node
● Assumption of h(𝑛)
○ Arbitrary, nonnegative, problem-specific functions
○ Constraint: if 𝑛 is a goal node, then 𝒉(𝒏) = 𝟎

7
g(n) vs h(n)

g(n) h(n)
n

UCS
f(n) = g(n) G
S

g(S) = 0 h(S) = 0

8
Greedy best-first search

g(n) h(n)
n

GBFS
f(n) = h(n) G
S

g(S) = 0 h(S) = 0

9
Straight-line distance heuristic

10
Greedy best-first search

● Draw the search tree

● Determine the path
● Compute the path-cost

11
Greedy best-first search: Evaluation

● Completeness:
○ NO – may get stuck forever
● Time complexity
○ 𝑂(𝑏𝑚) → reduced substantially with a good heuristic
● Space complexity
○ 𝑂(𝑏𝑚) – keeps all nodes in memory
● Optimality
○ NO

12
A* search

● The most widely known form of best-first search

● Evaluate nodes by𝒇(𝒏) =𝒈(𝒏) + 𝒉(𝒏)
○ 𝑔(𝑛) is the cost to reach the node 𝑛
○ h(𝑛) is the cost to get from 𝑛 to the goal
○ 𝑓(𝑛) = estimated cost of the cheapest solution through 𝑛

13
A* search

g(n) h(n)
n

A*
f(n) = g(n) + h(n) G
S

g(S) = 0 h(S) = 0

14
A* search

15
A* search

● Draw the search tree

● Determine the path
● Compute the path-cost

16
A* search: Evaluation

● Completeness
○ YES if all step costs exceed some finite 𝜖 and if 𝑏 is finite
○ (review the condition for completeness of UCS)
● Optimality
○ YES – with conditions on heuristic being used
● Time complexity
○ Exponential
● Space complexity
○ Exponential (keep all nodes in memory)

17
A* search is not always optimal

18
Heuristic: Admissibility

● h(𝑛) must be an admissible heuristic

○ Never overestimate the cost to reach the goal → optimistic
○ E.g: Euclidean distance, Manhattan distance, straight-line
distance, etc.

19
Heuristic: Admissibility

● If h(𝑛) is admissible, A* using TREE-SEARCH is optimal

Suppose some suboptimal goal 𝐺2 has been generated and is in the frontier.
Let 𝑛 be an unexpanded node in the frontier such that 𝑛 is on a shortest path to an
optimal goal 𝐺.
𝑓(𝐺2) = 𝑔(𝐺2) since h(𝐺2) = 0
𝑔(𝐺2) > 𝑔(𝐺) since G2 is suboptimal
𝑓(𝐺) = 𝑔(𝐺) since h(G) = 0
⇒ 𝑓(𝐺2) > 𝑓(𝐺) (1)
h(𝑛) ≤ h*(𝑛) since h is admissible
𝑔(𝑛) + h(𝑛) ≤ 𝑔(𝑛) + h*(𝑛) ⇒ 𝑓(𝑛) ≤ 𝑓(𝐺) (2)
From (1), (2): 𝑓(𝐺2) > 𝑓(𝑛) → A* will never select 𝐺2 for expansion
20
Heuristic: Consistency

● Admissibility is insufficient for graph search.

○ The optimal path to a repeated state could be discard if it is
not the first one selected.
● h(𝑛) is consistent if for every node 𝑛, every successor 𝑛′ of 𝑛
generated by any action 𝑎,
𝒉(𝒏) ≤ 𝒄(𝒏, 𝒂, 𝒏′) + 𝒉(𝒏′)
● Every consistent heuristic is also admissible.

21
Heuristic: Consistency

● If h(𝑛) is consistent, A* using GRAPH-SEARCH is optimal

If h(𝑛) is consistent, the values of 𝑓(𝑛) along any path are non-decreasing.

Suppose 𝑛′ is a successor of 𝑛→𝑔(𝑛′) = 𝑔(𝑛) + 𝑐(𝑛,𝑎,𝑛′)

𝑓(𝑛′) = 𝑔(𝑛′) + h(𝑛′) = 𝑔(𝑛) + 𝑐(𝑛,𝑎,𝑛′) + h(𝑛′) ≥ 𝑔(𝑛) + h(𝑛) = 𝑓(𝑛)

Whenever A* selects a node 𝑛 for expansion, the optimal path to that node has
been found.

22
Contours of A* search

● A* expands nodes in order of increasing 𝑓-value

● A* will expand all nodes with costs 𝑓(𝑛) < 𝐶*

23
Heuristic dominance

● Given two admissible heuristics, h1 and h2

● If h2(𝑛) ≥ h1(𝑛), for all 𝑛, then h2 dominates h1
● A* using h2 will never expand more nodes than A* using h1
● Better to use a heuristic function with higher values, provided it
is consistent and its computation time is not too long.

24
Homework

● Conduct homework in the given notebook.

25
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

26
Introduction to
Artificial Intelligence
Lecture: Local Search
Outline

● Optimization Problems
● Hill-climbing Search
● Simulated Annealing Search
● Local Beam Search
● Genetic Algorithm

2
Global search algorithms

● Global search: explore search spaces systematically

○ Keep one or more paths in memory and record which
alternatives have been explored at each point along the
path
● Many problems do not fit the “standard” search model
○ The final configuration matters, not the order in which it is
formed.
○ No “goal test” and no “path cost”

3
Optimization problems

● The optimization problems find the best state according to an

objective function
○ E.g., the Knight’s tour, TSP, scheduling, integrated-circuit
design, factory-floor layout, automatic programming, vehicle
routing, etc.

4
Local search algorithms

● Operate using a single current node and generally move only to

neighbors of that node
● Not systematic: the paths followed by the search are typically
not retained.
● Use very little memory: usually a constant amount
● Often able to find reasonable solutions in large or infinite
(continuous) state spaces.

5
Local search and Optimization

● “Pure optimization” problems

○ All states have an objective function
○ Goal is to find state with max (or min) objective value
● Local search can do quite well on these problems.

6
State-space landscape

● A landscape has both “location” and “elevation”: location state,

elevation heuristic cost or objective function
● A complete local search algorithm finds a goal if one exists.
● An optimal local search algorithm finds a global extremum

7
Hill-climbing Search

● A loop that continually moves in the direction of increasing

value and terminates when it reaches a “peak”.
● Not look ahead beyond the immediate neighbors of the current
state.
● No search tree maintained, only the state and the objective
function’s value for the current node recorded.

function HILL-CLIMBING(problem) returns a local maximum

current ← MAKE-NODE(problem.INITIAL-STATE)
loop do
neighbor ← a highest-valued successor of current
if neighbor.VALUE ≤ current.VALUE then return current.STATE
current ← neighbor

8
Hill-climbing: 8-queens problem

● Complete-state formulation
○ All 8 queens on the board, one per column
● Successor function
○ Move a single queen to another square in the same column
○ 8×7=56 successors
● Heuristic cost function h(𝑛)
○ The number of pairs of queens that are ATTACKING each
other, either directly or indirectly
○ Global minimum has h(𝑛) = 0

9
Hill-climbing Search

● Hill climbing is often suboptimal, due to local maxima, ridges

and plateau

10
Hill-climbing Search Variants

● Stochastic hill climbing:

○ choose at random from among the uphill moves with a
probability of selection varied with the moves’ steepness
● First-choice hill climbing:
○ generate successors randomly until one is generated that is
better than the current state
● Random-restart hill climbing:
○ A series of hill-climbing searches from randomly generated
initial states until a goal is found

11
Simulated Annealing

● Combine hill climbing with a random walk in some way that

yields both efficiency and completeness
● Shake hard (i.e., at a high temperature) and then gradually
reduce the intensity of shaking (i.e., lower the temperature)

12
Simulated Annealing

function SIMULATED-ANNEALING(problem, schedule)

returns a solution state
inputs: problem, a problem
schedule, a mapping from time to “temperature”
current ← MAKE-NODE(problem.INITIAL-STATE)
for t = 1 to ∞ do
T ← schedule(t)
if T = 0 then return current
next ← a randomly selected successor of current
ΔE ← next.VALUE – current.VALUE
if ΔE > 0 then current ← next
else current ← next only with probability eΔE/T 13
Local Beam Search

● Keep track of 𝑘 states rather than just one

● Begin with 𝑘 randomly generated states
● At each step, all successors of all 𝑘 states are generated
● If the goal is not found, select the 𝑘 best successors from the
complete list and repeat

14
Genetic Algorithms

● A variant of stochastic beam search

● Successor states are generated by combining two parent
states rather than by modifying a single state
● The reproduction are sexual rather than asexual

15
Genetic Algorithms

● Population: a set of 𝑘 randomly generated states that GA

begins with
● Fitness function: an objective function that rates each state
○ Higher values for better state
○ E.g., 8-queens: the number of non-attacking pairs of
queens (min = 0, max = 8×7/2 = 28)
● Produce the next generation by “simulated evolution”
○ Random selection
○ Crossover
○ Random mutation
● Each state, or individual, is represented as a string over a finite
alphabet – most commonly, a string of 0s and 1s.
16
Genetic Algorithms

● Pairs are selected at random for reproduction.

● The probability of being chosen for reproducing is directly
proportional to the fitness score.
● One individual may be selected several times or not at all.

17
Genetic Algorithms

function GENETIC-ALGORITHM(population, FITNESS-FN)

returns an individual
inputs: population, a set of individuals
FITNESS-FN, a function that measures the fitness of an individual
repeat
new_population ← empty set
for i = 1 to SIZE(population) do
x ← RANDOM-SELECTION(population, FITNESS-FN)
y ← RANDOM-SELECTION(population, FITNESS-FN)
child ← REPRODUCE(x , y)
if (small random probability) then child ← MUTATE(child)
add child to new_population
population ← new_population
until some individual is fit enough, or enough time has elapsed
return the best individual in population, according to FITNESS-FN
18
Genetic Algorithms

function REPRODUCE(x , y)
returns an individual
inputs: x , y, parent individuals
n ← LENGTH(x)
c ← random number from 1 to n
return APPEND(SUBSTRING(x , 1, c), SUBSTRING(y, c + 1, n))

19
Homework

● Conduct homework in the given notebook.

20
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

21
Introduction to
Artificial Intelligence
Lecture: Adversarial Search
Outline

● Games
● Optimal Decisions in Games
● α-β Pruning
● Imperfect, Real-time Decisions
● Stochastic Games

2
Multiagent environments

● Each agent needs to consider the actions of other agents and

how they affect its own welfare.
● The unpredictability of other agents introduce contingencies
into the agent’s problem-solving process
● Game theory views any multiagent environment as a game.
○ The impact of each agent on the others is “significant,”
regardless of whether the agents are cooperative or
competitive.
● Types of games:
○ Perfect information vs Imperfect information
○ Deterministic vs Chance
A sequential game has perfect information if each player, when making any decision, is perfectly informed of all the events
that have previously occurred, including the "initialization event" of the game.
3
Types of game

Deterministic Chance
Perfect Chess, Checkers, Go, Backgammon, Monopoly
information Othello
Imperfect Bridge, poker, scrabble
information nuclear war

4
Adversarial search

● Adversarial search (known as games) covers competitive

environments in which the agents’ goals are in conflict.
● Zero-sum games of perfect information
○ Deterministic, fully observable environments, turn-taking,
two-player
○ The utility values at the end are always equal and opposite.

5
Primary assumptions

● Two players only, called MAX and MIN.

○ MAX moves first, and then they take turns moving until the
game ends
○ Winner gets reward, loser gets penalty.
● Both players have complete knowledge of the game’s state
● No element of chance
● Zero-sum games
○ The total payoff to all players is the same for every game
instance.
● Rational players
○ Each player always tries to maximize his/her utility

6
Games as search

● 𝑆0 – Initial state: How the game is set up at the start 0.

● 𝑃𝐿𝐴𝑌𝐸𝑅(𝑠): Which player has the move in a state, MAX/MIN?
● 𝐴𝐶𝑇𝐼𝑂𝑁𝑆(𝑠) – Successor function: A list of (move, state) pairs
specifying legal moves.
● 𝑅𝐸𝑆𝑈𝐿𝑇(𝑠, 𝑎) – Transition model: Result of move 𝑎 on state 𝑠
● 𝑇𝐸𝑅𝑀𝐼𝑁𝐴𝐿 − 𝑇𝐸𝑆𝑇(𝑠): Is the game finished?
● 𝑈𝑇𝐼𝐿𝐼𝑇𝑌(𝑠,𝑝) – Utility function: A numerical value of a terminal
state 𝑠 for a player 𝑝

7
Games vs. Search problems

● Complexity: games are too hard to be solved

● Time limits: make some decision even when calculating the
optimal decision is infeasible
● Efficiency: penalize inefficiency severely
○ Several interesting ideas on how to make the best possible
use of time are spawn.

8
Search Tree of Tic-Tac-Toe

9
Optimal decision in games

● Normal search problem

○ Optimal solution is a sequence of action leading to a goal
state.
● Games
○ A search path that guarantee win for a player
○ The optimal strategy can be determined from the minimax
value of each node
● MINIMAX(s) =
UTILITY(s)
if TERMINAL-TEST(s)
maxa∈Actions(s)MINIMAX(RESULT(s, a)) if PLAYER(s) =
MAX 10
Example

MAX best move

MIN best move

Utility values for MAX

11
The minimax algorithm

● Compute the minimax decision from the current state

● Use a simple recursive computation of the minimax values of
each successor state
○ The recursion proceeds all the way down to the leaves of
the tree, and then the minimax values are backed up
through the tree as the recursion unwinds.

12
The minimax algorithm

function MINIMAX-DECISION(state) returns an action

return argmaxa ∈ ACTIONS(s)MIN-VALUE(RESULT(state, a))

function MAX-VALUE(state) returns a utility value

if TERMINAL-TEST(state) then return UTILITY(state)
v ← -∞
for each a in ACTIONS(state) do
v ← MAX(v, MIN-VALUE(RESULT(s, a)))
return v

function MIN-VALUE(state) returns a utility value

if TERMINAL-TEST(state) then return UTILITY(state)
v←∞
for each a in ACTIONS(state) do
v ← MIN(v, MAX-VALUE(RESULT(s, a)))
return v
13
The minimax algorithm

● A complete depth-first exploration of the game tree

● Completeness
○ Yes (if tree is finite)
● Optimality
○ Yes (against an optimal opponent)
● Time complexity
○ 𝑂(𝑏𝑚)
● Space complexity
○ 𝑂(𝑏𝑚) (depth-first exploration)

14
The minimax algorithm

15
Problem with minimax search

● The number of game states is exponential in the tree’s depth

→ Do not examine every node
● Alpha-beta pruning: Prune away branches that cannot possibly
influence the final decision
● Bounded lookahead
○ Limit depth for each search
○ This is what chess players do: look ahead for a few moves
and see what looks best

16
Alpha-beta pruning

17
Alpha-beta pruning

18
Alpha-beta pruning

function ALPHA-BETA-SEARCH(state) returns an action

v ← MAX-VALUE(state,-∞,+∞)
return the action in ACTIONS(state) with value v
function MAX-VALUE(state,α,β) returns a utility value
if TERMINAL-TEST(state) then return UTILITY(state)
v ← -∞
for each a in ACTIONS(state) do
v ← MAX(v, MIN-VALUE(RESULT(s,a),α,β))
if v ≥ β then return v
α ← MAX(α, v)
return v
function MIN-VALUE(state,α,β) returns a utility value
if TERMINAL-TEST(state) then return UTILITY(state)
v ← +∞
for each a in ACTIONS(state) do
v ← MIN(v, MAX-VALUE(RESULT(s,a) ,α,β))
if v ≤ α then return v
β ← MIN(β, v)
return v
19
Alpha-beta pruning

● Pruning does not affect the result

● Good move ordering improves effectiveness of pruning
● Killer move heuristic
● Transposition table avoids re-evaluation a state

20
Alpha-beta pruning

21
Heuristic minimax

● Both minimax and alpha-beta pruning search all the way to

terminal states.
○ This depth is usually impractical because moves must be
made in a reasonable amount of time (~ minutes).
● Cut off the search earlier with some depth limit
● Use an evaluation function
H-MINIMAX(s, d) =

EVAL(s)
if CUTOFF-TEST(s, d)

maxa∈ACTIONS(s)H-MINIMAX(RESULT(s, a), d+1) if PLAYER(s) =

MAX
mina∈ACTIONS(s)H-MINIMAX(RESULT(s, a), d+1) if PLAYER(s) =22
MIN
Evaluation Functions

● The evaluation function should order the terminal states in the

same way as the true utility function does
○ States that are wins must evaluate better than draws, which
in turn must be better than losses.
● The computation must not take too long!
● For nonterminal states, their orders should be strongly
correlated with the actual chances of winning.

23
Cutting off search

● Minimax Cutoff is identical to Minimax Value except

○ 𝑇𝑒𝑟𝑚𝑖𝑛𝑎𝑙? is replaced by 𝐶𝑢𝑡𝑜𝑓𝑓?
○ 𝑈𝑡𝑖𝑙𝑖𝑡𝑦 is replaced by 𝐸𝑣𝑎𝑙

if CUTOFF-TEST(state, depth) then return EVAL(state)

24
Stochastic Games

● Uncertain outcomes controlled by chance, not an adversary!

● Why wouldn’t we know what the result of an action will be?
○ Explicit randomness: rolling dice
○ Unpredictable opponents: the ghosts respond randomly
○ Actions can fail: when moving a robot, wheels might slip

25
Expectimax search

● Values reflect average-case (expectimax) outcomes, not worst-

case (minimax) outcomes
● Expectimax search: compute the average score under optimal
play
○ Max nodes as in minimax search
○ Chance nodes are like min nodes, but the outcome is
uncertain
○ Calculate expected utilities, i.e. take weighted average of
children
● The underlying uncertain-result problems can be formulated as
Markov Decision Processes

26
Expectimax search

27
Expectimax search

● It is possible to perform pruning in expectimax search.

● Common techniques for pruning in expectimax include:
○ Depth Limiting: Limiting the depth of the search tree can
effectively prune branches that are too deep to be practically
explored.
○ Evaluation Function: Using an evaluation function to estimate
the value of a state without fully exploring its subtree. If the
evaluation function indicates that further exploration is unlikely to
yield significant improvements, you can prune the subtree.
○ Probabilistic Pruning: In scenarios where probabilities are
involved, you might prune branches that have very low
probabilities of occurring, as they contribute little to the overall
expectation.
○ Iterative Deepening: Iterative deepening can be combined with
pruning techniques to explore deeper parts of the tree only when
necessary, based on the current state of the search.
28
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

29
Introduction to
Artificial Intelligence
Lecture: Constraint Satisfaction Problems
Outline

● Constraint Satisfaction Problems (CSPs)

● Constraint Propagation: Inference in CSPs
● Backtracking Search for CSPs
● Local Search for CSPs

2
Constraint Satisfaction Problems (CSPs)

● A constraint satisfaction problem (CSP) uses a factored

representation for each state.
○ State = a set of variables and each of which has a value
○ Solution = each variable has a value that satisfies all
constraints on that variable
● Take advantage of the structure of states
● General-purpose rather than problem-specific heuristics
○ Identify combinations of variable-value that violate the
constraints → eliminate large portions of the search space
all at once
○ Solutions to complex problems

3
Constraint Satisfaction Problems (CSPs)

● A CSP consists of the following three components

○ 𝑿 = {𝑋1,.., 𝑋n}: a set of variables
○ 𝑫 = {𝐷1,.., 𝐷n}: a set of domains, one for each variable.
○ 𝑪: a set of constraints that state allowable combinations of
values.
● Each 𝐶 consists of a pair <𝑠𝑐𝑜𝑝𝑒, 𝑟𝑒𝑙>
○ 𝑠𝑐𝑜𝑝𝑒: a tuple of variables that participate in the constraint
○ A relation 𝑟𝑒𝑙 defines the values that participated variables
can take

4
Constraints in CSPs

● Assume that both 𝑋1 and 𝑋2 have the domain {𝐴,𝐵}

● “Two variables must have different values”
● A relation can be an explicit list of all tuples of values that
satisfy the constraint.
○ <(𝑋1, 𝑋2), {(𝐴, 𝐵), (𝐵, 𝐴)}>
● It can be an abstract relation that supports two operations
○ Test whether a tuple is a member of the relation
○ Enumerate the members of the relation
○ <(𝑋1, 𝑋2), 𝑋1 ≠ 𝑋2>

5
Solutions for CSPs

● Each state is defined by an assignment of values to some or all

the variables.
● A solution to a CSP is a consistent – complete assignment.
○ A consistent assignment does not violate any constraints.
○ A complete assignment has every variable assigned, while
a partial assignment assigns values to only some
variables.

6
Example: Map Coloring

● Each state is assigned a color in {red, green, blue}.

● Adjacent states have different colors.
● Constraint graph:
○ Nodes ⇔ Variables
○ Arcs ⇔ Constraints

7
Example: Map Coloring

● Variables: 𝑋 = {𝑊𝐴, 𝑁𝑇, 𝑄, 𝑁𝑆𝑊, 𝑉, 𝑆𝐴, 𝑇}

● Domains: 𝐷i = {𝑟𝑒𝑑, 𝑔𝑟𝑒𝑒𝑛, 𝑏𝑙𝑢𝑒}
● Constraints: Adjacent regions must have different colors
{𝑆𝐴 ≠ 𝑊𝐴, 𝑆𝐴 ≠ 𝑁𝑇, 𝑆𝐴 ≠ 𝑄, 𝑆𝐴 ≠ 𝑁𝑆𝑊,
𝑆𝐴 ≠ 𝑉, 𝑊𝐴 ≠ 𝑁𝑇, 𝑁𝑇 ≠ 𝑄, 𝑄 ≠ 𝑁𝑆𝑊 , 𝑁𝑆𝑊 ≠ 𝑉}
○ 𝑆𝐴 ≠ 𝑊𝐴 is a shortcut of <(𝑆𝐴,𝑊𝐴), 𝑆𝐴 ≠ 𝑊𝐴>
○ 𝑆𝐴 ≠ 𝑊𝐴 can be fully enumerated as
{(red,green), (red,blue), (green,red),
(green,blue), (blue,red), (blue,green)}

8
Example: Map Coloring

● There are many possible solutions to this problem.

{𝑊𝐴 = 𝑟𝑒𝑑, 𝑁𝑇 = 𝑔𝑟𝑒𝑒𝑛, 𝑄 = 𝑟𝑒𝑑, 𝑁𝑆𝑊 = 𝑔𝑟𝑒𝑒𝑛,
𝑉 = 𝑟𝑒𝑑, 𝑆𝐴 = 𝑏𝑙𝑢𝑒, 𝑇 = 𝑟𝑒𝑑}

9
Benefits of CSPs

● Provide natural representation for a wide variety of problems

● Many problems intractable in regular state-space search can
be solved quickly with CSP formulation.
● Better insights to the problem and its solution.

10
Variations on the CSP formalism

● Discrete and finite variables

○ 𝑛 variables, domain size 𝑑 → 𝑂(𝑑𝑛) complete assignments
○ E.g., map coloring, scheduling with time limits, 8-queens
etc.
● Discrete, infinite domains
○ Sets of Integers, strings, etc.
○ E.g., job scheduling without deadlines
○ Constraint language: understand constraints without
enumeration
● Continuous domains
○ Real-world problems often involve continuous domains and
even real-valued variables.
11
CPSs in practise

● Operations Research (scheduling, timetabling)

○ Scheduling the time of observations on the Hubble Space
Telescope
○ Airline schedules
● Linear programming
○ Constraints must be linear equalities or inequalities
→ solved in time polynomial in the number of variables.
● Bioinformatics (DNA sequencing)
● Electrical engineering (circuit layout-ing)
● Cryptography
● Computer vision: image interpretation

12
Types of constraints

● Unary constraint: restrict the value of a single variable

○ 𝑆𝐴 ≠ 𝑔𝑟𝑒𝑒𝑛
● Binary constraint: relate two variables
○ 𝑆𝐴 ≠ 𝑊𝐴
● Higher-order constraints: involve three or more variables
○ E.g., Professors A, B, and C cannot be on a committee
together
○ Always possible to be represented by multiple binary
constraints
● Global constraints: involving an arbitrary number of variables
○ 𝐴𝑙𝑙𝑑𝑖𝑓𝑓 = all variables involved must have different values

13
Preference constraints

● Which solutions are preferred → soft constraints

○ E.g., 𝑟𝑒𝑑 is better than 𝑔𝑟𝑒𝑒𝑛
→ this often can be represented by a cost for each variable
assignment
● Constraint optimization problem (COP): a combination of
optimization with CSPs → linear programming

14
Constraint propagation

● Constraints help to reduce the number of legal values for a

variable
→ legal values for another variable are also reduced.
● Intertwined with search, or done as a preprocessing step.
● Enforcing local consistency in each part of a graph causes
inconsistent values to be eliminated throughout the graph.

15
Node consistency

● A single variable is node-consistent if all the values in the

variable’s domain satisfy the variable’s unary constraints.
● Eliminate all the unary constraints in a CSP by running node
consistency.
● E.g., The South Australians dislike green, the domain of {𝑆𝐴}
will be {𝑟𝑒𝑑, 𝑔𝑟𝑒𝑒𝑛, 𝑏𝑙𝑢𝑒}

16
Arc consistency

● A variable in a CSP is arc-consistent if every value in its

domain satisfies the variable’s binary constraints.
● Arc consistency may have no effect in several cases.
● E.g., the Australia map, no matter what value chosen for 𝑆𝐴 (or
for 𝑊𝐴), there is a valid value for the other variable.

17
Arc consistency

18
Arc consistency

● Run as a preprocessor before the search starts or after each

assignment
● AC must be run repeatedly until no inconsistency remains
● Need a systematic method for arc-checking
○ If 𝑋 loses a value, neighbors of 𝑋 need to be rechecked
→ incoming arcs can become inconsistent again while
outgoing arcs stay still

19
AC-3 algorithm

function AC-3(csp)
returns false if an inconsistency is found and true otherwise
inputs: csp, a binary CSP with components (X, D, C)
local variables: queue, a queue of arcs, initially all the arcs in csp
while queue is not empty do
(Xi , Xj) ← REMOVE-FIRST(queue)
if REVISE(csp, Xi , Xj) then
if size of Di = 0 then return false
for each Xk in Xi.NEIGHBORS - {Xj} do
add (Xk , Xi) to queue
return true
20
AC-3 algorithm

function REVISE(csp, Xi , Xj)

returns true iff we revise the domain of Xi
revised ← false
for each x in Di do
if no value y in Dj allows (x ,y) to satisfy the constraint between Xi
and Xj then
delete x from Di
revised ← true
return revised

● The worst-case complexity is 𝑂(𝑐𝑑3)

○ 𝑛: number of variables, each has domain size 𝑑, 𝑐 binary constraints
(arc) 21
Backtracking Search

● States are defined by the values assigned so far

○ Initial state: empty assignment { }
○ Successor function: assign a value to an unassigned
variable that agrees with the current assignment
→ fail if no legal assignments
○ Goal test: the current assignment is complete
● Depth-first search: choose values for one variable at a time and
backtrack when a variable has no legal values left

22
Backtracking Search

function BACKTRACKING-SEARCH(csp) returns a solution, or failure

return BACKTRACK({ }, csp)

function BACKTRACK(assignment, csp) returns a solution, or failure

if assignment is complete then return assignment
var ← SELECT-UNASSIGNED-VARIABLE(csp)
for each value in ORDER-DOMAIN-VALUES(var, assignment, csp) do
if value is consistent with assignment then
add {var = value} to assignment
inferences ← INFERENCE(csp, var, value)
if inferences != failure then
add inferences to assignment
result ← BACKTRACK(assignment, csp)
if result != failure then
return result
remove {var = value} and inferences from assignment
return failure
23
Backtracking Search

24
Variable and value ordering

● Minimum-remaining-values (MRV) heuristic: choose the

variable with the fewest legal values
○ E.g., after [𝑊𝐴 = 𝑟𝑒𝑑, 𝑁𝑇 = 𝑔𝑟𝑒𝑒𝑛] only one possible value
for 𝑆𝐴

25
Variable and value ordering

● Degree heuristic (DH): choose the variable that involves in the

largest number of constraints on other unassigned variables.
○ E.g., 𝑆𝐴 has a highest degree of 5

26
Variable and value ordering

● Least constraining value (LCV) heuristic: given a variable,

choose the value that leaves the maximum flexibility for
subsequent variable assignments

27
Inference: Forward checking

● Keep track of remaining legal values for unassigned variables

● Terminate search when any variable has no legal values

28
Local search for CSPs

● Complete-state formulation
○ The initial state assigns a value to every variable →
violation
○ The search changes the value of one variable at a time →
eliminate the violated constraints
● Min-conflicts heuristic: the minimum number of conflicts with
other variables
● Min-conflicts is surprisingly effective for many CSPs.

29
MIN-CONFLICTS algorithm

function MIN-CONFLICTS(csp, max steps) returns a solution or failure

inputs: csp, a constraint satisfaction problem
max steps, the number of steps allowed before giving up
current ← an initial complete assignment for csp
for i = 1 to max steps do
if current is a solution for csp then return current
var ← a randomly chosen conflicted variable from csp.VARIABLES
value ← the value v for var that minimizes CONFLICTS(var, v,
current, csp)
set var = value in current
return failure

30
Local search for CSPs

● The landscape of a CSP under the min-conflicts heuristic

usually has a series of plateaux.
● Plateau search: allow sideways moves to another state with the
same score.
● Tabu search: keep a small list of recently visited states and
forbid the algorithm to return to those states
● Simulated annealing can also be used.

31
Constraint weighting

● Concentrate the search on the important constraints

● Each constraint is given a numeric weight, 𝑊 , initially all 1.
● At each step, choose a variable/value pair to change that has
the lowest total weight of all violated constraints
● Increase the weight of each constraint that is violated by the
current assignment.

32
Homework

● Conduct homework in the given notebook

33
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

34
Introduction to
Artificial Intelligence
Lecture: Logical Agents
Outline

● Knowledge-Based Agents
● Logic
● Propositional Logic
● Propositional Theorem Proving
● Effective Propositional Model Checking

2
Knowledge-based agents

● Supported by logic – a general class of representation

● Combine and recombine information to suit myriad purposes
● Knowledge base (KB): A set of sentences or facts
○ Each sentence represents some assertion about the world.
○ Axiom = the sentence that is not derived from other
sentences
● Inference: Derive (infer) new sentences from old ones
○ Add new sentences to the knowledge base and query what
is known

3
Knowledge-based agents

function KB-AGENT(percept) returns an action

persistent: KB, a knowledge base
t, a counter, initially 0, indicating time
TELL(KB, MAKE-PERCEPT-SENTENCE(percept, t))
action ← ASK(KB, MAKE-ACTION-QUERY(t))
TELL(KB, MAKE-ACTION-SENTENCE(action, t))
t←t+1
return action
Inference mechanisms are hidden inside TELL and ASK

4
Logic

● Logics are formal languages for representing information such

that conclusions can be drawn
● Syntax defines the well-formed sentences in the language
● Semantics define the "meaning" of sentences
● Models are mathematical abstractions that fix the truth or
falsehood of every relevant sentence.
● 𝑚 satisfies (or is a model of) 𝛼 if 𝛼 is true in model 𝑚
● 𝑀(𝛼) = the set of all models of 𝛼

5
Entailment

● A sentence follows logically from another sentence: 𝜶 ⊨ 𝜷

● 𝜶 ⊨𝜷 if and only if, in every model in which 𝜶 is true, 𝜷 is also
true.
○ 𝑀(𝛼) ⊆ 𝑀(𝛽)
○ 𝑥 = 0 entails 𝑥𝑦 = 0
● Entailment is a relationship between sentences that is based
on semantics.

6
Logical inference

● 𝐾𝐵 ⊨𝑖 𝛼 means 𝛼 can be derived from 𝐾𝐵 by procedure 𝑖.

● Soundness: 𝑖 is sound if whenever 𝐾𝐵 ⊨𝑖 𝛼, it is also true that
𝐾𝐵 ⊨ 𝛼
● Completeness: 𝑖 is complete if whenever 𝐾𝐵 ⊨ 𝛼, it is also true
that 𝐾𝐵 ⊨𝑖 𝛼

7
World and representation

● The reasoning agent gets its knowledge about the facts of the
world as a sequence of logical sentences
● Conclusions must be drawn only from those → without agent’s
independent access to the world
● Thus it is very important that the agent’s reasoning is sound.

8
Propositional Logic: Syntax

● Propositional logic: the simplest logic illustrating basic ideas

● Constants: TRUE or FALSE
● Symbols stand for propositions (sentences): 𝑃, 𝑄, 𝑊, etc.
● Logical connectives
NOT ¬ Negation

AND ∧ Conjunction

OR ∨ Disjunction

IMPLIES ⇒ Implication (if..then)

IFF ⟺ Equivalence, biconditional

● Literal: atomic sentence (P) or negated atomic sentence (¬P)

9
Propositional Logic: Syntax

10
Semantics

● Each model specifies true/false for each proposition symbol

11
A simple inference procedure

● Given: a set of sentences, 𝑲𝑩, and sentence 𝜶

● Goal: answer 𝑲𝑩 ⊨ 𝜶? = “Does 𝑲𝑩 semantically entail 𝜶?”
● Approaches
○ Model-checking approach (Inference by enumeration)
○ Inference rules
○ Conversion to the inverse SAT problem (Resolution
refutation)

12
Model-checking approach

● Check if 𝛼 is true in every model in which 𝐾𝐵 is true

● Draw a truth table for checking

13
Inference by (depth-first) enumeration

function TT-ENTAILS?(KB, α) returns true or false

inputs: KB, the knowledge base, a sentence in propositional logic
α, the query, a sentence in propositional logic
symbols ← a list of the proposition symbols in KB and α
return TT-CHECK-ALL(KB, α, symbols, { })

function TT-CHECK-ALL(KB, α, symbols, model) returns true or false

if EMPTY?(symbols) then
if PL-TRUE?(KB, model) then return PL-TRUE?(α, model)
else return true // when KB is false, always return true
else do
P ← FIRST(symbols)
rest ← REST(symbols)
return (TT-CHECK-ALL(KB,α,rest,model ∪ {P = true})
and TT-CHECK-ALL(KB,α,rest,model ∪ {P = false}))
14
Model-checking approach

● Given a KB containing the following rules and facts

R1: IF hot AND smoky THEN fire
R2: IF alarm_beeps THEN smoky
R3: IF fire THEN sprinklers_on
F1: alarm_beeps
F2: hot
● Represent the KB in propositional logic with given symbols
○ H = hot, S = smoky, F = fire, A = alarms_beeps,
○ R = sprinklers_on
● Answer the question “Sprinklers_on?” by using the model- checking
approach.

15
Inference rules approach

● Theorem proving: Apply rules of inference directly to the

sentences in KB to construct a proof of the desired sentence
without consulting models

16
Logical equivalence

● Two sentences 𝛼 and 𝛽 are logically equivalent if they are true

in the same set of models.
𝜶 ≡ 𝜷 𝒊𝒇𝒇 𝜶 ⊨ 𝜷 𝒂𝒏𝒅 𝜷 ⊨ 𝜶

17
Validity

● A sentence is valid if it is true in all models.

● Valid sentences are also known as tautologies.
● Validity is connected to inference via the Deduction Theorem
𝛼 ⊨ 𝛽 𝑖𝑓𝑓 𝛼 ⇒ 𝛽 𝑖𝑠 𝑣𝑎𝑙𝑖𝑑

18
Satisfiability

● A sentence is satisfiable if it is true in some model.

● A sentence is unsatisfiable if it is true in no models.
● Satisfiability is connected to inference via the following
𝛼 ⊨ 𝛽 𝑖𝑓𝑓 𝛼 ∧ ¬𝛽 𝑖𝑠 𝑢𝑛𝑠𝑎𝑡𝑖𝑠𝑓𝑖𝑎𝑏𝑙𝑒
→ Refutation or proof by contradiction
● The SAT problem determines the satisfiability of sentences in
propositional logic (NP-complete).

19
Validity and Satisfiability

1. 𝐴 ∨ 𝐵 ⇒ 𝐴 ∧ 𝐶
2. 𝐴 ∧ 𝐵 ⇒ 𝐴 ∨ 𝐶
3. (𝐴 ∨ 𝐵) ∧ (¬𝐵 ∨ 𝐶) ⇒ 𝐴 ∨ 𝐶
4. (𝐴 ∨ ¬𝐵) ⇒ 𝐴 ∧ 𝐵

20
Inference and Proofs

● Proof: A chain of conclusions leads to the desired goal

21
Inference and Proofs

22
Monotonicity

● The set of entailed sentences only increases as information is

added to the knowledge base.
𝑖𝑓 𝐾𝐵 ⊨ 𝛼 𝑡h𝑒𝑛 𝐾𝐵 ∧ 𝛽 ⊨ 𝛼
● Additional conclusions can be drawn without invalidating any
conclusion 𝛼 already inferred.

23
Proof by Resolution

● Proof by Inference Rules: sound but not complete

● Resolution: sound and complete, a single inference rule
● Given 𝑙𝑖 and 𝑚 are complementary literals
○ Unit resolution inference rule

○ Full resolution rule

24
Proof by Resolution

25
Proof by Resolution

● Factoring: the resulting clause should contain only one copy of

each literal.
● For any sentences 𝛼 and 𝛽 in propositional logic, a resolution-
based theorem prover can decide whether 𝛼 ⊨ 𝛽.

26
Conjunctive Normal Form (CNF)

● Resolution applies only to clauses, i.e. disjunctions of literals

○ Convert all sentences in KB into clauses (CNF form)
● For example, convert A ⇔ (B ∨ C) into CNF
(¬A ∨ B ∨ C) ∧ (¬B ∨ A) ∧ (¬C ∨ A )

27
Conversion to CNF

● Eliminate ⇔: 𝛼 ⇔ 𝛽 ≡ 𝛼 ⇒𝛽 ∧ 𝛽 ⇒ 𝛼
● Eliminate ⇒: 𝛼 ⇒ 𝛽 ≡ ¬𝛼 ∨ 𝛽
● The operator ¬ appears only in literals: “move ¬ inwards”
¬¬𝛼 ≡ 𝛼 (double-negation elimination)
¬(𝛼 ∧ 𝛽) ≡ ¬𝛼 ∨ ¬𝛽 (De Morgan)
¬(𝛼 ∨ 𝛽) ≡ ¬𝛼 ∧ ¬𝛽 (De Morgan)
● Apply the distributivity law to distribute ∨ over ∧
(𝛼 ∧ 𝛽) ∨ 𝛾 ≡ (𝛼 ∨ 𝛾) ∧ (𝛽 ∨ 𝛾)

28
The resolution algorithm

● Proof by contradiction: To show that 𝐾𝐵 ⊨ 𝛼, prove 𝐾𝐵 ∧ ¬𝛼 is

unsatisfiable
function PL-RESOLUTION(KB, α) returns true or false
inputs: KB, the knowledge base, a sentence in propositional logic
α, the query, a sentence in propositional logic
clauses ← the set of clauses in the CNF representation of KB ∧ ¬α
new ← { }
loop do
for each pair of clauses Ci, Cj in clauses do
resolvents ← PL-RESOLVE(Ci , Cj)
if resolvents contains the empty clause then return true
new ← new ∪ resolvents
if new ⊆ clauses then return false
clauses ← clauses ∪ new
29
Horn clauses and Definite clauses

● Definite clause: a disjunction of literals of which exactly one is

positive.
● Horn clause: a disjunction of literals of which at most one is
positive.
● Goal clause: clauses with no positive literals

30
KB of definite clauses

● KB containing only definite clauses are interesting.

● Every definite clause can be written as an implication.
● Inference can be done with forward-chaining and backward-
chaining algorithms.
● Deciding entailment can be done in linear time.

31
Forward chaining

● Key idea: Fire any rule whose premises are satisfied in the KB,
add its conclusion to the KB, until the query is found.

32
Forward chaining

function PL-FC-ENTAILS?(KB, q) returns true or false

inputs: KB, the knowledge base, a set of propositional definite clauses
q, the query, a proposition symbol
count ← a table, where count[c] is the number of symbols in c’s premise
inferred ← a table, where inferred[s] is initially false for all symbols
agenda ← a queue of symbols, initially symbols known to be true in KB
while agenda is not empty do
p ← POP(agenda)
if p = q then return true
if inferred[p] = false then
inferred[p] ← true
for each clause c in KB where p is in c.PREMISE do
decrement count[c]
if count[c] = 0 then add c.CONCLUSION to agenda
return false 33
Forward chaining

34
Forward chaining

35
Backward chaining

● Key idea: Work backwards from the query 𝒒

● Avoid loops: A new subgoal is already on the goal stack?
● Avoid repeated work: A new subgoal has already been proved
true, or has already failed?

36
Backward chaining

37
Backward chaining

38
Backward chaining

39
Backward chaining

40
Backward chaining

41
Forward vs. Backward chaining

● Forward chaining: data-driven, automatic, unconscious

processing
● Backward chaining: goal-driven, good for problem-solving

42
Efficient propositional inference

● The SAT problem (checking satisfiability)

○ Testing entailment, 𝛼 ⊨ 𝛽? = testing unsatisfiability of 𝛼 ∧
¬𝛽
● Two families of efficient algorithms for general propositional
inference based on model checking
○ Complete backtracking search algorithms
DPLL algorithm (Davis, Putnam, Logemann, Loveland)
○ Incomplete local search algorithms(hill-climbing)
WalkSAT algorithm

43
The DPLL algorithm

● Determine whether an input propositional logic sentence (in

CNF) is satisfiable
● Improvements over truth table enumeration
1. Early termination
2. Pure symbol heuristic
3. Unit clause heuristic

44
The DPLL algorithm

● Early termination: A clause is true if any literal is true and a

sentence is false if any clause is false.
● Pure symbol heuristic: A pure symbol always appears with the
same "sign" in all clauses.
● Unit clause heuristic: there is only one literal in the clause and
thus this literal must be true

45
The DPLL algorithm

function DPLL-SATISFIABLE?(s)
returns true or false
inputs: s, a sentence in propositional logic
clauses ← the set of clauses in the CNF representation of s
symbols ← a list of the proposition symbols in s
return DPLL(clauses, symbols,{ })

46
The DPLL algorithm

function DPLL(clauses, symbols, model) returns true or false

if every clause in clauses is true in model then return true
if some clauses in clauses are false in model then return false
P, value ← FIND-PURE-SYMBOL(symbols, clauses, model)
if P is non-null then return DPLL(clauses, symbols – P, model ∪ {P=value})
P, value ← FIND-UNIT-CLAUSE(clauses, model)
if P is non-null then return DPLL(clauses, symbols – P, model ∪ {P=value})
P ← FIRST(symbols)
rest ← REST(symbols)
return DPLL(clauses, rest, model ∪ {P=true})
or DPLL(clauses, rest, model ∪ {P=false}))

47
The WalkSAT algorithm

● Incomplete, local search algorithm

● Evaluation function: The min-conflict heuristic of minimizing the
number of unsatisfied clauses
● Balance between greediness and randomness

48
The WalkSAT algorithm

● When the algorithm returns a model → The input sentence is

indeed satisfiable
● When it returns false → The sentence is unsatisfiable OR we
need to give it more time
● WalkSAT cannot always detect unsatisfiability
● It is most useful when a solution is expected to exist

49
PySAT - Glucose3

! pip install python-sat

from pysat.solvers import Glucose3

g = Glucose3()
g.add_clause([1])
g.add_clause([2])
g.add_clause([-1, 3])
g.add_clause([-2, -3, -4])
g.add_clause([-4])

print(g.solve())
# True

g.get_model()
# [1, 2, 3, -4] 50
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

51
Introduction to
Artificial Intelligence
Lecture: First Order Logic
Outline

● First-Order Logic (FOL)

● Unification and Lifting
● Forward Chaining
● Backward Chaining
● Resolution

2
First-order Logic

● Objects are referred by nouns and noun phrases.

○ E.g., people, houses, numbers, colors, Bill Gates, games,
wars, etc.
● Relations can be unary relations (properties) or 𝒏-ary relations,
representing by verbs and verb phrases
○ Properties: red, round, prime, etc.
○ 𝑛-ary relations: brother of, bigger than, part of, comes
between, etc.
● Functions are relations in which there is only one “value” for a
given “input.”
○ E.g., father of, best friend, one more than, etc.

3
First-order Logic

● “One plus two equals three.”

○ Object: one, two, three, one plus two
○ Relation: equal
○ Function: plus
● “The intelligent AlphaGo beat the world champion in 2016.”
○ Object: AlphaGo, world champion, 2016
○ Property: intelligent
○ Relation: beat

4
Types of Logic

Language Ontological Epistemological

Commitment Ontological Commitment
(What exists in the world) (What an agent believes
about facts)

Propositional logic Facts True/false/unknown

First-order logic Facts, objects, True/false/unknown
relations
Temporal logic Facts, objects, True/false/unknown
relations, time
Probability logic Facts Degree of belief ∈ [0,1]
Fuzzy logic Facts with degree of Known interval value
truth ∈ [0,1]
5
Models for a logic language

● Formal structures that constitute the possible worlds under

consideration
● Each model links the vocabulary of sentences to elements of
the possible world → determine the truth of any sentence
● Models for propositional logic link proposition symbols to
predefined truth values.
● The domain of a model is the set of objects (or domain
elements) it contains.
● Nonempty — Every possible world must contain at least one
object

6
Models for a logic language

7
Symbols

● Constant symbols represent objects.

○ E.g., Richard, John, etc.
● Predicate symbols stand for relations.
○ E.g., Brother , OnHead, Person, King, and Crown, etc.
● Function symbols stand for functions.
○ E.g., LeftLeg
● Each predicate or function symbol comes with an arity that
fixes the number of arguments.
○ E.g., Brother(x,y) → binary, LeftLeg(x) → unary, etc.
● These symbols begins with uppercase letters by convention.

8
Intended interpretation

● Interpretation specifies exactly which objects, relations and

functions are referred to by the symbol.
● Each model includes an (intended) interpretation.

9
FOL syntax

10
FOL syntax

11
Terms

● A term is a logical expression that refers to an object

○ Constant symbols: 𝐽𝑜h𝑛
○ Function symbols: 𝐿𝑒𝑓𝑡𝐿𝑒𝑔(𝐽𝑜h𝑛)
● A complex term is formed by a function symbol followed by a
parenthesized list of terms as arguments.
● Term = function(term1,...,termn) or constant or variable

12
Atomic sentence

● An atomic sentences state facts by using a predicate symbol

followed by a parenthesized list of terms.
● Atomic sentence = predicate(term1,...,termn)
● For example,
○ Brother(Richard, John)
○ Married(Father(Richard), Mother(John))
● It is true if the relation referred to by the predicate symbol holds
among the objects referred to by the arguments

13
Complex sentences

● A complex sentences are made from atomic sentences using

connectives.
● For example,
○ ¬𝐵𝑟𝑜𝑡h𝑒𝑟(𝐿𝑒𝑓𝑡𝐿𝑒𝑔(𝑅𝑖𝑐h𝑎𝑟𝑑), 𝐽𝑜h𝑛)
○ 𝐵𝑟𝑜𝑡h𝑒𝑟(𝑅𝑖𝑐h𝑎𝑟𝑑,𝐽𝑜h𝑛) ∧ 𝐵𝑟𝑜𝑡h𝑒𝑟(𝐽𝑜h𝑛,𝑅𝑖𝑐h𝑎𝑟𝑑)
○ 𝐾𝑖𝑛𝑔(𝑅𝑖𝑐h𝑎𝑟𝑑) ∨ 𝐾𝑖𝑛𝑔(𝐽𝑜h𝑛)
○ ¬𝐾𝑖𝑛𝑔(𝑅𝑖𝑐h𝑎𝑟𝑑) ⇒ 𝐾𝑖𝑛𝑔(𝐽𝑜h𝑛)
● Syntax and semantics are the same as in propositional logic.

14
Quantifiers: Universal quantification

● ∀<variables> <sentence>
○ E.g., “Students of FIT are smart.”
○ ∀𝑥 𝑆𝑡𝑢𝑑𝑒𝑛𝑡(𝑥, 𝐹𝐼𝑇) ⇒ 𝑆𝑚𝑎𝑟𝑡(𝑥)
● ∀x P is true in a model m iff P is true with x being each possible
object in the model.
● It is equivalent to the conjunction of instantiations of 𝑃.
Student(Lan, FIT) ⇒ Smart(Lan) ∧
Student(Tuan, FIT) ⇒ Smart(Tuan) ∧
Student(Long, FIT) ⇒ Smart(Long) ∧ …

15
Variables

● A variable is a term all by itself and able to serve as the

argument of a function
○ E.g., in predicates 𝐾𝑖𝑛𝑔(𝑥) or in function 𝐿𝑒𝑓𝑡𝐿𝑒𝑔(𝑥).
● Usually represented by lowercase letters
● A term with no variables is called a ground term.

16
A common mistake to avoid

● Typically, ⇒ is the main connective with ∀

○ The conclusion of the rule just for those objects for whom
the premise is true
○ It says nothing at all about individuals for whom the premise
is false.
● Too strong implication
○ ∀𝑥 𝑆𝑡𝑢𝑑𝑒𝑛𝑡(𝑥,𝐹𝐼𝑇) ∧ 𝑆𝑚𝑎𝑟𝑡(𝑥)
○ It means “Everyone is a student of FIT and Everyone is
smart.”

17
Quantifiers: Existential quantification

● ∃<variables> <sentence>
○ E.g., “Some students of FIT are smart.”
○ ∃𝑥 𝑆𝑡𝑢𝑑𝑒𝑛𝑡(𝑥,𝐹𝐼𝑇) ∧ 𝑆𝑚𝑎𝑟𝑡(𝑥)
● ∃x P is true in a model m iff P is true with x being some
possible object in the model.
● It is equivalent to the disjunction of instantiations of 𝑃.
Student(Lan, FIT) ∧ Smart(Lan) ∨
Student(Tuan, FIT) ∧ Smart(Tuan) ∨
Student(Long, FIT) ∧ Smart(Long) ∨ ...

18
Common mistake to avoid

● Typically, ∧ is the main connective with ∃

● Too weak implication
○ ∃𝑥 𝑆𝑡𝑢𝑑𝑒𝑛𝑡(𝑥, 𝐹𝐼𝑇) ⇒ 𝑆𝑚𝑎𝑟𝑡(𝑥)
○ It is true even with anyone who is not at FIT.

19
Nested quantifiers

● Multiple quantifiers enable more complex sentences.

● Simplest cases: Quantifiers are of the same type
○ ∀𝑥∀𝑦 𝐵𝑟𝑜𝑡h𝑒𝑟(𝑥, 𝑦) ⇒ 𝑆𝑖𝑏𝑙𝑖𝑛𝑔(𝑥, 𝑦)
○ ∀𝑥∀𝑦 𝑆𝑖𝑏𝑙𝑖𝑛𝑔(𝑥, 𝑦) ⇔ 𝑆𝑖𝑏𝑙𝑖𝑛𝑔(𝑦, 𝑥)
● Mixtures
○ ∀𝑥∃𝑦 𝐿𝑜𝑣𝑒𝑠(𝑥, 𝑦) → “Everybody loves somebody.”
○ ∃𝑥∀𝑦 𝐿𝑜𝑣𝑒𝑠(𝑦, 𝑥) → “There is someone loved by everyone.”
● The order of quantification is therefore very important.

20
Nested quantifiers

● Rule: The variable belongs to the innermost quantifier that

mentions it.
○ ∀𝑥 (𝐶𝑟𝑜𝑤𝑛(𝑥) ∨ (∃𝑥 𝐵𝑟𝑜𝑡h𝑒𝑟(𝑅𝑖𝑐h𝑎𝑟𝑑, 𝑥))
● Workaround: Use different variable names with nested
quantifier, e.g., ∃𝑧 𝐵𝑟𝑜𝑡h𝑒𝑟(𝑅𝑖𝑐h𝑎𝑟𝑑, 𝑧)

21
Quantifier duality

● De Morgan’s rules

● For example

22
Equality symbol =

● 𝑡𝑒𝑟𝑚1 = 𝑡𝑒𝑟𝑚2 is true under a given interpretation iff 𝑡𝑒𝑟𝑚1 and

𝑡𝑒𝑟𝑚2 refer to the same object
● The negation insists that two terms are not the same.
∃𝑥, 𝑦 𝐵𝑟𝑜𝑡h𝑒𝑟(𝑥, 𝑅𝑖𝑐h𝑎𝑟𝑑) ∧ 𝐵𝑟𝑜𝑡h𝑒𝑟(𝑦, 𝑅𝑖𝑐h𝑎𝑟𝑑) ∧ ¬(𝑥 = 𝑦)

23
Practise

∀𝑚, 𝑐 𝑀𝑜𝑡h𝑒𝑟(𝑐) = 𝑚 ⇔ 𝐹𝑒𝑚𝑎𝑙𝑒(𝑚) ∧ 𝑃𝑎𝑟𝑒𝑛𝑡(𝑚, 𝑐)

∀𝑤, h 𝐻𝑢𝑠𝑏𝑎𝑛𝑑(h, 𝑤) ⇔ 𝑀𝑎𝑙𝑒(h) ∧ 𝑆𝑝𝑜𝑢𝑠𝑒(h, 𝑤)
∀𝑥 𝑀𝑎𝑙𝑒(𝑥) ⇔ ¬𝐹𝑒𝑚𝑎𝑙𝑒(𝑥)
∀𝑝,𝑐 𝑃𝑎𝑟𝑒𝑛𝑡(𝑝,𝑐) ⇔ 𝐶h𝑖𝑙𝑑(𝑐,𝑝)
∀𝑔, 𝑐 𝐺𝑟𝑎𝑛𝑑𝑝𝑎𝑟𝑒𝑛𝑡(𝑔, 𝑐) ⇔ ∃𝑝 𝑃𝑎𝑟𝑒𝑛𝑡(𝑔, 𝑝) ∧ 𝑃𝑎𝑟𝑒𝑛𝑡(𝑝, 𝑐)
∀𝑥, 𝑦 𝑆𝑖𝑏𝑙𝑖𝑛𝑔(𝑥, 𝑦) ⇔ ¬(𝑥 = 𝑦) ∧ ∃𝑝 𝑃𝑎𝑟𝑒𝑛𝑡(𝑝, 𝑥) ∧ 𝑃𝑎𝑟𝑒𝑛𝑡(𝑝,
𝑦)

24
Practise

● Diagnostic rule → infer cause from effect

○ ∀𝑠 𝐵𝑟𝑒𝑒𝑧𝑦(𝑠) ⇔ ∃𝑟 𝐴𝑑𝑗𝑎𝑐𝑒𝑛𝑡(𝑟, 𝑠) ∧ 𝑃𝑖𝑡(𝑟)
● Causal rule → infer effect from cause
○ ∀𝑟 𝑃𝑖𝑡(𝑟) ⇔ [∀𝑠 𝐴𝑑𝑗𝑎𝑐𝑒𝑛𝑡(𝑟, 𝑠) ⇒ 𝐵𝑟𝑒𝑒𝑧𝑦(𝑠)]

25
Universal Instantiation (UI)

● It is possible to infer any sentence obtained by substituting a

ground term for the variable.
● Let 𝑆𝑈𝐵𝑆𝑇(𝜃, 𝛼) be the result of applying the substitution 𝜃 to
the sentence 𝛼.
● Then the rule of Universal Instantiation is written

26
Universal Instantiation (UI)

∀𝒙 𝑲𝒊𝒏𝒈(𝒙) ∧ 𝑮𝒓𝒆𝒆𝒅𝒚(𝒙) ⇒ 𝑬𝒗𝒊𝒍(𝒙)

𝐾𝑖𝑛𝑔(𝐽𝑜h𝑛) ∧ 𝐺𝑟𝑒𝑒𝑑𝑦(𝐽𝑜h𝑛) ⇒ 𝐸𝑣𝑖𝑙(𝐽𝑜h𝑛)

𝐾𝑖𝑛𝑔(𝑅𝑖𝑐h𝑎𝑟𝑑) ∧ 𝐺𝑟𝑒𝑒𝑑𝑦(𝑅𝑖𝑐h𝑎𝑟𝑑) ⇒ 𝐸𝑣𝑖𝑙(𝑅𝑖𝑐h𝑎𝑟𝑑)
𝐾𝑖𝑛𝑔(𝐹𝑎𝑡h𝑒𝑟(𝐽𝑜h𝑛)) ∧ 𝐺𝑟𝑒𝑒𝑑𝑦(𝐹𝑎𝑡h𝑒𝑟(𝐽𝑜h𝑛)) ⇒
𝐸𝑣𝑖𝑙(𝐹𝑎𝑡h𝑒𝑟(𝐽𝑜h𝑛))

substitutions: {𝑥/𝐽𝑜h𝑛}, {𝑥/𝑅𝑖𝑐h𝑎𝑟𝑑}, {𝑥/𝐹𝑎𝑡h𝑒𝑟(𝐽𝑜h𝑛)}

27
Existential Instantiation (EI)

● It is possible to replace the variable by a single new constant

symbol.
● The rule of Existential Instantiation is written

for any sentence 𝛼, variable 𝑣, and constant symbol 𝑘 that does not
appear elsewhere in 𝐾𝐵.
𝐶𝑟𝑜𝑤𝑛(𝐶) ∧ 𝑂𝑛𝐻𝑒𝑎𝑑(𝐶, 𝐽𝑜h𝑛)
𝐶 does not appear in 𝐾𝐵 → Skolem constant.

28
Universal / Existential Instantiation

● The UI rule can be applied many times to produce different

consequences.
● The EI rule can be applied once, and then the existentially
quantified sentence is discarded.
○ The new 𝐾𝐵 is not logically equivalent to the old but shown
to be inferentially equivalent.

29
Generalized Modus Ponens (GMP)

● For atomic sentences 𝑝i, 𝑝i′ and 𝑞, where there exists 𝜃 such
that 𝑆𝑈𝐵𝑆𝑇(𝜃, 𝑝i′) = 𝑆𝑈𝐵𝑆𝑇(𝜃, 𝑝i), for all 𝑖

● For example, 𝑝1′ is 𝐾𝑖𝑛𝑔(𝐽𝑜h𝑛) 𝑝1 is 𝐾𝑖𝑛𝑔(𝑥)

𝑝2′ is 𝐺𝑟𝑒𝑒𝑑𝑦(𝑦) 𝑝2 is 𝐺𝑟𝑒𝑒𝑑𝑦(𝑥)
𝜃 is {𝑥/𝐽𝑜h𝑛, 𝑦/𝐽𝑜h𝑛} 𝑞 is 𝐸𝑣𝑖𝑙(𝑥)
𝑆𝑈𝐵𝑆𝑇(𝜃, 𝑞) is 𝐸𝑣𝑖𝑙(𝐽𝑜h𝑛)
● All variables assumed universally quantified
● A lifted version of Modus Ponens → sound inference rule 30
Unification

● Find substitutions that make different logical expressions look

identical
𝑼𝑵𝑰𝑭𝒀(𝒑, 𝒒) = 𝜽 where 𝑆𝑈𝐵𝑆𝑇(𝜃, 𝑝) = 𝑆𝑈𝐵𝑆𝑇(𝜃, 𝑞)
● For example,
p q 𝜃
Knows(John, x) Knows(John, Jane) {x/Jane}
Knows(John, x) Knows(y, Steve) {x/Steve, y/John}
Knows(John, x) Knows(y, Mother(y)) {x/Mother(John), y/John}
Knows(John, x) Knows(x, Steve) fail
● Standardizing apart eliminates overlap of variables:
Knows(z, Steve)
31
Most General Unifier (MGU)

● 𝑈𝑁𝐼𝐹𝑌(𝐾𝑛𝑜𝑤𝑠(𝐽𝑜h𝑛, 𝑥), 𝐾𝑛𝑜𝑤𝑠(𝑦, 𝑧)) = 𝜃

1. 𝜃 = {𝑦/𝐽𝑜h𝑛, 𝑥/𝑧 }
2. 𝜃 = {𝑦/𝐽𝑜h𝑛, 𝑥/𝐽𝑜h𝑛, 𝑧/𝐽𝑜h𝑛}
● The first unifier is more general than the second
● There is a single Most General Unifier (MGU) that is unique up
to renaming of variables.
𝑀𝐺𝑈 = {𝑦/𝐽𝑜h𝑛, 𝑥/𝑧}

32
Unification algorithm

function UNIFY(x , y, θ) returns a substitution to make x and y identical

inputs: x , a variable, constant, list, or compound expression
y, a variable, constant, list, or compound expression
θ, the substitution built up so far (optional, defaults to empty)
if θ = failure then return failure
else if x = y then return θ
else if VARIABLE?(x) then return UNIFY-VAR(x , y, θ)
else if VARIABLE?(y) then return UNIFY-VAR(y, x , θ)
else if COMPOUND?(x) and COMPOUND?(y) then
return UNIFY(x.ARGS, y.ARGS, UNIFY(x.OP, y.OP, θ))
else if LIST?(x) and LIST?(y) then
return UNIFY(x.REST, y.REST, UNIFY(x.FIRST, y.FIRST, θ))
else return failure
33
Unification algorithm

function UNIFY-VAR(var, x , θ) returns a substitution

if {var /val} ∈ θ then
return UNIFY(val, x , θ)
else if {x/val} ∈ θ then
return UNIFY(var, val, θ)
else if OCCUR-CHECK?(var, x) then
return failure
else return add {var/x} to θ

34
Forward Chaining

● A definite clause is a disjunctions of literals of which exactly

one is positive.
● First-order literals can include variables, which are assumed to
be universally quantified.
● Not every 𝐾𝐵 can be converted into a set of definite clauses
due to the single-positive-literal restriction.

35
Forward Chaining

● “The law says that it is a crime for an American to sell weapons

to hostile nations. The country Nono, an enemy of America,
has some missiles, and all of its missiles were sold to it by
Colonel West, who is American.”
● 𝐴𝑚𝑒𝑟𝑖𝑐𝑎𝑛(𝑥) ∧ 𝑊𝑒𝑎𝑝𝑜𝑛(𝑦) ∧ 𝑆𝑒𝑙𝑙𝑠(𝑥, 𝑦, 𝑧) ∧ 𝐻𝑜𝑠𝑡𝑖𝑙𝑒(𝑧) ⇒
𝐶𝑟𝑖𝑚𝑖𝑛𝑎𝑙(𝑥)
● ∃𝑥 𝑂𝑤𝑛𝑠(𝑁𝑜𝑛𝑜, 𝑥) ∧ 𝑀𝑖𝑠𝑠𝑖𝑙𝑒(𝑥)
○ 𝑂𝑤𝑛𝑠(𝑁𝑜𝑛𝑜,𝑀1)
○ 𝑀𝑖𝑠𝑠𝑖𝑙𝑒(𝑀1)
● 𝑀𝑖𝑠𝑠𝑖𝑙𝑒(𝑥) ∧ 𝑂𝑤𝑛𝑠(𝑁𝑜𝑛𝑜, 𝑥) ⇒ 𝑆𝑒𝑙𝑙𝑠(𝑊𝑒𝑠𝑡, 𝑥, 𝑁𝑜𝑛𝑜)
● 𝑀𝑖𝑠𝑠𝑖𝑙𝑒(𝑥) ⇒ 𝑊𝑒𝑎𝑝𝑜𝑛(𝑥)
𝐴𝑚𝑒𝑟𝑖𝑐𝑎𝑛(𝑊𝑒𝑠𝑡)
● 𝐸𝑛𝑒𝑚𝑦(𝑥,𝐴𝑚𝑒𝑟𝑖𝑐𝑎) ⇒ 𝐻𝑜𝑠𝑡𝑖𝑙𝑒(𝑥) 𝐸𝑛𝑒𝑚𝑦(𝑁𝑜𝑛𝑜, 36
Forward Chaining

37
Forward Chaining

38
Forward Chaining

● Soundness?
○ YES, every inference is just an application of GMP.
● Completeness?
○ YES for definite clause knowledge bases.
● It answers every query whose answers are entailed by any 𝐾𝐵
of definite clauses.
● Terminate for Datalog in finite number of iterations
● Datalog = first-order definite clauses + no functions
● Entailment with definite clauses is semidecidable.

39
Backward chaining

40
Backward chaining

41
Backward chaining

● Depth-first recursive proof search

○ Space is linear in size of proof
● Incomplete due to infinite loops
○ Fix by checking current goal against every goal on stack
● Inefficient due to repeated subgoals (success and failure)
○ Fix using caching of previous results (extra space)
● Widely used for logic programming

42
CNF for First-order logic

● First-order resolution requires that sentences be in CNF

● Every sentence of first-order logic can be converted into an
inferentially equivalent CNF sentence.
● For example, the sentence
∀𝑥 𝐴𝑚𝑒𝑟𝑖𝑐𝑎𝑛(𝑥) ∧ 𝑊𝑒𝑎𝑝𝑜𝑛(𝑦) ∧ 𝑆𝑒𝑙𝑙𝑠(𝑥, 𝑦, 𝑧)
∧ 𝐻𝑜𝑠𝑡𝑖𝑙𝑒(𝑧) ⇒ 𝐶𝑟𝑖𝑚𝑖𝑛𝑎𝑙(𝑥)
becomes, in CNF,
¬𝐴𝑚𝑒𝑟𝑖𝑐𝑎𝑛(𝑥) ∨ ¬𝑊𝑒𝑎𝑝𝑜𝑛(𝑦) ∨ ¬𝑆𝑒𝑙𝑙𝑠(𝑥, 𝑦, 𝑧)
∨ ¬𝐻𝑜𝑠𝑡𝑖𝑙𝑒(𝑧) ∨ 𝐶𝑟𝑖𝑚𝑖𝑛𝑎𝑙(𝑥)

43
Conversion to CNF

● Eliminate implications

● Move ¬ in wards: ¬∀𝑥 𝑝≡ ∃𝑥 ¬𝑝, ¬∃𝑥 𝑝 ≡ ∀𝑥 ¬𝑝

● Standardize variables: each quantifier uses a different one

44
Conversion to CNF

● Skolemize: remove existential quantifiers by elimination

○ Simple case: translate ∃𝑥 𝑃(𝑥) into 𝑃(𝐴), where 𝐴 is a new
constant.
■ However, ∀𝑥 [𝐴𝑛𝑖𝑚𝑎𝑙(𝐴) ∧ ¬𝐿𝑜𝑣𝑒𝑠(𝑥, 𝐴)] ∨ 𝐿𝑜𝑣𝑒𝑠(𝐵, 𝑥)
has an entirely different meaning.
○ The arguments of the Skolem function are all universally
quantified variables in whose scope the existential
quantifier appears.

● Drop universal quantifiers

45
Conversion to CNF

● Distribute ∨ over ∧

46
Resolution

● For example,

47
Resolution

48
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

49
Introduction to
Artificial Intelligence
Lecture: Bayesian Networks
Outline

● Representing Knowledge in an Uncertain Domain

● Exact Inference in Bayesian Networks
● Constructing Bayesian Networks

2
Full joint probability distribution

● The full joint probability distribution (JPD) can answer any

question about the domain.
● (Conditional) independence relationships among variables can
greatly reduce the number of probabilities required.
● Bayesian networks can represent essentially, and in many
cases very concisely, any full JPD.
○ Belief network, probabilistic network, causal network,
knowledge map

3
Bayesian networks

● A Bayesian network is a directed graph in which each node is

annotated with quantitative probability information.
● Each node presents a random discrete/continuous variable.
● A set of directed links or arrows connects pairs of nodes.
● Each node 𝑋i has a probability distribution 𝑷(𝑋i | 𝑃𝑎𝑟𝑒𝑛𝑡(𝑋i ))
that quantifies the effect of the parents on the node.
● DAG
● X and Y are parents of Z
● Y is the parent of P

4
Bayesian network topology

● The network topology defines the conditional independence

relationships that hold in the domain.
○ An arrow means that 𝑋 has a direct influence on 𝑌, which
suggests that causes should be parents of effects.
○ A domain expert decides what direct influences exist.
● Then, specify a conditional probability distribution for each
variable, given its parents.

5
BN Example

● Burglary and earthquakes directly affect the probability of the

alarm’s going off
● Whether John and Mary call depends only on the alarm.

6
Conditional probability table (CPT)

● The probabilities summarizes a potentially infinite set of

circumstances in which an event does (not) happen.
● In this way, a small agent can cope with a very large world, at
least approximately.

7
Represent the full joint distribution

● An entry in the joint distribution is the probability of a variable

assignment, such as 𝑷(𝑋1 = 𝑥1 ∧⋯∧ 𝑋n = 𝑥n).

● Thus, it is the product of the appropriate elements of the CPTs

in the Bayesian network.

8
Represent the full joint distribution

9
Represent the full joint distribution

10
Notations

● 𝑋 denotes the query variable.

● 𝑬 denotes the set of evidence variables 𝐸1,..., 𝐸m, and 𝒆 is a
particular observed event.
● 𝑌 denotes the non-evidence, non-query variables 𝑌1,... , 𝑌n
(called the hidden variables).
● Thus, the complete set of variables is 𝑿 = {𝑋} ∪ 𝐸 ∪ 𝑌.
● A typical query asks for the posterior probability 𝑷(𝑋 | 𝒆)

11
Represent the full joint distribution

12
Inference by enumeration

● A query can be answered by computing sums of products of

conditional probabilities from the Bayesian network.

○ where 𝛼 stands for the constant denominator term, which is

usually simplified during calculation.

13
Inference by enumeration

● Consider the following query

● The hidden variables are 𝐸𝑎𝑟th𝑞𝑢𝑎𝑘𝑒 and 𝐴𝑙𝑎𝑟𝑚.

● Using initial letters for the variables, we have

● For simplicity, we do this for 𝐵𝑢𝑟𝑔𝑙𝑎𝑟𝑦 = 𝑡𝑟𝑢𝑒.

● Complexity: 𝑶(𝒏𝟐𝒏) for a network with 𝑛 Boolean variables

14
Inference by enumeration

15
Construct a Bayesian network

● Scenario 1: Network structure known and all variables

observable
○ Compute only the CPT entries
● Scenario 2: Network structure known while some variables
hidden
○ Gradient descent (greedy hill-climbing) method, i.e., search
for a solution along the steepest descent of a criterion
function
● Scenario 3: Network structure unknown, all variables
observable
○ Search through the model space to reconstruct network
topology
● Scenario 4: Network structure unknown and all variables
hidden
○ No good algorithms known for this purpose
16
Construct a Bayesian network

● The Chain Rule holds for any set of random variables.

● We generally assert that, for every variable 𝑋i in the network

17
Construct a Bayesian network

● Each node must be conditionally independent of its other

predecessors in the node ordering, given its parents.
● Nodes: Identify the set of variables required to model the
domain and order them {X1,..., Xn}.
● Links: For 𝑖 = 1 to 𝑛 do
○ Choose, from {𝑋1,...,𝑋i-1}, a minimal set of parents for 𝑋
such that Equation (*) is satisfied.
○ For each parent insert a link from the parent to 𝑋𝑖.
○ CPTs: Write down the conditional probability table,
𝑃(𝑋𝑖|𝑃𝑎𝑟𝑒𝑛𝑡(𝑋𝑖))
● Intuitively, the parents of node 𝑋i should contain all those nodes
in {𝑋1,..., 𝑋i-1} that directly influence 𝑋i. 18
Construct a Bayesian network

● The network is guaranteed to be acyclic.

○ Each node is connected only to earlier nodes.
● Bayesian networks contain no redundant probability values.
○ If there is no redundancy, then there is no chance for
inconsistency.
● It is impossible for the domain expert to create a Bayesian
network that violates the axioms of probability.

19
Example

● You want to diagnose whether there is a fire in a building

● You can receive reports (possibly noisy) about whether
everyone is leaving the building.
● If everyone is leaving, this may have been caused by a fire
alarm.
● If there is a fire alarm, it may have been caused by a fire or by
tampering.
● If there is a fire, there may be smoke.

20
Example

● Start by choosing the random Boolean variables for this domain

● 𝑇𝑎𝑚𝑝𝑒𝑟𝑖𝑛𝑔 (𝑇) : the alarm has been tampered with
● 𝐹𝑖𝑟𝑒 (𝐹): there is a fire
● 𝐴𝑙𝑎𝑟𝑚 (𝐴): there is an alarm
● 𝑆𝑚𝑜𝑘𝑒 (𝑆): there is smoke
● 𝐿𝑒𝑎𝑣𝑖𝑛𝑔 (𝐿): there are lots of people leaving the building
● 𝑅𝑒𝑝𝑜𝑟𝑡 (𝑅): the sensor reports that everyone are leaving the
building.

21
Example

● Define a total ordering of variables

○ Choose an order that follows the causal sequence of events
○ Fire (F) Tampering (T) Alarm (A) Smoke (S) Leaving (L)
Report (R)
● Consider the following chain rule and use given clues to
simplify it

22
Example

23
Example

24
Example

25
Example

26
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

27
Introduction to
Artificial Intelligence
Lecture: Basic Machine Learning
Outline

● Introduction to Machine Learning

● ID3 Decision Tree Learning
● Naïve Bayesian Learning

2
Machine Learning

● Machine learning involves adaptive mechanisms that enable

computers to learn from experience, learn by example and
learn by analogy.

3
Machine Learning

4
Machine Learning

5
Supervised learning

● Learn a function that maps an input to an output based on

example input-output pairs.
● Example,
○ Spam detection: Decide which emails are spam and which
are important
○ Object detection and recognition: Localize and identify
instances of semantic objects of a certain class (e.g.,
humans, buildings, or cars) in digital images and videos.

6
Supervised learning

7
Classification vs. Regression

● Classification
○ Train a model to predict a categorical dependent variable
○ Case studies: predicting disease, classifying images,
predicting customer churn, buy or won’t buy, etc.
○ Binary classification vs. Multiclass classification vs.
Multilabel classification
● Regression
○ Train a model to predict a continuous dependent variable
○ Case studies: predicting height of children, predicting sales,
forecasting stock prices, etc.

8
Unsupervised learning

● Infer a function to describe hidden structure from "unlabeled"

data.
● Example,
○ Social network analysis: cluster users of social networks by
interest (community detection)

9
Semi-supervised learning

● The model is initially trained with a small amount of labeled

data and a large amount of unlabeled data.

10
Reinforcement learning

● The agent learns from the environment by interacting with it

and receives rewards for performing actions.

11
Machine Learning

12
Machine Learning

13
ID3 Decision Tree Learning

● The decision is based on the following attributes

1. Alternate: is there an alternative restaurant nearby?
2. Bar: is there a comfortable bar area to wait in?
3. Fri/Sat: is today Friday or Saturday?
4. Hungry: are we hungry?
5. Patrons: number of people in the restaurant (None, Some, Full)
6. Price: price range ($, $$, $$$)
7. Raining: is it raining outside?
8. Reservation: have we made a reservation?
9. Type: kind of restaurant (French, Italian, Thai, Burger)
10.WaitEstimate: estimated waiting time (0-10, 10-30, 30-60, >60)

14
ID3 Decision Tree Learning

15
ID3 Decision Tree Learning

16
ID3 Decision Tree

● Divide and conquer: Split data into smaller and smaller subsets
● Splits usually on a single variable

17
ID3 Decision Tree

● The remaining examples are all positive (or all negative)

→ DONE, it is possible to answer Yes or No
● There are some positive and some negative examples
→ choose the “best” attribute to split them
● No examples left at a branch
→ return a default value
● No attributes left but both positive and negative examples
→ return the plurality classification of remaining ones.

18
ID3 Decision Tree

function DECISION-TREE-LEARNING (𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠, 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠, 𝑝𝑎𝑟𝑒𝑛𝑡

𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠) returns a tree
if 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠 is empty
then return PLURALITY-VALUE(𝑝𝑎𝑟𝑒𝑛𝑡 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠)
else if all 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠 have the same classification
then return the classification
else if 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠 is empty
then return PLURALITY-VALUE(𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠)
else
𝐴 ← 𝑎𝑟𝑔𝑚𝑎𝑥𝑎 ∈ 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠IMPORTANCE(𝑎, 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠)
𝑡𝑟𝑒𝑒 ← a new decision tree with root test A
for each value 𝑣𝑘 of A do
𝑒𝑥𝑠 ← {𝑒∶ 𝑒 ∈ 𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠 and 𝑒.𝐴 = 𝑣𝑘}
𝑠𝑢𝑏𝑡𝑟𝑒𝑒 ← DECISION-TREE-LEARNING(𝑒𝑥𝑠, 𝑎𝑡𝑡𝑟𝑖𝑏𝑢𝑡𝑒𝑠−𝐴,
𝑒𝑥𝑎𝑚𝑝𝑙𝑒𝑠)
add a branch to 𝑡𝑟𝑒𝑒 with label (𝐴 = 𝑣𝑘) and subtree 𝑠𝑢𝑏𝑡𝑟𝑒𝑒
19
return 𝑡𝑟𝑒𝑒
A purity measure with entropy

● Entropy is a measure of the uncertainty of a random variable 𝑉

with values 𝑣𝑘.

● 𝑣𝑘 is a class in 𝑉 (e.g., yes/no in binary classification)

● 𝑃(𝑣𝑘) is the proportion of the number of elements in class 𝑣𝑘 to
the number of elements in 𝑉.

20
A purity measure with entropy

● Entropy is maximal when all possibilities are equally likely.

● Entropy is zero in a pure “yes” (or pure “no”) node.
● Decision tree aims to decrease the entropy in each node.

21
Average Entropy

22
Information Gain

23
Information Gain

● IG(Alternate, S) = 0
● IG(Bar, S) = 0
● IG(Sat/Fri?, S) = 0.021
● IG(Hungry, S) = 0.196
● IG(Raining, S) = 0
● IG(Reservation, S) = 0.021
● IG(Patron, S) = 0.541
● IG(Price, S) = 0.23
● IG(Type, S) = 0
● IG(Est. waiting time, S) = 0.208

24
Decision Tree Learning

● Largest Information Gain (0.541) / Smallest Entropy (0.459)

achieved by splitting on Patrons.

25
Homework

26
Homework

from sklearn.tree import DecisionTreeClassifier, plot_tree

import pandas as pd
df = pd.DataFrame(
[{'W':1, 'U':0, 'S':0, 'L':0},
{'W':1, 'U':1, 'S':2, 'L':0},
{'W':0, 'U':1, 'S':1, 'L':0},
{'W':0, 'U':0, 'S':1, 'L':1},
{'W':1, 'U':0, 'S':2, 'L':1},
{'W':0, 'U':0, 'S':2, 'L':1},
]
)
clf = DecisionTreeClassifier()
clf.fit(df[['W', 'U', 'S']], df['L'])
plot_tree(clf)
27
Homework

28
Bayesian classification

● A statistical classifier performs probabilistic prediction, i.e.,

predicts class membership probabilities
○ Foundation: Based on Bayes’ Theorem

29
The Bayes’ Theorem

● Total Probability Theorem:

● Let 𝐗 be a data sample (“evidence”) with unknown class label

and 𝐻 be a hypothesis that 𝐗 belongs to class 𝐶.
● Bayes’ Theorem:

● Classification is to determine 𝑃(𝐻 | 𝐗), i.e. the probability that

the hypothesis 𝐻 holds given the observed data sample 𝐗.

30
Example data set

31
The Bayes’ Theorem

● 𝑷(𝑯) (prior probability): the initial probability

○ E.g., 𝐗 will buy computer, regardless of age, income, ...
● 𝑃(𝐗) : the probability that sample data is observed
○ E.g., 𝐗 is 31..40 and has a medium income, regardless of
the buying
● 𝑃(𝐗 | 𝐻) (likelihood): the probability of observing the sample 𝐗,
given that the hypothesis holds
○ E.g., Given that 𝐗 will buy computer, the probability that 𝐗 is
31..40 and has a medium income

● E.g., given that 𝐗 is 31..40 and has a medium income, the

probability that 𝐗 will buy computer
32
The Bayes’ Theorem

● Informally,

𝑝𝑜𝑠𝑡𝑒𝑟𝑖𝑜𝑟𝑖 = 𝑙𝑖𝑘𝑒𝑙𝑖h𝑜𝑜𝑑 ∗ 𝑝𝑟𝑖𝑜𝑟 / 𝑒𝑣𝑖𝑑𝑒𝑛𝑐𝑒

● 𝐗 belongs to 𝐶i iff the probability 𝑃(𝐶i | 𝐗) is the highest among
all the 𝑃(𝐶𝑘 | 𝐗) for all the 𝑘 classes.

33
Classification with Bayes’ Theorem

● Let 𝐷 be a training set of tuples and associated class labels

● Each tuple is represented by a 𝑛-attribute 𝐗 = (𝑥1, 𝑥2, ... , 𝑥𝑛)
● Suppose there are 𝑚 classes 𝐶1, 𝐶2,..., 𝐶m
● Classification is to derive the maximum posteriori 𝑃(𝐶i | 𝐗) from
Bayes’ theorem

● 𝑃(𝑋) is constant for all classes, only 𝑃(𝐗|𝐶i)𝑃(𝐶i) needs to be

maximized.

34
Naïve Bayesian classifier

● Class-conditional independence: There are no dependence

relationships among the attributes.
● The naïve Bayesian classification formula is written as

35
Naive Bayesian for the example dataset

36
Naive Bayesian for the example dataset

37
Avoiding the zero-probability problem

38
Avoiding the zero-probability problem

● Laplacian correction (or Laplacian estimator)

○ where 𝑚 is the number of classes, |𝑥k ∪ 𝐶i| denotes the

number of tuples contains both 𝐴k = 𝑥k and 𝐶i, and 𝑟 is the
number of values of attribute 𝐴𝑘
● The “corrected” probability estimates are close to their
“uncorrected” counterparts.

39
Naive Bayesian for the example dataset

40
Naive Bayesian for the example dataset

41
Handling missing values

● If the values of some attributes are missing, these attributes are

omitted from the product of probabilities
● As a result, the estimation is less accurate
● For example,

42
Homework

43
References

● Stuart Russell and Peter Norvig. 2009. Artificial Intelligence: A

Modern Approach (3rd ed.). Prentice Hall Press, Upper Saddle
River, NJ, USA.
● Lê Hoài Bắc, Tô Hoài Việt. 2014. Giáo trình Cơ sở Trí tuệ nhân
tạo. Khoa Công nghệ Thông tin. Trường ĐH Khoa học Tự
nhiên, ĐHQG-HCM.
● Nguyễn Ngọc Thảo, Nguyễn Hải Minh. 2020. Bài giảng Cơ sở
Trí tuệ Nhân tạo. Khoa Công nghệ Thông tin. Trường ĐH Khoa
học Tự nhiên, ĐHQG-HCM.
● Thanh-An NGUYEN, "3D-Xception-UNet: An Improved
Lightweight U-Net Variant for 3D Seismic Fault Segmentation,"
in Proceedings of the 2024 9th International Conference on
Intelligent Information Technology, 2024, pp. 197–202.
44

2021 Lecture01 IntroductionToAI
No ratings yet
2021 Lecture01 IntroductionToAI
48 pages
2020 CS300 Lecture01 IntroductionToAI
100% (1)
2020 CS300 Lecture01 IntroductionToAI
46 pages
#1 Introduction To AI
No ratings yet
#1 Introduction To AI
66 pages
AI Overview: Concepts and History
No ratings yet
AI Overview: Concepts and History
30 pages
AI Lecture Notes
No ratings yet
AI Lecture Notes
131 pages
BCA AI Chapter-1 Complete
No ratings yet
BCA AI Chapter-1 Complete
56 pages
Unit 1
No ratings yet
Unit 1
152 pages
AI Slides 2025
No ratings yet
AI Slides 2025
176 pages
AI Unit-1
No ratings yet
AI Unit-1
54 pages
Introduction To AI
No ratings yet
Introduction To AI
37 pages
L1 Intro
No ratings yet
L1 Intro
41 pages
2.artificial Intelligence (RTIT)
No ratings yet
2.artificial Intelligence (RTIT)
17 pages
Ai 01
No ratings yet
Ai 01
26 pages
Artificial Intelligence
No ratings yet
Artificial Intelligence
55 pages
Lect 2
No ratings yet
Lect 2
37 pages
2025 Lecture01 IntroductionToAI
No ratings yet
2025 Lecture01 IntroductionToAI
39 pages
AI - 01 (Introduction To AI)
No ratings yet
AI - 01 (Introduction To AI)
28 pages
Lec01 introductionToAI
No ratings yet
Lec01 introductionToAI
11 pages
Unit-1 AI
No ratings yet
Unit-1 AI
50 pages
Ai Course File
No ratings yet
Ai Course File
67 pages
Chapter - 1 Introduction To AI 2
No ratings yet
Chapter - 1 Introduction To AI 2
32 pages
Introduction To AI
No ratings yet
Introduction To AI
27 pages
AI and Its Rootscope - TMT4693
No ratings yet
AI and Its Rootscope - TMT4693
35 pages
Intro Artificial Intelligent PDF
No ratings yet
Intro Artificial Intelligent PDF
71 pages
Presentation 1
No ratings yet
Presentation 1
46 pages
Ai-Unit-I Notes
No ratings yet
Ai-Unit-I Notes
74 pages
AI Chapter 1
No ratings yet
AI Chapter 1
28 pages
Unit - 1 Artificial Intelligence: History of AI
No ratings yet
Unit - 1 Artificial Intelligence: History of AI
19 pages
CS 362 AI Unit 1 2
No ratings yet
CS 362 AI Unit 1 2
19 pages
Introduction To A.I.
100% (1)
Introduction To A.I.
29 pages
AI Notes
No ratings yet
AI Notes
10 pages
Unit 1
No ratings yet
Unit 1
78 pages
All in One AICourse
No ratings yet
All in One AICourse
173 pages
Rtificial Ntelligence: Presented by Mohd Arif C.S.E III (IV Year)
No ratings yet
Rtificial Ntelligence: Presented by Mohd Arif C.S.E III (IV Year)
30 pages
Unit 1 (Introduction To AI)
No ratings yet
Unit 1 (Introduction To AI)
36 pages
Ai 1
No ratings yet
Ai 1
17 pages
Artificial Intelligence Odd
No ratings yet
Artificial Intelligence Odd
59 pages
Artificial Intelligence: Chapter 1: Introduction Instructor: DR Ghulam Mustafa Department of Information Technology Pugc
No ratings yet
Artificial Intelligence: Chapter 1: Introduction Instructor: DR Ghulam Mustafa Department of Information Technology Pugc
25 pages
Lec 1
No ratings yet
Lec 1
52 pages
AI Chapter 1
No ratings yet
AI Chapter 1
27 pages
AI Course Overview and History
No ratings yet
AI Course Overview and History
47 pages
Introduction To Artificial Intelligence
No ratings yet
Introduction To Artificial Intelligence
33 pages
AI Course Overview & Key Concepts
No ratings yet
AI Course Overview & Key Concepts
40 pages
Unit 1
No ratings yet
Unit 1
167 pages
Introduction To AI-Chapter1
No ratings yet
Introduction To AI-Chapter1
16 pages
AI.1 - Introduction To AI (1-4)
No ratings yet
AI.1 - Introduction To AI (1-4)
71 pages
Module 1 Updated
No ratings yet
Module 1 Updated
97 pages
Ai Module-1
No ratings yet
Ai Module-1
63 pages
Module 1 Final
No ratings yet
Module 1 Final
92 pages
AI Course Overview & Syllabus
No ratings yet
AI Course Overview & Syllabus
30 pages
AI
No ratings yet
AI
446 pages
3 Introduction
No ratings yet
3 Introduction
48 pages
Chap1 Intelligent Agents
No ratings yet
Chap1 Intelligent Agents
75 pages
Introduction To Psychology Course Outline
100% (1)
Introduction To Psychology Course Outline
6 pages
Hello Marketing, What Can Artificial Intelligence Help You With?
No ratings yet
Hello Marketing, What Can Artificial Intelligence Help You With?
4 pages
WAIS-IV Intelligence Scale Kits Pricing
0% (3)
WAIS-IV Intelligence Scale Kits Pricing
1 page
General Psychological Factors Affecting Physical Performance and Sports
No ratings yet
General Psychological Factors Affecting Physical Performance and Sports
11 pages
Impact On The Students Learning of A Teaching Innovation Program at The University of Barcelona
No ratings yet
Impact On The Students Learning of A Teaching Innovation Program at The University of Barcelona
6 pages
DR Kundu
No ratings yet
DR Kundu
17 pages
Bloomed Earth Playtest V 1.2
No ratings yet
Bloomed Earth Playtest V 1.2
171 pages
Week 1 Module - Upou Model
No ratings yet
Week 1 Module - Upou Model
9 pages
Bank Quest April-June 2020
No ratings yet
Bank Quest April-June 2020
48 pages
Unit 8-8601
No ratings yet
Unit 8-8601
29 pages
Handover Tools & Guide
No ratings yet
Handover Tools & Guide
4 pages
Indian Education: Theory vs. Practice
No ratings yet
Indian Education: Theory vs. Practice
1 page
Ai - Attitude of Nursing Student
No ratings yet
Ai - Attitude of Nursing Student
4 pages
AI's Societal Impact: A Student's View
No ratings yet
AI's Societal Impact: A Student's View
7 pages
Identification of Artificial Intelligence Applications in Air Traffic Control
No ratings yet
Identification of Artificial Intelligence Applications in Air Traffic Control
12 pages
0313382654
100% (15)
0313382654
458 pages
Skills & Experience of Atang Tsieane
No ratings yet
Skills & Experience of Atang Tsieane
2 pages
Iterative Active & Semi-Supervised Learning for Domain Adaptive Segmentation
No ratings yet
Iterative Active & Semi-Supervised Learning for Domain Adaptive Segmentation
10 pages
Edexcel Devising A Level Theatre Mark Tick Sheet - Portfolio
No ratings yet
Edexcel Devising A Level Theatre Mark Tick Sheet - Portfolio
4 pages
Lesson 7.1 8.1 DIONELA
No ratings yet
Lesson 7.1 8.1 DIONELA
4 pages
Conference Report - Knowledge Representation and Inference in Sanskrit - Summer 1987 (AI Volume 8)
100% (1)
Conference Report - Knowledge Representation and Inference in Sanskrit - Summer 1987 (AI Volume 8)
1 page
Exploring The Challenges and Opportunities of
No ratings yet
Exploring The Challenges and Opportunities of
10 pages
Annex C-5 - COT Inter-Observer Agreement Form For Proficient Teacher For SY 2024-2025
No ratings yet
Annex C-5 - COT Inter-Observer Agreement Form For Proficient Teacher For SY 2024-2025
1 page
Request Letter To Tire Center
No ratings yet
Request Letter To Tire Center
3 pages
AI Lec-Module-I
No ratings yet
AI Lec-Module-I
157 pages
Biografi Alan Turing Full Version.1.2.p
No ratings yet
Biografi Alan Turing Full Version.1.2.p
169 pages
Advertising Research Project
No ratings yet
Advertising Research Project
21 pages
AMB120 Bridging Cultures and AMB390 Bridging Cultures International Assessment Item 2: Report 30%
No ratings yet
AMB120 Bridging Cultures and AMB390 Bridging Cultures International Assessment Item 2: Report 30%
2 pages
Methods for Studying Human Behavior
No ratings yet
Methods for Studying Human Behavior
25 pages
Vision L4 Cumulative Test B Units 1 8
No ratings yet
Vision L4 Cumulative Test B Units 1 8
6 pages