Rationality maximizes expected performance, while perfection maximizes actual performance.
Our
definition of rationality does not require omniscience, then, because the rational choice depends
only on the percept sequence to date. For example, if an agent does not look both ways before
crossing a busy road, then its percept sequence will not tell it that there is a large truck approaching
at high speed. First, it would not be rational to cross the road given this uninformative percept
sequence: the risk of accident from crossing without looking is too great. Second, a rational agent
should choose the “looking” action before stepping into the street, because looking helps maximize
the expected performance. Doing actions in order to modify future percepts—sometimes called
information gathering.A second example of information gathering is provided by the exploration
that must be undertaken by a vacuum-cleaning agent in an initially unknown environment.
Learning: Our definition requires a rational agent not only to gather information but also to learn as
much as possible from what it perceives. The agent’s initial configuration could reflect some prior
knowledge of the environment, but as the agent gains experience this may be modified and
augmented. There are extreme cases in which the environment is completely known a priori.
Autonomy: To the extent that an agent relies on the prior knowledge of its designer rather than on
its own percepts, we say that the agent lacks autonomy. A rational agent should be autonomous—it
should learn what it can to compensate for partial or incorrect prior knowledge. For example, a
vacuum-cleaning agent that learns to foresee where and when additional dirt will appear will do
better than one that does not.
THE NATURE OF ENVIRONMENTS:
A simple vacuum-cleaner agent, we had to specify the performance measure, the
environment, and the agent’s actuators and sensors. We group all these under the heading
of the task environment. For the acronymically minded, we call this the PEAS (Performance,
Environment, Actuators, Sensors) description.
Properties of task environments:
1. Discrete / Continuous: If there are a limited number of distinct, clearly defined, states of
the environment, the environment is discrete (For example, chess); otherwise it is
continuous (For example, driving).
2. Observable / Partially Observable: If it is possible to determine the complete state of
the environment at each time point from the percepts it is observable; otherwise it is
only partially observable.
3. Static / Dynamic: If the environment does not change while an agent is acting, then it is
static; otherwise it is dynamic.
4. Single agent / Multiple agents: The environment may contain other agents which may
be of the same or different kind as that of the agent.
5. Accessible vs. inaccessible: If the agent’s sensory apparatus can have access to the
complete state of the environment, then the environment is accessible to that agent.
6. Deterministic vs. Non-deterministic: If the next state of the environment is completely
determined by the current state and the actions of the agent, then the environment is
deterministic; otherwise it is non-deterministic.
7. Episodic vs. Non-episodic: In an episodic environment, each episode consists of the
agent perceiving and then acting. The quality of its action depends just on the episode
itself. Subsequent episodes do not depend on the actions in the previous episodes.
Episodic environments are much simpler because the agent does not need to think
ahead.
THE STRUCTURE OF AGENTS:
The job of AI is to design an agent program that implements the agent function— the
mapping from percepts to actions. We assume this program will run on some sort of
computing device with physical sensors and actuators—we call this the architecture:
agent = architecture + program .
Agent programs: The agent programs that we design in this book all have the same skeleton:
they take the current percept as input from the sensors and return an action to the
actuators.
we outline four basic kinds of agent programs that embody the principles underlying almost
all intelligent systems:
• Simple reflex agents;
• Model-based reflex agents;
• Goal-based agents; and
• Utility-based agents.
• Simple reflex agents:The simplest kind of agent is the simple reflex agent. These agents
select actions on the basis of the current percept, ignoring the rest of the percept history.
An agent program for this agent is shown in Figure 2.8.
Simple reflex behaviors occur even in more complex environments. Imagine yourself as the
driver of the automated taxi. If the car in front brakes and its brake lights come on, then you
should notice this and initiate braking. In other words, some processing is done on the visual
input to establish the condition we call “The car in front is braking.” Then, this triggers some
established connection in the agent program to the action “initiate braking.” We call such a
connection a condition–action rule, written as
if car-in-front-is-braking then initiate-braking.
Model-based reflex agents:
The most effective way to handle partial observability is for the agent to keep track
of the part of the world it can’t see now. That is, the agent should maintain some
sort of internal state that depends on the percept history and thereby reflects at
least some of the unobserved aspects of the current state.
Figure 2.11 gives the structure of the model-based reflex agent with internal
state, showing how the current percept is combined with the old internal state to
generate the updated description of the current state, based on the agent’s model
of how the world works. The agent program is shown in Figure 2.12. The interesting
part is the function UPDATE-STATE, which is responsible for creating the new
internal state description.
Goal-based agents:
Figure 2.13 shows the goal-based agent’s structure.Sometimes goal-based action
selection is straightforward—for example, when goal satisfaction results
immediately from a single action. Sometimes it will be more tricky—for example,
when the agent has to consider long sequences of twists and turns in order to find a
way to achieve the goal.. A goal-based agent, in principle, could reason that if the
car in front has its brake lights on, it will slow down. Given the way the world usually
evolves, the only action that will achieve the goal of not hitting other cars is to
brake.
Utility-based agents:
An agent’s utility function is essentially an internalization of the performance
measure. If the internal utility function and the external performance measure are in
agreement, then an agent that chooses actions to maximize its utility will be rational
according to the external performance measure.
Learning agents:
A learning agent can be divided into four conceptual components, as shown in
figure 2.15. The most important distinction is between the learning element, which is
responsible for making improvements, and the performance element, which is
responsible for selecting external actions. The performance element is what we have
previously considered to be the entire agent: it takes in percepts and decides on actions.
The learning element uses CRITIC feedback from the critic on how the agent is doing and
determines how the performance element should be modified to do better in the future.