CSI106
ARTIFICIAL
INTELLIGENCE
FALL 2024
OBJECTIVES
• Define and give a brief history of artificial intelligence.
• Describe how knowledge is represented in an intelligent agent.
• Show how expert systems can be used when a human expert is not available.
• Show how an artificial agent can be used to simulate mundane tasks performed
by human beings.
• Show how expert systems and mundane systems can use different search
techniques to solve problems.
• Show how the learning process in humans can be simulated, to some extent,
using neural networks that create the electronic version of a neuron called a
perceptron.
• Show how the deep learning working and the advantages of deep learning.
2
Content
13.2
13.1 13.3 Expert
Knowledge
Introduction Systems
Presentation
13.6 Deep 13.5 Neural 13.4
Learning Network Perception
3
Introduction: What is artificial intelligence?
No universally agreed definition of artificial
intelligence.
Let’s accept the following definition that
matches the topics covered in this section.
Artificial intelligence is the study of programmed systems that can
simulate, to some extent, human activities such as perceiving,
thinking, learning, and acting.
4
Intro: History of artificial intelligence
• Artificial intelligence is an
independent field of study is relatively
new, but it has some roots in the past.
It started 2400 years ago when the
Greek philosopher Aristotle invented
the concept of logical reasoning.
• The main idea of a thinking machine
came from Alan Turing, who proposed
the Turing test. The term "artificial
intelligence" was first coined by John
McCarthy in 1956.
Figure 13.1 History of artificial intelligence 5
Intro: The Turing test
• In 1950, Alan Turing proposed the Turing Test,
which provides a definition of intelligence in a
machine. The test simply compares the intelligent
behaviour of a human being with that of a
computer:
o An interrogator asks a set of questions that are
forwarded to both a computer and a human being.
o The interrogator receives two sets of responses but
does not know which set comes from the human and
which set from the computer.
o After careful examination of the two sets, if the Figure 13.2 Simple Turing Test diagram
interrogator cannot definitely tell which set has come
from the computer and which from the human, the
computer has passed
6
Intro: Intelligent agents
• An intelligent agent is a system that perceives its
environment, learns from it, and interacts with it
intelligently. Intelligent agents can be divided into
two broad categories:
o Software agents is a set of programs that are
designed to do particular tasks. E.g., some
intelligent systems can be used to organize
electronic mail (email).
o Physical agents (robot) is a programmable system
that can be used to perform a variety of tasks.
• Simple robots can be used in manufacturing to do
routine jobs such as assembling, welding, or Figure 13.3 Simple reflex agent diagram
painting.
• Some organizations use mobile robots that do
delivery jobs such as distributing mail or
correspondence to different rooms. 7
Intro: Programming languages
• Data scientists choose different programming languages based on ease of use,
simplicity in programming syntax, number of machine learning libraries available,
integration with other programs like cloud infrastructure or visualization software,
and computational speed and efficiency.
• There are some popular programming languages in AI
8
Intro: Programming languages (cont.)
1. Python 2. Prolog
This language is widely used by programmers Appreciated by AI developers it for its
because of its pure syntax and the logical, high level of abstraction, built-in search
strictly grammatical construction of the engine, non-determinism, etc.
program.
One of the few languages that represents
● relatively fast development speed;
the paradigm of declarative programming.
● rich and diverse set of libraries and tools;
The learning curve here is quite high.
● balance of low-level and high-level
● powerful and flexible programming
programming;
structure;
● allows algorithms testing without having
● data structuring on the basis of a tree;
to implement them;
● automatic rollback option.
● is still being actively developed.
9
Intro: Programming languages (cont.)
3. LISP 4. JAVA
Developed by John McCarthy for conducting Java is, w/o a doubt, a significant AI
research in the AI field & to represent programming language. One reason for this
algorithms using a NL and symbolic model. is the language’s widespread use in mobile
app development. It’s a fantastic match,
Main features include
given how many mobile apps make use of
• Knowledge representation & reasoning
AI.
• Simple & flexible syntax
• Support for NLP Java also makes use of simplified
debugging, and its easy-to-use syntax offers
It’s been decline in popularity, but as loyal graphical data presentation and
user base in academia and certain specialized incorporates both WORA and OO patterns.
applications 10
Intro: Programming languages (cont.)
5. C++ 6. R
C++ is another language that has been R may not be the ideal language for AI,
around for a long time yet is still a viable but it excels at crunching massive
option for AI. One of the reasons for this is numbers, making it superior to Python
the language’s broad flexibility, which makes at scale.
it ideal for resource-intensive applications.
C++ is a low-level language that helps the R is a suitable language for Artificial
AI model in production run more smoothly. Intelligence because of its built-in
And, while C++ isn’t everyone’s first pick for functional programming, vectorial
AI engineers, it’s worth noting that many computation, and Object-Oriented
deep and machine learning libraries are nature.
created in the language. 11
Content
13.2
13.1 13.3 Expert
Knowledge
Introduction Systems
Presentation
13.6 Deep 13.5 Neural 13.4
Learning Network Perception
12
Knowledge Presentation (KP): Introduction
• If an artificial agent is supposed to solve some problems related to the real world,
it needs to be able to represent knowledge somehow. Facts are represented as
data structures that can be manipulated by programs stored inside the computer.
Knowledge
Representation
Frame Rule-based
Semantic network Predicate logic
representation systems
Figure 13.4 Four common methods for representing knowledge 13
KP: Semantic networks
• Semantic networks were developed
in the early 1960s by Richard H.
Richens.
• A semantic network uses directed
graphs to represent knowledge.
• Semantic networks use
o vertices to represent concepts, and
o edges (denoted by arrows) to
represent the relation b/w two
concepts Figure 13.5 A simple semantic network
14
KP: Frames presentation
• Frames are closely related to semantic networks.
In semantic networks, a graph is used to represent
knowledge; in frames, data structures (records) are
used to represent the same knowledge.
• One advantage of frames over semantic networks
is that programs can handle frames more easily
than semantic networks.
• Objects A node in a semantic network ⇢ an object Figure 13.6 A set of frames representing
in a set of frames, so an object can define a class, a semantic network
subclass, or an instance of a class
• Slots Edges in semantic networks ⇢ slots – fields
in the data structure. The name of the slot defines
the type of the relationship, and the value of the
slot completes the relationship. 15
KP: Predicate logic
• The most common knowledge representation is predicate logic. Predicate logic can be
used to represent complex facts. It is a well-defined language developed via a long
history of theoretical logic.
• Propositional logic is a language made up from a set of sentences that can be used to
carry out logical reasoning about the world.
Examples
o 𝑆(𝑥) = x is a student.
o 𝐸 = Classroom is empty.
o 𝐶25 = Classroom has 25 students
And we can combine quantifiers and predicates,
o (Ɐx)P(x) = x passed the exam.
• Meaning: All the x passed the exam.
o (Ǝx)P(x) = x passed the exam.
• Meaning: Some of the x passed the exam. (At least one of the x has passed the exam.)
16
KP: Predicate logic (cont.)
Operators
Propositional logic uses five operators, as shown below:
Sentence
A sentence in this language is defined recursively as shown below:
1. An uppercase letter, such as A, B, S, or T, that represents a statement in a natural
languages, is a sentence.
2. Any of the two constant values (true and false) is a sentence.
3. If 𝑃 is a sentence, then ¬𝑃 is a sentence.
4. If 𝑃 and 𝑄 are sentences, then 𝑃 ∨ 𝑄, 𝑃 ∧ 𝑄, 𝑃 → 𝑄, and 𝑃 𝑄 are sentences
17
KP: Predicate logic (cont.)
Quantifiers
Predicate logic uses quantifiers. Two common quantifiers are:
• ∀ reads as "for all" – universal quantifier
• ∃ reads as "there exists" – existential quantifier
Deduction
• In predicate logic, if there is no quantifier, the verification of an argument is the
same as that in propositional logic. However, the verification becomes more
complicated if there are quantifiers.
18
KP: Rule-based systems
• A rule-based system represents knowledge using a set of rules that can be used
to deduce new facts from known facts. The rules express what is true if specific
conditions are met.
• A rule-based database is a set of if… then… statements in the form in which A is
called the antecedent and B is called the consequent.
• Note that in a rule-based system, each rule is handled independently without any
connection to other rules.
If A then B or A→B
antecedent consequent
19
KP: Rule-based systems (cont.)
Components
• A rule-based system is made up of 3 components: an interpreter (or inference
engine), a knowledge base, and a fact database, as shown below.
Fact database
Interpreter
Knowledge base
(Interference engine)
Figure 13.7 The components of a rule-based system
20
Content
13.2
13.1 13.3 Expert
Knowledge
Introduction Systems
Presentation
13.6 Deep 13.5 Neural 13.4
Learning Network Perception
21
Expert System (ES): Introduction
• Expert systems use the knowledge representation languages discussed in the
previous section to perform tasks that normally need human expertise. They can
be used in situations in which that expertise is in short supply, expensive, or
unavailable when required.
Figure 13.8 An Expert Systems Simplified Model
22
ES: Expert Systems – Example
• For example, in medicine, an expert system can narrow down a set of symptoms
to a likely subset of causes, a task normally carried out by a doctor.
Figure 13.9 A diagram of Medical Expert Systems
23
ES: Extracting knowledge
• An expert system is built on predefined knowledge about its field of expertise.
• Extracting knowledge from an expert is normally a difficult task, for several reasons:
1. The knowledge possessed by the expert is normally heuristic: it is based on probability
rather than certainty.
2. The expert often finds it hard to express their knowledge in such a way that it can be
stored in a knowledge base as exact rules. For example, it is hard for an electrical
engineer to show how, step by step, a faulty electric motor can be diagnosed. The
knowledge is normally intuitive.
3. Knowledge acquisition can only be done via personal interview with the expert, which
can be a tiring and boring task if the interviewer is not an expert in this type of
interview.
24
ES: Extracting knowledge (cont.)
• An expert system in medicine, for example, is built on the knowledge of a doctor
specialized in the field for which the system is built: an expert system is supposed
to do the same job as the human expert.
Figure 13.10: System Architecture for Medical Knowledge Extraction
25
ES: Extracting facts
• To be able to infer new facts or perform actions,
a fact database is needed in addition to the
knowledge base for a knowledge representation
language.
• The fact database in an expert system is case-
based, in which facts collected or measured are
entered into the system to be used by the
inference engine.
Figure 13.11: Flow Chart for Developing Case-based Reasoning Expert Systems 26
ES: Architecture
• An expert system can have up to seven components: user, user interface,
inference engine, knowledge base, fact database, explanation system, and
knowledge base editor.
• The inference engine is the heart of an expert system: it communicates with the
knowledge base, fact database, and the user interface.
Figure 13.12 The architecture of an expert system 27
ES: Architecture (cont.)
• User uses the system to benefit from the expertise offered.
• UI allows the user to interact with the system. UI accept natural
language from the user & interpret it for the system.
• Inference engine (heart of the system) uses the knowledge
base and the fact database to infer the action to be taken.
• Knowledge base is a collection of knowledge based on
interviews with experts in the relevant field of expertise.
• Fact database in an expert system is case-based. For each case, the user enters the available or measured data
into the fact database to be used by the inference engine for that particular case.
• Explanation system (may not be included), is used to explain the rationale behind the decision made by the
inference engine.
• Knowledge base editor (may not be included), is used to update the knowledge base if new experience has
been obtained from experts in the field. 28
• If the patient has a high fever and loss • Fact 1: The patient has a high fever.
of taste or smell, consider COVID-19. • Fact 2: The patient has muscle aches.
• If the patient has a runny nose and • Fact 3: The patient does not have a runny
sneezing, consider a cold. nose.
• If the patient has a high fever and • Fact 4: The patient does not have loss of
muscle aches, consider the flu. taste or smell.
• Fact 5: The patient is sneezing.
Knowledge Fact Database
Base
symptoms
are indicative
the flu.
29
Content
13.2
13.1 13.3 Expert
Knowledge
Introduction Systems
Presentation
13.6 Deep 13.5 Neural 13.4
Learning Network Perception
30
Perception: Introduction
• One of the goals in AI is to create a
machine that behaves like an expert – an
expert system.
• Another goal is to create a machine that
behaves like an ordinary human.
o Image processing
o Language & Understanding
Figure 13.13 Perception diagram
31
2. Image processing
• Image processing or computer vision is an area of AI that deals with the perception
of objects through the artificial eyes of an agent, such as a camera. An image
processor takes a two-dimensional image from the outside world and tries to create
a description of the three-dimensional objects present in the scene.
• The input presented to an image processor is one or more images from the scene,
while the output is a description of the objects in the scene. The processor uses a
database containing the characteristics of objects for comparison.
Figure 13.14 Components of an image processor 32
2. Image processing – Step 1: Edge detection
• The first stage in image processing is edge detection: finding where the edges
in the image are. Edges can define the boundaries between an object and its
background in the image.
• A sharp contrast b/w the surfaces belonging to an object and the environment,
assuming that there is no camouflage. Edges show discontinuity in surface, in
depth, or in illumination.
Figure 13.14 Edge detection process 33
2. Image processing - Step 2: Segmentation
• Segmentation divides the image into homogeneous segments or areas. In edge
detection, the boundaries of the object and the background are found; in
segmentation, the boundaries between different areas inside the object are
found. After segmentation, the object is divided into different areas.
• Several methods have been used for segmentation; e.g., thresholding, splitting,
merging
34
2. Image processing – Step 3: Finding depth
• The next step in image analysis is to find the depth of the object or objects in the
image.
o Depth finding can help the intelligent agent to gauge how far the object is from it.
Two general methods have been used for this purpose: stereo vision and motion.
• Stereo vision: use two eyes or two cameras. The picture created from two cameras
can help the intelligent agent to gauge if the object is close or far away.
• Motion: create several images when one or more objects are moving. The relative
position of a moving object with respect to other objects in the scene can give a
clue to the distance of objects
35
2. Image processing – Step 4: Finding orientation
• Orientation of the object in the scene can be found using two techniques:
shading and texture.
• Shading: The amount of light reflected from a surface depends on several factors.
If the optical properties of the different surfaces of an object are the same, the
amount of reflection depends on the orientation of the surface (its relative
position) which reflects the light source.
• Texture: (a regularly repeated pattern) can also help in finding the orientation or
the curvature of a surface.
36
2. Image processing – Step 5: Object recognition
• To recognize an object, the agent needs to have a model of the object in memory
for comparison. However, creating and storing a model for each object in the view
is an impossible task.
• One solution is to assume that the objects to be recognized are compound
objects made of a set of simple geometric shapes. These primitive shapes can be
created and stored in the intelligent agent’s memory, then classes of object that
we need the agent to recognize can be created from a combination of these
objects and stored.
37
3. Language understanding
• One of the inherent capabilities of a human being is to understand – that is,
interpret – the audio signals that they perceive. A machine that can understand
natural language can be very useful in daily life.
• We can divide the task of a machine that understands natural language
into 4 consecutive steps: Database
Input Sentence
o speech recognition,
Natural Language Processing
o syntactic analysis,
o semantic analysis, and Lexical Syntactic Semantic Output
Analysis Analysis Analysis Transformation
o Pragmatic analysis.
Figure 13.17 Steps of understands natural language
Output Data
38
3. Language understanding - Speech recognition
• Speech recognition: 1st step in NLP.
• A speech signal is analysed & the sequence of words it contains are extracted.
o Input to the speech recognition subsystem is a continuous (analog) signal
o Output is a sequence of words
o The signal needs to be divided into different sounds, sometimes called phonemes.
• The sounds then need to be combined into words.
Figure 13.18 overview of Speech recognition 39
3. Language understanding - Syntactic analysis
• Syntactic analysis step is used to define how words are to be grouped in a
sentence.
• This is a difficult task in a language like English, in which the function of a word in
a sentence is not determined by its position in the sentence.
Mary rewarded John.
John was rewarded by Mary.
It is always John who is rewarded, but in the 1st sentence John is in the last position
and Mary is in the 1st position.
A machine that hears any of the above sentences needs to interpret them correctly
and come to the same conclusion no matter which sentence is heard.
40
3. Language understanding - Semantic analysis
• Semantic analysis extracts the meaning of a sentence after it has been
syntactically analysed.
• It creates a representation of the objects involved in the sentence, their relations,
and their attributes.
• Any of the knowledge representation schemes previously discussed can be used
o Example: the sentence ‘John has a dog’ can be represented using predicate logic as:
∃𝑥 𝑑𝑜𝑔(𝑥) ℎ𝑎𝑠 (𝐽𝑜ℎ𝑛, 𝑥)
41
3. Language understanding - Pragmatic analysis
• Pragmatic analysis is needed to further clarify the purpose of the sentence & to
remove ambiguities.
• Clarify purpose
o Can you swim a mile? – asking about the hearer’s ability
o Can you pass the salt? – a polite request
• Remove ambiguities
o A word can have more than one function – "hard" can be both adjective & adverb
o A word can also have more than one meaning – "bat" can be an animal or an object
o Two words with the same pronunciation can have different spellings and meanings
42
Content
13.2
13.1 13.3 Expert
Knowledge
Introduction Systems
Presentation
13.6 Deep 13.5 Neural 13.4
Learning Network Perception
43
Introduction
• Most methods of enabling an artificial intelligence agent to learn use inductive
learning or learning by example.
• A large set of problems and their solutions are given to the machine from which to
learn.
• Neural networks try to simulate the learning process of the human brain using a
network of neurons.
Figure 13.19 A simplified view of an Artificial Neural Network 44
History of Neural Networks
• The first neural network was conceived of by Warren McCulloch and Walter Pitts in
1943. They wrote a seminal paper on how neurons may work and modelled their
ideas by creating a simple neural network using electrical circuits.
• This breakthrough model paved the way for neural network research in two areas
Figure 13.20 Biological processes in the brain Figure 13.21 The application of neural networks in AI
45
Biological neurons
• The human brain has billions of processing units, called neurons. Each neuron, on
average, is connected to several thousand other neurons. A neuron is made of
three parts: soma, axon, and dendrites, as shown below.
Figure 13.22 A simplified diagram of a neuron
46
Perceptron
• A perceptron is an artificial neuron similar to a single biological neuron. It takes a
set of weighted inputs, sums the inputs, and compares the result with a threshold
value. If the result is above the threshold value, the perceptron fires, otherwise, it
does not. When a perceptron fires, the output is 1: when it does not fire, the
output is zero.
• Figure 13.23 shows a perceptron with five inputs (x1 to x5), and five weights (w1 to
w5). In this perceptron, if T is the value of the threshold, the value of output is
determined
Figure 13.23 A perceptron 47
Example Perceptron
• Assume a case study with three inputs
and one output. There are already four
examples with known inputs and
outputs, as shown in the following table:
• This set of inputs is used to train a perceptron with all equal weights (𝑤1 = 𝑤2 = 𝑤3 ).
• Threshold = 0.8
• Originally, 𝑤 = 50%
• The weights remain the same if the output produced = the actual output)
• The weights are increased by 10% if the output produced < the output data
• The weights are decreased by 10% if the output produced is > the output data.
48
Multilayer Networks
• Several layers of perceptions can be combined to create multilayer NNs. The
output from each layer becomes the input to the next layer.
• 1st layer (the input layer); the middle layers (the hidden layers), and the last
layer (the output layer). The nodes in the input layer are not neurons, they are
only distributors. The hidden nodes are normally used to impose the weight
on the output from the previous layer.
Figure 13.24 Multiple layer ANN 49
Applications
• Handwriting Recognition – The
idea of Handwriting recognition
has become very important. This is
because handheld devices like the
Palm Pilot are becoming very
popular. Hence, we can use Neural
networks to recognize handwritten
characters.
• Traveling Salesman Problem –
Neural networks can also solve the
traveling salesman problem. But
this is to a certain degree of
approximation only.
50
Applications
• Image Compression – Vast
amounts of information is received
and processed at once by neural
networks. This makes them useful in
image compression. With the
Internet explosion and more sites
using more images on their sites,
using neural networks for image
compression is worth a look.
• Stock Exchange Prediction – The
day-to-day business of the stock
market is very complicated. Many
factors weigh in whether a given
stock will go up or down on any
given day. Thus, Neural networks
can examine a lot of information in a
fast manner and sort it all out. So we
can use them to predict stock prices.
51
DEEP LEARNING
52
Introduction
• Deep learning is a branch of
machine learning that uses data,
loads and loads of data, to teach
computers how to do things only
humans were capable of before.
• Deep learning is based on the
concept of ANN, or computational
systems that mimic the way the
human brain functions.
Figure 13.25 Deep Learning vs Human Brain
53
Technology
Deep learning is a fast-growing field, and new architectures, variants appear every
few weeks. We'll see discuss the major three:
1. Convolution Neural
Network (CNN)
CNNs exploit spatially-local
correlation by enforcing a local
connectivity pattern between
neurons of adjacent layers.
Figure 13.26 A sample Convolution Neural Network
(CNN)
54
Technology (cont.)
2. Recurrent Neural Network (RNN)
RNNs are called recurrent because they perform the same task for every element of
a sequence, with the output being depended on the previous computations. Or
RNNs have a “memory” which captures information about what has been calculated
so far.
Figure 13.27 A simplified diagram of a Recurrent Neural Network
55
Technology (cont.)
3. Long-Short Term Memory LSTM
can learn "Very Deep Learning" tasks that require memories of events that happened
thousands or even millions of discrete time steps ago. LSTM works even when there
are long delays, and it can handle signals that have a mix of low and high frequency
components.
Figure 13.28 A simplified diagram of a Long-Short Term Memory
56
Advantages
1. It does feature extraction, no need for engineering features
2. Moving towards raw features
3. Better optimization
4. A new level of noise robustness
5. Multi-task and transfer learning
6. Better Architectures
Figure 13.29 Machine Learning Vs Deep Learning
57
Challenges
1. Need a large dataset
2. Because you need a large dataset, training time is usually significant.
3. The scale of a net's weights is important for performance. When the features are
of the same type this is not a problem. However, when the features are
heterogeneous, it is.
4. Parameters are hard to interpret – although there is progress being made.
5. Hyperparamter tuning is non-trivial.
58
REAL TIME APPLICATIONS
1. Automatic Colorization of Black and White Images
2. Automatically Adding Sounds To Silent Movies
3. Automatic Machine Translation
4. Object Classification and Detection in Photographs
5. Automatic Handwriting Generation
59