Internet Engineering
Jacek Mazurkiewicz, PhD
Softcomputing
Part 1: Introduction, Elementary ANNs
Formal Introduction
• contact hours, room No. 225 building C-3:
Tuesday: 11:00 - 13:00,
Monday: 7:30 - 9:00,
• slides: www.zsk.ict.pwr.wroc.pl
• „Professor Wiktor Zin”
• test: 22.01.2018 during lecture
- softcomputing:
- lecture + project
- project mark – 20% of final mark
- bonus question!
Program
• Idea of intelligent processing
• Fuzzy sets and approximate reasoning
• Expert systems - knowledge base organization
• Expert systems - reasoning rules creation
• Expert systems: typical organization and applications
• Artificial neural networks: learning and retrieving algorithms
• Multilayer percetpron
• Kohonen neural network
• Hopfield neural network
• Hamming neural network
• Artificial neural networks: applications
• Genetic algorithms: description and classification
• Genetic algorithms: basic mechanisms and solutions
SUBJECT OBJECTIVES
C1. Knowledge of artificial neural networks in pattern recognition, digital signals
and data processing: topology of networks, influence of parameters for network behavior.
C2. Knowledge of genetic algorithms used for data pre- and postprocessing.
C3. Knowledge of expert systems – reasoning rules and knowledge base creation for different tasks.
C4. Skills of special environment usage for project phase, modeling and simulation
of softcomputing systems in case of different scientific problems.
SUBJECT EDUCATIONAL EFFECTS
relating to knowledge:
PEK_W01 – knows the rules and the idea of intelligent processing.
PEK_W02 – defines the fuzzy sets and understands the idea of approximate reasoning.
PEK_W03 – defines the knowledge base and reasoning rules, knows the expert systems construction.
PEK_W04 – knows the architecture of typical artificial neural networks structures, learning and retrieving algorithms,
applications.
PEK_W05 – knows the description, classification, examples of applications of genetic algorithms
relating to skills:
PEK_U01 – can use the environments for project phase, modeling and simulation of artificial neural networks
as well as genetic algorithms in different tasks about pattern digital signals recognition.
PEK_U02 – can use the environments for project phase, modeling and implementation of expert systems
to dedicated fields of knowledge.
PEK_U03 – can use the environments for project phase, modeling and implementation of fuzzy sets and fuzzy reasoning
to dedicated fields of knowledge.
Literature
• B. Bouchon Meunier, Fuzzy Logic and Soft Computing
• O. Castilo, A. Bonarini, Soft Computing Applications
• M. Caudill, Ch. Butler, Understanding Neural Networks
• E. Damiani, Soft Computing in Software Engineering
• R. Hecht-Nielsen, Neurocomputing
• S. Y. Kung, Digital Neural Networks
• D. K. Pratihar, Soft Computing
• S. N. Sivanandam, S. N. Deepa, Principles of Soft Computing
• A. K. Srivastava, Soft Computing
• D. A. Waterman, A Guide to Expert Systems
• D. Zhang, Parallel VLSI Neural System Design
Why Neural Networks and Company?
Still in active use
No chance to solve some problems in other way
Human ability vs. classical programs
Works as primitive human’s brain
Artificial intelligence has power!
ANN + Fuzzy Logic + Expert Systems + Rough Sets + Ant Algorithms
= SoftComputing
The Story
1943 – McCulloch & Pitts
– model of artificial neuron
1949 – Hebb
– information stored by biological neural nets
1958 – Rosenblatt
– perceptron model
1960 – Widrow & Hoff
– first neurocomputer - Madaline
1969 – Minsky & Papert
– XOR problem – single-layer perceptron limitations
1986 – McCleland & Rumelhart
– backpropagation algorithm
Where Softcomputing is in Use?
Letters, signs, characters, digits recognition
Recognition of ship types – data from sonar
Electric power prediction
Different kinds of simulators and computer games
Engine diagnostic – in planes, vehicles
Rock-type identification
Bomb searching devices
Neural Networks Realisation
Set of connected identical neurons
Artificial neuron based on a biological neuron
Hardware realisation – digital device
Software realisation – simulators
Artificial neural network – idea, algorithm, mathematical formulas
Works in parallel
No programming – learning process necessary
Learning Teacher
Nauczyciel
With a Teacher
Learning
Wektor cech Result of
Wynik
vector
(dane nauki) learning
klasyfikacji
Parameters
Klasyfikator
Weights
Without a Teacher Wektor cech Wynik
Learning Result of
(dane testowe)
vector klasyfikacji
learning
Parameters
Klasyfikator
Weights
Softcomputing vs. Classical Computer
Different limitations of softcomputing methods
No softcomputing:
– operations based on symbols: editors, algebraic equations
– calculations with a high level of precision
Softcomputing is very nice, but not as universal as computer
Anatomy Foundations (1)
Nervous System – 2-ways, symmetrical
brain
set of structures, divided into 4 parts:
brain stern cerebellum
prolonged cord Spinal Cord
– receiving and transmission of data
spinal cord
Prolonged Cord
– breathing, blood system, digestion
Cerebellum
– movement control
nervous system
Brain (ca. 1.3 kg) – 2 hemispheres
– feeling, thinking, movement
Anatomy Foundations (2)
Anatomy Foundations (3)
Cerebral cortex – thickness: 2 mm, area: ca. 1.5 m2
Cerebral cortex divided into 4 part – lobes
Each lobe is corrugated
Each hemisphere is responsible for half part of body:
right for left part, left for right part
Hemispheres are identical in case of a structure, but
their functions are different
Anatomy Foundations (4)
Brain composed by fibres with large number of branches
Two types of cells in nervous tissue: neurons and gley cells
There are more gley cells:
– no data transfer among neurons
– catering functions
Ca. 20 milliard neurons in cerebral cortex
Ca. 100 milliard neurons in whole brain
Neuron: dendrites – inputs, axon – output, body of neuron
Neuron: thousands of synapses – connections to other neurons
Anatomy Foundations (5)
Neurons in work:
• chemical-electrical signal transferring
• cell generates electrical signals
• electric pulse is changed into a chemical signal at the end of axon
• chemical info passed by neurotransmitters
• 50 different types of neurons
• neurons driven by a frequency of hundreds of Hz
• neurons are rather low devices!
Anatomy Foundations (6)
Biological and Artificial Neural Nets
Artificial neural networks are a good solution for:
– testing already identified biological systems
– pattern recognition
– alternative configurations to find the basic features of them
Artificial neural networks are primitive brothers of biological nets
Biological nets have sophisticated internal features important for their normal work
Biological nets have sophisticated time dependences ignored in most artificial networks
Biological connections among neurons are different and complicated
Most architectures of artificial nets are unrealistic from the biology point of view
Most learning rules for artificial networks are unreal in biology point of view
Most biological nets we can compare to already learned artificial nets to realise function
described in a very detailed way
Linear ANN - ADALINE (ADAive Linear Neuron)
1 single neuron’s answer:
M M
w0 y w j x j w0 x) w j x j w T~
y (~ x
x1 j 1 j 0
w1 scalar description vector description
w2 x col ( x0 , x1 ,..., xM ) w col ( w0 , w1 ,..., wM )
~
x2 . + y x0 1
.
wM
.
multi-output net:
xM M – number of input neurons
K – number of output neurons
Single-Layer Multi-Output Network
y1 y2 yK k-neuron’s answer:
M Wkj
y K (x) wkj x j
Input Output
j 0
neuron neuron
y(x) w Tx column y(X) WX
w10 w20 wK0
w11 w12 w1K
w21 w22 w2K wM1 wM2 wMK w10 w11 w1M
w w21 w2 M
W 20
1 x1 x2 xM
wK 0 wK 1 wKM
Learning Procedure
experimental data: N - series
x1 , x 2 ,..., x N – learning data x N t KN , – function implemented by net
t 1K , t K2 ,..., t KN – required answers
error function – mean-square error:
2
N K N K M
1
E (W ) yk w tkn E (W ) w jk x j tk
1 2 n n
2 n1 k 1 2 n1 k 1 j 0
looking for a minimum of E(W) function:
E (W )
k , j wkj
0
Pseudoinverse Algorithm
E (W ) 1 N M
wkj 2 n1 j '0
n
n n
2 wkj x j ' t k x j 0
N M
k , j n1 j '0
wkj x nj' x nj
N
t kn x nj
n1
where:
1 x11 x1M t11 t21 t1K w10 w11 w1M
2 w
1 x 2
xM2 t1 t22 t K2 w21 w2 M
X 1
T W 20
N
1 x1 xMN t1 t KN wKM
N
t2N wK 0 wK 1
finally:
X XW
T T
XTT XWT T W T (X T X)1 X T T WT X τ T,
τ pseudoinverse
y
Gradient-Type Algorithms (1)
iterative approach:
1 y
x x x x
x
steps:
( 1) E (w )
wkj wkj w
– random weight vector
wkj
– new weight vector following: -w E
– repeat process generating the sequence of weights vectors: w( )
– components of weight vectors calculated by
2
1 K M
error function: E (w ) E (w ) n
E (w ) wkj x j tk
n
n n
n
2 k 1 j 0
Gradient-Type Algorithms (2)
sequential approach:
E n
n
( 1) E
wkj wkj yk (x n ) tkn * x nj kn * x nj
wkj wkj
error – delta rule: Widrow-Hoff rule:
( 1)
yk (x ) t
n
k
n n
k
wkj w jk kn x nj
algorithm: 1. set start values – by a random way for example
2. calculate a net answer for available xn
3. calculate an error value kn
4. calculate a new weight vector wkj(+1) according to the delta rule
5. repeat steps 2. – 4. until E less than required value
1
Perceptron (1)
x1 w0
wi1 the story:
yi – Rosenblatt (1962)
wi2
x2 – classification task
– Widrow & Hoff (1960) - ADALINE
wiM
xM
M
j j
answer: y ( X ) g w x g ( w T
x) w0 – threshold value
j 0
activation function:
1 for a 0 1 for a 0 M
g (a ) g (a ) a wj x j
1 for a 0 0 for a 0 j 0
bipolar unipolar
E Perceptron (2)
error function: w jk – does not exist, because g(a) is not differentiable
perceptron criterions:
compare the actual value of yi and the required output value di and:
– if yi = di the weights values of Wij and w0 are unchanged
– if yi = 0 and the required value di =1 update the weights as follow:
Wij (t 1) Wij (t ) x j , bi (t 1) bi (t ) 1,
where: t – previous cycle, t+1 – actual cycle
– if yi = 1 and di = 0 update the weights according to:
Wij (t 1) Wij (t ) x j , bi (t 1) bi (t ) 1
where: bi – polarity, di – required neuron’s output signal
Perceptron (3)
summarising:
1. look-up the input learning vectors
2. if classification is correct weights are not changed
3. if classification is wrong:
– if tn = +1 add xn to the weight values
– else subtract xn from the weight values
– value of is not important – can be set to 1, it only scales wi
p
E ( yi( k ) d i( k ) ) 2 ,
k 1
Perceptron – Problems (1)
linear separability: XOR problem – Minsky & Papert (1969):
In1 In2
X1
In1 In2 Out
C1 0 0 0
0 1 1
XOR 1 0 1
y(x)=0
C2 X2 1 1 0
Out
– non-linear separable problem
– solution: multilayer net
Perceptron – Problems (2)
multilayer network for
XOR problem solution: w w
w w
w s1 s2 w
-2w w
w S