0% found this document useful (0 votes)
91 views5 pages

Topic 4 2

- Radial basis function neural networks use radial basis functions as activation functions which have radial symmetry around a center point in n-dimensional space. The farther an input is from the center, the less activation it produces. This models the "on-center off-surround" phenomenon seen in real neurons. - The lateral geniculate nucleus (LGN) in the brain transforms visual signals into electrical impulses. LGN neurons have receptive fields that are circularly symmetric with a center-surround organization similar to a "Mexican hat" shape. - To train an RBF network, the data points can be used as initial centers and gradient descent can be used to optimize the weights, centers, and

Uploaded by

scribdkhatn
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
91 views5 pages

Topic 4 2

- Radial basis function neural networks use radial basis functions as activation functions which have radial symmetry around a center point in n-dimensional space. The farther an input is from the center, the less activation it produces. This models the "on-center off-surround" phenomenon seen in real neurons. - The lateral geniculate nucleus (LGN) in the brain transforms visual signals into electrical impulses. LGN neurons have receptive fields that are circularly symmetric with a center-surround organization similar to a "Mexican hat" shape. - To train an RBF network, the data points can be used as initial centers and gradient descent can be used to optimize the weights, centers, and

Uploaded by

scribdkhatn
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 5

Main idea: change the activation function In contrast to sigmoidal functions, radial basis functions have radial symmetry

about a center in n-space (n = # of inputs). The farther from the center the input is, the less the activation. This models the on-center off-surround phenomenon found in certain real neurons in the visual system, for example.

Main idea: geometry

Radial Basis Function Neural Networks


Topic 4
Note: lecture notes by Michael Negnevitsky (U of Tasmania, Australia) and Bob Keller (Harvey Mudd College, CA) are used

OnOn-Center response in a lab


LGN (lateral geniculate nucleus) description, from http://www.science.gmu.edu/~nbanerje/csi801/report_html.htm LGN is a folded sheet of neurons (1.5 million cells), about the size of a credit card but about three times as thick, found on each side of the brain. The ganglion cells of the LGN transform the signals into a temporal series of discrete electrical impulses called action potentials or spikes. The ganglion cell responses are measured by recording the temporal pattern of action potentials caused by light stimulation. The receptive fields of the LGN neurons are circularly symmetric and have the same center-surround organization. The algebraic sum of the center and surround mechanisms has a vague resemblance to a sombrero with a tall peak, so this model of the receptive field is sometimes called "Mexican-hat model." When the spatial profiles of center and surround mechanisms can be described by Gaussian functions the model is referred to as the "difference-of-Gaussians" model.

LGN response

Modeling

Other RBF examples

Spread = 1/selectivity

RBF Network : two layers only

RBF network

Example: XOR with RBF

Example: XOR with RBF

Example: XOR with RBF

Example: Function approximation

demo

demo

RBF properties
RBF networks tend to have good interpolation properties, but not as good extrapolation properties as MLPs. For extrapolation, using a given number of neurons, an MLP could be a much better fit. With proper setup, RBFNs can train in time orders of magnitude faster than backpropagation. RBFNs enjoy the same universal approximation properties as MLPs: given sufficient neurons, any reasonable function can be approximated (with just 2 layers).

Example: matlab newrb


% NEWRB(PR,T,GOAL,SPREAD,MN,DF) takes these arguments, % P - RxQ matrix of Q input vectors. % T - SxQ matrix of Q target class vectors. % GOAL - Mean squared error goal, default = 0.0. % SPREAD - Spread of radial basis functions, default = 1.0. % MN - Maximum number of neurons, default is Q. % and returns a new radial basis network. % The larger that SPREAD is the smoother the function approximation % will be. Too large a spread means a lot of neurons will be % required to fit a fast changing function. Too small a spread % means many neurons will be required to fit a smooth function, % and the network may not generalize well. Call NEWRB with % different spreads to find the best value for a given problem.

Demo: spreads are too small

Demo: spreads are too large

RBF training for weights, centers and spreads using gradient descent

Some tricks on RBF NN training


Solving approach for RBF NN


Assume the spreads are fixed. Choose the N data points themselves as centers. It remains to find the weights. Define ji = (|| xi -xj ||) where is the radial basis function, xi, xj are training samples. The matrix of values ji is called the interpolation matrix.

Training for centers and spreads is apparently very slow. So some have taken the approach of computing these parameters by other means and just training for the weights (at most).

Solving approach for RBF NN


The interpolation matrix has the property that w = d where w is the weight vector d is the desired output vector over all training samples (since the samples are both data points and centers). If is non-singular, then we can solve for weights as w = -1 d

BiasBias-Variance dilemma or how to choose the numbers Two devils: approximation error vs. overfitting on training set Reason for overfitting: overfitting: too large model does not get an ability to generalize How to discover this: while moving from training to testing set Errors do increase but should not too much

Selecting centers by clustering One center per training sample may be overkill. There are ways to select centers as representatives among clusters, given say a fixed number of representatives.

K-means clustering

K-means clustering
Tries to optimize the SSE of the difference between points and the center of their clusters. This is a heuristic procedure, and is subject to the usual local minima pitfalls. However, it is used quite often.

This determines which points belong to which clusters, as well as the centers of those clusters. The desired number k of clusters is specified. Initialize k centers, e.g. by choosing them to be k distinct data points. Repeat For each data point, determine which center is closest. This determines each points cluster for the current iteration. Compute the centroid (mean) of the points in each cluster. Make this the centers for the next iteration. until centers dont differ appreciably from their previous value

MLP vs RBF Case Studies (source: Yampolskiy and Novikov, RIT)

You might also like