In the brain, a typical neuron collect signals from others through a host of fine structures called
dendrites. The neuron sends out spikes of electrical activity through the axon (the out put and
conducting structure) which can split into thousands of branches. At the end of each branch, a
synapse converts the activity from the axon into electrical effects that inhibit or excite activity on
the contacted (target) neuron. When a neuron receives excitatory input that is sufficiently large
compared with its inhibitory input, it sends a spike of electrical activity (an action potential) down
its axon.
Learning occurs by changing the effectiveness of the synapses so that the influence of
one neuron on another changes.
These general properties of neurons can be abstracted in order to study the dynamical behavior
of large ensembles of neuronal cells. Thus there have been many interesting attempts to mimic
the brains learning processes by creating networks of artificial neurons. This approach consist in
deducing the essential features of neurons and their interconnections and then, programming a
computer to simulate these features.
The synapse is modeled by a modifiable weight associated with each particular
connection. Most artificial networks do not reflect the detailed geometry of the dendrites and
axons, and they express the electrical output of a neuron as a single number that represents the
rate of firing.
Each unit converts the pattern of incoming activities that it receives into a single outgoing activity
that it sends to other units. This conversion is performed in two steps:
First, it multiplies each
incoming activity by the weight on the connection and adds together all these weighted inputs to get
a total input.
For a neural network to perform a specific task the connection between units must be specified an
the weights on the connections must be set appropriately. The connections determine whether it is
possible for one unit to influence another. The weights specify the strength of the influence.
A three layer network can be trained as follows: first the network is presented with a training
example consisting of a pattern of activities for the input units together with the pattern that
represents the desired output. Then it is determined how closely the actual output matches the
desired output. Next the weight of each connection is changed in order to produce a better
approximation of the desired output.
Note that in this procedure, the experimenter must know in advance the desired output and then
has to force the network to behave accordingly. Therefore a learning rule is needed. The learning
rule governs the way in which each connection (weights) is to be modified, so that the goal
(output pattern) is reached efficiently.
Investigators have devised many powerful learning rules of great practical value. However it is
still not known which representation and learning procedures are actually used by the brain.
Artificial neural networks are typically composed of inter connected units which serve as model
neurons.
Second, the unit uses an input output function that transform the total input into an
outgoing activity.
The global behavior of an artificial neural network depends on both the weight and the input-
output function that is specified for the unit. This function typically falls into one of three
categories: linear, threshold or sigmoid. For linear units, the output activity is proportional to the
total weighted input- For threshold units the output is set at one of two levels, depending on
whether the total input is greater than, or less than some threshold value. In sigmoid units, the
output varies continuously but not linearly as the input changes. Sigmoid units bear a greater
resemblance to real neurons than linear or threshold units.
One common type of artificial neural network consists of three groups or layers of units. An input
layer is connected to a layer of intermediate units (called hidden units), which is in turn connected
to a layer of output units. The activity of the input units represent the incoming external
information that is fed into the network. The activities of each hidden unit is determined by the
activities of the input units and the weights on the connections between the input and hidden
units. The behavior of the output units depends in turn, on the activity of the hidden units and
the weights between the hidden and output units.
Note that the hidden units are free to construct their own representation of the input. The weights
between the input and hidden units determine when each hidden unit is active. Thus by
modifying these weights a hidden unit can choose what it represents.