In the brain, a typical neuron collect signals from others through a host of fine structures called dendrites. The neuron sends out spikes of electrical activity through the axon (the out put and conducting structure) which can split into thousands of branches. At the end of each branch, a synapse converts the activity from the axon into electrical effects that inhibit or excite activity on the contacted (target) neuron. When a neuron receives excitatory input that is sufficiently large compared with its inhibitory input, it sends a spike of electrical activity (an action potential) down its axon.

Learning occurs by changing the effectiveness of the synapses so that the influence of one neuron on another changes.

These general properties of neurons can be abstracted in order to study the dynamical behavior of large ensembles of neuronal cells. Thus there have been many interesting attempts to mimic the brains learning processes by creating networks of artificial neurons.

This approach consist in deducing the essential features of neurons and their interconnections and then, programming a computer to simulate these features.

Artificial neural networks are typically composed of inter connected units which serve as model neurons.

The synapse is modeled by a modifiable weight associated with each particular connection. Most artificial networks do not reflect the detailed geometry of the dendrites and axons, and they express the electrical output of a neuron as a single number that represents the rate of firing.

Each unit converts the pattern of incoming activities that it receives into a single outgoing activity that it sends to other units. This conversion is performed in two steps:

First, it multiplies each incoming activity by the weight on the connection and adds together all these weighted inputs to get a total input.

Second, the unit uses an input output function that transform the total input into an outgoing activity.

The global behavior of an artificial neural network depends on both the weight and the input- output function that is specified for the unit. This function typically falls into one of three categories: linear, threshold or sigmoid. For linear units, the output activity is proportional to the total weighted input- For threshold units the output is set at one of two levels, depending on whether the total input is greater than, or less than some threshold value. In sigmoid units, the output varies continuously but not linearly as the input changes. Sigmoid units bear a greater resemblance to real neurons than linear or threshold units.

For a neural network to perform a specific task the connection between units must be specified an the weights on the connections must be set appropriately. The connections determine whether it is possible for one unit to influence another. The weights specify the strength of the influence.

One common type of artificial neural network consists of three groups or layers of units. An input layer is connected to a layer of intermediate units (called hidden units), which is in turn connected to a layer of output units. The activity of the input units represent the incoming external information that is fed into the network. The activities of each hidden unit is determined by the activities of the input units and the weights on the connections between the input and hidden units. The behavior of the output units depends in turn, on the activity of the hidden units and the weights between the hidden and output units.

Note that the hidden units are free to construct their own representation of the input. The weights between the input and hidden units determine when each hidden unit is active. Thus by modifying these weights a hidden unit can choose what it represents.

A three layer network can be trained as follows: first the network is presented with a training example consisting of a pattern of activities for the input units together with the pattern that represents the desired output. Then it is determined how closely the actual output matches the desired output. Next the weight of each connection is changed in order to produce a better approximation of the desired output.

Note that in this procedure, the experimenter must know in advance the desired output and then has to force the network to behave accordingly. Therefore a learning rule is needed. The learning rule governs the way in which each connection (weights) is to be modified, so that the goal (output pattern) is reached efficiently.

Investigators have devised many powerful learning rules of great practical value. However it is still not known which representation and learning procedures are actually used by the brain.

Go to the list of topics