To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
A question of great interest in neural network theory is the way such a network modifies its synaptic connections. It is in the synapses that memory is believed to be stored: the progression from input to output somehow leads to cognitive behaviour. When our work began more than ten years ago, this point of view was shared by relatively few people. Certainly, Kohonen was one of those who not only shared the attitude, but probably preceded us in advocating it. There had been some early work done on distributed memories by Pribram Grossberg Longuet Higgins and Anderson. If you consider a neural network, there are at least two things you can be concerned with. You can look at the instantaneous behaviour, at the individual spikes, and you can think of the neurons as adjusting themselves over short time periods to what is around them. This has led recently to much work related to Hopfield's model; many people are now working on such relaxation models of neural networks. But we are primarily concerned with the longer term behaviour of neural networks. To a certain extent this too can be formulated as a relaxation process, although it is a relaxation process with a much longer lifetime.
We realized very early, as did many others, that if we could put the proper synaptic strengths at the different junctions, then we would have a machine which, although it might not talk and walk, would begin to do some rather interesting things.
In the last few years we have learnt an enormous amount about how the immune system functions. We now have at least the outline of an immune system network theory that seems to account for much of the phenomenology (Hoffmann, 1980, 1982, Hoffmann et al. 1988). The many similarities between the immune system and the central nervous system suggested the possibility that the same kind of mathematical model could be applicable to both systems. We found that a neural network theory analogous to the immune system theory can indeed be formulated (Hoffmann, 1986). The basic variables in the immune system network theory are clone sizes; the corresponding variables in the neural network theory are the rates of firing of neurons. We need to postulate that neurons are slightly more complex than has been assumed in conventional neural network theories, namely that there can be hysteresis in the rate of firing of a neuron as the input level of the neuron is varied.
The added complexity of the hysteresis postulate is compensated by a new simplicity at the level of the network; the network can learn without any changes in the synaptic connection strengths (Hoffmann, Benson, Bree & Kinahan, 1986). Learned information is associated solely with a state vector; memory is a consequence of the fact that due to the hysteresis associated with each neuron, the system tends to stay in the region of an N-dimensional phase space to which its experiences have taken it. A network's stimulus–response behaviour is determined by its location in that space.
There has recently been a marked increase in research activity regarding the structural and function of the brain. Much of this has been generated by the more general advances in biology, particularly at the molecular and microscopic levels, but it is probably fair to say that the stimulation has been due at least as much to recent advances in computer simulation. To accept this view does not mean that one is equating the brain to an electronic computer, of course; far from it, those involved in brain research have long since come to appreciate the considerable differences between the cerebral cortex and traditional computational hardware. But the computer is nevertheless a useful device in brain science, because it permits one to simulate processes which are difficult to monitor experimentally, and perhaps impossible to handle by theoretical analysis.
The articles in this book are written records of talks presented at a meeting held at the Gentofte Hotel, Copenhagen, during the three days August 20–22, 1986. They have been arranged in an order that places more general aspects of the subject towards the beginning, preceding those applications to specific facets of brain science which make up the balance of the book. The final chapters are devoted to a number of ramifications, including the design of experiments, communication and control.
The meeting could not have been held without the financial support generously donated by the Augustinus Foundation, the Carlsberg Foundation, the Mads Clausen (Danfoss) Foundation, the Danish Natural Science Research Council, the Hartmann Foundation, IBM, the Otto Mønsted Foundation, NORDITA, the NOVO Foundation, and SAS.
A prominent feature of the brain is the apparent diversity of its structure: the distribution of neurons and the way in which their dendrites and axon fibers differ in various brain centers. The pattern of inputs and outputs of each neuron in the brain most probably differs from that of any other neuron in the system, and this possibility clearly imposes constraints on any attempts at generalization. Yet, since its inception, microscopy of the central nervous system (CNS) has involved a sustained effort to define the laws of spatial arrangement and of connectivity distinguishing specific structures. However, the question which naturally arises from the above is whether these structural features may reflect, and perhaps determine, fundamental differences in the mode of operation of distinct brain structures. Alternatively, the possibility may exist that such structural specializations merely represent anatomical ‘accidents of development’, perhaps reflecting phylogenic origin, but playing a functional role which is no more significant than, for example, that of the appendix or the coccygeal vertebrae in man.
It is difficult to provide an answer to this question from the presently available anatomical and physiological data. Although substantial neurohistological data, on one hand, and neurophysiological information, on the other, are available, meaningful correlation of these two sets of data can only be accomplished in very isolated instances. In general, unlike recording from invertebrates, where the simplicity and viability of the nervous system makes it feasible to observe the elements recorded, physiological studies of the mammalian CNS are performed in a ‘blind’ fashion and it is exceedingly difficult to correlate these studies with the microscopical anatomy of the tissue.
Networks of formal neurons provide simple models for distributed, content addressable, fault-tolerant memory. Many numerical and analytical results have been obtained recently, especially on the Hopfield model (Hopfield, 1982). In this model, a network of a large number of fully connected neurons, with symmetric interactions, has a memory capacity which increases linearly with the size of the network – that is, in fact, with the connectivity. When the total number of stored patterns exceeds this capacity, a catastrophic deterioration occurs, as total confusion sets in. Alternative schemes have been proposed (Hopfield, 1982; Nadal et al., 1986; Parisi, 1986) that avoid overloading: new patterns can always be learned, at the expense of more anciently stored ones which get erased – for this reason, these schemes have been called palimpsest. Numerical and analytical results (Nadal et al., 1986; Mézard et al., 1986) detail the behavior of these models, which show striking analogies with the behavior of human short-term memory. We will review the main results, and point out the possible relevance for human working memory.
In Section 13.2 the origin of the catastrophic deterioration in the standard scheme (Hopfield model) is simply explained. In Section 13.3 simple modifications are shown to lead to short-term memory effects. In Section 13.4 properties of these networks are exposed in close analogy with known data of experimental psychology.
Neural networks with plasticity (dynamic connection coefficients) can recognize and associate stimulus patterns. Previous studies have shown the usefulness of a simple algorithm called ‘brain-washing’ which leads to networks which can have many eligible neurons with large variations in activity and complex cyclic modes (Clark & Winston, 1984). Methods of modifying connection coefficients are discussed and evaluated.
Successful pattern recognition with quasirandom, rather than topographic, networks would be much more significant and general. There is no doubt that topographic networks could be more efficient in the brain but less adaptable to changing conditions. A quasirandom network could be trained to recognize temporal and spatial stimuli, while a topographic network would be limited to a particular type of stimuli.
Three types of neurons have been incorporated into the network. A group of 10 input (stimulus) neurons (Ni) send µi efferents to neurons in the main network (see Figs. 30.1 and 30.2). Neurons in the main network are interconnected by µa afferent connections and µe efferent connections. In addition a group of output neurons (No) can be included to monitor activity of the main network and to train the network.
Components of a successful, sensible and biologically feasible training algorithm will be discussed.
Physical limitations to training algorithms
A specified neuron obtains the majority of information from afferent and efferent neurons.
This article addresses a well-defined issue: does coherent firing of several neurons play a role in the function of the cerebral cortex? Its main purpose is to present the results of a computer simulation of a neural network in which the exact timing of impulses is indeed of paramount importance. And it is demonstrated that such a network would have potentially useful powers of discrimination and recall.
It is reasonably clear that the timing of the arrival of nerve impulses at a given neuron cannot be a matter of total indifference. The voltage across a neural membrane relaxes back towards its resting value, once a stimulus has been removed, so it is easy to envisage situations in which incoming impulses will fail to provoke a response unless they can act in unison by arriving simultaneously, or nearly so, at the somatic region. And there is a considerable corpus of evidence that the timing of incoming impulses is important. In the human auditory system, for example, small temporal offsets between impulse trains in the two cochlear nerves is exploited to locate sound sources, while the relative timing of impulses in the same nerve appears to be essential for the correct functioning of speech discrimination (Sachs, Voigt & Young, 1983). There is also evidence that the timing of sensory stimulation, down at the ten-millisecond level, is critically important for classical conditioning (Sutton & amp; Barto, 1981). And on the clinical side, one sees an extreme example of neuronal synchronization in the case of epilepsy, which apparently arises from mutual excitation between neurons (Traub & Wong, 1982).
A huge amount of physiological research shows the importance of the role of the cerebellum in motor control (Ito, 1984; and many others). It is natural for physiologists to try to understand what the cerebellum function is and when and which corrections or other modifications of cerebral cortical motor programs are introduced by the cerebellum. It seems to us that in recent decades the most interesting hypotheses on cerebellar performance have been proposed by the late David Marr in ‘A theory of cerebellar cortex’ (Marr, 1969). According to Marr, the main operation of the cerebellar circuitry is the switching on of proper motor commands by the current sensory input and the automatic adaptive acquisition of such cerebellar network capability. The location of the cerebellum – in the crest of almost all the ascending and descending nervous tracts – is definitely strategic for such a function. The unique combination of tens of thousands of granular cells (GrCs) and one climbing fibre (CF) at one Purkinje cell seems to be crucial for it.
The kernel of Marr's theory is composed by the postulates of the GrC–PC synaptic modification due to simultaneous excitation of the climbing fibre (CF) and parallel fibres (PF). In other words, Marr supposed that the Purkinje cell memorizes the afferent conditions in which it ought to be active.
from
Cyclic phenomena and chaos in neural networks
By
G. Barna, Central Research Institute for Physics of the Hungarian Academy of Sciences,
P. Érdi, Central Research Institute for Physics of the Hungarian Academy of Sciences
Rhythmic behaviour is characteristic for the nervous system at different hierarchical levels. Periodic temporal patterns can be generated both by endogenous pacemaker neurons and by multicellular neural networks. At single-cell level it was demonstrated, both experimentally and theoretically, that periodic membrane potentials could bifurcate to more complex oscillatory behaviour (ultimately identified by chaos) in response to drug treatment (Holden, Winlow & Haydon, 1982; Chay, 1984). Even the alteration of periodically synchronized oscillation and chaotic behaviour has been found in periodically forced oscillators of squid giant axons (Aihara, Matsumoto & Ichikawa, 1985).
The appearance of quasi-periodicity and chaos has been associated with abnormal neural phenomena not only at single neural level but as well at macroscopic scale connecting chaotic EEG dynamics to epileptic seizure (Babloyantz, Salazar & Nicholis, 1985). At intermediate level, chaotic behaviour was found in a model of the central dopaminergic neuronal system, and was associated with schizophrenics (King, Barchas & Huberman, 1984).
‘Normal’ and ‘abnormal’ dynamic behaviour, also at intermediate, namely synaptic level, has recently been investigated (Érdi & Barna, 1986; Éedi & Barna, 1987). Preliminary numerical calculations suggested that the regular periodic operation of synaptic level rhythmic generator of cholinergic system requires a fine-tuned neurochemical control system. Even mild impairment of the metabolism might imply ‘abnormal’ dynamic synaptic activity.
Memory disorders associated with Alzheimer's disease, partially due to disturbance of the control system of acetylcholine (ACh) synthesis, can be accompanied by change of dynamic patterns of firing frequency (Wurtman, Hefti & Melamed, 1981).
The belief that complex macroscopic phenomena of everyday experience are consequences of cooperative effects and large-scale correlations among enormous numbers of primitive microscopic objects subject to short-range interactions, is the starting point of all mathematical models on which our interpretation of nature is based. The models used are basically of two types: continuous and discrete models.
The first have been until now the most current models of natural systems; their mathematical formulation in terms of differential equations allows analytic approaches that permit exact or approximate solutions. The power of these models can be appreciated if one thinks that complex macroscopic phenomena, such as phase transitions, approach to equilibrium and so on, can be explained in terms of them when infinite (thermodynamic) limits are taken.
More recently, the great development of numeric computation has shown that discrete models can also be good candidates to explain complex phenomena, especially those connected with irreversibility, such as chaos, evolution of macroscopic systems from disordered to more ordered states and, in general, self-organizing systems. As a consequence, the interest in discrete models has vastly increased. Among these, cellular automata, C.A. for short, have received particular attention; we recapitulate their definition:
A discrete lattice of sites, the situation of which is described at time t by integers whose values depend on those of the sites at the previous time t - τ (τ is a fixed finite time delay).
Neural networks of spin glass type reveal remarkable properties of a content-addressable memory (Hopfield, 1982; Amit et al, 1985; Kinzel, 1985a). They are able to retrieve the full information of a learned pattern from an initial state which contains only partial information. Recently much effort has been devoted to the modeling of networks based on Hebb's learning rule (Cooper et al., 1979). These networks are the Hopfield model and its modifications. All have in common a local learning rule which allows the storage of orthogonal patterns without errors. The learning rule is local if the change of the synaptic coefficient depends only on the states of the two interconnected neurons and possibly on the local field of the postsynaptic one. This property seems to be essential from a biological point of view. However, the storing capability of these networks is strongly limited by the fact that they are not able to store correlated patterns without errors (Kinzel, 1985b).
On the other hand a storing procedure for correlated patterns is available (Personnaz et al, 1985; Kanter & Sompolinsky, 1986). But it involves matrix inversions which are not equivalent to a local learning mechanism. It is the purpose of this paper to present a new local learning rule for neural networks which are able to store both correlated and uncorrelated patterns. Moreover, this learning rule enables the network to fulfil two further important properties of natural networks: the learning process does not reverse the signs of the synaptic coefficients and leads to a network with unsymmetric bonds even if it starts from a symmetric one.
The mechanisms of the complex functions attributed mostly to the cerebral cortex are hidden in the collective behaviour of a vast neural network that cannot practically be described in detail or in general. Cyclic modes of activity which emerge spontaneously in the dynamics of neural networks may underly possible mechanisms of short-term memory and associative thinking. The transitions from seemingly random activity patterns to cyclic activity have been examined in isolated networks with pseudorandomly chosen synapses and in networks with very simple architectures.
The basic computer model (Clark, Rafelski & Winston, 1985) envisions a collection of neurons, linked by a network of axons and dendrites that synapse onto one another. The synaptic interactions are modeled by a connection matrix V. The net algebraic strength of the connections from neuron j to neuron i, represented by the matrix element Vij can be positive (excitatory), negative (inhibitory) or zero (no connection). In the present study, the Vij were chosen randomly, but in accord with certain specified gross network parameters, viz.
N = net size = number of neurons in net,
m = connection density = probability that a given j → i link exists,
h = fraction of inhibitory neurons.
No more than one connection (‘synapse’) was allowed from any source neuron j to a given target neuron i.
The neurons update their states synchronously, corresponding to the assumption of a universal time delay δ for direct signal transmission.
The retina is composed of a variety of cell types including the photoreceptors, horizontal cells, bipolar cells, amacrine cells, and retina ganglion cells (for a review see Grüsser & Grüsser-Cornehls, 1976). Only the retina ganglion cells (RGCs) send axons to the brain of the animals. Therefore any visual information the brain may rely on is mediated by cells of this type.
According to their response properties the RGCs are usually divided into four classes, which here for simplicity are called R1, R2, R3, and R4. We have restricted our attention to the classes R2 and R3 because these form the majority (about 93%) of cells projecting to the tectum opticum, which is that area in the brain where recognition of prey objects is supposed to be centered.
The recognition process starts in the retina. The overall operation of the ganglion cells (R2, R3) and their precursors (photoreceptors, etc.) on some arbitrary visual scene can be decomposed into the following more primitive operational components:
(1) Let x(s, t) be any distribution of light in the visual field, where x denotes light intensity (we do not consider colored scenes), s = (s1, s2) some point in the visual field, and t is time.
(2) These ganglion cells do not respond to stationary, but only to transitory, illumination.
In dealing with the problem of trying to describe mathematically neural tissue populations, known as neural nets, two approaches can be taken, the global and the microscopic. The global approach gives a phenomenological description of neural tissue populations. The microscopic, derived through appropriate simplifications, gives properties of the net from the properties of its constituents, the neurons, the connections and the synapses. While phenomenological theories appear to be easier to build and, overall, yield more results that are in agreement with experiments, it is impossible to build a realistic model of the brain without knowing the detailed functioning, interactions and interrelations of its constituents.
For the construction of an actual neural net machine, theories based on the properties of the fundamental constituents of the net will be more applicable to discovering the laws that govern the secrets of biological information processing. For instance, it is more relevant to simulate the activities of the brain from the underlying fundamental laws of nature. Examples taken from physics itself clarifies this point.
From a philosophical point of view it is more appealing to derive the laws of thermodynamics from the statistical behaviour of the particles that make up the system than to introduce thermodynamics as an independent branch of physics.
Analogously, it is more significant to derive the properties of superconductivity from the actual properties of the electrons and lattice than from phenomenological reasoning.
Many features of the functional architecture of the mammalian visual system have been experimentally identified during the past 25 years. Among the most striking of these features is the presence of layers of orientation-selective cells – cells whose response to an edge or bar in the appropriate portion of visual field is sensitive to the local orientation of the input. These cells are organized in bands or ‘columns’ of cells of the same or similar orientation preference. The preferred orientation varies roughly monotonically, but with frequent breaks and reversals, as one traverses the cell layer. Orientation-selective cells, organized in this fashion, are found in cat, monkey, and other mammalian systems. In macaque monkey, they are present at birth.
I have found that several salient features of mammalian visual system architecture – including orientation-selective cells and columns – emerge in a multilayered network of cells whose connections develop, one layer at a time, according to a synaptic modification rule of Hebb type. The theoretical base is biologically plausible, none of the assumptions is specific to visual processing, no orientation preferences are specified to the system at any stage, and the features emerge even in the absence of environmental input.
The development of this system is discussed in detail, and references to experimental work are provided, in a series of three papers (Linsker, 1986a,b,c).
The fundamental aim of simulation of neural nets is a better understanding of the functioning of the nervous system. Because of the complexities, simplifying assumptions have to be introduced from the beginning. In many frequently used models these assumptions are:
Each model neuron can be in one of few discrete states (e.g. ‘on’, ‘off’, ‘refractory’).
The totality of interactions between neurons is treated summarily by specifying few parameters, often just one ‘synaptic strength’, which is determined randomly, or by a deterministic algorithm, or by a combination of both.
The information processing by a neuron consists of comparing the value of a certain parameter, which is determined by incoming signals from other neurons, with some threshold. Essentially, when the value of this parameter exceeds the threshold, the neuron transits to another one of its possible states. A typical example for this parameter is the membrane potential at the axon hillock, which is determined by summing the action potentials impinging on this neuron during a certain time and whose value determines whether the neuron ‘fires’ or not.
The hypothesis that individual neurons can be described in a simple way makes the simulation of networks containing many model neurons possible. On the other hand, there are important parts of nervous systems, where model neurons simplified to the extent described above have little in common with reality.