To save content items to your account,
please confirm that you agree to abide by our usage policies.
If this is the first time you use this feature, you will be asked to authorise Cambridge Core to connect with your account.
Find out more about saving content to .
To save content items to your Kindle, first ensure no-reply@cambridge.org
is added to your Approved Personal Document E-mail List under your Personal Document Settings
on the Manage Your Content and Devices page of your Amazon account. Then enter the ‘name’ part
of your Kindle email address below.
Find out more about saving to your Kindle.
Note you can select to save to either the @free.kindle.com or @kindle.com variations.
‘@free.kindle.com’ emails are free but can only be saved to your device when it is connected to wi-fi.
‘@kindle.com’ emails can be delivered even when you are not connected to wi-fi, but note that service fees apply.
Feeding is the means by which an organism acquires the materials for building, maintaining and moving the vehicle that carries the next generation. Since nutrition is the main requirement of all living systems, feeding preceded by food-seeking behaviour, is a necessity of life. Organisms must feed to live and they also must work to feed. Evolution has played a role in influencing the eating and drinking behaviour of all species. It has selected for the mechanisms that motivate the organism's ingestion of nutrients and its selection. There are two fundamental aspects of feeding: energy balance and diet selection. Energy balance deals with how much animals eat in relation to their energy expenditures, whilst diet selection deals with mechanisms that allow omnivores to choose the appropriate nutrients. This chapter will deal with the former and Chapter 5 will be concerned with the latter issue.
Organisms regulate their nutritional intake according to shortand long-term energy needs. The mechanism responsible for this regulation, which involves the maintenance of a constant, optimal, internal environment, is called homeostasis. This concept was introduced by Claude Bernard (1879) in his discussion of ‘le milieu intérieur’ and the necessity of the organism keeping its internal environment at a constant, optimal level. (For a detailed discussion of the key components of homeostatic mechanisms and their character of operation, see Schulze (1995).) An example that illustrates the concept of homeostasis is that of the thermostat. After it is set for a certain temperature, it reacts to deviations from that temperature (the set point) by changing the environment so that such deviations are eliminated.
Axonal guidance cues can influence growth cone behaviour by binding to specific, cell surface receptors on growth cones. Activation of such receptors produces intracellular signalling events, which leads to changes in growth cone behaviour. The principal target of intracellular signalling pathways in growth cones is the cytoskeleton – the ‘final common path of action’ of guidance cues because it underlies growth cone motility and neurite extension – although other growth cone functions, such as plasma membrane growth, may also be targeted. Our understanding of the intracellular signalling pathways in growth cones is still fairly rudimentary. There are three main reasons for this. Firstly, although several families of guidance molecules have been identified and characterised, the growth cone receptors for these molecules, and thus their intracellular signalling mechanisms, are only now being discovered (see Chapter 3). Secondly, most experiments on signalling mechanisms have been done in vitro and focus on assaying neurite outgrowth rather than the pathfinding behaviour of growth cones, such as growth cone turning, for instance, in ‘choice’ assays (see Chapter 3), which more closely resemble growth cone pathfinding in vivo. Thirdly, there is considerable reliance on the use of pharmacological reagents to inhibit or disrupt signalling pathways, an approach which is often compromised by lack of specificity or the unavailability of more than one reagent with a different mechanism of action.
Several methodological advances have had a significant impact on our understanding of the signalling events within neurites and growth cones.
Motivational influences that are specifically social constitute the subject of this chapter. Social motives involve activities that affect interactions among organisms of the same species as well as organisms of other species. I will deal with only two issues in this short chapter: the formation and maintenance of social bonds between individuals; and that of pro-social and altruistic acts. The functional and proximate causal aspects of these phenomena will be considered in the analysis. The first and foremost social motive concerns the formation of the bond between an infant and its primary caretaker, usually the mother. The material in Chapter 3 deals with this issue from the perspective of the mother, whilst the material in this chapter complements it through an analysis of processes in the infant. The attachment of a human infant to its caretaker has biological roots. Both babies and adults are programmed by evolution to become attached in certain ways because the former is dependent upon the latter for survival. Attachment behaviours refer to a broad classification of behaviours that keep the infant in close proximity to an attachment figure. These behaviours include crying, clinging and approaching, as well as others produced when the infant is separated from the attachment figure.
Mammals live in a diverse array of habitats and social structures. The basic unit of the family is the mother and infant. However, our examination of material in Chapter 3 suggests that other conspecifics may be involved to some exent in infant caregiving. These caregivers can include the father, siblings or peers, or combinations of these conspecifics.
Drinking is the means by which an organism acquires fluids necessary for the normal functioning of the cells in the body. Water is the medium through which the chemical processes of the body operates. It is the largest component of the body and its volume must be defended within narrow limits. ‘The proportion of water to lean body mass (the body without fat) is essentially constant at 70%’ (Rolls & Rolls, 1982). The energy processes of the cell occur within a fluid medium. In Chapter 4, it was noted that feeding preceded by food-seeking behaviour, is a necessity of life. Similarly, drinking preceded by water-seeking behaviour is also a necessity of life. Living organisms are endowed with mechanisms that have been selected for, and which cause them to be motivated to seek and ingest water when their internal environment's water balance is disturbed. The animal's nervous system is supplied with information from a sample of body fluids, and drinking decisions are based upon the state of this sample. Just as the thermostat samples temperature at a site in a room, it is assumed that the drinking mechanisms sample the fluid environment at one or more sites.
Fluid regulation in living organisms represents a balance between intake and excretion of water. Each side of the equation consists of a ‘regulated’ and an ‘unregulated’ component. The regulated component represents factors which act specifically to maintain body fluid homeostasis (water balance). The primary factors that regulate water balance are thirst and pituitary secretion of the anti-diuretic hormone (ADH), which is also known as vasopressin (Verbalis, 1990).
When animals are exposed to aversive stimuli, particularly those of pain, they are likely to respond by either withdrawing from or attacking the source of the stimuli. ‘Pain is an anatomically developed sensory system genetically differentiated for survival and the defence of the body. Responses to painful stimuli either involve the skeletal musculature or are internal but they are experimentally quantified as escape and avoidance’ (Le Magnen, 1998, p. 4). Both types of responses to this source serve adaptive functions. In many species, specific escape mechanisms have evolved for dealing with physical danger. One of the simplest is the withdrawal reflex that removes the organism from damaging stimuli. When specific taste receptors are in contact with bitter substances, they result in a spitting reflex that protects the organism from ingesting possibly toxic substances that are generally associated with the bitter taste. In Chapter 5 the mechanisms by which rats learn to avoid smells and tastes that have previously been followed by illness were examined. Animals will also react to aversive stimuli with attack behaviour, particularly when escape is difficult. Such a reaction is particularly evident in feral animals.
TWO-FACTOR THEORY OF AVOIDANCE BEHAVIOUR
The standard situation or apparatus used to study responses to aversive stimuli is one in which a rat is placed in a shuttle box – a long narrow box divided in half by a partition. The floor of the box is a grid of steel rods that can can deliver a painful shock when activated by electricity. The rules of the experiment are as follows. The rat has a few seconds to cross the barrier over to the other side of the box.
The term ‘synapse’ was coined by the English physiologist Charles Sherrington from the Greek ‘to fasten or clasp together’ (Sherrington, 1947). He inferred the existence of a gap between neurons that contacted each other from experiments using the ganglionic blocking agent nicotine, which he painted on to peripheral ganglia. The synapse is a specialised region of close apposition between two cells where intercellular communication occurs (Burns & Augustine, 1995). Sherrington proposed the existence of a synapse from his physiological studies of the effects of nicotine on autonomic ganglia. Synapses form between neurons or between a neuron and an effector cell, such as a muscle or an endocrine cell. At the culmination of pathfinding, the growth cone must select a synaptic partner and form a synapse: a process known as synaptogenesis. In the case of most axonal growth cones, synaptogenesis requires the growth cone to stop growing and to transform into a presynaptic nerve terminal (reviewed in: Rees, 1978; Vaughn, 1989; Garrity & Zipursky, 1995; Haydon & Drapeau, 1995). Although there are presynaptic dendrites in the adult central nervous system, dendritic growth cones develop mainly into postsynaptic sites.
In this chapter, I will review our understanding of the early events in synaptogenesis, particularly the selection of a synaptic partner by the growth cone and the initial stages of the interaction between the growth cone and its synaptic partner leading to the differentiation of the synapse.
Natural images have been shown to demonstrate enormous structural redundancy; this rinding has motivated the incorporation of statistical image models into computational theories of visual processing, producing a variety of candidate encoding strategies (Field, 1987; Sirovich and Kirby, 1987; Baddeley and Hancock, 1991). Many of these strategies effectively filter out predictable correlational structure so as to reduce directly or indirectly the dimensionality of the visual input. One advantage of such strategies is that if the image data can be encoded into a representation whose axes lie closer to the “natural” axes of the visual input, thresholding might produce a “sparse-distributed” representation, i.e. one which would show only sparse neural activity in response to an expected stimulus. Perhaps the best-documented support for such a strategy has come from work by D. J. Field (1987), who investigated the global 2-D amplitude spectra (averaged across all orientations) of an ensemble of natural images; he found that the amplitude falls off typically as the inverse of radial spatial frequency f, that is, the corresponding power spectra Ŝ(f) fall off as f-2. If visual signals with these properties were to be encoded by a bank of spatial-frequency-selective mechanisms or “channels” whose spatial-frequency bandwidths are constant in octaves, the outputs of each channel (Field, 1987) should exhibit similar energies (and therefore similar r.m.s. contrasts, since the channel outputs are assumed to have zero mean). The advantage of this so-called “scale invar-iance” is that by thresholding these channel outputs, a visual system can easily discount the “expected” struture of natural scenes.
It is an assumption of the neural computation community that the brain as the most successful pattern recognition system is a useful model for deriving efficient algorithms on computers. But how can a useful interaction between brain research and artificial object recognition be realized? We see two questionable ways of interaction. On the one hand, a very detailed modelling of biological networks may lead to a disregard of the task solved in the brain area being modelled. On the other hand, the neural network community may lose credibility by a very rough simplification of functional entities of brain processing. This may result in a questionable naming of simple functional entities as neurons or layers to pretend biological plausibility. In our view, it is important to understand the brain as a tool solving a certain task and therefore it is important to understand the functional meaning of principles of cortical processing, such as hierarchical processing, sparse coding, and ordered arrangement of features. Some researchers (e.g., Barlow, 1961c; Palm, 1980; Földiák, 1990; Atick, 1992b; Olshausen and Field, 1997) have already made important steps in this direction. They have given an interpretation of some of the above-mentioned principles in terms of information theory. Others (e.g. Hummel and Biederman, 1992; Lades et al., 1992) have tried to initiate an interaction between brain research and artificial object recognition by building efficient and biologically motivated object recognition systems. Following these two lines of research, we suggest to look at a functional level of biological processing and to utilize abstract principles of cortical processing in an artificial object recognition system.
The hippocampus is anatomically and neurophysiologically one of the best known structures of the mammalian brain (for a review, see Witter, 1993). Besides, it plays a fundamental role in memory storage (for a review, see Squire, 1992), and has extensive connections with many areas of the neocortex, both incoming and outgoing, through the entorhinal cortex (Squire et al., 1989). Partly for these reasons, the hippocampus has been the object of several theoretical investigations and models. One of the most widespread functional hypotheses is that the hippocampus has the role of a fast episodic memory, and has to perform cued retrieval to release information to the neocortex, in which memory is slowly reorganized in semantic structures (see, e.g. McClelland et al., 1995). The hippocampus may be not a permanent store for episodes; it appears experimentally that there exists “hippocampal” forgetting (Zola-Morgan and Squire, 1990), even if the relation of typical forgetting times with the storage of the same information in neocortex is not known. In any case, it is likely that information is stored in the hippocampus, and a mechanism for retrieving it from the hippocampus is needed (Rolls, 1995).
Here we introduce a model of the information flow in the hippocampal system, focusing on the role of the connections between entorhinal cortex, CA3 and CA1. We note that our model is independent of the functional hypotheses just described, and of the detailed implications of behavioural theories. We base the model on neuroanatomical and physiological evidence, though with some form of approximation to allow for an analytical development.
To successfully interact with the everyday objects that surround us we must be able to recognise these objects under widely differing conditions, such as novel viewpoints or changes in retinal size and location. Only if we can do this correctly can we determine the behavioural significance of these objects and decide whether the sphere in front of us should, for example, be kicked or eaten. Similar, although often finer discriminations are required in face recognition. One might be presented with the task of deciding which side of the aisle is reserved for the groom's family at your cousin's wedding – a problem of familiar versus unfamiliar categorisation. On the other hand, the faces may be familiar and the task becomes one of distinguishing family members, such as your aunt from your sister. Such decisions have clear social significance and are crucial in deciding how to interact with other people.
Quite how we succeed in recognising people's faces or indeed any other objects remains the subject of much debate. Theories for how we represent objects and ultimately solve object recognition abound. One suggestion is that we construct mental 3D models which we can manipulate in size and orientation until a match to the observed object is found in our repertoire of objects (Marr and Nishihara, 1978; Marr, 1982). Other theories also work on the assumption that we store libraries of objects, but at the level of innumerable, deformable outline sketches or “templates” that can be matched to edges and other features detected in the viewed object (Ullman, 1989; Yuille, 1991; Hinton et al., 1992).
Qualitative measures show that an existing artificial neural network can perform invariant object recognition. Quantification of the level of performance of cells within this network, however, is shown to be problematic.
In line with contemporary neurophysiological analyses (e.g. Optican and Richmond, 1987; Tovee et al., 1993), a simplistic form of Shannon's information theory was applied to this performance measurement task. However, the results obtained are shown not to be useful – the perfect reliability of artificial cell responses highlights the implicit decoding power of pure Shannon information theory.
Refinement of the definition of cell performance in terms of usable Shannon information (Shannon information which is available in a “useful” form) leads to the development of two novel performance measures. First, a cell's “information trajectory” quantifies standard information-theoretic performance across a range of decoders of increasing complexity – information made available using simple decoding is weighted more strongly than information only available using more complex decoding. Second, the nature of the application (the task the network attempts to solve) is used to design a decoder of appropriate complexity, leading to an exceptionally simple and reliable information-theoretic measure. Comparison of the various measures' performance in the original problem domain show the superiority of the second novel measure.
The chapter concludes with the observation that reliable application of Shannon's information theory requires close consideration of the form in which signals may be decoded – in short, not all measurable information may be usable information.
Introduction
This chapter discusses an approach to performance measurement using information theory in the context of a model of invariant object recognition. Each of these terms is discussed in turn in the following sections.
Stochastic resonance (SR) is a phenomenon whereby random fluctuations and noise can enhance the detectability and/or the coherence of a weak signal in certain nonlinear dynamical systems (see e.g. Moss et al. (1994a), Wiesenfeld and Moss (1995); Bulsara and Gammaitoni (1996) and references therein). There is growing evidence that SR may play a role in the extreme sensitivity exhibited by various sensory neurons (Longtin et al., 1991; Douglass et al., 1993; Bezrukov and Vodyanoy, 1995; Collins et al., 1996) it has also been suggested that SR could feature at higher levels of brain function, such as in the perceptual interpretation of ambiguous figures (Riani and Simonotto, 1994; Simonotto et al., 1997; Bressloff and Roper, 1998). In the language of information theory, the main topic of this volume, SR is a method for optimising the Shannon information transfer rate (transinformation) of a memoryless channel (Heneghan et al., 1996).
Most studies of SR have been concerned with external noise, that is, a stochastic forcing term is deliberately added to a non-linear system that is controllable by the experimentalist. The archetype is one of a damped particle moving in a double well potential. If the particle is driven by a weak periodic force, i.e. one in which the forcing amplitude is less than the barrier height, it will be confined to a single well and will oscillate about the minimum. However, if the particle is driven by weak noise it will switch between wells with a transition rate which depends exponentially on the noise strength, D (imagine cooking popcorn on a low heat in a large pan).
the final part consists of four chapters: two using information theory techniques to model the hippocampus and associated systems; one on the phenomena of stochastic resonance in neronal models; and one exploring the idea that cortical maps can be understood within the information maximisation framework.
The simple “hippocampus as memory system” metaphor has led the hippocampus to be one of the most modelled areas of the brain. These models extend from the very abstract (the hidden layer of a back-propagation network labelled “hippocampus”), to simulations where almost every known detail of the anatomy and physiology is incorporated. The first of the two chapters on the hippocampus modelling (Chapter 14) starts with a model of intermediate complexity. The model is simple enough to allow analytic results, whilst it is rich enough to allow the parameters to be related to known anatomy and physiology. In particular, Schultz et al.'s framework allows an investigation of the effects of changing the topography of connections between CA3 and CA1 (two subparts of the hippocampus), and different forms of representation in CA3 (binary or graded activity).
The second chapter uses similar methods, but this time working on a model that includes more of the hippocampal circuitry: specifically the entorhinal cortex and the associated perforant pathways. This inevitably introduces more parameters and makes the model more complex, but by use of methods to reduce the dimensionality of the saddle-point equations, numerical results have been obtained.
Stochastic resonance is the initially counterintuitive phenomenon that the signal-to-noise ratio (and hence information transmission) of a non-linear system can sometimes be improved by the addition of noise.
the first part concentrates on how information theory can give us insight into low-level vision, an area that has many characteristics that make it particularly appropriate for the application of such techniques. Chapter 2, by Burton, is a historical review of the application of information theory to understanding the retina and early cortical areas. The rather impressive matches of a number of models to data are described, together with the different emphases placed by researchers on dealing with noise, removing correlations, and having representations that are amenable to later processing.
Information theory only really works if information transmission is maximised subject to some constraint. In Chapter 3 Laughlin et al. explore the explanatory power of considering one very important constraint: the use of energy. This is conceptually very neat, since there is a universal biological unit of currency, the ATP molecule, allowing the costs of various neuronal transduction processes to be related to other important costs to an insect, such as the amount of energy required to fly. There are a wealth of ideas here and the insect is an ideal animal to explore them, given our good knowledge of physiology, and the relative simplicity of collecting the large amounts of physiological data required to estimate the statistics required for information theoretical descriptions.
To apply the concepts of information theory, at a very minimum one needs a reasonable model of the input statistics. In vision, the de facto model is based on the fact that the power spectra of natural images have a structure where the power at a given frequency is proportional to one over that frequency squared.
information theory has two main contributions to studying neural systems: one is that it provides a theoretically clear and task-independent objective function for unsupervised learning; the other is that it serves as an analytical tool in evaluating the performance of a given model or a biological neural system. We must keep in mind, however, that an information measure in itself is not some kind of Holy Grail; the evolutionary success of an agent eventually comes down to its performance or fitness in a specific ecological niche. A frog may be exposed to the information in the patterns on the pages of a book, the pattern of clouds or a huge number of other possible pattern configurations in its sensory input stream, but the vast majority of this information is irrelevant to it, and it will do better concentrating just on small dark moving spots and large dark blobs in its proximity. The nervous system of higher animals, e.g. mammals, however, has a much more ambitious goal which allows it to operate in a much more flexible way in environments that are completely novel and unexpected on an evolutionary time scale. The goal is to build a model of the sensory environment. This model is still subject to biological constraints, such as the nature and resolution of its sensors of the physical signals but the ways in which it can combine these signals becomes gradually more sophisticated.
Learning and using a new technique always takes time. Even if the question initially seems very straightforward, inevitably technicalities rudely intrude. Therefore before a researcher decides to use the methods information theory provides, it is worth finding out if these set of tools are appropriate for the task in hand.
In this chapter I will therefore provide only a few important formulae and no rigorous mathematical proofs (Cover and Thomas (1991) is excellent in this respect). Neither will I provide simple “how to” recipes (for the psychologist, even after nearly 40 years, Attneave (1959) is still a good introduction). Instead, it is hoped to provide a non-mathematical introduction to the basic concepts and, using examples from the literature, show the kind of questions information theory can be used to address. If, after reading this and the following chapters, the reader decides that the methods are inappropriate, he will have saved time. If, on the other hand, the methods seem potentially useful, it is hoped that this chapter provides a simplistic overview that will alleviate the growing pains.
What Is Information Theory?
Information theory was invented by Claude Shannon and introduced in his classic book The Mathematical Theory of Communication (Shannon and Weaver, 1949). What then is information theory? To quote three previous authors in historical order:
The “amount of information” is exactly the same concept that we talked about for years under the name “variance”. [Miller, 1956]
The technical meaning of “information” is not radically different from the everyday meaning; it is merely more precise.