Académique Documents
Professionnel Documents
Culture Documents
NEURONAL NOISE A fundamental problem in computational neuroscience tion because the information is encoded across many
The part of a neuronal response concerns how information is encoded by the neural cells. However, population codes turn out to have other
that cannot apparently be architecture of the brain. What are the units of compu- computationally desirable properties, such as mecha-
accounted for by the stimulus.
Part of this factor may arise
tation and how is information represented at the neural nisms for NOISE removal, short-term memory and the
from truly random effects (such level? An important part of the answers to these ques- instantiation of complex, NONLINEAR FUNCTIONS. Under-
as stochastic fluctuations in tions is that individual elements of information are standing the coding and computational properties of
neuronal channels), and part encoded not by single cells, but rather by populations population codes has therefore become one of the
from uncontrolled, but non-
or clusters of cells. This encoding strategy is known as main goals of computational neuroscience. We begin
random, effects.
population coding, and turns out to be common this review by describing the standard model of popu-
throughout the nervous system. lation coding. We then consider recent work that high-
The ‘place cells’ that have been described in rats are lights important computations closely associated with
a striking example of such a code. These neurons are population codes, before finally considering extensions
believed to encode the location of an animal with to the standard model suggested by studies into more
respect to a world-centred reference frame in environ- complex inferences.
ments such as small mazes. Each cell is characterized by
a ‘place field’, which is a small confined region of the The standard model
*Department of Brain and maze that, when occupied by the rat, triggers a response We will illustrate the properties of the standard model
Cognitive Sciences, from the cell1,2. Place fields within the hippocampus are by considering the cells in visual area MT of the monkey.
Meliora Hall, University of usually distributed in such a way as to cover all loca- These cells respond to the direction of visual movement
Rochester, Rochester,
New York 14627, USA.
tions in the maze, but with considerable spatial overlap within a small area of the visual field.
‡Gatsby Computational between the fields. As a result, a large population of The typical response of a cell in area MT to a given
Neuroscience Unit, cells will respond to any given location. Visual features, motion stimulus can be decomposed into two terms.
Alexandra House, such as orientation, colour, direction of motion, depth The first term is an average response, which typically
17 Queen Square, London and many others, are also encoded with population takes the form of a GAUSSIAN FUNCTION of the direction of
WC1N 3AR, UK.
§Department of Computer codes in visual cortical areas3,4. Similarly, motor com- motion. The second term is a noise term, and its value
Science, University of mands in the motor cortex also rely on population changes each time the stimulus is presented. Note that
Toronto, Toronto, Ontario codes5, as do the nervous systems of invertebrates such in this article we will refer to the ‘response’ to a stimulus
M5S 1A4, Canada. as leeches or crickets6,7. as being the number of action potentials (spikes) per
e-mails:
alex@bcs.rochester.edu;
One key property of the population coding strategy second measured over a few hundred milliseconds dur-
dayan@gatsby.ucl.ac.uk; is that it is robust — damage to a single cell will not ing the presentation of the stimulus. This measurement
zemel@cs.toronto.edu have a catastrophic effect on the encoded representa- is also known as the spike count rate or the response
SYMMETRIC LATERAL strategies lead to problems with noise, robustness and Neural decoding
CONNECTIONS the sheer number of cells required. However, coding As we saw above, the variability of the neuronal respons-
Lateral connections are formed with many cells is often wasteful, requires complicated es to a given stimulus is the central obstacle for decoding
between neurons at the same
decoding computations, and has problems in cases such methods. Sophisticated methods such as ML may form
hierarchical level. For instance,
the connections between cortical as transparency, when many directions of motions co- accurate estimates in the face of such noise, but one
neurons in the same area and exist at the same point in visual space. More recently, might wonder how such complex computations could
same layer are said to be lateral. attention has turned to other aspects of population be carried out in the brain. A possibility is for the popula-
Lateral connections are codes that are inspired by the idea of connections tion itself to actively remove some of the noise in its
symmetric if any connection
from neuron a to neuron b is
between population-coding units. These ideas are dis- response.
matched by an identical cussed in the following section. Recently, Deneve et al.14 have shown how optimal
connection from neuron b to ML estimation can be done with a biologically plausi-
neuron a. 100 ble neural network. Their neural network is composed
of a population of units with bell-shaped tuning
HYPERCOLUMN
80 curves (FIG. 2). Each unit uses a nonlinear activation
Activity (spikes s–1)
mum likelihood.
60 Note that if the hill were always to stabilize at the
same position (say 90°) for any initial noisy activities,
40 the network would be a poor estimator. Indeed, this
would entail that the network always estimate the direc-
20 tion to be 90° whether the true direction was in fact
35°, 95° or any other value. It is therefore essential that
0
smooth hills centred on any point on the neuronal
–100 0 100
Preferred direction (deg) array be stable activity states, as, in most cases, any value
of the stimulus s is possible. Networks with this proper-
Figure 2 | A neural implementation of a maximum ty are called line (or surface) attractor networks. They
likelihood estimator. The input layer (bottom), as in FIG. 1a,b, are closely related to more conventional memory
consists of 64 units with bell-shaped tuning curves whose attractor networks, except that, instead of having sepa-
activities constitute a noisy hill. This noisy hill is transmitted to
the output layer by a set of feedforward connections. The
rated discrete attractors, their attractors are continuous
output layer forms a recurrent network with lateral connections in this space of activities. Interestingly, the same archi-
between units (we show only one representative set of tecture has been used to model short-term memory cir-
connections and only nine of the 64 cells). The weights in the cuits for continuous variables such as the position of
lateral connections are determined such that, in response to a the head with respect to the environment19, and similar
noisy hill, the activity in the output layer converges over time circuits have been investigated as forming the basis of
onto a smooth hill of activity (upper graph). In essence, the
the neural integrator in the oculomotor system20.
output layer fits a smooth hill through the noisy hill, just like
maximum likelihood (FIG. 1d). Deneve et al.14 have shown that, An important issue is whether the network used by
with proper choice of weights, the network is indeed an exact Deneve et al. is biologically plausible. At first glance the
implementation of, or a close approximation to, a maximum answer seems to be that it is not. For example, the tun-
likelihood estimator. The network can also be thought of as an ing curves of the units differ only in their preferred
optimal nonlinear noise filter, as it essentially removes the noise direction, in contrast to the actual neurons in area MT,
from the noisy hill.
area MST and elsewhere, whose tuning curves also dif-
Movie online fer in their width and maximum firing rate. Moreover,
taining many transparent layers of motion in all the overall level of activity in the population can provide
relevant directions. However, as the standard popula- information differentiating between these interpreta-
tion-coding model is incapable of dealing with trans- tions, for example if the number of motions is propor-
parent motion correctly, it would also fail to cope with tional to the total activity. Mechanisms to implement
uncertainty coded in this way. In fact, the standard this type of control through manipulations of α have
model cannot deal with any coding for uncertainty yet to be defined.
because its encoding model ignores uncertainty.
Uncertainty also arises in other ways. For instance, Discussion
imagine seeing a very low-contrast moving stimulus The theory and analysis of population codes are steadily
(such as a black taxicab at night), the details of whose maturing. In the narrowest sense, it is obvious that the
movement are hard to discern. In this case, the popula- brain must represent aspects of the world using more
tion code might be expected to capture uncertainty than just one cell, for reasons of robustness to noise and
about the direction of motion in such a way that it neuronal mortality. Accordingly, much of the early work
could be resolved by integrating information either on population codes was devoted to understanding how
over long periods of observation, or from other sources, information is encoded, and how it might be simply
such as audition. decoded. As we have seen, more recent work has con-
The standard population-coding model cannot rep- centrated on more compelling computational proper-
resent motion transparency and motion uncertainty ties of these codes. In particular, recurrent connections
because of its implicit assumption that the whole popu- among the cells concerned can implement line attractor
lation code is involved in representing just one direction networks that can perform maximum likelihood noise
of motion, rather than two, or potentially many. The removal; and population codes can provide a set of basis
neurophysiological data indicating that the activity of functions, such that complex, nonlinear functions can
MT cells to transparency may resemble the average of be represented simply as a linear combination of the
their responses to single stimuli suggests that EQN 1 output activities of the cells. Finally, we have considered
should be changed to: an extension to the standard model of population
encoding and decoding, in which the overall shape and
ri = α∫g(s)fi(s)ds + n (5) magnitude of the activity among the population is used
s
to convey information about multiple features (as in
In EQN 5, g(s) is a function that indicates what motions motion transparency) or feature uncertainty (as in the
are present or consistent with the image, and α is a aperture problem).
constant (see below). Here, the motions in the image We have focused here on population codes that are
are convolved with the tuning function of the cell to observed to be quite regular, with features such as gauss-
give its output4,43,44. The breadth of the population ian-shaped tuning functions. Most of the theory we
activity conveys information about multiplicity or have discussed is also applicable to population codes
uncertainty. Recent results are consistent with this using other tuning functions, such as sigmoidal, linear-
form of encoding model45,46, but it is possible that a rectified, or almost any other smooth nonlinear tuning
more nonlinear combination method might offer a curves. It also generalizes to codes that do not seem to
better model than averaging47. be as regular as those for motion processing or space,
Unfortunately, although this encoding model is such as the cells in the inferotemporal cortex of mon-
straightforward, decoding and other computations keys that are involved in representing faces48,49. These
are not. In the standard population-coding model, the ‘face cells’ share many characteristics with population
presence of noise means that only a probability densi- codes; however, we do not know what the natural coor-
ty over the motion direction, s, can be extracted from dinates might be (if, indeed there are any) for the repre-
the responses, r. In the model based on EQN 5, the sentation of faces. We can still build decoding models
presence of noise means that only a probability densi- from observations of the activities of the cells, but the
ty over the whole collection of motions, g(s), can be computations involving memory and noise removal are
extracted. Of course, in the same way that a single harder to understand.
value of s can be chosen in the standard case, a single Many open issues remain, including the integration
function, g(s) (or even a single aspect of it, such as the of the recurrent models that remove noise optimally
two motions that are most consistent with it), can be (assuming that there is only one stimulus) with the
chosen in the new case4,44. So, although we understand models allowing multiple stimuli. The curse of dimen-
how to encode uncertainty with population codes, sionality that arises when considering that cells really
much work remains to be done to develop a theory of code for multiple dimensions of stimulus features must
computation with these new codes, akin to the basis also be addressed. Most importantly, richer models of
FULL-FIELD GRATING
function theory for the standard population codes computation with population codes are required.
A grating is a visual stimulus described above.
consisting of alternating light The multiplier α in EQN 5 deserves comment, as Links
and dark bars, like the stripes on there is a controversy about the semantics of the FURTHER INFORMATION Computational neuroscience |
the United States flag. A full-field
grating is a very wide grating
decoded motions, g(s). This function can describe Gatsby computational neuroscience unit | Motion
that occupies the entire visual multiple transparent motions that are simultaneously perception | Alex Pouget’s web page | Peter Dayan’s web
field. present, or a single motion of uncertain direction. The page | Richard Zemel’s web page
1. Bair, W. Spike timing in the mammalian visual system. Curr. direction of the rat in world-centred coordinates. This 32. Hinton, G. E. in Proceedings of the Ninth International
Opin. Neurobiol. 9, 447–453 (1999). internal compass is calibrated with sensory cues, Conference on Artificial Neural Networks 1–6 (IEEE, London,
2. Borst, A. & Theunissen, F. E. Information theory and neural maintained in the absence of these cues and updated England, 1999).
coding. Nature Neurosci. 2, 947–957 (1999). after each movement of the head. This model shows 33. Poggio, T. & Edelman, S. A network that learns to recognize
3. Usrey, W. & Reid, R. Synchronous activity in the visual how attractor networks of neurons with bell-shaped three-dimensional objects. Nature 343, 263–266 (1990).
system. Annu. Rev. Physiol. 61, 435–456 (1999). tuning curves to head direction can be wired to 34. Salinas, E. & Abbott, L. Invariant visual responses from
4. Zemel, R., Dayan, P. & Pouget, A. Probabilistic interpretation account for these properties. attentional gain fields. J. Neurophysiol. 77, 3267–3272
of population codes. Neural Comput. 10, 403–430 (1998). 20. Seung, H. How the brain keeps the eyes still. Proc. Natl (1997).
5. Tolhurst, D., Movshon, J. & Dean, A. The statistical reliability Acad. Sci. USA 93, 13339–13344 (1996). 35. Deneve, S. & Pouget, A. in Advances in Neural Information
of signals in single neurons in cat and monkey visual cortex. 21. Poggio, T. A theory of how the brain might work. Cold Processing Systems (eds Jordan, M., Kearns, M. & Solla, S.)
Vision Res. 23, 775–785 (1982). Spring Harbor Symp. Quant. Biol. 55, 899–910 (1990). (MIT Press, Cambridge, Massachusetts, 1998).
6. Földiak, P. in Computation and Neural Systems (eds Introduces the idea that many computations in the 36. Groh, J. & Sparks, D. Two models for transforming auditory
Eeckman, F. & Bower, J.) 55–60 (Kluwer Academic brain can be formalized in terms of nonlinear signals from head-centered to eye-centered coordinates.
Publishers, Norwell, Massachusetts, 1993). mappings, and as such can be solved with population Biol. Cybernet. 67, 291–302 (1992).
7. Salinas, E. & Abbot, L. Vector reconstruction from firing rate. codes computing basis functions. Although 37. Pouget, A. & Sejnowski, T. A neural model of the cortical
J. Comput. Neurosci. 1, 89–108 (1994). completely living up to the title would be a tall order, representation of egocentric distance. Cereb. Cortex 4,
8. Sanger, T. Probability density estimation for the interpretation this remains a most interesting proposal. A wide 314–329 (1994).
of neural population codes. J. Neurophysiol. 76, 2790–2793 range of available neurophysiological data can be 38. Olshausen, B., Anderson, C. & Essen, D. V. A multiscale
(1996). easily understood within this framework. dynamic routing circuit for forming size- and position-
9. Zhang, K., Ginzburg, I., McNaughton, B. & Sejnowski, T. 22. Pouget, A. & Sejnowski, T. Spatial transformations in the invariant object representations. J. Comput. Neurosci. 2,
Interpreting neuronal population activity by reconstruction: parietal cortex using basis functions. J. Cogn. Neurosci. 9, 45–62 (1995).
unified framework with application to hippocampal place 222–237 (1997). 39. Treue, S., Hol, K. & Rauber, H. Seeing multiple directions of
cells. J. Neurophysiol. 79, 1017–1044 (1998). 23. Andersen, R., Essick, G. & Siegel, R. Encoding of spatial motion-physiology and psychophysics. Nature Neurosci. 3,
10. Cox, D. & Hinckley, D. Theoretical statistics (Chapman and location by posterior parietal neurons. Science 230, 270–276 (2000).
Hall, London, 1974). 456–458 (1985). 40. Shadlen, M., Britten, K., Newsome, W. & Movshon, T.
11. Ferguson, T. Mathematical statistics: a decision theoretic 24. Squatrito, S. & Maioli, M. Gaze field properties of eye A computational analysis of the relationship between
approach (Academic, New York, 1967). position neurons in areas MST and 7a of macaque monkey. neuronal and behavioral responses to visual motion.
12. Paradiso, M. A theory of the use of visual orientation Visual Neurosci. 13, 385–398 (1996). J. Neurosci. 16, 1486–1510 (1996).
information which exploits the columnar structure of striate 25. Rumelhart, D., Hinton, G. & Williams, R. in Parallel 41. Zemel, R. & Dayan, P. in Advances in Neural Information
cortex. Biol. Cybernet. 58, 35–49 (1988). Distributed Processing (eds Rumelhart, D., McClelland, J. & Processing Systems 11 (eds Kearns, M., Solla, S. & Cohn,
A pioneering study of the statistical properties of Group, P. R.) 318–362 (MIT Press, Cambridge, D.) 174–180 (MIT Press, Cambridge, Massachusetts, 1999).
population codes, including the first use of Bayesian Massachusetts, 1986). 42. Simoncelli, E., Adelson, E. & Heeger, D. in Proceedings
techniques to read out and analyse population codes. 26. Zipser, D. & Andersen, R. A back-propagation 1991 IEEE Computer Society Conference on Computer
13. Seung, H. & Sompolinsky, H. Simple model for reading programmed network that stimulates reponse properties Vision and Pattern Recognition 310–315 (Los Alamitos, Los
neuronal population codes. Proc. Natl Acad. Sci. USA 90, of a subset of posterior parietal neurons. Nature 331, Angeles, 1991).
10749–10753 (1993). 679–684 (1988). 43. Watamaniuk, S., Sekuler, R. & Williams, D. Direction
14. Deneve, S., Latham, P. & Pouget, A. Reading population 27. Burnod, Y. et al. Visuomotor transformations underlying perception in complex dynamic displays: the integration of
codes: A neural implementation of ideal observers. Nature arm movements toward visual targets: a neural network direction information. Vision Res. 29, 47–59 (1989).
Neurosci. 2, 740–745 (1999). model of cerebral cortical operations. J. Neurosci. 12, 44. Anderson, C. in Computational Intelligence Imitating Life
Shows how a recurrent network of units with bell- 1435–1453 (1992). 213–222 (IEEE Press, New York, 1994).
shaped tuning curves can be wired to implement a A model of the coordinate transformation required for Neurons are frequently suggested to encode single
close approximation to a maximum likelihood arm movements using a representation very similar to values (or at most 2–3 values for cases such as
estimator. Maximum likelihood estimation is widely basis functions. This model was one of the first to transparency). Anderson challenges this idea and
used in psychophysics to analyse human relate the tuning properties of cells in the primary argues that population codes might encode
performance in simple perceptual tasks in a class of motor cortex to their computational role. In particular, probability distributions instead. Being able to encode
model known as ‘ideal observer analysis’. it explains why cells in M1 change their preferred a probability distribution is important because it
15. Georgopoulos, A., Kalaska, J. & Caminiti, R. On the relations direction to hand movement with starting hand would allow the brain to perform Bayesian inference,
between the direction of two-dimensional arm movements position. an efficient way to compute in the face of uncertainty.
and cell discharge in primate motor cortex. J. Neurosci. 2, 28. Salinas, E. & Abbot, L. Transfer of coded information from 45. Recanzone, G., Wurtz, R. & Schwarz, U. Responses of MT
1527–1537 (1982). sensory to motor networks. J. Neurosci. 15, 6461–6474 and MST neurons to one and two moving objects in the
16. Pouget, A., Deneve, S., Ducom, J. & Latham, P. Narrow vs (1995). receptive field. J. Neurophysiol. 78, 2904–2915 (1997).
wide tuning curves: what’s best for a population code? A model showing how a basis function representation 46. Wezel, R. V., Lankheet, M., Verstraten, F., Maree, A. &
Neural Comput. 11, 85–90 (1999). can be used to learn visuomotor transformations with Grind, W. V. D. Responses of complex cells in area 17 of the
17. Pouget, A. & Thorpe, S. Connectionist model of orientation a simple hebbian learning rule. cat to bi-vectorial transparent motion. Vision Res. 36,
identification. Connect. Sci. 3, 127–142 (1991). 29. Olshausen, B. & Field, D. Emergence of simple-cell 2805–2813 (1996).
18. Regan, D. & Beverley, K. Postadaptation orientation receptive field properties by learning a sparse code for 47. Riesenhuber, M. & Poggio, T. Hierarchical models of object
discrimination. J. Opt. Soc. Am. A 2, 147–155 (1985). natural images. Nature 381, 607–609 (1996). recognition in cortex. Nature Neurosci. 2, 1019–1025 (1999).
19. Zhang, K. Representation of spatial orientation by the 30. Bishop, C., Svenson, M. & Williams, C. GTM: The generative 48. Perrett, D., Mistlin, A. & Chitty, A. Visual neurons responsive
intrinsic dynamics of the head-direction cell ensemble: a topographic mapping. Neural Comput. 10, 215–234 (1998). to faces. Trends Neurosci. 10, 358–364 (1987).
theory. J. Neurosci. 16, 2112–2126 (1996). 31. Lewicki, M. & Sejnowski, T. Learning overcomplete 49. Bruce, V., Cowey, A., Ellis, A. & Perret, D. Processing the
Head direction cells in rats encode the heading representations. Neural Comput. 12, 337–365 (2000). Facial Image (Clarendon, Oxford, 1992).