Abstract
How the brain processes information accurately despite stochastic neural activity is a longstanding question^{1}. For instance, perception is fundamentally limited by the information that the brain can extract from the noisy dynamics of sensory neurons. Seminal experiments^{2,3} suggest that correlated noise in sensory cortical neural ensembles is what limits their coding accuracy^{4,5,6}, although how correlated noise affects neural codes remains debated^{7,8,9,10,11}. Recent theoretical work proposes that how a neural ensemble’s sensory tuning properties relate statistically to its correlated noise patterns is a greater determinant of coding accuracy than is absolute noise strength^{12,13,14}. However, without simultaneous recordings from thousands of cortical neurons with shared sensory inputs, it is unknown whether correlated noise limits coding fidelity. Here we present a 16beam, twophoton microscope to monitor activity across the mouse primary visual cortex, along with analyses to quantify the information conveyed by large neural ensembles. We found that, in the visual cortex, correlated noise constrained signalling for ensembles with 800–1,300 neurons. Several noise components of the ensemble dynamics grew proportionally to the ensemble size and the encoded visual signals, revealing the predicted informationlimiting correlations^{12,13,14}. Notably, visual signals were perpendicular to the largest noise mode, which therefore did not limit coding fidelity. The informationlimiting noise modes were approximately ten times smaller and concordant with mouse visual acuity^{15}. Therefore, cortical design principles appear to enhance coding accuracy by restricting around 90% of noise fluctuations to modes that do not limit signalling fidelity, whereas much weaker correlated noise modes inherently bound sensory discrimination.
Access options
Subscribe to Journal
Get full journal access for 1 year
$199.00
only $3.90 per issue
All prices are NET prices.
VAT will be added later in the checkout.
Tax calculation will be finalised during checkout.
Rent or Buy article
Get time limited or full article access on ReadCube.
from$8.99
All prices are NET prices.
Data availability
The data that support the findings of this study are available from the corresponding authors upon reasonable request.
Code availability
We used open source software routines for image registration^{44} (http://bigwww.epfl.ch/thevenaz/turboreg/) and partial least squares analysis (https://www.mathworks.com/matlabcentral/fileexchange/18760partialleastsquaresanddiscriminantanalysis). Software code for extracting individual neurons and their Ca^{2+} activity traces from Ca^{2+} videos using principal component and then independent component analyses^{35,48} is freely available (https://www.mathworks.com/matlabcentral/fileexchange/25405emukamelcellsort), although for convenience we used a commercial version of these routines (Mosaic software, version 0.99.17; Inscopix). We wrote all other analysis routines in MATLAB (Mathworks; version 2017b). The primary software code used to support the findings of the study is available at Zenodo.org (https://zenodo.org/record/3593520#.XgWPuhKg2w).
References
 1.
von Neumann, J. The Computer and the Brain 2nd edn (Yale Univ. Press, 1958).
 2.
Britten, K. H., Shadlen, M. N., Newsome, W. T. & Movshon, J. A. The analysis of visual motion: a comparison of neuronal and psychophysical performance. J. Neurosci. 12, 4745–4765 (1992).
 3.
Newsome, W. T., Britten, K. H. & Movshon, J. A. Neuronal correlates of a perceptual decision. Nature 341, 52–54 (1989).
 4.
Zohary, E., Shadlen, M. N. & Newsome, W. T. Correlated neuronal discharge rate and its implications for psychophysical performance. Nature 370, 140–143 (1994).
 5.
Averbeck, B. B., Latham, P. E. & Pouget, A. Neural correlations, population coding and computation. Nat. Rev. Neurosci. 7, 358–366 (2006).
 6.
Cohen, M. R. & Kohn, A. Measuring and interpreting neuronal correlations. Nat. Neurosci. 14, 811–819 (2011).
 7.
Sompolinsky, H., Yoon, H., Kang, K. & Shamir, M. Population coding in neuronal systems with correlated noise. Phys. Rev. E 64, 051904 (2001).
 8.
Abbott, L. F. & Dayan, P. The effect of correlated variability on the accuracy of a population code. Neural Comput. 11, 91–101 (1999).
 9.
Shamir, M. & Sompolinsky, H. Implications of neuronal diversity on population coding. Neural Comput. 18, 1951–1986 (2006).
 10.
Ecker, A. S., Berens, P., Tolias, A. S. & Bethge, M. The effect of noise correlations in populations of diversely tuned neurons. J. Neurosci. 31, 14272–14283 (2011).
 11.
Oram, M. W., Földiák, P., Perrett, D. I. & Sengpiel, F. The ‘Ideal Homunculus’: decoding neural population signals. Trends Neurosci. 21, 259–265 (1998).
 12.
Kanitscheider, I., CoenCagli, R. & Pouget, A. Origin of informationlimiting noise correlations. Proc. Natl Acad. Sci. USA 112, E6973–E6982 (2015).
 13.
MorenoBote, R. et al. Informationlimiting correlations. Nat. Neurosci. 17, 1410–1417 (2014).
 14.
Pitkow, X., Liu, S., Angelaki, D. E., DeAngelis, G. C. & Pouget, A. How can single sensory neurons predict behavior? Neuron 87, 411–423 (2015).
 15.
Prusky, G. T., West, P. W. & Douglas, R. M. Behavioral assessment of visual acuity in mice and rats. Vision Res. 40, 2201–2209 (2000).
 16.
Baylor, D. A., Lamb, T. D. & Yau, K. W. Responses of retinal rods to single photons. J. Physiol. (Lond.) 288, 613–634 (1979).
 17.
Barlow, H. B. Retinal noise and absolute threshold. J. Opt. Soc. Am. 46, 634–639 (1956).
 18.
Siebert, W. M. Some implications of the stochastic behavior of primary auditory neurons. Kybernetik 2, 206–215 (1965).
 19.
Yatsenko, D. et al. Improved estimation and interpretation of correlations in neural circuits. PLoS Comput. Biol. 11, e1004083 (2015).
 20.
Kanitscheider, I., CoenCagli, R., Kohn, A. & Pouget, A. Measuring Fisher information accurately in correlated neural populations. PLoS Comput. Biol. 11, e1004218 (2015).
 21.
Ecker, A. S. et al. Decorrelated neuronal firing in cortical microcircuits. Science 327, 584–587 (2010).
 22.
Reich, D. S., Mechler, F. & Victor, J. D. Independent and redundant information in nearby cortical neurons. Science 294, 2566–2568 (2001).
 23.
Renart, A. et al. The asynchronous state in cortical circuits. Science 327, 587–590 (2010).
 24.
Stirman, J. N., Smith, I. T., Kudenov, M. W. & Smith, S. L. Wide fieldofview, multiregion, twophoton imaging of neuronal activity in the mammalian brain. Nat. Biotechnol. 34, 857–862 (2016).
 25.
Chen, J. L., Voigt, F. F., Javadzadeh, M., Krueppel, R. & Helmchen, F. Longrange population dynamics of anatomically defined neocortical networks. eLife 5, e14679 (2016).
 26.
Sofroniew, N. J., Flickinger, D., King, J. & Svoboda, K. A large field of view twophoton mesoscope with subcellular resolution for in vivo imaging. eLife 5, e14472 (2016).
 27.
Tsai, P. S. et al. Ultralarge fieldofview twophoton microscopy. Opt. Express 23, 13833–13847 (2015).
 28.
Niell, C. M. & Stryker, M. P. Modulation of visual responses by behavioral state in mouse visual cortex. Neuron 65, 472–479 (2010).
 29.
Bonin, V., Histed, M. H., Yurgenson, S. & Reid, R. C. Local diversity and finescale organization of receptive fields in mouse visual cortex. J. Neurosci. 31, 18506–18521 (2011).
 30.
Averbeck, B. B. & Lee, D. Effects of noise correlations on information encoding and decoding. J. Neurophysiol. 95, 3633–3644 (2006).
 31.
Cover, T. M. & Thomas, J. A. Elements of Information Theory 2nd edn, (John Wiley & Sons, 2006).
 32.
Stringer, C., Michaelos, M. & Pachitariu, M. High precision coding mouse visual cortex. Preprint at https://www.biorxiv.org/content/10.1101/679324v1 (2019).
 33.
Prusky, G. T. & Douglas, R. M. Characterization of mouse cortical spatial vision. Vision Res. 44, 3411–3418 (2004).
 34.
Glickfeld, L. L., Histed, M. H. & Maunsell, J. H. Mouse primary visual cortex is used to detect both orientation and contrast changes. J. Neurosci. 33, 19416–19422 (2013).
 35.
Lecoq, J. et al. Visualizing mammalian brain area interactions by dualaxis twophoton calcium imaging. Nat. Neurosci. 17, 1825–1829 (2014).
 36.
Pologruto, T. A., Sabatini, B. L. & Svoboda, K. ScanImage: flexible software for operating laser scanning microscopes. Biomed. Eng. Online 2, 13 (2003).
 37.
Madisen, L. et al. Transgenic mice for intersectional targeting of neural sensors and effectors with high specificity and performance. Neuron 85, 942–958 (2015).
 38.
Wekselblatt, J. B., Flister, E. D., Piscopo, D. M. & Niell, C. M. Largescale imaging of cortical dynamics during sensory perception and behavior. J. Neurophysiol. 115, 2852–2866 (2016).
 39.
Chettih, S. N. & Harvey, C. D. Singleneuron perturbations reveal featurespecific competition in V1. Nature 567, 334–340 (2019).
 40.
Harvey, C. D., Coen, P. & Tank, D. W. Choicespecific sequences in parietal cortex during a virtualnavigation decision task. Nature 484, 62–68 (2012).
 41.
Chen, T. W. et al. Ultrasensitive fluorescent proteins for imaging neuronal activity. Nature 499, 295–300 (2013).
 42.
Huber, D. et al. Multiple dynamic representations in the motor cortex during sensorimotor learning. Nature 484, 473–478 (2012).
 43.
Kim, K. H. et al. Multifocal multiphoton microscopy based on multianode photomultiplier tubes. Opt. Express 15, 11658–11678 (2007).
 44.
Thévenaz, P., Ruttimann, U. E. & Unser, M. A pyramid approach to subpixel registration based on intensity. IEEE Trans. Image Process. 7, 27–41 (1998).
 45.
Preibisch, S., Saalfeld, S. & Tomancak, P. Globally optimal stitching of tiled 3D microscopic image acquisitions. Bioinformatics 25, 1463–1465 (2009).
 46.
Brown, M. & Lowe, D. G. Automatic panoramic image stitching using invariant features. Int. J. Comput. Vis. 74, 59–73 (2007).
 47.
Pnevmatikakis, E. A. & Giovannucci, A. NoRMCorre: An online algorithm for piecewise rigid motion correction of calcium imaging data. J. Neurosci. Methods 291, 83–94 (2017).
 48.
Mukamel, E. A., Nimmerjahn, A. & Schnitzer, M. J. Automated analysis of cellular signals from largescale calcium imaging data. Neuron 63, 747–760 (2009).
 49.
Vogelstein, J. T. et al. Fast nonnegative deconvolution for spike train inference from population calcium imaging. J. Neurophysiol. 104, 3691–3704 (2010).
 50.
Bishop, C. M. Pattern Recognition and Machine Learning Vol. 1 (Springer, 2007).
 51.
Geladi, P. & Kowalski, B. R. Partial leastsquares regression: a tutorial. Anal. Chim. Acta 185, 1–17 (1986).
 52.
Hastie, T., Tibshirani, R. & Friedman, J. The Elements of Statistcal Learning (Springer, 2009).
 53.
Podgorski, K. & Ranganathan, G. Brain heating induced by nearinfrared lasers during multiphoton microscopy. J. Neurophysiol. 116, 1012–1023 (2016).
 54.
Graner, M. W., Cumming, R. I. & Bigner, D. D. The heat shock response and chaperones/heat shock proteins in brain tumors: surface expression, release, and possible immune consequences. J. Neurosci. 27, 11214–11227 (2007).
 55.
Kalmbach, A. S. & Waters, J. Brain surface temperature under a craniotomy. J. Neurophysiol. 108, 3138–3146 (2012).
 56.
Wang, H. et al. Brain temperature and its fundamental properties: a review for clinical neuroscientists. Front. Neurosci. 8, 307 (2014).
 57.
Talan, M. Body temperature of C57BL/6J mice with age. Exp. Gerontol. 19, 25–29 (1984).
 58.
Greenberg, D. S., Houweling, A. R. & Kerr, J. N. D. Population imaging of ongoing neuronal activity in the visual cortex of awake rats. Nat. Neurosci. 11, 749–751 (2008).
 59.
Karimipanah, Y., Ma, Z., Miller, J. K., Yuste, R. & Wessel, R. Neocortical activity is stimulus and scaleinvariant. PLoS ONE 12, e0177396 (2017).
Acknowledgements
We acknowledge a Stanford Graduate Fellowship (O.I.R.), research support from the Howard Hughes Medical Institute (M.J.S.), the Stanford CNC Program (M.J.S.), DARPA (M.J.S.), an NSF CAREER Award (S.G.), and the BurroughsWellcome (S.G.), McKnight (S.G.), James S. McDonnell (S.G.) and Simons (S.G.) foundations. NIH grants MH085500 and DA028298 to H.Z. funded development of the GCaMP6ftTAdCre and Rasgrf22AdCre mice. NIH grant R24NS098519 (M.J.S.) supports our effort to make the 16beam twophoton microscope an open resource available to other laboratories. We thank T. Moore, P. Jercog, J. C. Jung, D. Vucinic, B. F. Grewe, E. T. W. Ho, H. Kim, X. Pitkow and T. Zhang for discussions, D. Flickinger and K. Svoboda for providing design files for the waterimmersion objective lens, C. Niell for providing tetOGCaMP6 s/CaMK2atTA mice, and C. Irimia for animal husbandry.
Author information
Affiliations
Contributions
O.I.R., S.G. and M.J.S. designed experiments and analyses. O.I.R., J.A.L. and J.S. designed and built the microscope. O.I.R., J.A.L., O.H., Y.Z., R.C. and J.L. acquired and analysed data. S.G. developed theory and analysed data. H.Z. provided transgenic mice. O.I.R., S.G. and M.J.S. wrote the paper. All authors edited the paper. S.G. and M.J.S. supervised the research.
Corresponding authors
Ethics declarations
Competing interests
M.J.S. is a scientific cofounder of Inscopix, which produces the Mosaic software used to identify individual neurons in the Ca^{2+} videos. J.A.L. is also an Inscopix stockholder.
Additional information
Peer review information Nature thanks Stefano Panzeri and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.
Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data figures and tables
Extended Data Fig. 1 The discriminability of two sensory stimuli based on the activity patterns of two or more cells depends on the statistical relationship between the mean responses of the cells and their noise correlations, which in turn depends on visual neural circuitry.
a–f, Schematics of the distributions of responses by two cells to two distinct stimuli in six different cases. Cyan dots indicate joint responses of the cell pair to stimulus 1; orange dots indicate responses to stimulus 2. Ellipses convey the shapes of the statistical distributions of the responses to each stimulus. Three types of noise correlation are depicted. In a and d, the two cells have statistically independent noise correlations. In b and e, the cells share positively correlated noise fluctuations. In c and f, the cells share negatively correlated noise fluctuations. In all six cases, dashed lines indicate optimal linear boundaries for stimulus discrimination. The information in a–f is based on similar plots published previously^{5,11,30}. a–c, When both neurons have similar stimulusresponse properties (for example, as schematized, when both cells have a smaller mean response to stimulus 1 than stimulus 2), positively correlated noise fluctuations (b) increase the overlap between the two response distributions and thereby impair stimulus discrimination, whereas negatively correlated noise fluctuations (c) improve stimulus discrimination as compared to the case with independent noise fluctuations (a). d–f, When both neurons have opposite stimulus tuning (for example, as schematized, when neuron 1 responds more vigorously to stimulus 1 and neuron 2 responds more vigorously to stimulus 2), positively correlated noise fluctuations (e) decrease the overlap between the two response distributions as compared to the case with independent noise fluctuations (d) and thereby improve stimulus discrimination, whereas negatively correlated noise fluctuations (f) impair stimulus discrimination by increasing the overlap of the two response distributions. g, Cells in visual cortical areas, denoted by red circles, integrate signals from earlier stages of the visual pathway, as schematized by the input connections to two example cortical neurons. Thus, as visual information propagates through neural circuitry, noise fluctuations become correlated between cells with similar receptive fields, leading to an upper bound on the amount of information that a neural ensemble can encode. h, Example receptive fields for cells in g. Cells in early stages of the visual processing pathway have relatively simple receptive fields. Integration of their activity patterns leads to more complex visual receptive fields in downstream visual areas. Dashed boxes enclose receptive fields (right) for the two example cells marked in g, as well as the receptive fields of cells providing visual inputs (left). i, A network’s pattern of synaptic connectivity constrains the dimensionality of the activity in downstream visual circuits^{12}. Left, in the early layers of the visual pathway, the dimensionality of ensemble activity is about the same order of magnitude as the number of photoreceptors. In downstream visual areas, due to the extraction of visual features, neural activity is constrained to a manifold of lower dimensionality (indicated by the redshaded manifold in the space of all possible photoreceptor inputs). This manifold is determined by the set of receptive fields and hence the visual features that the downstream visual area detects. Grey ellipses (left) depict the distributions of photoreceptor responses to two distinct visual stimuli; after propagating through the visual circuitry these distributions are confined to the lowerdimensional manifold (red ellipses). Right, for a family of visual stimuli parameterized by a single variable, the mean neural ensemble responses lie along a corresponding tuning curve. Noise in the input circuitry propagates to downstream areas and leads to noise fluctuations in downstream neurons that are statistically correlated for cells with similar receptive fields. This, in turn, implies that the magnitude of noise fluctuations along the neural tuning curve becomes proportional to the number of cells in a neural ensemble and indistinguishable from the encoded visual signals, which also increase in proportion to the number of cells. This proportional growth of noise and signal ultimately limits the ability to discriminate two visual stimuli. Thus, for neural ensembles with more than a certain number of cells, the encoded information reaches an upper bound. j, We simulated a twolayer, linear feedforward neural network, to illustrate that informationlimiting correlations are intrinsic to feedforward neural networks with overlapping receptive fields^{12}. Top, for three example output cells, the plot shows the synaptic weights of the inputs from cells in the first layer of the network. Bottom, diagram of connections between the two layers of the network. Symbols are defined as follows: x is the mean activity of cells in the first layer in response to a given stimulus; n is the noise in the activity of the input cells; r is the activity of the output cells. k, Digitized plots of spike counts for simulated activity in the network of j, for the two example input cells (yellow and black) and three example output cells (red, green, blue). The noise traces for the input cells came from independent Poisson random processes. External inputs to the network selectively drove either the yellow or the black cell, but owing to the presence of noise the two cells are occasionally active concurrently. l, Frequency plots of pairwise activity levels (rounded to the nearest integer) for pairs of output cells in the network of j. Yellow and black circles denote which of the two corresponding input cells received external input. The diameter of each circle denotes the number of time bins with a given pair of activity levels in the two cells. Σ values are noise correlation coefficients and are larger for pairs of output cells with greater overlap in their receptive fields. m, Plot of the distribution of activity responses in the output cell layer, for the three example cells coloured green, red and blue in j. Data points are coloured either yellow or black, to indicate whether the output activity is a response to stimulation of the yellow or blackcoloured cell in the input layer. The red plane denotes the optimal linear classification boundary between the two stimulation conditions.
Extended Data Fig. 2 Spatiotemporal multiplexing of the illumination beams permits imaging of large fields of view at fast frame rates without thermal damage to brain tissue.
a, Computerassisted design of the mechanical layout of the twophoton microscope. Scale bar, 0.5 m. b, In the pixel multiplexing mode of imaging, each of the 16 beams are assigned to one of four different temporal phases within each cycle of the pixel clock (Extended Data Fig. 3b). Alternatively, in the linemultiplexing mode of imaging, only 8 of the 16 beam paths are used (Methods). In neither imaging mode are neighbouring beams ever active concurrently (Extended Data Fig. 3c), minimizing fluorescence scattering between active image tiles and allowing scattering into inactive image tiles to be corrected computationally (Extended Data Figs. 3d, e, 4a–g). c, To switch between the different sets of active beams, squarewave electronic signals control a set of three electrooptic modulators (EOMs). d, A Ti:sapphire laser provides ultrashortpulsed infrared illumination. A halfwave (λ/2) plate and a polarizing beamsplitter enable power control. Three pairs of EOMs and polarizing beamsplitters direct the light into one of four main optical paths, with only one path illuminated during each of the four multiplexing phases. In each of these four main paths, three 50:50 beamsplitters create four beams of equal intensity, yielding up to 16 total beams but with only four on at any instant. A chopper blocks all light during the turnaround portion of the galvanometer scanning cycle. e, Seventyfive example fluorescence traces of Ca^{2+} activity in layer 2/3 pyramidal cells of an awake mouse. f, Maintaining brain temperature within physiological ranges during in vivo twophoton imaging requires a proper balance between heat loss through the cranial window and heating induced by the laser illumination^{53,55}. To directly verify that our cranial window preparation and imaging conditions properly balanced these two opposing effects, we measured brain temperature during twophoton imaging with the 16beam microscope. For these studies we used an implanted thermocouple^{53} and either the highest (blue trace) or lowest (green trace) timeaveraged laser illumination intensity used for Ca^{2+} imaging elsewhere in this study (Methods). Consistent with previous work, before laser illumination commenced the brain temperature was about 9 °C below normal mouse body temperature^{55}, a state that is considered to be neuroprotective^{56}. By about 100 s after the start of imaging, brain temperatures attained steadystate values within the physiological range of C57BL/6 mice^{57} (grey shaded region; 36.3 °C–38.7 °C). Each trace is an average of three bouts of imaging for each of three separate mice. Coloured shading denotes the s.d. across the 9 individual measurements acquired at each illumination intensity. g–i, Fluorescence immunohistochemical analyses of tissue damage markers. To check whether in vivo imaging of brain tissue with the 16beam instrument (4 mm^{2} field of view) induced any tissue damage, we immunostained postmortem brain tissue sections using antibodies to two different damage markers, glial fibrillary activation protein (GFAP) and heat shock protein 70 (HSP70), previously identified as indicators of laserinduced tissue damage^{53}. We also stained the sections with DAPI, which labels cell nuclei. We compared positive control tissue sections (g) that we had deliberately damaged in vivo with highpower (2,680 mW mm^{−2}) laser illumination, negative control sections (h) that received no laser illumination, and experimental tissue sections (i) that had undergone in vivo twophoton imaging at the highest level of laser illumination (80 mW mm^{−2}) used in this study for tracking Ca^{2+} dynamics in neocortical layer 2/3 pyramidal neurons. Together, these analyses verified the functionality of the antibodies and revealed no signs of tissue damage from twophoton imaging. To image neurons in cortical layers deeper than layer 2/3, users have several options for doing so without delivering excess heat to the brain (Supplementary Video 3, Supplementary Note). Scale bars, 500 μm. Results shown are representative of those from 8 cerebral hemispheres of 4 different mice. j, k, Comparisons between recent largescale twophoton microscopes^{24,26}. The performance of a laserscanning microscope closely relates to four main parameters: the scanner speed, imageframe acquisition rate, field of view, and pixel size (Supplementary Note). For microscopes that use a single laser beam to sweep in two dimensions across the field of view, these parameters obey the relationship FOV = d × v × f ^{−1}, where FOV is the fieldofview area, d is the spacing between adjacent image lines (or equivalently the pixel width along the slowaxis of laserscanning), v is the speed at which the beam is swept across the specimen by the fastaxis scanner, and f is the imageframe acquisition rate. By comparison, our approach using four active beams leads to an expression for the maximal field of view, FOV = 4 × d × v × f ^{−1}. These relationships enable performance comparisons with other recently published largescale twophoton microscopes^{24,26}. To illustrate, j shows a plot of the imageframe acquisition rate against the fieldofview area, given a line spacing of d = 1.15 μm. k shows how the imageframe acquisition rate depends on d for a 4 mm^{2} field of view. Solid red circles denote the performance of our microscope in its linemultiplexing imaging mode using an 8kHz resonant galvanometer (Methods). Black data points denote performance options of another large twophoton microscope, which uses pair of laser beams with temporally interleaved pulses^{24}, as calculated on the basis of its published capabilities. Blue data points and associated blue dashed lines show performance options for a third largescale microscope^{26}, as calculated on the basis of its published capabilities.
Extended Data Fig. 3 Data acquisition and postprocessing for twophoton imaging with 16 timemultiplexed excitation beams.
a, Block diagram of the electronics for data acquisition and instrument control. PMT, photomultiplier tube; Preamp, preamplifier; ADC, analoguetodigital converter; FPGA, fieldprogrammable gate array; EOM, electrooptic modulator. b, Computer simulation of signal sampling in different stages of the pipeline in a. The ADC samples the analogue, preamplified and lowpass filtered signals (blue) from one of the PMTs at a rate of 5 × 10^{7} samples per second. In each of the four temporal phases, the FPGA sums the digitized signals (red) from the ADC to yield the fluorescence intensity values of each image pixel (grey). c, Raw fluorescence images for each of the four excitation phases, acquired in an awake mouse expressing GCaMP6f in layer 2/3 cortical pyramidal cells and averaged over 100 frames (7.23 Hz acquisition rate). In each of the four phases, a distinct set of four PMTs detects most of the fluorescence emissions, creating four active image tiles within the 4 × 4 array. (Each of the four PMTs corresponds to one of the four laser beams that is active in that phase.) To illustrate, the four active tiles within the phase I image are shaded with a different colour (shaded large square regions). However, close to the boundaries of each active tile, some fluorescence photons are detected by the other 12 PMTs. During signal unmixing these photons are reassigned to corresponding pixels in the correct adjacent active image tile. For instance, within the phase I image photons detected in the areas outlined in colour (rectangles and small squares) are reassigned to the colourcorresponding active tiles. d, An image compiling the four sets of four active image tiles from the panels in c. e, During signal unmixing, we reassign scattered fluorescence photons to their correct pixels of origin, using the method shown in c, by reassigning the boundary regions of 128 pixels width. The resulting image is displayed with the mean contrast equalized across tiles. Scale bars: c, e, 500 μm.
Extended Data Fig. 4 Crosstalk unmixing procedure for reconstructing the full fieldofview enables accurate estimation of neural activity traces.
a, To quantify the extent of fluorescence scattering across image tiles, we acquired images in two distinct configurations that enabled us to distinguish fluorescence signals from any crosstalk due to fluorescence scattering across image tiles. Using an awake mouse expressing GCaMP6f in layer 2/3 cortical pyramidal cells, we first imaged with only one active laser beam and its corresponding PMT; the other 15 beams were blocked (configuration 1). In this configuration, there is no fluorescence scattering into the active image tile from the other 15 tiles, only the signals from the active tile. In configuration 2, we blocked the beam that had previously been active, unblocked the other 15 beams, operated the microscope with the normal multiplexing approach, and again sampled signals from all 16 PMTs. To estimate the extent of scattering into the tile with the blocked beam, we applied the computational unmixing procedure to the raw image data. To estimate how much scattered fluorescence affects cell sorting, we first extracted individual cells and their Ca^{2+} activity traces from the first dataset, attained in configuration 1 without crosstalk. We then summed the images, frame by frame, from the two datasets, to create a mock dataset comprising unscattered plus scattered fluorescence signals, from which we again computationally extracted cells and their activity traces. This enabled a direct comparison between two datasets containing the exact same patterns of neural activity, with and without fluorescence scattering from other image tiles. b, Activity traces for four example cells, enabling comparisons of the Ca^{2+} activity traces (top), ΔF(t)/F_{0}, and the resulting traces of the estimated spike counts (bottom), between the datasets with (red traces) and without (black traces) intertile scattering. The traces with and without intertile scattered fluorescence signals are nearly indistinguishable by eye. c, Histogram of the ratio of estimated spikes for the two datasets constructed in a, for all time bins (0.14 s per time bin) with an estimated spike count greater than 0.5. The mean ratio is 1.0 ± 0.06 (mean ± s.d.; N = 31 cells). Total number of time bins, 5,865. d–g, Studies of fluorescence scattering between the active image tiles in one temporal phase (Extended Data Fig. 2b) of the multiplexing scheme used for twophoton imaging. Throughout the paper, we corrected computationally for fluorescence scattering from active to inactive image tiles within each temporal phase of imaging (Extended Data Fig. 3c, Methods). This approach neglects the small amount of fluorescence scattering from active tiles to other active tiles, which in principle could also be computationally corrected using a more sophisticated method than the one we adopted. Hence, we examined experimentally the validity of our computational approach and the extent to which scattering between active tiles can be justifiably neglected. The amplitude of scattering between active tiles (d) varies with the location of each laser beam and its proximity to a tile boundary. We used fixed cortical tissue slices from adult GCaMP6ftTAdCre mice to measure the amplitude of such scattering effects when imaging at different depths within brain tissue. An image (e) of the spatial distribution of twophoton fluorescence excited 500 μm deep within a tissue slice shows that a majority of scattered fluorescence photons exits the brain tissue relatively near to the laser focus. By averaging over 100 different laser foci positions in each of 3 different brain slices, we determined the mean crosssectional spatial profiles (f) of scattered fluorescence excited at different depths in tissue, as a function of the lateral displacement, x, from the laser focus. Profiles are shown normalized to unity at x = 0. The inset of f shows a magnified view of these crosssectional profiles for x ∈ [–1,000 μm, –500 μm], that is, up to 1 mm away from the laser focus. We used these empirically determined scattering profiles to compute the probability (mean ± s.d.; N = 300 laser focus positions) (g) that a fluorescence photon originating in one active image tile would scatter into an adjacent active tile. Even when the laser focus is on the boundary of an image tile, this probability remains less than 0.02 for all tissue depths ≤ 600 μm. For our studies of layer 2/3 cortical pyramidal cells in live mice, the probability of a fluorescence photon scattering between active tiles is less than 0.01. In conclusion, computational corrections for fluorescence scattering that account solely for scattering from active to inactive tiles—and neglect scattering between different active tiles—are empirically well justified.
Extended Data Fig. 5 Pipeline of offline data processing and procedures for reducing the dimensionality of the neural ensemble activity data and calculating the decoding accuracy.
a, Pipeline of the offline procedures we applied to the acquired fluorescence signals to attain traces of neural activity. Steps coloured purple involve algorithms that use raw or processed image data. Steps coloured yellow involve algorithms that use cells’ spatial filters as their input arguments. Steps coloured green involve algorithms that use cells’ activity traces as their inputs. Purple steps, starting from the raw photocurrents from each of the 16 PMTs (sampled at 50 MHz and assigned to individual image pixels corresponding to a 400ns laser dwell time), we normalized the photocurrent signals by the gain of each individual PMT, to equalize the image intensity scale across the entire image. We then unmixed scattered fluorescence, as shown in Extended Data Fig. 3, and applied an image registration routine (TurboReg^{44}) to the videos from the individual image tiles. To highlight Ca^{2+} transients against baseline fluctuations, we used the fact that the twophoton fluorescence increases of GCaMP6 during Ca^{2+} transients are many times the s.d. of background noise. Thus, we converted the fluorescence trace of each pixel, F(t), into a trace of zscores, ΔF(t)/σ. Here ΔF(t) = F(t) – F_{0} denotes the deviation of the pixel from its mean value, F_{0}, and σ denotes the background noise of the pixel, which we estimated by taking the minimum of all standard deviation values calculated within a sliding 10s window^{35}. After transforming the movie data into this ΔF(t)/σ form, we identified neural cell bodies and processes using an established cellsorting algorithm that sequentially applies principal and independent component analyses (PCA and ICA) to extract the spatial filters and time traces of individual cells^{48}. Yellow steps, for all spatial filters corresponding to individual cell bodies, we thresholded the filters at 5% of each filter’s maximum intensity and set to zero any filter components with nonzero weights outside the soma. To attain neural activity traces, we then reapplied the set of resulting filters to the ΔF(t)/F_{0} movies. Green steps, to estimate the most likely number of spikes fired by each cell in each time bin, we applied a fast nonnegative deconvolution algorithm to the ΔF/F_{0} trace of the cell^{49}. For each neuron, we downsampled (2×) the activity traces to time bins of 0.275 s by averaging the values within adjacent time bins. To make comparisons across similar behavioural states, we removed all trials during which the mouse was moving. b, Neural responses for each visual stimulus (A and B) are represented as matrices of size N_{neurons} × N_{trials} × N_{time bins}. To calculate the accuracy of stimulus discrimination, we first randomly chose a subset of neurons from the dataset. For decoding using the ‘instantaneous’ strategy (Fig. 3, Extended Data Figs. 7–10), we then chose a specific time bin, whereas for the ‘cumulative’ decoding strategy we treated all the different time bins up to a specific time, t, as independent dimensions of the population activity vector. We then split the trials in half, into a training set and a test set, each with equal numbers of trials with the A and B stimuli. We took the neural activity traces in the training set and normalized them by the s.d. of the cell’s activity about its mean, to create to a set of zscore traces. We then performed PLS analysis to identify a lowdimensional basis that well captured the separation between the neural responses to the two sensory stimuli. Using the activity data in the test set, we applied the same normalization and dimensional reduction procedures and values as for the training set. We used the resulting distributions of responses to calculate d′ values and the eigenvectors of the noise covariance matrix. For each mouse we repeated this entire procedure for 100 different randomly chosen subsets of neurons.
Extended Data Fig. 6 Distributions of pairwise noise correlation coefficients do not differ significantly between pyramidal neurons in area V1 and higherorder visual areas.
a, Anatomical maps of visual cortical neurons that responded to each of the two stimuli. For these maps (but for no other analyses in the paper), we denoted a cell as responsive to one of the stimuli if, in at least one time bin during the 2sstimulation period (0.275 s per bin), the difference between the cell’s mean response and its mean activity trace during the intertrial intervals was more than twice the sum of the s.e.m. values for these two traces. Cells that responded to stimulus A only are shown red, those that responded only to stimulus B only are shown blue, those that responded to both stimuli are shown purple. b, Mean Ca^{2+} responses (ΔF/F) of 25 example neurons to the two different moving grating stimuli, oriented at ± 30°. Ca^{2+} activity traces are shown coloured during the stimulation period (marked with light grey shading) and black otherwise. Coloured shading about each trace denotes the s.e.m. over 217 trials of each type. The inset shows a schematic of the two stimuli, which appeared for 2 s per trial and were presented in random order. c, d, Histograms of the estimated mean spiking rates of individuals neurons during visual stimulation (c) and the absolute values of the differential responses of the individual neurons to the two visual stimuli, R_{A} – R_{B} / (R_{A} + R_{B}) (d), where R_{A} and R_{B} denote the mean responses of a cell to stimuli A and B, respectively. The distributions of cells’ activity rates and preferences for one stimulus over the other were consistent with previous studies of rodent visual cortical neurons^{28,29,38,58,59}. Data shown are for N = 8,029 individual cells from N = 5 mice. Error bars are s.d. as estimated on the basis of counting errors. e, Histogram of noise correlation coefficients, r, between pairs of layer 2/3 pyramidal neurons, computed as in Fig. 2d, for V1 cell pairs (dashed lines) and cells pairs in higherorder visual areas (solid lines). The histograms show mean values across the two different visual stimuli for both the real neural activity traces, and for trialshuffled data in which each cell’s responses to each stimulus presentation were randomly permuted across the set of all presentations of the same stimulus. r values were computed on the basis of cells’ responses integrated over t = [0.5 s, 2 s] from the start of each trial. Histogram bin, 0.01. (N = 1,331,109 V1 cell pairs from 5 mice; N = 2,428,437 cell pairs from higherorder visual areas in 5 mice). f, Boxandwhisker plots of the mean and FWHM values of the distributions in e (real data only). Both statistical metrics are similar for the two classes of visual cortical neurons. Open circles denote individual data points for N = 5 mice. g, h, Histograms (g) and cumulative probability distributions (h) of noise correlation coefficients for all cell pairs (based on all recorded V1 and higherorder visual cortical neurons) with similar or differently tuned mean evoked responses to the two visual stimuli. Unlike Fig. 2e, which shows these distributions for only the most active cells (the highest decile), here the distributions include all cell pairs with either positively (red curves) or negatively (blue curves) correlated mean responses to the two stimuli. Within these two groups of cell pairs, we computed the noise correlation coefficient, r, for each cell pair. Owing to the extremely large number of cell pairs, the two distributions of r values differed significantly (***P < 10^{−13} for all 5 individual mice; twotailed Kolmogorov–Smirnov test; 3,482,186 positively correlated cell pairs in total; 3,464,094 negatively correlated pairs), even though the effect size was tiny and the two distributions were nearly identical. This result shows the difficulty of detecting informationlimiting correlations by measuring pairwise noise correlations, because the variance in the individual r values is much greater than the difference between the mean values of the two distributions. i, Boxandwhisker plots of the mean values of the correlation coefficients in g, h. Open circles mark individual data points for N = 5 mice. b–i are based on 217–332 trials per stimulus condition in each of 5 mice. In f, i, boxes cover the middle 50% of values, horizontal lines denote medians, and whiskers span the full range of the data.
Extended Data Fig. 7 Temporal integration of neural activity improves decoding performance, but quadratic and linear decoding yield identical biological conclusions.
a–c, To identify how many PLS dimensions were needed to determine d′ accurately, we divided data from each of 5 mice into three equally sized portions. We performed PLS analysis using trials in the first third. Onto the PLS dimensions thereby identified, we projected the neural ensemble activity in the second third of the data (training data). We retained only the first N_{R} dimensions of this projection and computed d′ in the reduced space (magenta data points) by identifying a hyperplane for optimal stimulus discrimination. Finally, we applied this discrimination strategy to the remaining third of the data (test data) and again calculated d′ (grey points). Plots show mean values of d′ as a function of N_{R} for the interval [0.83 s, 1.11 s] from stimulus onset (N = 5 mice; error bars denote s.d. across 100 different subsets of 1,000 neurons per mouse). We normalized d′ values to that found for N_{R} = 5 on the test dataset. For N_{R} > 5, discrimination performance declines owing to overfitting for all discrimination strategies: instantaneous (a), cumulative (b) and integrated (c). Hence, throughout the rest of the study we used N_{R} = 5 for all calculations of d′. d, Pearson correlation coefficients between the optimal linear decoding weights attained using instantaneous decoding at different time bins after the onset of grating stimuli (±30° orientations). These weights were highly correlated for different time bins, especially across the interval [0.5 s, 2 s], during which d′ reaches a plateau. Further, optimal decoders for each time bin yielded nearly equivalent decoding performance when applied to data from other time bins. For instance, the optimal decoder for the fourth time bin (t = 0.97 s), when applied to any other of the last five time bins, yielded a performance within less than 2% of that of the optimal instantaneous decoder in all mice. When applied to the first and second time bins, the decoder from the fourth time bin yielded decoding performances that were, respectively, 83 ± 11% and 90 ± 3% (mean ± s.d.; N = 5 mice; 217–232 trials per stimulus) of that of the optimal decoders. e, Plots of d′ versus time after stimulus onset, for instantaneous and cumulative decoding strategies (Fig. 3). For each mouse that viewed gratings oriented at ±30°, we chose 100 random subsets of 1,000 cells and normalized d′ values by those obtained using a timeintegrated decoding strategy, which involved optimal linear discrimination over one interval, [0.28 s, 1.94 s], covering most of the visual stimulation period. Green traces, mean d′ values for individual mice using a time bin of 275 ms. Error bars, s.d. across 5 mice. f, In the fivedimensional space used after truncating ensemble neural responses to the five leading PLS dimensions, the distributions of noise in the responses to the two stimuli were highly similar. Specifically, nondiagonal elements, Σ_{ij}, of the noise covariance matrices for the two stimulus conditions were highly correlated (r: 0.81 ± 0.16; mean ± s.d.; N = 5 mice), as computed for the interval [0.83 s, 1.11 s] after stimulus onset. This similarity argues that a linear discrimination strategy to classify the two sets of ensemble neural responses is near optimal, as confirmed in h. Values of Σ_{ij} are plotted as mean ± s.d., computed across 100 different randomly chosen subsets of 1,000 neurons per mouse. g, Using optimal linear decoding, d′ values saturated as the number of trials analysed increased. Colours denote individual mice. Data points were calculated for the interval [0.83 s, 1.11 s] after stimulus onset. Error bars, s.d. across 100 different randomly chosen subsets of 1,000 cells per mouse and stimulation trials. h, To check whether our results depended on our use of linear decoding, we tested whether quadratic decoding might yield different conclusions. We examined the KL divergence^{31}, a generalization of (d′)^{2} that makes no assumption about the statistical distributions under consideration. We computed the KL divergence, which equals (d′)^{2} for linear decoders, by using Gaussian approximations to the distributions of ensemble neural responses to the two different stimuli, and we plotted the results as a function of the number of cells, n, in the ensemble. First, to recapitulate our determinations of (d′)^{2} (magenta data points), we computed the KL divergence under the assumption the two different response distributions had distinct means but identical noise covariance matrices, which we estimated as the mean noise covariance matrix averaged over the two different stimulus conditions. This is equivalent to computing (d′)^{2}. Next, we relaxed the assumption that the two noise covariance matrices were equal and computed the KL divergence between the distributions of neural responses to stimulus B relative to those to stimulus A (blue points), and vice versa (red points) (Methods). For all mice, KL divergence values saturated with increasing n and, except in one mouse, were not much larger than (d′)^{2} values. Thus, quadratic decoders (which are optimal for discriminating two Gaussian distributions with different means and covariances) will yield the same basic conclusions as linear decoders (which are optimal for discriminating two Gaussian distributions with the same covariance matrix). Data points and error bars denote mean ± s.d. values computed in each mouse across 50 different randomly chosen subsets of cells and assignments of visual stimulation trials to decoder training and testing (Extended Data Fig. 5b). i, Mean neural responses, averaged across all cells, to stimuli A (top) and B (bottom) for the first and second halves of the experimental trials in each mouse. Error bars, s.d. across the set of trials. j, d′ values computed for each mouse using instantaneous decoders trained on the first half of the trials and tested on the second half (x axis), plotted with d′ values for an instantaneous decoder trained on the second half of the trials and tested on the first half (y axis). a–j are based on 217–332 trials per stimulus condition in each of 5 mice.
Extended Data Fig. 8 PLSbased decoding methods are robust to multiplicative gain modulation and common mode fluctuations in the neural ensemble dynamics and yield identical conclusions to regularized regression.
a, b, To test whether PLS analysis and dimensionality reduction might lead to underestimates of d′, we compared d′ values determined using an L2regularized regression (L2RR) performed in the full space of neural responses (a) to those found by PLS analysis (b). The two methods yielded similar estimates of d′, which both saturated with increasing numbers of neurons. Plots show d′ values (mean ± s.d.) for neural responses within [0.83 s, 1.11 s] after stimulus onset, computed across 100 different randomly chosen subsets of neurons and visual stimulation trials (Extended Data Fig. 5b). For PLS analyses, we used half of the trials in each subset for decoder training and the other half for testing. For L2RR we used 90% of the trials in each subset to determine the regression vector and the other 10% to determine d′. We varied the regularization parameter, k, within [1, 10^{5}] and used the maximum d′ value so obtained, as determined independently for each mouse, subset of neurons, and subset of trials (217–332 trials per stimulus condition in each of 5 mice). c–h, The conclusions of our study depend on comparisons of decoding performance between real and trialshuffled datasets. Thus, we checked whether our PLSbased decoding methods would robustly detect informationlimiting correlations in models in which such correlations were present but weak; avoid reporting informationlimiting correlations in models lacking such correlations; and be robust to the potential presence of other strong sources of neural trialtotrial variability—such as common mode fluctuations and multiplicative gain modulation—even when they make an orderofmagnitude greater contribution to neural variability than the informationlimiting noise fluctuations. We studied these issues using two different computational models (Methods). For both models we plotted empirically determined (d′)^{2} values as a function of the number of neurons in the ensemble. We compared determinations of (d′)^{2} using PLSbased decoding and those made using L2RR to the actual ground truth values of (d′)^{2} in each model. In each panel, the top and bottom plots show results for unshuffled and trialshuffled datasets, respectively. Data points and error bars denote mean ± s.d. values across 30 different simulations. To examine the combined effects of informationlimiting noise correlations and common mode fluctuations (c–f) we studied a model of neural ensemble responses in which the noise covariance matrix exhibited informationlimiting noise correlations via a single eigenvector f, the eigenvalue of which grew linearly with the number of cells in the ensemble. In addition to this rank 1 component, we included a noise term that was uncorrelated between different cells, as well as a common mode fluctuation, yielding a noise covariance matrix with the form Σ* = σ^{2}I + ε_{common}J + ε f^{T} f, where σ^{2} = 1 is the amplitude of uncorrelated noise, I is the identity matrix, J is a rank 1 matrix of all ones, reflecting a common mode fluctuation, and f is the informationlimiting direction, a vector that we chose randomly in each individual simulation from a multidimensional Gaussian distribution with unity variance in each dimension. The amplitude of informationlimiting correlations was ε = 0.002, approximately matching the level observed in the experimental data. We chose the difference in the means of the two stimulus response distributions, Δμ, to be aligned with f (Fig. 3a) and to have a magnitude of 0.2 so that the asymptotic value of d′ for large numbers of cells approximately matched that of the data. We compared decoding results attained with and without the presence of the common mode fluctuations in the neural responses. In the version of the model without common mode fluctuations, we set ε_{common} to zero. In this case (c) both PLS and L2RRbased decoders correctly detected the saturation of information in the real data but not in trialshuffled datasets. (See Extended Data Fig. 10h, k for theoretical results showing how the accuracy of d′ estimates from PLS analysis depends on the numbers of neurons and experimental trials in this particular model.) To verify that our methods would not incorrectly report an information saturation when it was in fact absent, we next set ε = 0 and confirmed that in the absence of informationlimiting noise correlations (d), neither decoder detected a saturation of information in the real or shuffled data. In the version of the model with common mode fluctuations, we set ε_{common} = 0.02, ten times the value of ε = 0.002. In this case (e), both PLS and L2RRbased decoders correctly detected the information saturation in the real but not in the shuffled data. To verify that common mode fluctuations alone cannot induce an illusory saturation of information (f), we set ε = 0 while maintaining ε_{common} = 0.02 and confirmed that neither PLS nor L2RRbased decoders reported an illusory information saturation. Overall, these results indicate that our methods accurately detect the presence of weak informationlimiting correlations buried within common mode noise that can be an order of magnitude larger, without falsely detecting informationlimiting correlations when they are absent. To study the possible effects of multiplicative gain modulation (g, h), we compared two versions of a model in which the responses of the V1 neural population either were or were not subject to a multiplicative stochastic gain modulation but were otherwise statistically equivalent. We modelled the V1 cell population as a set of Gabor filters (see Appendix section 5). In the model version with gain modulation, on each visual stimulation trial we multiplied the output of each Gabor filter by a randomly chosen factor, uniformly distributed between 50%–150%, the value of which was the same for all cells but varied from trial to trial. In the model version without gain modulation (g) both PLS and L2RRbased decoders detected the information saturation in the real but not in the trialshuffled datasets. When we added global gain modulation to the model (h) both decoders correctly found the information saturation in the real but not in the shuffled datasets.
Extended Data Fig. 9 Moving grating visual stimuli oriented at ±6° are harder to distinguish on the basis of their evoked neural ensemble responses than gratings oriented at ±30°, but also reveal the saturation of information signalling in large neural populations.
a, (d′)^{2} values determined using an ‘instantaneous’ decoder for the interval [0.70 s, 0.94 s] from visual stimulation onset, plotted as a function of the number of cells, n, in the ensemble in mice presented moving gratings oriented at ±6°. Data points represent mean values determined across 100 different subsets of cells, and the shading represents s.e.m. As in Fig. 3f, g, we fit the (d′)^{2} values as a function of n using a oneparameter fit, (d′)^{2} = (d′)^{2}_{shuffled}/(1 + ε × n), where (d′)^{2}_{shuffled} (n) is the empirically determined value of (d′)^{2} for the same number of cells in the shuffled data, and ε is the fit parameter. For each mouse, for both real and trialshuffled data we normalized (d′)^{2} values by the value of (d′_{shuffled})^{2} for n = 1,000 neurons. Goodness of fit: R^{2} = 0.41 ± 0.17 (s.d). N = 5 mice. ε = 0.0021 ± 0.0008 (s.d.), 122–167 trials per stimulus condition for each mouse. b, Same as a, but using the ‘cumulative’ decoding strategy over the [0 s, 0.94 s] time interval. c, Boxandwhisker plots of the asymptotic values of d′ in the limit of many neurons (right) and the number of cells at which (d′)^{2} attains half its asymptotic value (left) as determined from parametric fits to the data of a and b for the instantaneous (open boxes) and cumulative (filled boxes) decoding strategies. Optimal linear decoders (green data) slightly but significantly outperformed diagonal decoders (black data) (**P < 0.0001; onetailed Wilcoxon rank sum test; N = 100 different randomly chosen assignments of trials to decoder training and test sets in each mouse; 122–167 trials per stimulus condition for each mouse; open circles denote mean values from N = 5 individual mice). d, e, Histograms for the real (unshuffled) and shuffled datasets of the ensemble neural responses to each of the two visual stimuli, projected onto the direction of the optimal decoding vector determined by PLS analysis, as computed in each mouse viewing moving gratings oriented either at ±30° (d) or ±6° (e), using all imaged neurons and the instantaneous decoding approach. Error bars denote counting errors. Values on the x axes are plotted for each mouse in units of the s.d. of its neural ensemble responses along the decoding vector for the shuffled data. For each mouse, the histograms have approximately equal shapes for the two visual stimuli, are unimodal and approximately symmetric about their mean values, bolstering the use of linear decoding and d′_{.} This analysis involved 217–232 trials per stimulus condition per mouse in d and 122–167 trials per stimulus condition per mouse in e.
Extended Data Fig. 10 Hundreds of experimental trials sufficed to estimate the statistical structure of signals and noise in visual cortical coding.
a, b, PLS analysis represents ensemble neural responses in a lowdimensional subspace that helps for understanding visual discrimination (Fig. 4). On the basis of Extended Data Fig. 7a–c, computations here used the five most informative PLS dimensions. Each column shows results from an individual mouse that viewed gratings oriented at ±30° (217–332 trials per stimulus). Each colour denotes a different eigenvector, e_{α}, of the noise covariance matrix in the fivedimensional subspace. α denotes the dimension index, {1,2,3,4,5}. As illustrated in Fig. 4e, each mouse had multiple eigenvalues, λ_{α}, of the noise covariance matrix that increased with the number of cells, n, used for analysis. As shown in Fig. 4f, visual signals—defined as the mean separation, Δμ, between the two response distributions—also increased with n. a, b show eigenvalues λ_{α} (a) and signal components Δμ · e_{α} (b) plotted against the number of trials analysed. Both signal and noise estimates plateau, indicating that there were sufficient trials to accurately estimate signal and noise structure in the reduced fivedimensional space. Throughout a–d, lines and shading denote mean ± s.d. across 100 different randomly chosen subsets of cells and assignments of trials to decoder training and testing, except in a, b we used all cells from each mouse and 30 different assignments of trials. c, d, The statistical relationships between visual signals and noise show the largest noise mode is not informationlimiting. Each mouse had multiple eigenvalues, λ_{α}, of the noise covariance matrix (c) that increased with n, the number of cells. Visual signals (d) also increased with n, as shown by decomposing Δμ into components along the five eigenvectors, e_{α}. In every mouse the eigenvector with the largest eigenvalue, e_{1,} was the least well aligned with the signals, Δμ (compare red curves in c, d). e, Plots of noise values, computed as in c, versus signal values, computed as in d, based on all recorded neurons from each mouse and the same 100 subsets of data used in c, d. The largest noise mode (red points) was generally an order of magnitude greater than noise modes that limited neural ensemble signalling (green and yellow points). f–k, In a–e and throughout much of the paper, we analysed populations of up to 2,191 neurons using 217–332 trials with each stimulus, which sufficed to accurately determine the Fisher information, (d′)^{2}, and principal eigenvectors of the noise covariance matrix (Fig. 4). By comparison, there were insufficient trials to accurately determine noise covariance matrix elements—that is, noise correlations between cell pairs (Fig. 2d). To explain this, we derived the accuracy with which d′ and principal noise covariance eigenvectors and eigenvalues can be estimated through PLS analysis of recordings of n neurons across P trials, using the computational model of Extended Data Fig. 8c (Appendix section 6 has derivations of results in f–k). The central idea, illustrated in f, is that one can estimate accurately the principal noise covariance eigenvector, because it has a large eigenvalue, λ, that grows linearly with n (\(\lambda \cong cn\), where c is a constant). The theory predicts that the correlation coefficient, \({\mathscr{C}}\), between estimated and actual eigenvectors is given by \({{\mathscr{C}}}^{2}=\frac{cP1/(cn)}{cP+1}\), for \({c}^{2}Pn > 1\). Otherwise, \({\mathscr{C}}\) = 0. f shows predictions for \({\mathscr{C}}\) (black curve) versus the number of trials, P, for n = 2,000 and c = 0.005. We chose this c value to fall within the lower range of growth rates for experimentally determined eigenvalues, c. The predicted \({\mathscr{C}}\) values match those describing the accuracy (red points) with which we could estimate the principal noise covariance eigenvector in the computational model. However, correlation coefficients (blue points) between estimated and actual individual elements of the noise covariance matrix were unsatisfactory, even with 800 trials. i shows predicted values of \({\mathscr{C}}\) as a joint function of n and P. Isocontours of \({\mathscr{C}}\) are hyperbolic, revealing a tradeoff such that recording more cells enables accurate estimation of noise eigenvectors using fewer trials. We also derived how accurately one can estimate eigenvalues of the noise covariance matrix, as quantified using the ratio, \({\Re }_{\lambda }=\lambda /\hat{\lambda }\)where λ = cn is the actual eigenvalue in the model and \(\hat{\lambda }\) is the estimate based on P trials. The theory predicts \({\Re }_{\lambda }=\frac{cP}{cP+1}\) when \({c}^{2}Pn > 1\); otherwise we set \({\Re }_{\lambda }=0\), because we cannot accurately estimate the corresponding eigenvector when \({c}^{2}Pn < 1.\) g plots predictions of \({\Re }_{\lambda }\,\)(black curve) versus P (for n = 2,000 cells and c = 0.005), which match the accuracy with which we estimated the model eigenvalues from simulated data (red dots). j shows \({\Re }_{\lambda }\) predictions as a joint function of n and P. We also studied how well one can estimate the Fisher information, (d′)^{2}, via PLS analysis of data with fewer trials than recorded neurons. We examined the ratio, \(\Re \), of the d′ estimate to its actual value using the model and simulated data of Extended Data Fig. 8c and found \({\Re }^{2}=\,\frac{1+n\varepsilon }{1/{{\mathscr{C}}}_{{\rm{PLS}}}^{2}+n\varepsilon }\), where \({{\mathscr{C}}}_{{\rm{PLS}}}^{2}=\frac{\Delta {s}^{2}P+4(\varepsilon +1/n)}{\Delta {s}^{2}P+4(\varepsilon +1)}\) is the predicted correlation coefficient between the PLS regression vector and the optimal one. Here Δs^{2} and ε determine the Fisher information in the model of Extended Data Fig. 8c via \({({d}_{{\rm{o}}{\rm{p}}{\rm{t}}}^{{\prime} })}^{2}\,=\,\frac{n\Delta {s}^{2}}{1+n\,\varepsilon }\). As in Extended Data Fig. 8c, we used ε = 0.002 to match the growth rate of (d′)^{2} in experimental data with increasing n, and Δs^{2} = 0.04 to approximate the magnitude, \(\frac{\Delta {s}^{2}}{\varepsilon }\), of (d′)^{2} in the data for large n. \({{\mathscr{C}}}_{{\rm{PLS}}}^{2}\) increases monotonically with P and n, confirming that PLS regression improves as n and P increase. As \({{\mathscr{C}}}_{PLS}^{2}\) nears 1, so does \({\Re }^{2}\), indicating that PLS analysis can accurately estimate (d′)^{2}. h shows predictions for \({\Re }^{2}\) versus P for n = 2,000 cells (black curve). The theory matches the accuracy with which we estimated (d′)^{2} via PLS analyses of the simulated model data (red dots). k shows predicted \({\Re }^{2}\) values versus n and P. Isocontours of \({\Re }^{2}\) are hyperbolic, indicating recordings of more neurons permit accurate estimates of (d′)^{2} based on fewer trials.
Supplementary information
Supplementary Information
Supplementary Appendix  Mathematical derivations and analyses regarding informationlimiting noise correlations.
Supplementary Information
Supplementary Note  Technical discussion of largescale twophoton imaging. References for the Supplementary material.
Supplementary Table
Supplementary Table 1  Summary of statistical results associated with the figures and extended data figures.
The 16beam twophoton microscope enables simultaneous monitoring of Ca
Video 1: ^{2+} dynamics in >2000 cortical neurons in an awake mouse. A twophoton Ca^{2+} video of the activity of layer 2/3 visual cortical pyramidal neurons expressing GCaMP6f in an awake mouse. 2191 individual cells were identified in the full video dataset from this mouse. The data were recorded at 7.23 Hz and are played back at 8× realspeed (30 fps playback, with each frame the average of two image acquisitions). The fieldofview is 2 mm × 2 mm.
Largescale Ca
Video 2: ^{2+} dynamics of layer 2/3 neocortical neurons in an awake mouse, recorded at 17.5 Hz over a 2 mm × 2 mm fieldofview. A twophoton Ca^{2+} video of the activity of layer 2/3 visual cortical pyramidal neurons expressing GCaMP6f in an awake mouse. The data were recorded at 17.5 Hz and are played back at 6.8× realspeed (30 fps playback), with each displayed video frame equaling the average of four image acquisitions. During preprocessing, we stitched together the 16 images tiles, corrected for brain motion artifacts via image registration, and adjusted the contrast to highlight the details (Methods).
Largescale Ca
Video 3: ^{2+} dynamics of layer 5 neocortical pyramidal cells in an awake mouse imaged with the 16beam microscope. A twophoton Ca^{2+} video acquired 500 μm deep below the cortical surface in a transgenic mouse (tetOGCaMP6s/CaMK2atTA) expressing GCaMP6s in a subset of layer 5 cortical pyramidal cells^{38}. The greater Ca^{2+} affinity and fluorescence output of GCaMP6s as compared to GCaMP6f enabled us to image layer 5 cells using the same total illumination power as the maximum value (320 mW) used elsewhere in the paper for studying layer 2/3 neurons expressing GCaMP6f. The data were recorded at 17.5 Hz, processed in the same way as Supplementary Video 2, and played back at 6.8× realspeed (30 fps playback). The fieldofview is 2 mm × 2 mm.
Rights and permissions
About this article
Cite this article
Rumyantsev, O.I., Lecoq, J.A., Hernandez, O. et al. Fundamental bounds on the fidelity of sensory cortical coding. Nature 580, 100–105 (2020). https://doi.org/10.1038/s4158602021302
Received:
Accepted:
Published:
Issue Date:
Further reading

Estimating Fisher discriminant error in a linear integrator model of neural population activity
The Journal of Mathematical Neuroscience (2021)

Correlations enhance the behavioral readout of neural population activity in association cortex
Nature Neuroscience (2021)

Neural tuning and representational geometry
Nature Reviews Neuroscience (2021)

Information diversity in individual auditory cortical neurons is associated with functionally distinct coordinated neuronal ensembles
Scientific Reports (2021)

Probing neural codes with twophoton holographic optogenetics
Nature Neuroscience (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.