Abstract
Motion detection represents one of the critical tasks of the visual system and has motivated a large body of research. However, it remains unclear precisely why the response of retinal ganglion cells (RGCs) to simple artificial stimuli does not predict their response to complex, naturalistic stimuli. To explore this topic, we use Motion Clouds (MC), which are synthetic textures that preserve properties of natural images and are merely parameterized, in particular by modulating the spatiotemporal spectrum complexity of the stimulus by adjusting the frequency bandwidths. By stimulating the retina of the diurnal rodent, Octodon degus with MC we show that the RGCs respond to increasingly complex stimuli by narrowing their adjustment curves in response to movement. At the level of the population, complex stimuli produce a sparser code while preserving movement information; therefore, the stimuli are encoded more efficiently. Interestingly, these properties were observed throughout different populations of RGCs. Thus, our results reveal that the response at the level of RGCs is modulated by the naturalness of the stimulus - in particular for motion - which suggests that the tuning to the statistics of natural images already emerges at the level of the retina.
Similar content being viewed by others
Introduction
Motion detection is essential for animal survival, and studies show that many retinal cells1,2,3 and a considerable portion of the cortex4 are involved in its processing. However, most of these studies have focused on the detection of artificial stimuli, such as random dots, lines, or moving gratings, while ignoring the natural signals from the environment in which animals have evolved and to which their visual system would be attuned to5,6,7,8,9. Moreover, computational models based on the response to simple, artificial stimuli fail to predict the response to naturalistic images10,11. Nevertheless, working with natural images directly is not always possible due to their complexity regarding critical parameters of signals processing, including visual content and its variability. The latter makes it hard to control the motion information that is effectively being presented to the visual system12,13.
In the visual cortex, motion is detected at different levels and by populations of cells attuned to different parameters of the stimulus (see4 for a review). It has been reported that visual neurons stimulated with complex images, which should be closer to natural images, change their tuning curves compared to the response obtained with simple stimuli, becoming narrower and thus more selective to speed14. In turn, stimulation with naturalistic images generates sparser code15,16 and eye tracking responses become more precise17.
In the retina, the most basic form of motion detection is the response to moving textures in which Y-like Retinal Ganglion Cells (RGCs) increase their firing rate when sub-regions of their Receptive Field (RF) alternate between light and dark luminosity levels through time, regardless of the direction of motion18,19. It has been demonstrated that these mechanisms arises from the RF properties of ON and OFF Bipolar Cells (BCs) that connect to the RGCs20 and transmit their activity in a non-linear fashion. This non-linear integration of the activity of BCs allows that the alternating activation of cells of both polarities do not cancel each other upon reaching RGCs21. These BCs can generate fast transient responses when there is a luminance change of the corresponding polarity (dark to light or light to dark respectively), so each time that BCs detect a temporal change in luminance, they will generate EPSPs in the corresponding RGCs. The integration time of the RGCs is longer than the time course of the response of the BCs, so fast successive activations will be added. In this way, the RGCs will respond with higher firing rates to higher speeds because of the higher rate of alternating activation of the ON and OFF sub-fields.
More specialized computations are carried out by, e. g. Direction Selective RGCs (DSRGC) which respond preferentially to a single direction in a varied range of speeds depending on the RGC type and the species1,3,22. The circuitry involved in the computation of DSRGCs relates the integrated activity of BCs and starburst amacrine cells (SACs). At least two types of calculations are simultaneously present in the retina projecting to different places of the central nervous system23,24,25. ON-OFF DSRGCs have been reported both in mouse and rabbits, responding to ON and OFF stimuli and being selective to four different types of motion direction in a broad range of speed. Besides, there also exists an On-DSRGC type reacting to slow velocities23. Rabbit and salamander retina report Object Motion RGCs detectors, which fire when a small patch of the image moves in a different pattern than the rest of the background26,27. More recently, it has been reported that the mouse retina contains a non-DS high-definition RGC, which is mainly selective to the motion of small objects across their RF28. Another sophisticated processing of motion, present in salamander retina, is the response to Motion Onset29, in which the initiation of movement within the RF of the RGC causes a higher response than does an object moving smoothly through it. While it has been shown that the retina is capable of advanced computations beyond standard processing30, many capabilities remain unexplored31, for example, the fine tuning of retinal responses to motion in natural images.
In this study, using retinas from Octodon degus, a diurnal crepuscular rodent with a high proportion of cones (30%)32 and a relatively high density of RGCs, we focus on the properties of RGCs responses to motion in several scenarios. To overcome the limitations of working with simple, low-dimensional artificial stimuli, while at the same time avoiding the problems of working with natural images12, we used synthetic random textures called Motion Clouds (MC)33, which preserve some of the properties of natural images. The MCs mimic the spatiotemporal changes in luminance of natural scenes using a generative model for the geometric transformations of visual objects (translation, rotation, zoom)34. In contrast to a traditional drifting grating which has a single spatial and temporal frequency, a MC resembles a dynamic texture for which components (textons) may have different sizes and move at different speeds (see Fig. 1a for comparison of these two stimuli). While in general, natural images have their power distributed along a broad range of frequencies, MC can be parametrized to have a restricted range of frequencies to limit their information content or to have a broad spectrum to resemble natural images. As such, each generated texture corresponds to a compound of different moving gratings with random positions (phases) yet with spatiotemporal properties distributed along a parameterizable area in Fourier space. Here, we use the same distribution near the spatiotemporal speed plane, and control the amount of components by the bandwidth around the central spatial frequency. This defines a simple parameter to control the complexity of the input stimulus as the amount of compounded components in the texture (see Fig. 1b). This manipulation has previously been used successfully to study speed discrimination in humans17, where it was shown that additional information contained in the stimuli improves eye pursuit gain and precision.
To study such a property at the level of the retina, the activity recorded from a population of RGCs was compared using simple moving stimuli (drifting gratings) vs. MC’s with the same speed and two levels of naturalness, regarding their spatial frequency spectrum (narrow and broad bandwidth). We found that for a significant fraction (66%) of RGCs responding to the motion, the cells’ tuning bandwidths become narrower when the spatiotemporal frequency content of the stimulus increased. The set of RGCs studied contains cells traditionally assigned to different types, indicating that this property is not exclusive of a single class, but instead is widespread in the retina. At the population level, this change in tunning bandwidth results in sparser codes, which in turn reflects a more efficient coding of the motion information. Therefore, our findings show that fine-tuning of motion detection to natural image statistics already emerges at the level of the retina.
Results
We characterized the response from a population of speed responsive RGCs, recorded from retinal patches of 2 young Octodon degus, a diurnal rodent, having a total of 308 RGCs (see Methods section and Fig. 2 for details).
Speed responsive cells in the retina as characterized using gratings
We measured RGCs responses to a set of drifting gratings (artificial simple stimuli), with varying speed and spatial frequency (see Fig. 3a). For each spatial frequency, the response to speed averaged over trials was fitted to a skewed Gaussian. In Fig. 3b, we show how preference for certain speeds changes as a function of spatial frequency. As expected for RGCs, preferred speed decreases at higher spatial frequencies. In the example shown, the response is maximal at intermediate speeds and spatial frequencies, diminishing gradually as one moves away from that preferred combination of parameters. When looking at the fitted curves, it can be seen that the whole curve shifts towards lower speeds when increasing the spatial frequency.
A subset of the complete set of valid RGCs was separated and classified as speed responsive cells (SR), i.e. cells whose response is modulated by speed, by keeping only cells whose average firing rate correlated with the response strength, by estimating its correlation with the standardized component of the amplitude spectrum zF135; and then, cells whose speed response curve to the drifting gratings has a good fit to eq. 3 at every spatial frequency tested (χ2 < 0.05 for the normalized response). We analyzed retinal patches from two animals, where 81 and 109 SR cells were found, respectively, representing ≈26% and ≈29% of the total RGCs recorded. Figure 4a shows the total number of RGCs found in a sample piece of retina, and their respective RF, of which the SR cells are highlighted in color (a second example is shown in Figure S1 in the Supplementary Information section).
To discriminate between SR and Non-SR cells, we analyzed the RF properties of each class. In terms of ON/OFF RGCs distribution, and similar to previous studies performed in the same animal model36, a low ratio of ON/OFF cells were found. The temporal profile of a cell’s response was characterized using the parameters indicated in Fig. 4b: zero-cross and peak-time. In Fig. 4c we show the distribution of these two parameters along the entire population of RGCs, separated by their speed responsive selectivity (SR cells are represented in red, while Non-SR in black). Interestingly, only the zero-cross parameter is significantly different between SR and Non-SR cell populations (p = 0.026 Kolmogorov-Smirnov test, p = 0.65 for the difference in peak time). Additionally, there was no direct match with other traditional criteria for classification of RGCs, such as Fast-Slow and Transient-Sustained using the latency and transience index37 of the response to a full-field flash, since our set of SR cells contained cells belonging to each class (24 Fast-Transient, 13 Fast-Sustained, 19 Slow-Transient and 25 Slow-Sustained from 81 SR cells).
Speed selectivity changes with the complexity of the stimulus
When presenting a stimulus with the similar basic properties as the drifting gratings but with energy distributed along a larger area in parameter space, the responses from RGCs differ significantly. Examining in detail the response at the preferred spatial frequency (Fig. 5a) reveals that the firing rate response to lower and higher speeds is much weaker for the MC stimuli compared to the grating, while the responses become equivalent only around the preferred, intermediate speeds. This effect is more evident when looking at the tuning curves (Fig. 5b), where the response decreases more sharply when moving away from the preferred parameters for the complex stimuli. The change in the shape of the curve is evidenced by a change in the σ value of the fit, which decreases progressively from 1.76 to 1.45 and 0.95 as the complexity of the stimulus increases, while maintaining the same preferred speed (1.47;1.37;1.49 mm/s respectively). The firing rates are stable across repetitions, so the narrower tuning is not due to drifts or decay in cell activity. When considering all the conditions tested, the response profile to the MC stimuli cover a smaller area in parameter space along both axes, when viewing it as a two dimensional map of the response at each combination of parameters (Fig. 5c). As expected, speed preference decreases with spatial frequency, as seen by the slanted response profile.
This change in the response results in tuning curves with narrower profiles, thus the tuning of the RGC becomes more specific for certain parameters of the stimulus. Besides this change in the shape of the tuning curve, for many RGCs, the response to the MC stimuli has lower firing rates, as is the case for the example shown in Fig. 5b, while for others, the responses are equivalent and even higher in some cases (see examples in Fig. 5d).
Another interesting and unexpected finding is that some of the RGCs with narrower tuning for speed also present changes in their preferred speed, mostly towards higher speeds (Fig. 5d, panels on the right). Low firing rates obtained for MCs compared to gratings could be explained by differences in the contrast stimuli, but not the sharpening of the speed-selective curve (see Box 1). The low contrast observed in MCs is due to an increase of the spatial frequency bandwidth. In the grating analysis, we observe that for higher spatial frequency bandwidths, the peak activity for the central frequency gets smaller, and not narrower.
To quantify the change in the tuning bandwidth of the grating versus MC response, we measured the properties of the curve fitted to the response to speed (see Methods section for details). In Fig. 5d we show some examples of RGCs with narrower tuning in terms of lower response to the non-preferred speeds. At the population level, a large proportion of the SR cells show a decrease in the bandwidth of the response to speed at their preferred spatial frequency, for the complex stimuli when compared to the response to drifting gratings. This is evident when looking at the comparison of σv in Fig. 5e; for both MC series, a large portion of the points falls below the line of identity, meaning that the values are lower than for the grating stimulus. The distributions of σv are skewed towards lower values (Fig. 5f), meaning that a large portion of the speed responsive RGCs have narrower tunings when stimulated with the complex stimulus (T = 1009 p = 0.00216 for grating vs. narrow bw. MC, T = 843 p = 0.00012 for the broad bw. MC, Wilcoxon signed-rank test). With regard to the preferred speed, even though that for some RGCs the preferred speed increases (see the clusters at the lowest and highest speeds in Fig. 5g), overall, the differences are not highly significant (Fig. 5h T = 1159 p = 0.018 for grating vs. narrow bw. MC, T = 1632 p = 0.893 for the broad bw. MC, Wilcoxon signed-rank test).
Population responses becomes sparser for more naturalistic stimuli
When pooling the response of many RGCs, as a motion sensing neuron in higher cortical areas would do14, the response profile of the recorded population also shows differences when stimulated with the complex, more naturalistic stimuli. As expected from the individual responses, the response profile to speed depends on the spatial frequency (Fig. 6a left panel). The curve not only shifts when changing the spatial frequency, but the peak response also decays as the spatial frequency increases.
When comparing the response to the simple vs complex stimuli, the preference for lower speeds at higher spatial frequencies is still observed, but with a significant decrease in response for almost every condition (Fig. 6a middle and right panels). However, this decrease in response is larger for the lower speeds at every spatial frequency tested, to the point that for the two lowest speeds, the difference in response between spatial frequencies becomes non-significant (no significant difference at speeds 0.25 and 0.5 mm/s, p > 0.05 Wilcoxon signed-rank test), so the left part of all the tuning curves converge to the same values.
However, these lower firing rates elicited by the complex stimuli would not necessarily be detrimental to the coding properties of the retina. Instead, lower average firing rates could be associated with sparse coding. To see if this were the case, we computed the average population sparseness (eq. 5) for each stimulus condition (how many RGCs and how much they are responding to each stimulus). When compared to the simple stimulus, there are significant increases in sparseness in 8 conditions for the MC with narrow bandwidth and 19 conditions for the broad bandwidth stimulus at each combination of speed and spatial frequency tested (Fig. 6b).
A sparse response encodes motion information efficiently
To determine if the sparser retinal response is still encoding the motion information, we applied a reconstruction framework to decode the trajectory of a moving stimulus from the response of a group of RGCs and then extract information derived from the trajectory. As can be seen in Fig. 7, we constructed retinal images from the spikes emitted by each RGCs and its respective RF. To assess the ability to extract motion information from the reconstruction, we applied a process analogous to the way velocity is computed in cortical visual areas38 (see Methods section for details). An example of this process is shown in Fig. 7. Even though that for some of the conditions the first stage of decoding was enough to get the correct speed (in the example, the lowest and the two highest speeds), in general the decoding improves at the Motion Energy and Opponent Motion stages, in the sense that the set of filters that produces the highest activation is the one that matches the presented speed.
The precision of speed decoding for the different conditions can be seen in Fig. 8. The motion traces obtained for all the stimuli can be seen in Fig. 8a. To measure the quality of the estimation, we computed the error rate as 1 minus the number of trials for which the estimation is correct, over the total number of trials. As seen in Fig. 8b, some of the conditions have zero error rate, while for the lowest speed, the error rate is 1 for many of the spatial frequencies for the three types of stimulus. For the gratings, most values are either one or zero, while for the MCs some conditions show intermediate values, meaning that for some trials the decoding was correct and for others it failed. When looking at the cumulative error rate across spatial frequencies (Fig. 8c) it becomes apparent that the decoding error for gratings is very high at the lowest and highest spatial frequencies, decreasing gradually towards the intermediate spatial frequencies. The same pattern can be seen for the MCs, however, the difference in decoding error for different spatial frequencies is lower, and stays within an acceptable range for all conditions. For broad bandwidth stimuli, the error rate is distributed more evenly across spatial frequencies, with bad decoding performance concentrated at the lowest speed.
While the distribution of the errors is different for the three types of stimuli, overall, the total error rate is slightly lower for the MC stimuli (Fig. 8c, rightmost plot, T = 0.0, p = 0.016 and T = 3.0, p = 0.034 Wilcoxon signed-rank test). Thus, it can be considered that there is a slight improvement in decoding performance. Since we were able to estimate the stimulus speed with an accuracy of ≈0.7 (1 - the total error rate) for the three types of stimuli, it could be argued from an information point of view, that the retina is transmitting at least enough information about the stimulus to encode speed under the three conditions. However, as shown in Fig. 5, MCs elicit responses with lower firing rates. So, if we quantify the cost of accuracy for each type of stimulus, in terms of the number of spikes used to successfully transmit the speed of the stimulus, we get that for the gratings the value is ≈8.26 spikes/cell, while for the MC narrow and broad is 7.56 and 5.57 respectively, and these values are statistically significant (T = 0.0, p = 0.005 for the MC narrow and T = 0.0, p = 0.0050 for the MC broad, Wilcoxon signed-rank test).
Discussion
Our results strongly suggest that the tuning to naturalistic stimuli is already present at the level of the retina. For the first time, this showed that the tuning curves of response to motion become narrower under stimulation with increasingly more complex images, where we defined complexity as the number of compounded features of different spatial frequency but similar speed in the input texture. In particular, the bandwidth parameter controls complexity as it may generate a simple crystal-like stimulus (a grating component) with a very narrow bandwidth up to a smoke-like texture with a similar 1/f fall-off as natural images as the bandwidth grows to infinity. In the present work, the difference between each set of stimuli is the spatial frequency content, i.e. the variety in sizes for the elements that constitute the images; while the three classes of stimuli have the same average properties (spatial frequency and speed) for each set, the MC contain additional signals or components around this central value.
Additional signals in a stimulus can modulate the response of a RGC in different ways. It has been shown that isolated stimuli that fail to modify the activity of a RGC can alter the response to another stimulus when presented simultaneously39. The latter would stem from nonlinearities in the processing of the signal performed by the RGC, in the sense that the response to a complex stimulus is not merely the response to the sum of its components. In some cases, the presence of signals around the RF, product of an increased area of stimulation, is enough to trigger changes in the response15. One of the mechanisms responsible for this effect is surround suppression, in which the presence of a stimulus in an area surrounding the spatial RF will decrease the response to the stimulus in the center, which is partly explained by an increase in the amplitude of inhibitory post-synaptic potentials16.
In our experiments, all stimuli cover the same area, which exceeds the area of the RFs of the recorded RCGs, so the presence of a stimulus in the surround would not by itself be enough to explain the changes in the observed response. Surround suppression occurs under a variety of stimulation patterns, both spatially and temporally40. However, for simple stimuli it has been shown that the effect is most active when the parameters of the stimulus on the center and the surround are the same41, which would be the case for the gratings in our experiments; however, our data shows that this is not the case Fig. 5.
Seminal works in vision psychophysics established that spatial frequency sensitivity is dependent on contrast and overall luminance levels42. In general, sensitivity to high spatial and temporal frequency decreases at lower contrast levels. It could be argued then that our results reflect the effects of changes in contrast and not changes in spatial frequency. With our naturalistic stimulus, the relative contrast of each component of the images indeed scales with its spatial frequency, so components with higher spatial frequencies have lower contrast, following the same 1/f scaling characteristic of natural images. However, the contrast of the main components (those with the central spatial frequency sf0 or close to it) have the same contrast as the grating stimulus, while only the high-frequency components have lower contrast; this distribution of contrasts is stationary during the whole experiment, so the primary response drivers have the same contrast across the three types of stimuli. On the other hand, the effect of lower contrast on tuning curves is the opposite of what we report, as stimuli with lower contrast generate broader tuning curves43, at least in the visual cortex. Finally, it has been shown that adaptation can alter the relationship between spatio-temporal tuning curves and contrast44. Even though retinal adaptation - occurring at multiple stages - to luminance levels is relatively fast (in the order of seconds), the stimulation time necessary to generate spatial frequency adaptation is longer, so we explicitly designed the stimulation protocol to avoid adaptation by limiting the exposure time to 3 seconds and interspersing the images with 1 second of blank, average luminance stimulation. The stability in the response during stimulation and across trials for the three classes of moving stimulus (Fig. 5) shows that we can consider the effect of contrast, concerning the impact of spatial frequency bandwidth, to be negligible.
It has been shown that a non-linear interaction between the center and surround parts of the RF in RGCs modulates the response to spatial patterns in a context-dependent manner and that this effect is more significant during naturalistic stimulation45. We think that this effect can explain, at least in part, the results that we observe; the additional signals contained in the complex stimulus would act as non-linear inhibitors, resulting in lower response when the parameters are farther from the preference of the RGC, producing narrower tuning curves. Moreover, the narrower speed selectivity tuning observed in RGCs, when stimulated with naturalistic-like stimuli such as MCs, seems to be a general property of many RGCs types in the retina and not related to a single RGC type. Recently, it has been reported the relevance of gap-junction connections between BCs in motion computation46,47. This early mechanism, together with nonlinearity computations at the output of BCs48, increases RGCs motion sensitivity to correlated spatio-temporal input, such as the case of MCs.
Interestingly, together with this narrower bandwidth, approximately a third of the RGCs show an increase in preferred speed (see two examples in Fig. 5, right panels), while others preserve their preference (Fig. 5, left panels). The change in the speed preference cannot be attributed to changes in the spatial frequency content of the stimulus. Due to the bandwidth of the stimuli, the images will include additional spatial frequencies, but the shape of the spectrum (see Methods: Naturalistic stimuli) determines that most of the additional frequencies will be higher than the central frequency (sf0). If the RGC processed these higher spatial frequencies linearly, they should shift preference towards lower speeds - but our results show the opposite. From our analysis, it is difficult to relate or predict this behavior from other properties of the RGC’s response, since the shift in the preferred speed appears not to correlate with any of the RF’s characteristics.
Which are the possible implications or consequences of having these narrower tunings for naturalistic stimuli? At the level of individual RGCs, each RGC will only respond when the stimulus is near the preferred parameters. But this does not mean that a RGC will encode its preferred stimulus. Indeed, it has been shown that steeper tuning curves will encode more efficiently because stimuli will be more readily discriminated in the high slope range49. As such, these results mean that for speed-selective RGCs an input with a broader spectrum of spatial frequencies will have a higher precision in speed, independently of the local scale (spatial frequency) of the stimulus. This mechanism speaks to the dual mechanisms at the origin of selective responses (here speed) and of the invariance to other features (here scale). In addition to this effect at the individual RGC level, when looking at the population code, narrower tuning curves can lead to less overlap between RCGS, so in conjunction, the code will contain less redundancy, an essential aspect of the efficient coding theory50,51,52. In support of this, we showed that population sparseness increases for the naturalistic stimuli, so in this respect, the retina is performing better under these types of stimulation, in the sense that a sparse code is more efficient from an information transmission point of view.
To see if these changes in tuning curves affect the coding of the motion signal, we implemented the stimulus reconstruction and speed decoding method. We chose speed as the relevant parameter for motion estimation, but we expect that with the proper adjustment, any property of the stimulus could be recovered, provided one can obtain a sufficiently good reconstruction53. Another simplification that we took advantage of, is that our experimental design is based on the independent presentation of a single speed for each stimulus set (with slight variations around an average value for the MCs); this means that for the duration of each segment of stimulation, the whole population of RGCs is “seeing” the same speed. Thus, to recover the speed of the stimulus from the reconstruction, we only need to get a single value from the whole population and for the entire presentation of each sequence.
Our reconstruction method could be improved by varying some of the parameters, such as the weights of the sum, to minimize the error of the results, as is usually done in this type of work. However, even though when the models obtained by those methods are biologically plausible, the process of a posteriori maximization of the quality of decoding is hardly naturalistic, because no biological sensory system has access to the “real” value of the signal to perform the minimization step that is essential to this kind of approach.
Another possible improvement would be to incorporate nonlinearities in the decoding process, possibly by taking into account the history of the firing of a RGC instead taking every spike with the same value. Nonlinear models have been reported to give better stimulus reconstruction, especially for complex images54.
In any case, even though our reconstructions have a relatively low degree of fidelity, they contain enough information to compute the speed of moving features in the images successfully. The latter is crucial because it means that even though these complex stimuli elicit lower retinal responses, the relevant information is still contained in these lower responses. In this regime of lower firing rates, energy expenditure would be lower, and the cost of information would be lower55. Thus, this simplified approach is good enough to support the hypothesis that the retina would efficiently encode naturalistic stimuli.
In conclusion, we have shown that the retina, of a diurnal rodent, would be adapted to at least some of the characteristics of natural images, specifically those related to motion and the spatio-temporal information content and correlations used in this study. This adaptability is expressed in the narrower shapes of the tuning curves responses, which would lead to less overlap between RGCs in the feature space and thus less redundant and sparser population code, with smaller firing rates, which translates into less energy expenditure and less channel saturation. One limit of the stimuli that we used here is that they are densely packing the image space. However, natural images may exhibit some skewness and positive kurtosis in the distribution of luminance, reflecting the fact that the visual images are in general generated using a sparse number of components. This may introduce a different level of hierarchy in our definition of complexity: sparser textures may introduce spatial structures compared to the stationary nature of the MC textures used in this study. An interesting perspective is, therefore, to extend such stimulations in the retina to different levels of sparseness and to challenge the hypothesis that the processing in the retina could be differentially adapted to these different levels.
Methods
Animals
Young Octodon degus born in captivity are maintained in a controlled facility. Prior to each experiment, animals were put in darkness for 30 minutes, then deeply anesthetized with halothane and beheaded. Eyes were removed and dissected at room temperature under red illumination. Experimental procedures were approved by the Institutional Committee on Bioethics for Animal Research from the University and in accordance with the bioethics regulation of the Chilean Research Council (CONICYT).
Degus are diurnal rodents with 30% of their photoreceptors being cones and a comparatively high number of RGCs, qualities that makes them a good model for studying vision. In addition, the large area of the retina makes degus especially suitable for Multi Electrode array recordings as good coverage of all the electrodes in the matrix can be obtained. For a complete description of the model see32,56.
Electrophysiological recordings
The experimental protocol was described before and follows36. Briefly, the physiological response of degu RGCs to different types of visual stimuli was measured using a Multi Electrode Array557,58 (USB MEA256, Multichannel Systems GmbH, Reutlingen, Germany) using 256MEA100/30iR-ITO matrices and sampling at 20 kHz. After removing the eyes, the posterior hemisphere was dissected in quadrants and the pigmented epithelium separated from the retina. Finally, a piece of retina was mounted on a dialysis membrane and mounted on a perfusion chamber, which was then lowered onto the electrode array with the RGCs side down. Recording commenced under perfusion with AMES medium bubbled with 95% O2 5% CO2 at 33 °C and the pH adjusted to 7.4.
Visual Stimulus Presentation
Stimulus display was performed with a conventional DLP projector using custom optics to reduce and focus the image onto the photoreceptor layer, projecting from the RGC side. The projected size of a pixel is ≈4 µm, maintaining an average irradiance of 70 nWmm−2, covering an area of ≈2 × 2 mm, exceeding the area covered by the electrodes. Regarding the spectral emission of the DLP used here, the LEDs with peak emission at 460 nm and 520 nm (USB4000 Fiber Optic Spectrometer, Ocean Optics) covers the main green-cones with peak at 510 nm and barely, to −2 log maximal peak sensitivity of, the 370 nm UV-cones present in the dichromatic visual system of Octodon degus56. Cones in degus are close to a 30% of total number of photoreceptors where green-cones (or M-pigment) represent 85% and UV-cones 15% (or UV-pigment)32. In that respect our main results and conclusions, at photopic conditions, are extracted from the response of M-pigment cones. Stimuli were generated using the code available at https://github.com/NeuralEnsemble/MotionClouds. Timing of the images was controlled using a custom-built software based on Psychtoolbox for MATLAB. Spike sorting analysis was performed using Spyking-Circus59. Data was analyzed via redistributable Jupyter notebooks using SciPy and running Python 3 kernels. All data fitting procedures were performed with LMFIT. Figures were generated with Matplotlib.
Maximal speed projection is given by the time needed for a point to cross half of the projected image in one frame time, i.e., 200 px/frame. As the projector works at 60 frames, it gives us an estimation of 12000 px/s. Considering that one pixel covers 4 μm (4 μm/px), it generates a maximal speed of 48 mm/s, which is higher than the highest rate used in our study (4 mm/s).
Characterization of spatiotemporal tuning of RGCs
To produce a standard characterization of RGCs responses to different stimulations, we measured the response to a white noise checkerboard pattern (block size = 0.05 mm, 35 × 35 blocks) for 1200 s at 60 Hz and to sinusoidal drifting gratings at maximum contrast (minimum and maximum Weber contrast of ≈−94 and ≈129 respectively), with spatial frequencies of 0.66, 0.9, 1.26, 1.8, 2.52 and 3.6 cycles/mm, and speeds of 0.25, 0.5, 1.0, 2.0 and 4.0 mm/s in sequences of three seconds with 10 repetitions of each. Due to the constraints in recording length stemming from tissue viability, we focused only on changes in response to the speed of the moving stimulus and not to its direction, so the present protocol uses a single direction for all sets of stimuli.
RFs were estimated from the response to the checkerboard stimulus by reverse correlation, yielding the Spike-Triggered Average (STA)60 as a three-dimensional spatiotemporal impulse response (Fig. 2e). The spatial characterization was performed by fitting a bi-dimensional Gaussian function at the time point of maximum amplitude, then drawing an ellipse at one standard deviation. The size of each RF was calculated as the radius of the circle with the same area as the ellipse fit61. The shape was quantified by the ellipse-eccentricity ε defined as \(\varepsilon =\sqrt{1-{(b/a)}^{2}}\) where a and b are respectively the radius of the major and minor axis; since noisy STAs generate fittings with high ellipse-eccentricities, cells with ε > 0.9 were discarded. The temporal profile was computed as the time course of the intensity at the point with the largest variance, and then fitted to a difference of two cascades of low-pass filters62. Basal activity was set to zero and amplitude normalized, such that it follows the form
where t represents time in number of image frames before the spike, τ1 and τ2 describe the temporal decay of the response of each filter; scalars p1 and p2 are amplitude responses of each filter, and n is a free parameter.
RGCs sensitive to motion
Tuning to motion was evaluated from the response to the drifting gratings; the response to each set of speeds and spatial frequencies was measured by Peristimulus Time Histogram (PSTH) and Fourier analysis for 10 repetitions of each sequence. The beginning of the PSTH is set to 200 ms after stimulus start to discard the response to stimulus onset. To select the cells responsive to motion, first we selected RGCs with response modulated by the stimulus. As a measure of the strength of the modulation of the response by the stimulus, we computed the standarized F1 (zF1)35, defined as:
where F1 is the amplitude component at the temporal frequency of the stimulus, mean(FFT) is the mean amplitude of the spectrum over the range of frequencies from 1/T to the Nyquist frequency, and SD(FFT) is the standard deviation of amplitudes in the frequency spectrum over the same range of frequencies. We discarded all the cells whose zF1 values were not correlated with the corresponding average firing rate (Pearson correlation, p > 0.05).
Tuning to speed v was fitted to a skewed Gaussian in which the response to speed R(v) takes the form
where V is the preferred speed, σv the curve width, ζ represents the skew of the curve and A a scaling factor14. Since σv is highly correlated with tuning bandwidth (width of the function at half-height)63, we used σv for all bandwidth related functions. Changes in tuning bandwidth were measured as the normalized difference in σv as
From eq. 3, we define Speed Responsive (SR) RGCs as those whose response to the drifting gratings has a good fit to the equation at every spatial frequency tested (χ2 < 0.05).
The selection criteria for SR RGCs did not elicit direction or orientation selective responses. Moreover, computing the Direction Selectivity Index for all the SR RGCs, we observe practically no cell with a DSI larger than 0.5, the value that is usually considered the minimum to classify a cell as Direction Selective.
Population coding
Population response was evaluated as the average of the PSTH of all speed responsive RGCs. Sparseness of the response to each stimulus was measured as Population Sparseness (Sp), defined in64 as
where Ai is the average response of cell i to the stimulus.
Naturalistic stimuli
We used Motion Clouds (MCs)33, with an image size of 400 × 400 px (projected size ≈1.6 × 1, 6 mm) for 3 s, to measure the response of the retina to the motion of stimuli with different levels of naturalness (in terms of their spatio-temporal correlations). MCs are synthetic dynamical textures that mimic some key properties of natural images34, while allowing for the precise control over the signals that are being presented, which is a highly desirable property for the study of sensory systems under naturalistic conditions12. The traits of these textures are determined by three parameters for the center of the envelope (size, speed, and orientation), along with their respective bandwidths. These bandwidths control the area (in parameter space) in which the power of the stimulus will be spread.
Our protocols consisted of a set of different sequences with the same motion information (mean speed and spatial frequency), but in which we progressively varied the spatial frequency bandwidth (i.e. the range of sizes), as this was reported to influence visual motion detection17. These match the parameters of the drifting grating stimuli (equivalent to an MC with infinitely narrow bandwidth), and textures with a narrow and a wide bandwidth of spatial frequencies (Fig. 1a blue, green and orange ellipses respectively). For narrow and broad bandwidth textures, amplitude scales with spatial frequency: for the narrow bandwidth stimuli, the scaling factor is chosen for the spectra to not overlap, while for the broad band stimuli the scaling factor is 1 resulting in Bsf = sf0, so that each spectrum overlaps with the adjacent one along the line of equal speed, see Table 1 for the specific parameters. Figure 1b shows an example stimulus, comparing the spatiotemporal image of a drifting grating and MCs with narrow and broad bandwidth, illustrating the smooth transition from a “crystal-like” pattern (the grating) to progressively more complex and naturalistic stimuli using a single parameter (Bsf). Under this characterization, a collection of well-sampled natural images would correspond to MCs with infinite bandwidth, i.e. it would contain every spatial frequency, so drifting gratings and natural images would be on opposite ends of the complexity scale, with our MCs constituting a gradual transition between them. The responses for the three types of stimuli were constructed to obtain a similar Michelson contrast (that is, their intensity values vary from 0 to 1). However, the intensity distribution was different, as seen in Fig. 1c. Intensity was quantified by measuring the Weber contrasts distribution: for each intensity value the average intensity was subtracted and divided by the average intensity. Because of their design, sinusoidal grids have a higher proportion of brightest and darkest pixels, while the rest of the values have a uniform representation. On the other hand, the MCs have a large portion of pixels with intermediate luminance, with extreme values being rare, which resembles natural images.
Trajectory Reconstruction and velocity estimation
The first step is the reconstruction of the trajectory of a moving stimulus from the response of a population of RGCs. As recently demonstrated in54, the Decoding Fields obtained by maximum likelihood are very similar to the RFs obtained by traditional methods of reverse correlation. Based on this, we made a spatiotemporal reconstruction of the stimulus by convolving the response vectors of each RGCs with its respective RF and then performing a weighted sum across the whole population. The response \({r}_{i}^{c}\) is the number of spikes at each interval i with Δt = 16.66 ms. The RF is computed by reverse correlation from the response to a checkerboard stimulus. Since we are interested in stimuli moving only in one direction, we collapse the tridimensional spatiotemporal RF to a bidimensional representation \(R{F}_{x,t}^{c}\) containing only time and one spatial dimension. Thus, the intensity of the reconstructed stimulus at each point in space and time, I(x, i), is given by
where a is a constant offset, ωc is the weighting factor for each cell, \(R{F}_{x,j}^{c}\) is the value of the receptive field at point x at time jΔt before the spike, and NΔt is the length of the RF estimation.
Speed of the moving stimulus is assessed by a multi-stage process following standard motion processing models38. The first step is the convolution of the reconstructed stimulus with sets of filters resembling Reichardt detectors, approximated by slanted Gabor filters oriented in space time with preference for different speeds (given by the slope in the space/time plane, thus the speed tuning is given by the angle of the Gabor kernel). Each kernel is paired with another of the same properties, but with its phase shifted, forming a quadrature pair. The response of each quadrature pair is rectified and added to obtain the Motion Energy for each speed. The process is repeated to compute the Motion Energy in the opposite direction, and finally, both are subtracted to obtain the net motion signal for each speed. The estimated speed is determined by the set of filters with the highest activation in a winner-takes-all approach. An example of the process is shown in Fig. 7.
References
Barlow, H. B., Hill, R. M. & Levick, W. R. Retinal ganglion cells responding selectively to direction and speed if image motion in the rabbit. The Journal of Physiology 173, 377–407 (1964).
Olveczky, B. P., Baccus, S. A. & Meister, M. Retinal adaptation to object motion. Neuron 56, 689–700 (2007).
Vaney, D. I., Sivyer, B. & Taylor, W. R. Direction selectivity in the retina: symmetry and asymmetry in structure and function. Nature Reviews Neuroscience 1–15 (2012).
Burr, D. & Thompson, P. Motion psychophysics: 1985-2010. Vision Research 51, 1431–1456 (2011).
Simoncelli, E. P. & Olshausen, B. A. Natural image statistics and neural representation. Annual Review of Neuroscience 24, 1193–1216 (2001).
Geisler, W. S. Visual Perception and the Statistical Properties of Natural Scenes. Annual Review of Psychology 59, 167–192 (2008).
Dyakova, O., Lee, Y.-J., Longden, K. D., Kiselev, V. G. & Nordström, K. A higher order visual neuron tuned to the spatial amplitude spectra of natural scenes. Nature communications 6, 8522 (2015).
Yu, Y., Romero, R. & Lee, T. S. Preference of Sensory Neural Coding for 1/f Signals. Physical Review Letters 94, 108103–4 (2005).
Olshausen, B. & Field, D. Sparse coding of sensory inputs. Current opinion in neurobiology 14, 481–487 (2004).
David, S. V. Natural Stimulus Statistics Alter the Receptive Field Structure of V1 Neurons. Journal of Neuroscience 24, 6991–7006 (2004).
Carandini, M. et al. Do we know what the early visual system does? The Journal of neuroscience: the official journal of the Society for Neuroscience 25, 10577–10597 (2005).
Rust, N. C. & Movshon, J. A. In praise of artifice. Nature neuroscience 8, 1647–1650 (2005).
Geisler, W. S. & Burge, J. Optimal speed estimation in natural image movies predicts human performance. Nature communications 6, 1–11 (2015).
Priebe, N. J. Tuning for Spatiotemporal Frequency and Speed in Directionally Selective Neurons of Macaque Striate Cortex. Journal of Neuroscience 26, 2941–2950 (2006).
Vinje, W. E. & Gallant, J. L. Natural stimulation of the nonclassical receptive field increases information transmission efficiency in V1. The Journal of neuroscience: the official journal of the Society for Neuroscience 22, 2904–2915 (2002).
Haider, B. et al. Synaptic and network mechanisms of sparse and reliable visual cortical activity during nonclassical receptive field stimulation. Neuron 65, 107–121 (2010).
Simoncini, C., Perrinet, L. U., Montagnini, A., Mamassian, P. & Masson, G. S. More is not always better: adaptive gain control explains dissociation between perception and action. Nature neuroscience 15, 1–10 (2012).
Enroth-Cugell, C., Robson, J. G., Schweitzer-Tong, D. E. & Watson, A. B. Spatio-temporal interactions in cat retinal ganglion cells showing linear spatial summation. The Journal of Physiology 341, 279–307 (1983).
Shapley, R. & Lennie, P. Spatial frequency analysis in the visual system. Annual Review of Neuroscience 8, 547–581 (1985).
Demb, J. B., Haarsma, L., Freed, M. A. & Sterling, P. Functional circuitry of the retinal ganglion cell’s nonlinear receptive field. The Journal of neuroscience: the official journal of the Society for Neuroscience 19, 9756–9767 (1999).
Demb, J. B., Zaghloul, K., Haarsma, L. & Sterling, P. Bipolar cells contribute to nonlinear spatial summation in the brisk-transient (Y) ganglion cell in mammalian retina. The Journal of neuroscience: the official journal of the Society for Neuroscience 21, 7447–7454 (2001).
Demb, J. B. Cellular Mechanisms for Direction Selectivity in the Retina. Neuron 55, 179–186 (2007).
Kanjhan, R. & Sivyer, B. Two types of ON direction-selective ganglion cells in rabbit retina. Neuroscience Letters 483, 105–109 (2010).
Wei, W., Hamby, A. M., Zhou, K. & Feller, M. B. Development of asymmetric inhibition underlying direction selectivity in the retina. Nature 469, 402–406 (2011).
Cruz-Martn, A. et al. A dedicated circuit links direction-selective retinal ganglion cells to the primary visual cortex. Nature 507, 358–361 (2014).
Olveczky, B. P., Baccus, S. A. & Meister, M. Segregation of object and background motion in the retina. Nature 423, 401–408 (2003).
Baccus, S. A., Olveczky, B. P., Manu, M. & Meister, M. A Retinal Circuit That Computes Object Motion. Journal of Neuroscience 28, 6807–6817 (2008).
Jacoby, J. & Schwartz, G. W. Three Small-Receptive-Field Ganglion Cells in the Mouse Retina Are Distinctly Tuned to Size, Speed, and Object Motion. Journal of Neuroscience 37, 610–625 (2017).
Chen, E. Y. et al. Alert Response to Motion Onset in the Retina. Journal of Neuroscience 33, 120–132 (2013).
Gollisch, T. & Meister, M. Eye Smarter than Scientists Believed:Neural Computations in Circuits of the Retina. Neuron 65, 150–164 (2010).
Masland, R. H. & Martin, P. R. The unsolved mystery of vision. Current Biology 17, R577–R582 (2007).
Jacobs, G. H., Calderone, J. B., Fenwick, J. A., Krogh, K. & Williams, G. A. Visual adaptations in a diurnal rodent, Octodon degus. Journal of comparative physiology. A, Neuroethology, sensory, neural, and behavioral physiology 189, 347–361 (2003).
Leon, P. S., Vanzetta, I., Masson, G. S. & Perrinet, L. U. Motion clouds: model-based stimulus synthesis of natural-like random textures for the study of motion perception. Journal of Neurophysiology 107, 3217–3226 (2012).
Vacher, J., Meso, A. I., Perrinet, L. U. & Peyré, G. Biologically inspired dynamic textures for probing motion perception. In Cortes, C. et al. (eds) Advances in Neural Information Processing Systems 28, 1909–1917, http://arxiv.org/abs/1511.02705 (Curran Associates, Inc., 2015).
Wypych, M. et al. Standardized F1: a consistent measure of strength of modulation of visual responses to sine-wave drifting gratings. Vision research 72, 14–33 (2012).
Palacios-Muñoz, A. et al. Role of connexin channels in the retinal light response of a diurnal rodent. Frontiers in cellular neuroscience 8, 249 (2014).
Farrow, K. & Masland, R. H. Physiological clustering of visual channels in the mouse retina. Journal of Neurophysiology 105, 1516–1530 (2011).
Bradley, D. C. & Goyal, M. S. Velocity computation in the primate visual system. Nature Reviews Neuroscience 9, 686–695 (2008).
Vilankar, K. P. & Field, D. J. Selectivity, hyperselectivity, and the tuning of V1 neurons. Journal of Vision 17, 9–22 (2017).
Osaki, H., Naito, T., Sadakane, O., Okamoto, M. & Sato, H. Surround suppression by high spatial frequency stimuli in the cat primary visual cortex. European Journal of Neuroscience 33, 923–932 (2011).
Serrano-Pedraza, I., Grady, J. P. & Read, J. C. A. Spatial frequency bandwidth of surround suppression tuning curves. Journal of Vision 12, 24–24 (2012).
Kelly, D. H. Spatial frequency selectivity in the retina. Vision Research 15, 665–672 (1975).
Skottun, B. C., Bradley, A. & Ramoa, A. S. Effect of contrast on spatial frequency tuning of neurones in area 17 of cat’s visual cortex. Experimental brain research 63, 431–435 (1986).
Webster, M. A. & Miyahara, E. Contrast adaptation and the spatial structure of natural images. Journal of the Optical Society of America. A, Optics, image science, and vision 14, 2355–2366 (1997).
Turner, M. H., Schwartz, G. W. &Rieke, F. Receptive field center-surround interactions mediate context-dependent spatial contrast encoding in the retina. bioRxiv (2018).
Kuo, S. P., Schwartz, G. W. & Rieke, F. Nonlinear Spatiotemporal Integration by Electrical and Chemical Synapses in the Retina. Neuron 90, 320–332 (2016).
Manookin, M. B., Patterson, S. S. & Linehan, C. M. Neural Mechanisms Mediating Motion Sensitivity in Parasol Ganglion Cells of the Primate Retina. Neuron 97, 1–19 (2018).
Gollisch, T. Features and functions of nonlinear spatial integration by retinal ganglion cells. Journal of Physiology-Paris 1–11 (2013).
Butts, D. A. & Goldman, M. S. Tuning Curves, Neuronal Variability, and Sensory Coding. PLoS Biology 4, e92 (2006).
Barlow, H. B. Possible principles underlying the transformation of sensory messages. Sensory communication (1961).
Barlow, H. Redundancy reduction revisited. Network (Bristol, England) 12, 241–253 (2001).
Kayser, C. Processing of complex stimuli and natural scenes in the visual cortex. Current opinion in neurobiology 14, 468–473 (2004).
Borst, A. & Theunissen, F. E. Information theory and neural coding. Nature neuroscience 2, 947–957 (1999).
Botella-Soler, V., Deny, S., Martius, G., Marre, O. & Tkačik, G. Nonlinear decoding of a complex movie from the mammalian retina. PLoS Computational Biology 14, e1006057–27 (2018).
Laughlin, S. B., de Ruyter van Steveninck, R. R. & Anderson, J. C. The metabolic cost of neural information. Nature neuroscience 1, 36–41 (1998).
Chávez, A. E., Bozinovic, F., Peichl, L. & Palacios, A. G. Retinal Spectral Sensitivity, Fur Coloration, and Urine Reflectance in the Genus Octodon (Rodentia): Implications for Visual Ecology. Investigative Ophthalmology & Visual Science 44, 2290–2296 (2003).
Litke, A. M. et al. What does the eye tell the brain? Development of a system for the large-scale recording of retinal output activity. Nuclear Science, IEEE Transactions on 51, 1434–1440 (2004).
Segev, R., Goodhouse, J., Puchalla, J. & Berry, M. J. Recording spikes from a large fraction of the ganglion cells in a retinal patch. Nature neuroscience 7, 1154–1161 (2004).
Yger, P. et al. Fast and accurate spike sorting in vitro and in vivo for up to thousands of electrodes. Tech. Rep. (2016).
Chichilnisky, E. J. A simple white noise analysis of neuronal light responses. Network (Bristol, England) 12, 199–213 (2001).
Petrusca, D. et al. Identification and Characterization of a Y-Like Primate Retinal Ganglion Cell Type. Journal of Neuroscience 27, 11019–11027 (2007).
Chichilnisky, E. J. & Kalmar, R. S. Functional asymmetries in ON and OFF ganglion cells of primate retina. The Journal of neuroscience: the official journal of the Society for Neuroscience 22, 2737–2747 (2002).
Pinto, L. & Baron, J. Spatiotemporal Frequency Tuning Dynamics of Neurons in the Owl Visual Wulst. Journal of Neurophysiology 103, 3424–3436 (2010).
Zylberberg, J. & DeWeese, M. R. Sparse Coding Models Can Exhibit Decreasing Sparseness while Learning Sparse Codes for Natural Images. PLoS Computational Biology 9, e1003182 (2013).
Acknowledgements
Fondecyt 1150638 and 1140403, CONICYT-Basal Project FB0008, Millennium Institute ICM-P09-022-F. L.U.P. was supported by ANR projects “BalaV1” (ANR-13-BSV4-0014-02) and “TRAJECTORY” (ANR-15-CE37-0011). C.R.R. was supported by a CONICYT Ph.D. scholarship. We thank Dr. John Ewer for proofreading the manuscript.
Author information
Authors and Affiliations
Contributions
M.J.E., L.U.P., A.P. conceived the study. A.P. provided the animals, equipment for electrophysiology and data analysis. C.R.R. designed the stimulation protocols and conducted the experiments. C.R.R. and L.U.P. analyzed the results. C.R.R. and M.J.E. prepared the figures. All authors reviewed the manuscript.
Corresponding authors
Ethics declarations
Competing Interests
The authors declare no competing interests.
Additional information
Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Ravello, C.R., Perrinet, L.U., Escobar, MJ. et al. Speed-Selectivity in Retinal Ganglion Cells is Sharpened by Broad Spatial Frequency, Naturalistic Stimuli. Sci Rep 9, 456 (2019). https://doi.org/10.1038/s41598-018-36861-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41598-018-36861-8
This article is cited by
-
Cortical recurrence supports resilience to sensory variance in the primary visual cortex
Communications Biology (2023)
-
Learning heterogeneous delays in a layer of spiking neurons for fast motion detection
Biological Cybernetics (2023)
-
Visual Cognition–Inspired Multi-View Vehicle Re-Identification via Laplacian-Regularized Correlative Sparse Ranking
Cognitive Computation (2021)
Comments
By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.