Speed-Selectivity in Retinal Ganglion Cells is Sharpened by Broad Spatial Frequency, Naturalistic Stimuli

Motion detection represents one of the critical tasks of the visual system and has motivated a large body of research. However, it remains unclear precisely why the response of retinal ganglion cells (RGCs) to simple artificial stimuli does not predict their response to complex, naturalistic stimuli. To explore this topic, we use Motion Clouds (MC), which are synthetic textures that preserve properties of natural images and are merely parameterized, in particular by modulating the spatiotemporal spectrum complexity of the stimulus by adjusting the frequency bandwidths. By stimulating the retina of the diurnal rodent, Octodon degus with MC we show that the RGCs respond to increasingly complex stimuli by narrowing their adjustment curves in response to movement. At the level of the population, complex stimuli produce a sparser code while preserving movement information; therefore, the stimuli are encoded more efficiently. Interestingly, these properties were observed throughout different populations of RGCs. Thus, our results reveal that the response at the level of RGCs is modulated by the naturalness of the stimulus - in particular for motion - which suggests that the tuning to the statistics of natural images already emerges at the level of the retina.

SCIENTIFIC REPORtS | (2019) 9:456 | DOI: 10.1038/s41598-018-36861-8 they will generate EPSPs in the corresponding RGCs. The integration time of the RGCs is longer than the time course of the response of the BCs, so fast successive activations will be added. In this way, the RGCs will respond with higher firing rates to higher speeds because of the higher rate of alternating activation of the ON and OFF sub-fields. More specialized computations are carried out by, e. g. Direction Selective RGCs (DSRGC) which respond preferentially to a single direction in a varied range of speeds depending on the RGC type and the species 1,3,22 . The circuitry involved in the computation of DSRGCs relates the integrated activity of BCs and starburst amacrine cells (SACs). At least two types of calculations are simultaneously present in the retina projecting to different places of the central nervous system [23][24][25] . ON-OFF DSRGCs have been reported both in mouse and rabbits, responding to ON and OFF stimuli and being selective to four different types of motion direction in a broad range of speed. Besides, there also exists an On-DSRGC type reacting to slow velocities 23 . Rabbit and salamander retina report Object Motion RGCs detectors, which fire when a small patch of the image moves in a different pattern than the rest of the background 26,27 . More recently, it has been reported that the mouse retina contains a non-DS high-definition RGC, which is mainly selective to the motion of small objects across their RF 28 . Another sophisticated processing of motion, present in salamander retina, is the response to Motion Onset 29 , in which the initiation of movement within the RF of the RGC causes a higher response than does an object moving smoothly through it. While it has been shown that the retina is capable of advanced computations beyond standard processing 30 , many capabilities remain unexplored 31 , for example, the fine tuning of retinal responses to motion in natural images.
In this study, using retinas from Octodon degus, a diurnal crepuscular rodent with a high proportion of cones (30%) 32 and a relatively high density of RGCs, we focus on the properties of RGCs responses to motion in several scenarios. To overcome the limitations of working with simple, low-dimensional artificial stimuli, while at the same time avoiding the problems of working with natural images 12 , we used synthetic random textures called Motion Clouds (MC) 33 , which preserve some of the properties of natural images. The MCs mimic the spatiotemporal changes in luminance of natural scenes using a generative model for the geometric transformations of visual objects (translation, rotation, zoom) 34 . In contrast to a traditional drifting grating which has a single spatial and temporal frequency, a MC resembles a dynamic texture for which components (textons) may have different sizes and move at different speeds (see Fig. 1a for comparison of these two stimuli). While in general, natural images have their power distributed along a broad range of frequencies, MC can be parametrized to have a restricted range of frequencies to limit their information content or to have a broad spectrum to resemble natural images. As such, each generated texture corresponds to a compound of different moving gratings with random positions (phases) yet with spatiotemporal properties distributed along a parameterizable area in Fourier space. Here, we (a) Bi-dimensional representation of the spatio-temporal frequency space. Points along the continuous line correspond to simple drifting grating stimuli moving at a given speed but with different spatial frequencies (blue dots). MC stimuli have their spectral energy distributed in an "ellipse" warped along a given speed plane. Interestingly, these can be parameterized with the same mean spatial (sf 0 ) frequency and speed (V) as gratings, but with different levels of spatial frequency bandwidths (defined by the parameter B sf , with B sf = 0 for the grating) which will define the level of complexity of the sequence. The green ellipse represents a narrow bandwidth stimulus and the orange ellipse a broader bandwidth stimulus. (b) Examples of stimuli at different complexity levels. For simplicity, we plot the light intensity along a single row of the image (vertical axis) for the whole duration of the stimulation (horizontal axis). We show the drifting grating (top panel), which is constituted by a single spatial and temporal frequency (and thus a single speed, seen as the slope in this view), and the MCs (middle and lower panels), which have the same speed and central spatial frequency but with a narrow or wide bandwidth in spatial frequency space. (c) Contrast distribution of the stimuli. The three types of stimuli were built to have the same Michelson contrast, but their contrast distribution is different. By construction, sinusoidal gratings have a higher proportion of brightest and darkest pixels, while the rest of the values have uniform representation; on the other hand, MCs have a large proportion of pixels with intermediate luminance, with extreme values being rare, resulting in images where low contrast is more frequent than high contrast, as is the case of natural images. use the same distribution near the spatiotemporal speed plane, and control the amount of components by the bandwidth around the central spatial frequency. This defines a simple parameter to control the complexity of the input stimulus as the amount of compounded components in the texture (see Fig. 1b). This manipulation has previously been used successfully to study speed discrimination in humans 17 , where it was shown that additional information contained in the stimuli improves eye pursuit gain and precision.
To study such a property at the level of the retina, the activity recorded from a population of RGCs was compared using simple moving stimuli (drifting gratings) vs. MC's with the same speed and two levels of naturalness, regarding their spatial frequency spectrum (narrow and broad bandwidth). We found that for a significant fraction (66%) of RGCs responding to the motion, the cells' tuning bandwidths become narrower when the spatiotemporal frequency content of the stimulus increased. The set of RGCs studied contains cells traditionally assigned to different types, indicating that this property is not exclusive of a single class, but instead is widespread in the retina. At the population level, this change in tunning bandwidth results in sparser codes, which in turn reflects a more efficient coding of the motion information. Therefore, our findings show that fine-tuning of motion detection to natural image statistics already emerges at the level of the retina.

Results
We characterized the response from a population of speed responsive RGCs, recorded from retinal patches of 2 young Octodon degus, a diurnal rodent, having a total of 308 RGCs (see Methods section and Fig. 2 for details).
Speed responsive cells in the retina as characterized using gratings. We measured RGCs responses to a set of drifting gratings (artificial simple stimuli), with varying speed and spatial frequency (see Fig. 3a). For each spatial frequency, the response to speed averaged over trials was fitted to a skewed Gaussian. In Fig. 3b, we show how preference for certain speeds changes as a function of spatial frequency. As expected for RGCs, preferred speed decreases at higher spatial frequencies. In the example shown, the response is maximal at intermediate speeds and spatial frequencies, diminishing gradually as one moves away from that preferred combination of parameters. When looking at the fitted curves, it can be seen that the whole curve shifts towards lower speeds when increasing the spatial frequency.
A subset of the complete set of valid RGCs was separated and classified as speed responsive cells (SR), i.e. cells whose response is modulated by speed, by keeping only cells whose average firing rate correlated with the response strength, by estimating its correlation with the standardized component of the amplitude spectrum zF1 35 ; and then, cells whose speed response curve to the drifting gratings has a good fit to eq. 3 at every spatial frequency tested (χ 2 < 0.05 for the normalized response). We analyzed retinal patches from two animals, where 81 and 109 SR cells were found, respectively, representing ≈26% and ≈29% of the total RGCs recorded. Figure 4a shows the total number of RGCs found in a sample piece of retina, and their respective RF, of which the SR cells are highlighted in color (a second example is shown in Figure S1 in the Supplementary Information section).
To discriminate between SR and Non-SR cells, we analyzed the RF properties of each class. In terms of ON/ OFF RGCs distribution, and similar to previous studies performed in the same animal model 36 , a low ratio of ON/ OFF cells were found. The temporal profile of a cell's response was characterized using the parameters indicated in Fig. 4b: zero-cross and peak-time. In Fig. 4c we show the distribution of these two parameters along the entire population of RGCs, separated by their speed responsive selectivity (SR cells are represented in red, while Non-SR in black). Interestingly, only the zero-cross parameter is significantly different between SR and Non-SR cell populations (p = 0.026 Kolmogorov-Smirnov test, p = 0.65 for the difference in peak time). Additionally, there was no direct match with other traditional criteria for classification of RGCs, such as Fast-Slow and Transient-Sustained using the latency and transience index 37 of the response to a full-field flash, since our set of SR cells contained cells belonging to each class (24 Fast-Transient, 13 Fast-Sustained, 19 Slow-Transient and 25 Slow-Sustained from 81 SR cells).
Speed selectivity changes with the complexity of the stimulus. When presenting a stimulus with the similar basic properties as the drifting gratings but with energy distributed along a larger area in parameter space, the responses from RGCs differ significantly. Examining in detail the response at the preferred spatial frequency (Fig. 5a) reveals that the firing rate response to lower and higher speeds is much weaker for the MC stimuli compared to the grating, while the responses become equivalent only around the preferred, intermediate speeds. This effect is more evident when looking at the tuning curves (Fig. 5b), where the response decreases more sharply when moving away from the preferred parameters for the complex stimuli. The change in the shape of the curve is evidenced by a change in the σ value of the fit, which decreases progressively from 1.76 to 1.45 and 0.95 as the complexity of the stimulus increases, while maintaining the same preferred speed (1.47;1.37;1.49 mm/s respectively). The firing rates are stable across repetitions, so the narrower tuning is not due to drifts or decay in cell activity. When considering all the conditions tested, the response profile to the MC stimuli cover a smaller area in parameter space along both axes, when viewing it as a two dimensional map of the response at each combination of parameters (Fig. 5c). As expected, speed preference decreases with spatial frequency, as seen by the slanted response profile.
This change in the response results in tuning curves with narrower profiles, thus the tuning of the RGC becomes more specific for certain parameters of the stimulus. Besides this change in the shape of the tuning curve, for many RGCs, the response to the MC stimuli has lower firing rates, as is the case for the example shown in Fig. 5b, while for others, the responses are equivalent and even higher in some cases (see examples in Fig. 5d).
Another interesting and unexpected finding is that some of the RGCs with narrower tuning for speed also present changes in their preferred speed, mostly towards higher speeds (Fig. 5d, panels on the right). Low firing rates obtained for MCs compared to gratings could be explained by differences in the contrast stimuli, but not the sharpening of the speed-selective curve (see Box 1). The low contrast observed in MCs is due to an increase of the spatial frequency bandwidth. In the grating analysis, we observe that for higher spatial frequency bandwidths, the peak activity for the central frequency gets smaller, and not narrower.
To quantify the change in the tuning bandwidth of the grating versus MC response, we measured the properties of the curve fitted to the response to speed (see Methods section for details). In Fig. 5d we show some examples of RGCs with narrower tuning in terms of lower response to the non-preferred speeds. At the population level, a large proportion of the SR cells show a decrease in the bandwidth of the response to speed at their preferred spatial frequency, for the complex stimuli when compared to the response to drifting gratings. This is evident when looking at the comparison of σ v in Fig. 5e; for both MC series, a large portion of the points falls below the line of identity, meaning that the values are lower than for the grating stimulus. The distributions of σ v are skewed towards lower values (Fig. 5f), meaning that a large portion of the speed responsive RGCs have Population responses becomes sparser for more naturalistic stimuli. When pooling the response of many RGCs, as a motion sensing neuron in higher cortical areas would do 14 , the response profile of the recorded population also shows differences when stimulated with the complex, more naturalistic stimuli. As expected from the individual responses, the response profile to speed depends on the spatial frequency ( Fig. 6a left panel). The curve not only shifts when changing the spatial frequency, but the peak response also decays as the spatial frequency increases. When comparing the response to the simple vs complex stimuli, the preference for lower speeds at higher spatial frequencies is still observed, but with a significant decrease in response for almost every condition ( Fig. 6a middle and right panels). However, this decrease in response is larger for the lower speeds at every spatial frequency tested, to the point that for the two lowest speeds, the difference in response between spatial frequencies becomes non-significant (no significant difference at speeds 0.25 and 0.5 mm/s, p > 0.05 Wilcoxon signed-rank test), so the left part of all the tuning curves converge to the same values.
However, these lower firing rates elicited by the complex stimuli would not necessarily be detrimental to the coding properties of the retina. Instead, lower average firing rates could be associated with sparse coding. To see if this were the case, we computed the average population sparseness (eq. 5) for each stimulus condition (how many RGCs and how much they are responding to each stimulus). When compared to the simple stimulus, there are significant increases in sparseness in 8 conditions for the MC with narrow bandwidth and 19 conditions for the broad bandwidth stimulus at each combination of speed and spatial frequency tested (Fig. 6b).

A sparse response encodes motion information efficiently.
To determine if the sparser retinal response is still encoding the motion information, we applied a reconstruction framework to decode the trajectory of a moving stimulus from the response of a group of RGCs and then extract information derived from the trajectory. As can be seen in Fig. 7, we constructed retinal images from the spikes emitted by each RGCs and its respective RF. To assess the ability to extract motion information from the reconstruction, we applied a process analogous to the way velocity is computed in cortical visual areas 38 (see Methods section for details). An example of this process is shown in Fig. 7. Even though that for some of the conditions the first stage of decoding was enough to get the correct speed (in the example, the lowest and the two highest speeds), in general the decoding improves at the Motion Energy and Opponent Motion stages, in the sense that the set of filters that produces the highest activation is the one that matches the presented speed. The precision of speed decoding for the different conditions can be seen in Fig. 8. The motion traces obtained for all the stimuli can be seen in Fig. 8a. To measure the quality of the estimation, we computed the error rate as 1 minus the number of trials for which the estimation is correct, over the total number of trials. As seen in Fig. 8b, some of the conditions have zero error rate, while for the lowest speed, the error rate is 1 for many of the spatial frequencies for the three types of stimulus. For the gratings, most values are either one or zero, while for the MCs some conditions show intermediate values, meaning that for some trials the decoding was correct and for others it failed. When looking at the cumulative error rate across spatial frequencies (Fig. 8c) it becomes apparent that the decoding error for gratings is very high at the lowest and highest spatial frequencies, decreasing gradually towards the intermediate spatial frequencies. The same pattern can be seen for the MCs, however, the difference in decoding error for different spatial frequencies is lower, and stays within an acceptable range for all conditions. For broad bandwidth stimuli, the error rate is distributed more evenly across spatial frequencies, with bad decoding performance concentrated at the lowest speed.
While the distribution of the errors is different for the three types of stimuli, overall, the total error rate is slightly lower for the MC stimuli (Fig. 8c, rightmost plot, T = 0.0, p = 0.016 and T = 3.0, p = 0.034 Wilcoxon signed-rank test). Thus, it can be considered that there is a slight improvement in decoding performance. Since we were able to estimate the stimulus speed with an accuracy of ≈0.7 (1 -the total error rate) for the three types of stimuli, it could be argued from an information point of view, that the retina is transmitting at least enough information about the stimulus to encode speed under the three conditions. However, as shown in Fig. 5, MCs elicit responses with lower firing rates. So, if we quantify the cost of accuracy for each type of stimulus, in terms of the number of spikes used to successfully transmit the speed of the stimulus, we get that for the gratings the

Discussion
Our results strongly suggest that the tuning to naturalistic stimuli is already present at the level of the retina. For the first time, this showed that the tuning curves of response to motion become narrower under stimulation with increasingly more complex images, where we defined complexity as the number of compounded features of different spatial frequency but similar speed in the input texture. In particular, the bandwidth parameter controls complexity as it may generate a simple crystal-like stimulus (a grating component) with a very narrow bandwidth up to a smoke-like texture with a similar 1/f fall-off as natural images as the bandwidth grows to infinity. In the present work, the difference between each set of stimuli is the spatial frequency content, i.e. the variety in sizes for the elements that constitute the images; while the three classes of stimuli have the same average properties (spatial frequency and speed) for each set, the MC contain additional signals or components around this central value.
Additional signals in a stimulus can modulate the response of a RGC in different ways. It has been shown that isolated stimuli that fail to modify the activity of a RGC can alter the response to another stimulus when presented simultaneously 39 . The latter would stem from nonlinearities in the processing of the signal performed by the RGC, in the sense that the response to a complex stimulus is not merely the response to the sum of its components. In some cases, the presence of signals around the RF, product of an increased area of stimulation, is enough to trigger changes in the response 15 . One of the mechanisms responsible for this effect is surround suppression, in which the presence of a stimulus in an area surrounding the spatial RF will decrease the response to the stimulus in the center, which is partly explained by an increase in the amplitude of inhibitory post-synaptic potentials 16 .
In our experiments, all stimuli cover the same area, which exceeds the area of the RFs of the recorded RCGs, so the presence of a stimulus in the surround would not by itself be enough to explain the changes in the observed response. Surround suppression occurs under a variety of stimulation patterns, both spatially and temporally 40 . However, for simple stimuli it has been shown that the effect is most active when the parameters of the stimulus on the center and the surround are the same 41 , which would be the case for the gratings in our experiments; however, our data shows that this is not the case Fig. 5.
Seminal works in vision psychophysics established that spatial frequency sensitivity is dependent on contrast and overall luminance levels 42 . In general, sensitivity to high spatial and temporal frequency decreases at lower contrast levels. It could be argued then that our results reflect the effects of changes in contrast and not changes in spatial frequency. With our naturalistic stimulus, the relative contrast of each component of the images indeed scales with its spatial frequency, so components with higher spatial frequencies have lower contrast, following show the average response across trials for the whole population of Speed Responsive RGCs for each combination of speed and spatial frequency used; the lighter shades show increasing spatial frequency, the error bars show standard deviation. The encircled dots indicate all the conditions for which the response is significantly smaller than the response to the grating stimulus (p < 0.05 Wilcoxon signed-rank test). The response to speeds for each spatial frequency was fitted to an skewed Gaussian. As expected, at higher spatial frequencies the preferred speed shifts towards lower values, together with a decrease in maximum response, however, this decrease becomes more pronounced when the complexity of the stimulus increases, to the point where the response to lower speeds becomes very similar for different spatial frequencies (no significant difference at speeds 0.25 and 0.5 mms (p > 0.05 Wilcoxon signed-rank test). (b) Sparseness in the code, calculated as population sparseness (eq. 5), is a measure of how many RGCs are responding and their relative level of activity. The 2-D plots show the difference in average sparseness between grating and MC for each condition tested. In general, it is higher for the extreme parameters, however, for the more complex stimulus, the area of high sparseness is larger. The asterisks indicate the conditions for which the sparseness is higher compared to that obtained using grating stimulus (Wilcoxon signed-rank test, p < 0.05). the same 1/f scaling characteristic of natural images. However, the contrast of the main components (those with the central spatial frequency sf 0 or close to it) have the same contrast as the grating stimulus, while only the high-frequency components have lower contrast; this distribution of contrasts is stationary during the whole experiment, so the primary response drivers have the same contrast across the three types of stimuli. On the other hand, the effect of lower contrast on tuning curves is the opposite of what we report, as stimuli with lower contrast generate broader tuning curves 43 , at least in the visual cortex. Finally, it has been shown that adaptation can alter the relationship between spatio-temporal tuning curves and contrast 44 . Even though retinal adaptationoccurring at multiple stages -to luminance levels is relatively fast (in the order of seconds), the stimulation time necessary to generate spatial frequency adaptation is longer, so we explicitly designed the stimulation protocol to avoid adaptation by limiting the exposure time to 3 seconds and interspersing the images with 1 second of blank, average luminance stimulation. The stability in the response during stimulation and across trials for the three classes of moving stimulus (Fig. 5) shows that we can consider the effect of contrast, concerning the impact of spatial frequency bandwidth, to be negligible. It has been shown that a non-linear interaction between the center and surround parts of the RF in RGCs modulates the response to spatial patterns in a context-dependent manner and that this effect is more significant during naturalistic stimulation 45 . We think that this effect can explain, at least in part, the results that we observe; the additional signals contained in the complex stimulus would act as non-linear inhibitors, resulting in lower response when the parameters are farther from the preference of the RGC, producing narrower tuning curves. Moreover, the narrower speed selectivity tuning observed in RGCs, when stimulated with naturalistic-like stimuli such as MCs, seems to be a general property of many RGCs types in the retina and not related to a single RGC type. Recently, it has been reported the relevance of gap-junction connections between BCs in motion computation 46,47 . This early mechanism, together with nonlinearity computations at the output of BCs 48 , increases RGCs motion sensitivity to correlated spatio-temporal input, such as the case of MCs. Interestingly, together with this narrower bandwidth, approximately a third of the RGCs show an increase in preferred speed (see two examples in Fig. 5, right panels), while others preserve their preference (Fig. 5, left  panels). The change in the speed preference cannot be attributed to changes in the spatial frequency content of the stimulus. Due to the bandwidth of the stimuli, the images will include additional spatial frequencies, but the shape of the spectrum (see Methods: Naturalistic stimuli) determines that most of the additional frequencies will be higher than the central frequency (sf 0 ). If the RGC processed these higher spatial frequencies linearly, they should shift preference towards lower speeds -but our results show the opposite. From our analysis, it is difficult to relate or predict this behavior from other properties of the RGC's response, since the shift in the preferred speed appears not to correlate with any of the RF's characteristics.
Which are the possible implications or consequences of having these narrower tunings for naturalistic stimuli? At the level of individual RGCs, each RGC will only respond when the stimulus is near the preferred parameters. But this does not mean that a RGC will encode its preferred stimulus. Indeed, it has been shown that steeper tuning curves will encode more efficiently because stimuli will be more readily discriminated in the high slope range 49 . As such, these results mean that for speed-selective RGCs an input with a broader spectrum of spatial The representation is the same as in Fig. 7. For the three types of stimulus, the motion traces (and their slopes) are easily seen for all conditions, except at the lower speeds and higher spatial frequencies, which is consistent with lower responses under those conditions (Fig. 6). (b) Error rate in the decoding of the speed estimation (number of times that the algorithm estimated the correct speed over the total number of trials). The 2-D plots show the error rate at each spatial frequency and speed, for each type of stimulus. (c) Error rate for each spatial frequency and total error rate. Error lines show standard deviation between trials. Bars with no error mean that speed decoding was successfully performed between stimulus repetitions, for a given condition. In the rightmost plot, solid columns show total error rate and open bars show the Accuracy cost. Accuracy cost is computed as the success rate multiplied by the average firing (total number of spikes/number of cells); asterisks indicate significant difference with respect to the Grating (p < 0.05 Wilcoxon signed-rank test).
SCIENTIFIC REPORtS | (2019) 9:456 | DOI:10.1038/s41598-018-36861-8 frequencies will have a higher precision in speed, independently of the local scale (spatial frequency) of the stimulus. This mechanism speaks to the dual mechanisms at the origin of selective responses (here speed) and of the invariance to other features (here scale). In addition to this effect at the individual RGC level, when looking at the population code, narrower tuning curves can lead to less overlap between RCGS, so in conjunction, the code will contain less redundancy, an essential aspect of the efficient coding theory [50][51][52] . In support of this, we showed that population sparseness increases for the naturalistic stimuli, so in this respect, the retina is performing better under these types of stimulation, in the sense that a sparse code is more efficient from an information transmission point of view. To see if these changes in tuning curves affect the coding of the motion signal, we implemented the stimulus reconstruction and speed decoding method. We chose speed as the relevant parameter for motion estimation, but we expect that with the proper adjustment, any property of the stimulus could be recovered, provided one can obtain a sufficiently good reconstruction 53 . Another simplification that we took advantage of, is that our experimental design is based on the independent presentation of a single speed for each stimulus set (with slight variations around an average value for the MCs); this means that for the duration of each segment of stimulation, the whole population of RGCs is "seeing" the same speed. Thus, to recover the speed of the stimulus from the reconstruction, we only need to get a single value from the whole population and for the entire presentation of each sequence.
Our reconstruction method could be improved by varying some of the parameters, such as the weights of the sum, to minimize the error of the results, as is usually done in this type of work. However, even though when the models obtained by those methods are biologically plausible, the process of a posteriori maximization of the quality of decoding is hardly naturalistic, because no biological sensory system has access to the "real" value of the signal to perform the minimization step that is essential to this kind of approach.
Another possible improvement would be to incorporate nonlinearities in the decoding process, possibly by taking into account the history of the firing of a RGC instead taking every spike with the same value. Nonlinear models have been reported to give better stimulus reconstruction, especially for complex images 54 .
In any case, even though our reconstructions have a relatively low degree of fidelity, they contain enough information to compute the speed of moving features in the images successfully. The latter is crucial because it means that even though these complex stimuli elicit lower retinal responses, the relevant information is still contained in these lower responses. In this regime of lower firing rates, energy expenditure would be lower, and the cost of information would be lower 55 . Thus, this simplified approach is good enough to support the hypothesis that the retina would efficiently encode naturalistic stimuli.
In conclusion, we have shown that the retina, of a diurnal rodent, would be adapted to at least some of the characteristics of natural images, specifically those related to motion and the spatio-temporal information content and correlations used in this study. This adaptability is expressed in the narrower shapes of the tuning curves responses, which would lead to less overlap between RGCs in the feature space and thus less redundant and sparser population code, with smaller firing rates, which translates into less energy expenditure and less channel saturation. One limit of the stimuli that we used here is that they are densely packing the image space. However, natural images may exhibit some skewness and positive kurtosis in the distribution of luminance, reflecting the fact that the visual images are in general generated using a sparse number of components. This may introduce a different level of hierarchy in our definition of complexity: sparser textures may introduce spatial structures compared to the stationary nature of the MC textures used in this study. An interesting perspective is, therefore, to extend such stimulations in the retina to different levels of sparseness and to challenge the hypothesis that the processing in the retina could be differentially adapted to these different levels.

Methods
Animals. Young Octodon degus born in captivity are maintained in a controlled facility. Prior to each experiment, animals were put in darkness for 30 minutes, then deeply anesthetized with halothane and beheaded. Eyes were removed and dissected at room temperature under red illumination. Experimental procedures were approved by the Institutional Committee on Bioethics for Animal Research from the University and in accordance with the bioethics regulation of the Chilean Research Council (CONICYT).
Degus are diurnal rodents with 30% of their photoreceptors being cones and a comparatively high number of RGCs, qualities that makes them a good model for studying vision. In addition, the large area of the retina makes degus especially suitable for Multi Electrode array recordings as good coverage of all the electrodes in the matrix can be obtained. For a complete description of the model see 32,56 . Electrophysiological recordings. The experimental protocol was described before and follows 36 . Briefly, the physiological response of degu RGCs to different types of visual stimuli was measured using a Multi Electrode Array5 57,58 (USB MEA256, Multichannel Systems GmbH, Reutlingen, Germany) using 256MEA100/30iR-ITO matrices and sampling at 20 kHz. After removing the eyes, the posterior hemisphere was dissected in quadrants and the pigmented epithelium separated from the retina. Finally, a piece of retina was mounted on a dialysis membrane and mounted on a perfusion chamber, which was then lowered onto the electrode array with the RGCs side down. Recording commenced under perfusion with AMES medium bubbled with 95% O2 5% CO2 at 33 °C and the pH adjusted to 7.4.

Visual Stimulus Presentation.
Stimulus display was performed with a conventional DLP projector using custom optics to reduce and focus the image onto the photoreceptor layer, projecting from the RGC side. The projected size of a pixel is ≈4 µm, maintaining an average irradiance of 70 nWmm −2 , covering an area of ≈2 × 2 mm, exceeding the area covered by the electrodes. Regarding the spectral emission of the DLP used here, the LEDs with peak emission at 460 nm and 520 nm (USB4000 Fiber Optic Spectrometer, Ocean Optics) covers the main green-cones with peak at 510 nm and barely, to −2 log maximal peak sensitivity of, the 370 nm UV-cones present in the dichromatic visual system of Octodon degus 56 . Cones in degus are close to a 30% of total number of photoreceptors where green-cones (or M-pigment) represent 85% and UV-cones 15% (or UV-pigment) 32 . In that respect our main results and conclusions, at photopic conditions, are extracted from the response of M-pigment cones. Stimuli were generated using the code available at https://github.com/NeuralEnsemble/MotionClouds. Timing of the images was controlled using a custom-built software based on Psychtoolbox for MATLAB. Spike sorting analysis was performed using Spyking-Circus 59 . Data was analyzed via redistributable Jupyter notebooks using SciPy and running Python 3 kernels. All data fitting procedures were performed with LMFIT. Figures were generated with Matplotlib. Maximal speed projection is given by the time needed for a point to cross half of the projected image in one frame time, i.e., 200 px/frame. As the projector works at 60 frames, it gives us an estimation of 12000 px/s. Considering that one pixel covers 4 μm (4 μm/px), it generates a maximal speed of 48 mm/s, which is higher than the highest rate used in our study (4 mm/s).

Characterization of spatiotemporal tuning of RGCs.
To produce a standard characterization of RGCs responses to different stimulations, we measured the response to a white noise checkerboard pattern (block size = 0.05 mm, 35 × 35 blocks) for 1200 s at 60 Hz and to sinusoidal drifting gratings at maximum contrast (minimum and maximum Weber contrast of ≈−94 and ≈129 respectively), with spatial frequencies of 0.66, 0.9, 1.26, 1.8, 2.52 and 3.6 cycles/mm, and speeds of 0.25, 0.5, 1.0, 2.0 and 4.0 mm/s in sequences of three seconds with 10 repetitions of each. Due to the constraints in recording length stemming from tissue viability, we focused only on changes in response to the speed of the moving stimulus and not to its direction, so the present protocol uses a single direction for all sets of stimuli.
RFs were estimated from the response to the checkerboard stimulus by reverse correlation, yielding the Spike-Triggered Average (STA) 60 as a three-dimensional spatiotemporal impulse response (Fig. 2e). The spatial characterization was performed by fitting a bi-dimensional Gaussian function at the time point of maximum amplitude, then drawing an ellipse at one standard deviation. The size of each RF was calculated as the radius of the circle with the same area as the ellipse fit 61 . The shape was quantified by the ellipse-eccentricity ε defined as ε = − b a 1 ( / ) 2 where a and b are respectively the radius of the major and minor axis; since noisy STAs generate fittings with high ellipse-eccentricities, cells with ε > 0.9 were discarded. The temporal profile was computed as the time course of the intensity at the point with the largest variance, and then fitted to a difference of two cascades of low-pass filters 62 . Basal activity was set to zero and amplitude normalized, such that it follows the form (1) where t represents time in number of image frames before the spike, τ 1 and τ 2 describe the temporal decay of the response of each filter; scalars p 1 and p 2 are amplitude responses of each filter, and n is a free parameter.
RGCs sensitive to motion. Tuning to motion was evaluated from the response to the drifting gratings; the response to each set of speeds and spatial frequencies was measured by Peristimulus Time Histogram (PSTH) and Fourier analysis for 10 repetitions of each sequence. The beginning of the PSTH is set to 200 ms after stimulus start to discard the response to stimulus onset. To select the cells responsive to motion, first we selected RGCs with response modulated by the stimulus. As a measure of the strength of the modulation of the response by the stimulus, we computed the standarized F1 (zF1) 35 , defined as: where F1 is the amplitude component at the temporal frequency of the stimulus, mean(FFT) is the mean amplitude of the spectrum over the range of frequencies from 1/T to the Nyquist frequency, and SD(FFT) is the standard deviation of amplitudes in the frequency spectrum over the same range of frequencies. We discarded all the cells whose zF1 values were not correlated with the corresponding average firing rate (Pearson correlation, p > 0.05). Tuning to speed v was fitted to a skewed Gaussian in which the response to speed R(v) takes the form where V is the preferred speed, σ v the curve width, ζ represents the skew of the curve and A a scaling factor 14 .
Since σ v is highly correlated with tuning bandwidth (width of the function at half-height) 63 , we used σ v for all bandwidth related functions. Changes in tuning bandwidth were measured as the normalized difference in σ v as From eq. 3, we define Speed Responsive (SR) RGCs as those whose response to the drifting gratings has a good fit to the equation at every spatial frequency tested (χ 2 < 0.05). The selection criteria for SR RGCs did not elicit direction or orientation selective responses. Moreover, computing the Direction Selectivity Index for all the SR RGCs, we observe practically no cell with a DSI larger than 0.5, the value that is usually considered the minimum to classify a cell as Direction Selective.
Population coding. Population response was evaluated as the average of the PSTH of all speed responsive RGCs. Sparseness of the response to each stimulus was measured as Population Sparseness (S p ), defined in 64 as where A i is the average response of cell i to the stimulus.
Naturalistic stimuli. We used Motion Clouds (MCs) 33 , with an image size of 400 × 400 px (projected size ≈1.6 × 1, 6 mm) for 3 s, to measure the response of the retina to the motion of stimuli with different levels of naturalness (in terms of their spatio-temporal correlations). MCs are synthetic dynamical textures that mimic some key properties of natural images 34 , while allowing for the precise control over the signals that are being presented, which is a highly desirable property for the study of sensory systems under naturalistic conditions 12 . The traits of these textures are determined by three parameters for the center of the envelope (size, speed, and orientation), along with their respective bandwidths. These bandwidths control the area (in parameter space) in which the power of the stimulus will be spread. Our protocols consisted of a set of different sequences with the same motion information (mean speed and spatial frequency), but in which we progressively varied the spatial frequency bandwidth (i.e. the range of sizes), as this was reported to influence visual motion detection 17 . These match the parameters of the drifting grating stimuli (equivalent to an MC with infinitely narrow bandwidth), and textures with a narrow and a wide  Table 1. Parameters used to build each set of motion stimuli. For the gratings, the temporal frequency tf is modulated to obtain the same speeds at every spatial frequency sf. Motion Clouds (MCs) do not have a tf parameter, they are controlled directly by the speed parameter V. Note that the MCs software works in pixel units, so a proper scaling has to be applied. In our case, the conversion factor between pixels and mm is 240:1.
SCIENTIFIC REPORtS | (2019) 9:456 | DOI:10.1038/s41598-018-36861-8 bandwidth of spatial frequencies (Fig. 1a blue, green and orange ellipses respectively). For narrow and broad bandwidth textures, amplitude scales with spatial frequency: for the narrow bandwidth stimuli, the scaling factor is chosen for the spectra to not overlap, while for the broad band stimuli the scaling factor is 1 resulting in B sf = sf 0 , so that each spectrum overlaps with the adjacent one along the line of equal speed, see Table 1 for the specific parameters. Figure 1b shows an example stimulus, comparing the spatiotemporal image of a drifting grating and MCs with narrow and broad bandwidth, illustrating the smooth transition from a "crystal-like" pattern (the grating) to progressively more complex and naturalistic stimuli using a single parameter (B sf ). Under this characterization, a collection of well-sampled natural images would correspond to MCs with infinite bandwidth, i.e. it would contain every spatial frequency, so drifting gratings and natural images would be on opposite ends of the complexity scale, with our MCs constituting a gradual transition between them. The responses for the three types of stimuli were constructed to obtain a similar Michelson contrast (that is, their intensity values vary from 0 to 1). However, the intensity distribution was different, as seen in Fig. 1c. Intensity was quantified by measuring the Weber contrasts distribution: for each intensity value the average intensity was subtracted and divided by the average intensity. Because of their design, sinusoidal grids have a higher proportion of brightest and darkest pixels, while the rest of the values have a uniform representation. On the other hand, the MCs have a large portion of pixels with intermediate luminance, with extreme values being rare, which resembles natural images.
Trajectory Reconstruction and velocity estimation. The first step is the reconstruction of the trajectory of a moving stimulus from the response of a population of RGCs. As recently demonstrated in 54 , the Decoding Fields obtained by maximum likelihood are very similar to the RFs obtained by traditional methods of reverse correlation. Based on this, we made a spatiotemporal reconstruction of the stimulus by convolving the response vectors of each RGCs with its respective RF and then performing a weighted sum across the whole population. The response r i c is the number of spikes at each interval i with Δt = 16.66 ms. The RF is computed by reverse correlation from the response to a checkerboard stimulus. Since we are interested in stimuli moving only in one direction, we collapse the tridimensional spatiotemporal RF to a bidimensional representation RF x t c , containing only time and one spatial dimension. Thus, the intensity of the reconstructed stimulus at each point in space and time, I(x, i), is given by where a is a constant offset, ω c is the weighting factor for each cell, RF x j c , is the value of the receptive field at point x at time jΔt before the spike, and NΔt is the length of the RF estimation.
Speed of the moving stimulus is assessed by a multi-stage process following standard motion processing models 38 . The first step is the convolution of the reconstructed stimulus with sets of filters resembling Reichardt detectors, approximated by slanted Gabor filters oriented in space time with preference for different speeds (given by the slope in the space/time plane, thus the speed tuning is given by the angle of the Gabor kernel). Each kernel is paired with another of the same properties, but with its phase shifted, forming a quadrature pair. The response of each quadrature pair is rectified and added to obtain the Motion Energy for each speed. The process is repeated to compute the Motion Energy in the opposite direction, and finally, both are subtracted to obtain the net motion signal for each speed. The estimated speed is determined by the set of filters with the highest activation in a winner-takes-all approach. An example of the process is shown in Fig. 7.