Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

# Functional characterization of retinal ganglion cells using tailored nonlinear modeling

## Abstract

The mammalian retina encodes the visual world in action potentials generated by 20–50 functionally and anatomically-distinct types of retinal ganglion cell (RGC). Individual RGC types receive synaptic input from distinct presynaptic circuits; therefore, their responsiveness to specific features in the visual scene arises from the information encoded in synaptic input and shaped by postsynaptic signal integration and spike generation. Unfortunately, there is a dearth of tools for characterizing the computations reflected in RGC spike output. Therefore, we developed a statistical model, the separable Nonlinear Input Model, to characterize the excitatory and suppressive components of RGC receptive fields. We recorded RGC responses to a correlated noise (“cloud”) stimulus in an in vitro preparation of mouse retina and found that our model accurately predicted RGC responses at high spatiotemporal resolution. It identified multiple receptive fields reflecting the main excitatory and suppressive components of the response of each neuron. Significantly, our model accurately identified ON-OFF cells and distinguished their distinct ON and OFF receptive fields, and it demonstrated a diversity of suppressive receptive fields in the RGC population. In total, our method offers a rich description of RGC computation and sets a foundation for relating it to retinal circuitry.

## Introduction

The early stages of sensory processing encode sensory input in a format that allows higher brain areas to extract information essential to guide behavior. In the retina, over twenty types of retinal output neurons – retinal ganglion cells (RGCs) – serve as “feature detectors” that each encode specific components of the visual world and convey them to the brain1,2,3. Feature detection is accomplished by distributing the output of photoreceptors to parallel circuits, each of which originates with one of ~15 types of bipolar cell and provides input to multiple types of RGCs4. Thus, responses of different RGCs reflect computations performed by the various presynaptic, parallel circuits4. Although the anatomy and retinal circuitry supporting the diversity of RGC types has been studied in molecular5, physiological6, and computational detail7,8,9, the methodology necessary to represent the diversity of retinal computations in RGC outputs is relatively anemic.

Being able to describe the “diversity of computation” requires sufficiently complex mathematical descriptions of how RGCs process visual stimuli. The most commonly used mathematical model of retinal computation is the linear-nonlinear (LN) model, which describes the input-output transformation of the circuit in two steps: one, a linear filter that emphasizes particular spatial and temporal features in the stimulus delivered to the photoreceptors; two, a nonlinear function that captures spike generation in RGCs10. The LN model is appealing in its simplicity and yields an easily interpretable spatiotemporal filter (also called the linear receptive field, or RF), which typically has a center-surround structure and either ON or OFF selectivity in time (i.e., its polarity indicates whether neurons respond to increases or decreases in luminance). The “spiking nonlinearity” describes how sensitively neuron firing rates depend on the stimulus11.

Although the LN model can differentiate ON from OFF cells based on their RFs, the single linear filter of the LN model condenses all retinal processing into a single step, which masks the diverse processing channels presynaptic to RGCs12. This is a particular problem in the mouse retina, where ON-OFF cells (i.e., cells that respond both to increments and decrements in luminance) are the most common type13. ON-OFF cells have at least one ON and one OFF RF, and a single linear RF therefore averages ON and OFF filters together11,14,15. Likewise, a linear filter cannot distinguish between ON excitation and OFF inhibition (or vice versa).

Thus, more sophisticated (and necessarily nonlinear) models that involve the characterization of multiple RFs are required to capture the diversity in retinal computation15,16,17,18. Nonlinear models are generally more difficult to fit to data because they require more parameters (e.g., multiple RFs) and a nonlinear description that can capture a sufficient range of computation. Here, we use a “space-time separable” form of the Nonlinear Input Model (NIM)15 to fit multiple excitatory and suppressive RFs by maximum-likelihood modeling. Space-time separability refers to representing spatiotemporal RFs as combinations of separate spatial and temporal filters; this permits much more efficient parameter estimation15,18,19,20. Furthermore, because this approach does not require uncorrelated “white noise” stimuli, we can characterize RGCs experimentally using tailored “cloud” stimuli, which have spatial features on multiple scales. The NIM described detailed spatiotemporal RF maps in ON, OFF, and (uniquely) ON-OFF RGCs in the mammalian retina, and it revealed suppressive RFs unobserved in standard LN analyses. Such detail provides a much fuller picture of the computations represented in RGC outputs and provides the means to understand their functional diversity.

## Methods

### Neurophysiology

Experimental use of mice was performed under an animal protocol approved by the IACUC of the University of Maryland. All methods were performed in compliance with National Institutes of Health guidelines and regulations. C57bl/6 mice of either sex were dark-adapted for 1–2 hrs before isoflurane anesthesia followed by euthanasia by decapitation. Eyes were removed and retinas dissected free in Ames’ medium (Sigma) bubbled with 95% O2/5% CO2 (Carbogen) at room temperature. Ventral retinae were isolated in small, rectangular sections and placed RGCs down on a 6 × 10 perforated multielectrode array (Multichannel Systems, Tübingen, Germany). After a rest period of at least 30 minutes (to permit tissue adhesion to the MEA), RGC responses to light stimuli were recorded while the tissue was perfused with Ames’ medium bubbled with Carbogen and kept at 32 °C. All dissection procedures were performed under dim red illumination.

### Stimuli

Visual stimuli were generated using the Psychophysics Toolbox in Matlab and presented in the UV spectrum using a modified DLP projector (HP Notebook Projection Companion Projector, Model: HSTNN-FP01) (frame rate = 60 Hz). Projector output (Imean ≈ 5 × 103 photons/cm2/s, 398 nm; measured using a fiber-coupled spectrophotometer: Ocean Optics USB4000, Ocean Optics, Dunedin, FL, USA) was focused through the MEA and onto the photoreceptor layer using a microscope objective (Olympus) optimized for UV transmission. The mouse retina contains both S and M opsin- expressing cones found in a ventral→dorsal gradient, with S-opsin dominating phototransduction signaling in the ventral retina21,22,23. Here, we chose to use a UV stimulus because the mouse retina responds across a broader frequency range in response to UV as opposed to green stimuli24. Stimuli include: (1) 120 repeats of a whole-field stimulus that alternated between black and white, changing every 500 ms; (2) Gaussian white noise (GWN) flickering checkerboard (pixel size = 44.77 µm); (3) spatially correlated “cloud” stimulus; and (4) repeated, short periods of cloud (short repeats) for cross validation. The cloud stimulus was generated by low-pass filtering the GWN using a two-dimensional spatial Gaussian filter centered at the origin (in Fourier space) with a standard deviation of ~1.3 cycles/mm. Note that while this does not result in any single spatial scale, it does give the appearance of features roughly on the order of RF field sizes, as demonstrated in Fig. 1. The total duration of GWN, cloud, and short repeats was 20 min, and was broken into two 10-min blocks, separated by 10-min resting intervals. A 60-s long conditioning, whole-field gray light at Imean was presented at the beginning of each stimulus (block). Data was recorded at 50 kHz using the MC_Rack software (Multichannel Systems, Tübingen, Germany). Spikes were sorted using an offline sorter (Plexon Inc).

### Data analysis

A majority of experiments and neurons were recorded using just cloud stimulation (24 preparations, 431 well-isolated RGCs). Due to the instability of long recordings over the two 10-min blocks of cloud stimulus and repeats, we only used both blocks to fit models for 12 of the 24 experiments, using two criteria: (1) average firing rates of neurons did not change more than 50% from the first to second experiment; (2) cross-validated likelihood of the LN model did not change by more than 50%. If neither condition was met, we used the block which contained (on average) the best fits without suppressive terms added (LN or ON-OFF fits). We also recorded responses to a novel 10-sec repeated stimulus sequence, although only a handful of these were stable stable by these criteria. As a result, we were able to generate PSTHs to compare to model predictions in the cases shown in Figures, but did not use these numbers in population statistics.

For a smaller set of experiments (6 experiments, 125 well-isolated neurons), we recorded responses to both cloud and GWN stimuli in order to facilitate comparisons of spatial RFs between conditions. Here, 81/125 neurons with clear temporal kernels in the sGLM fits for both cloud and GWN conditions and RFs did not overlap with the edge of stimulation were selected for analysis. Note that these comparisons were based on comparing ON an OFF RFs (without regard to later analyses identifying ON-OFF neurons), and thus would implicitly eliminate some ON-OFF cells with strange linear RFs. Also, because performance comparisons were not important for this analysis, we did not use the stringent stability above for rejecting segments of the data.

Data were binned at exactly twice the stimulus refresh frequency (roughly 16 ms) for all subsequent analyses.

### Linear and Nonlinear modeling

This work follows general modeling for the Nonlinear Input Model (NIM) described in detail in McFarland et al.15; modified to incorporate separable spatiotemporal RFs. As described below, the model fitting code (in Matlab) has been made publicly available, and should complement the descriptions here and in previous publications. Briefly, all models were fit to maximize the penalized Poisson log-likelihood of the model given the data25, given by:

$$LL=\sum _{t}[{R}_{obs}(t)\mathrm{log}\,r(t)-r(t)]-P[\theta ]$$

where r(t) is the predicted spike output of the model, Robs(t) is the binned spike count of the neuron being modeled, and P[θ] is a regularization penalty that depends on the model parameters θ.

The predicted firing rate of the GLM model is given, familiarly, by r(t) = F[Ks(t)]25,26 (also Eq. 1, below), where we are using vector notation [bold] to denote the linear projection of the stimulus at a given time point (which includes all spatial positions and time lags) and the spatiotemporal filter K = K(x, y, τ) over all spatial positions and lags (back in time). The spiking nonlinearity F[.] is defined to be F[x] = log[1 + exp(x)]15,25. For the separable GLM, we approximated the full spatiotemporal kernel K(x, y, τ) as the outer product of a spatial kernel ksp(x, y) and temporal kernel kt(τ), resulting in the separable GLM (sGLM) used below.

The predicted firing rate of the NIM is given by15 (Eq. 2, below). As with the sGLM, the separable NIM (sNIM) involved approximating each subunit RF as the outer product of a spatial and a temporal kernel.

Because the outer product does not uniquely specify either kernel (because one can multiply one by any constant and divide the other to get a mathematically equivalent result), we constrained the spatial kernels to be normalized (i.e., fixed overall magnitude), and to have positive centers.

Models were fit using gradient ascent of the LL (above), as described in detail previously15. However, due to the separable kernels, each RF was fit by alternating fits of the spatial component (holding the temporal component fixed) and temporal component (holding the spatial component fixed) until convergence. For each neuron, we first fit computed the spike triggered average, and then used singular value decomposition to initialize the sGLM. Once the sGLM was fit, we initialized an sNIM to have ON-OFF selectivity: with two excitatory kernels, where the first kernel was initialized with that of the sGLM, and the second had the same temporal kernel multiplied by −1. If the result had a better LL than the LN fit and maintained an ON-OFF form, we then fit models with additional suppressive terms. Otherwise, we generated an NIM with a single excitatory subunit (initialized using the sGLM) and added suppressive terms. Note that each subunit could be one of four general types given the range of kernels that were fit: 1) ON excitatory, 2) suppressive, 3) OFF excitatory, and 4) OFF suppressive. ON and OFF refer to the preferred stimulus, and excitatory and suppressive indicate whether the filter output will be positive or negative.

All models were regularized using a Laplacian square penalty separately on the spatial and temporal kernels – which penalizes the second derivative15, resulting the penalized log-likelihood (above).

### Spatial RF analysis

The robust identification of multiple spatiotemporal RFs contributing to a given RGC response allows for detailed analysis of their spatial integration properties, as well as detecting the presence of surrounds. To this end, we fit circular Gaussians to a given spatial RF, fitting the precise center, standard deviation, and amplitude (4-parameters) by minimizing the mean-squared error using simplex optimization. We also could fit elliptical Gaussians (two additional parameters) robustly, but we found that this did not reveal additional useful information. With the centers identified, we could then take the average magnitude of pixels at any given radius from the center, generating a radial profile w(r) that revealed the center and surround. We also reported the center and surround areas, given by identifying the zero-crossing of w(r) and performing a weighted integration of (2πr)w(r) on either side of this zero-crossing.

### Availability of modeling code

All modeling code is available through our lab website at http://neurotheory.umd.edu/Code.

## Results

We recorded retinal ganglion cells (RGCs) from in vitro mouse retina using a multielectrode array. We first consider recordings made during the presentation of spatiotemporal white noise in order to perform standard receptive field (RF) analyses. The RF of RGCs is usually defined to be the optimal linear filter k that best predicts the cell’s spike responses27 and is the basis of the Linear-Nonlinear (LN) model, which generates a firing rate prediction of the response r(t) given the stimulus s(t) as follows (Fig. 1A):

$$r(t)=F[{\sum }_{i}{k}_{i}{s}_{i}(t)]\equiv F[{\bf{k}}\cdot {\bf{s}}(t)]$$
(1)

where i indexes the components of s(t) and the corresponding filter k that are relevant to the response at time t (including spatial locations and time lags into the past), and F[.] is a spiking nonlinearity that maps the output of the filter to a firing rate. Note we are using boldface to represent vectors, i.e., k = [k1 k2 k3 …]. When the stimulus is uncorrelated across its dimensions, then the optimal linear filter can be simply estimated using the spike triggered average (STA)10,27, defined as the average stimulus that evoked a spike. The spiking nonlinearity also can be estimated easily using the histogram10,27. For this reason, the LN model estimated in the context of Gaussian white noise (GWN; Fig. 1B, left) is the predominant method used to characterize RGCs10.

In using a white noise stimulus, however, consideration of the spatial extent and temporal duration of the pixels that compose the stimulus is critical28,29. Spatial and temporal parameters often are chosen empirically either to drive strong RGC responses or to yield RFs with good appearances, or both. A good RF for RGCs usually has a relatively smooth center covered by a few pixels (Fig. 1C, left); note, though, that in many publications the RFs pictured are interpolated into a higher resolution image and smoothed. While, in principle, smoother RFs can be obtained with finer spatial resolution using smaller pixels, having a large number of uncorrelated pixels tiling any one feature (e.g., the RF center) will typically not drive robust responses, because the summed luminance over that area will not deviate much from gray (mean) luminance. In the context of MEA recording, where RFs of many sizes might be sampled, the stimulus is usually adjusted to the most prevalent RF sizes, and therefore can be maladjusted for others29.

Furthermore, even an optimal choice of pixel size for a given RF center can bias RF estimation to features of that size. Receptive fields that have larger scales, such as the surround, will not be driven effectively by GWN stimuli optimized for the center: again, this occurs because the uncorrelated light and dark pixels will average to gray and not drive the larger spatial features. Consequently, features such as the RF surround will contribute minimally to the response to GWN stimulation and often not be visible in the spike-triggered average (Fig. 1C, left), which is a common issue with GWN characterizations throughout previous work.

### Likelihood based estimation of RFs using cloud stimuli

We presented a “cloud” stimulus to the same neurons characterized using GWN. This cloud stimulus was designed to have the same pixel size and duration as the GWN stimulus, but it introduced spatial correlations within each frame; these were generated by applying a spatial low-pass filter to a GWN stimulus (see Methods). Such correlations implicitly lead to dark and bright areas at a range of spatial scales (Fig. 1B, right) while maintaining the same high spatial resolution as GWN. The introduction of such correlations confounds traditional spike-triggered characterizations because the presence of correlations requires an additional (and noisy) step in computing the optimal linear filter: deconvolution of stimulus correlations from the STA27,30. This problem can be avoided by direct estimation of the linear filer by maximum a posteriori estimation, such as in the framework of the Generalized Linear Model (GLM)25,26. Such optimization is easily performed in principle (given standard computers and software), but it can introduce problems of overfitting due to the large number of parameters in the spatiotemporal filter that must be estimated simultaneously. For example, an appropriate spatial and temporal resolution for our experiments (21 × 21 spatial grid and 40 time lags at 16 ms resolution) results in thousands of parameters (17,640 in this case) that must be fit simultaneously. Application of smoothness regularization can mitigate overfitting, but its ability to suppress the noise in the RF without over-smoothing is limited, given typical amounts of data. Thus, in the context of relatively high-resolution spatiotemporal estimation problems, the GLM often provides little (if any) advantage over the STA in the context of a GWN stimulus; STA estimation is not affected by the number of parameters because each element is an independently performed average.

This major limitation of GLM estimation of cell RFs can be overcome by low-rank approximation18,19,20, which involves representing the spatiotemporal RF as a combination of separately estimated spatial and temporal RFs. Therefore, we adapted GLM estimation techniques to estimate space-time separable filters by representing the spatiotemporal filter as a spatial filter multiplied by a temporal filter, i.e., k(x, y, τ) = ksp(x, y) × kt(τ). This separable form is fit through block-coordinate descent: after initializing with the [correlated] STA, the spatial filter is held fixed while the temporal kernel is estimated (40 parameters), and then the temporal kernel is held fixed while estimating the spatial kernel (441 parameters). Alternating continues until the fit converges (see Methods), and corresponding spatial and temporal regularization is applied to each separately. Such an approach yields clean estimates of both (Fig. 1C, right): with much smaller amount of noise than the STA, and the appearance of additional details of the RF such as an opposite-polarity surround.

The separable GLM can be applied to both GWN and cloud data, yielding nearly identical temporal kernels (Fig. 1C). RFs measured from the cloud stimulus context, however, yield much more consistent features (Fig. 1D). Note that both RFs are optimized in their respective contexts: differences in spatial structure represent the spatial patterns that best predicted responses using an LN model. Although the neuron shown fired similar number of spikes in each context (58,219 in GWN and 61,358 in CLD), RF elements of the cloud drive the RGC much more efficiently, leading to more robust features in the estimated RF (Fig. 1D) including a smaller latency to spike (Fig. 1C, bottom right). This is most evident in the appearance of the RF surround, which we measured by fitting the center to a circular Gaussian (Fig. 1D, red) and then measuring the average value of the RF at concentric distances to yield a precise measurement of its strength (Fig. 1E). While there was a large diversity in surround strength for the LN models measured across neurons, the cloud stimulus yielded stronger surrounds in all but a handful of cases (Fig. 1F).

### Estimation of ON-OFF receptive fields

A large fraction of mouse RGCs are ON-OFF, meaning they are excited by both increments and decrements in luminance (Fig. 2A). Such selectivity defies characterization with linear RFs: linear stimulus processing implicitly averages the selectivity of different RF components into a single RF, and overlapping ON and OFF components cancel, i.e., (kon · s) + (koff · s) = (kon + koff) · s. For two RF components to contribute separately to excitation, ON and OFF selectivity must be represented in an additional nonlinear stage of processing (Fig. 2A):

$$r(t)=F[{f}_{1}[{{\bf{k}}}_{1}\cdot {\bf{s}}(t)]+{f}_{2}[{{\bf{k}}}_{2}\cdot {\bf{s}}(t)]]$$
(2)

This model has the form of an LNLN cascade31,32,33 and has been formulated in a maximum likelihood context as the Nonlinear Input Model (NIM)15. We adapted the NIM to incorporate separable spatiotemporal filters, and the resulting model, the separable NIM (sNIM) was fit to the same data as the separable GLM described above. To begin, we show an example cell that responds to both increments and decrements in luminance (i.e., ON and OFF stimuli) in a slowly modulated full-field stimulus (Fig. 2B). The selectivity demonstrated by the STA suggested that this cell is an ON cell because the calculation is limited to computing a single RF. The separable GLM, however, provided additional resolution and revealed a spatial asymmetry in the spatial RF (right), suggesting spatially offset and partially cancelling ON and OFF components. The sNIM fit to the same data demonstrated robust, circular ON and OFF responses, each with a discernible surround (Fig. 2C). The resulting two filters are represented as vectors that define a plane (Fig. 2E) in the much higher-dimensional “stimulus space”, which represents all possible types of feature selectivity. This plane defined by the detected ON and OFF filters largely contains the GLM filter, which in this case matched the sum of the two NIM filters (Fig. 2F). Although most of the power in the STA is dominated by noise, its projection into this plane is also in the same direction as the GLM (Fig. 2B, middle).

The sensitivity of the sNIM reveals ON and OFF selectivity in RGCs that otherwise would appear to be exclusively ON or OFF cells. ON-OFF cells are often characterized based on response to increments and decrements in full-field luminance (e.g., Fig. 2A). Measures of the peak responses to ON and OFF steps7 yield a distribution of response types (Fig. 3A, blue), ranging from pure ON to pure OFF with a large number of intermediate values. Such measures, however, are poor indicators of true ON-OFF cells, as demonstrated by the histogram of values for neurons with clear ON and OFF excitatory components in the sNIM (red). For example, we show an example neuron having a negligible response to a full-field decrement in luminance (Fig. 3B) and a clear ON RF predicted by the GLM (Fig. 3C). This cell, however, is best described by an ON-OFF sNIM (Fig. 3D), and its ON-OFF selectivity can be demonstrated from analysis of responses to repeated presentations of a unique cloud stimulus (Fig. 3G).

We thus characterized neurons as being ON-OFF based on whether their best model (i.e., that with the highest cross-validated model performance) had two excitatory subunits with opposing polarity. We used the log-likelihood (cross-validated, LLx), which is ideal for long continuous trials because it does not require repeated stimuli to compute (as compared with the more common measure such as explainable variance R2), as a measure of model performance33,34. The LLx-improvement for the ON-OFF model relative to the separable GLM was quite variable across labeled ON-OFF cells (Fig. 4A), with a median improvement of 55% (N = 141). There, however, were a large number of ON-OFF cells with small improvements (Fig. 4B). However, this was largely due to the distribution of ON-OFF bias (Fig. 4C), such neurons that were highly dominated by ON- or OFF- responses had little to gain by correctly predicting the response to opposite polarity, and thus large ON-OFF bias was associated with very little model improvement. While the PSTH-based analysis shown above (Fig. 3G) can demonstrate the clear presence of ON and OFF responses, such analysis was not possible with most of the dataset, because it depended on having repeated stimuli for each neuron that happens to contain noise patterns matching its ON and OFF RFs. As a result, model-based labeling cannot be definitive, but rather demonstrates the much more wide-spread presence of ON-OFF selectivity than what is traditionally reported through electrophysiological characterizations.

### Detection of suppressive surrounds

Although ON-OFF cells are the most obvious examples of nonlinear processing in the retina, most RGCs exhibit some form of nonlinear processing in their responses, such as adaptation to stimulus contrast35,36,37,38 and the generation of temporal precision39,40,41. In the context of the NIM model, such nonlinear properties might be explained through the addition of suppressive subunits33,42,43,44,45, e.g., the same model structure that fit ON-OFF cells (Fig. 2C) with the second subunit’s output multiplied by −1 such that it can only suppress the firing rate. Indeed, the majority of OFF cells have suppressive RF(s) identified by the sNIM (e.g., Fig. 5). These suppressive subunits typically have roughly the same spatial extent as the excitatory subunit (and LN model RF) (Fig. 5A, left), but delayed in temporal kernel (Fig. 5A, middle). Such delayed suppression is consistent with similar models fit to cat RGCs44 and LGN cells15,33,44. Compared with the corresponding LN model, both the spatial and temporal filters of the NIM have broader dimensions; likely because the LN model’s RF is the average of filters that largely cancel each other. The contribution of the suppression tends to make the response more transient due to the delay33.

Other RGCs models had an opposite-sign (or “Pull”) suppression in addition to a same-sign suppression (Fig. 5C). There was in fact every combination evident in the sNIM fits to ON and OFF cells, with a majority of OFF cells having a detectable suppressive component, while slightly more than half of ON cells did not (Fig. 5D). However, the detection of suppressive RFs typically lead to much smaller improvements in model performance (Fig. 5E; median = 10.0%, N = 290), because the effect of suppressive RFs only had a small impact on the predicted firing rate (Fig. 5B), related to its precision33. Nevertheless, the detection of the diversity in suppressive RFs demonstrates the ability to more accurately reflect the diverse computation of different RF types.

Suppressive RFs were also detected by the sNIM in ON-OFF neurons. Indeed, for the example neuron described by an ON and OFF excitatory RF (Fig. 6A), a delayed suppressive RF could be added to each excitatory RF (Fig. 6B). This pattern held in a majority of ON-OFF cells (Fig. 6C), although – like with the ON and OFF cells considered earlier, the addition of suppression only led to small improvements in model performance over the already large improvements associated with the ON-OFF models without suppression (Fig. 6D; median = 14.0%). However, when the sNIM is considered as a whole, it does result in considerable model improvements relative to the separable GLM (Fig. 6E; median = 89.7%, N = 141). Thus, the diversity of suppression is another element of the sNIM that contributes to the computational diversity of RGCs.

## Discussion

Here, we present a state-of-the-art modeling approach for characterizing retinal computation. It is based on extracellular recording of RGC action potentials evoked by “cloud” stimuli, which are correlated noise46,47. Analysis of responses to clouds reveals selectivity of RGCs across spatial scales: here, we used a maximum likelihood estimation applied to an LNLN cascade, which is selective to multiple spatiotemporal “features” (i.e., receptive fields), to describe stimulus-driven spike train data. To handle the large number of spatial and temporal dimensions of RFs, we adapted fitting procedures of the Nonlinear Input Model15 to use space-time separable RFs (i.e., the separable NIM; sNIM). Using this experimental and computational approach, we identified a diverse array of computations in the observed RGC population, which included a large number of ON-OFF cells. Our approach to characterizing RGC receptive fields (RFs) can be extended to detail differences between individual RGC types as well as between the retinas of different species (primate, mouse, salamander, etc.).

The sNIM utilizes a range of methods from existing statistical modeling methods; combining them to generate a robust description of retinal computation is novel and unique. The sNIM is based on the methods of the NIM15,33,44, modified to have space-time separable RFs15,18,19,20,33,48 to make it feasible to fit multiple spatiotemporal RFs per neuron at high spatial and temporal resolution. Furthermore, we employ cloud stimuli46,47 – essentially white noise stimuli filtered to have spatial correlations – to drive responses to features across spatial scales. This facilitated robust fitting of salient features of RGC computation, including separate ON and OFF and suppressive RFs.

Statistical models of RGCs (and related LGN neurons) are some of the most successful in neuroscience, in large part due to their predominantly linear behavior49. As a result, models of RGCs have been largely based on spike-triggered averages (STAs)10, which – although not explicitly “fit” to maximize likelihood – are the optimal linear filters in the context of stimuli such as Gaussian white noise10,27. Such models, however, capture only linear stimulus processing (i.e., a single RF) and must be used with uncorrelated stimuli. Spike-triggered covariance analysis50,51,52 extended spike-triggered techniques to nonlinear models, although are only able to yield a “feature space” (e.g., Fig. 2), within which the true features that the neuron is driven by exist. Thus, while such approaches identify multiple excitatory and suppressive features, they cannot be used to characterize these separate components contributing to RGC processing14,15,33,52. Furthermore, STC-based approaches14,17,50 can require a prohibitive amount of data for the types of spatiotemporal characterization, and for example could not be applied for recordings considered here.

The use of rectified nonlinearities in the context of an LNLN cascade, however, more clearly distinguishes the separate features that contributing to RGC computations, due to the two-stage nonlinear processing in such cascades better approximating the form of nonlinear processing in the retinal circuit15,18,44. Such an approach was recently used in the salamander retina, where it identified multiple excitatory features comprising single OFF RFs, likely corresponding to separate bipolar cell inputs18. This detection of excitatory RFs in another recent study in salamander was able to picture the spatial arrangement of such putative bipolar cell inputs in two-dimensions17. Neither study, however, detected ON-OFF cells nor suppressive RFs, likely reflecting the predominance of OFF RGCs in the salamander retina9. These OFF cells exhibit highly nonlinear excitatory integration53,54, when compared with those of the mouse retina. It is intriguing to think that more data, or tailored one-18 or two-dimensional16 stimuli, would allow such models to dissect more finely the distinct inputs integrated by different types of mammalian RGCs.

Naturally, the LNLN cascade only approximates the true nonlinearities present in RGC computation, and more detailed studies have been able to model particular elements of RGCs that cannot be captured in this general framework. Such nonlinear response properties include temporal precision33,40, nonlinear spatial integration16,17,55,56, contrast gain control35,43 and models based on mechanisms such as synaptic depression57,58 and presynaptic inhibition45. Such more detailed models will only be fittable in limited circumstances owing to the large number of component parameters and thus are not yet general. Likewise, the methods presented here are not ideal for characterization of direction-selectivity due to the non-separability of their RF components, which thus would require much larger amounts of data to adequately constrain and/or targeted visual stimuli. The problem of constraining more complex models is starting to be addressed using new machine-learning methods that can fit large populations of recorded neurons simultaneously58,59 and these provide an opportunity to fit large numbers of parameters with more data. Such techniques, however, are still in their infancy and currently have limited interpretability.

In this context, the sNIM is relatively simple: in a robust and user-friendly framework, it is able to approximate many of the nonlinearities revealed by more detailed studies. Assumptions about the specific structure of the model, such as space-time separability of the RFs, and the limit in the number of RF subunits, are necessarily approximations, but they maximize robustness and interpretability. Thus, the sNIM serves as a much-improved baseline model relative to the LN analysis that has dominated retinal computational analysis until recently.

### ON-OFF selectivity and the diversity of RGC computation

While making up only a small RGC population in cat and primate59,60, ON-OFF cells represent a significant fraction of the RGCs in mouse13,61. ON-OFF cells appear to be common in salamander retina as well54,62. Furthermore, an intriguing recent study utilizing LN analysis suggested that the response polarity of the linear receptive fields of mouse and pig RGCs varied with illumination intensity63: simple ON or OFF characterizations of retinal neurons therefore may be insufficient to capture their physiological functionality. Similar “polarity reversal” was observed in salamander OFF RGCs in response to “peripheral image shift”64, suggesting that robust methods to characterize ON-OFF responses are necessary14. The results we presented here demonstrate that the sNIM detects ON-OFF selectivity more sensitively than physiological characterizations using flashed stimuli and/or based on the linear RF7,9,11. The clear RFs detected also permit detailed characterization of the measured RFs (Fig. 1) and thus provide a framework for understanding the contribution of ON-OFF processing in vision.

More broadly, the revolution in mouse genetics has allowed for the discovery of new RGCs – e.g., J-RGCs65, and W3 cells13 – and the study of their presynaptic circuitry. Further, new anatomical and statistical techniques5,6 have amplified our understanding of novel and known RGC types. Nonlinear modeling approaches have offered a picture of cell diversity in more general stimulus contexts. The sNIM can be applied to general stimuli and offers a robust description of RGC computation that can be related to each cell’s stimulus selectivity, a diversity of nonlinear response properties, and potentially underlying mechanisms of processing within the visual circuitry.

## References

1. Barlow, H. B. & Levick, W. R. The mechanism of directionally selective units in rabbit’s retina. J Physiol 178, 477–504 (1965).

2. Münch, T. A. et al. Approach sensitivity in the retina processed by a multifunctional neural circuit. Nat Neurosci 12, 1308–1316 (2009).

3. Dhande, O. S., Stafford, B. K., Lim, J.-H. A. & Huberman, A. D. Contributions of Retinal Ganglion Cells to Subcortical Visual Processing and Behaviors. Annu Rev Vis Sci 1, 291–328 (2015).

4. Demb, J. B. & Singer, J. H. Functional Circuitry of the Retina. Annu Rev Vis Sci 1, 263–289 (2015).

5. Helmstaedter, M. et al. Connectomic reconstruction of the inner plexiform layer in the mouse retina. Nature 500, 168–174 (2013).

6. Baden, T. et al. The functional diversity of retinal ganglion cells in the mouse. Nature 529, 345–350 (2016).

7. Carcieri, S. M., Jacobs, A. L. & Nirenberg, S. H. Classification of retinal ganglion cells: a statistical approach. J Neurophysiol 90, 1704–1713 (2003).

8. Seung, H. S. & Sümbül, U. Neuronal cell types and connectivity: lessons from the retina. Neuron 83, 1262–1272 (2014).

9. Segev, R., Puchalla, J. L. & Berry, M. J. II. Functional organization of ganglion cells in the salamander retina. J Neurophysiol 95, 2277–2292 (2006).

10. Chichilnisky, E. J. A simple white noise analysis of neuronal light responses. Network 12, 199–213 (2001).

11. Chichilnisky, E. J. & Kalmar, R. S. Functional asymmetries in ON and OFF ganglion cells of primate retina. J Neurosci 22, 2737–2747 (2002).

12. Asari, H. & Meister, M. Divergence of visual channels in the inner retina. Nat Neurosci 15, 1581–1589 (2012).

13. Zhang, Y., Kim, I.-J., Sanes, J. R. & Meister, M. The most numerous ganglion cell type of the mouse retina is a selective feature detector. PNAS 109, E2391–8 (2012).

14. Cantrell, D. R., Cang, J., Troy, J. B. & Liu, X. Non-centered spike-triggered covariance analysis reveals neurotrophin-3 as a developmental regulator of receptive field properties of ON-OFF retinal ganglion cells. PLoS Comput Biol 6, e1000967 (2010).

15. McFarland, J. M., Cui, Y. & Butts, D. A. Inferring nonlinear neuronal computation based on physiologically plausible inputs. PLoS Comput Biol 9, e1003143 (2013).

16. Freeman, J., Field, G. D., Li, P. H., Greschner, M. & Gunning, D. E. Mapping nonlinear receptive field structure in primate retina at single cone resolution. Elife, https://doi.org/10.7554/eLife.05241.001 (2015).

17. Liu, J. K. et al. Inference of neuronal functional circuitry with spike-triggered non-negative matrix factorization. Nat Commun 8, 149 (2017).

18. Maheswaranathan, N., Kastner, D. B., Baccus, S. A. & Ganguli, S. Inferring hidden structure in multilayered neural circuits. PLoS Comput Biol 14, e1006291 (2018).

19. Thorson, I. L., Liénard, J. & David, S. V. The Essential Complexity of Auditory Receptive Fields. PLoS Comput Biol 11, e1004628 (2015).

20. Park, M. & Pillow, J. W. Bayesian inference for low rank spatiotemporal neural receptive fields. Advances in Neural Information Processing (2013).

21. Lyubarsky, A. L., Daniele, L. L. & Pugh, E. N. From candelas to photoisomerizations in the mouse eye by rhodopsin bleaching in situ and the light-rearing dependence of the major components of the mouse ERG. Vision Research 44, 3235–3251 (2004).

22. Applebury, M. L. et al. The murine cone photoreceptor: a single cone type expresses both S and M opsins with retinal spatial patterning. Neuron 27, 513–523 (2000).

23. Jacobs, G. H., Williams, G. A. & Fenwick, J. A. Influence of cone pigment coexpression on spectral sensitivity and color vision in the mouse. Vision Research 44, 1615–1622 (2004).

24. Wang, Y. V., Weick, M. & Demb, J. B. Spectral and temporal sensitivity of cone-mediated responses in mouse retinal ganglion cells. J Neurosci 31, 7670–7681 (2011).

25. Paninski, L. Maximum likelihood estimation of cascade point-process neural encoding models. Network 15, 243–262 (2004).

26. Truccolo, W., Eden, U. T., Fellows, M. R., Donoghue, J. P. & Brown, E. N. A point process framework for relating neural spiking activity to spiking history, neural ensemble, and extrinsic covariate effects. J Neurophysiol 93, 1074–1089 (2005).

27. Dayan, P. & Abbott, L. F. Theoretical neuroscience: computational and mathematical modeling of neural systems. (Massachusetts Institute of Technology Press, 2001).

28. Meister, M., Pine, J. & Baylor, D. A. Multi-neuronal signals from the retina: acquisition and analysis. J Neurosci Methods 51, 95–106 (1994).

29. Pamplona, D., Hilgen, G., Cessac, B., Sernagor, E. & Kornprobst, P. A super-resolution approach for receptive fields estimation of neuronal ensembles. BMC Neurosci 16, P130 (2015).

30. Theunissen, F. E. et al. Estimating spatio-temporal receptive fields of auditory and visual neurons from their responses to natural stimuli. Network 12, 289–316 (2001).

31. Korenberg, M. J., Sakai, H. M. & Naka, K.-I. Dissection of the neuron network in the catfish inner retina. III. Interpretation of spike kernels. J Neurophysiol 61, 1110–1120 (1989).

32. Hochstein, S. & Shapley, R. M. Linear and nonlinear spatial subunits in Y cat retinal ganglion cells. J Physiol 262, 265–284 (1976).

33. Butts, D. A., Weng, C., Jin, J. Z., Alonso, J.-M. & Paninski, L. Temporal precision in the visual pathway through the interplay of excitation and stimulus-driven suppression. J Neurosci 31, 11313–11327 (2011).

34. Pillow, J. W., Paninski, L., Uzzell, V. J., Simoncelli, E. P. & Chichilnisky, E. J. Prediction and decoding of retinal ganglion cell responses with a probabilistic spiking model. J Neurosci 25, 11003–11013 (2005).

35. Shapley, R. M. & Victor, J. D. The effect of contrast on the transfer properties of cat retinal ganglion cells. J Physiol 285, 275–298 (1978).

36. Kim, K. J. & Rieke, F. Temporal contrast adaptation in the input and output signals of salamander retinal ganglion cells. J Neurosci 21, 287–299 (2001).

37. Brown, S. P. & Masland, R. H. Spatial scale and cellular substrate of contrast adaptation by retinal ganglion cells. Nat Neurosci 4, 44–51 (2001).

38. Baccus, S. A. & Meister, M. Fast and slow contrast adaptation in retinal circuitry. Neuron 36, 909–919 (2002).

39. Liu, R. C., Tzonev, S., Rebrik, S. P. & Miller, K. D. Variability and information in a neural code of the cat lateral geniculate nucleus. J Neurophysiol 86, 2789–2806 (2001).

40. Berry, M. J. II & Meister, M. Refractoriness and neural precision. J Neurosci 18, 2200–2211 (1998).

41. Uzzell, V. J. & Chichilnisky, E. J. Precision of spike trains in primate retinal ganglion cells. J Neurophysiol 92, 780–789 (2004).

42. Bonin, V., Mante, V. & Carandini, M. The statistical computation underlying contrast gain control. J Neurosci 26, 6346–6353 (2006).

43. Mante, V., Bonin, V. & Carandini, M. Functional mechanisms shaping lateral geniculate responses to artificial and natural stimuli. Neuron 58, 625–638 (2008).

44. Butts, D. A., Cui, Y. & Casti, A. R. R. Nonlinear computations shaping temporal processing of precortical vision. J Neurophysiol 116, 1344–1357 (2016).

45. Cui, Y., Wang, Y. V., Park, S. J. H., Demb, J. B. & Butts, D. A. Divisive suppression explains high-precision firing and contrast adaptation in retinal ganglion cells. eLife 5, e19460 (2016).

46. Niell, C. M. & Stryker, M. P. Highly selective receptive fields in mouse visual cortex. J Neurosci 28, 7520–7536 (2008).

47. Leon, P. S., Vanzetta, I., Masson, G. S. & Perrinet, L. U. Motion clouds: model-based stimulus synthesis of natural-like random textures for the study of motion perception. J Neurophysiol 107, 3217–3226 (2012).

48. Cai, D., DeAngelis, G. C. & Freeman, R. D. Spatiotemporal receptive field organization in the lateral geniculate nucleus of cats and kittens. J Neurophysiol 78, 1045–1061 (1997).

49. Shapley, R. M. Linear and nonlinear systems analysis of the visual system: why does it seem so linear? A review dedicated to the memory of Henk Spekreijse. Vision Research 49, 907–921 (2009).

50. Fairhall, A. L. et al. Selectivity for multiple stimulus features in retinal ganglion cells. J Neurophysiol 96, 2724–2738 (2006).

51. Samengo, I. & Gollisch, T. Spike-triggered covariance: geometric proof, symmetry properties, and extension beyond Gaussian stimuli. J Comput Neurosci 34, 137–161 (2013).

52. Liu, J. K. & Gollisch, T. Spike-Triggered Covariance Analysis Reveals Phenomenological Diversity of Contrast Adaptation in the Retina. PLoS Comput Biol 11, e1004425 (2015).

53. Gollisch, T. & Meister, M. Modeling convergent ON and OFF pathways in the early visual system. Biological Cybernetics 99, 263–278 (2008).

54. Gollisch, T. & Meister, M. Rapid neural coding in the retina with relative spike latencies. Science 319, 1108–1111 (2008).

55. Gollisch, T. Features and functions of nonlinear spatial integration by retinal ganglion cells. J Physiol-Paris 107, 338–348 (2013).

56. Schwartz, G. W. et al. The spatial structure of a nonlinear receptive field. Nat Neurosci 15, 1572–1580 (2012).

57. Jarsky, T. et al. A synaptic mechanism for retinal adaptation to luminance and contrast. J Neurosci 31, 11003–11015 (2011).

58. Ozuysal, Y. & Baccus, S. A. Linking the computational structure of variance adaptation to biophysical mechanisms. Neuron 73, 1002–1015 (2012).

59. Troy, J. B. & Shou, T. The receptive fields of cat retinal ganglion cells in physiological and pathological states: where we are after half a century of research. Prog Retin Eye Res 21, 263–302 (2002).

60. Perry, V. H., Oehler, R. & Cowey, A. Retinal ganglion cells that project to the dorsal lateral geniculate nucleus in the macaque monkey. Neuroscience 12, 1101–1123 (1984).

61. Sun, W., Li, N. & He, S. Large-scale morphological survey of mouse retinal ganglion cells. J Comp Neurol 451, 115–126 (2002).

62. Hensley, S. H., Yang, X. L. & Wu, S. M. Relative contribution of rod and cone inputs to bipolar cells and ganglion cells in the tiger salamander retina. J Neurophysiol 69, 2086–2098 (1993).

63. Tikidji-Hamburyan, A. et al. Retinal output changes qualitatively with every change in ambient illuminance. Nat Neurosci 18, 66–74 (2015).

64. Geffen, M. N., de Vries, S. E. J. & Meister, M. Retinal ganglion cells can rapidly change polarity from Off to On. PLoS Biol 5, e65 (2007).

65. Kim, D. S. et al. Identification of molecular markers of bipolar cells in the murine retina. J Comp Neurol 507, 1795–1810 (2008).

## Acknowledgements

This work was supported by the Department of Biology at University of Maryland, College Park, NIH EY021372 (Q.S. and J.S.) and NSF IIS-1350990 (D.A.B., P.G., and A.K.B.).

## Author information

Authors

### Contributions

Q.S., J.H.S. and D.A.B. designed the experiments. Q.S. performed the MEA recordings. Q.S., P.G., A.K.B. and D.A.B. performed data analysis. Q.S., J.H.S. and D.A.B. wrote the paper.

### Corresponding author

Correspondence to Qing Shi.

## Ethics declarations

### Competing Interests

The authors declare no competing interests.

Publisher’s note: Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

## Rights and permissions

Reprints and Permissions

Shi, Q., Gupta, P., Boukhvalova, A. et al. Functional characterization of retinal ganglion cells using tailored nonlinear modeling. Sci Rep 9, 8713 (2019). https://doi.org/10.1038/s41598-019-45048-8

• Accepted:

• Published:

• DOI: https://doi.org/10.1038/s41598-019-45048-8