Neuronal coding of multiscale temporal features in communication sequences within the bat auditory cortex

Experimental evidence supports that cortical oscillations represent multiscale temporal modulations existent in natural stimuli, yet little is known about the processing of these multiple timescales at a neuronal level. Here, using extracellular recordings from the auditory cortex (AC) of awake bats (Carollia perspicillata), we show the existence of three neuronal types which represent different levels of the temporal structure of conspecific vocalizations, and therefore constitute direct evidence of multiscale temporal processing of naturalistic stimuli by neurons in the AC. These neuronal subpopulations synchronize differently to local-field potentials, particularly in theta- and high frequency bands, and are informative to a different degree in terms of their spike rate. Interestingly, we also observed that both low and high frequency cortical oscillations can be highly informative about the listened calls. Our results suggest that multiscale neuronal processing allows for the precise and non-redundant representation of natural vocalizations in the AC.

LFPs recorded in response to seq4, across trials, for the same unit shown in A. The observed distribution from the data is depicted on the left, whereas a surrogate amplitude distribution (from one repetition) is shown on the right (see Supplementary   Methods). The bottom row follows the same conventions described above, but the amplitude distribution across phase bins is shown for the unit's spontaneous activity.
The modulation indexes (Midxs) for each case are indicated in the figure. Note that, to allow for further PAC analyses and quantifications, Midxs from the observed data were z-normalized to a surrogate distribution (m = 500 repetitions), in which the relationship between the 50 -100 Hz LFP amplitude was consistently disassociated from the 4 -8 Hz LFP phase (see Methods). (c) z-normalized Midxs across neuronal groups (STs, blue, n = 22; BTs, red, n = 37; and NTs, black, n = 22), in response to the two longest sequences used as stimuli (i.e. seq2 and seq4, strong line colors), and during spontaneous activity (light line colors). There was significant PAC between high frequency LFPs and theta band oscillations, during spontaneous activity and also while the animal listened to the calls, independently of the unit type considered (red stars on top of each boxplot indicate z-values significantly higher than 2; FDRcorrected Wilcoxon signed-rank tests; pcorr <= 0.03; *, pcorr < 0.05; **, pcorr < 0.01; ***, pcorr < 0.001). There were no significant differences between z-normalized Midxs during spontaneous activity and during sound stimulation (FDR-corrected Wilcoxon signed-rank tests; pcorr > 0.08), nor did we observe significant differences between neuronal groups (FDR-corrected Wilcoxon rank-sum tests; pcorr > 0.27), suggesting that in our data, the PAC between theta and fast LFPs was not a reliable marker to distinguish the temporal patterns observed in ST, BT or NT units.

Periodicity analysis of neuronal responses
Periodicities present in the responses from groups of ST, BT or NT units were assessed using the autocorrelation of the spiking PDFs', taking in to account that slow (or fast) periodicities would be marked by low (or high) frequency oscillations in the autocorrelogram. Formally, Responses of a certain neuronal group were considered to carry slow or fast periodicities if the z-normalized power in low or high frequencies of the set of units comprising the group were significantly higher than a z-score of 2 (FDR-corrected tailed Wilcoxon signed-rank tests; significance when pcorr < 0.05; red stars in Supplementary Figure 3). When comparing values across groups, FDR-corrected Wilcoxon rank-sum tests were performed (statistical significance for pcorr < 0.05).

Stimulus-field coherence
Coherence between cortical oscillations and the natural sequences was calculated using the stimulus-field coherence (StimFC) metric. Conceptually, the StimFC is similar to the SFC: a frequency-dependent, normalized synchronization index that quantifies how LFPs lock to the stimulus. In this case, LFP windows extend from the stimulus onset until its offset (note that this implies variations in LFP segment length across sequences). The average (across trials, n = 50) of the LFP windows (stimulus-triggered average, StimTA) retains oscillatory components whose phase is consistent throughout stimulus presentations, and therefore related to the stimulus, while non-synchronized components are "averaged-out". Similar to the SFC calculations, the power of the StimTA is normalized to the average power of the individual LFP windows, which yields values between 0 and 1 indicating the strength of coherence in a certain frequency of the spectrum. LFP-stimulus synchronization was calculated on unit-per-unit basis, for every distress sequence analyzed. Power spectra were obtained using the multitaper 1 method available in Chronux 2 , using 5 tapers and a TW product of 8.
The StimFC was z-normalized to a surrogate distribution in which phase consistencies across trials were destroyed as described below. For each trial (considering a certain unit and natural sequence), the LFP trace was split at a random time-point, and the resulting two segments were swapped. By doing this, surrogate StimTAs lost the synchronized oscillatory components present in the observed data. Based on the above, surrogate StimFC values, representing results with chance-level coherence, were calculated. The described procedure was repeated 500 times, in which surrogate values were computed with the same parameters used for the observed data. Finally, StimFC values were z-normalized, on a frequency bin basis, to the corresponding surrogate distribution (note that the above was done on a unit per unit basis, for all natural sequences tested).

Phase amplitude coupling
The phase amplitude coupling (PAC) analyses were performed as described in 3 , on a unit per unit basis, using LFPs recorded in response to sequences 2 and 4 (with LFP windows extending from 100 ms after stimulus onset, until stimulus offset). Other sequences (1 and 3) were not used because their lengths did not allow for a robust estimation of PAC. In each case, the phase modulator was considered to be the theta-band (4 -8 Hz) LFP, whereas the signal whose amplitude was being modulated was considered to be the fast (50 -100 Hz) LFP. The instantaneous phase of theta-band oscillations, as well as the amplitude envelope of fast LFPs, were obtained from the Hilbert transform of the respective filtered LFP traces 3 , and were paired in a composite time series [ ( ), ( )]. Phases were then binned into 36 equally sized bins (10° angular size), and for each bin j we accumulated the mean instantaneous energy of the fast LFP (A(t)), coupled with phases that fell within that particular bin. The amplitude of each bin was normalized to the sum across bins, which results in a distribution that holds the properties of a discrete probability density function. The former is referred to as "amplitude distribution" 3 . The modulation index (Midx) was then obtained by normalizing the observed amplitude distribution (P) with a uniform distribution (U), by means of the Kullback-Leibler distance. Thus, the Midx ranges from 0 to 1 representing, respectively, absolute absence of modulation, or perfect modulation. Similar analyses were performed for LFPs recorded in spontaneous activity, using randomly chosen LFP windows of the same length as those used for estimating PAC during sound stimulation.
To test for the significance of the observed PAC, we computed a surrogate distribution of Midx values from data in which the relationship between the phase of theta-band LFPs and the amplitude of fast oscillations was reduced to chance. The above was accomplished by block-swapping instantaneous phase traces across trials at random time points 500 times.
Such procedure yields surrogates in which there is minimal distortion of the phase-amplitude

Stimulus characterization
In order to quantify how much information auditory cortical units provided about the natural distress sequences, we calculated how informative these units were in terms of their ability to distinguish specific parts of the acoustic stimulus [5][6][7][8] . To that end, the sequence used as stimulus was divided into a set of sk (k = 1, 2, 3, …, M) consecutive and non-overlapping segments (time windows used for calculating main results were of length T = 4 ms), and each of this sub-stimulus s was treated as an independent member of the set S (i.e. the whole divided sequence). The former makes no assumptions about which features of the call elicited a response, and therefore is well suited to quantify the information content of the responses to the sequences. Note that it is assumed that all sub-stimuli sk are equiprobable.

Information in neural codes
To quantify the information contained in the firing rate of individual units (Irate), we determined the number of spikes occurring in response to each of the sub-stimuli (sk) considered and based on that we then computed the information content. In this case, the response set was defined as the occurrence or not of a spike (R = {0, 1}), and P(r) was then the probability of firing (or not) in each unit. P(r) was estimated throughout the total set of 50 trials used for stimulation. We considered the time window for the sub-stimuli (T = 4 ms) to be sufficiently short as to assume that only one spike would occur inside of them in a given trial, and thus the information estimates provided in this manuscript were calculated with binarized responses (i.e. only the occurrence of a spike was noted, not the total amount of spikes in a certain window). Note that the former assumption can only underestimate the total value of Irate, but we verified that there was no significant information loss due to the response binarization by comparing information estimates with and without the binarizing the response (Wilcoxon sign-rank tests, pcorr > 0.09).
In order to compute the information provided by the LFP phase (Iphase), we filtered the LFP in 11 different and non-overlapping frequency bands, ranging from 4 to 72 Hz, using a 4 th order For the rate and phase information (Irate_phase), a spike in a sub-stimulus was labeled with one of the  possible symbols in which the average instantaneous phase (within the sub-stimulus) was binned. This way, spikes were associated to a certain phase bin of the LFP in each of the frequency bands analyzed, yielding a set of responses that can be represented as R = {0, 1, 2, …, }, where 0 implies the occurrence of no spike, whereas symbols 1 - indicate the occurrence of a spike, labeled with a certain phase bin.
If adding the phase of the LFP to the spike rate provides additional information not contained in the rate alone, and this information is indeed genuine (i.e. the phase of the LFP is not a function of the spiking rate), then the information in Irate_phase must be still higher than 0, even when the Irate is nullified. The latter was practically achieved by selecting sub-stimuli in which the spike rate was the same (rendering this particular code as non-informative) and quantifying the information in Irate_phase. As mentioned above, observing an Irate_phase > 0 would be a way to verify that the phase-of-firing at a particular frequency band of the LFP provides additional and genuine information not contained in the rate-of-firing alone 7,8 . The procedure of selecting stimulus epochs in which the units fired with the same rate was done for rates up to 55 spikes/s. Rates higher than 55 spikes/s were not considered because no more than 33% of the units (29 units) could be used for analyses.
We calculated the percentage of recorded units that significantly exhibited an increase in information when considering Irate_phase in each fixed spiking rate. Our criterion was that a unit showed a significant increase in the rate-phase code if: i) the unit's Irate_phase value considering a particular fixed spiking rate was significantly higher (z-score > 1.67, equivalent to a p-value of 0.05 of a tailed test) than a surrogate distribution; and ii) the unit's information content in Irate was not significantly higher than the corresponding surrogate distribution. In other words, those criteria add up to considering a significant increase per unit if the information in Irate_phase was higher than 0 and the information in Irate was not.
We also quantified the information content in the response of two simultaneously recorded units to the presented natural vocalizations (Ijoint). To that end, we used a spatial code which takes into account the identity of the unit that elicited (or not) a spike. The response in this