Introduction

Selective attention is believed to facilitate auditory task performance by enhancing neural representation of behaviorally meaningful sounds1,2,3,4,5,6,7,8. Task-related plasticity is a neural correlate of selective attention that is characterized by transient changes in both the gain of neuronal responses to auditory targets, and the shape of spectrotemporal receptive fields (STRFs)4,6,7. During pure-tone detection and discrimination tasks, individual neurons become more responsive to auditory targets, while STRFs become more selective for the auditory target frequency4,6,7.

Converging lines of evidence from both anatomical and neurophysiological studies suggest that task-related plasticity in A1 may be greater in cortical layer 2/3 (L2/3) than layers 4–6 (L4–6), due to intracortical network activity within L2/3 that is believed to mediate top-down control of sensory processing9,10,11,12,13. The L2/3 intracortical network may provide a pathway for prefrontal cortex to bias A1 responsiveness in favor of behaviorally meaningful sounds14,15,16. However, the laminar profile of task-related plasticity in A1 remains unclear since few studies have recorded simultaneously across layers during auditory task performance17,18. In humans, behavioral detection of a frequency modulation sweep sharpened frequency tuning in superficial cortical layers more than in middle-deep layers17. In monkeys, intermodal attention-related suppressive effects predominated neural responses in superficial cortical layers, yet response enhancement was dominant in middle-deep layers18. Long-lasting effects of auditory training on A1 responses to sounds in anesthetized rats include an enhancement of neural responses in L2/3, but not in layer 419. It is believed that high-frequency LFPs (i.e. “high-gamma” LFPs >80 Hz) measure synchronous spiking from many neurons20. Here, we hypothesized that task-related plasticity might be (1) greater in superficial L2/3 than in middle-deep L4–6, and (2) similar for multi-unit spiking and high-gamma LFPs.

Results

Target response enhancement was greatest in L2/3

We studied the laminar profile of rapid task-related plasticity by recording from a 24 channel linear electrode array (Plexon U-probe) inserted through the dura, orthogonal to the surface of A1, in two ferrets that were performing an auditory detection task (Fig. 1a). During task performance trials, the animal heard a sequence of reference noises followed by a pure-tone target (Fig. 1b). Upon detecting the target, the animal was trained to stop licking a waterspout to avoid a mild shock4. Neural responses to the same sounds used during the task were also measured while the animal was in a passive, quiescent state, to provide a within-animal passive control condition for neural activity. Wide-band recordings from the 24 channel linear array allowed us to analyze multi-unit spiking and high-gamma LFP magnitudes across a 1.8 mm depth that included layers 2/3–621 (see Methods and Fig. 1c). All statistical comparisons were done using a bootstrap hypothesis test (see Methods).

Figure 1
figure 1

Awake-behaving experimental paradigm and electrode depth registration. (a) Head-fixed preparation. Ferrets were implanted with a metal post that was attached to the skull and held fixed during awake-behaving neurophysiological experiments. The ferret performed the task while we recorded from primary auditory cortex (A1) using a 24 channel linear electrode array (Plexon U-probe). (b) Pure-tone detection task. Two ferrets were trained to do a conditioned avoidance Go/No-Go pure-tone detection task. In each trial of the task, the animal heard a sequence of reference noises followed by a pure-tone target. Reference noises were “Go” signals, during which the animal was free to lick a waterspout. Upon detecting the target (the “No-Go” signal), the animal stopped licking the water spout to avoid a mild shock. The target frequency was different for each experiment. (c) Electrode depth-registration. The left panel shows an example of how the layer 2/3 (L2/3) vs. layers 4–6 (L4–6) border (dashed line) was computed for a single penetration of the 24 channel linear electrode array in A1. Local field potential (LFP) responses to 100 ms tones were used to find a common marker of depth across penetrations (i.e., for depth registration). Registration began by first identifying the electrode with the shortest LFP response latency (Eτ, white square), then finding the LFP waveform correlation coefficients (ρ) between Eτ and all other electrodes in the same penetration. The border between the first neighboring electrode pair with positive and negative correlation coefficients defined the L2/3 vs. L4–6 border21,45,46,47,48. Laminar profiles were averaged across penetrations after first aligning to the border. The right panel shows the average depth-registered LFP laminar profile in response to 100 ms tones.

During task performance (average target detection d’ = 1.3, std = 0.74), we found that neural responses to targets were enhanced relative to reference responses, and target enhancement was greater in superficial L2/3 than middle-deep L4–6. Figures 2a and 3a show laminar profiles for multi-unit spiking and high-gamma LFPs, respectively. Each figure shows the average response to target and reference sounds, during passive and behavior conditions. Both target and reference sounds evoked responses across layers 2/3–6, during both passive and behavior conditions. To quantify task-related plasticity (P) from neural responses (PResp), we first found the fractional change (i.e., ratio) of stimulus response amplitudes during behavior vs. passive trials, separately for target (TarBhv/Pass) and reference (RefBhv/Pass) sounds. This defined the task-related stimulus responses for targets and references. Then we took the difference between target and reference ratios (PResp = TarBhv/Pass − RefBhv/Pass; Figs 23b,c). Positive values of PResp (red in Figs 2b and 3b) indicate relative target response enhancement during auditory task performance. We found that target enhancement was the predominant effect for both multi-unit spiking (Fig. 2b,c) and high-gamma LFPs (Fig. 3b,c). Furthermore, target enhancement was greater in L2/3 electrodes for most recordings (see cumulative distribution functions in Figs 2c and 3c). The average target enhancement values for L2/3 vs. L4–6 were 0.48 vs. 0.09 for multi-unit spiking (p < 0.001) and 0.52 vs. 0.19 for high-gamma LFPs (p < 0.001). The 9% average target enhancement we measured in multi-unit spiking from L4–6 agrees with previous measurements of multi-unit responses in A15.

Figure 2
figure 2

Laminar profiles of task-related plasticity in multi-unit (MU) spiking from A1. Panel a shows the average laminar profile of MU responses to reference noises (Ref.; top row) and target tones (Tar.; bottom row), and in passive (left column) and behavior (right column) conditions. Depth is marked relative to the L2/3 border (see Methods and Fig. 1c). The vertical black line shows when the tone was presented. The color bar indicates that the MU response data are shown as the change in the spike rate (ΔSPK) relative to the silent baseline before the tone in each trial. The profiles in each panel were normalized to the peak value across profiles. Panel (b) shows the laminar profile of task-related plasticity for MU spiking, on the same depth axis as panel (a). To quantify task-related plasticity for MUs and high-gamma LFPs, we first found the fractional change (i.e., ratio) of stimulus response amplitudes between behavior vs. passive experiments. This defined the task-related stimulus responses for targets and references. We then took the difference in task-related stimulus responses for targets minus references to define task-related plasticity. The color bar indicates that task-related plasticity is shown as either target enhancement (Enh.; red) or suppression (Sup.; blue). (c) Shows the cumulative distribution functions (CDFs) and grand-averages of task-related plasticity for all recordings in L2/3 (red) and L4–6 (blue). Error bars and shading show 1 standard error of the mean (sem). Stars and the solid comparison bar indicate averages that were significantly different than 0 (p < 0.001, bootstrap test). Population sizes (n) indicate the number of electrodes per average after applying noise rejection (see Methods).

Figure 3
figure 3

Laminar profiles of task-related plasticity in high-gamma LFPs from A1. Panel (a) shows the average laminar profile of high-gamma responses to reference noises (Ref.; top row) and target tones (Tar.; bottom row), and in passive (left column) and behavior (right column) conditions. Depth is marked relative to the L2/3 border (see Methods and Fig. 1c). The vertical black line shows when the tone was presented. The color bar indicates that the high-gamma response data are shown as the change in the response magnitude (ΔMAG) relative to the silent baseline before the tone in each trial. The profiles in each panel were normalized to the peak value across profiles. Panel (b) shows the laminar profile of task-related plasticity for high-gamma LFPs, on the same depth axis as panel (a). The color bar indicates that task-related plasticity is shown as either target enhancement (Enh.; red) or suppression (Sup.; blue). Panel (c) shows the CDFs and grand-averages of task-related plasticity for all recordings in L2/3 (red) and L4–6 (blue). Error bars and shading show 1 sem. Stars and the solid comparison bar indicate averages that were significantly different than 0 (p < 0.001, bootstrap test). Population sizes (n) indicate the number of electrodes per average after applying noise rejection (see Methods).

Enhanced target selectivity in STRFs was greatest in L2/3

Task-related plasticity has previously been described in A1 using STRFs computed from single- and multi-unit spiking4,6,7. Here, we extend that analysis by computing STRFs from high-gamma LFPs (bottom row, Fig. 4), in addition to multi-unit spiking (top row, Fig. 4). STRFs computed from reference noises estimate the magnitude of neural responses to target tones, relative to other pure-tone frequencies. We analyzed the 2-dimensional STRFs in the same manner as 1-dimensional response traces to compute laminar profiles of task-related plasticity (i.e., PSTRF; Fig. 4b,e), with the additional step of first aligning each STRF to the target frequency bin before averaging. We found that enhanced target selectivity (i.e., peaks in PSTRF; red in Fig. 4b,e) was the predominant effect in STRFs. Enhancement was greater in L2/3 than in L4–6 (Fig. 4c,f). The average STRF target enhancement for electrodes in L2/3 vs. L4–6 was 0.6 vs. 0.27 for multi-unit spiking (p < 0.001) and 0.46 vs. 0.28 for high-gamma LFPs (p < 0.01). For both multi-unit spiking and high-gamma LFPs, the STRFs indicated that target enhancement resulted from a reduction of inhibitory fields (blue in Fig. 4a,d) and increased excitatory fields (red in Fig. 4a,d). This can be seen by comparing the left and right columns in panels a and d in Fig. 4. The STRF prediction of 27% target enhancement in multi-units from middle-deep L4–6 (Fig. 4c) agrees with previous measurements of task-related plasticity in A1 multi-unit STRFs4,7. Thus, we found that multi-unit spiking and high-gamma LFPs are similarly predictive of the effects of selective attention on A1 responses to behaviorally meaningful sounds.

Figure 4
figure 4

Laminar profiles of spectrotemporal receptive fields (STRFs) in A1. The top and bottom rows show the data for multi-units and high-gamma LFPs, respectively. Panels (a,d) show the average depth-registered and target-aligned (ΔTarget) STRF laminar profiles for L2/3 (top row in each panel) and L4–6 (bottom row in each panel), and in passive (left column in each panel) and behavior (right column in each panel) conditions. Red and blue indicate excitatory and inhibitory fields, respectively. Panels (b,e) show the laminar profile of task-related plasticity. Red and blue indicate target enhancement and suppression, respectively, during task performance. Panels (c,f) show the CDFs (top of each panel) and grand-average (bottom) of task-related plasticity for all electrodes in L2/3 (red) and L4–6 (blue). Error bars and shading show 1 sem. Stars indicate averages that were significantly different than 0 (p < 0.001, bootstrap test). Solid and dashed comparison bars indicate significant differences between layers (p < 0.001 and p < 0.01, respectively, bootstrap test). Population sizes (n) indicate the number of electrodes per average after applying noise rejection (see Methods).

The persistence of target enhancement was common across cortical layers

Task-related plasticity can persist for minutes to hours after task performance ends4,7. We measured the persistence of task-related plasticity in the minutes following task performance, when the animal returned to a passive, quiescent state (i.e. during a “post-passive” condition). In Figs 24, task-related plasticity was found, for both neural responses and STRFs, by computing TarBhv/Pass − RefBhv/Pass. We quantified the persistence of task-related plasticity similarly by comparing the post-passive state vs. the “pre-passive” state that occurred before task performance, i.e., we computed Ppersistence = TarPre/Post − RefPre/Post. We found a similar pattern of persistence in both the neural responses (Fig. 5a) and STRFs (Fig. 5b) computed from both multi-unit spiking and high-gamma LFPs: target enhancement was greatest during task performance and tended to decrease toward the pre-passive state after task performance.

Figure 5
figure 5

The persistence of task-related plasticity. Panels (a,b) show the persistence of task-related plasticity (i.e. target enhancement) for both neural responses and STRFs, respectively. The left and right columns of each panel show the results for multi-unit spiking and high-gamma LFPs, respectively. We found that target enhancement often persisted in the minutes after task performance but was usually less than during the task. Stars indicate averages that were significantly different than 0 (p < 0.001, bootstrap test). Error bars show 1 sem. Solid, dashed, and dotted lines indicate significant decay of task-related plasticity (p < 0.001, p < 0.01, p < 0.05, respectively).

Discussion

We recorded laminar profiles of neural activity in A1 during the performance of a pure-tone detection task and found that task-related plasticity was greater in L2/3 than in L4–6. The predominant effect of task-related plasticity was to enhance both neural responses to auditory targets and STRF selectivity for auditory target frequencies. Since target enhancement quickly decayed in the minutes following task performance, and the enhancement was similar for both multi-unit spiking and high-gamma LFPs, our results support rapid task-related plasticity as a neural correlate of attention22,23.

The dominance of target enhancement in L2/3 suggests that intracortical modulation of stimulus selectivity in A1 is an important neural correlate of selective attention. Top-down projections from prefrontal cortex are known to target neurons in supragranular layers in auditory cortex24,25,26. Neurons in prefrontal cortex show greater selectivity than A1 for behaviorally meaningful sounds15, and stimulation of orbitofrontal cortex causes changes in A1 pure-tone frequency tuning16,27 that resembles the task-related plasticity observed here and in previous studies4,5,6,7. Simultaneous recordings from frontal cortex and auditory cortex reveal behavior-dependent changes in functional connectivity13,15.

Figures 2b and 3b show that the greatest target enhancement measured from L2/3 tended to peak more than ~75 ms after the tone onset. This long latency for maximal target enhancement in L2/3 also indicates the importance of intracortical connections in task-related plasticity. Future studies measuring task-related plasticity simultaneously in laminar profiles of A1 and higher-order cortex will help to clarify the intracortical network dynamics of target enhancement during auditory tasks. Furthermore, the use of larger animal population sizes will clarify how task-related plasticity varies across individuals.

Recent evidence suggests that the attentional circuit is not strictly cortical, but may also involve contributions from the reticular nucleus of the thalamus (TRN) and the sensory thalamus28,29. It has recently been shown that prefrontal projections to the visual thalamus, but not primary visual cortex, exert a causal effect on behavior during an auditory-visual divided attention task29. Thus, cortico-thalamo-cortical loops may play an important role in modulating responses in all sensory cortices, including A130. In a recent paper, Guo, et al.28 describe an A1 layer 6 corticothalamic (CT) circuit in the mouse that biases sound processing in auditory cortex towards either improved auditory detection or discrimination. Activation of layer 6 neurons has been shown to increase inhibition across all cortical layers31, yet we found that the gain of target responses increased in L2/3 during behavior. Thus, if the layer 6 CT circuit establishes the task-dependent state of operation in A1 (i.e., detection vs. discrimination states), then it is likely that additional intracortical and cortico-thalamo-cortical attentional mechanisms subsequently disinhibit auditory responses in L2/3 during task performance32, enabling target enhancement. There are several mechanisms that may contribute to the enhancement of responses in supragranular layers, including functional connectivity with higher-order cortical areas14,15,16,33, intrinsic intracortical plasticity mechanisms34, and the activation of layer 5B cortico-thalamo-cortical loops35.

Our study supports the growing body of evidence indicating the importance of circuitry in L2/3 for plasticity in sensory cortex8,18,19,32,34,36,37. For example, neurons in L2/3 of both auditory and barrel cortex show enhanced modulation of dendritic spine growth, both during and after auditory- and whisker-based learning, respectivly36,38. To understand how changes in the amplitude of local synaptic input relate to task-related plasticity in spiking and high-gamma LFPs, in future studies we will characterize laminar profiles of low frequency LFPs using current source density (CSD) analysis18,19,20,39. Our results suggest that intracortical modulation of auditory processing is important not only for establishing long-lasting experience-related plasticity19,36,38,40 but also for enabling rapid task-related plasticity as a neural correlate for selective attention.

Methods

Neural activity was recorded in primary auditory cortex (A1) of 2 adult, female ferrets during performance of an auditory task in 24 total experiments (12 experiments per animal). All experimental procedures were approved by the University of Maryland (UMD) Animal Care and Use Committee, and performed in accordance with UMD and National Institutes of Health guidelines and regulations.

Animals were trained to detect a pure-tone target after a series of references noises composed of temporally orthogonal ripple combinations (TORCs)41. Animals were initially trained in sound-attenuated testing booths where they could move freely. Once they reached behavioral criterion on the task (discrimination ratio >0.6), they were implanted with a head-post and trained to perform the task while their heads were held fixed to facilitate stability in neurophysiological recording. Behavior and stimulus presentation were controlled by custom software written in Matlab (MathWorks).

Acoustic stimuli

Target tones were pure sine waves (5-ms onset and offset ramps), with frequency held fixed during a block of trials, but varied randomly between experiments. Reference noises consisted of a set of TORCs with a spectral resolution of 0-1.2 cycles/octave and temporal envelope resolution of 4–48 Hz41. Targets and references always had the same duration (2 s, 0.8 s inter-stimulus interval) and sound level (65 to 80 dB SPL) during neurophysiological recordings. All sounds were synthesized using a 44 kHz sampling rate, and presented through a free-field speaker that was equalized to achieve a flat gain.

Pure-tone detection task

Two animals were trained to perform a conditioned avoidance Go/No-go pure-tone detection task42 (Fig. 1a,b). Training was initiated by delivering water from a spout while presenting reference noises. The animals quickly learned to freely lick the spout during references. Target tones were then introduced and the animals learned to stop licking the spout in a 0.4 s time-window after the target to avoid a mild shock to the tongue (free-moving behavior) or to the tail (head-fixed behavior). On each trial, the number of references presented before the target varied randomly from one to six. Catch trials were also used, in which targets were absent. Performance was assessed by the sensitivity index, d’, calculated from the probability of hits (reduced licking after target offset) vs. false alarms (reduced licking after reference offset)43.

Neurophysiology

Each animal was implanted with a steel head-post to allow for stable recording, and a small craniotomy (1–2 mm diameter) was opened over A1. Recordings were verified as being in A1 according to their tonotopic organization, auditory response latency, and simple frequency tuning. Data acquisition was controlled using the MATLAB software MANTA44. Neural activity was recorded using a 24 channel Plexon U-Probe (electrode impedance: ~1 MOhm, 75 μm inter-electrode spacing). The probe was inserted through the dura, orthogonal to the brain’s surface, until most channels displayed spontaneous spiking.

Extracting neural responses

Multi-unit spikes were extracted on each electrode by band-pass filtering the raw signal between 300 and 6,000 Hz, then isolating spikes by peak detection (4σ threshold). Peri-stimulus time histograms (PSTHs) of spiking were computed using 10 ms bins. We analyzed multi-units instead of single-units because previous reports have indicated that task-related plasticity is more robust for multi-units4, which emphasizes that the behavioral relevance of task-related plasticity is predominant in neural populations, rather than single-units.

On the same electrodes used to extract multi-units, we also extracted high-gamma local field potentials (LFPs) by filtering the raw signal between 80–300 Hz, then taking the magnitude of the filtered signal’s Hilbert transform, and finally low-pass filtering below 70 Hz. LFPs on a given electrode were only kept if the signal-to-noise ratio (SNR) was greater than 1. This criterion eliminated 2 of 24 experiments from the dataset. Only the data from trials with correct behavioral responses were kept for analysis.

Computing spectrotemporal receptive fields (STRFs)

STRFs were estimated by reverse correlation between each time-varying neural response (i.e., multi-unit spiking and high-gamma LFPs) and the TORCs presented during experiments20. Positive STRF values indicate time-frequency components of the TORC that correlated with increased neural responses (i.e., an excitatory field), and negative values indicate components that correlated with decreased responses (i.e. an inhibitory field). An STRF was only included in further analyses if its SNR was above the 25th percentile of the SNR distribution. Before averaging STRFs across electrodes, we aligned each STRF so that the frequency bin containing the target was in the center of the frequency axis.

Depth-Registration

Each penetration of the linear electrode array produced a laminar profile of auditory responses in A1 across a 1.8 mm depth, however, the absolute depth varied across penetrations. In order to align all penetrations to the same depth, LFP responses to 100 ms tones were measured during the passive condition to find a common marker of depth (Fig. 1c). The marker was found for each penetration by first identifying the electrode with the shortest LFP response latency (Eτ), indicating an electrode depth at thalamorecpient layer 4. We then found the correlation coefficient between the average LFP waveform from Eτ and the LFP waveforms on all other electrodes in the same penetration. The border between the first neighboring electrode pair with positive and negative correlation coefficients defined the superficial vs. middle-deep layer border, corresponding to layer 2/3 (L2/3) and L4–6, respectively21,45,46,47,48. Laminar profiles were averaged across penetrations by aligning to the calculated border. Because of the neural response SNR criterion, data from the top two electrodes were also eliminated from all experiments, which removed data that may have included layer 121. Thus, we were able to measure 1.6 mm laminar profiles that included layers 2/3–621,45,46,47,48. We did not include 4 of the remaining 22 penetrations because the LFP correlations became negative in deep electrodes, suggesting that the penetration was not orthogonal to the surface or to the cortical layers.

Quantifying the laminar profile of task-related plasticity

To quantify task-related plasticity (P) from neural responses (PResp), we first computed the ratio of response amplitudes during behavior vs. passive trials, separately for target (TarBhv/Pass) and reference (RefBhv/Pass) sounds. Then we took the difference between target and reference ratio (PResp = TarBhv/Pass − RefBhv/Pass). PResp was normalized between +/−1 for each experiment before averaging across experiments. Positive values of PResp indicate target response enhancement during auditory task performance. We analyzed the 2-dimensional STRFs in the same manner as 1-dimensional response traces to compute STRF laminar profiles of task-related plasticity (i.e., PSTRF), with the additional step of first aligning each STRF to the target frequency bin before averaging. Data from electrodes in each penetration were separated into either L2/3 or L4–6 STRFs, since these were the regions quantitatively defined by depth-registration.

Significant differences between PSTRF and PResp from L2/3 vs. L4–6 were determined by a non-parametric bootstrap hypothesis test. Given two data sets, P1 and P2, having sample sizes of n and m, respectively, we tested P1 and P2 against the null hypothesis that they were drawn from a common distribution. Accepting the null hypothesis indicated that there was no statistically significant difference between the means of P1 and P2. For each bootstrap iteration, we resampled from P1 and P2, with replacement, to form distributions used for statistical testing. The minimum of sample sizes n and m determined the number of resampled values from P1 and P2, for each of 100,000 resampling iterations. We estimated cumulative distribution functions (CDFs) for task-related plasticity by bootstrapping parametric fits of a Gaussian CDF to the data from each experiment.