Prediction-related neural activity during silent periods is sharply tuned

Prior experience shapes sensory perception by enabling the formation of expectations with regards to the occurrence of upcoming sensory events. Especially in the visual modality, an increasing number of studies show that prediction-related neural signals carry feature-specific information about the stimulus. This is less established in the auditory modality, in particular without bottom-up signals driving neural activity. We studied whether auditory predictions are sharply tuned to even carry tonotopic specific information. For this purpose, we conducted a Magnetoencephalography (MEG) experiment in which participants passively listened to sound sequences of varying regularity (i.e. entropy). Importantly, sound presentation was occasionally omitted. This allowed us to assess whether and how carrier frequency specific information in the MEG signal is modulated according to the entropy level, especially during the silent (omission) periods. Using multivariate decoding analysis, our main finding is that only during an ordered (most predictable) sensory context does neural activity during omission periods contain carrier frequency specific information that can be used to classify neural activity elicited by genuine sounds. This shows that tonotopically specific patterns can be activated by top-down processes and supports the notion that predictions in the human auditory system can be sharply tuned.


Introduction
Our capacity to constantly predict incoming sensory inputs based on past experiences is fundamental to adapting our behavior in complex environments. A core enabling process is the identification of statistical regularities in sensory input, which does not require any voluntary allocation of processing resources (e.g. selective attention) and occurs more or less automatically in healthy brains 1 . Analogous to other sensory modalities 2,3 , auditory cortical information processing takes place in hierarchically organized streams along putative ventral and dorsal pathways 4 . These streams reciprocally connect different portions of auditory cortex with frontal and parietal regions 4,5 . This hierarchical anatomical architecture yields auditory cortical processing regions sensitive to topdown modulations, thereby enabling modulatory effects of predictions. In this context, a relevant question is to what extent do predictionrelated topdown modulations (pre)activate the same or similar neural ensembles as established for genuine sensory stimulation.
Such finetuning of neural activity would be suggested by frameworks that propose the existence of internal generative models [6][7][8][9] , inferring causal structure of sensory events in our environment and the sensory consequences of our actions. A relevant process of validating and optimizing these internal models is the prediction of incoming stimulus events, by influencing activity of corresponding neural ensembles in respective sensory areas. Deviations from these predictions putatively lead to (prediction) error signals, which are passed on in a bottomup manner to adapt the internal model, thereby continuously improving predictions 7 (for an alternative predictive coding architecture see 10 ). According to this line of reasoning, predicted input should lead to weaker neural activation than input that was not predicted, which has been illustrated previously in the visual 11 and the auditory modality 12 . Support for the idea that predictions engage neurons specifically tuned to (expected) stimulus features has been more challenging to address and has come mainly from the visual modality (for review see 13 ). In an fMRI study Smith and Muckli 14 showed that early visual cortical regions (V1 and V2), which process occluded parts of a scene, carry sufficient information to decode above chance different visual scenes. Importantly, activity patterns in the occlusion condition are generalized to a nonocclusion control condition, implying contextrelated topdown feedback or input via lateral connections to modulate visual cortex in a feature specific manner. In a similar vein, it has been shown that mental replay of a visual stimulus sequence is accompanied by V1 activity that resembles activity patterns driven in a feedforward manner by the real sequence 1 . Beyond more or less automatically generated predictions, explicit attentional focus to specific visual stimulus categories also goes along with similar featurespecific modifications in early and higher visual cortices even in the absence of visual stimulation 15 . Overall, for the visual modality, these studies underline that topdown processes lead to sharper tuning of neural activity to contain more information about the predicted and / or attended stimulus (feature).
Studies as to whether predictions in the auditory domain (pre)activate specific sensory representations in a sharply tuned manner are scarce especially in humans (for animal works see e.g. 16,17 ). Sharpened tuning curves of neurons in A1 during selective auditory attention have been established in animal experiments 18 , even though this does not necessarily generalize to automatically formed predictions. A line of evidence could be drawn from research in marmoset monkeys, in which a reduction of auditory activity is seen during vocalization 19 . This effect is abolished when fed back vocal utterances are pitch shifted 20 . A recent work suggests that even inner speech may be sufficient to produce reduced neural activity, but only when the presented sounds matched those internally verbalized 21 . Using invasive recordings in a small set of human epilepsy patients, it was shown that masked speech is restored by specific activity patterns in bilateral auditory cortices 22 , an effect reminiscent of a report in the visual modality 1 (for other studies investigating similar auditory continuity illusion phenomena see 23-25 ). Albeit being feature specific, this "fillingin" type of activity pattern observed during phoneme restoration cannot clarify conclusively whether they require topdown input. In principle these results could also be largely generated via bottomup thalamocortical input driving feature relevant neural ensembles via lateral or feedforward connections. To resolve this issue, putative sharp tuning via predictions needs to be shown absent of feedforward input (i.e. silence). Furthermore, the exact timing of the effects could provide important evidence on whether predictions come along with featurespecific preactivations of relevant neural ensembles 13 .
The goal of the present study was to investigate in healthy human participants whether predictions in the auditory modality are exerted in a carrier frequency (i.e. tonotopic) specific manner. For this purpose, we merged an omission paradigm with a regularity modulation paradigm (for overview see 26 , and see Figure 1 for the specific details). Socalled omission responses occur when an expected tone is replaced by silence. Frequently this response has been investigated in the context of Mismatch Negativity (MMN 27 ) paradigms, which undoubtedly have been the most common approach of studying the processing of statistical regularities in human auditory processing 28-30 . This evoked response occurs upon a deviance from a "standard" stimulus sequence, that is, a sequence characterized by a rule endowing it with a certain degree of order. For omission responses (e.g. 31 ), this order is usually established in a temporal sense, that is, allowing precise predictions when a tone will occur 32 (for a study using a repetition suppression design see 33 ). The neural responses during these silent periods are of outstanding interest since they cannot be explained by any feedforward propagation of activity elicited by a physical stimulus. Thus, omission of an acoustic stimulation will lead to a neural response, as long as this omission violates a regular sequence of acoustic stimuli, that is, it occurs unexpectedly. Previous works have identified auditory cortical contributions to the omission response (e.g. 33 ). Interestingly, and underlining the importance of a topdown input driving the omission response, a recent DCM study by Chennu et al. 34 illustrates that it can be best explained when assuming topdown driving inputs into higher order cortical areas (e.g. frontal cortex). While establishing temporal predictions via a constant stimulation rate, we varied the regularity of the sound sequence by parametrically modulating its entropy level (see e.g. 35,36 ). Using different carrier frequencies, sound sequences varied between random (high entropy; transition probabilities from one sound to all others at chance level) and ordered (low entropy; transition probability from one sound to another one above chance). Our reasoning was that omissionrelated neural responses should contain carrier frequency specific information that is modulated by the entropy level of the contextual sound sequence. Using a time generalization decoding approach 37 , we find evidence that particularly during the low entropy (highly ordered) sequence, neural activity in the omission period contains carrier frequency specific information similar to activity observed during real sound presentation. This work shows that predictionrelated neural activity in the auditory system are sharply tuned even down to the tonotopic level.

Results
Sound and omission evoked responses show differential relationship with entropy We found clear sound and omissionrelated cortical evoked responses at the grand average level. Disregarding the specific entropy level as well as carrier frequency, during a time period of 50200 ms post event onset, striking overlaps between sound and omission evoked generators were observed in the right primary auditory cortex (A1; see Figure 2A , left panel). Despite this spatial overlap, the temporal dynamics in right A1 showed a faster peak when an actual sound was presented as compared to when an omission occurred (see Figure 2A , right panel). Outside of right A1, pronounced evoked responses were also observed for sounds and omissions in an idiosyncratic manner: while left A1 was also strongly activated by sounds, omissions went along with strong evoked activity in the primary visual cortex (V1; see Figure 2A , left panel). The latter may be associated with the fact that unexpected omissions may involuntarily lead to an orientation response, which involves visual exploration. Since this issue was not relevant to the research focus it was not further followed up.
To assess the relationship between evoked activity and the entropy level, we performed a regression analysis in timewindows 100200 ms and 200300 ms postevent onset. For the early (100200 ms) timewindow, no effect at the cluster corrected level was observed (all p 's > .1). In the later (200300 ms) timewindow, lower entropy (more regularity) was reflected in weaker evoked responses (negative cluster: p = .007). This effect showed maxima at 263 and 297 ms after sound onset and was localized to precuneus and right striatum for both time points ( Figure 2B , left panel). Right A1 was also implicated, but only for the later period ( Figure 2B , left and right panel). No effect at a clustercorrected level was obtained for omissions ( p 's > . 19). Given our hypothesis that contrary to sound evoked responses, omission evoked responses should increase with decreasing entropy, we applied a regression test to a normalized contrast restricted to left and right A1 ( Figure 2A ). This measure normalized each condition with respect to the highest entropy (random) condition and the difference between sound and omission within each entropy level was entered into the statistic. For the left A1 a significant cluster corrected effect was obtained at later time points (250300 ms; right A1: n.s.; see Figure 2C , left panel), which was driven by a differential relationship with entropy level for sounds and omissions.
Overall, the analysis of evoked responses establishes common generators for omissions and genuine sounds in right A1. This region and a set of nonauditory regions (precuneus and right striatum) show decreasing sound evoked responses with increasing regularity of the sound sequence. Increasing omission evoked activity was observed with increasing regularity of the sound sequence in left A1.

Figure 2 Sound and omission evoked response modulation according to entropy level. (A)
Source localization of evoked brain activity in response to sound or omission between 50 and 200 ms (left panel) and timecourse of source activity in right Heschl's gyrus (right panel). Brain source clusters are masked at 85% maximum activity. All trials were averaged separately for sounds and omission disregarding entropy level and tone frequency. For visualization purposes, a baseline between 50 ms to sound / omission onset was subtracted to the averaged timecourse depiction. (B) Linear regression analysis over different contextual regularity conditions (4 entropy levels) for time period between 200 and 300 ms.
(C) Linear regression analysis on normalized differential contrast between sound and omission trials contrast restricted to left and right Heschl's gyri (see methods section). Time course of the effect statistics is shown on left panel. Left Heschl's gyrus normalized difference with respect to higher entropy level (RND) for each entropy level (MM, MP and OR) and stimulation type (sound and omission) Singletrial neural activity during sound contains information about entropy level The results in the previous step were obtained after trial averaging, which leaves the question open as to what information is contained in the signal on a single trial level. To validate our decoding approach, we first followed up on the strong evoked effect showing differential sound evoked response amplitudes for the different entropy levels. Using all magnetometers (i.e. discarding the spatial pattern) and a timegeneralization decoding approach showed that the entropy level of the condition in which the sound was embedded into could be decoded above chance from virtually any time point and generalized to any other time point. Only the ondiagonal result is depicted in Figure 3 (left panel; the offdiagonal patterns will be part of a separate manuscript), showing globally above chance level decoding accuracy. This finding of temporally generalizable nonspecific neural patterns fits well with the fact that conditions were presented in blocks. A transient increase following ~100200 ms poststimulus onset can be observed, which is somewhat earlier than the evoked response effect described above. This underlines that the outcome of this decoding analysis is not merely an alternative depiction of the evoked response analysis. In order to identify potential neural generators that drive the described effect, the timegeneralization analysis was followed up by a searchlight analysis in source space. Since sensor level analysis suggested a temporally stable (in the sense of almost always significant) neural pattern, the entire 0300 ms time period was used for this purpose. The analysis revealed above chance decoding accuracy spread throughout almost the entire brain. In order to identify potential "hotspots" a 10% of maximum decoding accuracy threshold was introduced, showing that the largest effect was obtained in the right hemisphere, encompassing large portions of the temporal cortex. Based on this analysis, we can state that information about the regularity of the sound sequence is contained also at the singletrial level in a temporally stable manner and the right temporal cortex may play a pronounced role in representing the regularity of sound sequences.

Figure 3
Decoding of entropy level. Outcome of decoding accuracy of the ondiagonal shows temporally stable above chance scores, with a transient increase following~100200 ms poststimulus onset (left panel). Searchlight analysis on source level using a wide temporal window (0300 ms) shows a widely distributed pattern with a maximum in right temporal regions (image thresholded at 10% of maximum).

Singletrial neural activity during sound contains information about tone frequency
Prior to addressing the more challenging question of whether silent periods (i.e. omissions) contain carrier frequency specific information (omissiontosound decoding), we first tested whether this was in general possible using neural activity to actual sound presentation (soundtosound decoding). In an analogous approach to the one previously described, we derived timegeneralized carrier frequency decoding performance separately for the different entropy levels. Sound frequency could be decoded high above chance from MEG activity, disregarding the regularity of the sound sequence ( Figure 4A ). For all entropy levels, a temporally relatively stable pattern emerges between 100 to 300 ms, with the highest accuracy clustering along the diagonal. The offdiagonal pattern becomes descriptively more pronounced with increasing regularity of the sound sequence. This impression is confirmed by statistically testing a linear trend ( Figure 4B ; left) showing a significant neural pattern at ~200 ms training time generalizing for~100 ms. This effect is observed in spite of sound evoked responses showing overall decreased amplitudes with increasing regularity of the sequence ( Figure 2B ). We investigated potential generators for this significant neural pattern with a sourcelevel searchlight decoding analysis performed within this significant Prior to this effect, a further pattern of increasing accuracy emerges as a function of entropy levels between~100 to~120 ms, yielding a significant linear trend ( Figure 4B ; left). This pattern consists of a strong ondiagonal part, monotonously decreasing in strength as a function of temporal distance. In principle, it could contain some predictionrelated preactivations. However, given the setup of the experiment, the effect is likely generated in large part by carryover neural activity of the previous tone, that is exploited by the classifier in decoding the present tone frequency. However, another pattern within this time frame is an increasing offdiagonal decoding accuracy emerging~80100 ms poststimulus and extending from presound periods to almost 200 ms (see Figure 4B ; left). The time almost coincides with the overall onset ondiagonal increase in decoding accuracy, but the 45° orientation of this effect with respect to the diagonal makes it unlikely that it is artifactual. The pattern for the presound period indicates that while carryover is certainly a part, some parts are similar to neural patterns driven by the new sound. To test this impression more formally, we averaged decoding accuracy within selected slices from the time generalization matrix that reflected how strongly a presound neural pattern generalizes over time (see Figure 4C , left). Assuming that the influence of the preceding sound (and thereby the carryover) should be strongest at earliest intervals and subsequently become weaker, temporal decoding accuracy profiles between 70 and 150 ms were modeled by linear regression (see Figure  4C , middle). For every time point, deviations from the carryover model were assessed by calculating the residuals from the linear fit for every time point in this window. Within the aforementioned timewindow of 80100 ms, residuals appeared to increase as a function of the entropy level. Statistically significant deviations were only obtained for nonrandom sound sequences, being particularly pronounced in the ordered condition at 80 to 90 ms. Altogether the results from this analysis are difficult to reconcile with carryover neural activity from the previous tone (i.e. neural patterns elicited by a tone~80100 ms being the same as the pattern to a previous tone of a different frequency), but instead are more parsimoniously explained by testtone carrierfrequency specific effects that are already present prior to actual stimulus onset. This would speak in favor not only of the predictability of the sound sequence to affect sound processing in tonotopically specific manners, but also that this manipulation could instantiate tonotopically specific preactivations. Frequency of expected but omitted tones can be decoded only during regular sound sequences Our analyses so far show that regularity of the sound sequence affects neural responses to sounds and even influences the performance of a classifier to decode tone frequencies.
However the putative effects of predictions up to this point (except of the omission evoked response; see Figure 1C ) have been obtained in the presence of a sound. This is not sufficient to address our main question of whether predictionmediated neural processes are of sufficient granularity to contain information also about what sound was predicted in absence of any acoustic information. Pursuing an analogous timegeneralized decoding approach as previously described, we tested whether neural patterns around omissions (training set) can be found during genuine tones (test set). Indeed activity~150250 ms following onset of the omission could classify significantly above chance level sound frequency in the test data set. However, this was only the case for the ordered condition ( Figure 5A ).
Testing a linear trend across the timegeneralization decoding results confirms this general pattern ( Figure 5B , left). Using a searchlight analysis on source level data (thresholded at p corrected < .01), we followed up on probable generators of the sensor level effect. For this purpose we focused on the late significant effect, that is, between~100 and 200 ms training time (from omissions) tested on neural patterns~250 ms following sound onset. This analysis shows a right hemispheric dominant pattern encompassing particular regions of the auditory cortex, but also motor and premotor regions ( Figure 5B , right). Furthermore, a strong linear trend was also identified in anterior cingulate cortex and subcortical structures such as hippocampus and thalamus.
Also, at the 90 ms testing time period, the activity bears similarities to the one recorded in the preomission period. This effect is reminiscent of the one reported above (see Figure 4 ) and we followed up on the question of whether this is a trivial carryover effect or one indicating a testtone frequency specific preactivation effect using the same approach. However, here we used only the timegeneralization slice that contains information as to whether preomission neural patterns can be found during tone presentation. As previously reported, deviations from a linear regression fit were most pronounced at 90 ms (see Figure  5C , middle), which is significant albeit at a nonBonferronicorrected level (see Figure 5C , right). In light of the more strongly powered analysis in the previous section, we take this as corroborating evidence that neural patterns in the frequency of the test tone are already present prior to the presentation of the actual sound underlining the proactive nature of predictions. Most importantly, however, the results in this section unequivocally show that regularity of the sound sequence putatively modulating predictions lead to tonotopically neural activity patterns during omission periods. Omissiontosound decoding results. (A) Timegeneralization analysis on sensor data (magnetometers) for each entropy level. Omission trials were used as training data and sound trials as testing data. Classifiers were trained at different time points to decode tonefrequency (4 classes). Each matrix presents the result of decoding for each entropy level where classifier accuracy was significantly above chance level (25%) masked at p corrected <0.005. Only MP and OR entropy conditions show significant decoding at the group level. (B) Linear trend analysis between entropy levels. In the left panel, the time generalization matrix represents a significant linear trend for decoding accuracy values over the group (masked at p corrected <0.05). The significant timebytime points inside the black square depiction are used to investigate source level decoding searchlight lineartrend analysis on the right panel (masked at p corrected <0.01). (C) Preactivation pattern analysis for ordered context. Training time points from 100 ms to 0 ms were averaged over the testing time course (see dashed black rectangle in left panel). A regression linear fit was used to remove previous stimulation carryover activity (middle panel). One can observe a residual accuracy increase around 90 ms in testing time speaking in favor of prestimulation activation patterns related to tonefrequency prediction. 323

Discussion
In this study, we investigate neural activity during passive listening to auditory tone sequences by manipulating respective entropy levels and thereby the predictability of an upcoming sound. We used MVPA applied to MEG data to show that, next to more abstract features such as the entropy level, neural responses contain sufficient information to decode the carrier frequency of tones. Our main result reveals that singletrial brain responses to unexpected omissions in a predictable (low entropy) context can be used to decode carrier frequency when a sound is presented. This study provides strong support that topdown prediction related processes are sharply tuned to contain tonotopically specific information. While the finding of sharp tuning of neural activity is not surprising, given in particular invasive recordings from the animal auditory cortex (e.g. during vocalizations, see 19,20 ; or shift of tuning curves following explicit manipulations of attention to specific tone frequencies, see 18,38 ), our work is a critical extension of previous human studies for which a tonotopically tuned effect of predictions has not been shown so far. Critically, given that omission responses have been considered as pure prediction signals 26,39 , our work illustrates that sharp tuning via predictions does not require bottomup thalamocortical drive.

Soundevoked activity decreases with increasing regularity, omissionevoked activity increases
As a general test of our data quality, we first focused on evoked responses to pursue some previously reported findings 31,32,40,41 . Both omissions and sounds elicit the largest evoked responses in the right primary auditory cortex independently from tonefrequency and entropy level. Interestingly, in contrast to sounds, expected but omitted sounds appear to elicit marked evoked activity in the visual cortex. We speculate that unexpected omissions constitute salient events that require reorienting 42 , thereby phase resetting visual activity (for general evidence for audiovisual phase resetting in humans see e.g. 43,44 ). This potentially interesting question is, however, outside the scope of this manuscript and would require further followup studies. Most importantly, sounds and omissions show differential evoked response patterns depending on the contextual entropy level, in particular during later periods of the evoked response (>200 ms). For soundevoked brain responses amplitude increases with entropy, whereas for omissionevoked brain responses amplitude decrease with entropy. While the omission evoked effect was maximal in left A1, the sound evoked effect was more widespread involving also the striatum and precuneus. The fact that the latter effect goes beyond auditory regions is not surprising given that activity in these regions has been reported to be modulated based on manipulations of regularity in previous studies 45,46 . For example Rauschecker 47 ascribes the basal ganglia along with other dorsal stream auditory regions a role in matching sounds with expectations formed by previous presentations. Overall, our analysis of evoked responses are fully consistent with previous works 36 and notions of precision based predictive coding 26 , which suggests that neural responses decrease to expected events whereas they increase to unexpected events.

Singletrial MEG activity contains low and highlevel auditory information
To pursue our main research question, we relied on MVPA applied to MEG data 37,48 . In particular, prior to addressing whether neural activity during omissions contains carrierfrequency specific information, it was important to illustrate the decoding analysis performance when a sound was actually presented. A priori, this is not a trivial undertaking given that the small spatial extent of the auditory cortex 49 likely produces highly correlated topographical patterns for different pure tones and the fact that mapping tonotopic organization using noninvasive electrophysiological tools has had mixed success (for critical overview see e.g. 50 ). Considering this challenging background it is remarkable that all participants showed a stable pattern with marked poststimulus onset decoding increases after~90 ms. This pattern was observed for all entropy levels and encompassed an ondiagonal increase fading out after~300 ms and a temporally stable offdiagonal increase indicating a generalizable pattern emerging after~100 ms and remaining elevated for ~100200 ms. While this analysis included all sensors and was therefore spatially agnostic, it hints at a rich dynamic that goes beyond a transient activation of a circumscribed brain region (e.g. A1; see also below). Overall, this finding underlines that noninvasive electrophysiological methods such as MEG can be used to decode lowlevel auditory features such as the carrier frequency of tones. This corroborates and extends findings from the visual modality for which successful decoding of lowlevel stimulus features such as contrast edge orientation have been demonstrated previously 51 .
Going beyond this lowlevel information, we also addressed whether a representation of a more abstract feature such as the sequence's entropy level could also be decoded from the noninvasive data. Functionally, extracting regularities requires an integration over a longer time period and previous MEG works focussing on evoked responses have identified in particular slow (DC) shifts to reflect transitions from random to regular sound sequences 36 . This fits with our result showing that the entropy level of a sound sequence can be decoded above chance at virtually any time point, implying an ongoing (slow) process tracking regularities that is transiently increased following the presentation of a sound. Taken together, the successful decoding of low and highlevel auditory information underlines the significant potential of applying MVPA tools to noninvasive electrophysiological data to address research questions in auditory cognitive neuroscience that would be difficult to pursue using conventional approaches 22,52 .

Predictions can be formed in spectrally sharplytuned manner
Using an MVPA approach with time generalization allowed us to assess whether beyond the level of differential evoked responses, carrier frequency related neural activity during sound or omission is systematically modulated by the entropy level. In both cases, a clear poststimulus onset activity pattern was obtained that exhibited a linear relationship to entropy level across participants in the sense that increasing regularity (i.e. lower entropy) went along with improved decoding accuracy. In particular, when training and testing on sounds, a pattern emerged after~150 ms that generalized until~300 ms and putatively involved leftdominant auditory and nonauditory regions. Note that this effect for soundtosound decoding was obtained while overall evoked response strength decreased with increasing regularity. This effect is reminiscent of findings in the visual modality, suggesting a sharpening of the neural response profile by expectations, that is, a reduction of neural responses in the visual cortex, while at the same time representational information is enhanced 53 . Via our timegeneralized omissiontosound MVPA, we can assert that predictions can sharply tune relevant neural ensembles in a purely topdown manner, without any confounding influence of a sound. In this case an increasing regularity of the sound sequence went along with better decoding for a time period~100250 ms training time generalizing to~200300 ms testing time. This finding supports and extends a previous experiment by Sanmiguel et al. 28 showing omission responses are sensitive not only to timing, but also to the precise features of the stimulus. Interestingly, the informative time period during the omission is clearly earlier than the omissionevoked response peak and also the period yielding a relationship to entropy level. Also with regards to the latter omissionevoked effect, which was mainly pronounced in left A1, the timegeneralized MVPA effect showed a right hemispheric dominance. Altogether, the results underline the fact that our decoding approach uncovers patterns in the data that are not immediately available from looking at the evoked responses. It is worth noting that conforming to the general pattern in this study, the omissiontosound decoding effect is not confined strictly to auditory regions, but also encompasses (pre)motor and frontal regions as well as subcortical regions such as the thalamus and hippocampus. This conforms to previous studies in the visual modality implying an involvement of medial temporal and prefrontal regions in the generation of predictions based on the statistical regularity of sensory input [54][55][56] . The differential lateralization patterns for the soundtosound and omissiontosound linear trend effects may be surprising at first sight, especially for the auditory cortex, if one assumes them to reflect pure prediction responses as suggested previously by some authors 57 . However, differential patterns could make sense considering that the absence of an expected stimulation will create a greater amount of surprise than when a sound is presented as expected. This difference could in principle involve a nonoverlapping set of brain regions. In any case, this aspect does not change the fact that the late omissiontosound timegeneralized MVPA effect is driven exclusively by topdown processes, illustrating for the first time in humans that predictionrelated processes in the auditory system can be tonotopically tuned.
While later latencies could in principle contain a complex mix of prediction and surprise related processes, the act of predicting usually contains a notion of preactivating relevant neural ensembles, a pattern that has been previously illustrated in the visual modality (e.g. 1,58 ). For the omission response, this was put forward by Bendixen et al. 39 even though the reported evoked response effects cannot be directly seen as signatures of preactivation. We found neural patterns~90 ms following sound onset to generalize to presound periods that could not be explained by a simple linear carryover effect from the previous sound. It is thus most parsimonious to assume that next to the activity related to the carrier frequency of the previous tone, presound periods contain relevant information about the carrier frequency of the upcoming tone. However, future studies will need to study this in greater detail since the current design cannot, for example, completely exclude a reactivation of patterns of previous neural activity by a new sound, even though this is not the most parsimonious assumption 447 for the described regression residual effects. Next to this caveat, overall decoding accuracy was in absolute terms not high especially for the critical analysis (i.e. Omissiontosound decoding). However, it should be noted that we refrained from a widespread practice of subaveraging trials 48,51 , which boosts classification accuracies significantly. When compared to cognitive neuroscientific M/EEG studies that perform decoding on the genuine single trials and a focus on group level effects (rather than featureoptimizing on individual level as in BCI applications), the strength of our effects are comparable (e.g. 59,60 ).

Methods Participants
A total of 34 volunteers (16 females) took part in the experiment, giving written informed consent. At the time of the experiment, the average age was 26.6 ± 5.6 SD years. All participants reported no previous neurological or psychiatric disorder, and reported normal or correctedtonormal vision. The experimental protocol was approved by the ethics committee of the University of Salzburg and has been carried out in accordance with the Declaration of Helsinki.
Stimuli and experimental procedure Before entering the Magnetoencephalography (MEG) cabin, five head position indicator (HPI) coils were applied on the scalp. Anatomical landmarks (nasion and left/right preauricular points), the HPI locations, and around 300 headshape points were sampled using a Polhemus FASTTRAK digitizer. After a 5 min resting state session (not reported in this study), the actual experimental paradigm started. The subjects watched a movie ( Cirque du Soleil: Worlds Away) while passively listening to tone sequences. Auditory stimuli were presented binaurally using MEGcompatible tubal inear headphones (SOUNDPixx, VPixx technologies, Canada). This particular movie was chosen for the absence of speech and dialogue, and the soundtrack was substituted with the sound stimulation sequences. These sequences were composed of four different pure (sinusoidal) tones, ranging from 200 to 2000 Hz, logarithmically spaced (that is: 200 Hz, 431 Hz, 928 Hz, 2000 Hz) each lasting 100 ms (5 ms linear fade in / out). Tines were presented at a rate of 3 Hz. Overall the participants were exposed to four blocks, each containing 4000 stimuli, with every block lasting about 22 mins. Each block was balanced with respect to the number of presentations per tone frequency. Within the block, 10% of the stimuli were omitted, thus yielding 400 omission trials (100 per omitted sound frequency). While within each block, the overall amount of trials per sound frequency was set to be equal, blocks differed in the order of the tones, which were parametrically modulated in their entropy level using different transition matrices 61 . In more detail, the random condition (RND; see Figure 1 ) was characterized by equal transition probability from one sound to another, thereby preventing any possibility of accurately predicting an upcoming stimulus (high entropy). In the ordered condition (OR), presentation of one sound was followed with high (75%) probability by another sound (low entropy).

MEG data acquisition and preprocessing
The magnetic signal was recorded at 1000 Hz (hardware filters: 0.1 330 Hz) in a standard passive magnetically shielded room (AK3b, Vacuumschmelze, Germany) using a whole head MEG (Elekta Neuromag Triux, Elekta Oy, Finland). Signals were sampled with 102 magnetometers and 204 orthogonally placed planar gradiometers at 102 different positions. We use a signal space separation algorithm implemented in the Maxfilter program (version 2.2.15) provided by the MEG manufacturer to remove external noise from the MEG signal (mainly 16.6Hz, and 50Hz plus harmonics) and realign data to a common standard head position ( trans default Maxfilter parameter) across different blocks based on the measured head position at the beginning of each block 63 .
Data analysis was done using the Fieldtrip toolbox 64 (git version 20170919) and inhouse built scripts. First, a highpass filter at 0.1 Hz (6 th order zerophase Butterworth filter) was applied to the continuous data. Then the data were segmented from 600 ms before to 600 ms after target stimulation onset and downsampled to 256 Hz for the ERF analysis, and to 100 Hz for the decoding part. Trials containing physiological or acquisition artifacts were rejected. A semiautomatic artifact detection routine identified statistical outliers of trials in the datasets using a set of summary statistics (variance, maximum absolute amplitude, maximum zvalue). These trials were removed from each dataset. Across subjects, an average of 721 ± 266 SD (4.5 ± 1.7 SD %) of trials were rejected. In all further analyses for each subject, the number of trials for the different carrier frequencies was balanced to prevent any bias across conditions 65 . Finally, the epoched data was 30 Hz lowpassfiltered (6 th order zerophase Butterworth filter) prior to further analysis.

Source level analysis
Preprocessed data was projected to sourcelevel using an LCMV beamformer analysis 66 . For each participant, realistically shaped, singleshell headmodels 67 were computed by coregistering the participants' headshapes either with their structural MRI (15 participants) or -when no individual MRI was available (19 participants) -with a standard brain from the Montreal Neurological Institute (MNI, Montreal, Canada), warped to the individual headshape. A grid with 1 cm resolution based on an MNI template brain was morphed to fit the brain volume of each participant. A common spatial filter (for each grid point and each participant) was computed using the leadfields and the common covariance matrix, taking into account the data from all trials (i.e. including sound and omission trials from all conditions). The covariance window for the beamformer filter calculation was based on 200 ms prestimulus to 500 ms poststimulus. Using this common filter, the sensor level singletrial timeseries were projected onto the 3D grid. For the evoked response, the resulting sound and omission trials were averaged relative to the stimulus onset and the absolute value was calculated. This yields for each condition a sound or omissionrelated

Multivariate Pattern Analysis (MVPA)
We used multivariate pattern analysis as implemented in CoSMoMVPA 68 (git version 20170505). MVPA decoding was first performed using a timegeneralized decoding analysis that included all magnetometers. Specific time slices from the timegeneralization matrix were followed up by a spatialsearchlight decoding at source level (see below). We performed decoding analysis based on single trial sensorlevel data and single trial normalized (zscored) source data.
Overall, three decoding approaches were taken: • Entropylevel decoding : In a first step, we kept only trials with sound presentation (removing omission trials) to investigate brain activity modulated by different experimental contexts. For this purpose, we defined four decoding targets (classes) based on block type (4 contexts: RND, MM, MP, OR). • Soundtosound decoding : To test whether we could classify carrier frequency in general, we defined four targets (classes) for the decoding related to the carrier frequency of the sound presented on each trial (4 carrier frequencies). • Omissiontosound decoding : To test whether omission periods contain carrier frequency specific neural activity, omission trials were labeled according to the carrier frequency of the sound which would have been presented. These trials were used to train the classifier, which was subsequently applied to a test set of trials during which sounds were presented.
Using a Linear Discriminant Analysis (LDA) classifier, we performed a decoding analysis at each time point around stimulus / omission onset. A twofold crossvalidation scheme was applied for entropylevel and soundtosound decoding, using two randomly assigned sets of single trials. For the omissiontosound decoding analysis, the training set was restricted to omission trials and the testing set contained only sound trials. Trials were balanced in the training and testing sets by using a random subset of trials in which the number of trials was equalized between the four conditions (i.e. 4 target classes: 4 entropy levels or 4 carrier frequencies depending on the decoding analysis). In all cases, training and testing partitions always contained different sets of data.
Classification accuracy for each subject was averaged at the group level and reported to depict the classifier's ability to decode over time (i.e. timegeneralization analysis at sensor level) and over spatial dimension (i.e. searchlight analysis at source level). The time generalization method was used to study the ability of each LDA classifier across different time points in the training set to generalize to every time point in the testing set 37 . For the soundtosound and omissiontosound decoding, time generalization was calculated for each entropy level separately, resulting in four generalization matrices, one for each entropy level. This was necessary to assess whether the contextual sound sequence influences classification accuracy on a systematic level. Significant clusters of time points were followed up by a searchlight analysis across brain sources. In this analysis we used local 568 neighborhood features in source space (source radius of 1.5 cm). All significant searchlight accuracy results were averaged over time cluster and reported on brain maps.

Statistical analysis
For the evoked responses, we tested the dependence on entropy level using a regression test ( depsamplesregT in Fieldtrip). Results for sounds and omissions were sorted from random to ordered respectively. Testing sound and omissionevoked responses separately on a whole brain level first, we defined an early (100200 ms) and late time window (200300 ms) based on previous studies in this domain (for an overview see 26,69 ). In order to account for multiple comparisons, we used a nonparametric cluster permutation test 70 as implemented in Fieldtrip using 1000 permutations and a p < .025 to threshold the clusters. Neighboring grid points were clustered (minimum number of grid points in a cluster when their distance was below 1.5 cm). Given previous works 11,12 and also theoretical reasoning 13 , we hypothesized decreasing evoked responses to sounds the more ordered the sound sequence became. On the other hand, for omissions, we expected evoked responses to increase the more ordered the sequences became, since within these sequences expectations and violations thereof should be stronger 28 This latter prediction was not evident at a whole brain cluster corrected level. In order to target this differential prediction for sound and omission evoked responses in a more direct manner, we implemented a normalized contrast and focused on the left and right auditory cortex (as given by the grand average; see Figure 2A ). In this procedure, we first normalized each condition (i.e. sound / omission x entropy level) by the evoked response of the random sequence (e.g. OR norm = [OR -RND] / [OR + RND]). For the regression analysis, we entered the difference of the normalized contrasts between omission and sound (e.g. OR diff = [OR norm (omission) OR norm (sound)]). According to our hypothesis, the differential relationship to entropy level for sound and omission evoked responses should be reflected in a monotonically increasing difference.
The multivariate analysis results were tested at the group level by comparing the resulting individual accuracy maps against chance level (25% with 4 classes) using a nonparametric approach implemented in CoSMoMVPA 68 adopting 10,000 permutations to generate a null distribution. Pvalues were set at p < 0.005 for cluster level correction to control for multiple comparisons using a thresholdfree method for clustering 71 , which has been used and validated for MEG/EEG data 72,73 . The time generalization results and searchlight brain maps at the group level were thresholded using a mask with corrected zscore > 2.58 (or p corrected < 0.005). We also tested the dependence of classification results on entropy level using a regression test ( depsamplesregT in Fieldtrip) following analogous statistical method as evoked response analysis. Only the significant timebytime points identified on sensor level timegeneralization where used to test source level dependence of searchlight decoding results on entropy level using a similar regression test. Data and Code Availability Further information and requests for resources or data should be directed to and will be fulfilled by the corresponding author.