Introduction

Theories of semantic embodiment propose that concepts are, at least in part, represented within the same neural sensorimotor systems that were involved in their acquisition (e.g.1,2). Motor action concepts, for example, are considered to be represented within the brain’s motor cortices. In other words, the neuronal assemblies that represent the concept ‘to kick’ are thought to overlap with those involved in physically kicking one’s leg.

In support of the hypothesised overlap between action execution and action comprehension, there is considerable evidence that healthy participants are faster to perform actions in response to sentences if those actions are congruent with the actions described by the sentences – e.g. pushing a joystick away from you in response to the sentence “Close the drawer”3. One study observed that healthy participants were selectively slower to make semantic judgments of sentences that describe actions if their motor system had been recently fatigued by repetitive action4. Furthermore, there is evidence that participants experience interference in planning and executing hand actions when they are simultaneously required to access the meanings of visually presented action words5. Functional neuroimaging data suggests that listening to sentences describing actions elicits greater activity in premotor cortex than listening to sentences that do not describe actions6. Moreover, some studies have reported effector-specific overlap between tasks that involve action comprehension and action execution in premotor and primary motor cortices7,8,9,10,11.

One commonly used marker of sensorimotor cortex activation is the modulation of electrophysiological oscillations in the mu (8–12 Hz) and beta (13–30 Hz) bands over the top of the head. Reductions in the amplitude of oscillations in these ranges in electroencephalography (EEG) and magnetoencephalography (MEG), often referred to as event related desynchronisations (ERD12), are ubiquitous when healthy individuals complete motor tasks, including imagery, observation, and execution (e.g.13,14,15). These motor ERDs are often reported to be somatotopically distributed, suggesting sources within somatotopic cortices, although this is variable across studies16,17,18,19,20. Nevertheless, data from simultaneous EEG-fMRI indicate broad somatosensory and motor cortical generators of the mu and beta rhythms in motor tasks21. Furthermore, the temporal resolution of electrophysiology allows researchers to separate cortical activity that may contribute to semantic access (i.e. within the first ~400 milliseconds post-stimulus22) from that which occurs subsequent to semantic access, such as implicit or explicit mental imagery.

Consequently, and often cited in support of embodied semantics, somatotopy of MEG-recorded ERDs in the mu and beta bands have been reported from 200 ms after presentation of action words23. Event-related potentials (ERPs) elicited by verbs and nouns also begin to differentiate from 200 ms post-stimulus24,25 while source estimates of ERPs and MEG event-related fields have also suggested somatotopy of premotor and primary motor cortical contributions to semantic access of action words from 150 ms post-stimulus26,27,28,29. The timing of this motor activation, hundreds of milliseconds before any overt semantic judgment by the participant, is viewed by some as support for the hypothesis that the motor system is engaged in service of semantic access of action concepts.

Nevertheless, post-lexical cortical activity has also been reported. For example, verbs and nouns have been associated with late (>500 ms post-stimulus) topographically distinct oscillations in the beta range (25–35 Hz)24. Larger beta ERDs have been observed in response to verbs relative to nouns from 600 ms post-stimulus30 with putative generators in primary motor cortex31. However, the opposite pattern - i.e. larger ERDs for nouns relative to verbs over central scalp - has also been reported in the high beta/low gamma range32, suggesting the activity of multiple non-overlapping oscillatory mechanisms in semantic processing.

Perhaps mirroring the variability of the above evidence, a meta-analysis of fMRI and PET activation foci found insufficient evidence for the specific involvement of motor cortices in action verb processing, and instead observed a more consistent role for left lateral temporo-occipital cortex33. The authors concluded that action representations may overlap more with the cortical regions involved in perceiving actions (i.e. visual motion areas), rather than those involved in performing them (i.e. motor cortices). While this interpretation still falls within an embodied view of semantics, it calls into question the role of motor cortices in action verb comprehension. Nevertheless, there is evidence that motor cortical involvement in comprehension varies as a function of task goals8,34, and may therefore not be evident on average across a broad literature. However, congruent sensorimotor activations have been reported in studies that are entirely task free26 or that specifically instruct participants not to attend to the word stimuli35, suggesting that sensorimotor activation should still be evident across task demands. We are specifically interested in the potential for sensorimotor activation under relatively shallow task demands as it speaks to the automaticity of sensorimotor involvement, and so here report an EEG study designed to investigate the specific cortical contributions to comprehension in the case of healthy participants accessing the meaning of visually-presented verbs in a low-demand, definition verification task.

Many high temporal resolution studies, reviewed above, have contrasted broad categories of words, such as verbs and nouns, which also differ on a range of potentially confounding psycholinguistic variables, such as imageability36. Furthermore, studies of neural responses to effector-specific words (e.g. lick, pick, kick) have relied on inverse models to accurately separate sub-regions of sensorimotor cortices26,28,29. Here, we describe a study of semantic access of verbs that differ in the extent to which they describe action. As a comparison of effector-specific words is necessarily limited by the accuracy of inverse models of activity in neighbouring cortical sub-regions, we here chose not to compare categories of stimuli organised by effector, but to create two categories of words that differ in the extent to which they describe motor action, but which do not differ in their levels of imageability. Our aim with this approach was to increase the likelihood of detecting sensorimotor activation as we predicted that verbs describing motor actions would activate (at least, sub-regions of) sensorimotor cortex, while non-motor verbs would activate a diffuse and heterogenous set of cortical regions outside of those activated by the motor verbs, thus eliciting differential profiles of cortical generators on average. A similar approach by others in the field has led to differential sensorimotor activations (e.g.36,37) although, unlike our study, the word categories in those studies were not matched in imageability. Therefore, here we test the hypothesis that semantic access of motor verbs (e.g. ‘grab’) recruits dissociable regions of cortex, as measured by ERPs and ERDs, from non-motor verbs (e.g. ‘fail’; when matched on imageability). By employing source estimation of these effects, we also tested whether the differential activity can be explained by overlap with cortical regions involved in performing actions (i.e. motor cortices) and those involved in perceiving actions (i.e. left lateral temporo-occipital cortex).

Results

Behaviour

Participants judged the correctness of verb definitions with high accuracy (hit rate: M = 91.79%, SD = 11.25%; false alarm rate: M = 3.03%, SD = 2.96%) which we interpret as evidence of the group’s attention to the meaning of stimuli.

Sensor analyses: ERPs

One positive and one negative spatial cluster exceeded our significance threshold in the 164–203-ms time-window (p = 0.008 and p = 0.002, respectively). The two clusters are located over both sides of a dipolar distribution of voltage differences, with an anterior positivity and posterior negativity (Fig. 1B). Global dissimilarity within the 164–203-ms time-window was significantly greater than that expected by chance (GD = 0.416, p = 0.004), suggesting that the neural generators underlying motor and non-motor verb processing are not entirely overlapping in this time-window.

Figure 1
figure 1

ERP scalp topographies from 164–203 ms post-stimulus. (A) Grand average topographies separated according to condition. (B, left) Grand average topography of the difference between conditions. Electrodes contributing to the two clusters are marked. (B, right) Tukey boxplots and individual subject mean voltages within the two significant clusters.

No clusters were formed in any of the other three time-windows of interest, nor did any other global dissimilarity analyses exceed our statistical threshold (106–160-ms: GD 0.160, p = 0.998; 207–293-ms: GD 0.148, p = 0.325; 297–418-ms: GD 0.066, p = 0.399).

Sensor analyses: Oscillations

Global dissimilarity within the 555–785-ms time-window of the mu band was significantly greater than that expected by chance (GD = 0.385, p = 0.009; Fig. 2A), indicating that the neural generators underlying the mu band reactivity to motor and non-motor verbs are not entirely overlapping in this time-window. The scalp maps in Fig. 2A clearly show occipital alpha reactivity in this time-window, with a greater extent of the motor verb ERD over central midline electrodes. In the same time-window, one cluster of voltage differences was formed with p = 0.050 centred on electrode Cz, although this fails to pass our two-tailed threshold of p < 0.025.

Figure 2
figure 2

Mu band scalp topographies from 555–785 ms post-stimulus. (A) Grand average topographies separated according to condition. (B, left) Grand average topography of the difference between conditions. Electrodes contributing to the cluster are marked. Note that this cluster does not pass the threshold of p < 0.025, but is shown to visualise the difference that drives the significant global dissimilarity (p = 0.009). (B, right) Tukey boxplots and individual subject mean power within the cluster of electrodes.

No clusters were formed in the early time-window for the mu band (66-ms–551-ms), or in either of the time-windows for the beta band. No other global dissimilarity analyses exceeded our statistical threshold (Mu band 66–551-ms: GD 0.216, p = 0.143; Beta band 90–160-ms: GD 0.769, p = 0.364; Beta band 164–633-ms: GD.240, p = 0.685).

Source analyses: ERP

Source analyses on the ERP dataset did not reveal any significant differences between motor and non-motor verbs in any of the tested AAL regions (paired one-sample t-test: FDR corrected p-values > 0.05). Figure 3 shows unthresholded T-values of difference between ERP to motor and non-motor verbs averaged over the time window that showed significant clusters on the sensor level (164–203 ms). Although the source results of the difference between motor and non-motor trials show a maximum over central and fronto-central areas, no significant clusters were observed (p > 0.6). To test whether there are AAL regions that respond in the same way to motor and non-motor verbs within the analysed time window, a Bayesian t test was computed for each AAL region. It revealed substantial evidence for the Null in all regions (BF10 < 0.333) except the left Pre-central Gyrus which exhibited weaker evidence in favour of the null (BF10 = 0.422); see Table 1 for detailed description of statistical tests), suggesting that the signal in the analysed time window of the ERP in these regions is the same for motor and non-motor verbs.

Figure 3
figure 3

Whole brain source estimate for the difference between ERPs to motor and non-motor verbs averaged across the time-window of significant difference in the sensor data. The figure shows the difference of the source estimates between motor and non-motor trials across the time window 164–203 ms as unthresholded T-values. [A = anterior; P = posterior; L = left; R = right].

Table 1 Paired sample T-Tests and Bayesian T-Tests of ERP source estimate in individual AAL regions.

Source analyses: Mu/alpha-oscillations

Source estimates of the differential mu/alpha (see Discussion for consideration of whether this is a mu or alpha rhythm) response during the time-window of the significant effect at the sensor level reveal a broad negativity over centro-posterior brain regions (see Fig. 4A). The difference in the left superior parietal lobule survived multiple comparisons correction, reflecting a reduction in mu/alpha power in response to motor verbs relative to an increase in mu/alpha power to non-motor verbs (T(19) = −3.249; p = 0.029 FDR corrected; Fig. 4B), which was further confirmed by a Bayesian t test (BF10 = 10.598). Mu/alpha power distributions of motor and non-motor conditions in that AAL region however were not significantly different from 0 (motor: T(19) = 0.107; non-motor: T(19) = 0.604). Applying the Bayesian t test to all other AAL regions (see Table 2 for detailed description of statistical tests) showed some evidence for the Null in the posterior middle temporal gyrus (BF10 = 0.232), the left pre-central gyrus (BF10 = 0.237), and the right middle occipital gyrus (BF10 = 0.251), suggesting that mu/alpha power in these regions is not modulated differently upon processing of motor or non-motor words.

Figure 4
figure 4

Source results Mu/Alpha oscillations. (A) Mu/alpha power difference values averaged over participants and the time window that showed a significant negative cluster on the sensor level (555–785 ms). (B) AAL region ‘left superior parietal lobule’ (in orange) that showed a significant mu/alpha power difference between motor and non-motor conditions (p = 0.029; FDR-corrected). The Tukey boxplots represent mu/alpha power values in the AAL region ‘left superior parietal lobule’ averaged over the time window of interest (555–785 ms) for motor and non-motor conditions, with individual subject means overlaid. Abbreviations identify spatial landmarks: A = anterior; P = posterior; L = left; R = right).

Table 2 Paired sample T-Tests and Bayesian T-Tests of mu/alpha power source estimate in individual AAL regions.

We tested the hypothesis that semantic access of verbs that describe motor actions involves dissociable regions of cortex from semantic access of verbs that do not describe motor actions. In support of this hypothesis, we observed significant differences between the ERPs elicited by these two classes of stimuli in a time-window 164–203 ms post-stimulus. Global dissimilarity analysis in this time-window provided strong evidence that the ERPs in response to motor and non-motor verbs are not generated by entirely overlapping regions of cortex (p = 0.004). This result is consistent with widely-accepted distributed views in which a concept’s semantic features are represented across cortices (see, for example38), but is not in itself sufficiently consistent with an embodied view of semantics, in which action concepts are considered to be represented within the cortical sensorimotor system.

The timing of our observed ERP effect is consistent with that reported in a study of Dutch arm action verbs and non-action verbs, in which ERPs diverged from 155–174 ms post-stimulus36. Furthermore, source estimation implicated bilateral motor cortices (precentral gyri) as generators of that effect, and was therefore interpreted as evidence for embodied semantics of action. While the early onset of our observed ERP difference is consistent with cortical activity in support of semantic access22, we found no evidence that this ERP effect originated within cortical regions that would be consistent with embodied semantics – e.g. occipito-temporal (perceptual) areas identified in a meta-analysis of fMRI and PET results33 or bilateral pre- and primary motor cortices (see Tables 1 and 4). While the difference between motor and non-motor verb trials on the source level shows its maximum over central and fronto-central areas (see Fig. 3), no significant clusters, nor differences in the tested AAL regions were observed. Indeed, all cluster p-values were >0.6, all AAL p-values were >0.25, and our Bayesian analyses suggested substantial evidence for the null hypothesis of no effect in all AAL regions of interest (BF10 < 0.333) apart from the left Pre-central Gyrus which exhibited weaker evidence but still in the favour of the null hypothesis. Taken together, these results are inconsistent with the hypothesised specific role of sensorimotor cortex in action verb comprehension.

Table 3 Descriptive and inferential statistics for each psycholinguistic variable across Motor and Non-Motor stimuli.
Table 4 MNI coordinates of the centre of mass of all investigated AAL regions and peak locations from the associated meta-analysis study (Watson et al.33).

One possible reason for the conflicting source estimates between our study and that of Vanhoutte et al.36 is that the verbs they used differed considerably in their imageability ratings between action and non-action categories (reported p < 0.001) while the verbs in our study did not differ significantly between categories (p = 0.098) and a Bayesian analysis indicated weak evidence in favour of the null hypothesis of no difference in imageability between the two categories of stimuli (BF10 = 0.831). Thus, it is possible that the activity localised in bilateral motor cortices in the study by Vanhoutte et al. stems from a differential process of, potentially implicit, mental imagery between categories, that is absent from our data due to the more closely matched levels of imageability. This same argument can be applied to evidence for early sensorimotor activation from Moseley et al.37 whose visually-presented action words also differed significantly from their comparison categories in imageability (reported p < 0.001). Nevertheless, Vanhoutte et al.36, and others39,40,41, argue that the onset of this post-stimulus effect is too quick to reflect mental imagery. Conversely, critics of strong embodied accounts argue that such apparent rapid activation of motor cortices is not sufficient evidence for motor cortical involvement in representing the concept itself, as the activity could also reflect spreading activation from an abstracted representation1. If indeed our lack of evidence for rapid sensorimotor activation is driven by our controlling for verb imageability across categories in our study, the reported rapid sensorimotor activations across the literature where imageability is not controlled may reflect a form of implicit motor imagery, or motor simulation9, rather than activation of the concept representations themselves. However, a direct test of this link is required. Furthermore, while we endeavoured to match our word categories on a range of variables that can modulate neural responses, including imageability, there are inevitably more psycholinguistic variables that we did not measure but that may also modulate neural responses42 or differ between our word categories in ways that are not specific to our desired embodiment contrast. While we cannot investigate all possible variables for our stimulus set, we have made all of our data and stimuli available online so that they can be investigated in future along other linguistic variables. Nevertheless, while our ERP data are consistent with differential semantic activations between verb categories, they do not provide evidence for a specific contribution of the cortical regions implicated by an embodied account.

Even though it was not our intention, many of the motor words show a bias towards hand-related motor actions (e.g. snip, catch). The lack of significant differences in the ERP between motor and non-motor conditions could therefore be due to the non-specificity of the investigated regions of interest. We therefore reinvestigated our ERP data on the source level by focusing the analysis on regions of interest centred on the left and right hand-knobs (centre MNI-coordinates x: −39/+39, y: −24, z: 57 mm; from43,44 +6 surrounding virtual electrodes (+/−1 on x-, y-, and z-coordinate)). However, these effector-specific regions of interest also did not reveal any significant difference between motor and non-motor trials either (left hand knob: p = 0.216; BF10 = 0.474; right hand knob: p = 0.339; BF10 = 0.355).

To increase the signal to noise of our EEG data, we presented our participants with the full set of word stimuli four times across the study. It is therefore possible that our recorded neural responses became reduced across repetitions, and may therefore obscure semantic effects on average across all repetitions. However, a comparison of the difference between motor and non-motor ERPs in the early time-window (164–302 ms) between the first quarter of the study (i.e. repetition 1) and the last quarter of the study (i.e. repetition 4) failed to form any clusters, suggesting a negligible impact of priming on our results.

Alongside an early ERP effect, we also observed a differential modulation of the mu rhythm (8–12 Hz) in a late time-window (555–785 ms). While the differences in power of the mu rhythm in this time-window did not exceed our statistical threshold at the sensor level (p = 0.050 where alpha = 0.025), a global dissimilarity analysis indicated strong evidence for different scalp distributions of the mu rhythm ERD (p = 0.009). As in the case of the ERPs above, this result suggests that the mu ERDs in response to motor and non-motor verbs are not generated by entirely overlapping regions of cortex. Furthermore, our source analyses provided strong evidence (BF10 = 10.598, p = 0.029 FDR-corrected) for a generator of this difference within the left superior parietal lobule – a region identified in a meta-analysis of fMRI and PET studies of lexical-semantic processing of action words or images33. Watson et al.33 suggested that the meta-analytical concordance in this parietal region stems from the use of object-directed action concepts across studies, as overlapping regions of parietal cortex are thought to support production of object-directed actions45, and lesions of parietal cortex are linked to deficits in recognising object-directed actions46. Indeed, many of the motor stimuli used in our study are concepts that describe object-directed actions – e.g. pull, hurl, carve.

Nevertheless, our mu data are also not consistent with an embodied view of action semantics as the effect is both late in time and not localised to specific sensorimotor cortices. The so-called mu rhythm (8–12 Hz) in fact shares the same frequency band as the alpha rhythm but is differentiated by the fact that it is distributed over the top of the head with putative generators in rolandic regions21. However, our data did not show significant differences in this frequency band within rolandic brain regions. Indeed, our data provided substantial evidence for the null within left pre-central gyrus using Bayesian equivalent t-tests (BF10 = 0.236) – i.e. evidence that 8–12 Hz power does not differ on average in this brain region during semantic access of both motor and non-motor verbs. It may therefore be more accurate to describe our observed 8–12 Hz activity as reflecting the alpha rhythm, rather than the mu rhythm. The alpha rhythm has been characterised as an active inhibition mechanism whereby power increases functionally inhibit sensory processing in task-irrelevant brain regions, while power decreases boost processing in other regions47. In this sense, our observed greater decrease in 8–12 Hz (alpha) power within the left superior parietal lobule in response to motor verbs may reflect a boost in processing in this brain region, perhaps as part of accessing the object-directed features of motor actions45. This finding is also in line with the stronger activation observed in fMRI and PET data in this region in response to action-related lexical stimuli33, as alpha power is often reported to be negatively correlated with the BOLD response of fMRI48,49,50.

Just as with the mu rhythm, beta band oscillations (13–30 Hz) classically represent activation of the sensorimotor cortex upon imagining, executing, or observing a movement (e.g.13,14,15). However, we did not observe any effects in the beta band. Nevertheless, if we continue to assume that our observed 8–12 Hz effect reflects the alpha rhythm rather than the mu rhythm, the lack of evidence for modulation of the beta band in our data is not surprising, and again fails to support an embodied view of motor action concepts.

While a number of authors report evidence for activity over motor regions during action-word processing and subsequently argue for an embodied account of action meaning, there are others who critically question this interpretation. For example, according to Bedny and Caramazza51, it is unclear whether the reported modulations over motor areas reflect the general role of these areas in language processing or reveals a specificity of motor areas in the understanding of action words. In a review article, the authors argue that there is more evidence for the role of left middle temporal gyrus in action word comprehension than sensorimotor regions – a position supported by the subsequent meta-analysis of Watson et al.33. Nevertheless, our findings at the source level do not provide evidence for a role of either of these cortical regions. As noted above, Bayesian t-tests of the alpha power effect provide evidence for the null hypothesis within the left pre-central gyrus, consistent with a more critical view of semantic embodiment of action verbs. Furthermore, our source analyses of the early ERP effect also revealed substantial evidence in favour of the null within all but one of the investigated AAL regions.

In conclusion, our data are consistent with a rapid differential activation of cortex when accessing the meaning of motor and non-motor verbs, followed by a later post-lexical involvement of left superior parietal lobule. However, our data do not provide direct support for a specific role of sensorimotor cortices when healthy individuals access the meaning of individual motor action verbs. To further delineate the extent to which embodied cognition applies to semantic representations, we must continue to delineate the specific task goals and/or contexts in which sensorimotor cortices are recruited in service of comprehension, as our task may have encouraged a relatively shallow reading of the words34.

Methods

Participants

A total of forty-nine healthy participants (students from the University of Birmingham) took part in the studies and were compensated with either course credits or cash. Twenty of these participants took part in an initial behavioural study to validate the stimuli list (median age = 21.5; range: 18–31), and the remaining twenty-nine participants took part in the EEG study. Five participants were excluded from the EEG study due to excessive artefact, resulting in twenty-four participants (median age = 21; range: 19–28) for analysis. A sample size of 24 in a two-tailed within-subjects design gives 80% power to detect an effect size of 0.652. All participants reported to be monolingual native English speakers, between 18 and 35 years old, right-handed, with no history of epilepsy, and no diagnosis of dyslexia. The experimental procedures were approved by the Ethical Review Committee of the University of Birmingham (ERN_15–1367AP3) and all research was performed in accordance with relevant guidelines. All participants gave written informed consent in accordance with the Declaration of Helsinki.

Stimuli

We constructed an initial list of 100 monosyllabic bodily action words and 100 monosyllabic non-bodily action words. We then used the Match software53 to select the stimuli set that best matched the motor and non-motor verbs on the basis of the following psycholinguistic variables: number of letters, number of phonemes, log frequency, orthographic neighbourhood, phonological neighbourhood, concreteness, imageability, and mean age of acquisition. The MRC Psycholinguistic Database54 provided the values for number of letters, number of phonemes, and concreteness. Word frequency values were taken from the British National Corpus Frequency database55. N-Watch software56 provided the orthographic and phonemic neighbourhood measures. Imageability ratings were taken from57, and age of acquisition ratings from58. This procedure resulted in a list of 36 motor and 36 non-motor verbs.

To validate this word list, we collected data from a group of healthy participants. Each participant was presented with each word individually, and instructed to create a sentence incorporating that word. Upon completion of this task, participants were again presented with each word and instructed to rate from 1 to 7 the extent to which the verb described a “voluntary and bodily action or movement” (7 as highest). We subsequently removed all words that were used as nouns by more than half of the participants or that received inconsistent ratings across participants (3 words per condition: tug, stroll, split, glare, glow, bleed). The resulting set of motor words were rated significantly higher than non-motor words (Wilcoxon’s Signed Rank Test, Z = 561, p < 0.001), thus validating the motoric difference in meaning across lists while approximately controlling for potentially confounding psycholinguistic variables.

T-tests (Table 3) revealed no significant differences between the two final lists (33-words per condition) for any of the variables (p > 0.09). Bayesian T-tests (conducted with JASP v. 0.8. 0.0 software59,60) revealed at least substantial evidence in favour of the null hypothesis for the majority of variables (i.e. BF10 ≤ 1/3), and weak evidence in favour of the null hypothesis for number of phonemes (BF10 = 0.397), phonological neighbourhood (BF10 = 0.600), and imageability (BF10 = 0.831).

Procedure

The paradigm was programmed and presented using the Matlab Psychophysics Toolbox (Matlab Psychtoolbox-3; www.psychtoolbox.org)61. Participants sat approximately 100 cm from a 27-inch PC monitor, with refresh rate of 60 Hz, 1920 × 1080 resolution, and 32-bit colour depth.

Each trial (Fig. 5) began with a central grey fixation cross on a black background for 1500-ms, followed by a central white fixation cross for 200-ms, a blank screen for 1300-ms, and finally the word presented in lower case Arial (size 80) at the centre of the screen for 200-ms, followed by 1300-ms of blank screen. To promote participants’ attention to the meaning of the stimuli, 25% of trials were followed by presentation of a word definition, taken from web-based dictionaries, to which the participant was required to judge whether the definition matched the preceding word. Responses were given via keyboard, with response hand counterbalanced across participants (i.e. left-hand to answer “correct” and right-hand to answer “incorrect” for half of the participants, and vice versa for the other half). Definitions matched the preceding word exactly half of the time. Stimulus order and the stimuli chosen for presentation of definitions were randomised. Due to a bug in the presentation script, the order of stimuli was identical for half of the participants. Nevertheless, the order of stimuli for those participants was unpredictable. At the end of every trial, a blank screen was presented for between 1000- and 2000-ms, selected on each trial from a uniform distribution.

Figure 5
figure 5

Trial procedure with timings relative to stimulus presentation.

To improve signal to noise, participants completed four runs of the above procedure, resulting in 132 trials per condition. Across all 4 runs, a definition for each word was presented exactly once. Participants also completed a brief practice session of six trials to familiarise to the structure of the task. Practice stimuli were the words rejected during the stimuli validation procedure described in the Stimuli section above.

EEG pre-processing

We recorded EEG with a 128-channel Biosemi ActiveTwo system, with two additional electrodes recording from the mastoid processes. Data were sampled at 256 Hz and referenced to CMS (Common Mode Sense) and DRL (Driven Right Leg). Offline, the EEG signal was digitally filtered between 0.5 and 40 Hz, segmented into epochs from 1500-ms prestimulus until 1500-ms poststimulus, re-referenced to the average of the mastoids, and baseline-corrected to the 200-ms prestimulus period. All offline pre-processing was performed with a combination of the Matlab toolbox EEGLAB (version 14.0.0b62) and custom scripts.

Artefact rejection proceeded in three steps. First, channels and trials with excessive or non-stationary artefact were identified by visual inspection and discarded. Across participants, a median of 4 channels (range 0–12) and a median of 38.5 trials (range 11–73) were discarded. Second, we conducted Independent Component Analysis (ICA) of the remaining data (EEGLAB’s runica algorithm) to identify and remove components that described eye blinks and eye movements. Any previously removed channels were then interpolated back into the data. Finally, trials with artefacts that had not been effectively cleaned by the above procedure were identified with visual inspection and discarded.

Prior to analysis, all data were re-referenced to the average of all channels, and baseline corrected to the 200-ms prestimulus period. A median of 113.5 trials per participant contributed to each condition (Motor range: 93–127; Non-motor range: 98–127).

EEG/MRI co-registration

We recorded the electrode positions of each participant relative to the surface of the head with a Polhemus Fastrak device using the Brainstorm Digitize application (Brainstorm v. 3.463) running in Matlab. Furthermore, on a separate day, we acquired a T1-weighted anatomical scan of the head (nose included) of each participant with a 1 mm resolution using a 3 T Philips Achieva MRI scanner (32 channel head coil). This T1-weighted anatomical scan was then co-registered with the digitised electrode locations using Fieldtrip64.

Sensor analyses: ERPs

Analyses of ERPs proceeded in two stages. First, we calculated the global field power65 of the grand average of all trials (i.e. both conditions together) to identify time-windows of interest. Global field power (GFP) is the summed square of voltages, and is a principled means of identifying component peak latencies from an orthogonal contrast. We then identified a time-window around each peak by inspecting the global dissimilarity65 – the mean of the root mean square of voltage differences between consecutive time-points, after the data have been scaled by the global field power. Deflections in the time-course of global dissimilarity therefore suggest boundaries between scalp topographies. Due to the focus on processing in support of semantic access, we selected four ERP topographies approximately within the first 400-ms post-stimulus: 106-ms–160-ms, 164-ms–203-ms, 207-ms–293-ms, and 297-ms–418-ms (see Fig. 6).

Figure 6
figure 6

Global Field Power (GFP) and Global Dissimilarity (GD) time-courses of the ERPs and mu and beta band-power estimates. Shaded areas represent the time-windows selected for subsequent analyses. Note that these are plots of grand average data across conditions, and are therefore orthogonal to the subsequent motor versus non-motor analyses.

ERPs within each time-window of interest were compared with the cluster mass method of the open-source Matlab toolbox FieldTrip (version 2016061964). First, for each participant x condition we averaged the voltages at each electrode within the time-window of interest. Next, a two-tailed dependent samples t-test between conditions was conducted at each electrode. Spatially adjacent t-values with p-values passing the threshold (alpha = 0.05) were then clustered based on their spatial proximity. Spatial clusters were required to involve at least 4 neighbouring electrodes. To correct for multiple comparisons, a randomisation procedure produced 1000 Monte Carlo permutations of the above method to estimate the probability of the observed cluster under the null hypothesis66. We used a cluster alpha threshold of 0.025 as we are testing for both positive and negative effects.

As we hypothesise that the neural representations of motor and non-motor verbs are not entirely overlapping, we also tested for differences in the scalp topographies across conditions with a randomisation test of global dissimilarity (see67). For each time-window of interest, we calculated the global dissimilarity (i.e. the root mean square difference in GFP-normalised voltages) between the grand-average topographies of the two conditions. We then estimated the probability of observing that global dissimilarity (or a value larger) under the null hypothesis. Specifically, we randomly shuffled data across conditions, while maintaining within-subject pairings of condition, and re-calculated global dissimilarity as above. The p-value is the proportion of global dissimilarities from 1000 randomisations that are larger than the observed global dissimilarity.

Sensor analyses: oscillations

To estimate power in each frequency band of interest (mu: 8–12 Hz; beta: 13–30 Hz) we filtered all individual trials within the band of interest (EEGLAB firls) and extracted the squared envelope of the signal (i.e. the squared complex magnitude of the Hilbert-transformed signal). We then averaged trials of the same condition within each participant’s data, and converted post-stimulus values to decibels relative to the mean power in a pre-stimulus baseline (−600 to −200ms) selected to not be contaminated by temporally-smeared post-stimulus power estimates. Subsequent statistical procedures were identical to the ERP sensor analyses above. Due to previous evidence of late oscillatory changes during verb processing, we identified time-windows within the first 800-ms post-stimulus from the GFP and GD time-courses of the two frequency bands: mu: 66-ms–551-ms, 555-ms–785-ms; beta: 90-ms–160-ms, 164-ms–633-ms (see Fig. 6).

Source analysis

We performed source analyses on data of 20 out of the 24 participants because we were unable to acquire anatomical MRI scans for the remaining four participants. From the subject-specific T1-weighted anatomical scans, individual boundary element head models (BEM; four layers) were constructed using the ‘dipoli’ method of the Matlab toolbox FieldTrip64. Digitised electrode locations were aligned to the surface of the scalp layer that was extracted from the segmented T1-weighted anatomical scans using fiducial points and head shape as reference points.

Data that was analysed previously on the sensor level was now projected onto the source level. To allow for direct statistical comparison between motor vs. non-motor verbs, the number of trials was balanced between conditions by randomly removing trials of the condition (discarded trials: median 3, range 0–10) with more data until both datasets had the same number of trials (median 112, range 94–125).

ERPs whole brain

For the ERP source estimate, we followed the analysis approach presented by Popov et al.68 (source reconstruction of the main ERP components using this analysis approach is described in detail in the Supplementary Material). Therefore, data was first filtered between 1 and 40 Hz, using a firws filter as implemented in the ft_preprocessing function of Fieldtrip (using default parameters). Then, the sensor covariance matrix was estimated over a time window including pre- and post-stimulus time points of interest (−500ms–+500 ms) and a common spatial filter (including trials of both conditions) was computed using an Linear Constraint Minimum Variance (LCMV) beamformer69,70,71. Specific beamformer parameters were chosen based on the approach used by Popov et al.68, including a fixed dipole orientation, a weighted normalisation (to reduce the center of head bias), as well as a regularisation parameter of 5% to increase the signal to noise ratio. This common spatial filter was then used for source estimation of motor and non-motor trials. The dipole moments of both conditions were extracted in the post-stimulus time windows of interest which showed significant clusters on the sensor level (164–203 ms), and their absolute values were averaged over time points to obtain one average value per grid point (virtual electrode). To test for significant differences between the motor and the non-motor condition, a cluster-based permutation test as implemented in the Fieldtrip toolbox was computed over subjects. The overall brain response, averaged over 20 participants, is represented in Fig. 3.

Mu/Alpha oscillations whole brain

For the mu/alpha oscillation dataset, trials were defined as time windows reaching from [−800ms–1200 ms] relative to stimulus presentation. To increase computational efficiency, individual trials were concatenated, resulting in one continuous datastream for each subject. The sensor covariance matrix was estimated over the whole datastream and a common a spatial filter was constructed using a Linear Constraint Minimum Variance (LCMV) beamformer69,70,71. Therefore, we applied a fixed dipole orientation as well as a regularisation parameter of 3% to increase signal to noise ratio. Then, all VEs were extracted and their time course computed. To obtain mu/alpha power values, time courses were further hilbert-transformed and their absolute values squared. Data were then epoched into individual trials of 1.6 s [−600 ms–1000 ms], excluding the first and last 200 ms to avoid potential artefacts due to data discontinuities. Then, trials were split into motor and non-motor conditions and the average over trials within each condition was computed. After baseline-correcting the data to dB using the pre-stimulus time window [−600 ms to−200 ms], average values over the time window that showed a significant cluster on the sensor level between motor and non-motor conditions (555 ms–785 ms) were computed. Figure 4A shows the overall brain response to the difference in mu/alpha power between motor and non-motor verbs, averaged over 20 participants.

Automated Anatomical Labelling (AAL) analysis

We focused our analyses on the specific anatomical regions identified in the meta-analysis of Watson et al.33 described in the introduction to this manuscript, and bilateral precentral gyri based on evidence from literature7,8,9,10,11. The resulting seven anatomical regions of interest (see Table 4 and Fig. 7 for anatomical details) were defined using the automated anatomical labelling (AAL) atlas (see72,73 for similar analyses with MEG and EEG data). One of these seven AAL regions, the left middle temporal gyrus (left MTG) was further subdivided into anterior (aMTG) and posterior middle temporal gyrus (pMTG) since only the left pMTG was of interest in this study (cf.33). This was done by selecting only those left MTG VEs located posterior to the centre of mass of the left MTG AAL region.

Figure 7
figure 7

Locations of the seven investigated AAL regions (A = anterior; P = posterior; L = left; R = right).

To investigate AAL regions that could show a significant difference between motor and non-motor trials on the ERP level, we extracted average values, which we obtained from the whole brain analysis (cf. text above) of all virtual electrodes (VEs) and weighted them based on the Euclidian distance between each VE and the centre of mass of the respective AAL region (cf.72). To extract potential mu/alpha power differences on the source level between the conditions motor vs. non-motor in the AAL regions of interest, we extracted and weighted in a first step, time courses of all AAL regions’ VEs, based on the Euclidian distance between each VE and the centre of mass of the respective AAL region (cf.72). Subsequently, the time courses of all VEs were Hilbert transformed and the absolute values squared before summing across VEs. Further processing followed the procedure described above (Mu/Alpha oscillations whole brain). To test for statistical significance, paired-sample t-tests were performed on both datasets on the computed differences between motor and non-motor conditions for ERPs and mu/alpha oscillations (20 vs. 20 for each AAL region). Resulting p-values were further corrected for multiple comparisons using False Discovery Rate (FDR74,75). For the AAL region showing significant differences between motor and non-motor conditions, additional one-sample t-tests were computed in order to test whether the mean of their distributions differed significantly from 0.

To test for evidence for the Null, Bayes Factor analyses with default priors (r = 0.707) were carried out on the ERP and Mu/Alpha data for each AAL region separately according to76.

Plotting

All data plots were made with Matlab and edited in a desktop publisher. Colour palettes are taken from Color Brewer 2 (http://colorbrewer2.org/) or in-house customized colour maps. Source results were plotted using caret (http://www.nitrc.org/projects/caret/)77.