An ALE meta-analytical review of the neural correlates of abstract and concrete words

Several clinical studies have reported a double dissociation between abstract and concrete concepts, suggesting that they are processed by at least partly different networks in the brain. However, neuroimaging data seem not in line with neuropsychological reports. Using the ALE method, we run a meta-analysis on 32 brain-activation imaging studies that considered only nouns and verbs. Five clusters were associated with concrete words, four clusters with abstract words. When only nouns were selected three left activation clusters were found to be associated with concrete stimuli and only one with abstract nouns (left IFG). These results confirm that concrete and abstract words processing involves at least partially segregated brain areas, the IFG being relevant for abstract nouns and verbs while more posterior temporoparietal-occipital regions seem to be crucial for processing concrete words, in contrast with the neuropsychological literature that suggests a temporal anterior involvement for concrete words. We investigated the possible reasons that produce different outcomes in neuroimaging and clinical studies.

An advantage for concrete words as compared to abstract words has been demonstrated in a series of psycholinguistic studies. Neurologically unimpaired participants perform better on concrete than abstract words in free recall, cued recall, paired-associate learning and recognition; their reaction times in visual lexical decision are shorter with concrete than abstract words 1 . This effect is known as "concreteness effect", and it increases in aphasic patients. This is especially evident in non-fluent aphasia, for example in patients with agrammatism 2 , where it has been found in spontaneous speech 3 , reading 4 , writing 5 , repetition 6 , naming 7 , and comprehension 8 . Several theories [9][10][11][12] have been proposed to explain this advantage of concrete words but they share a common feature, namely a quantitative distinction between concrete and abstract concepts, with concrete items more strongly represented than abstract ones, either because they benefit from a verbal and visuo-perceptual representation 10 or thanks to a larger contextual support 12 or a larger number of semantic features 9,11 . For instance, the Dual Coding Theory 10 postulates that concrete concepts are supported by both perceptual and verbal representations while abstract words are based exclusively on linguistic information. From the Dual Coding Theory perspective, the advantage of concrete compared to abstract concepts is attributed to the additional contribution of the sensorymotor systems triggered by imagery-based richer representations, presumably involving both hemispheres (not only the left hemisphere) and to a greater number of units activated in the semantic system for concrete words 10 . The hub-and-spokes model assumes that words are processed in a neural network containing one or more amodal hubs, sensorimotor modality-specific regions, and connections between them (cross-modal conjunctive representations 13,14 . Initially, it was hypothesized that the anterior temporal lobes (ATLs) were the main hub, but, later on, other potential high-order and low-order hubs have been introduced (e.g., left posterior cingulate cortex, dorsomedial pre-frontal cortex, inferior frontal gyrus, inferior parietal cortex, precuneus) 15,16 . From the Dual Hub Theory 17,18 perspective the ATL processes taxonomic knowledge (shared features, e.g., dog → wolf) while the temporo-parietal areas, including the posterior middle temporal gyrus (pMTG), are involved in thematic knowledge (contiguity relations based on co-occurrence in events or scenarios, e.g., dog-leash).
To account for the reversed concreteness effect, it has been proposed that abstract and concrete concepts are distinguished by the manner in which they are acquired, and by the relative weight of sensory-perceptual features in their representation 20 . An alternative explanation by Crutch and Warrington 35 , points to a fundamental 1. We exclusively selected papers that used only words stimuli and presented specific contrasts (concrete > abstract and abstract > concrete stimuli). 2. We used a different method, choosing the more popular Activation Likelihood Estimation [41][42][43] (ALE) as compared to the multilevel kernel density analysis (MKDA) 44 applied by Wang et al. 37 . MKDA and ALE produce similar results, both using the location (xyz-coordinates) of local maxima reported by the individual studies, but MKDA uses a spherical kernel whose radius is determined by the analyst 45 while ALE applies a Gaussian kernel whose FWHM is empirically determined. Moreover, our analyses are conducted on the last version of the GingerAle software, which managed to rectify some of the previous limitations of this instrument, e.g., the frequently used FDR correction is no longer supported 43 and proposes new best-practice ALE recommendations like the cluster-level family-wise error (FWE) corrected threshold of p < 0.05 46 .
Finally, this is also an update of the previous reviews, including publications from the last 10 years.

Materials and methods
The present systematic review was conducted under the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines 47 .
Studies selection. Our meta-analysis is based on 32 neuroimaging studies exploring the neural basis of concrete and abstract words processing, using either PET or fMRI on adult participants, published between January 1996 and February 2021. Studies were selected using four electronic databases: MEDLINE (accessed by PubMed, https:// www. ncbi. nlm. nih. gov/ pubmed), PsycARTICLES (via EBSCOHost, https:// search. ebsco host. com), PsycINFO (via EBSCOHost) and Web of Science (https:// webof nowl edge. com/). The search terms used were: (1) "semantic decision", "semantic judgment"; "abstract words", "concrete words ", "abstract concepts", "concrete concepts", "lexical decision" AND (2) "imaging", "MRI", "PET". Additional sources such as reference lists of included studies and relevant systematic reviews were also checked. Titles, abstracts, and full-text articles were screened and evaluated for eligibility based on the following criteria: Inclusion criteria: • Imaging technique: PET or fMRI, • Reported stereotaxic coordinates (in the MNI or Talairach atlases), • Whole-brain voxel-based data analyses, www.nature.com/scientificreports/ Exclusion criteria: • Region-of-interest analyses, • Multiple single-case analyses, • Sample population of minors, • Sample population of neurological, brain-damaged, cognitively impaired or psychiatric patients, • Only concrete > baseline or abstract > baseline contrasts, • Articles from the gray literature (i.e., literature that is not formally published in sources such as books or journal articles, e.g. unpublished Ph.D. thesis), • Presentations from international meetings with no specific data provided, perspective and opinion publications, case reports, series of cases, previous reviews or meta-analyses, • Studies not published in, or translated into English, • Phrases or sentences stimuli, • Studies without adequate information (e.g., stereotaxic coordinates) to analyze the concrete vs. abstract contrasts and no reply from the authors after asking for the missing data.
We used a general and broad initial search. We looked for publications that reported word stimuli and concrete > abstract or abstract > concrete MRI and PET contrasts. In a second time, we distinguished the included papers taking into consideration the type of knowledge, type of task, type of stimuli, and type of investigation method. These classifications led to exploratory sub-analyses, with the purpose of controlling as much as possible confounding factors. As previously specified, we looked for publications that reported concrete > abstract words or abstract > concrete words contrast, without analyzing the exact strategies that the authors applied to divide word stimuli into the two categories. Often, the abstractness/concreteness constructs are operationalized in the papers based on two rating methods: (1) asking participants to classify a word as concrete taking into consideration the degree to which it refers to a tangible entity in the world (it has clear references to material objects); (2) or by evaluating its imageability, i.e., the ease with which the word elicits a mental image. Generally speaking, words referring to something that exists in reality, and one can have an immediate experience of it through the senses are considered concrete (e.g., animals, tools); while words whose meaning cannot be experienced directly but can be defined by other words, internal sensory experience, and linguistic information, are classified as abstract (e.g., emotions, morality, social interaction, time).
After removing duplicates, research papers which did not satisfy the above criteria were excluded. For example, several studies focused on sentences or phrases 48,49 ; or reported only words > baseline contrasts 50 . The more conservative concrete > abstract and abstract > concrete contrast (as opposed to concrete > baseline or abstract > baseline contrasts) was chosen in order to avoid a variety of baselines that could range from resting state, fixation cross 51 to pseudowords 52 and number or letters 53 and could affect the interpretation of the results, since subtractions from different baselines create different activation patterns. We acknowledge that this type of contrast does not reveal which brain regions equally support the processing of both concrete and abstract words. The question that this meta-analysis can answer is which are the most replicated data in the literature (in terms of brain activation and words representation) when contrasting abstract and concrete words. Moreover, since concrete and abstract words can dissociate, we aimed at assessing in which anatomical correlates they differ, and not the common ones.
If the same data were reported in different publications, we chose the most recent one and with the highest number of participants 54,55 .
Uncertainties regarding some inclusions were solved by the authors through discussion. The PRISMA flow of information diagram was used to track the search process as presented in Fig. 1 and the main characteristics of the studies included in this meta-analysis are reported in Table 1.
Classification of the raw data before clustering analyses. From the selected papers, only the stereotactic coordinates representing the concrete > abstract or abstract > concrete contrasts were extracted. Following this procedure, we obtained 295 foci from a total sample of 535 participants. The stereotaxic coordinates reported in terms of the Talairach and Tournoux atlas 56 were transformed into the MNI (Montreal Neurological Institute) stereotaxic space 57 using the tal2icbm transforms implemented in the GingerALE software 41,43,58 .
For all the stereotaxic coordinates we extracted the relevant information about the statistical comparisons that generated them. More explicitly, we reported the MNI coordinates (MNI x,y,z), the name of the first author, the journal and the year of publication of the paper, the technique (PET or fMRI) and the stereotactic space used, the age of participants, the type of task, the nature of the contrast from which the peak was extracted, the statistical thresholds, the stimulus type (nouns or verbs) and the presentation modality (auditory or visual).
Clustering procedure. Once obtained the set of MNI coordinates, the meta-analyses were carried out using the revised ALE algorithm 41,43 implemented into GingerALE software Version 3.0.2 58 (http:// brain map. org/ ale). The ALE algorithm aims to identify areas with a convergence of reported coordinates across experiments that are higher than expected from a random spatial association. The logic behind this approach implies a spatial probability distribution modeled for each activation peak included in the dataset of interest. Reported foci are treated as centers of 3D Gaussian probability distributions capturing the spatial uncertainty associated with each focus 58 . The between-subject variance is weighted by the number of participants per study, since larger sample sizes should provide more reliable approximations of the "true" activation effect. The voxel-by-voxel union of these distributions is used as an activation likelihood map, subsequently tested for statistical signifi- www.nature.com/scientificreports/ cance against randomly generated sets of foci. ALE was proven to be a reliable way of blending evidence from multiple studies 43 and was used successfully in different fields e.g., 59 . More specifically we used the following procedure: • Anatomical filtering-we applied a first filtering of the coordinates using the most conservative (smallest) mask available in the GingerALE software and 17 foci from the total of 295 fell out of the mask. • ALE maps (quantify the degree of overlap in peak activation across experiments) were calculated using the modified ALE algorithm and the random-effects model 41,43 ; • Thresholding procedure-for each ALE calculation described below significance was tested using 1000 permutations with a cluster forming threshold of p < 0.001 (uncorrected). In order to increase test sensitivity to false positives significance was corrected with a cluster-level family-wise error threshold of p < 0.05 46 as used by other meta-analytic studies 60 .
Unfortunately, ALE cannot deal with multiple independent variables designs, and in this paper we intended to consider the role of different variables like (1) stimulus type (nouns only, verbs only or all word stimuli), (2) modality of presentation (visual only, auditory only or both visual and auditory), and (3) task specificity (e.g., lexical, semantic tasks or all tasks). The ALE strategy we choose in this case was to consider separate sets of foci for each variable and run one meta-analysis for each of these sets when the number of papers was large enough. To this purpose, the overall dataset was divided a-posteriori into several subsets, which automatically implied   Table 1. Descriptive information of the 32 experiments included in the meta-analysis. Age is reported in years and when it was specified means and standard deviations are presented. p values (the statistical threshold for the neuroimaging univariate analysis conducted in the included papers) are reported as they were presented in the original articles; the exact value and the correction procedure was not always specified. ns, not specified. www.nature.com/scientificreports/ running meta-analyses on a low number of foci (lowering the power). An important limitation of this approach is that we are not able to statistically assess the interaction between variables like stimuli type and task. The analyses were based on the following contrasts: (1) An analysis included the activation peaks associated with word processing independently of the stimulus type and task (2) An analysis with peaks associated with noun processing only (because the number of studies including verbs only was too small (4 studies) for a specific analysis on this type of stimuli 70,76,82,83 ) • concrete nouns > abstract nouns included 107 stereotactic activation loci from 15 studies (5 foci out of mask), 251 participants; • abstract nouns > concrete nouns included 99 stereotactic activation loci from 18 studies (8 foci out of mask), 324 participants; (3) An analysis included the activation peaks associated with word processing independently of the stimulus type (verbs, names or adjectives), but taking into consideration only visually presented stimuli (4) An analysis on peaks associated with lexical (words or non-words classification task), or semantic decision tasks (e.g., pleasantness decision task, answering a question about the stimuli), excluding all the studies based on: memory tasks (2 studies), perceptual decision task (1 study), mental image generation (3 studies), passive reading (2 studies).
• concrete > abstract word (only lexical and semantic tasks) included 114 stereotactic activation loci from 16 studies, 273 participants • abstract > concrete word (only lexical and semantic tasks) included 116 stereotactic activation loci from 17 studies, 289 participants We explored a-posteriori the role of the (1) stimulus type (nouns), (2) modality of presentation (visual stimuli), and (3) task specificity (lexical and semantic tasks) in order to control as much as possible for each of these variables, i.e., to increase the results accuracy.
It might be argued against the inclusion of PET and fMRI studies in the same meta-analysis due to the substantial methodological differences between the two techniques in terms of experimental design, processing, spatial localization and cluster accuracy. Because the number of studies investigating concrete vs. abstract words is small, we decided to include data from both techniques in order to increase power in the analyses. Nevertheless, in Appendix A (supplementary materials), we present the data analysis after excluding all PET studies (figures and tables are numbered as e.g., Fig. 1A and Table 1A).
For anatomical labeling and figures, we capitalized on the Automatic Anatomical Labeling (AAL) template available in the MRIcron visualization Software (https:// www. nitrc. org/ proje cts/ mricr on).

Results
Once the appropriate studies were collected, we used activation likelihood estimation (ALE) to meta-analytically remodel available neuroimaging data.
CONCRETE > ABSTRACT meta-analysis. The GingerALE procedure run over the concrete words > abstract words set of coordinates identified a total of 5 clusters, with 1-4 individual peaks each, from 4 to 11 different studies (Fig. 2). Regions that were consistently activated across experiments were localized in the bilateral middle temporal gyrus and posterior cingulate, the left parahippocampal gyrus, left fusiform gyrus, bilateral precuneus and angular gyri, left superior occipital gyrus and left cerebellum culmen. The peaks distribution for each significant cluster is reported in Table 2 www.nature.com/scientificreports/ A similar activation pattern, except for the right hemisphere involvement, was observed when only studies reporting exclusively noun stimuli were taken into consideration (concrete nouns > abstract nouns). We observed three left activation clusters (Fig. 3, Table 3) situated in the middle temporal gyrus, parahippocampal gyrus, posterior cingulate, precuneus, superior occipital gyrus, and culmen (left cerebellum anterior lobe).
The ALE procedure run over the concrete words > abstract words, visual stimuli only set of coordinates, identified a total of 5 clusters, with 1-6 individual peaks each, from 4 to 8 different studies (Fig. 4). Regions that were consistently activated across experiments were localized in the left middle temporal gyrus, bilateral posterior cingulate, and parahippocampal gyrus, left fusiform gyrus, bilateral precuneus and angular gyri, left superior occipital gyrus and left cerebellum culmen. The peaks distribution for each significant cluster is reported in Table 4.
A comparable activation pattern was observed when only studies based on lexical and semantic tasks were taken into consideration. The analysis indicated 4 activation clusters correlated with concrete words > abstract words-lexical and semantic tasks: bilateral middle temporal gyrus, left posterior cingulate and the left parahippocampal gyri, bilateral precuneus, left angular, left superior occipital gyrus and left cerebellum culmen (Fig. 5, Table 5).
Abstract > concrete meta-analysis. The revised ALE algorithm discriminated four clusters that correlated with abstract word processing in a healthy population (Fig. 6), from four to 12 different papers (Table 6). Our analyses identified a robust neural pattern of activity in the left frontal and temporal lobes, specifically, the inferior frontal gyrus, the superior and middle temporal gyri and left inferior parietal.
When only abstract nouns (abstract nouns > concrete nouns) were analyzed, the results indicated a single cluster with two peaks, from 9 studies, in the left inferior frontal gyrus (Fig. 7, Table 7).
We identified three clusters associated with abstract words processing in a healthy population when only studies reporting abstract visual stimuli were included (Fig. 8), from 4 to 12 different papers (Table 8). Our analyses revealed a robust neural pattern of activity in the frontal and temporal lobes, specifically, the inferior frontal gyrus and the superior and middle temporal gyri.
When only foci from lexical and semantic tasks were analyzed, the results indicated 2 clusters (with 1-4 individual peaks each, from 3 to 9 different studies), in the left inferior frontal gyrus, superior and middle temporal gyrus (Fig. 9, Table 9).
As previously specified, due to the very small number of studies we could not conduct sub-analyses based on the (1) verbs only, (2) other types of tasks present in the included publications like mental image generation, memory tasks, or perceptual decision task only; (3) auditory stimuli only.
In Table 10 the descriptive information for each sub-analysis is reported.
In Appendix A (supplementary materials) we present the analyses without PET data. Except for a few clusters that have a smaller number of voxels and one cluster that fragmented into two smaller ones (for the abstract > concrete contrast), all the other clusters are perfectly overlapped (for detail see from Tables 2A, 3 Table 10. A in which we presented the Brodmann area (BA) for the activated clusters with a brief description.

Discussion
As we pointed out in the introduction, neuropsychological studies suggest a role of the lateral prefrontal cortex in processing abstract words and of the left anterior temporal lobe in processing concrete ones. These data are not confirmed by neuroimaging studies. We run a meta-analysis using more stringent criteria to assess whether imaging data can support not only this segregation but also in which components the two networks differ. There are many variables that could influence our findings concerning the neural correlates, like the type of task, type of stimuli, stimuli presentation modality. We tried to control for all these factors in order to obtain accurate results.
Since the number of studies was limited, we could not analyze data according to type of task (e.g., lexical decision task vs. semantic task), but at least we excluded those studies without a semantic or lexical decision task. The task performed during fMRI scan is particularly relevant because the activation observed during passively hearing/ reading words might be very different from the one observed during semantic judgments for concrete vs. abstract www.nature.com/scientificreports/ (e.g., the decision for-which is better associated to a table: a chair or a bench?). Regarding the stimulus type, there is an ongoing debate concerning nouns vs. verbs in general 38 . This question becomes even more difficult when we try to separate abstract and concrete nouns, and abstract and concrete verbs (we could not analyze verbs separately for the lack of studies). Both, Wang et al. 37 and Binder et al. 40 combined different types of stimuli, e.g., words, sentences, fixed expressions such as idioms, and short stories without further focusing on the stimulus type. Furthermore, since Binder et al. 's 40 objective was to investigate the semantic processing in general and not concrete and abstract distinction (although they run a sub-analysis on these two categories), the activation peaks meta-analyzed were obtained from different contrasts: concrete and abstract stimuli > baseline, concrete > abstract and abstract > concrete stimuli. This choice is comprehensible given their objective but the results could be biased by the type of contrast applied; indeed, discrepancies in the patterns of cortical activation across studies may be attributable, at least in part, to differences in baseline tasks, and hence, reflect the limits of the subtractive logic. Thirty-two imaging studies were included, which evaluated the activation patterns in response to concrete and abstract concepts. All the data included in the ALEanalysis are based on general linear model, GLM. We also looked for studies that used the more modern multivariate pattern analysis, i.e., a set of methods that analyze neural responses as patterns of activity 90 , in order to have a separate dataset with this type of methods. Unfortunately, we found a very small number of publications preventing a further meta-analytic procedure 48,91,92 .
The results of this meta-analysis, consistent with those of previous ones 37, 40 , confirmed that concrete and abstract words processing relies, at least in part, on different brain regions. Based on the currently available data  Table 3. The images are presented in neurological convention. Table 3. Concrete > abstract nouns clusters. Included 107 stereotactic activation loci from 15 studies, 251 participants, Chosen min. cluster size 720 mm 3 . All the values and labels were extracted from the GingerALE output files. Clusters are ordered for decreasing volume size. Coordinates (x, y, z) are in the MNI space. H = Hemisphere; ALE = activation likelihood estimation; Nr. = number of studies that contributed to each cluster; L = left; BA = Brodmann area; ** = between brackets are the number of foci from each study that contributed to that specific cluster. www.nature.com/scientificreports/ we could not investigate the existence of overlapping networks between concrete and abstract words. The ALE procedure was completely data-driven, without a prior theoretical basis, and the results are constrained only by the nature of our data (e.g., the limited temporal resolution of the neuroimaging techniques, the correlational nature of the data), and by our inclusion/exclusion criteria. As previously mentioned, experiments testing for greater activation for concrete than abstract words (concrete words > abstract words) converge in the temporo-parieto-occipital regions; namely, the left middle temporal gyrus, left fusiform, left parahippocampal and lingual gyri, bilateral angular gyrus and precuneus, bilateral posterior cingulate, left superior occipital gyrus and left culmen in the cerebellum. The neuroimaging evidence indicates that concrete concept processing is at least partly associated to the perceptual system, and also rely on mental imagery (precuneus, superior occipital gyrus). Binder et al. 40 found significant overlapping for concrete stimuli in the angular gyrus bilaterally, left mid-fusiform gyrus, left posterior cingulate, and left dorsomedial prefrontal cortex (DMPFC). With the exception of DMPFC that might be related to the stimuli complexity and/or different baselines, all the other regions are confirmed by our data. At variance with Wang et al. 's metaanalysis 37 we found a bilateral involvement of the posterior cingulate cortex (PCC), angular and precuneus gyri. Although involved in many semantic-based tasks, the function of the PCC in semantic cognition is still debated. The following hypothesis are proposed: (1) this region could act as a supramodal convergence zone 40 , (2) PCC activation could reflect the greater engagement of an imagery-based perceptual system for concrete stimuli, or (3) PCC might be an interface between semantic knowledge and episodic memory 91 . The precuneus also seems associated with visuospatial imagery, a hypothesis supported by experiments conducted on episodic memory retrieval and linguistic tasks which required the processing of high imagery words or mental image generation 83 . The same regions were found when only nouns were considered (concrete nouns > abstract nouns contrast) with the difference that the right hemisphere activation disappeared. The two right hemisphere clusters might be specifically correlated with action verbs but this result could also be a consequence of the lack of power due to the limited number of studies (15 studies in the nouns dataset vs. 22 in the noun-and-verb database).
The results on abstract words replicated those reported by Wang and colleagues 37 and Binder et al. 40 ; higher activation for abstract compared to concrete words conditions (abstract words > concrete words) is more frequently reported in a left lateralized network, encompassing the inferior frontal gyrus (IFG, Brodmann areas 45, 47), a very small portion of the precentral gyrus, the superior and middle temporal gyri, and inferior parietal. They are also in line with the results observed in brain-damaged patients.
It has been suggested that the ventrolateral prefrontal cortex (VLPFC) implements semantic control in two steps 93 .
Step 1 constitutes controlled access to stored representations when bottom-up input is not sufficient.
Step 2 operates at post-retrieval and is thought to bias competition among representations that have been activated during Step 1. According to Badre and Wagner 94 , both steps recruit VLPFC, though different parts of it, with BA 45 involved in Step 2. In other words, IFG activation could reflect a higher level of semantic control processes (additional resources) since abstract stimuli might require semantic selection, irrelevant cues inhibition,  www.nature.com/scientificreports/ effortful integration, top-down control and working-memory related processes 95 , in agreement with the context availability theory 96 . In line with this hypothesis, this region showed greater activation for abstract words when a judgment task was performed following irrelevant cues and reduced activation when semantic decisions were made with contextual help, supporting the idea that this area responds more strongly to abstract words because their meanings are inherently more variable and require more control during linguistic processing as compared to the concrete ones 53,97 . An alternative explanation is offered by Della Rosa 98 using a lexical decision task; they found that the left IFG was particularly active during presentation of words characterized by low imageability and low context availability. The authors' interpretation was that this area could be a functional convergence zone between imageability and context availability, differentiating abstract from concrete concepts. In neuroimaging studies, besides the IFG, additional clusters were found in the left superior and middle temporal lobe. However, when only nouns were considered (and not verbs), these clusters lost significance, supporting the idea that the cerebral networks deputed to noun and verb processing might be slightly different.
On the other side, results on concrete words do not support neuropsychological data. Indeed, apart from several single case reports, a study comparing the behavioral variant of frontotemporal dementia (FTD), in which there is a predominant prefrontal atrophy, to the semantic variant, with anterior temporal atrophy showed that while the former group had an increase of the concreteness effect, the reversal was found in the semantic variant group. Similarly, patients with left Anterior Temporal Lobe (ATL) resection show the same pattern of reversal concreteness effect 33 .
One possibility of this inconsistent results is the type of task; the neuroimaging studies used pleasantness judgments, memory tasks, lexical decision, etc. while, in general, patients are examined by means of naming and comprehension tasks and, occasionally, also semantic judgments. Orena et al. 36 , for example, using direct electrical stimulation (DES) for brain mapping during awake surgery found no behavioral differences between BA 44 and BA 38 stimulation while patients performed a lexical decision task, but they registered a dissociation Table 4. Concrete > abstract words-visual stimuli-clusters. Included 121 stereotactic activation loci from 18 studies, 301 participants, chosen min. cluster size 616 mm 3 . All the values and labels were extracted from the GingerALE output files. Clusters are ordered for decreasing volume size. Coordinates (x, y, z) are in the MNI space. H = Hemisphere; ALE = activation likelihood estimation; Nr. = number of studies that contributed to each cluster; L = left; BA = Brodmann area; ** = between brackets are the number of foci from each study that contributed to that specific cluster; R = right. www.nature.com/scientificreports/ between abstract and concrete words during a concreteness judgment task; in particular, abstracts words were impaired during stimulation of BA 44 and concrete words during BA 38 stimulation. Neuroimaging studies are often hard to compare, and many variables could influence the reported results as the duration of the stimuli presentation, stimuli number, stimulus types. For example, in the same type of experiment a large number of stimuli [e.g., 164 nouns 74 ] were presented while in other studies, only four words were repeated for more than 140 trials 78 . Moreover, selected stimuli greatly varied among studies encompassing emotions, mind states, living and nonliving things, of different frequency of use, age of acquisition and imageability. In addition, many studies used interchangeably "concreteness" and "imageability" , which are in fact two distinct properties that can differently affect naming and recall [99][100][101] .
We also controlled for presentation modality. When only visually presented words were included in the analysis no relevant differences were observed between auditory and visual stimuli combined, and only visually presented words (see Figs. 5,9). This can be partially due to the very small number of studies using auditory information (only 5 studies out of 32).
Another relevant element is the participants' age since aging can modify neural organization due to neuroplasticity 102 . With two exceptions 69,77 in which the participants' mean age was > 70, all other studies included a young population with a mean age < 30 (see Table 1). Neuropsychological studies (on patients) involve a different population ranging from 55 to 75. Information obtained from healthy young people cannot be optimal to interpret data from elderly, brain-damaged patients.
According to Eickhoff et al. 46 , the statistical power of the current meta-analysis to detect not only large, but also small-and medium-size effects can be considered acceptable. Nevertheless, meta-analytic power is intrinsically limited by the number of currently available data especially for two sub-analyses: (1) concrete nouns > abstract nouns, only 15 independent experiments, and (2) lexical and semantical task-concrete words > abstract words, 16 studies. This indicates that, in these two cases, we cannot properly control the influence of individual experiments and that we might have failed to detect small effects. Another limitation is related  www.nature.com/scientificreports/ to the sample size of the included experiments that ranged from 6 to 28 participants. We acknowledge the need to consider only well-designed and controlled studies but taking into account the limited number of papers we were forced to include data from studies with uncorrected p values (see Table 1) risking subtle activation differences that may underlie abstract-concrete differences. This meta-analysis is focused on how representations of abstract and concrete words are processed in the brain. Regarding this last point, future research should better understand the specific role of each region within the semantic network, how they are connected, and specify how task and stimuli characteristics interact and modify activation patterns.
Considering the main question, we can confirm that concrete and abstract words involve at least partially segregated brain areas, the IFG being relevant for abstract nouns and verbs; in contrast, we could not find evidence of the ATL involvement for concrete items. Our data indicate a more posterior activation for concrete words in regions that are often correlated with mental imagery processes. This meta-analysis seems to support the hypothesis that abstract and concrete words have partly separate neural correlates but the specific features that differentiate between these two classes of stimuli are still open to discussion. The cortical regions that are commonly activated in imaging studies investigating concrete and abstract words seem more congruent with the Dual Coding Theory 10 , i.e., concrete words have richer representations, depending on both hemispheres. Regarding the hub-and-spoke model (hub regions interacting with modality-specific processing areas), we observed activation patterns in areas considered neural crossroads of the semantic network like the posterior cingulate region, the anterior temporal lobe, and the left inferior frontal gyrus 91,98,103 , but these data cannot be interpreted in the frame of this theoretical model.
The lack of converging evidence from clinical neuropsychological and neuroimaging data might be explained by several variables like task and stimuli type, differences in terms of age and brain plasticity between the two populations (young vs. elderly people), etc. These discrepancies deserve further investigation, for example by www.nature.com/scientificreports/  Table 6. The images are presented in neurological convention. www.nature.com/scientificreports/ www.nature.com/scientificreports/ www.nature.com/scientificreports/  Figure 9. Clusters activated by the abstract > concrete words -semantic and lexical task-contrast. The crosses are centered in the areas correspond to stereotactic coordinates reported in Table 9. The images are presented in neurological convention. www.nature.com/scientificreports/