Introduction

Hallucinations, which are a hallmark of schizophrenia, also occur with a significant incidence in the general population1,2,3, although their prevalence rate varies widely across studies. Certain cognitive mechanisms have been demonstrated to underlie both hallucinations in non-clinical individuals and those experienced by patients with schizophrenia, suggesting a continuum from normality to pathological experience4,5,6,7. One of the shared cognitive bases of non-clinical and clinical hallucinations appears to be dysfunction in source monitoring8,9,10.

Source monitoring is a broad concept which encompasses various overlapping functions such as reality monitoring/discrimination—the ability to distinguish imagined from perceived events—, self monitoring—the ability to recognize one’s overt or covert productions as one’s own—, and source memory—the ability to remember, rather than identify, the origin of information. Impairment in various types of source-monitoring processes in individuals with hallucinations has been studied through a plurality of paradigms. Notably, signal detection tasks have been used to study reality-discrimination processes. It has been repeatedly observed that a liberal response bias, reflecting a tendency to make false detections of auditory signals that were not emitted, is related to hallucinations in schizophrenia patients11,12,13 and to hallucination proneness in non-clinical individuals11,14,15,16,17,18,19. Auditory verbal imagery20 and prior expectations21,22 have been demonstrated to play a crucial role in the false detection of speech in hallucination-prone individuals. Impairments in source memory have been demonstrated by altered response bias in recognition tasks. In such tasks participants are not required to detect stimuli but rather to remember after a delay whether stimuli have been previously presented. Hallucinations in schizophrenia patients have been found to be associated with a liberal response bias in word recognition23,24,25 and picture recognition26,27, reflecting a tendency to falsely remember words or pictures which had not been presented in the encoding phase. A liberal response bias in word recognition25,28 and picture recognition27 was also found to be associated with hallucination proneness in non-clinical individuals. Two studies used the Deese–Roediger–McDermott paradigm, which induces false recognition of words, to investigate hallucination proneness in healthy participants. The authors did not calculate a response bias but did report that auditory hallucination proneness was correlated with increased rates of false recognitions of words strongly associated with target words29 and of non-associated words with negative emotional valence30.

Like false detections, false recognitions may be seen as stemming from self-monitoring failure, with inability to discriminate internally-generated from externally-produced stimuli. Indeed, non-target stimuli in the recognition test may seem familiar because of shared features with internal representations of words or images, and therefore they are mistaken for stimuli presented at encoding. As far as verbal hallucinations are concerned, false recognitions of non-target words might result from the mistaking of inner verbal productions for verbal experimental material. In line with this, a recent study in which we contrasted high- and low-frequency words revealed that verbal hallucinations in patients, as well as hallucination proneness in healthy individuals, were specifically related to the false recognition of high-frequency words25. In a recognition test, individuals presenting verbal hallucinations might fail to distinguish the words previously presented in the experimental target list from those that seem familiar on account of their readily available internal representation through inner speech, as might be the case with the high-frequency words. According to the source-monitoring framework, confusion between imagination and perception may arise from dysfunctional judgment —i.e., internal/external comparison—processes, or from the fact that internal events either present the characteristics of perceived events or lack those of imagined events8. Thus, such failure in the self-monitoring of verbal material might stem from an altered threshold in evaluation/decision processes, i.e., a tendency to laxly give the status of perception to internally-produced verbal events. Alternatively, independent of the integrity of the internal/external comparison processes, inner verbal productions might be abnormally salient and thereby less distinguishable from externally-presented words.

Brain studies have demonstrated across a variety of paradigms that verbal hallucinations are associated with abnormal activity and lateralization of temporal auditory regions, and with altered functional connectivity within the language network31,32,33. While dysfunction of the auditory and language processing regions might generally underlie voice hearing, similarities and differences in neural alterations between clinical and non-clinical hallucinations, and between state and trait hallucinations, are not clearly delineated. According to a recent review, decreased asymmetry seems to be observed also in non-clinical voice hearers; activity of the left superior temporal gyrus might differentiate state vs- trait verbal hallucinations31. Studies that have examined the cortical underpinnings of reality-monitoring/self-monitoring processes suggest that verbal hallucinations are further associated with dysfunction in various regions involved in the appraisal and monitoring of self-generated information34,35,36,37. These regions include the anterior cingulate cortex as well as subregions of the temporal lobe—involved in recollective experience and feeling of familiarity—and anterior prefrontal cortex.

In this neuroimaging study of a non-clinical sample, we aimed to determine whether the previously observed role of word frequency in the association between false recognitions and verbal hallucinations25 corresponded to a differential cortical processing, which would enhance the view of an implication of inner speech in this symptom. False recognitions are known to activate recollective brain regions largely overlapping with those activated by correct recognitions, notably in the parietal and temporal cortex, while prefrontal regions supporting monitoring and cognitive control processes have been found to be more active during false than correct recognitions38. FMRI studies have revealed that low-frequency words elicit greater activation of the left inferior frontal gyrus and other language-related brain regions than do high-frequency words, allegedly reflecting the fact that high-frequency words are more easily accessed and require less phonological and semantic processing39. We contrasted high- and low-frequency words and hypothesized that verbal hallucination proneness was associated with increased rates of false recognitions of high-frequency words. Verbal hallucination proneness was further expected to be associated with activation of language-related regions, as well as regions involved in recollection and decision processes, during false recognitions of non-presented words. We investigated whether the putative specific association with false recognitions of high-frequency words corresponds to a distinct cerebral activation pattern for the high-frequency words when compared to those of low-frequency. In hallucination-prone participants, false recognitions of high-frequency words might be associated with abnormal activation of decisional brain regions, which would suggest altered self-monitoring processes. On the other hand, the high-frequency words might undergo an atypical processing at the language level, suggesting abnormality of inner speech.

In order to demonstrate that impairment in the monitoring of inner speech is a specific underpinning of verbal hallucinations rather than a general feature of psychosis, we also studied potential associations of response bias and cerebral activity with two factors similarly involved in psychotic experience, namely visual imagery and delusion proneness. No association with increased rates of false recognitions or increased activity in language brain regions was expected for either of them. Potential associations with activity in other brain areas, such as visual, recollective, and decisional areas, were explored. With respect to word frequency effects, visual imagery was expected to impact the cortical processing of high-frequency words, which were concrete and lent themselves readily to the formation of visual mental images, in contrast to uncommon words. The cortical impact of delusion proneness was not expected to be modulated by word frequency.

Method

Participants

Thirty-seven (18 female) Spanish-speaking participants with normal or corrected-to-normal vision were recruited from the general population by means of announcements: age: mean = 38.8, sd = 11.3; education level: median = 5 [1 = no studies; 2 = uncompleted primary studies; 3 = completed primary studies; 4 = high school uncompleted; 5 = high school completed; 6 = undergraduate studies; 7 = bachelor’s or master’s degree; 8 = doctorate]; verbal IQ (Word Accentuation test40: mean = 103.1, sd = 7.8. The inclusion criteria were age between 18 and 60 years and fluency in Spanish. The exclusion criteria were neurological or mental illness, intellectual disability, head injury, alcohol or drug abuse in the past six months, and current severe physical disease, as well as the standard exclusion criteria for participation in fMRI procedures, namely claustrophobia and metallic implants including fitted pacemakers and cochlear implants. The study was approved by the ethics committee of the Parc Sanitari Sant Joan de Déu, Barcelona, Spain, and it was conducted in accordance with the guidelines and regulations relevant for experimental studies of human subjects. All participants provided written informed consent before the task was administered.

Scales for hallucination proneness, delusion proneness, and visual imagery

Hallucination proneness was assessed by means of a Spanish adaptation of the Launay-Slade Hallucination Scale (LSHS41,42), a self-questionnaire which measures proneness to hallucinatory experiences in various modalities. Two additional items were mixed with the LSHS items, although they were not taken into account in the computation of the LSHS score: ‘I can easily identify animals or things in the clouds’, and ‘When I see spots (of painting, humidity…), I can see faces, silhouettes, or objects in them’. Similar to the LSHS items, each of these new items had to be rated from 0 to 3 by the participants according to the frequency of the experience. The total score obtained on these two items constituted a visual imagery score (m = 2.16, sd = 1.38; range: 0–5). This visual imagery subscale, meant to assess the spontaneity and abundance of visual imagery rather than its vividness, had been validated and already used in a previous study of healthy participants43. A global hallucination proneness score was tallied by adding up the sub-scores for all LSHS items excluding the two new items (m = 6.73, sd = 5.19; range: 0–19). In addition, a verbal hallucination proneness score was computed by adding up the sub-scores obtained on the corresponding items (‘In the past, I have had the experience of hearing a person’s voice and then found that no one was there’, ‘I often hear a voice speaking my thoughts aloud’, and ‘I have been troubled by hearing voices in my head’ (m = 0.73, sd = 1.19; range: 0–4). Proneness to delusions was assessed by means of the Peters Delusion Inventory scale44 (m = 7.9, sd = 7.6; range: 0–33).

The verbal hallucination and delusion proneness scores did not follow normal distribution; they were normalised by square root transformation before data analysis.

Material

Six lists of 24 concrete nouns, equivalent in the total number of syllables but differing in the word frequency of use (Corpus de Referencia del Español Actual), were constructed. Three lists included high-frequency words (e.g., book, dress) (average frequency per million: m = 123.5, m = 122.9, and m = 120.3, respectively) and 3 included low-frequency words (e.g., apron, whistle) (average frequency per million: m = 2.96 for each list). Two high-frequency and 2 low-frequency lists were used as targets and the remaining two lists (1 high- and 1 low-frequency) were used as distractors. Two of the target lists (1 high- and 1 low-frequency), referred to as ‘read’ lists, were to be read by the participant, and the other two (1 high- and 1 low-frequency), referred to as ‘heard’ lists, were to be read aloud by the experimenter. Each target list was assigned to the ‘read’ or ‘heard’ condition in a counterbalanced way. The order of the conditions (heard-read-heard-read or read-heard-read-heard) and type of list (high-high-low-low, or low-low–high-high) was counterbalanced as well.

Procedure

The scales and fMRI task were administered in one session of approximately 2 h. The participants received financial compensation.

Outside scanner

The four target lists (1 high-frequency/heard, 1 high-frequency/read, 1 low-frequency/heard, 1 low-frequency/read), each displayed on a sheet of paper, were presented and the participants were instructed to memorize the words. In the ‘read’ condition they were required to read the word list aloud once, while in the ‘heard’ condition the experimenter read the word list aloud once. In order to prevent a floor effect, we split the lists into two half-lists. After the reading or hearing of each half-list (12 words), the participants were required to write down as many words as they could remember.

Inside scanner

Read/heard discrimination task: The 96 target words were presented on the screen, one by one and in random order. The participants had to press one of two keys to indicate, after each word, whether it had been ‘read’ or ‘heard’ [The results of this task will not be reported on in this paper].

Old/new recognition task: Forty-eight target words (12 from each target list) and 48 distractors (24 high- and 24 low-frequency words) were presented in pseudo-random order, one by one for 3.5 s, separated by fixation crosses with random durations between 5.5 and 9 s extracted from an exponential distribution, with mean = 6.68 s. The participants had to press one of two keys to indicate, after each word, whether it had been previously presented in the target lists or was new.

Before the task began, two short lists of words (1 ‘heard’ and 1 ‘read’) were presented as practice outside the scanner. In the scanner, a few practice trials were also administered.

fMRI data acquisition

MRI data for the participants were acquired with a General Electric 1.5 Tesla Signa HDe scanner (General Electric Healthcare, Milwaukee, WI, USA) at Parc Sanitari Sant Joan de Déu, using an 8-channel head coil. For each participant, a high-resolution T1-weighted FSPGR structural image with the axial plane parallel to the AC-PC axis was acquired using the following parameters: 2 mm slice thickness, TR = 12.24 ms, TE = 3.84 ms, FOV = 24 cm, acquisition matrix = 512 × 512, flip angle = 20°, voxel size = 0.47 × 0.47 × 2.00 mm3. A T2*-weighted functional echoplanar imaging sequence depicting BOLD contrast was also obtained. In total, 294 volumes were collected with AC-PC axial orientation, with the following scanning parameters: 26 slices, 4 mm thickness, 1 mm gap, TR = 2000 ms, TE = 40 ms, FOV = 24 cm, acquisition matrix = 64 × 64, flip angle = 90°, voxel size = 3.75 × 3.75 × 5.00 mm3. The first 7 volumes in each run were discarded to allow for magnetic saturation effects. Visual stimuli were presented on a rear projection screen and viewed through a mirror mounted on the head coil, and all responses were collected with an MR-compatible response box (fORP, Current Designs, Inc., USA; www.curdes.com).

fMRI data preprocessing

Imaging data were analyzed using SPM12 (Wellcome Department of Imaging Neuroscience, London; www.fil.ion.ucl.ac.uk/spm) running under MATLAB (Release 2009a, The MathWorks, Inc., Natick, Massachusetts). All of the functional volumes for each participant were spatially realigned to the mean image in each series, in order to correct for small head movements. Motion parameters were examined for each subject to ensure that no movements larger than the voxel size were present. The resulting series were warped into MNI space using isotropic voxels (3 × 3 × 3 mm3) with SPM’s standard normalization procedure, and then spatially smoothed using a Gaussian kernel of 8 mm full-width-at-half-maximum.

fMRI data analysis

The preprocessed fMRI data were analyzed with an event-related model, using SPM12. In order to assess random effects at the individual level, the activity associated with the experimental conditions was modelled with a hemodynamic response function (HRF) and its time derivative. Displacement and rotation motion parameters were included as confounds in the individual model. A 200s high-pass filter cut-off was used to remove low frequency noise, together with a first-order autoregression model to correct for temporal autocorrelation.

Four event types were determined by the responses in the old/new recognition task: target words identified as old (correct recognitions), distractors identified as new (correct rejections), target words erroneously identified as new (omissions), and distractors erroneously identified as old (false recognitions). Linear contrasts were constructed to test the experimental effects of interest. These contrasts were entered into a second level analysis in which subjects were treated as a random effect.

The resulting statistical parametric maps were generated using a cluster-defining threshold at voxel level defined by p < 0.001 and a cluster-level threshold defined by a family-wise-error (FWE) corrected p < 0.05. When necessary, a more restrictive FWE-corrected threshold at voxel level was used to separate brain activity clusters that extended across several brain structures.

Measures

The numbers of correctly recalled high- and low-frequency words in the free recall task were tallied. The numbers of high- and low-frequency target words correctly reported as targets in the recognition task, as well as the numbers of high- and low-frequency distractor words erroneously reported as targets (false recognitions), were recorded with the computer, as were the response times for each type of response. The numbers of correctly and erroneously reported words were combined to compute a recognition efficiency index, Pr, reflecting the ability to discriminate target words from distractors (rate of correct recognitions minus rate of false recognitions), and a response bias index, Br, reflecting the tendency to report distractor words as targets (rate of false recognitions/1-Pr)45. Pr and Br indices were computed for each type of word (high frequency, low frequency), and an averaged Pr index was derived (Pr-global).

Statistical design

Behavioural data

First, the word frequency effects were tested by contrasting the numbers of correctly recalled high- vs. low-frequency words, as well as the Pr indices for high- vs. low-frequency words (t-tests).

Regression analyses were then conducted on the response bias for high- and low-frequency words, and on the response times for the false recognitions of high- and low-frequency words. A regression analysis was computed for each variable, with the rating scale scores (LSHS, delusion proneness, and visual imagery) and four socio-demographic measures (age, sex, education level, and verbal IQ) as predictors. These latter measures were entered to control for their potentially confounding effect on the investigated associations, as sociodemographic factors have been demonstrated to have an impact on verbal memory46,47. In the event that a significant association with the LSHS score was observed, a post-hoc analysis of the effect of hallucination proneness was conducted. The regression analysis of the variable was recomputed after replacing the LSHS score with the verbal hallucination proneness score to test the hypothesis that verbal hallucinations were specifically involved in false recognitions.

Neuroimaging data

FMRI analyses were conducted on the correctly recognized high- and low-frequency target words and on the false recognitions of high- and low-frequency distractor words. In a first set of analyses, the verbal hallucination proneness score was entered as covariate along with the visual imagery score to determine the effect of each while controlling for their potential overlap. The Pr-global index was also entered in the model to control for the impact of recognition efficiency. Lastly, only sex and verbal IQ—which were found to impact cerebral activity in our previous studies—were entered to reduce the number of covariates. Preliminary analyses of the potential effect of each socio-demographic variable did not reveal any association of age or education level with cerebral activity in any of the contrasts studied. Then, two other sets of analyses were conducted on the same contrasts to determine the specificity of the cerebral activity associated with verbal hallucination proneness. The verbal hallucination proneness score in the model was replaced first by the LSHS score and then by the delusion proneness score, while the same other covariates were used.

Results

Behavioural data

T-tests revealed that the participants recalled significantly more high- than low-frequency words (m = 23.3, sd = 5.2 vs. m = 19.7, sd = 3.2; t(36) = 5.98, p < 0.0001). On the other hand, they demonstrated greater recognition of the low- than of the high-frequency words (m = 0.56, sd = 0.17 vs. m = 0.51, sd = 0.18; t(36) = 2.56, p < 0.015).

Br-high frequency

Regression analysis indicated that both visual imagery and delusion proneness scores made a near-zero contribution to the response bias, and so they were removed from the predictors. A regression analysis involving only LSHS score and the socio-demographic measures revealed that the LSHS score was a significant predictor of Br-high frequency, in the sense that global hallucination proneness was associated with an increased tendency to make false recognitions of non-presented high-frequency words, as expected (β = 0.42, p < 0.05). The post-hoc analysis indicated that, when the verbal hallucination proneness subscore was entered in this model instead of the LSHS score, it also made a significant contribution to Br-high frequency (β = 0.38, p < 0.05). Education level made a trend contribution to Br-high frequency in this latter model (β = 0.48, p < 0.09), while no significant association emerged for age (β = 0.42, p > 0.10), sex (β = 0.09, p > 0.10), or verbal IQ (β = − 0.25, p > 0.10).

Br-low frequency

The LSHS score was strongly associated with the response bias for the low-frequency words (β = 0.74, p < 0.015) while neither visual imagery (β = − 0.41) nor delusion proneness (β = − 0.21) score made any significant contribution to it (p > 0.10 in both cases). When the verbal hallucination proneness score was entered in the model instead of the LSHS score in the post-hoc analysis, its contribution to the response bias did not reach statistical significance (β = 0.41, p > 0.10), and no significant association was observed for visual imagery (β = − 0.21), delusion proneness (β = − 0.18), age (β = − 0.12), sex (β = 0.27), education level (β = 0.11), or verbal IQ (β = 0.17) (p > 0.10 in all cases).

Response time for the false recognitions of high-frequency words

The LSHS score did not make any contribution to the response time (β = − 0.05, p > 0.84) while the delusion proneness score was negatively associated with it (β = − 0.66, p < 0.01). No significant association with the visual imagery score was observed (β = 0.14). Education level made a significant (β = − 0.68, p < 0.05) and age a trend (β = − 0.49, p < 0.08) contribution to the response time, while sex (β = 0.07) and verbal IQ (β = 0.04) were unrelated to it.

Response time for the false recognitions of low-frequency words

No association with the LSHS score was observed (β = − 0.17, p > 0.52). The delusion proneness score was again negatively associated with the response time (β = − 0.56, p < 0.025), while the visual imagery score was positively associated with it (β = 0.56, p < 0.05). Education level tended to make a contribution to the response time (β = − 0.57, p < 0.06), while age (β = − 0.32), sex (β = 0.16), and verbal IQ (β = − 0.25) were not significantly associated with it (p > 0.10 in all cases).

Neuroimaging data

The results of the analyses conducted with the verbal hallucination proneness score as covariate are reported in Tables 1 and 2. Verbal hallucination proneness was significantly associated with activation of language areas (left Heschl’s gyrus, Broca’s area) and of the anterior cingulate during false recognitions of non-target words, as expected, while an association with activation of a recollective area, the left angular gyrus, emerged during correct recognitions of target words. However, these associations with correct and false recognitions were observed only for the low-frequency words (see Fig. 1). With respect to visual imagery score, associations with decreased activation of various brain areas were observed, and they pertained to the correct and false recognitions of high-frequency, but not low-frequency words, as expected (see Fig. 2). In particular, the left planum temporale and Broca’s area were under-activated during the false recognitions of these words, as were the posterior cingulate and right cerebellum (crus 1).

Table 1 Brain activation areas significantly associated with each covariate (verbal hallucination proneness, visual imagery, sex, verbal IQ, and Pr-global) during the correct recognition of high- and low-frequency words in the 37 participants.
Table 2 Brain activation areas significantly associated with each covariate (verbal hallucination proneness, visual imagery, sex, verbal IQ, and Pr-global) during the false recognition of high- and low-frequency words in the 37 participants.
Figure 1
figure 1

Activation clusters positively associated with verbal hallucination proneness score during correct recognitions (yellow) and false recognitions (red) of low-frequency words, while controlling for visual imagery score, sex, verbal IQ, and Pr-global. The slices depicting MNI coordinates derive from peak activations for the low-frequency word contrasts reported in Tables 1 and 2.

Figure 2
figure 2

Activation clusters negatively associated with visual imagery score during correct recognitions (purple) and false recognitions (green) of high-frequency words, while controlling for verbal hallucination proneness score, sex, verbal IQ, and Pr-global. The slices depicting MNI coordinates derive from peak activations for the high-frequency word contrasts reported in Tables 1 and 2.

When the LSHS score was entered in the model instead of the verbal hallucination score, the analyses did not reveal any significant activation associated with this global hallucination score during the correct or false recognitions of either high- or low-frequency words. When the delusion proneness score was entered in the model instead of the verbal or global hallucination score, delusion proneness was found to be significantly associated with increased activation of various brain areas—right cerebellum, left middle frontal gyrus, bilateral middle temporal gyrus—during the correct recognitions of low-frequency words and the false recognitions of high-frequency words (see Table 3).

Table 3 Brain activation areas significantly associated with delusion proneness during the correct and false recognition of words, in the 37 participants.

Discussion

Verbal hallucination proneness

Verbal hallucinations are the most commonly observed type of hallucination in patients with schizophrenia, and they are assumed to stem from self-monitoring failure through which inner speech is misattributed to an external source48,49,50,51,52. Inner speech has also been linked to auditory-verbal hallucination proneness in non-clinical individuals53,54,55. One factor potentially relevant to the study of inner speech is the sense of familiarity conveyed by the verbal material that is being processed, and we therefore varied the frequency of use of the experimental words. As expected, proneness to hallucinations in the verbal modality was significantly associated with increased rates of false recognition of the high-frequency, but not the low-frequency words. This observation extends a finding previously observed in schizophrenia patients to the general population25.

The pattern of differential associations with cerebral activity confirms the implication of word familiarity in false recognitions. Verbal hallucination proneness was associated with activation of language and decisional brain regions during the false recognition of non-target words. However, these associations pertained only to the low-frequency words. During the false recognitions of these words, hallucination proneness was indeed associated with significant activation of left Heschl’s gyrus and Broca’s area, both involved in speech production. Cortical regions engaged in language reception and language production are consistently activated during auditory-verbal hallucinations56, and left Heschl’s gyrus in particular has been proposed as a key region for this symptom57,58. No similar activation was observed for the high-frequency words. One interpretation is that inner speech might be over-salient in verbal hallucination-prone individuals, and therefore the feeling of familiarity conveyed by common words might be so sharp that their processing requires little activation of language-related areas.

Further, verbal hallucination proneness was significantly associated with activation of the anterior cingulate cortex during the false recognition of low-frequency words. The anterior cingulate cortex is involved in decision-making in conflictual situations59. In schizophrenia patients it has been found to be involved in the appraising of errors60,61. Previous neuroimaging studies have revealed activation of the anterior cingulate during false recognition of faces in healthy participants62, and false memories for pictures in healthy participants63 and schizophrenia patients64. Interestingly, it was reported that schizophrenia patients with verbal hallucinations, in contrast to healthy participants and non-hallucinating patients, failed to activate the anterior cingulate during the appraisal of self/alien speech35 and the generation of inner speech65. In our verbal hallucination-prone participants, a similar lack of significant activation of this brain region during false recognition of non-target high-frequency words suggests that these words, exceedingly accessible, were confidently sensed as having been recently presented, while the expected cognitive conflict arose for the judgement of the non-target low-frequency words. The anterior cingulate might be crucially involved in the cognitive biases associated with hallucinations. Indeed, a recent review identified this brain structure as a shared neural mechanism of aberrant salience and source monitoring in psychosis.66.

The examination of correct recognitions further demonstrates the differential processing of familiar vs. uncommon words in verbal hallucination-prone individuals. Indeed, during correct recognitions of low-frequency words, verbal hallucination proneness was associated with activation of the lingual gyrus, engaged in the visual processing of words67, as well as with activation of a brain region involved in memory retrieval, namely the left angular gyrus68. No similar activations were observed during correct recognitions of high-frequency words, which suggests that these correct recognitions arose more from a guess than from an authentic retrieval. Thus, the pattern of behavioural and neuroimaging findings indicates that in verbal hallucination-prone individuals, the experimental words that seemed familiar, be they targets or distractors, were liberally endorsed as previously presented —i.e., perceived—words without implementing of the necessary linguistic, recollective, and decisional processes. It is worth noting that the differential pattern of cerebral activity and response bias as a function of word frequency is specific to hallucination proneness in the verbal modality. Indeed, the global hallucination proneness score was significantly associated with liberal response bias for both types of word, and it was not associated with any cerebral activity for either. A technical point should be made that the emergence of a dissociated pattern in verbal hallucination-prone participants is likely to have been facilitated by the experimental procedure of intermixing high- and low-frequency words in the recognition list, thereby increasing their differential processing. Less distinctive results might have been observed if pure recognition lists of high-frequency and of low-frequency words had been contrasted.

Previous cognitive studies that have employed other paradigm types have similarly demonstrated involvement of impaired self-monitoring of inner speech in non-clinical hallucinations14,29,30,69,70,71. Confusion between inner speech and perception appears to be a shared underpinning of clinical and non-clinical verbal hallucinations, supporting a continuum model of verbal hallucinations72,73,74,75,76. This confusion may stem from defective self-monitoring comparison processes, impeding appropriate evaluation of internal vs. external production. In our study, though, the fact that decisional processes were adequately implemented for judging the low-frequency words suggests that self-monitoring comparison processes were not intrinsically defective but rather that they failed to be recruited for the judging of the familiar words. Self-monitoring errors might also result from abnormal salience of inner speech through increased vividness or increased abundance of this material. Within the source monitoring framework, abnormal vividness of inner speech would make it seem more similar to perceived speech; abnormal abundance of inner speech, reflecting an easy production without cognitive effort, would make it seem dissimilar to cognitively-produced internal events, and therefore more likely to be mistaken for perception8,77,78,79. Abnormal salience of inner speech might be a characteristic of individuals presenting clinical or non-clinical verbal hallucinations, while a clinical hallucinatory level might be reached when self-monitoring disruption further occurs. It should be kept in mind that cognitive mechanisms other than inner speech misattribution, such as intrusive memories and cognitive disinhibition5,80, are also likely to participate in the formation of verbal hallucinations.

Visual imagery

Meanwhile, visual imagery, which also contributes to psychotic experience, was not associated with increased rates of false recognition of either type of word, and its pattern of associations with cerebral activity was entirely distinct from that observed for verbal hallucination proneness. Visual imagery relies largely on the same cortical bases as visual perception81, and it appears to have an impact on verbal processing. Indeed, various fMRI studies which contrasted concrete vs. abstract words or manipulated word imageability have demonstrated that the processing of highly imageable concrete words was associated with activation of visual-related brain regions82,83. A study focused on visual imagery revealed that only a subgroup of individuals who demonstrated high visual imagery propensity activated a visual brain region during the processing of common concrete words, suggesting that they had formed a visual mental image of the designated object43. In the current study, visual imagery selectively impacted the judgment of high-frequency words, as expected, and it was associated with decreased, rather than increased, cerebral activity. In particular, during false recognitions of high-frequency words, higher visual imagery scores were associated with decreased activation of two regions involved in memory retrieval, the posterior cingulate cortex and the cerebellum-crus 1, and of two verbal areas, namely the left planum temporale and Broca’s area. Individuals with high visual imagery scores probably made visual mental images of the familiar words that were presented, thereby de-activating the brain regions usually recruited for verbal processing. This observation is compatible with studies which used the Deese–Roediger–McDermott paradigm and demonstrated that the instruction to form visual mental images of target words at encoding resulted in fewer false recognitions of non-target words84,85,86.

Delusion proneness

High levels of delusion proneness did not lead to increased rates of false recognitions of words, corroborating what was observed in another healthy sample25. This suggests that misattribution of inner speech is not a mechanism involved in this symptom. Our behavioural data reveal that delusion proneness was associated, rather, with rapidness in making false recognitions of both high and low-frequency words. These short response times might reflect the ‘jumping-to-conclusions’ bias and overconfidence in incorrect memories consistently observed in delusional schizophrenia patients and non-clinical delusion-prone individuals87,88,89. At the cortical level, delusion proneness was associated with activation of the right cerebellum-crus I, a brain region engaged in autobiographical memory retrieval90, during false recognition of high-frequency words. It was further associated with bilateral activation of the middle temporal gyrus, which is involved in semantic processing91,92. False recognitions in delusion-prone individuals might result from semantic and reasoning abnormalities rather than from any deficiency in the monitoring of self-generated information.

Limitations and conclusions

Our conclusions are limited by the low incidence of verbal hallucination proneness in the sample and the restricted range observed for this symptom score. The differential processing of high- vs. low-frequency words ought to be tested in a large sample of verbal-hallucination prone individuals. It should be noted, though, that the analyses revealed significant associations of cerebral activation with the verbal but not the global hallucination score, in spite of the much more extended range of this latter score. Another important limitation is that the manipulation of word frequency can only be assumed to tap into the processes engaged in inner speech. At the methodological level, the fact that the word presentation format was different at encoding and at recognition may have to some extent affected the results. Nonetheless, our combined behavioural and neuroimaging data corroborate the view that proneness to verbal hallucinations in non-clinical individuals, similar to verbal hallucinations in schizophrenia patients, hinges on a decreased ability to distinguish inner speech from perceived verbal information. With respect to the other psychosis-related factors that we investigated, visual imagery was associated with deactivation of language-related brain areas during false memories of highly imageable words, while proneness to delusions appears to be associated with hasty decisions rather than with increased rates of false memories.