Trait-level paranoia shapes inter-subject synchrony in brain activity during an ambiguous social narrative

Individuals often interpret the same event in different ways. How do personality traits modulate patterns of brain activity evoked by a complex stimulus? Participants listened to a narrative during functional MRI describing a deliberately ambiguous social scenario, designed such that some individuals would find it highly suspicious, while others less so. Using inter-subject correlation analysis, we identified several brain areas that were differentially synchronized during listening between participants with high- and low trait-level paranoia, including theory-of-mind regions. Event-related analysis indicated that while superior temporal cortex responded reliably to mentalizing events in all participants, anterior temporal and medial prefrontal cortex responded to such events only in high-paranoia individuals. Analyzing participants’ speech as they freely recalled the narrative revealed semantic and syntactic features that also scaled with paranoia. Results indicate that trait paranoia acts as an intrinsic ‘prime’, producing different neural and behavioral responses to the same stimulus across individuals.

reliably to mentalizing events in all participants, anterior temporal and medial prefrontal cortex 23 responded to such events only in high-paranoia individuals. Analyzing participants' speech as 24 they freely recalled the narrative revealed semantic and syntactic features that also scaled with 25 paranoia. Results indicate that trait paranoia acts as an intrinsic 'prime', producing different 26 neural and behavioral responses to the same stimulus across individuals. 27 That two individuals may see the same event in different ways is a truism of human 28 nature. Examples are found at many scales, from low-level perceptual judgments to 29 interpretations of complex, extended scenarios. This latter phenomenon is known as the 30 "Rashomon effect" 1 after a 1950 Japanese film in which four eyewitnesses give contradictory 31 accounts of a crime and its aftermath, raising the point that for multifaceted, emotionally charged 32 events, there may be no single version of the truth. 33 What accounts for these individual differences in interpretation? Assuming everyone has 34 access to the same perceptual information, personality traits may bias different individuals 35 toward one interpretation or another. Paranoia is one such trait, in that individuals with strong 36 paranoid tendencies may be more likely to assign a nefarious interpretation to otherwise neutral 37 events. While paranoia in its extreme is a hallmark symptom of schizophrenia and other 38 psychoses, trait-level paranoia, like many experiences and behaviors associated with psychiatric 39 illness, exists as a continuum rather than a dichotomy 2,3 . On a behavioral level, up to 30 percent 40 of people report experiencing certain types of paranoid thoughts (e.g., 'I need to be on my guard 41 against others') on a regular basis 4 and trait-level paranoia in the population follows an 42 exponential, rather than bimodal, distribution 5 . 43 Few neuroimaging studies have investigated paranoia as a continuum between normality 44 and pathology; the majority simply contrast healthy controls and patients suffering from clinical 45 delusions. However, a handful of reports from subclinical populations describe patterns of brain 46 activity that scale parametrically with tendency toward paranoid or delusional ideation. For 47 example, it has been reported that higher-paranoia individuals show less activity in the medial 48 temporal lobe during memory retrieval and less activity in the cerebellum during sentence 49 completion 6 , less activity in temporal regions during social reflection 7 and auditory oddball 50 detection 8 , but higher activity in the insula and medial prefrontal cortex during self-referential 51 processing 9 and differential patterns of activity in these regions as well as the amygdala while 52 viewing emotional pictures 10 . 53 Such highly controlled paradigms enable precise inferences about evoked brain activity, 54 but potentially at the expense of real-world validity. For example, brain response to social threat 55 is often assessed with decontextualized static photographs of unfamiliar faces presented rapidly 56 in series (see 11 for a review). Compare this to threat detection in the real world, which involves 57 perceiving and interacting with both familiar and unfamiliar faces in a rich, dynamic social context. Paranoid thoughts that eventually reach clinical significance usually have a slow, 59 insidious onset, involving complex interplay between a person's intrinsic tendencies and his or 60 her experiences in the world. In studying paranoia and other trait-level individual differences, 61 then, is important to complement highly controlled paradigms with more naturalistic stimuli. 62 Narrative is an attractive paradigm for several reasons. First, narrative is an ecologically 63 valid way to study belief formation in action. Theories of fiction posit that readers model 64 narratives in a Bayesian framework in much the same way as real-world information 12 , and story 65 comprehension and theory-of-mind processes share overlapping neural resources 13 . Second, a 66 standardized narrative stimulus provides identical input, so any variation in interpretation reflects 67 individuals' intrinsic biases in how they assign salience, learn and form beliefs. Third, from a 68 neuroimaging perspective, narrative listening is a continuous, engaging task that involves much 69 of the brain 14 and yields data lending itself to innovative, data-driven analyses such as inter-70 subject correlation 15  Our primary approach for analyzing the fMRI data was inter-subject correlation (ISC), 134 which is a model-free way to identify brain regions responding reliably to a naturalistic stimulus 135 across subjects 15,16 . In this approach, the timecourse from each voxel in one subject's brain 136 across the duration of the stimulus is correlated with the timecourse of the same voxel in a 137 second subject's brain. Voxels that show high correlations in their timecourses across subjects 138 are considered to have a stereotyped functional role in processing the stimulus. The advantage of 139 this approach is that it does not require the investigator to have an a priori model of the task, nor 140 to assume any fixed hemodynamic response function. 141 In a first-pass analysis, we calculated ISC at each voxel across the whole sample of n = 142 22 participants, using a recently developed statistical approach that relies on a linear mixed-143 effects model with crossed random effects to appropriately account for the correlation structure 144 embedded in the data 21 . Results are shown in Fig. 2. As expected given the audio-linguistic 145 nature of the stimulus, ISC was highest in primary auditory cortex and language regions along 146 the superior temporal lobe, but we also observed widespread ISC in other parts of association 147 cortex, including frontal, parietal, midline and temporal areas, as well as the posterior 148 cerebellum. These results replicate previous reports that complex naturalistic stimuli induce 149 stereotyped responses across participants in not only the relevant primary cortex, but also higher-150 order brain regions 14,15,22 . 151 Also as expected, ISC was generally lower or absent in primary motor and somatosensory 152 cortex, although we did observe significant ISC in parts of primary visual cortex, despite the fact 153 that there was no timecourse of visual input during the story. (To encourage imagery and 154 engagement, we had participants fixate on a static photograph that was thematically relevant to 155 the story during listening, so the observed ISC in visual cortex may reflect similarities in the 156 timecourse of internally generated visualization across participants.) 157 158 Trait-level paranoia modulates neural response to the narrative 159 Having established that story listening evokes widespread neural synchrony across the 160 entire sample, we next sought to determine if there were brain regions whose degree of ISC was 161 modulated by trait-level paranoia. Using a median split of GPTS-A scores, we stratified our 162 sample into a low-paranoia (GPTS-A ≤ 18, n = 11) and high-paranoia (GPTS-A ≥ 19, n = 11) 163 group (Fig. 1b). We then used the same linear mixed-effects model described above formulated 164 as a two-group contrast to reveal areas that are differentially synchronized across paranoia 165 groups. 166 We were primarily interested in two contrasts. First, which voxels show greater ISC 167 among pairs of high-paranoia participants versus low-paranoia participants, or vice versa? This 168 contrast reveals regions that have a stereotyped response in one group but not the other, 169 suggesting a more prominent role for that region in processing the stimulus. Second, which 170 voxels show greater ISC among pairs of participants within the same paranoia group (i.e., high-171 high and low-low) than across groups (high-low)? This contrast reveals regions that have equally 172 stereotyped responses within each group, but whose response timecourses are qualitatively 173 different across groups, suggesting that such regions may be responding to different events 174 within the stimulus depending on paranoia level. These contrasts are schematized in Fig. 1c. . Searches for these coordinates on Neurosynth, an automatic fMRI results 181 synthesizer for mapping between neural and cognitive states 23 , indicated that for the left temporal 182 pole and right anterior mPFC clusters, top meta-analysis terms included "mentalizing", "mental 183 states", "intentions", and "theory mind". There were no regions showing a statistically 184 significant difference in the reverse direction (low-paranoia > high-paranoia). 185 In the second contrast, a set of regions also emerged as being more synchronized within 186 paranoia groups than across paranoia groups. These were the left middle occipital gyrus 187  Table 1.)  198 For example, if the high-paranoia participants have better overall attentional and 199 cognitive abilities, they might simply be paying closer attention to the story, inflating ISC values 200 but not necessarily because of selective attention to ambiguous or suspicious details. However, 201 there were no differences between high-and low-paranoia participants on any of the cognitive 202 tasks we administered (verbal IQ, vocabulary, fluid intelligence or working memory), making it 203 unlikely that observed differences are due to trait-level differences in attention or cognition. As 204 for state-level attention during the story, there was no relationship between paranoia and number 205 of comprehension questions answered correctly, total word count during the recall task, or self-206 report measures of engagement and attention. We also explored potential imaging-based 207 confounds, and found that paranoia was not related to amount of head motion during the scan (as 208 measured by mean framewise displacement), number of censored frames, or temporal signal-to-209 noise ratio (tSNR). In terms of demographics, paranoia groups did not differ in age or sex 210 breakdown. Thus we are reasonably confident that the observed effects are driven by true trait-211 level differences in paranoia between individuals. 212 213 Probing response to mentalizing events using an encoding model of the task 214 Results of the first contrast from the two-group ISC analysis indicated that certain brain 215 regions showed a more stereotyped response in high-paranoia versus low-paranoia individuals. 216 What features of the narrative were driving activity in these regions? In theory, ISC allows for 217 reverse correlation, in which peaks of activation in a given region's timecourse are used to 218 recover the stimulus events that evoked them 15 . In practice, this is often difficult. Especially with 219 narrative stimuli, in which structure is built up over relatively long timescales 14 , it is challenging 220 to pinpoint exactly which event-word, phrase, sentence-triggered an increase in BOLD 221 activity. 222 Rather than rely on reverse correlation, which is a purely data-driven decoding approach, 223 we took an encoding approach: we created a model of the task based on events that we 224 hypothesized would stimulate differing interpretations across individuals, and evaluated the 225 degree to which certain regions of interest (ROIs) responded to such events, using a standard 226 general linear model (GLM) analysis. Specifically, we labeled sentences in the story when the 227 main character was experiencing an ambiguous (i.e., possibly suspicious) social interaction, 228 and/or sentences when she was explicitly reasoning about the intentions of other characters. For 229 brevity, we refer to these timepoints as "mentalizing events." In creating the regressor, all events 230 were time-locked to the end of the last word of the labeled sentences, when participants are 231 presumably evaluating information they just heard and integrating it into their situation model of 232 the story. 233 We hypothesized that the two ROIs from the previous analysis known to be involved in 234 theory-of-mind and mentalizing, specifically the left temporal pole and right medial PFC, would 235 show higher evoked activity to mentalizing events in individuals with higher trait-level paranoia. 236 We included two additional ROIs, the left posterior superior temporal sulcus and left Heschl's 237 gyrus, as a positive and negative control, respectively. We selected the left posterior superior 238 temporal sulcus as a positive control because of its well-established role in theory-of-mind and 239 mentalizing processes, and the fact that it emerged as highly synchronized across all participants 240 (cf. Fig. 2) but did not show a group difference (cf. Fig. 3); thus we hypothesized that this region should respond to mentalizing events in all participants, regardless of paranoia score. 242 Conversely, left Heschl's gyrus (primary auditory cortex) should only respond to low-level 243 acoustic properties of the stimulus and not show preferential activation to mentalizing events in 244 either group or the sample as a whole. 245 For each participant, we regressed the timecourse of each of these four ROIs against the 246 mentalizing-events regressor and compared the resulting regression coefficients between groups 247 To confirm that these results hold if paranoia is treated as a continuous variable, we also 255 calculated the rank correlation between paranoia score and beta coefficient for all four ROIs 256 As an additional control, to check that this effect was specific to mentalizing events and 261 not just any sentence offset, we created an inverse regressor comprising all non-mentalizing 262 events (i.e., by flipping the binary labels from the mentalizing-events regressor, such that all 263 sentences were labeled except those containing an ambiguous social interaction or explicit 264 mentalizing as described above). As expected, there were no differences between paranoia 265 groups in any of the four ROIs in response to non-mentalizing sentences (Fig. 4d), and no 266 continuous relationships between regression coefficient and paranoia (Fig. 4e). This suggests that 267 trait-level paranoia is associated with differential sensitivity of the left temporal pole and right 268 medial PFC to not just any type of information, but specifically to socially ambiguous 269 information that presumably triggers theory-of-mind processes. 'friend', 'social'). Also associated with high trait-level paranoia were frequent use of adjectives 298 as well as anxiety-and risk-related words (e.g., 'bad', 'crisis'); drives, a meta-category that 299 includes words concerning affiliation, achievement, power, reward and risk; and health-related 300 words (e.g., 'clinic', 'fever', 'infected'; recall that the story featured a doctor treating patients in 301 a remote village; cf. Box 1). Features with strongly negative loadings-indicating an inverse 302 relationship with paranoia-included male references (e.g., 'him', 'his', 'man', 'father'); anger-related words ('yell', 'annoyed'); function words ('it', 'from', 'so', 'with'); and conjunctions 304 ('and', 'but', 'until'). See Fig. 5b for specific examples for selected categories from participants' 305 speech transcripts. 306 After the free-speech prompts, participants answered a series of multiple-choice 307 questions. First, they were asked to rate the degree to which they were experiencing various 308 emotions (suspicion, paranoia, sadness, happiness, confusion, anxiety, etc; 16 in total) on a scale 309 from 1 to 5. Most of ratings skewed low-for example, the highest paranoia rating was 3, and 310 only six subjects rated their paranoia level higher than 1. Interestingly, there was no significant 311 correlation between trait-level paranoia score and self-reported paranoia (r s = -0.02, p uncorr = 312 0.91) or suspicion (r s = 0.11, p uncorr = 0.62) following the story. Neither were any of the other 313 emotion ratings significantly correlated with trait-level paranoia (all uncorrected p > 0.12; see 314 Next, participants were asked to rate the three central characters on six personality 316 dimensions (trustworthy, impulsive, considerate, intelligent, likeable, naïve). None of these 317 personality ratings were significantly correlated with trait-level paranoia (all uncorrected p > 318 0.09; data not shown). 319 Finally, participants were asked to rate the likelihood of each of six scenarios. There was 320 a trend such that individuals with higher trait-level paranoia were more likely to disagree with 321 one of these scenarios ("Juan and the other villagers had not known anything about the disease 322 before Carmen arrived"; r s = -0.50, p uncorr = 0.02), but this did not survive correction for multiple 323 comparisons. None of the other scenario likelihood ratings significantly correlated with trait-324 level paranoia (all uncorrected p > 0.19; see Fig. 5d). 325 Overall, then, participants' free speech was much more sensitive to individual differences 326 in trait-level paranoia than their answers on the multiple-choice questionnaire. Self-report is a 327 coarse measure that may suffer from response bias; behavior provides a richer feature set that 328 allows for the discovery of more subtle associations. In studying nuanced individual differences, 329 then, these results highlight the desirability of capturing behavior in naturalistic ways. Here, we have shown that trait-level paranoia acts as a lens through which individuals 334 perceive an ambiguous stimulus, yielding a spectrum of responses across participants at both the 335 neural and behavioral levels. While our in-scanner narrative listening task evoked widespread 336 inter-subject correlations in brain activity across all participants (Fig. 2), stratifying participants 337 according to paranoia revealed differential inter-subject synchrony in various brain regions. 338 Specifically, high-paranoia participants showed greater ISC in the left temporal pole, left 339 precuneus, and two regions in the right medial prefrontal cortex (Fig. 3a). Participants more 340 similar on trait-level paranoia-whether high or low-tended to have more similar activity 341 patterns in left middle occipital gyrus and right angular gyrus (Fig. 3b). A follow-up event 342 related analysis showed that the left temporal pole and right medial prefrontal cortex selectively 343 responded to mentalizing events in high-paranoia participants, but not in low-paranoia 344 participants (Fig. 4). Finally, analyzing participants' speech as they freely recalled the narrative 345 revealed semantic and syntactic features that also scaled with trait-level paranoia (Fig. 5). 346 The fundamental advance of the present work is to show that an intrinsic personality trait 347 can act as an "implicit prime" that colors how individuals perceive and interpret complex events. 348 Previous work using naturalistic tasks has shown that brain activity and behavioral responses are 349 sensitive to experimenter instructions, i.e., an explicit prime 18,19 , or to the nature of the stimulus 350 itself, i.e., whether it is more or less compelling or entertaining 25-27 . The present study extends 351 these results in an important new direction, suggesting that in addition to such explicit 352 modulations, there is substantial implicit variation in neural response to a naturalistic stimulus 353 that stems from trait-level individual differences. 354 Our results have implications for the neural basis of both trait-and state-related paranoia. 355 The relative hyperactivity of theory-of-mind regions in high-paranoia individuals fits with the 356 conception of paranoia as "over-mentalizing", or the tendency to excessively attribute 357 (malevolent) intentions to other people's actions 28 . In high-paranoia individuals, the story's 358 ambiguous or mentalizing events triggered a response across more of the brain, activating not 359 only the posterior superior temporal cortex as in low-paranoia individuals, but also an extended 360 set of regions including the temporal pole and medial prefrontal cortex. Both of these regions are 361 sometimes, but not always, reported in theory-of-mind tasks broadly construed; individual 362 differences may at least partially explain the inconsistencies in the literature 29 . While their 363 precise roles in such tasks remain unclear, it has been suggested that the medial PFC is specifically involved in meta-cognitive processes of reflecting on feelings and intentions 30 , and 365 that the temporal pole is responsible for binding complex stimuli to emotional responses 31 . It is 366 conceivable that both of these processes are more active in individuals with higher trait-level 367 paranoia. But because delusions typically have a slow, insidious onset, it is nearly impossible to 376 retrospectively recover specific triggering events in individual patients. A related challenge is 377 that while thematically similar, each patient's delusion is unique in its details. Thus it is difficult 378 to devise material that will evoke comparable responses across patients. One solution is to craft a 379 model context using a stimulus that is ambiguous yet controlled-i.e., identical across 380 participants, permitting meaningful comparisons of time-locked evoked activity-such as the 381 one used in this work. Future work should study patient populations using paradigms such as this 382 one, as they may shed light on mechanisms of delusion formation and/or provide eventual 383 diagnostic or prognostic value. 384 While there is little work investigating how brain activity is altered during a naturalistic 385 stimulus in psychiatric populations, a handful of studies have used such paradigms in autism, 386 finding that ISC is lower among autistic individuals than typically developing controls while 387 watching movies of social interactions 38-40 . Notably, the degree of asynchrony scales with 388 autism-like phenotype severity in both the patient and control groups 39 . It is interesting to 389 juxtapose these reports with the present results, in which individuals with a stronger paranoia 390 phenotype were more synchronized during exposure to socially relevant material; ultimately, this 391 fits with the notion of autism and psychosis as opposite ends of the same spectrum, involving 392 hypo-and hyper-mentalization, respectively 41,42 . Eventually, it may be possible to combine a 393 naturalistic stimulus with an ISC-based analysis that cuts across diagnostic labels to examine 394 how the synchrony of neural response varies across both healthy and impaired populations.
From a methodological perspective, much of the research using fMRI to study individual 396 differences has shifted focus in recent years from activation measured in task-based conditions to 397 functional connectivity measured predominantly at rest 43-47 . Both suffer from limitations: 398 traditional tasks are tightly controlled paradigms that often lack ecological validity; resting-state 399 scans, on the other hand, are entirely unconstrained, making it difficult to separate signal from 400 noise. Furthermore, very few resting-state scans use behavioral monitoring, so while they may 401 detect individual or group differences, it is usually impossible to recover the nature of the mental 402 events that give rise to such differences. Naturalistic tasks may be a happy medium for studying 403 both group-level functional brain organization as well as individual differences 48,49 . We and 404 others have argued that such tasks could serve as a "stress test" to draw out individual variation 405 in brain and behaviors of interest 50-53 , enhancing the signal-to-noise ratio in the search for Carmen is a young American doctor who journeys to the Amazon to work in a small village health clinic. There, she meets Juan, a village leader, and Alba, a young girl whom she befriends. Soon after arriving, Carmen sees a series of patients with a very serious-and seemingly highly contagious-fever. At times it seems that Juan and the villagers are fully open and forthcoming with Carmen, while at other times their behavior is harder to interpret and it seems as if they may be hiding something. Carmen begins to wonder if the villagers had known about the disease before she arrived, and if she had somehow been deliberately lured to the remote location. The story ends abruptly, when Carmen discovers that Alba herself is sick, and that unbeknownst to her, Alba is Juan's daughter. Carmen fears she may have already been infected, and wonders what to do next. that are more synchronized between pairs of high-paranoia participants than pairs of low-455 paranoia participants (contrast schematized in top panel, cf. Fig. 1C). Significant clusters were 456 detected in the left temporal pole, two regions in the right medial prefrontal cortex (one anterior 457 and one dorsal and posterior), and the left precuneus. No clusters were detected in the opposite 458 direction (low > high). b) Results from a whole-brain, voxelwise contrast revealing brain regions 459 that are more synchronized within paranoia groups (i.e., high-high and low-low pairs) than 460 across paranoia groups (i.e., high-low pairs; contrast schematized in top panel, cf. Fig 1C). 461 Clusters were detected in the right angular gyrus and left lateral occipital cortex. For both 462 contrasts, results are shown at an initial threshold of p < 0.002 with cluster correction 463 corresponding to p < 0.05. 464 465 participants' trait-level paranoia and their likelihood ratings of various potential scenarios 496 following the narrative (based on a Likert scale from 1 to 5). Likelihood rating for one scenario 497 (denoted with †) was significant at p < 0.05 but this did not survive correction for multiple 498 comparisons. 499 500 501 Table 1. Trait paranoia was unrelated to potential confounding variables. There were no 503 significant differences between high-and low-paranoia participants in terms of demographics, 504 cognitive abilities, fMRI data quality or attention to the stimulus. Categorical comparisons were 505 carried out using Student's t-tests between the low and high paranoia groups as determined by 506 median split (degrees of freedom for all t-tests = 20). Continuous comparisons were carried out 507 using Spearman (rank) correlation between raw paranoia score and the variable of interest. All p-508 values are raw (uncorrected). *Measured with a chi-squared test. FD, framewise displacement; 509 tSNR, temporal signal-to-noise ratio; WRAT, Wide Range Achievement Test. Research Center. After an initial localizing scan, a high-resolution 3D volume was collected 560 using a magnetization prepared rapid gradient echo (MPRAGE) sequence (208 contiguous 561 sagittal slices, slice thickness = 1 mm, matrix size 256 × 256, field of view = 256 mm, TR = 562 2400 ms, TE = 1.9 ms, flip angle = 8°). Functional images were acquired using a multiband T2*-563 sensitive gradient-recalled single shot echo-planar imaging pulse sequence (TR = 1000 ms, TE = 564 30 ms, voxel size = 2.0mm 3 , flip angle = 60°, bandwidth = 1976 Hz/pixel, matrix size = 110 × 565 110, field of view = 220 mm × 220 mm, multiband factor = 4). 566 We acquired the following functional scans: 1) an initial eyes-open resting-state run 567 (6:00/360 TRs in duration) during which subjects were instructed to relax and think of nothing in 568 particular; 2) a movie-watching run using Inscapes 55 (7:00/420 TRs); 3) three narrative-listening 569 runs corresponding to parts I, II and III of the story (21:50/1310 TRs in total); and 4) a post-570 narrative, eyes-open resting-state run (6:00/360 TRs) during which subjects were instructed to 571 reflect on the story they had just heard. The present work focuses exclusively on data acquired 572 during narrative listening. The narrative stimulus was delivered through MRI-compatible audio headphones and a short "volume check" scan was conducted just prior to the first narrative run to 574 ensure that participants could adequately hear the stimulus above the scanner noise. To promote 575 engagement, during the three narrative runs, participants were asked to fixate on a static image of 576 a jungle settlement and to actively imagine the story events as they unfolded. 577 Following conversion of the original DICOM images to NIFTI format, AFNI (Cox 1996) 578 was used to preprocess MRI data. The functional time series went through the following 579 preprocessing steps: despiking, head motion correction, affine alignment with anatomy, 580 nonlinear alignment to a Talairach template (TT_N27), and smoothing with an isotropic FWHM 581 were outliers. Censored time points were set to zero rather than removed altogether (this is the 589 conventional way to do censoring, but especially important for inter-subject correlation analyses, 590 to preserve the temporal structure across participants). The final output of this preprocessing 591 pipeline was a single functional run concatenating data from the three story runs (total duration = 592 21:50, 1310 TRs). All analyses were conducted in volume space and projected to the surface for 593 visualization purposes. 594 We used mean framewise displacement (MFD), a per-participant summary metric, to 595 assess the amount of head motion in the sample. MFD was overall relatively low (after 596 censoring: mean = 0.075 mm, s.d. = 0.026, range = 0.035-0.14). Number of censored time points 597 during the story was overall low but followed a right-skewed distribution (range = 0-135, median 598 = 4, median absolute deviation = 25). All 22 participants in the final analysis retained at least 89 599 percent of the total time points in the story, so missing data was not a substantial concern. Still, 600 we performed additional control analyses to ensure that number of censored timepoints and 601 amount of head motion were not associated with paranoia score in any way that would confound 602 interpretation of the results (see Table 1). 603

Inter-subject correlation 605
Following preprocessing, inter-subject correlation (ISC) during the story was computed 606 across all possible pairs of subjects (i,j) using AFNI's 3dTcorrelate function, resulting in 231 607 (n*(n-1)/2, where n = 22) unique ISC maps, where the value at each voxel represents the 608 Pearson's correlation between that voxel's timecourse in subject i and its timecourse in subject j. 609 To identify voxels demonstrating statistically significant ISC across all 231 subject pairs, 610 we performed inference at the single-group level using a recently developed linear mixed-effects 611 (LME) model with a crossed random-effects formulation to accurately account for the correlation 612 structure embedded in the ISC data 21 . This approach has been characterized extensively, 613 including a comparison to non-parametric approaches, and found to demonstrate proper control 614 for false positives and good power attainment 21 . The resulting map was corrected for multiple 615 comparisons and thresholded for visualization using a voxelwise false discovery rate threshold of 616 q < 0.001 (Fig. 2). 617 In a second analysis, we stratified participants according to a median split of scores on 618 the GPTS-A subscale. We used these groups to identify voxels that had higher ISC values within 619 one paranoia group or the other, or higher ISC values within rather than across paranoia groups. 620 To this end, we used a two-group formulation of the LME model. This model gives the following 621 outputs: voxelwise population ISC values within group 1 (G 11 ); voxelwise population ISC values 622 within group 2 (G 22 ); voxelwise population ISC values between the two groups that reflect the 623 ISC effect between any pair of subjects with each belonging to different groups (G 12 ). These 624 outputs can be compared to obtain several possible contrasts. Here, we were primarily interested 625 in two of these contrasts: 1) G 11 versus G 22 ; and 2) G 11 G 22 versus G 12 . The maps resulting from 626 each of these contrasts were thresholded using an initial voxelwise threshold of p < 0.002 and 627 controlled for family-wise error (FWE) using a cluster size threshold of 50 voxels, corresponding 628 to a corrected p-value of 0.05. We opted for a particularly stringent initial p-threshold in light of 629 recent concerns about false positives arising from performing cluster correction on maps with 630 more lenient initial thresholds 56 . 631 632

Event-related analysis 633
Creating the regressor. One of the authors (E.S.F.) manually labeled sentences 634 containing either an ambiguous social interaction or an instance of the main character Count LIWC; 24, liwc.net , a software program that takes as input a given text and counts the 667 percentage of words falling into different syntactic and semantic categories. Because LIWC was 668 developed by researchers with interests in social, clinical, health, and cognitive psychology, the 669 language categories were created to capture people's social and psychological states. 670 We restricted LIWC output to the 67 linguistic (syntactic and semantic) categories, 671 excluding categories relating to metadata (e.g., percentage of words found in the LIWC 672 dictionary), as well as categories irrelevant to spoken language (e.g., punctuation). Thus, our 673 final LIWC output was a 22x67 matrix where each row corresponds to a participant and each 674 column to a category. 675 These categories can be scaled very differently from one another. For example, words in 676 the syntactic category "pronoun" accounted for between 10.3-20.5 percent of speech transcripts, 677 while words in the semantic category "leisure" accounted for only 0-1.09 percent. To give 678 approximately equal weight to all categories, we standardized each category (to have zero mean 679 unit variance) across participants before performing partial least squares regression (PLSR) as 680 described in the next section. This ensures that the resulting PLS components are not simply 681 dominated by variance in categories that are represented heavily in all human speech. 682 683

Relating speech features to paranoia 684
To determine which speech features were most related to trait-level paranoia, we 685 submitted the data to a partial least squares regression (PLSR) with the z-scored speech features 686 as X (predictors) and GPTS-A score as Y (response), implemented in Matlab as plsregress. 687 PLSR is a latent variable approach to modeling the covariance structure between two matrices, 688 which seeks to find the direction in X space that explains the maximum variance in Y space. It is 689 well suited to the current problem, because it can handle a predictor matrix with more variables 690 than observations, as well as multi-collinearity among the predictors. 691 In a first-pass analysis, we ran a model with 10 components to determine the number of 692 components needed to explain most of the variance in trait-level paranoia. Results of this 693 analysis indicated that the first component was sufficient to explain 72.3 percent of the total 694 variance in GPTS-A score, so we selected just this component for visualization and 695 interpretation. Predictor loadings for this component are visualized in Fig. 5a. 696